This question already has answers here:
Reg expression to remove empty Tags (any of them)?
(3 answers)
Closed 9 years ago.
As mentioned in title, I'd like remove all empty elements from XML document.
By empty I mean elements that don't have any text nodes in it or in its children.
Is it possible to do that with phpQuery?
I used Gordon's code from answer in this topic: Reg expression to remove empty Tags (any of them)?
Firstly I tried just to put his XPath query into phpQueryObject::find() method, but it gave me a warning saying it's incorrect query. Don't know why since it's using DOMXPath and should work.
Anyway the solution was still quite simple.
$pqDoc = phpquery::newDocument() // phpQueryObject created some way. Doesn't matter here.
$xp = new DOMXPath($pqDoc->getDOMDocument());
foreach($xp->query('//*[not(node()) or normalize-space() = ""]') as $node) {
$node->parentNode->removeChild($node);
}
Now you have removed empty elements and you still can use your changed phpQueryObject since it has actually working on DOMDocument's reference.
Related
This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
I want to extract the href of the anchor having only certain class with it like link-wrapper.
So, this means I will have the href of the link like:
click here
P.S. It should extract both the links if they are aligned in sequential manner like:
link-1link-2
I tried the solutions already present in the stack-overflow, but none suited my problem. Since some of them were in java-script and other languages. I tried looking for DOMDocument, but its bit difficult to exactly match the solution.
I tied some of the preg_match which didn't worked for me, like:
preg_match('/<a(?:(?!class\=")(?:.|\n))*class\="(?:(?!link\-wrapper)(?:.|\n))*link\-wrapper(?:(?!<\/a>)(?:.|\n))*<\/a>/i', $content, $output_array);
You can use DOMDocument and DOMXPath to get your results. First load the HTML into a DOMDocument and then use an XPath query to find all the anchors that have class including link-wrapper e.g.
$html = 'click herelink-3
link-1link-2';
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//a[contains(#class, "link-wrapper")]') as $a) {
$urls[] = $a->attributes->getNamedItem('href')->nodeValue;
}
foreach ($urls as $url) {
echo "$url\n";
}
Output:
blaa..blaa
blaa
blaa..again
Demo on 3v4l.org
This question already has answers here:
how to use dom php parser
(4 answers)
Closed 9 years ago.
<?php
$html = file_get_contents('http://xpool.xram.co/index.cgi');
echo $html;
?>
I want to get information in a tag on a remote web site using php. and only the tags.
I found this small string that is great for retrieving the entire site source. However, i want to get a small section only. How can I filter out all the other tags and get only the one tag I need?
I'd suggest using a PHP DOM parser. (http://simplehtmldom.sourceforge.net/manual.htm)
require_once ('simple_html_dom.php');
$html = file_get_contents('http://xpool.xram.co/index.cgi');
$p = $html->find('p'); // Find all p tags.
$specific_class = $html->find('.classname'); // Find elements with classname as class.
$element_id = $html->find('#element'); // Find element with the id element
Read the docs, there are tons of other options available.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Xpath fails if an element has a a xmlns attribute
I have been trying for a long time to extract a string from the following xml with no luck
http://chris.photobooks.com/xml/default.htm?state=8T
I am trying to get the ASIN number of a book and I have tried
$xpath->query('//MarketplaceASIN/ASIN')->item(0)->nodeValue;
and
$xpath->query('/GetMatchingProductResponse/GetMatchingProductResult[1]/Product/Identifiers/MarketplaceASIN/ASIN')->item(0)->nodeValue;
but neither seem to work, what am I doing wrong here?
The elements in that document are bound to the namespace http://mws.amazonservices.com/schema/Products/2011-10-01.
You may have missed it because it does not use a namespace-prefix and the xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" just looks like an attribute, but namespace attributes are special.
All of the descendant elements inherit that namespace. You will want to register the namespace with a namespace-prefix and adjust your XPath:
$rootNamespace = $xml->lookupNamespaceUri($xml->namespaceURI);
$xpath->registerNamespace('a', $rootNamespace);
$elementList = $xpath->query('//a:MarketplaceASIN/a:ASIN');
Or you could use a more generic XPath that matches on elements and uses a predicate filter to match the local-name() and namespace-uri():
//*[local-name()='MarketplaceASIN' and namespace-uri()='http://mws.amazonservices.com/schema/Products/2011-10-01']/*[local-name()='ASIN' and namespace-uri()='http://mws.amazonservices.com/schema/Products/2011-10-01']
This question already has answers here:
Getting actual value from PHP SimpleXML node [duplicate]
(4 answers)
Closed 8 years ago.
I am using simplexml_load_string for XML packets. In my scenario, the XML string I want to convert is known as k.
My problem, however, is that when I use k, tags still remain that weren't parsed (<k>, <\k>).
For example, I use
$x->k, and I get back <k>DATA I WANT HERE<\EK>.
How do I get rid of these?
What the code does: It connects to a game and logs in.
Use InnerNode to get the value without the tags:
$x->k->InnerNode
You can also do a typecast:
(string)$x->k
I tried this and seem to be getting the string.
<?php
$str = "<msg t='sys'><body action='rndK' r='-1'><k>qH~e9Gmt</k></body></msg>";
$xml = simplexml_load_string( $str );
echo $xml->body->k; // gives 'qH~e9Gmt'
?>
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
PHP SimpleXML doesn't preserve line breaks in XML attributes
I have following XML
$xmldatas = '<layer text="name
id"></layer>';
I have parse this XML with
$xml = simplexml_load_string($xmldatas);
But when I checked the $xml, the \n is been replaced with space. I want the new line remains as it is after the xml parsing.
But how can I do that ?
Thanks
I don't think that xml will accept a new line character in the tag option text.
If you are generating the xml maybe you want to do something like this?
$xmldatas = '<layer><text>name
id</text></layer>';