Remove empty XML elements with phpQuery [duplicate] - php

This question already has answers here:
Reg expression to remove empty Tags (any of them)?
(3 answers)
Closed 9 years ago.
As mentioned in title, I'd like remove all empty elements from XML document.
By empty I mean elements that don't have any text nodes in it or in its children.
Is it possible to do that with phpQuery?

I used Gordon's code from answer in this topic: Reg expression to remove empty Tags (any of them)?
Firstly I tried just to put his XPath query into phpQueryObject::find() method, but it gave me a warning saying it's incorrect query. Don't know why since it's using DOMXPath and should work.
Anyway the solution was still quite simple.
$pqDoc = phpquery::newDocument() // phpQueryObject created some way. Doesn't matter here.
$xp = new DOMXPath($pqDoc->getDOMDocument());
foreach($xp->query('//*[not(node()) or normalize-space() = ""]') as $node) {
$node->parentNode->removeChild($node);
}
Now you have removed empty elements and you still can use your changed phpQueryObject since it has actually working on DOMDocument's reference.

Related

Is there any regex that will help me choose the anchors only with certain class? [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
I want to extract the href of the anchor having only certain class with it like link-wrapper.
So, this means I will have the href of the link like:
click here
P.S. It should extract both the links if they are aligned in sequential manner like:
link-1link-2
I tried the solutions already present in the stack-overflow, but none suited my problem. Since some of them were in java-script and other languages. I tried looking for DOMDocument, but its bit difficult to exactly match the solution.
I tied some of the preg_match which didn't worked for me, like:
preg_match('/<a(?:(?!class\=")(?:.|\n))*class\="(?:(?!link\-wrapper)(?:.|\n))*link\-wrapper(?:(?!<\/a>)(?:.|\n))*<\/a>/i', $content, $output_array);
You can use DOMDocument and DOMXPath to get your results. First load the HTML into a DOMDocument and then use an XPath query to find all the anchors that have class including link-wrapper e.g.
$html = 'click herelink-3
link-1link-2';
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//a[contains(#class, "link-wrapper")]') as $a) {
$urls[] = $a->attributes->getNamedItem('href')->nodeValue;
}
foreach ($urls as $url) {
echo "$url\n";
}
Output:
blaa..blaa
blaa
blaa..again
Demo on 3v4l.org

Getting Specific Tag from Remote Site using PHP [duplicate]

This question already has answers here:
how to use dom php parser
(4 answers)
Closed 9 years ago.
<?php
$html = file_get_contents('http://xpool.xram.co/index.cgi');
echo $html;
?>
I want to get information in a tag on a remote web site using php. and only the tags.
I found this small string that is great for retrieving the entire site source. However, i want to get a small section only. How can I filter out all the other tags and get only the one tag I need?
I'd suggest using a PHP DOM parser. (http://simplehtmldom.sourceforge.net/manual.htm)
require_once ('simple_html_dom.php');
$html = file_get_contents('http://xpool.xram.co/index.cgi');
$p = $html->find('p'); // Find all p tags.
$specific_class = $html->find('.classname'); // Find elements with classname as class.
$element_id = $html->find('#element'); // Find element with the id element
Read the docs, there are tons of other options available.

XPath incorrect query path [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Xpath fails if an element has a a xmlns attribute
I have been trying for a long time to extract a string from the following xml with no luck
http://chris.photobooks.com/xml/default.htm?state=8T
I am trying to get the ASIN number of a book and I have tried
$xpath->query('//MarketplaceASIN/ASIN')->item(0)->nodeValue;
and
$xpath->query('/GetMatchingProductResponse/GetMatchingProductResult[1]/Product/Identifiers/MarketplaceASIN/ASIN')->item(0)->nodeValue;
but neither seem to work, what am I doing wrong here?
The elements in that document are bound to the namespace http://mws.amazonservices.com/schema/Products/2011-10-01.
You may have missed it because it does not use a namespace-prefix and the xmlns="http://mws.amazonservices.com/schema/Products/2011-10-01" just looks like an attribute, but namespace attributes are special.
All of the descendant elements inherit that namespace. You will want to register the namespace with a namespace-prefix and adjust your XPath:
$rootNamespace = $xml->lookupNamespaceUri($xml->namespaceURI);
$xpath->registerNamespace('a', $rootNamespace);
$elementList = $xpath->query('//a:MarketplaceASIN/a:ASIN');
Or you could use a more generic XPath that matches on elements and uses a predicate filter to match the local-name() and namespace-uri():
//*[local-name()='MarketplaceASIN' and namespace-uri()='http://mws.amazonservices.com/schema/Products/2011-10-01']/*[local-name()='ASIN' and namespace-uri()='http://mws.amazonservices.com/schema/Products/2011-10-01']

Tags still remaining after using simplexml_load_string [duplicate]

This question already has answers here:
Getting actual value from PHP SimpleXML node [duplicate]
(4 answers)
Closed 8 years ago.
I am using simplexml_load_string for XML packets. In my scenario, the XML string I want to convert is known as k.
My problem, however, is that when I use k, tags still remain that weren't parsed (<k>, <\k>).
For example, I use
$x->k, and I get back <k>DATA I WANT HERE<\EK>.
How do I get rid of these?
What the code does: It connects to a game and logs in.
Use InnerNode to get the value without the tags:
$x->k->InnerNode
You can also do a typecast:
(string)$x->k
I tried this and seem to be getting the string.
<?php
$str = "<msg t='sys'><body action='rndK' r='-1'><k>qH~e9Gmt</k></body></msg>";
$xml = simplexml_load_string( $str );
echo $xml->body->k; // gives 'qH~e9Gmt'
?>

How to detect new line from the parsed XML with simpleload_xml_string? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
PHP SimpleXML doesn't preserve line breaks in XML attributes
I have following XML
$xmldatas = '<layer text="name
id"></layer>';
I have parse this XML with
$xml = simplexml_load_string($xmldatas);
But when I checked the $xml, the \n is been replaced with space. I want the new line remains as it is after the xml parsing.
But how can I do that ?
Thanks
I don't think that xml will accept a new line character in the tag option text.
If you are generating the xml maybe you want to do something like this?
$xmldatas = '<layer><text>name
id</text></layer>';

Categories