Using xPath for sitemap.xml - php

Here is the XML file content:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url id="first_url">
<loc>http://example.com</loc>
<lastmod>2014-05-21</lastmod>
</url>
</urlset>
And here goes the PHP code:
<?php
$dom = new DOMDocument('1.0', 'utf-8');
$dom->Load('sitemap.xml');
$xpath = new DOMXPath($dom);
$tags = $xpath->query('//url[#id="first_url"]');
foreach($tags as $tag)
print $tag->getAttribute("id")."<br/>";
?>
This code does not work. But if I remove xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" from file, it works. Why it's so? Thanks!

It is because of the namespace. Here is way you can do it that ignores the name space:
Xpath 1.0:
//*[local-name()="url"][#id="first_url"]
Xpath 2.0:
//*:url[#id="first_url"]

Register the namespace using DOMXPath::registerNamespace
$xpath->registerNamespace("s",
"http://www.sitemaps.org/schemas/sitemap/0.9");
Then use it in your XPath:
$tags = $xpath->query('//s:url[#id="first_url"]');

Related

PHP Xpath return zero match

Considering this XML
<?xml version="1.0" encoding="UTF-8"?>
<Assertion xmlns="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:oasis:names:tc:SAML:2.0:assertion http://docs.oasis-open.org/security/saml/v2.0/saml-schema-assertion-2.0.xsd" ID="_a75adf55-01d7-40cc-929f-dbd8372ebdfc" IssueInstant="2009-09-09T00:46:02Z" Version="2.0">
<Subject>
<NameID>801234567890</NameID>
</Subject>
....
</Assertion>
PHP
$dom = new DOMDocument();
$ret = $dom->loadXML($data);
$xp = new DOMXPath($dom);
$node_list = $xp->query('/Assertion');
$node_list->length return 0 element. I want to extract the DOMElement but somehow it didn't work.
As stated on the comments by Sami Kuhmonen you may need to register the namespaces, here is an example:
<?php
$string= <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<Assertion xmlns="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:oasis:names:tc:SAML:2.0:assertion http://docs.oasis-open.org/security/saml/v2.0/saml-schema-assertion-2.0.xsd" ID="_a75adf55-01d7-40cc-929f-dbd8372ebdfc" IssueInstant="2009-09-09T00:46:02Z" Version="2.0">
<Subject>
<NameID>801234567890</NameID>
</Subject>
</Assertion>
XML;
$dom = new DOMDocument();
$dom->loadXML($string);
$xp = new DOMXPath($dom);
// registering the namespaces
$xp->registerNamespace('a','urn:oasis:names:tc:SAML:2.0:assertion');
$xp->registerNamespace('b','http://www.w3.org/2001/XMLSchema-instance');
// using the prefix of the registered namespace in the xpath expression
$node_list = $xp->query('/a:Assertion');
print $node_list->length
?>

Parsing xml response from ebay getsellerlist with php

I am trying to parse XML with PHP. The XML is a response from ebay getsellerlist api, and is structured like so:
<!--?xml version="1.0" encoding="UTF-8"?-->
<getsellerlistresponse xmlns="urn:ebay:apis:eBLBaseComponents">
<timestamp>2016-08-11T14:17:39.869Z</timestamp>
<ack>Success</ack>
<version>967</version>
<build>E967_CORE_APISELLING_17965876_R1</build>
<itemarray>
<item>
<itemid>itemid1</itemid>
<listingdetails>
<viewitemurl>itemurl1</viewitemurl>
</listingdetails>
<primarycategory>
<categoryid>categoryid1</categoryid>
<categoryname>categoryname1</categoryname>
</primarycategory>
<title>title1</title>
<picturedetails>
<galleryurl>url1</galleryurl>
<photodisplay>thumbnail1</pictureurl>
<pictureurl>picture1</pictureurl>
</picturedetails>
</item>
</itemarray>
</getsellerlistresponse>
My php is as follows:
<?
$xml = '<!--?xml version="1.0" encoding="UTF-8"?--><getsellerlistresponse xmlns="urn:ebay:apis:eBLBaseComponents"><timestamp>2016-08-11T14:17:39.869Z</timestamp><ack>Success</ack><version>967</version><build>E967_CORE_APISELLING_17965876_R1</build><itemarray><item><itemid>itemid1</itemid><listingdetails><viewitemurl>itemurl1</viewitemurl></listingdetails><primarycategory><categoryid>categoryid1</categoryid><categoryname>categoryname1</categoryname></primarycategory><title>title1</title><picturedetails><galleryurl>url1</galleryurl><photodisplay>thumbnail1</pictureurl><pictureurl>picture1</pictureurl></picturedetails></item><item><itemid>itemid2</itemid><listingdetails><viewitemurl>itemurl2</viewitemurl></listingdetails><primarycategory><categoryid>categoryid2</categoryid><categoryname>categoryname2</categoryname></primarycategory><title>title1</title><picturedetails><galleryurl>url2</galleryurl><photodisplay>thumbnail2</pictureurl><pictureurl>picture2</pictureurl></picturedetails></item></itemarray></getsellerlistresponse>';
$dom = new DOMDocument();
$dom->loadXML($xml);
$title_nodes = $dom->getElementsByTagName('title');
$titles = array();
foreach ($title_nodes as $node) {
$titles[] = $node->nodeValue;
echo $node->nodeValue;
}
echo $titles[0];
echo count($titles);
?>
When I run it, I get a blank page, no errors, nothing.
If I check $titles length using count(), it comes back as zero.
For some reason it is not getting the title node (or any other nodes) and I can't figure out how to parse the xml string with php and get the node values.
Any help most appreciated, if the question is vague or lacking detail, please let me know and I will correct it.
The XML isn't valid:
Unable to parse any XML input. org.jdom2.input.JDOMParseException: Error on line 2: The element type "photodisplay" must be terminated by the matching end-tag "".
And that's only after you remove the comments in your XML declaration:
<!--?xml version="1.0" encoding="UTF-8"?-->
shoud be
<?xml version="1.0" encoding="UTF-8"?>
Working demo:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8"?>
<getsellerlistresponse xmlns="urn:ebay:apis:eBLBaseComponents">
<timestamp>2016-08-11T14:17:39.869Z</timestamp>
<ack>Success</ack>
<version>967</version>
<build>E967_CORE_APISELLING_17965876_R1</build>
<itemarray>
<item>
<itemid>itemid1</itemid>
<listingdetails>
<viewitemurl>itemurl1</viewitemurl>
</listingdetails>
<primarycategory>
<categoryid>categoryid1</categoryid>
<categoryname>categoryname1</categoryname>
</primarycategory>
<title>title1</title>
<picturedetails>
<galleryurl>url1</galleryurl>
<photodisplay>thumbnail1</photodisplay>
<pictureurl>picture1</pictureurl>
</picturedetails>
</item>
</itemarray>
</getsellerlistresponse>';
$dom = new DOMDocument();
$dom->loadXML($xml);
$title_nodes = $dom->getElementsByTagName('title');
$titles = array();
foreach ($title_nodes as $node) {
$titles[] = $node->nodeValue;
echo $node->nodeValue;
}
echo $titles[0];
echo count($titles);

PHP and Xpath - Get node from inner text

I have the following XML structure
<url>
<loc>some-text</loc>
</url>
<url>
<loc>some-other-text</loc>
</url>
My goal is to get loc node from it's inner text (i.e. some-text) or a part of it (i.e. other-text). Here's my best attempt:
$doc = new DOMDocument('1.0','UTF-8');
$doc->load($filename);
$xpath = new Domxpath($doc);
$locs = $xpath->query('/url/loc');
foreach($locs as $loc) {
if(preg_match("/other-text/i", $loc->nodeValue)) return $loc->parentNode;
}
Is it possible to get specific loc node without iterating over all nodes, simply using xpath query?
Yes, you can use a query like //url/loc[contains(., "other-text")]
Example:
$xml = <<<'XML'
<root>
<url>
<loc>some-text</loc>
</url>
<url>
<loc>some-other-text</loc>
</url>
</root>
XML;
$dom = new DOMDocument();
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//url/loc[contains(., "other-text")]') as $node) {
echo $dom->saveXML($node);
}
Output:
<loc>some-other-text</loc>

Creating attribute in domdocument

I have to make this type of XML :-
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.example.com/catalog?item=12&desc=vacation_hawaii</loc>
<changefreq>weekly</changefreq>
</url>
</urlset>
For which I have written this code,
$dom = new domDocument('1.0', 'utf-8');
$dom->formatOutput = true;
$rootElement = $dom->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
$sxe = simplexml_import_dom( $dom );
$urlMain = $sxe->addChild("url");
$loc = $urlMain->addChild("loc","http://www.example.com");
$lastmod = $urlMain->addChild("lastmod","$date");
$changefreq = $urlMain->addChild("changefreq","daily");
$priority = $urlMain->addChild("priority","1");
Everything works completely fine, but for some reason xmlns for urlset is not getting added. What might be wrong here?
Any suggestion would be helpful.
You need to append the root element to the document prior to conversion to simplexml:
$rootElement = $dom->createElementNS('http://www.sitemaps.org/schemas/sitemap/0.9', 'urlset');
$dom->appendChild($rootElement);
$sxe = simplexml_import_dom( $dom );

How to XML to Php

How to write php so I know the link here?
There will always be different links
<response>
<redirect>
http://www.example.com/
</redirect>
<code>0</code>
<description>OK</description>
</response>
try simplexml
$xml ='<response>
<redirect>
http://www.example.com/
</redirect>
<code>0</code>
<description>OK</description>
</response>';
$xml = simplexml_load_string($xml);
echo $xml->redirect; // http://www.example.com/
simplexml_load_string
This should answer all your questions...
http://www.php.net/manual/en/function.simplexml-load-string.php
Use DOM+Xpath:
$xml = <<<'XML'
<response>
<redirect>http://www.example.com/</redirect>
<code>0</code>
<description>OK</description>
</response>
XML;
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
var_dump($xpath->evaluate('string(/response/redirect)'));
Output:
string(23) "http://www.example.com/"

Categories