<html>
<body>
<channel>
<item>
<link>"http://www.example.com/"
</link>
<title>This is a title
</title>
</item>
<item>
<link>"http://www.example2.com/"
</link>
<title>This a 2nd title
</title>
</item>
</channel>
</body>
</html>
$query = '/html/body/channel/item/title';
$xpath->query($query);
$i = 0;
foreach ( $xpath->query($query) as $key )
{
echo '<p>'.$xpath->query($query) -> item($i) -> nodeValue . '</p><br />';
$i++;
}
I tried the following queries:
$query = '/html/body/channel/item/link';
and
$query = '/html/body/channel/item/link/text()';
I can return <item> and <title> just fine. Just not <link>. Is there something I'm missing?
Your code is broken and does not make sense
1 $query = '/html/body/channel/item/title';
2 $xpath->query($query);
3 $i = 0;
4 foreach ($xpath->query($query) as $key)
5 {
6 echo '<p>'.$xpath->query($query) -> item($i) -> nodeValue . '</p><br />';
7 $i++;
8 }
will query for title elements (2) but since the result isn't assigned, it is superfluous. Then you do foreach and query again (4). This time you assign each title DOMElement to $key (which is bad wording imo). In the foreach, you do yet another query for title elements (6) and fetch the items/title elements in it from your counter variable (3/6). That is superfluous as well, because you already have that element in $key (3). So you are doing three identical queries where you just need one and you do a foreach without using it.
It should be
foreach ($xpath->query('/html/body/channel/item/title') as $titleElement) {
printf('<p>%s</p>', $titleElement->nodeValue);
}
Since you are already using DOM to work with the markup, you could also create the p element with it instead of using string concatenation, e.g.
foreach ($xpath->query('/html/body/channel/item/title') as $titleElement) {
echo $domDocument->saveXml(
$domDocument->createElement('p', $titleElement->nodeValue)
);
}
If you want the link elements, change the XPath accordingly to query for that instead of title. The quotes in the node value have nothing to do with it at all. They will show just fine.
Full working example showing how to combine <title> and <link> elements into <a> elements
Related
I have a bookCatalog.xml file as below
<bookCatalog>
<book id='1'>
<title>html</title>
</book>
<book id='2'>
<title>java</title>
</book>
<book id='3'>
<title>php</title>
</book>
</bookCatalog>
I want to programmatically get the title value of a book node by using variable $id of book node, and i used the following code:
$doc=new DOMDocument();
$doc->load('bookCatalog.xml');
$xpath= new DOMXPath($doc);
$findBookNode=$xpath->query("//book[#id='$id']")->item(0);
foreach ($findBookNode as $child) {
if ($child->nodeName === 'title') {
$bookTitle = $child->nodeValue;
}
}
But it turned out that the result is not what i want.
If I replace the variable $id to '1' , I can get the title value of the book node whose id=1;
$findBookNode=$xpath->query("//book[#id='1']")->item(0);
I just found the problem in my code:
The problem is that the variable $id is assigned by $_POST['id'] from a form in another code segment.
$id=$_POST['id'];
Then the value of variable $id has trailing space, for example
$id='1'
become
$id='1 ' // one space after number 1
I have an XML file structured as follows:
<pictures>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Unites States</place>
</facts>
<people>
<person>John</person>
<person>Sue</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>Sue</person>
<person>Jane</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>John</person>
<person>Joe</person>
<person>Harry</person>
</people>
</picture>
<pictures>
In one case, I need to search for pictures where place="Canada". I have an XPath that does this fine, as such:
$place = "Canada";
$pics = ($pictures->xpath("//*[place='$place']"));
This pulls the entire "picture" node, so I am able to display title, description, etc.
I have another need to find all pictures where person = $person. I use the same type query as above:
$person = "John";
$pics = ($pictures->xpath("//*[person='$person']"));
In this case, the query apparently knows there are 2 pictures with John, but I don't get any of the values for the other nodes. I'm guessing it has something to do with the repeating child node, but can't figure out how to restructure the XPath to pull all of the picture node for each where I have a match on person. I tried using attributes instead of values (and modified the query accordingly), but got the same result.
Can anyone advise what I'm missing here?
Let's replace the variables first. That takes PHP out of the picture. The problem is just the proper XPath expression.
//*[place='Canada']
matches any element node that has a child element node place with the text content Canada.
This is the facts element node - not the picture.
Getting the pictures node is slightly different:
//picture[facts/place='Canada']
This would select ANY picture node at ANY DEPTH that matches the condition.
picture[facts/place='Canada']
Would return the same result with the provided XML, but is more specific and matches only picture element nodes that are children of the document element.
Now validating the people node is about the same:
picture[people/person="John"]
You can even combine the two conditions:
picture[facts/place="Canada" and people/person="John"]
Here is a small demo:
$element = new SimpleXMLElement($xml);
$expressions = [
'//*[place="Canada"]',
'//picture[facts/place="Canada"]',
'picture[facts/place="Canada"]',
'picture[people/person="John"]',
'picture[facts/place="Canada" and people/person="John"]',
];
foreach ($expressions as $expression) {
echo $expression, "\n", str_repeat('-', 60), "\n";
foreach ($element->xpath($expression) as $index => $found) {
echo '#', $index, "\n", $found->asXml(), "\n";
}
echo "\n";
}
HINT: Your using dyamic values in you XPath expressions. String literals in XPath 1.0 do not support any kind of escaping. A quote in the variable can break you expression. See this answer.
I have an XML file that I'm parsing with PHP's Simplexml, but I'm having an issue with an iteration through nodes.
The XML:
<channel>
<item>
<title>Title1</title>
<category>Cat1</category>
</item>
<item>
<title>Title2</title>
<category>Cat1</category>
</item>
<item>
<title>Title3</title>
<category>Cat2</category>
</item>
</channel>
My counting function:
public function cat_count($cat) {
$count = 0;
$items = $this->xml->channel->item;
$size = count($items);
for ($i=0; $i<$size; $i++) {
if ($items[$i]->category == $cat) {
$count++;
}
}
return $count;
}
Am I overlooking an error in my code, or is there another preferred method for iterating through the nodes? I've also used a foreach and while statement with no luck, so I'm at a loss. Any suggestions?
EDIT: while using the xpath method below, I noticed that using
foreach ($this->xml->channel->item as $item) {
echo $item->category;
}
will print all the category name, but, using
foreach ($this->xml->channel->item as $item) {
if ($item->category == $cat) {
echo $item->category;
}
}
will only print one instance of the doubled categories. Even when I have copy and pasted the lines, only one shows. Does this mean the XML structure could be invalid somehow?
An easy way to count elements with a given name in an XML file is to use xpath. Try this:
private function categoryCount($categoryName) {
$categoryName = $this->sanitize($categoryName); // easy xpath injection protection
return count($this->xml->xpath("//item[category='$categoryName']"));
}
The sanitize() function should remove single and double quotes in your $categoryName to prevent xpath injection. To also get queries for a category name containing quotes to work, you need to build your xpath query string depending on wheather it contains single or double quotes:
// xpath in case of single quotes in category name
$xpath = '//item[category="' . $categoryName . '"]';
// xpath in case of double quotes in category name
$xpath = "//item[category='" . $categoryName . "']";
If you don't have full control over the xml data (for example if is created out of user generated content), you should take this into account. Unfortunately there is no simple way to this in php like parametrized queries.
see here for the php xpath function docs: http://php.net/manual/en/simplexmlelement.xpath.php
see here for an xpath reference: http://www.w3schools.com/xpath/xpath_syntax.asp
I've basically got an XML file full of product information for use in an ecommerce system. I've been creating a script that converts these XML files into a .CSV with the data structured in a format the ecommerce system can handle (So I don't need to copy/paste columns over every time the vendor provides new XML files). The category of each product is defined like this:
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
web_category3 being the category of the item and 1 and 2 being the parent categories of the product's category. The thing is that some items are nested under 2 categories..or sometimes 5. So I need to figure out a way for PHP to grab the web_category with the highest number after it since that's always going to be the product's category.
Thanks!
#ben's answer is correct, but is a little intense for me. SimpleXMLElement objects are nice because you can easily cast them to an array. So, a simpler solution would be to cast it to an array and use max to determine the highest value in the resulting array:
$str = '
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
';
$xml = new SimpleXMLElement($str);
echo max((array)$xml); // outputs: 6
UPDATE
Based on your comment below, let's assume you need to get the max of all the <item> elements that occur in an XML file and not just one (like above). To handle this you could use SimpleXMLElement::xpathdocs to get an array of all the occurrences of <item> then execute the same casting trick inside a loop over the xpath result:
$str = '
<xml>
<product1>
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
</product1>
<product2>
<item>
<web_category4>17</web_category4>
<web_category5>0</web_category5>
</item>
</product2>
</xml>
';
$xml = new SimpleXMLElement($str);
$allItems = array();
$items = $xml->xpath('//item');
foreach($items as $item) {
$allItems = array_merge($allItems, (array)$item);
}
echo max($allItems); // outputs: 17
UPDATE 2
Okay, last time. If this isn't exactly what you're trying to do, you should at least have enough examples to figure it out from here. Consider:
$str = '
<xml>
<product1>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</product1>
<product2>
<web_category4>17</web_category4>
<web_category5>0</web_category5>
</product2>
<product3>
<web_category6>17</web_category6>
<web_category7>21</web_category7>
</product3>
</xml>
';
$xml = new SimpleXMLElement($str);
// assumes that product node names start with "product"
$products = $xml->xpath("//*[starts-with(name(),'product')]");
foreach ($products as $p) {
$catNames = array_keys((array)$p);
$catNums = preg_replace("/[^\d]/", "", $catNames);
echo $p->getName() . ' - highest category: ' . max($catNums) . "\n";
}
The above code outputs the following:
product1 - highest category: 3
product2 - highest category: 5
product3 - highest category: 7
assuming your XML is something like this:
<item>
<web_category1>3</web_category1>
<web_category2>1</web_category2>
<web_category3>6</web_category3>
</item>
And you have a SimpleXMLElement object for <item>, this should do it:
$highest_web_category_number = -1;
$value_of_highest_web_category_number = -1;
foreach($item->getChildren() as $name => $data) {
if(strpos($name, 'web_category') === 0) {
$web_category_number = substr($name, strlen('web_category'));
if($web_category_number > $highest_web_category_number) {
$highest_web_category_number = $web_category_number
$value_of_highest_web_category_number = $data;
}
}
}
I've never asked a question here before so please forgive my question if its formatted badly or not specific enough. I am just a dabbler and know very little about PHP and XPath.
I have an XML file like this:
<catalogue>
<item>
<reference>A1</reference>
<title>My title1</title>
</item>
<item>
<reference>A2</reference>
<title>My title2</title>
</item>
</catalogue>
I am pulling this file using SimpleXML:
$file = "products.xml";
$xml = simplexml_load_file($file) or die ("Unable to load XML file!");
Then I am using the reference from a URL parameter to get extra details about the 'item' using PHP:
foreach ($xml->item as $item) {
if ($item->reference == $_GET['reference']) {
echo '<p>' . $item->title . '</p>';
}
So from a URL like www.mysite.com/file.php?reference=A1
I would get this HTML:
<p>My title1</p>
I realise I might not be doing this right and any pointers to improving this are welcome.
My question is, I want to find the next and previous 'item' details. If I know from the URL that reference=A1, how do I find the reference, title etc of the next 'item'? If I only have 'A1' and I know that's a reference node, how do I get HTML like this:
<p>Next item is My title2</p>
I have read about following-sibling but I don't know how to use it. I can only find the following-sibling of the reference node, which isn't what I need.
Any help appreciated.
You could use:
/catalogue/item[reference='A1']/following-sibling::item[1]/title
Meaning: from an item element child of catalogue root element, having a reference element with 'A1' string value, navegate to first following sibling item element's title child.
I´d probably use xpath to fetch the next/previous (and current) node.
<?php
error_reporting(E_ALL ^ E_NOTICE);
$s = '
<catalogue>
<item>
<reference>A1</reference>
<title>My title1</title>
</item>
<item>
<reference>A2</reference>
<title>My title2</title>
</item>
<item>
<reference>A3</reference>
<title>My title3</title>
</item>
</catalogue>
';
$xml = simplexml_load_string($s);
$reference = 'A3';
list($current) = $xml->xpath('/catalogue/item[reference="' . $reference . '"]');
if($current) {
print 'current: ' . $current->title . '<br />';
list($prev) = $current->xpath('preceding-sibling::*[1]');
if($prev) {
print 'prev: ' . $prev->title . '<br />';
}
list($next) = $current->xpath('following-sibling::*[1]');
if($next) {
print 'next: ' . $next->title . '<br />';
}
}
See the documentation of SimpleXMLElement::xpath and XPath syntax documentation.