URL decode all values in xml document in php - php

I'm after a way of making simplexml_load_string return a document where all the text values are urldecoded. For example:
$xmlstring = "<my_element>2013-06-19+07%3A20%3A51</my_element>";
$xml = simplexml_load_string($xmlstring);
$value = $xml->my_element;
//and value would contain: "2013-06-19 07:20:51"
Is it possible to do this? I'm not concerned about attribute values, although that would be fine if they were also decoded.
Thanks!

you can run
$value = urldecode( $value )
which will decode your string.
See: http://www.php.net/manual/en/function.urldecode.php

As long as each value is inside an element of its own (in SimpleXML you can not process text-nodes on its own, compare with the table in Which DOMNodes can be represented by SimpleXMLElement?) this is possible.
As others have outlined, this works by applying the urldecode function on each of these elements.
To do that, you need to change and add some lines of code:
$xml = simplexml_load_string($xmlstring, 'SimpleXMLIterator');
if (!$xml->children()->count()) {
$nodes = [$xml];
} else {
$nodes = new RecursiveIteratorIterator($xml, RecursiveIteratorIterator::LEAVES_ONLY);
}
foreach($nodes as $node) {
$node[0] = urldecode($node);
}
This code-example takes care that each leave is processed and in case, it's only the root element, that that one is processed. Afterwards, the whole document is changed so that you can access it as known. Demo:
<?php
/**
* URL decode all values in XML document in PHP
* #link https://stackoverflow.com/q/17805643/367456
*/
$xmlstring = "<root><my_element>2013-06-19+07%3A20%3A51</my_element></root>";
$xml = simplexml_load_string($xmlstring, 'SimpleXMLIterator');
$nodes = $xml->children()->count()
? new RecursiveIteratorIterator(
$xml, RecursiveIteratorIterator::LEAVES_ONLY
)
: [$xml];
foreach ($nodes as $node) {
$node[0] = urldecode($node);
}
echo $value = $xml->my_element; # prints "2013-06-19 07:20:51"

Related

How to get element xml using xpath php

I have issue with how to get element xml using xpath php, i already create a php file to extract the "attributes" xml by using xpath php.
What i want is how to extract every element in xml by using xpath.
test.xml
<?xml version="1.0" encoding="UTF-8"?>
<InvoicingData>
<CreationDate> 2014-02-02 </CreationDate>
<OrderNumber> XXXX123 </OrderNumber>
<InvoiceDetails>
<InvoiceDetail>
<SalesCode> XX1A </SalesCode>
<SalesName> JohnDoe </SalesName>
</InvoiceDetail>
</InvoiceDetails>
</InvoicingData>
read.php
<?php
$doc = new DOMDocument();
$doc->loadXML(file_get_contents("test.xml"));
$xpath = new DOMXpath($doc);
$nodes = $xpath->query('//*');
$names = array();
foreach ($nodes as $node)
{
$names[] = $node->nodeName;
}
echo join(PHP_EOL, ($names));
?>
From the code above it will print like this :
CreationDate OrderNumber InvoiceDetails InvoiceDetail SalesCode
SalesName
So, the problem is, how to get the element inside the attribute, basically this is what i want to print :
2014-02-02 XXXX123 XX1A JohnDoe
You use $node->textContent to get the textual value of the node (and its descendants, if any).
In response to your first comment:
You didn't use $node->textContent. Try this:
$doc = new DOMDocument();
$doc->loadXML(file_get_contents("test.xml"));
$xpath = new DOMXpath($doc);
$nodes = $xpath->query('//*');
$names = array();
$values = array(); // created a separate array for the values
foreach ($nodes as $node)
{
$names[] = $node->nodeName;
$values[] = $node->textContent; // push to $values array
}
echo join(PHP_EOL, ($values));
However, if you only want to push the textual values when they're a direct child of an element and still want to collect all node names as well, you could do something like:
foreach ($nodes as $node)
{
$names[] = $node->nodeName;
// check that this node only contains one text node
if( $node->childNodes->length == 1 && $node->firstChild instanceof DOMText ) {
$values[] = $node->textContent;
}
}
echo join(PHP_EOL, ($values));
And if you only care about the nodes that directly contain textual values, you could do something like this:
// this XPath query only selects those nodes that directly contain non-whitespace text
$nodes = $xpath->query('//*[./text()[normalize-space()]]');
$values = array();
foreach ($nodes as $node)
{
// add nodeName as key
// (only works reliable of there's never a duplicate nodeName in your XML)
// and add textContent as value
$values[ $node->nodeName ] = trim( $node->textContent );
}
var_dump( $values );

can I use str_replace in all xml nodes with php?

I think the problem is with my logic and I am probably going about this the wrong way. What I want is to
open an xml document with php
get elements by Tag name
then for each node that has child nodes
replace every letter a with ა every b with ბ and so on.
here is the code I have so for but it doesn't work.
xmlDoc=loadXMLDoc("temp/word/document.xml");
$nodes = xmlDoc.getElementsByTagName("w:t");
foreach ($nodes as $node) {
while( $node->hasChildNodes() ) {
$node = $node->childNodes->item(0);
}
$node->nodeValue = str_replace("a","ა",$node->nodeValue);
$node->nodeValue = str_replace("b","ბ",$node->nodeValue);
$node->nodeValue = str_replace("g","გ",$node->nodeValue);
$node->nodeValue = str_replace("d","დ",$node->nodeValue);
// More replacements for each letter in the alphabet.
}
I thought it might be because of the multiple str_replace() calls but it doesn't work with even just one. Am I going about this the wrong way or have I missed something?
The $node variable gets overwritten on each iteration, so only the last $node will get modified (if ever). You need to do the replacement inside the loop and then use saveXML() method to return the modified XML markup.
Your code (with some improvements):
$xmlDoc = new DOMDocument();
$xmlDoc->load('temp/word/document.xml');
foreach ($xmlDoc->getElementsByTagName("w:t") as $node) {
while($node->hasChildNodes()) {
$node = $node->childNodes->item(0);
$search = array('a', 'b', 'g', 'd');
$replace = array('ა', 'ბ', 'გ', 'დ');
$node->nodeValue = str_replace($search, $replace, $node->nodeValue);
}
}
echo $xmlDoc->saveXML();

PHP XML file filter on match

I am having a heck of a time getting this working...
What I want to do is filter a xml file by a city (or market in this case).
This is the xml data.
<itemset>
<item>
<id>2171</id>
<market>Vancouver</market>
<url>http://</url></item>
<item>
<id>2172</id>
<market>Toronto</market>
<url>http://</url></item>
<item>
<id>2171</id>
<market>Vancouver</market>
<url>http://</url></item>
This is my code...
<?php
$source = 'get-xml-feed.php.xml';
$xml = new SimpleXMLElement($source);
$result = $xml->xpath('//item/[contains(market, \'Toronto\')]');
while(list( , $node) = each($result)) {
echo '//Item/[contains(Market, \'Toronto\')]',$node,"\n";
}
?>
If I can get this working I would like to access each element, item[0], item[1] base on filtered results.
Thanks
I think this implements what you are looking for using XPath:
<?php
$source = file_get_contents('get-xml-feed.php.xml');
$xml = new SimpleXMLElement($source);
foreach ($xml as $node)
{
$row = simplexml_load_string($node->asXML());
$result = $row->xpath("//item/market[.='Toronto']");
if ($result[0])
{
var_dump($row);
}
}
?>
As another answer mentioned, unless you are wed to the use of XPath it's probably more trouble than it's worth for this application: just load the XML and treat the result as an array.
I propose using simplexml_load_file. The learning curve is less step than using the specific XML objects + XPath. It returns an object in the format you descibe.
Try this and you'll see what I mean:
<?php
$source = 'get-xml-feed.php.xml';
$xml = simplexml_load_file($source);
var_dump($xml);
?>
There is also simplexml_load_string if you just have an XML snippet.
<?php
$source = 'get-xml-feed.php.xml';
//$xml = new SimpleXMLElement($source);
$dom = new DOMDocument();
#$dom->loadHTMLFile($source);
$xml = simplexml_import_dom($dom);
$result = $xml->xpath("//item/market[.='Toronto']/..");
while(list( , $node) = each($result)) {
print_r($node);
}
?>
This will get you the parent nodeset when it contains a node with "Toronto" in it. It returns $node as a simplexml element so you will have to deal with it accordingly (I just printed it as an array).

ignoring nested elements when parsing xml with php

probably a simple question to answer for someone:::
xml:
<foobar>
<foo>i am a foo</foo>
<bar>i am a bar</bar>
<foo>i am a <bar>bar</bar></foo>
</foobar>
In the above, I want to display all elements that are <foo>. When the script gets to the line with the nested < bar > the result is "i am a bar" .. which isn't the result I had hoped for.
Is it not possible to print out the entire contents of that element as it is, so that i see: "i am a <bar>bar</bar>"
php:
$xml = file_get_contents('sample');
$dom = new DOMDocument;
#$dom->loadHTML($xml);
$resources= $dom->getElementsByTagName('foo');
foreach ($resources as $resource){
echo $resource->nodeValue . "\n";
}
After some trolling and trying to do what I needed with SimpleXML, I arrived at the following conclusion. My issue with SimpleXML was where the elements are. If the xml is structured, and the hierarchy is standard ... I have no problem.
If the XML is a web page for example, and the <foo> element is anywhere, SimpleXML doesn't have a good facility like getElementsByTagName to pull out the element wherever it may be....
<?php
$doc = new DOMDocument();
$doc->load('sample');
$element_name = 'foo';
if ($doc->getElementsByTagName($element_name)->length > 0) {
$resources = $doc->getElementsByTagName($element_name);
foreach ($resources as $resource) {
$id = null;
if (!$resource->hasAttribute('id')) {
$resource->setAttribute('id', gen_uuid());
}
$innerHTML = null;
$children = $resource->childNodes;
foreach ($children as $child) {
$tmp_doc = new DOMDocument();
$tmp_doc->appendChild($tmp_doc->importNode($child,true));
$innerHTML .= rtrim($tmp_doc->saveHTML());
}
$resource->nodevalue = $innerHTML;
}
}
echo $doc->saveHTML();
?>
Rather than writing all that code, you might try XPath. That expression would be "//foo", which would get a list of all the elements in the document named "foo".
http://php.net/manual/en/simplexmlelement.xpath.php

Retrieving single node value from a nodelist

I'm having difficulty extracting a single node value from a nodelist.
My code takes an xml file which holds several fields, some containing text, file paths and full image names with extensions.
I run an expath query over it, looking for the node item with a certain id. It then stores the matched node item and saves it as $oldnode
Now my problem is trying to extract a value from that $oldnode. I have tried to var_dump($oldnode) and print_r($oldnode) but it returns the following: "object(DOMElement)#8 (0) { } "
Im guessing the $oldnode variable is an object, but how do I access it?
I am able to echo out the whole node list by using: echo $oldnode->nodeValue;
This displays all the nodes in the list.
Here is the code which handles the xml file. line 6 is the line in question...
$xpathexp = "//item[#id=". $updateID ."]";
$xpath = new DOMXpath($xml);
$nodelist = $xpath->query($xpathexp);
if((is_null($nodelist)) || (! is_numeric($nodelist))) {
$oldnode = $nodelist->item(0);
echo $oldnode->nodeValue;
//$imgUpload = strchr($oldnode->nodeValue, ' ');
//$imgUpload = strrchr($imgUpload, '/');
//explode('/',$imgUpload);
//$imgUpload = trim($imgUpload);
$newItem = new DomDocument;
$item_node = $newItem ->createElement('item');
//Create attribute on the node as well
$item_node ->setAttribute("id", $updateID);
$largeImageText = $newItem->createElement('largeImgText');
$largeImageText->appendChild( $newItem->createCDATASection($largeImgText));
$item_node->appendChild($largeImageText);
$urlANode = $newItem->createElement('urlA');
$urlANode->appendChild( $newItem->createCDATASection($urlA));
$item_node->appendChild($urlANode);
$largeImg = $newItem->createElement('largeImg');
$largeImg->appendChild( $newItem->createCDATASection($imgUpload));
$item_node->appendChild($largeImg);
$thumbnailTextNode = $newItem->createElement('thumbnailText');
$thumbnailTextNode->appendChild( $newItem->createCDATASection($thumbnailText));
$item_node->appendChild($thumbnailTextNode);
$urlB = $newItem->createElement('urlB');
$urlB->appendChild( $newItem->createCDATASection($urlA));
$item_node->appendChild($urlB);
$thumbnailImg = $newItem->createElement('thumbnailImg');
$thumbnailImg->appendChild( $newItem->createCDATASection(basename($_FILES['thumbnailImg']['name'])));
$item_node->appendChild($thumbnailImg);
$newItem->appendChild($item_node);
$newnode = $xml->importNode($newItem->documentElement, true);
// Replace
$oldnode->parentNode->replaceChild($newnode, $oldnode);
// Display
$xml->save($xmlFileData);
//header('Location: index.php?a=112&id=5');
Any help would be great.
Thanks
Wasn't it supposed to be echo $oldnode->firstChild->nodeValue;? I remember this because technically you need the value from the text node.. but I might be mistaken, it's been a while. You could give it a try?
After our discussion in the comments on this answer, I came up with this solution. I'm not sure if it can be done cleaner, perhaps. But it should work.
$nodelist = $xpath->query($xpathexp);
if((is_null($nodelist)) || (! is_numeric($nodelist))) {
$oldnode = $nodelist->item(0);
$largeImg = null;
$thumbnailImg = null;
foreach( $oldnode->childNodes as $node ) {
if( $node->nodeName == "largeImg" ) {
$largeImg = $node->nodeValue;
} else if( $node->nodeName == "thumbnailImg" ) {
$thumbnailImg = $node->nodeValue;
}
}
var_dump($largeImg);
var_dump($thumbnailImg);
}
You could also use getElementsByTagName on the $oldnode, then see if it found anything (and if a node was found, $oldnode->getElementsByTagName("thumbnailImg")->item(0)->nodeValue). Which might be cleaner then looping through them.

Categories