xml parsing with php - php

I would like to create a new simplified xml based on an existing one:
(using "simpleXml")
<?xml version="1.0" encoding="UTF-8"?>
<xls:XLS>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>Start</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>End</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
</xls:XLS>
Because there are always colons in the element-tags, it will mess with "simpleXml", I tried to use the following solution->link.
How can I create a new xml with this structure:
<main>
<instruction>Start</instruction>
<instruction>End</instruction>
</main>
the "instruction-element" gets its content from the former "xls:Instruction-element".
Here is the updated code:
But unfortunately it never loops through:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach($xml->children() as $child){
print_r("xml_has_childs");
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
echo $new_xml->asXML();
there is no error-message, if I leave the "#"…

/* the use of # is to suppress warning */
$xml = #simplexml_load_string($YOUR_RSS_XML);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->children() as $child)
{
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
/* to print */
echo $new_xml->asXML();

You could use xpath to simplify things. Without knowing the full details, I don't know if it will work in all cases:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start</instruction><instruction>End</instruction></main>
Edit: The file at http://www.gps.alaingroeneweg.com/route.xml is not the same as the XML you have in your question. You need to use a namespace like:
$xml = #simplexml_load_string(file_get_contents('http://www.gps.alaingroeneweg.com/route.xml'));
$xml->registerXPathNamespace('xls', 'http://www.opengis.net/xls'); // probably not needed
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//xls:Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start (Southeast) auf Sihlquai</instruction><instruction>Fahre rechts</instruction><instruction>Fahre halb links - Ziel erreicht!</instruction></main>

Related

How to use php to remove an XML element [duplicate]

I am trying to develop a function that removes certain URL nodes from my sitemap file. Here is what I have so far.
$xpath = new DOMXpath($DOMfile);
$elements = $xpath->query("/urlset/url/loc[contains(.,'$pageUrl')]");
echo count($elements);
foreach($elements as $element){
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
Which outputs "111111". I don't know why I can't echo a string in a foreach loop if the $elements count is '1'.
Up until now, I've been doing
$urls = $dom->getElementsByTagName( "url" );
foreach( $urls as $url ){
$locs = $url->getElementsByTagName( "loc" );
$loc = $locs->item(0)->nodeValue;
echo $loc;
if($loc == $fullPageUrl){
$removeUrl = $dom->removeChild($url);
}
}
Which would work fine if my sitemap wasn't so big. It times out right now, so I'm hoping using xpath queries will be faster.
After Gordon's comment, I tried:
$xpath = new DOMXpath($DOMfile);
$query = sprintf('/urlset/url[./loc = "%d"]', $pageUrl);
foreach($xpath->query($query) as $element) {
//this is where I want to delete the URL
echo $element;
echo "here".$element->nodeValue;
}
And its not returning anything.
I tried going a step further and used codepad, using what was used in the other post mentioned, and did this:
<?php error_reporting(-1);
$xml = <<< XML <?xml version="1.0"
encoding="UTF-8" ?> <url>
<loc>professional_services</loc>
<loc>5professional_services</loc>
<loc>6professional_services</loc>
</url> XML;
$id = '5professional_services';
$dom = new DOMDocument; $dom->loadXML($xml);
$xpath = new DOMXPath($dom); $query = sprintf('/url/[loc = $id]');
foreach($xpath->query($query) as $record) {
$record->parentNode->removeChild($record);
}
echo $dom->saveXml();
and I'm getting a "Warning: DOMXPath::query(): Invalid expression" at the foreach loop line. Thanks for the other comment on the urlset, I'll be sure to include the double slashes in my code, tried it and it returned nothing.
XML from a sitemap should be :
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc></loc>
...
</url>
<url>
<loc></loc>
...
</url>
...
</urlset>
Since it got a namespace, the query is a little more complicated than my previous answer :
$xpath = new DOMXpath($DOMfile);
// Here register your namespace with a shortcut
$xpath->registerNamespace('sm', "http://www.sitemaps.org/schemas/sitemap/0.9");
// this request should work
$elements = $xpath->query('/sm:urlset/sm:url[sm:loc = "'.$pageUrl.'"]');
foreach($elements as $element){
// This is a hint from the manual comments
$element->parentNode->removeChild($element);
}
echo $DOMfile->saveXML();
I'm writing out of memory just before going to bed. If it doesn't work I'll go test tomorrow morning. (And yes, I'm aware that it could bring some downvotes)
If you don't have a namespace (you should but that's not an obligation sigh)
$elements = $xpath->query('/urlset/url[loc = "'.$pageUrl.'"]');
You got a concrete example that it's working here : http://codepad.org/vuGl1MAc

PHP Search between two points

I'm trying to deal with some XML in PHP.
I have code, such as this:
<?php
$stream = fopen("xml","r");
?>
Where "xml" contains something such as this:
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
etc.
I'd like to create an array out of the contents of the <key> tags, something like where
keys[0] = "key1"
and
keys[1] = "key2"
Any help is appreciated, thank you very much :)
Solution:
$xmlstr = fread($stream,filesize("xml-file"));
$sxe = new SimpleXMLElement($xmlstr);
echo $sxe->getName() . "\n";
foreach ($sxe->children() as $child) {
echo $child->children();
}
You should use DOM functions for this case. Let's suppose a well-formed XML document (xmltest.xml):
<?xml version="1.0" encoding="utf-8"?>
<root>
<name>name1</name>
<key>key1</key>
<name>name2</name>
<key>key2</key>
</root>
This code loads the xml file into DOM document and gets all nodes with tag key;
<?php
$dom = new DOMDocument('1.0','utf-8');
$dom->load('xmltest.xml');
$keys = $dom->getElementsByTagName('key');
for ($i = 0; $i < $keys->length; $i++) {
echo $keys->item($i)->nodeValue . "</br>";
}
?>

How to load string back xml with simpleXML in PHP?

Would anyone know how i can "explode" a string back into "normal" xml format?
I found this script (ref:gooseflight,2010) that looks like it can do the job but the output comes out stuck together.
Here's the code:
enter code herefunction combineXML($file)
{
global $xmlstr;
$xml = simplexml_load_file($file);
foreach($xml as $element)
$xmlstr .= $element->asXML();
}
$files[] = "tmp.xml";
$files[] = "traduction.xml";
$xmlstr = '<CAB>';
foreach ($files as $file)
combineXML($file);
$xmlstr .= '</CAB>';
// Convert string to XML for further processing
$xml = simplexml_load_string($xmlstr);
$bytes = file_put_contents("combined.xml", $xml->asXML())
Here is the output:
<?xml version="1.0" encoding="UTF-8"?>
<CAB>
<CABO>XXXXXXXXXX0987650003</CABO><ACTIVITY>NONE</ACTIVITY><BEORI>blablaE</BEORI>BEDEST>blabla</BEDEST><NATRELA>more blabla</NATRELA><ANE>2014</ANE><NODEP>1111</NODEP>
</CAB>
So how could i seperate the nodes to look like this?:
<?xml version="1.0" encoding="UTF-8"?>
<CAB>
<CABO>XXXXXXXXXX0987650003</CABO>
<ACTIVITY>NONE</ACTIVITY>
<BEORI>blablaE</BEORI>
<BEDEST>blabla</BEDEST>
<NATRELA>more blabla</NATRELA>
<ANE>2014</ANE>
<NODEP>1111</NODEP>
.....
</CAB>
Would anyone know how to fix it?
I would suggest to use DomDocument class to save the XML; check this:
$dom_obj = new DOMDocument();
$dom_obj->loadXML($file);
// Do all your changes to the file by using DomDocument command (e.g. CreateElement, CreateAttribute, etc)
$dom_obj->formatOutput = true;
$dom_obj->save($file);

How to remove XML elements and all children?

I need to read an XML file and delete all the elements named <images> and all the children associated. I have found similar old questions that did not work. What am I doing wrong? Is there a better method?
XML:
<?xml version='1.0' encoding='UTF-8'?>
<settings>
<background_color>#000000</background_color>
<show_context_menu>yes</show_context_menu>
<image>
<thumb_path>210x245.png</thumb_path>
<big_image_path>620x930.png</big_image_path>
</image>
<image>
<thumb_path>200x295.png</thumb_path>
<big_image_path>643x950.png</big_image_path>
</image>
</settings>
PHP:
$dom = new DOMDocument();
$dom->load('test.xml');
$thedocument = $dom->documentElement;
$elements = $thedocument->getElementsByTagName('image');
foreach ($elements as $node) {
$node->parentNode->removeChild($node);
}
$save = $dom->saveXML();
file_put_contents('test.xml', $save)
I figured it out after a good night of sleep. It was quite simple actually.
$xml = simplexml_load_file('test.xml');
unset($xml->image);
$xml_file = $xml->asXML();
$xmlFile = 'test.xml';
$xmlHandle = fopen($xmlFile, 'w');
fwrite($xmlHandle, $xml_file);
fclose($xmlHandle);
Edit: You probably want to make it save directly:
$file = 'test.xml';
$xml = simplexml_load_file($file);
unset($xml->image);
$success = $xml->asXML($file);
See SimpleXMLElement::asXML()Docs.
In the PHP Manual page (where you should always go 1st :-) one awesome contributor points out that:
You can't remove DOMNodes from a DOMNodeList as you're iterating over them in a foreach loop.
Then goes on to offer a potential solution. Try something like this instead:
<?php
$domNodeList = $domDocument->getElementsByTagname('p');
$domElemsToRemove = array();
foreach ( $domNodeList as $domElement ) {
// ...do stuff with $domElement...
$domElemsToRemove[] = $domElement;
}
foreach( $domElemsToRemove as $domElement ){
$domElement->parentNode->removeChild($domElement);
}
?>
First of all, your XML is broken, see <thumb>...</thumb_path>and next line as well -> fix it!
Then, real simple in 3 lines of code:
$xml = simplexml_load_string($x); // $x holds your xml
$count = $xml->image->count()-1;
for ($i = $count;$i >= 0;$i--) unset($xml->image[$i]);
See live demo # http://codepad.viper-7.com/HkGy5o

PHP DOM: How to move element into default namespace?

What I tried and what doesn't work:
Input:
$d = new DOMDocument();
$d->formatOutput = true;
// Out of my control:
$someEl = $d->createElementNS('http://example.com/a', 'a:some');
// Under my control:
$envelopeEl = $d->createElementNS('http://example.com/default',
'envelope');
$d->appendChild($envelopeEl);
$envelopeEl->appendChild($someEl);
echo $d->saveXML();
$someEl->prefix = null;
echo $d->saveXML();
Output is invalid XML after substitution:
<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<a:some xmlns:a="http://example.com/a"/>
</envelope>
<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<:some xmlns:a="http://example.com/a" xmlns:="http://example.com/a"/>
</envelope>
Note that <a:some> may have children. One solution would be
to create a new <some>, and copy all children from <a:some> to <some>. Is
that the way to go?
This is really an interesting question. My first intention was to clone the <a:some> node, remove the xmlns:a attribute, remove the <a:some> and insert the clone - <a>. But this will not work, as PHP does not allow to remove the xmlns:a attribute like any regular attribute.
After some struggling with DOM methods of PHP I started to google the problem. I found this comment in the PHP documentation on this. The user suggest to write a function that clones the node manually without it's namespace:
<?php
/**
* This function is based on a comment to the PHP documentation.
* See: http://www.php.net/manual/de/domnode.clonenode.php#90559
*/
function cloneNode($node, $doc){
$unprefixedName = preg_replace('/.*:/', '', $node->nodeName);
$nd = $doc->createElement($unprefixedName);
foreach ($node->attributes as $value)
$nd->setAttribute($value->nodeName, $value->value);
if (!$node->childNodes)
return $nd;
foreach($node->childNodes as $child) {
if($child->nodeName == "#text")
$nd->appendChild($doc->createTextNode($child->nodeValue));
else
$nd->appendChild(cloneNode($child, $doc));
}
return $nd;
}
Using it would lead to a code like this:
$xml = '<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<a:some xmlns:a="http://example.com/a"/>
</envelope>';
$doc = new DOMDocument();
$doc->loadXML($xml);
$elements = $doc->getElementsByTagNameNS('http://example.com/a', 'some');
$original = $elements->item(0);
$clone = cloneNode($original, $doc);
$doc->documentElement->replaceChild($clone, $original);
$doc->formatOutput = TRUE;
echo $doc->saveXML();

Categories