Replace element value in DOM - php

I want to save DOM tags value to exist XML, I found replace function but it is in js and I need the function in PHP
I tried save and saveXML function, but this didn't worked. I have tags in XML with colon "iaiext:auction_title". I used getElement and it's work good, next i cut title to 50 characters function work too, but how i can replace old title to this new title if i dont use path like simple_load_file. How to show in my script this path?
$dom = new DOMDocument;
$dom->load('p.xml');
$i = 0;
$tytuly = $dom->getElementsByTagName('auction_title');
foreach ($tytuly as $tytul){
$title = $tytul->nodeValue;
$end_title = doTitleCut($title);
//echo "<pre>";
//echo($end_title);
//echo "<pre>";
$i = $i+1;
}

In your loop, you can update a particular nodes value the same way you fetch it - with nodeValue. So in your loop, just update it each time...
$tytul->nodeValue = doTitleCut($title);
Then after your loop, you can just echo the new XML out using
echo $dom->saveXML();
or save it using
$dom->save("3.xml");

It is the same basic API in PHP. However browsers implement more or other parts of the API. Here are 5 revisions of the API (DOM Level 1 to 4 and DOM LS). DOM 3 added a property to read/write the text content of a node: https://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-textContent
The following example prefixes the titles:
$xml = <<<'XML'
<auctions>
<auction_title>World!</auction_title>
<auction_title>World & Universe!</auction_title>
</auctions>
XML;
$document = new DOMDocument();
$document->loadXML($xml);
$titleNodes = $document->getElementsByTagName('auction_title');
foreach ($titleNodes as $titleNode) {
$title = $titleNode->textContent;
$titleNode->textContent = 'Hello '.$title;
}
echo $document->saveXML();
Output:
<?xml version="1.0"?>
<auctions>
<auction_title>Hello World!</auction_title>
<auction_title>Hello World & Universe!</auction_title>
</auctions>
PHPs DOMNode::$nodeValue implementation does not match the W3C API definition. It behaves the same as DOMNode::$textContent for reads and does not fully escape on write.

Related

PHP: Keeping HTML inside XML node without CDATA

I've got an xml like this:
<father>
<son>Text with <b>HTML</b>.</son>
</father>
I'm using simplexml_load_string to parse it into SimpleXmlElement. Then I get my node like this
$xml->father->son->__toString(); //output: "Text with .", but expected "Text with <b>HTML</b>."
I need to handle simple HTML such as:
<b>text</b> or <br/> inside the xml which is sent by many users.
Me problem is that I can't just ask them to use CDATA because they won't be able to handle it properly, and they are already use to do without.
Also, if it's possible I don't want the file to be edited because the information need to be the one sent by the user.
The function simplexml_load_string simply erase anything inside HTML node and the HTML node itself.
How can I keep the information ?
SOLUTION
To handle the problem I used the asXml as explained by #ThW:
$tmp = $xml->father->son->asXml(); //<son>Text with <b>HTML</b>.</son>
I just added a preg_match to erase the node.
A CDATA section is a character node, just like a text node. But it does less encoding/decoding. This is mostly a downside, actually. On the upside something in a CDATA section might be more readable for a human and it allows for some BC in special cases. (Think HTML script tags.)
For an XML API they are nearly the same. Here is a small DOM example (SimpleXML abstracts to much).
$document = new DOMDocument();
$father = $document->appendChild(
$document->createElement('father')
);
$son = $father->appendChild(
$document->createElement('son')
);
$son->appendChild(
$document->createTextNode('With <b>HTML</b><br>It\'s so nice.')
);
$son = $father->appendChild(
$document->createElement('son')
);
$son->appendChild(
$document->createCDataSection('With <b>HTML</b><br>It\'s so nice.')
);
$document->formatOutput = TRUE;
echo $document->saveXml();
Output:
<?xml version="1.0"?>
<father>
<son>With <b>HTML</b><br>It's so nice.</son>
<son><![CDATA[With <b>HTML</b><br>It's so nice.]]></son>
</father>
As you can see they are serialized very differently - but from the API view they are basically exchangeable. If you're using an XML parser the value you get back should be the same in both cases.
So the first possibility is just letting the HTML fragment be stored in a character node. It is just a string value for the outer XML document itself.
The other way would be using XHTML. XHTML is XML compatible HTML. You can mix an match different XML formats, so you could add the XHTML fragment as part of the outer XML.
That seems to be what you're receiving. But SimpleXML has some problems with mixed nodes. So here is an example how you can read it in DOM.
$xml = <<<'XML'
<father>
<son>With <b>HTML</b><br/>It's so nice.</son>
</father>
XML;
$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);
$result = '';
foreach ($xpath->evaluate('/father/son[1]/node()') as $child) {
$result .= $document->saveXml($child);
}
echo $result;
Output:
With <b>HTML</b><br/>It's so nice.
Basically you need to save each child of the son element as XML.
SimpleXML is based on the same DOM library internally. That allows you to convert a SimpleXMLElement into a DOM node. From there you can again save each child as XML.
$father = new SimpleXMLElement($xml);
$sonNode = dom_import_simplexml($father->son);
$document = $sonNode->ownerDocument;
$result = '';
foreach ($sonNode->childNodes as $child) {
$result .= $document->saveXml($child);
}
echo $result;

can't access xml node PHP

I have a page in php where I have to parse an xml.
I have done this for example:
$hotelNodes = $xml_data->getElementsByTagName('Hotel');
foreach($hotelNodes as $hotel){
$supplementsNodes2 = $hotel->getElementsByTagName('BoardBase');
foreach($supplementsNodes2 as $suppl2) {
echo'<p>HERE</p>'; //not enter here
}
}
}
In this code I access to each hotel of my xml, and foreach hotel I would like to search the tag BoardBase but it doesn0t enter inside it.
This is my xml (cutted of many parts!!!!!)
<hotel desc="DESC" name="Hotel">
<selctedsupplements>
<boardbases>
<boardbase bbpublishprice="0" bbprice="0" bbname="Colazione Continentale" bbid="1"></boardbase>
</boardbases>
</selctedsupplements>
</occupancy></occupancies>
</hotel>
I have many nodes that doesn't have BoardBase but sometimes there is but not enter.
Is possible that this node isn't accessible?
This xml is received by a server with a SoapClient.
If I inspect the XML printed in firebug I can see the node with opacity like this:
I have also tried this:
$supplementsNodes2 = $hotel->getElementsByTagName('boardbase');
but without success
2 issues I can see from the get-go: XML names are case-sensitive, hence:
$hotelNodes = $xml_data->getElementsByTagName('Hotel');
Can't work, because your xml node looks like:
<hotel desc="DESC" name="Hotel">
hotel => lower-case!
As you can see here:
[...] names for such things as elements, while XML is explicitly case sensitive.
The official specs specify tag names as case-sensitive, so getElementsByTagName('FOO') won't return the same elements as getElementsByTagName('foo')...
Secondly, you seem to have some tag-soup going on:
</occupancy></occupancies>
<!-- tag names don't match, both are closing tags -->
This is just plain invalid markup, it should read either:
<occupancy></occupancy>
or
<occupancies></occupancies>
That would be the first 2 ports of call.
I've set up a quick codepad using this code, which you can see here:
$xml = '<hotel desc="DESC" name="Hotel">
<selctedsupplements>
<boardbases>
<boardbase bbpublishprice="0" bbprice="0" bbname="Colazione Continentale" bbid="1"></boardbase>
</boardbases>
</selctedsupplements>
<occupancy></occupancy>
</hotel>';
$dom = new DOMDocument;
$dom->loadXML($xml);
$badList = $dom->getElementsByTagName('Hotel');
$correctList = $dom->getElementsByTagName('hotel');
echo sprintf("%d",$badList->lenght),
' compared to ',
$correctList->length, PHP_EOL;
The output was "0 compared to 1", meaning that using a lower-case selector returned 1 element, the one with the upper-case H returned an empty list.
To get to the boardbase tags for each hotel tag, you just have to write this:
$hotels = $dom->getElementsByTagName('html');
foreach($hotels as $hotel)
{
$supplementsNodes2 = $hotel->getElementsByTagName('boardbase');
foreach($supplementsNodes2 as $node)
{
var_dump($node);//you _will_ get here now
}
}
As you can see on this updated codepad.
Alessandro, your XML is a mess (=un casino), you really need to get that straight. Elias' answer pointed out some very basic stuff to consider.
I built on the code pad Elias has been setting up, it is working perfectly with me:
$dom = new DOMDocument;
$dom->loadXML($xml);
$hotels = $dom->getElementsByTagName('hotel');
foreach ($hotels as $hotel) {
$bbs = $hotel->getElementsByTagName('boardbase');
foreach ($bbs as $bb) echo $bb->getAttribute('bbname');
}
see http://codepad.org/I6oxkEOC

Parsing inline tags with SimpleXML

I'm using SimpleXML & PHP to parse an XML element in the following form:
<element>
random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse
</element>
I know I can reach inlinetag using $element->inlinetag, but I don't know how to reach it in such a way that I can basically replace the inlinetag with a link to the attribute source without using it's location in the text. The result would basically have to look like this:
here is a random text with inline XML
This may be a stupid questions, I hope someone here can help! :)
I found a way to do this using DOMElement.
One way to replace the element is by cloning it with a different name/attributes. Here is is a way to do this, using the accepted answer given on How do you rename a tag in SimpleXML through a DOM object?
function clonishNode(DOMNode $oldNode, $newName, $replaceAttrs = [])
{
$newNode = $oldNode->ownerDocument->createElement($newName);
foreach ($oldNode->attributes as $attr)
{
if (isset($replaceAttrs[$attr->name]))
$newNode->setAttribute($replaceAttrs[$attr->name], $attr->value);
else
$newNode->appendChild($attr->cloneNode());
}
foreach ($oldNode->childNodes as $child)
$newNode->appendChild($child->cloneNode(true));
$oldNode->parentNode->replaceChild($newNode, $oldNode);
}
Now, we use this function to clone the inline element with a new element and attribute name. Here comes the tricky part: iterating over all the nodes will not work as expected. The length of the selected nodes will change as you clone them, as the original node is removed. Therefore, we only select the first element until there are no elements left to clone.
$xml = '<element>
random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse
</element>';
$dom = new DOMDocument;
$dom->loadXML($xml);
$nodes= $dom->getElementsByTagName('inlinetag');
echo $dom->saveXML(); //<element>random text with <inlinetag src="http://url.com/">inline</inlinetag> XML to parse</element>
while($nodes->length > 0) {
clonishNode($nodes->item(0), 'a', ['src' => 'href']);
}
echo $dom->saveXML(); //<element>random text with inline XML to parse</element>
That's it! All that's left to do is getting the content of the element tag.
Is this the result you want to achieve?
<?php
$data = '<element>
random text with
<inlinetag src="http://url.com/">inline
</inlinetag> XML to parse
</element>';
$xml = simplexml_load_string($data);
foreach($xml->inlinetag as $resource)
{
echo 'Your SRC attribute = '. $resource->attributes()->src; // e.g. name, price, symbol
}
?>

Change the value of a text node using SimpleXML

I am trying to write a code where it will find a specific element in my XML file and then change the value of the text node. The XML file has different namespaces. Till now, I have managed to register the namespaces and also echo the text node of the element, which I want to change.
<?php
$xml = simplexml_load_file('getobs.xml');
$xml->registerXPathNamespace('g','http://www.opengis.net/gml');
$result = $xml->xpath('//g:beginPosition');
foreach ($result as $title) {
echo $title . "\n";
}
?>
My question is: How can I change the value of this element using SimpleXML? I tried to use the nodeValue command but I am not able to make it work.
This is a part of the XML:
<sos:GetObservation xmlns:sos="http://www.opengis.net/sos/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" service="SOS" version="1.0.0" srsName="urn:ogc:def:crs:EPSG:4326">
<sos:offering>urn:gfz:cawa:def:offering:meteorology</sos:offering>
<sos:eventTime>
<ogc:TM_During xmlns:ogc="http://www.opengis.net/ogc" xsi:type="ogc:BinaryTemporalOpType">
<ogc:PropertyName>urn:ogc:data:time:iso8601</ogc:PropertyName>
<gml:TimePeriod xmlns:gml="http://www.opengis.net/gml">
<gml:beginPosition>2011-02-10T01:10:00.000</gml:beginPosition>
Thanks
Dimitris
In the end I managed to do it by using the PHP XML DOM.
Here is the code that I used in order to change the text node of a specific element:
<?php
// create new DOM document and load the data
$dom = new DOMDocument;
$dom->load('getobs.xml');
//var_dump($dom);
// Create new xpath and register the namespace
$xpath = new DOMXPath($dom);
$xpath->registerNamespace('g','http://www.opengis.net/gml');
// query the result amd change the value to the new date
$result = $xpath->query("//g:beginPosition");
$result->item(0)->nodeValue = 'sds';
// save the values in a new xml
file_put_contents('test.xml',$dom->saveXML());
?>
Not wanting to switch from the code I've already made for SimpleXML, I found this solution:
http://www.dotdragnet.com/forum/index.php?topic=3979.0
Specificially:
$numvotes = $xml->xpath('/gallery/image[path="'.$_GET["image"].'"]/numvotes');
...
$numvotes[0][0] = $votes;
Hope this helps!

PHP Dealing with missing XML data

If I have three sets of data, say:
<note><from>Me</from><to>someone</to><message>hello</message></note>
<note><from>Me</from><to></to><message>Need milk & eggs</message></note>
<note><from>Me</from><message>Need milk & eggs</message></note>
and I'm using simplexml is there a way to have simple xml check that there's an empty/absent tag automatically?
I would like the output to be:
FROM TO MESSAGE
Me someone hello
Me NULL Need milk & eggs
Me NULL Need milk & eggs
Right now I'm doing it manually and I quickly realised that it's going to take a very long time to do it for long xml files.
My current sample code:
$xml = simplexml_load_string($string);
if ($xml->from != "") {$out .= $xml->from."\t"} else {$out .= "NULL\t";}
//repeat for all children, checking by name
Sometimes the order is different as well, there might be a xml with:
<note><message>pick up cd</message><from>me</from></note>
so iterating through the children and checking by index count doesn't work.
The actual xml files I'm working with are thousands of lines each, so I obviously can't just code in every tag.
It sounds like you need a DTD (Document Type Definition), which will define the required format of the XML file, and specify which elements are required, optional, what they can contain, etc.
DTDs can be used to validate an XML file before you do any processing with it.
Unfortunately, PHP's simplexml library doesn't do anything with DTD, but the DomDocument library does, so you may want to use that instead.
I'll leave it as a separate excersise for you to research how to create a DTD file. If you need more help with that, I'd suggest asking it as a separate question.
You could use the DOMDocument instead. I have created a quick demo that splits the <note> elements into an array using the XML tag names as keys. You could then iterate the resultant array to create your output.
I corrected the invalid XML by replacing the ampersand with the HTML entity equivalent (&).
<?php
libxml_use_internal_errors(true);
$xml = <<<XML
<notes>
<note><from>Me</from><to>someone</to><message>hello</message></note>
<note><from>Me</from><to></to><message>Need milk & eggs</message></note>
<note><from>Me</from><message>Need milk & eggs</message></note>
<note><message>pick up cd</message><from>me</from></note>
</notes>
XML;
function getNotes($nodelist) {
$notes = array();
foreach ($nodelist as $node) {
$noteParts = array();
foreach ($node->childNodes as $child) {
$noteParts[$child->tagName] = $child->nodeValue;
}
$notes[] = $noteParts;
}
return $notes;
}
$dom = new DOMDocument();
$dom->recover = true;
$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$nodelist = $xpath->query("//note");
$notes = getNotes($nodelist);
print_r($notes);
?>
Edit: If you change to $noteParts = array(); to $noteParts = array('from' => null, 'to' => null, 'message' => null); then it will always create the full set of keys.

Categories