How to replace the content of my xml file in php?

How to replace the content of my xml file in php? - php

I know to translate the content of my XML file. For that I extract the code with the function file_get_contents and I convert it into text with the library Html2Text. I have translated the content of the file, but when I get to the end to replace the content with two arrays using the str_replace function, it does not work too well. There is a part that is translated and another that is not.
Here is my code
// We will return our text in an array.
$explodesTextForTranslatingArr = array_filter( explode(WEBSITE_NAME, $textForTranslating), 'strlen' );
$explodesTextForTranslatingArr = array_map('trim', $explodesTextForTranslatingArr);
$countExplodestextForTranslating = count($explodesTextForTranslatingArr);
// We will return our text that we have translated into an array.
$explodesTextAddForTranslatingArr = array_filter( explode(WEBSITE_NAME, $textTranslating), 'strlen' );
$explodesTextAddForTranslatingArr = array_map('trim', $explodesTextAddForTranslatingArr);
$countExplodesTextAddForTranslating = count($explodesTextAddForTranslatingArr);
$textForTranslatingArr = [];
foreach ($explodesTextForTranslatingArr as $value) {
if (!empty($value)) {
$textForTranslatingArr[] = $value;
}
}
$textAddForTranslatingArr = [];
foreach ($explodesTextAddForTranslatingArr as $value) {
if (!empty($value)) {
$textAddForTranslatingArr[] = $value;
}
}
$updateAllContentFromXml = str_replace($textForTranslatingArr, $textAddForTranslatingArr, $fileGetContentFileXml);
echo '<pre>';
print_r($fileGetContentFileXml);
Here is a snippet of the result of my XML file:
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at building applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
EDIT:
Here are some essential explanations on the use of its three variables.
The variable $textForTranslatingArr store the text for translation in the array by filtering the data to remove the site name in the text.
The variable $textAddForTranslatingArr will add the translated text into a new array. By filtering the data i.e. by removing the name of the site and the spaces that we had in the previous text.
The variable $fileGetContentFileXml will display the content of our XML file and this content will be replaced by our translated content. This content is stored in the variable $textAddForTranslatingArr. Now I explain you the functionality of these three variables, what should I do so that I can solve this bug please?
I don't know why all the content is not translated exactly as it is in the array. Can you give me an idea or show me where I made the mistake in my code please?

Related

Parsing big XML file - having unescaped html tags - throws error

Am trying to import data from a large 1 GB XML file into WordPress. As it's a big file, I did some research and found that this would be the best solution: https://github.com/prewk/xml-string-streamer
I implemented a test script like this:
<?php
require('vendor/autoload.php');
// Convenience method for creating a file streamer with the default parser
$streamer = Prewk\XmlStringStreamer::createStringWalkerParser("mybigfile.xml");
$count = 1;
while ($node = $streamer->getNode()) {
echo $node . '<br>';
$simpleXmlNode = simplexml_load_string($node);
if( $simpleXmlNode AND $simpleXmlNode->getName() == 'book' )
{
var_dump( $simpleXmlNode );
echo (string)$simpleXmlNode->name. '<br>';
echo $count++. '<br>';
}
if( $count == 20 ) die;
}
Upto 10 nodes, everything seems to work fine. But after that, there's a <description> element and inside it, there's some unescaped HTML tags (eg: <div>). So it's throwing errors because of these HTML tags.
My XML file looks somewhat like this:
<?xml version="1.0" encoding="UTF-8"?>
<source>
<lastBuildDate>2021-04-24</lastBuildDate>
<owner>Blahblah</owner>
<book>
<name><![CDATA[Once upon a time in coma]]></name>
<price><![CDATA[USD 20]]></price>
<listDate><![CDATA[2021-04-02]]></listDate>
<description><![CDATA[<div>This is a great book..</div>]]></description>
</book>
<book>
<name><![CDATA[Once upon a time in coma]]></name>
<price><![CDATA[USD 20]]></price>
<listDate><![CDATA[2021-04-02]]></listDate>
<description><![CDATA[<div>This is a great book..</div>]]></description>
</book>
<book>
<name><![CDATA[Once upon a time in coma]]></name>
<price><![CDATA[USD 20]]></price>
<listDate><![CDATA[2021-04-02]]></listDate>
<description><![CDATA[<div>This is a great book..</div>]]></description>
</book>
</source>
Content is not always same, I just gave you an example. I believe the XML reader is having hard time understanding which are the XML elements because of the HTML elements inside <description> tag. How can I convert the HTML tags to HTML entities on the fly?

Try to set expectGT option as true. Have a look at https://github.com/prewk/xml-string-streamer#available-options-for-the-stringwalker-parser

php xpath query to get parent node based on value in repeating child nodes

I have an XML file structured as follows:
<pictures>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Unites States</place>
</facts>
<people>
<person>John</person>
<person>Sue</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>Sue</person>
<person>Jane</person>
</people>
</picture>
<picture>
<title></title>
<description></description>
<facts>
<date></date>
<place>Canada</place>
</facts>
<people>
<person>John</person>
<person>Joe</person>
<person>Harry</person>
</people>
</picture>
<pictures>
In one case, I need to search for pictures where place="Canada". I have an XPath that does this fine, as such:
$place = "Canada";
$pics = ($pictures->xpath("//*[place='$place']"));
This pulls the entire "picture" node, so I am able to display title, description, etc.
I have another need to find all pictures where person = $person. I use the same type query as above:
$person = "John";
$pics = ($pictures->xpath("//*[person='$person']"));
In this case, the query apparently knows there are 2 pictures with John, but I don't get any of the values for the other nodes. I'm guessing it has something to do with the repeating child node, but can't figure out how to restructure the XPath to pull all of the picture node for each where I have a match on person. I tried using attributes instead of values (and modified the query accordingly), but got the same result.
Can anyone advise what I'm missing here?

Let's replace the variables first. That takes PHP out of the picture. The problem is just the proper XPath expression.
//*[place='Canada']
matches any element node that has a child element node place with the text content Canada.
This is the facts element node - not the picture.
Getting the pictures node is slightly different:
//picture[facts/place='Canada']
This would select ANY picture node at ANY DEPTH that matches the condition.
picture[facts/place='Canada']
Would return the same result with the provided XML, but is more specific and matches only picture element nodes that are children of the document element.
Now validating the people node is about the same:
picture[people/person="John"]
You can even combine the two conditions:
picture[facts/place="Canada" and people/person="John"]
Here is a small demo:
$element = new SimpleXMLElement($xml);
$expressions = [
'//*[place="Canada"]',
'//picture[facts/place="Canada"]',
'picture[facts/place="Canada"]',
'picture[people/person="John"]',
'picture[facts/place="Canada" and people/person="John"]',
];
foreach ($expressions as $expression) {
echo $expression, "\n", str_repeat('-', 60), "\n";
foreach ($element->xpath($expression) as $index => $found) {
echo '#', $index, "\n", $found->asXml(), "\n";
}
echo "\n";
}
HINT: Your using dyamic values in you XPath expressions. String literals in XPath 1.0 do not support any kind of escaping. A quote in the variable can break you expression. See this answer.

xmldiff issues on php

I am having some issues using xmldiff package. I'm using xmldiff package 0.9.2; PHP 5.4.17; Apache 2.2.25.
For example I have two xml files: "from.xml" & "to.xml".
File "from.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>321</NDC>
<NDC>123</NDC>
</rott>
</root>
File "to.xml" contains:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rott>
<NDC>123</NDC>
<NDC>321</NDC>
</rott>
</root>
I'm using code:
$zxo = new XMLDiff\File;
$dir1 = dirname(__FILE__) . "/upload/from.xml";
$dir2 = dirname(__FILE__) . "/upload/to.xml";
$diff = $zxo->diff($dir1, $dir2);
$file = 'differences.xml';
file_put_contents($file, $diff);
I get result in "differences.xml" file:
<?xml version="1.0"?>
<dm:diff xmlns:dm="http://www.locus.cz/diffmark">
<root>
<rott>
<dm:delete>
<NDC/>
</dm:delete>
<dm:copy count="1"/>
<dm:insert>
<NDC>321</NDC>
</dm:insert>
</rott>
</root>
</dm:diff>
Could you please comment from where this:
<dm:delete>
<NDC/>
</dm:delete>
comes?
Also please kindly inform me if there is a method which differs two xml files without matter of xml nodes order?

What you see is the diff in the libdiffmark format. Right from that page:
<copy/> is used in places where the input subtrees are the same
The documents from your snippet have partially identical sub trees. Effectively the instructions libdiffmark will execute are
delete the whole subtree
copy 1 nodes, that means the node is the same in the both documents, so don't touch it
insert 1 new node
The order of the nodes matters. Please think about how a diff would look like, if the node order were ignored. Say you had 42 nodes and some of those were the same, how it would apply the copy instruction with the count? Much easier for a diff to use the exact node order of two documents. One interesting reading I've found here about why node order can be important.
Thanks.

If the document structure is known, I think you can simply sort the necessary parts. Here's a useful acticle about it. Based on it, I've poked on some examples and could sort a document by node values (just for example), please look here
document library.xml
<?xml version="1.0"?>
<library>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
</library>
The PHP code, sorting by the <title>
<?php
$dom = new DOMDocument();
$dom->load('library.xml');
$xp = new DOMXPath($dom);
$booklist = $xp->query('/library/book');
$books = iterator_to_array($booklist);
function sort_by_title_node($a, $b)
{
$x = $a->getElementsByTagName('title')->item(0);
$y = $b->getElementsByTagName('title')->item(0);
return strcmp($x->nodeValue, $y->nodeValue) > 0;
}
usort($books, 'sort_by_title_node');
$newdom = new DOMDocument("1.0");
$newdom->formatOutput = true;
$root = $newdom->createElement("library");
$newdom->appendChild($root);
foreach ($books as $b) {
$node = $newdom->importNode($b,true);
$root->appendChild($newdom->importNode($b,true));
}
echo $newdom->saveXML();
And here's the result:
<?xml version="1.0"?>
<library>
<book id="1002">
<title>American dream</title>
<author>Hello</author>
<price>300</price>
</book>
<book id="1003">
<title>Jquery MVC</title>
<author>Me</author>
<price>500</price>
</book>
<book id="1001">
<title>Php</title>
<author>Me</author>
<price>600</price>
</book>
<book id="1002">
<title>Where to use IFrame</title>
<author>Me</author>
<price>300</price>
</book>
</library>
This way you can sort the parts of the document before comparing. After that you can even use the DOM comparison directly. Even you could reorder the nodes, it were a similar approach.
I'm not sure it'll be very useful in the case if you have a variable node number. Say if the <NDC> tag were repeated some random number of times and it's values were completely different.
And after all, I still think the simplest way were to ask your supplicant to create some more predictable document structure :)
Thanks
Anatol

PHP Recursively Edit XML Document with Simplexml

I have a series of arbitrary XML Documents that I need to parse and perform some string manipulation on each element within the document.
For example:
<sample>
<example>
<name>David</name>
<age>21</age>
</example>
</sample>
For the nodes name and age I might want to run it through a function such as strtoupper to change the case.
I am struggling to do this in a generic way. I have been trying to use RecursiveIteratorIterator with SimpleXMLIterator to achieve this but I am unable to get the parent key to update the xml document:
$iterator = new RecursiveIteratorIterator(new SimpleXMLIterator($xml->asXML()));
foreach ($iterator as $k=> $v) {
$iterator->$k = strtoupper($v);
}
This fails because $k in this example is 'name' so it's trying to do:
$xml->name = strtoupper($value);
When it needs to be
$xml->example->name = strtoupper($value);
As the schema of the documents change I want to use something generic to process them all but I don't know how to get the key.
Is this possible with Spl iterators and simplexml?

You are most likely looking for something that I worded SimpleXML-Self-Reference once. It does work here, too.
And yes, Simplexml has support for SPL and RecursiveIteratorIterator.
So first of all, you can directly make $xml work with tree-traversal by opening the original XML that way:
$buffer = <<<BUFFER
<sample>
<example>
<name>David</name>
<age>21</age>
</example>
</sample>
BUFFER;
$xml = simplexml_load_string($buffer, 'SimpleXMLIterator');
// #################
That allows you to do all the standard modifications (as SimpleXMLIterator is as well a SimpleXMLElement) but also the recursive tree-traversal to modify each leaf-node:
$iterator = new RecursiveIteratorIterator($xml);
foreach ($iterator as $node) {
$node[0] = strtoupper($node);
// ###
}
This exemplary recursive iteration over all leaf-nodes shows how to set the self-reference, the key here is to assign to $node[0] as outlined in the above link.
So all left is to output:
$xml->asXML('php://output');
Which then simply gives:
<?xml version="1.0"?>
<sample>
<example>
<name>DAVID</name>
<age>21</age>
</example>
</sample>
And that's the whole example and it should also answer your question.

PHP script to echo VLC now playing XML attributes

I've been searching for a while on this and haven't had much luck. I've found plenty of resources showing how to echo data from dynamic XML, but I'm a PHP novice, and nothing I've written seems to grab and print exactly what I want, though from everything I've heard, it should be relatively easy. The source XML (located at 192.168.0.15:8080/requests/status.xml) is as follows:
<root>
<fullscreen>0</fullscreen>
<volume>97</volume>
<repeat>false</repeat>
<version>2.0.5 Twoflower</version>
<random>true</random>
<audiodelay>0</audiodelay>
<apiversion>3</apiversion>
<videoeffects>
<hue>0</hue>
<saturation>1</saturation>
<contrast>1</contrast>
<brightness>1</brightness>
<gamma>1</gamma>
</videoeffects>
<state>playing</state>
<loop>true</loop>
<time>37</time>
<position>0.22050105035305</position>
<rate>1</rate>
<length>168</length>
<subtitledelay>0</subtitledelay>
<equalizer/>
<information>
<category name="meta">
<info name="description">
000003EC 00000253 00000D98 000007C0 00009C57 00004E37 000068EB 00003DC5 00015F90 00011187
</info>
<info name="date">2003</info>
<info name="artwork_url"> file://brentonshp04/music%24/Music/Hackett%2C%20Steve/Guitar%20Noir%20%26%20There%20Are%20Many%20Sides%20to%20the%20Night%20Disc%202/Folder.jpg
</info>
<info name="artist">Steve Hackett</info>
<info name="publisher">Recall</info>
<info name="album">Guitar Noir & There Are Many Sides to the Night Disc 2
</info>
<info name="track_number">5</info>
<info name="title">Beja Flor [Live]</info>
<info name="genre">Rock</info>
<info name="filename">Beja Flor [Live]</info>
</category>
<category name="Stream 0">
<info name="Bitrate">128 kb/s</info>
<info name="Type">Audio</info>
<info name="Channels">Stereo</info>
<info name="Sample rate">44100 Hz</info>
<info name="Codec">MPEG Audio layer 1/2/3 (mpga)</info>
</category>
</information>
<stats>
<lostabuffers>0</lostabuffers>
<readpackets>568</readpackets>
<lostpictures>0</lostpictures>
<demuxreadbytes>580544</demuxreadbytes>
<demuxbitrate>0.015997290611267</demuxbitrate>
<playedabuffers>0</playedabuffers>
<demuxcorrupted>0</demuxcorrupted>
<sendbitrate>0</sendbitrate>
<sentbytes>0</sentbytes>
<displayedpictures>0</displayedpictures>
<demuxreadpackets>0</demuxreadpackets>
<sentpackets>0</sentpackets>
<inputbitrate>0.016695899888873</inputbitrate>
<demuxdiscontinuity>0</demuxdiscontinuity>
<averagedemuxbitrate>0</averagedemuxbitrate>
<decodedvideo>0</decodedvideo>
<averageinputbitrate>0</averageinputbitrate>
<readbytes>581844</readbytes>
<decodedaudio>0</decodedaudio>
</stats>
</root>
What I'm trying to write is a simple PHP script that echoes the artist's name (In this example Steve Hackett). Actually I'd like it to echo the artist, song and album, but I'm confident that if I'm shown how to retrieve one, I can figure out the rest on my own.
What little of my script which actually seems to work goes as follows. I've tried more than what's below, but I left out the bits that I know for a fact aren't working.
<?PHP
$file = file_get_contents('http://192.168.0.15:8080/requests/status.xml');
$sxe = new SimpleXMLElement($file);
foreach($sxe->...
echo "Artist: "...
?>
I think I need to use foreach and echo, but I can't figure out how to do it in a way that will print what's between those info brackets.
I'm sorry if I've left anything out. I'm not only new to PHP, but I'm new to StackOverflow too. I've referenced this site in other projects, and it's always been incredibly helpful, so thanks in advance for your patience and help!
////////Finished Working Script - Thanks to Stefano and all who helped!
<?PHP
$file = file_get_contents('http://192.168.0.15:8080/requests/status.xml');
$sxe = new SimpleXMLElement($file);
$artist_xpath = $sxe->xpath('//info[#name="artist"]');
$album_xpath = $sxe->xpath('//info[#name="album"]');
$title_xpath = $sxe->xpath('//info[#name="title"]');
$artist = (string) $artist_xpath[0];
$album = (string) $album_xpath[0];
$title = (string) $title_xpath[0];
echo "<B>Artist: </B>".$artist."</br>";
echo "<B>Title: </B>".$title."</br>";
echo "<B>Album: </B>".$album."</br>";
?>

Instead of using a for loop, you can obtain the same result with XPath:
// Extraction splitted across two lines for clarity
$artist_xpath = $sxe->xpath('//info[#name="artist"]');
$artist = (string) $artist_xpath[0];
echo $artist;
You will have to adjust the xpath expression (i.e. change #name=... appropriately), but you get the idea. Also notice that [0] is necessary because xpath will return an array of matches (and you only need the first) and the cast (string) is used to extract text contained in the node.
Besides, your XML is invalid and will be rejected by the parser because of the literal & appearing in the <info name="album"> tag.

If you look at your code again, you are missing a function that turns the first result of the xpath expression into a string of a SimpleXMLElement (casting).
One way to write this once is to extend from SimpleXMLElement:
class BetterXMLElement extends SimpleXMLElement
{
public function xpathString($expression) {
list($result) = $this->xpath($expression);
return (string) $result;
}
}
You then create the more specific SimpleXMLElement like you did use the less specific before:
$file = file_get_contents('http://192.168.0.15:8080/requests/status.xml');
$sxe = new BetterXMLElement($file);
And then you benefit in your following code:
$artist = $sxe->xpathString('//info[#name="artist"]');
$album = $sxe->xpathString('//info[#name="album"]');
$title = $sxe->xpathString('//info[#name="title"]');
echo "<B>Artist: </B>".$artist."</br>";
echo "<B>Title: </B>".$title."</br>";
echo "<B>Album: </B>".$album."</br>";
This spares you some repeated code. This means as well less places you can make an error in :)
Sure you can further on optimize this by allowing to pass an array of multiple xpath queries and returning all values named then. But that is something you need to write your own according to your specific needs. So use what you learn in programming to make programming more easy :)
If you want some more suggestions, here is another, very detailed example using DOMDocument, the sister-library of SimpleXML. It is quite advanced but might give you some good inspiration, I think something similar is possible with SimpleXML as well and this is probably what you're looking for in the end:
Extracting data from HTML using PHP and xPath

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to replace the content of my xml file in php? - php

Related

Parsing big XML file - having unescaped html tags - throws error

php xpath query to get parent node based on value in repeating child nodes

xmldiff issues on php

PHP Recursively Edit XML Document with Simplexml

PHP script to echo VLC now playing XML attributes

Categories

Resources