Modify CDATA String and insert again in CDATA format? - php

Actual I have one problem. I want to append something to string in CDATA format in a XML file.
<url><![CDATA[http://www.kinguin.net/category/1356/l-a-noire-steam-key/?nosalesbooster=1&country_store=1&currency=EUR]]</url>
After I modify the string with this code:
$doc = simplexml_import_dom($dom);
$items = $doc->xpath('//url');
foreach ($items as $item) {
$node = dom_import_simplexml($item);
$node->textContent .= 'appendIT';
}
I get:
<url>http://www.kinguin.net/category/1356/l-a-noire-steam-key/?nosalesbooster=1&country_store=1&currency=EURappendIT]]</url>
But how can I say "Put it back into the CDATA format"?
And in addition. I want to send this url to my database. But without the & char. So how can I get the url without the &?
Greetings and Thank You!

As CDATA in XML stands for "character data", you can treat it as a string without parsing it. CDATA ends with ]] (assume not nested CDATA), so
$xmlstrIn = '<url><![CDATA[http://www.kinguin.net/category/1356/l-a-noire-steam-key/?nosalesbooster=1&country_store=1&currency=EUR]]</url>';
$xmlstrOut = preg_replace('|(\]\])|', 'appendIT$1', $xmlstrIn);
and send $xmlstrOut to your database.
Hope this solves the problem (unless you are doing research of SimpleXML extension or something).
Regards,
Nick

Related

PHP: Converting xml to array

I have an xml string. That xml string has to be converted into PHP array in order to be processed by other parts of software my team is working on.
For xml -> array conversion i'm using something like this:
if(get_class($xmlString) != 'SimpleXMLElement') {
$xml = simplexml_load_string($xmlString);
}
if(!$xml) {
return false;
}
It works fine - most of the time :) The problem arises when my "xmlString" contains something like this:
<Line0 User="-5" ID="7436194"><Node0 Key="<1" Value="0"></Node0></Line0>
Then, simplexml_load_string won't do it's job (and i know that's because of character "<").
As i can't influence any other part of the code (i can't open up a module that's generating XML string and tell it "encode special characters, please!") i need your suggestions on how to fix that problem BEFORE calling "simplexml_load_string".
Do you have some ideas? I've tried
str_replace("<","<",$xmlString)
but, that simply ruins entire "xmlString"... :(
Well, then you can just replace the special characters in the $xmlString to the HTML entity counterparts using htmlspecialchars() and preg_replace_callback().
I know this is not performance friendly, but it does the job :)
<?php
$xmlString = '<Line0 User="-5" ID="7436194"><Node0 Key="<1" Value="0"></Node0></Line0>';
$xmlString = preg_replace_callback('~(?:").*?(?:")~',
function ($matches) {
return htmlspecialchars($matches[0], ENT_NOQUOTES);
},
$xmlString
);
header('Content-Type: text/plain');
echo $xmlString; // you will see the special characters are converted to HTML entities :)
echo PHP_EOL . PHP_EOL; // tidy :)
$xmlobj = simplexml_load_string($xmlString);
var_dump($xmlobj);
?>

Data not saved properly into XML file

I am trying to save data to a XML file.
if (isset( $_POST['submit'])) {
$name = mysql_real_escape_string($_POST['name']);
$xml = new DOMDocument("1.0", "ISO-8859-1");
$xml->preserveWhiteSpace = false;
$xml->load('/var/www/Report/file.xml');
$element = $xml->getElementsByTagName('entries');
$newItem = $xml->createElement('reports');
$newItem->appendChild($xml->createElement('timestamp', date("F j, Y, g:i a",time())));
$newItem->appendChild($xml->createElement('name', $name));
$element -> item(0) -> appendChild($newItem);
$xml->formatOutput = true; // this adds spaces, new lines and makes the XML more readable format.
$xmlString = $xml->saveXML(); // $xmlString contains the entire String
$xml->save('/var/www/Report/file.xml');
}
Anytime I use mysql_real_escape_string() to escapes special characters in my string or try to to sanitize my data, my XML file looks like something in the image below.
I don't understand why the name start tag is missing in my XML file and why my data $name isn't saved into the XML file either. How could I to fix this. Any help would be appreciated. Thanks in advance
1) Firefox does not show the xml byte-by-byte, rather it formats it itself. Therefore empty elements appear with only one self closing tag, no matter how they appear in the xml source. If you want to see the exact xml output, then open it in a text editor.
2) mysql_real_escape_string escapes characters that are problematic for mysql, not for XML: for example it does not escape '<'. Instead it, you should use DOMDocument::createTextNode. So instead of
$newItem->appendChild($xml->createElement('name', $name));
use
$nameElement = $xml->createElement('name');
$nameElement->appendChild( $xml->createTextNode($name) );
$newItem->appendChild($nameElement);
3) As you see, the $name is empty, but I don't know what can be the problem with your POST data seeing only this code, maybe there is no <input name='name' /> in your form: I sometimes accidentally set an id attribute to inputs instead of the name attribute.

Ampersands in database

I am trying to write a php function that goes to my database and pulls a list of URLS and arranges them into an xml structure and creates an xml file.
Problem is, Some of these urls will contain an ampersand that ARE HTML encoded. So, the database is good, but currently, when my function tries to grab these URLS, the script will stop at the ampersands and not finish.
One example link from database:
http://www.mysite.com/myfile.php?select=on&league_id=8&sport=15
function buildXML($con) {
//build xml file
$sql = "SELECT * FROM url_links";
$res = mysql_query($sql,$con);
$gameArray = array ();
while ($row = mysql_fetch_array($res))
{
array_push($row['form_link']);
}
$xml = '<?xml version="1.0" encoding="utf-8"?><channel>';
foreach ($gameArray as $link)
{
$xml .= "<item><link>".$link."</link></item>";
}
$xml .= '</channel>';
file_put_contents('../xml/full_rankings.xml',$xml);
}
mysql_close($con);
session_write_close();
If i need to alter the links in the database, that can be done.
You can use PHP's html_entity_decode() on the $link to convert & back to &.
In your XML, you could also wrap the link in <![CDATA[]]> to allow it to contain the characters.
$xml .= "<item><link><![CDATA[" . html_entity_decode($link) . "]]></link></item>";
UPDATE
Just noticed you're actually not putting anything into the $gameArray:
array_push($row['form_link']);
Try:
$gameArray[] = $row['form_link'];
* #Musa looks to have noticed it first, for due credit.
Look at this line
array_push($row['form_link']);
you never put anything in the $gameArray array, it should be
array_push($gameArray, $row['form_link']);
You need to use htmlspecialchars_decode. It will decode any encoded special characters in string passed to it.
This is most likely what you are looking for:
http://www.php.net/manual/en/function.mysql-real-escape-string.php
Read the documentation, there are examples at the bottom of the page...
'&' in oracleSQL and MySQL are used in queries as a logical operator which is why it is tossing an error.
You may also want to decode the HTML...

PHP text to array and with key

I know RegExp not well, I did not succeeded to split string to array.
I have string like:
<h5>some text in header</h5>
some other content, that belongs to header <p> or <a> or <img> inside.. not important...
<h5>Second text header</h5>
So What I am trying to do is to split text string into array where KEY would be text from header and CONTENT would be all the rest content till the next header like:
array("some text in header" => "some other content, that belongs to header...", ...)
I would suggest looking at the PHP DOM http://php.net/manual/en/book.dom.php. You can read / create DOM from a document.
i've used this one and enjoyed it.
http://simplehtmldom.sourceforge.net/
you could do it with a regex as well.
something like this.
/<h5>(.*)<\/h5>(.*)<h5>/s
but this just finds the first situation. you'll have to cut hte string to get the next one.
any way you cut it, i don't see a one liner for you. sorry.
here's a crummy broken 4 liner.
$chunks = explode("<h5>", $html);
foreach($chunks as $chunk){
list($key, $val) = explode("</h5>", $chunk);
$res[$key] = $val;
}
dont parse HTML via preg_match
instead use php Class
The DOMDocument class
example:
<?php
$html= "<h5>some text in header</h5>
some other content, that belongs to header <p> or <a> or <img> inside.. not important...
<h5>Second text header</h5>";
// a new dom object
$dom = new domDocument('1.0', 'utf-8');
// load the html into the object ***/
$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
$hFive= $dom->getElementsByTagName('h5');
echo $hFive->item(0)->nodeValue; // u can get all h5 data by changing the index
?>
Reference

Parsing XML with PHP (simplexml)

Firstly, may I point out that I am a newcomer to all things PHP so apologies if anything here is unclear and I'm afraid the more layman the response the better. I've been having real trouble parsing an xml file in to php to then populate an HTML table for my website. At the moment, I have been able to get the full xml feed in to a string which I can then echo and view and all seems well. I then thought I would be able to use simplexml to pick out specific elements and print their content but have been unable to do this.
The xml feed will be constantly changing (structure remaining the same) and is in compressed format. From various sources I've identified the following commands to get my feed in to the right format within a string although I am still unable to print specific elements. I've tried every combination without any luck and suspect I may be barking up the wrong tree. Could someone please point me in the right direction?!
$file = fopen("compress.zlib://$url", 'r');
$xmlstr = file_get_contents($url);
$xml = new SimpleXMLElement($url,null,true);
foreach($xml as $name) {
echo "{$name->awCat}\r\n";
}
Many, many thanks in advance,
Chris
PS The actual feed
Since no one followed my closevote, I think I can just as well put my own comments as an answer:
First of all, SimpleXml can load URIs directly and it can do so with stream wrappers, so your three calls in the beginning can be shortened to (note that you are not using $file at all)
$merchantProductFeed = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
To get the values you can either use the implicit SimpleXml API and drill down to the wanted elements (like shown multiple times elsewhere on the site):
foreach ($merchantProductFeed->merchant->prod as $prod) {
echo $prod->cat->awCat , PHP_EOL;
}
or you can use an XPath query to get at the wanted elements directly
$xml = new SimpleXMLElement("compress.zlib://$url", null, TRUE);
foreach ($xml->xpath('/merchantProductFeed/merchant/prod/cat/awCat') as $awCat) {
echo $awCat, PHP_EOL;
}
Live Demo
Note that fetching all $awCat elements from the source XML is rather pointless though, because all of them have "Bodycare & Fitness" for value. Of course you can also mix XPath and the implict API and just fetch the prod elements and then drill down to the various children of them.
Using XPath should be somewhat faster than iterating over the SimpleXmlElement object graph. Though it should be noted that the difference is in an neglectable area (read 0.000x vs 0.000y) for your feed. Still, if you plan to do more XML work, it pays off to familiarize yourself with XPath, because it's quite powerful. Think of it as SQL for XML.
For additional examples see
A simple program to CRUD node and node values of xml file and
PHP Manual - SimpleXml Basic Examples
Try this...
$url = "http://datafeed.api.productserve.com/datafeed/download/apikey/58bc4442611e03a13eca07d83607f851/cid/97,98,142,144,146,129,595,539,147,149,613,626,135,163,168,159,169,161,167,170,137,171,548,174,183,178,179,175,172,623,139,614,189,194,141,205,198,206,203,208,199,204,201,61,62,72,73,71,74,75,76,77,78,79,63,80,82,64,83,84,85,65,86,87,88,90,89,91,67,92,94,33,54,53,57,58,52,603,60,56,66,128,130,133,212,207,209,210,211,68,69,213,216,217,218,219,220,221,223,70,224,225,226,227,228,229,4,5,10,11,537,13,19,15,14,18,6,551,20,21,22,23,24,25,26,7,30,29,32,619,34,8,35,618,40,38,42,43,9,45,46,651,47,49,50,634,230,231,538,235,550,240,239,241,556,245,244,242,521,576,575,577,579,281,283,554,285,555,303,304,286,282,287,288,173,193,637,639,640,642,643,644,641,650,177,379,648,181,645,384,387,646,598,611,391,393,647,395,631,602,570,600,405,187,411,412,413,414,415,416,649,418,419,420,99,100,101,107,110,111,113,114,115,116,118,121,122,127,581,624,123,594,125,421,604,599,422,530,434,532,428,474,475,476,477,423,608,437,438,440,441,442,444,446,447,607,424,451,448,453,449,452,450,425,455,457,459,460,456,458,426,616,463,464,465,466,467,427,625,597,473,469,617,470,429,430,615,483,484,485,487,488,529,596,431,432,489,490,361,633,362,366,367,368,371,369,363,372,373,374,377,375,536,535,364,378,380,381,365,383,385,386,390,392,394,396,397,399,402,404,406,407,540,542,544,546,547,246,558,247,252,559,255,248,256,265,259,632,260,261,262,557,249,266,267,268,269,612,251,277,250,272,270,271,273,561,560,347,348,354,350,352,349,355,356,357,358,359,360,586,590,592,588,591,589,328,629,330,338,493,635,495,507,563,564,567,569,568/mid/2891/columns/merchant_id,merchant_name,aw_product_id,merchant_product_id,product_name,description,category_id,category_name,merchant_category,aw_deep_link,aw_image_url,search_price,delivery_cost,merchant_deep_link,merchant_image_url/format/xml/compression/gzip/";
$zd = gzopen($url, "r");
$data = gzread($zd, 1000000);
gzclose($zd);
if ($data !== false) {
$xml = simplexml_load_string($data);
foreach ($xml->merchant->prod as $pr) {
echo $pr->cat->awCat . "<br>";
}
}
<?php
$xmlstr = file_get_contents("compress.zlib://$url");
$xml = simplexml_load_string($xmlstr);
// you can transverse the xml tree however you want
foreach ($xml->merchant->prod as $line) {
// $line->cat->awCat -> you can use this
}
more information here
Use print_r($xml) to see the structure of the parsed XML feed.
Then it becomes obvious how you would traverse it:
foreach ($xml->merchant->prod as $prod) {
print $prod->pId;
print $prod->text->name;
print $prod->cat->awCat; # <-- which is what you wanted
print $prod->price->buynow;
}
$url = 'you url here';
$f = gzopen ($url, 'r');
$xml = new SimpleXMLElement (fread ($f, 1000000));
foreach($xml->xpath ('//prod') as $name)
{
echo (string) $name->cat->awCatId, "\r\n";
}

Categories