I'm currently working a project that has me working with XML a lot. I have to take an XML response and decrypt each text node and then do various tasks with the data. The problem I'm having is taking the response and processing each text node. Originally I was using the XMLToArray library, and that worked fine I would change the XML into an array and then loop through the array and decrypt the values. However some of the XML response I'm dealing with have repeated tags and the XMLToArray library will only return the last values.
Is there a good way that I can take an XML response and process all the text nodes and easily putting the values into an array that has a similar structure to the response?
Thanks in advance.
I would use SimpleXML.
Here's a small example of using it. It loads and parses XML from http://www.w3schools.com/xml/plant_catalog.xml and then outputs values of "COMMON" and "PRICE" tags of each "PLANT" tag.
$xml = simplexml_load_file('http://www.w3schools.com/xml/plant_catalog.xml');
foreach ( $xml->PLANT as $plantNode ) {
echo $plantNode->COMMON, ' - ', $plantNode->PRICE, "\n";
}
If you have any problems with adapting it to your needs, just give an example of your XML so that we can help with it.
All those XML to array libraries are a remain of the times where PHP 4 would force you to write your own XML parser almost from scratch. In recent PHP versions you have a good set of XML libraries that do the hard job. I particularly recommend SimpleXML (for small files) and XMLReader (for large files). If you still find them complicate, you can try phpQuery.
You might want to give SimpleXML a try. Plus it comes by default in php so you dont need to install
Check out SimpleXML, it may offer a bit more for what you are looking for.
Related
I'm having some trouble parsing malformed XML in PHP. In particular I'm querying a third party webservice that returns data in an XML format without encoding the XML entities in actual data. For example one of the the elements contains an ASCII heart, '<3', without the quotes, which the XML parser sees as an opening tag. It should be '<3'.
Right now I'm simply passing the XML string into a SimpleXMLElement which, predictably, fails on these instances. I've done some looking around and it seems like PHP Tidy package might be able to help me, but the amount of configuration you can do is overwhelming :(
Thus, I'm just wondering if anyone else has had a problem like this and, if so, how they were able to solve it.
Thanks!
Try tidy.repairString:
php > $tidy = new tidy();
php > $repaired = $tidy->repairString("<foo>I <3 Philadelphia</foo>", array("input-xml"=>1));
php > print($repaired);
<foo>I <3 Philadelphia</foo>
php > $el = new SimpleXMLElement($repaired);
Read the content as a string.
htmlspecialchars(preg_replace('/[\x-\x8\xb-\xc\xe-\x1f]/','',$string))
Load the transformed string in SimpleXMLElement
It worked for me so far.
I'm working on some system for a few hours now and this little thing is too much for me to think logically about at the moment.
Normally I would wait a few hours but this is a last minute job and I need to finish this.
Here's my problem:
I have an XML file that gets posted to my PHP file, the PHP file inserts certain data into a DB, but some XML nodes have the same name:
<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>
Now I want to get a var $acclist which contains all values seperated by a comma:
value1,value2,value3,
I bet the solution to this is very easy but I'm at the known point where even the easiest piece of code becomes a hassle. And googling only comes up with nodes that in some way have their own identifiers.
Could someone help me out please?
You can try simplexml_load_string to parse the html then call implode on the node after casting to an array.
NOTE This code was tested in php 5.4.6 and behaves as expected.
<?php
$xml = '<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>';
$dat = simplexml_load_string($xml);
echo implode(",",(array)$dat->accessoire);
For 5.3.x I had to change to
$xml = '<accessoires>
<accessoire>value1</accessoire>
<accessoire>value2</accessoire>
<accessoire>value3</accessoire>
</accessoires>';
$dat = simplexml_load_string($xml);
$dat = (array)$dat;
echo implode(",",$dat["accessoire"]);
You do this by taking a library that is able to parse and process XML, for example with SimpleXML:
implode(',', iterator_to_array($accessoires->accessoire, FALSE));
The key part here is to use iterator_to_array() as SimpleXML offers the same-named child-elements here as an iterator. Otherwise $accessoires->accessoire gives you auto-magically only the first element (if any).
I'm a somewhat experienced PHP scripter, however I just dove into parsing XML and all that good stuff.
I just can't seem to wrap my head around why one would use a separate XML parser instead of just using the explode function, which seems to be just as simple. Here's what I've been doing (assuming there is a valid XML file at the path xml.php):
$contents = file_get_contents("xml.php");
$array1 = explode("<a_tag>", $contents);
$array2 = explode("</a_tag>", $array1[1]);
$data = $array2[0];
So my question is, what is the practical use for an XML parser if you can just separate the values into arrays and extract the data from that point?
Thanks in advance! :)
Excuse me for not going into details but for starters try parsing
$contents = '<a xmlns="urn:something">
<a_tag>
<b>..</b>
<related>
<a_tag>...</a_tag>
</related>
</a_tag>
<foo:a_tag xmlns:foo="urn:something">
<![CDATA[This is another <a_tag> element]]>
</foo:a_tag>
</a>';
with your explode-approach. When you're done we can continue with some trickier things ;-)
In a nutshell, its consistency. Before XML came into wide use there were numerous undocumented formats for keeping information in files. One of the motivators behind XML was to create a well defined, standard document format. With this well defined format in place, a general set of parsing tools could be developed that would work consistently on documents so long as the documents adhered to the aforementioned well defined format.
In some specific cases, your example code will work. However, if the document changes
...
<!-- adding an attribute -->
<a_tag foo="bar">Contents of the Tag</a_tag>
...
...
<!-- adding a comment to the contents -->
<a_tag>Contents <!-- foobar --> of the Tag</a_tag>
...
Your parsing code will probably break. Code written using a correctly defined XML parser will not.
XML parsers:
Handle encoding
May have xpath support
Allow you to easily modify and save the XML; append/remove child nodes, add/remove attributes, etc.
Don't need to load the whole file into memory (except from DOM parsers)
Know about namespaces
...
How would you explode the same file if a_tag had an attribute?
explode("<a_tag>" ... will work differently than explode("<a_tag attr='value'>" ..., after all.
XML Parsers understand the XML specification. Explode can only handle the simplest of cases, and will most likely fail in a lot of instances of that case.
Using a proven XML parsing method will make the code more maintainable and easy to read. It will also make it more easily adaptable should the schema change, and it can make it easier to determine error conditions. XPath and XSLT exist for a reason, they are proven ways to deal with XML data in a sensible, legible manner. I'd suggest you use whichever is applicable in your given situation. Remember, just because you think you're only writing code for one specific purpose, you never know what a piece of well-written code could evolve into.
are there build in functions in latest versions of php specially designed to aid in this task ?
Use a DOM parser like SimpleXML to split the HTML code into nodes, and walk through the nodes to build the array.
For broken/invalid HTML, SimpleHTMLDOM is more lenient (but it's not built in).
String replace and explode would work if the HTML code is clean and always the same, as soon as you have new attributes it will brake.
So only dependable solution would be using regular expressions or XML/HTML parser.
Check http://php.net/manual/en/book.dom.php
An alternative to using a native DOM parser could be using YQL. This way you dont have to do the actual parsing yourself. The YQL Web Service enables applications to query, filter, and combine data from different sources across the Internet.
For instance, to grab the HTML table with the class example given at
http://www.w3schools.com/html/html_tables.asp
you can do
$yql = 'http://tinyurl.com/yql-table-grab';
$yql = json_decode(file_get_contents($yql));
print_r( $yql->query->results );
I've deliberated shortened the URL so it does not mess up the answer. $yql actually links to the YQL API, adds some options and contains the query:
select * from html
where xpath="//table[#class='example']"
and url="http://www.w3schools.com/html/html_tables.asp"
YQL can return JSON and XML. I've made it return JSON and decoded this then, which then results in a nested structure of stdClass objects and Arrays (so it's not all arrays). You have to see if that fits your needs.
You try out the interactive YQL console to see how it works.
i dont know if this is the faster , but you can check this class (using preg_replace)
http://wonshik.com/snippet/Convert-HTML-Table-into-a-PHP-Array
If you want to convert the html-description of a table, here's how I would do it:
remove all closing tags (</...>) ( http://php.net/manual/de/function.str-replace.php)
split string at opening tags (<...>) using a regular expression ( http://php.net/manual/en/function.split.php)
You have to work out the details on your own, since I do not know if you want to handle different lines as subarrays or you want to merge all lines into one big array or something else.
you could use the explode-function to turn the table cols and rows into arrays.
see: php explode
Is there any php libraries or API's that help when dealing with X12 documents in php? Googling around doesn't help much, so looking for people with experience in this field.
If you know which Segments and what the meaning in all segments. Then its just about php
$file = file_get_contents('/edi.x12');
$segments = explode(~\n,$file);
foreach($segments as $segment){
$elements = explode('*',$segment);
foreach($elements as $element){
switch($elements[0]){
case 'ISA':
break;
/// And so on
}
}
}
Then you will have an Array consisting all Segments in the file. If your just iterate over the array it is possible to get all Elements for given Segment.
But for creating a x12 Edi file it is a little bit more tricky.
I dont see the point in first convert to Xml.
Doing a quick google search, I found a couple of tools that will convert X12 documents to XML. PHP has made a lot of progress in the area of XML parsing.
Is converting to XML first an option?
not PHP, but python: http://bots.sourceforge.net
translates from and to x12.