Troubleshooting parsing of XML document using PHP SimpleXml - php

I've used xpath to process XML element before, however I'm struggling to get the syntax right for this particular XML.
I'm trying to parse a guardian API response. Sample response:
<response user-tier="approved" current-page="1" start-index="1" page-size="10" pages="1" total="10" status="ok">
<results>
<tag type="series" web-title="Cycling" section-name="Life and style" id="lifeandstyle/series/cycling" api- url="http://content.guardianapis.com/lifeandstyle/series/cycling" section-id="lifeandstyle" web- url="http://www.guardian.co.uk/lifeandstyle/series/cycling"/>
<tag type="keyword" web-title="Cycling" section-name="Sport" id="sport/cycling" api- url="http://content.guardianapis.com/sport/cycling" section-id="sport" web- url="http://www.guardian.co.uk/sport/cycling"/>
<tag type="keyword" web-title="Cycling" section-name="Life and style" id="lifeandstyle/cycling" api-url="http://content.guardianapis.com/lifeandstyle/cycling" section-id="lifeandstyle" web-url="http://www.guardian.co.uk/lifeandstyle/cycling"/>
<results>
<response>
Here is my first try coding it in PHP (I've connected using cURL):
$news_items = new SimpleXMLElement($result); //loads the result of the cURL into a simpleXML response
$news_items = $guardian_response->xpath('results');
foreach ($news_items as $item) { //for each statement every entry will load the news_item and the web_url for the document
$item_block = "<p class=\"web_title\">";
$item_block = "<p class=\"web_url\">";
}
It doesn't retrieve anything, is there any flaws in my code?

<?php
function getAttribute($object, $attribute) {
foreach($object->attributes() as $a => $b) {
if ($a == $attribute) { $return = $b; }
}
if($return) { return $return; }
}
try {
$xml = simplexml_load_file( "parse.xml" );
/* Pay attention to the XPath, include all parents */
$result = $xml->xpath('/response/results/tag');
while(list( , $node) = each($result)) {
echo getAttribute( $node, "type" );
}
} catch( Exception $e ) {
echo "Exception on line ".$e->getLine()." of file ".$e->getFile()." : ".$e->getMessage()."<br/>";
}
?>

Related

Get elements from a XML content by PHP

I am trying to get elements from this XML content but returns empty:
<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>
And this is my code to extract element type of grammar :
$dom = new DOMDocument();
$dom->loadXml($output);
$params = $dom->getElementsByTagName('error'); // Find Sections
$k=0;
foreach ($params as $param) //go to each section 1 by 1
{
if($param->type == "grammar"){
echo $param->description;
}else{
echo "other type";
}
Problem is the script returns empty.
you can use simplexml_load_string()
$output = '<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>';
$xml = simplexml_load_string($output);
foreach($xml->error as $item)
{
//echo (string)$item->type;
if($item->type == "grammar"){
echo $item->description;
}else{
echo "other type";
}
}
You apparently haven't configured PHP to report errors because your code triggers:
Notice: Undefined property: DOMElement::$type
You need to grab <type> the same way you grab <error>, using DOM methods like e.g. getElementsByTagName(). Same for node value:
if ($param->getElementsByTagName('type')->length && $param->getElementsByTagName('type')[0]->nodeValue === 'grammar') {
// Feel free to add additional checks here:
echo $param->getElementsByTagName('description')[0]->nodeValue;
}else{
echo "other type";
}
Demo
I think is this what you want.
<?php
$output = '<results>
<error>
<string>i</string>
<description>Make I uppercase</description>
<precontext></precontext>
<suggestions>
<option>I</option>
</suggestions>
<type>grammar</type>
</error>
</results>';
$dom = new DOMDocument();
$dom->loadXml($output);
$params = $dom->getElementsByTagName('error'); // Find Sections
$k=0;
foreach ($params as $param) //go to each section 1 by 1
{
$string = $param->getElementsByTagName( "string" )->item(0)->nodeValue;
$description = $param->getElementsByTagName( "description" )->item(0)->nodeValue;
$option = $param->getElementsByTagName( "option" )->item(0)->nodeValue;
$type = $param->getElementsByTagName( "type" )->item(0)->nodeValue;
echo $type;
if($type == "grammar"){
echo $description ;
}else{
echo "other type";
}
}
?>
You're mixing DOM with SimpleXML. This is possible, but you would need to convert the DOM element node into a SimpleXML instance with simplexml_import_dom().
Or you use Xpath. getElementsByTagName() is a low level DOM method. Using Xpath expressions allows for more specific access with a lot less code.
$document = new DOMDocument();
$document->loadXML($xml);
$xpath = new DOMXpath($document);
foreach ($xpath->evaluate('//error') as $error) {
var_dump(
[
'type' => $xpath->evaluate('string(type)', $error),
'description' => $xpath->evaluate('string(description)', $error)
]
);
}
Output:
array(2) {
["type"]=>
string(7) "grammar"
["description"]=>
string(16) "Make I uppercase"
}
Xpath expressions allow for conditions as well, for example you could fetch all grammar errors using //error[#type = "grammar"].

Cannot parse XML using simplexml_load_string

I have tried various methods as seen in here
and in here and many more.
I even tried the function in here.
The XML looks something like this:
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing"><s:Header><a:Action s:mustUnderstand="1">http://tempuri.org/IFooEntryOperation/SaveFooStatusResponse</a:Action></s:Header><s:Body><SaveFooStatusResponse xmlns="http://htempuri.org/"><SaveFooStatusResult xmlns:b="http://schemas.datacontract.org/2004/07/FooAPI.Entities.Foo" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><b:AWBNumber>999999999</b:AWBNumber><b:IsError>true</b:IsError><b:Status><b:FooEntryStatus><b:StatusCode>Foo_ENTRY_FAILURE</b:StatusCode><b:StatusInformation>InvalidEmployeeCode</b:StatusInformation></b:FooEntryStatus></b:Status></SaveFooStatusResult></SaveFooStatusResponse></s:Body></s:Envelope>
And here's one example of my code (I have a dozen variations):
$ReturnData = $row["ReturnData"]; // string frm a database
if (strpos($ReturnData, "s:Envelope") !== false){
$ReturnXML = new SimpleXMLElement($ReturnData);
$xml = simplexml_load_string($ReturnXML);
$StatusCode = $xml["b:StatusCode"];
echo "<br>StatusCode: " . $StatusCode;
$IsError = $xml["b:IsError"];
echo "<br>IsError: " . $IsError;
}
Another option I tried:
$test = json_decode(json_encode($xml, 1); //this didn't work either
I either get an empty array or I get errors like:
"Fatal error: Uncaught exception 'Exception' with message 'String
could not be parsed as XML"
I have tried so many things, I may lost track of where my code is right now. Please help - I am really stuck...
I also tried:
$ReturnXML = new SimpleXMLElement($ReturnData);
foreach( $ReturnXML->children('b', true)->entry as $entries ) {
echo (string) 'Summary: ' . simplexml_load_string($entries->StatusCode->children()->asXML(), null, LIBXML_NOCDATA) . "<br />\n";
}
Method 1.
You can try the below code snippet to parse it an array
$p = xml_parser_create();
xml_parse_into_struct($p, $xml, $values, $indexes);// $xml containing the XML
xml_parser_free($p);
echo "Index array\n";
print_r($indexes);
echo "\nVals array\n";
print_r($values);
Method 2.
function XMLtoArray($xml) {
$previous_value = libxml_use_internal_errors(true);
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->preserveWhiteSpace = false;
$dom->loadXml($xml);
libxml_use_internal_errors($previous_value);
if (libxml_get_errors()) {
return [];
}
return DOMtoArray($dom);
}
function DOMtoArray($root) {
$result = array();
if ($root->hasAttributes()) {
$attrs = $root->attributes;
foreach ($attrs as $attr) {
$result['#attributes'][$attr->name] = $attr->value;
}
}
if ($root->hasChildNodes()) {
$children = $root->childNodes;
if ($children->length == 1) {
$child = $children->item(0);
if (in_array($child->nodeType,[XML_TEXT_NODE,XML_CDATA_SECTION_NODE]))
{
$result['_value'] = $child->nodeValue;
return count($result) == 1
? $result['_value']
: $result;
}
}
$groups = array();
foreach ($children as $child) {
if (!isset($result[$child->nodeName])) {
$result[$child->nodeName] = DOMtoArray($child);
} else {
if (!isset($groups[$child->nodeName])) {
$result[$child->nodeName] = array($result[$child->nodeName]);
$groups[$child->nodeName] = 1;
}
$result[$child->nodeName][] = DOMtoArray($child);
}
}
}
return $result;
}
You can get an array using print_r(XMLtoArray($xml));
I don't know how you would do this using SimpleXMLElement but judging by the fact you have tried so many things I trust that the actual method employed is not important so you should therefore find the following, which uses DOMDocument and DOMXPath, of interest.
/* The SOAP response */
$strxml='<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing">
<s:Header>
<a:Action s:mustUnderstand="1">http://tempuri.org/IFooEntryOperation/SaveFooStatusResponse</a:Action>
</s:Header>
<s:Body>
<SaveFooStatusResponse xmlns="http://htempuri.org/">
<SaveFooStatusResult xmlns:b="http://schemas.datacontract.org/2004/07/FooAPI.Entities.Foo" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<b:AWBNumber>999999999</b:AWBNumber>
<b:IsError>true</b:IsError>
<b:Status>
<b:FooEntryStatus>
<b:StatusCode>Foo_ENTRY_FAILURE</b:StatusCode>
<b:StatusInformation>InvalidEmployeeCode</b:StatusInformation>
</b:FooEntryStatus>
</b:Status>
</SaveFooStatusResult>
</SaveFooStatusResponse>
</s:Body>
</s:Envelope>';
/* create the DOMDocument and manually control errors */
libxml_use_internal_errors( true );
$dom=new DOMDocument;
$dom->validateOnParse=true;
$dom->recover=true;
$dom->strictErrorChecking=true;
$dom->loadXML( $strxml );
libxml_clear_errors();
/* Create the XPath object */
$xp=new DOMXPath( $dom );
/* Register the various namespaces found in the XML response */
$xp->registerNamespace('b','http://schemas.datacontract.org/2004/07/FooAPI.Entities.Foo');
$xp->registerNamespace('i','http://www.w3.org/2001/XMLSchema-instance');
$xp->registerNamespace('s','http://www.w3.org/2003/05/soap-envelope');
$xp->registerNamespace('a','http://www.w3.org/2005/08/addressing');
/* make XPath queries for whatever pieces of information you need */
$Action=$xp->query( '//a:Action' )->item(0)->nodeValue;
$StatusCode=$xp->query( '//b:StatusCode' )->item(0)->nodeValue;
$StatusInformation=$xp->query( '//b:StatusInformation' )->item(0)->nodeValue;
printf(
"<pre>
%s
%s
%s
</pre>",
$Action,
$StatusCode,
$StatusInformation
);
The output from the above:
http://tempuri.org/IFooEntryOperation/SaveFooStatusResponse
Foo_ENTRY_FAILURE
InvalidEmployeeCode

How to parse linkedin xml so I get single fields values

I have this code:
$xml_response = $linkedin->getProfile("~:(id,firstName,lastName,email-address)");
which generates the following result xml:
<person>
<id>c3g9fdgdbP9-</id>
<first-name>Shoen</first-name>
<last-name>Vergue</last-name>
<email-address>manager#glob....beg.com</email-address>
</person>
How to get for example email value?
I tried this:
$mail=$xml_response['email-address'];
but it returns nothing
Thank you in advance
Check out the SimpleXML Parser, and try something like this:
libxml_use_internal_errors(true);
$xml = simplexml_load_string($xml_response);
if ($xml === false) {
echo "Failed loading XML: ";
foreach(libxml_get_errors() as $error) {
echo "<br>", $error->message;
}
} else {
echo $xml->{"email-address"};
}

Geting soap xsi:type from server response

I get response from SOAP server which has zero or more transactions of different types in each response.
Each transaction type is extension of base transaction type.
Different transaction types are processed differently.
Is there a way in PHP to get transaction type for each of transactions in response
(other then trying to figure difference in elements within each complex type)?
There is lot of types and lot of elements in each type....
Is there any class which could get this?
Following is just illustration...
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type1">
<id>24111</id><something>00000000</something><name>Blah</name>
</transactions>
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type8">
<id>24111</id><somethingelse>011</somethingelse>
</transactions>
I 'm not quite sure if this answer fits your question exactly. The following code snippet gets the type attribute value by their given namespaces and not the type of the namespaced value itself.
Done with PHP 's own Document Object Model.
<?php
$str = <<<XML
<content>
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type1">
<id>24111</id>
<something>00000000</something>
<name>Blah</name>
</transactions>
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type8">
<id>24111</id>
<somethingelse>011</somethingelse>
</transactions>
</content>
XML;
$doc = new DomDocument();
$doc->loadXML($str);
$nodeList = $doc->getElementsByTagName('transactions');
foreach ($nodeList as $element) {
$value = $element->getAttributeNS('http://www.w3.org/2001/XMLSchema-instance', 'type');
echo $value . "\n";
}
This will output the two given types "ns2:type1" and "ns2:type8".
I can parse your elements with simple_html_dom.
Here is the link for it.
An example is here :
<?php
include "simple_html_dom.php";
$html_nb = '
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type1"><id>24111</id><something>00000000</something><name>Blah</name>
</transactions>
<transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type8"><id>24111</id><somethingelse>011</somethingelse>
</transactions>';
function chtml($str){
if(strpos("<html>", $str) !== false)
return '<html><whole_code>'.$str.'</whole_code></html>';
else
return "<whole_code>".$str."</whole_code>";
}
function find_element_type($str){
if(preg_match_all("/\<(.*?)\>/i", $str, $matches))
return $matches[1][0];
else
return false;
}
function get_xsi_type($str){
if(preg_match_all("/xsi\:type\=\"(.*?)\"/i", $str, $matches))
return $matches[1][0];
else
return false;
}
$html = new simple_html_dom();
$html_2 = new simple_html_dom();
$html->load(chtml($html_nb));
$max_type = 10;
$element = $html->find('whole_code');
$e = $element[0]->innertext;
$html_2->load(chtml($e));
$k = 0;
while($html_2->find("whole_code",false)->children($k) != "")
{
$all = $html_2->find("whole_code",false)->children($k);
echo get_xsi_type($all) . "<br>";
echo find_element_type($all) . " : " .$all."<br>";
$k++;
}
echo "<hr>";
The result :
ns2:type1
transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type1" : 2411100000000Blah
ns2:type8
transactions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns2:type8" : 24111011

How to delete xml Dom document in php

I'd search for this problem and find some questions but they didn't mention to my error...
I'm trying to remove a child of my DOM document and when I type the $x->removeChild($key); function, nothing happend...
$xmlreq = new DOMDocument;
$xmlreq->loadXML($xmlStr);
$x = $xmlreq->getElementsByTagName('*');
foreach($x as $key)
{
if (substr($key->nodeValue,0,3)=="{{{" and substr($key->nodeValue,-3)=="}}}")
{
$field = explode("|",substr($key->nodeValue,3,strlen($key->nodeValue)-6));
if((int)$field[3]==0)
{
if(trim($_POST[$field[2]])=="")
{
$x->removeChild($key);
}else{
$key->nodeValue = trim($_POST[$field[2]]);
}
}elseif((int)$field[3]==1)
{
if(trim($_POST[$field[2]])=="")
{
$errors.="";
}else{
$key->nodeValue = trim($_POST[$field[2]]);
}
}else{
}
}
}
header("content-type: application/xml");
print $xmlreq->saveXml();
and this is my xml:
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
<command>
<check>
<contact:check xmlns:contact="http://epp.nic.ir/ns/contact-1.0">
<contact:id>ghhg-ghgh</contact:id>
<contact:id>45</contact:id>
<contact:id>45</contact:id>
<contact:id>45</contact:id>
<contact:authInfo>
<contact:pw>1561651321321</contact:pw>
</contact:authInfo>
</contact:check>
</check>
<clTRID>TEST-12345</clTRID>
</command>
</epp>
and I want to delete one of <contact:id>45</contact:id>
Your loop does nothing since the outer conditional is looking for a node where nodeValue starts with {{{ and ends with }}}:
foreach($x as $key)
{
if (substr($key->nodeValue,0,3)=="{{{" and substr($key->nodeValue,-3)=="}}}")
Additionally, there's no removeChild() method in DOMNodeList. You probably want to fetch the node's parent first and call its removeChild() method instead.
A possible alternative:
$x = $xmlreq->getElementsByTagName('*');
$remove = TRUE;
foreach($x as $key)
{
if( $key->nodeName=='contact:id' && $key->nodeValue=='45' ){
if($remove){
$key->parentNode->removeChild($key);
$remove = FALSE;
}
}
}

Categories