extract XML tag content with PHP - php

I have a PHP script that extracts data from an XML and so far it only looks for tag attributes. How can I also extract the tag content?
XML
<test name="Example 1">
<status status="FAIL" starttime="20200501 09:36:52.452" endtime="20200501 09:37:07.159"
critical="yes">Setup failed:
Variable '${EMAIL_INPUT}' not found.</status>
</test>
PHP
foreach ($result->test as $result) {
echo $result['name'], PHP_EOL;
$endtime = $result->status;
echo $endtime['starttime'], PHP_EOL;
echo $endtime['endtime'], PHP_EOL;
echo $endtime['status'], PHP_EOL;
}
What I need is the text in-between the tags:
"Setup failed:Variable '${EMAIL_INPUT}' not found."
Thanks

To get the contents of a node you can just cast node to string:
// I changed to `as $test` 'cause `as $result`
// overwrites initial `$result` variable
foreach ($result->test as $test) {
$endtime = $test->status;
$text = (string) $endtime;
// Also `echo` will cast `$endtime` to string implicitly
echo $text;
}

Related

Need to remove headers from cURL XML response in PHP

There's a few threads about this, but I couldn't find a solution to this issue in them. I hope it doesn't violate duplicate rules.
I've tested the following code with static XML and it works great, but said XML did not contain any headers.
I'm trying to remove headers through code after making a POST request so I can continue to process the resulting XML, but I'm not having any luck with it.
This is the XML:
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><AUTOS_Cotizar_PHPResponse xmlns="http://tempuri.org/"><AUTOS_Cotizar_PHPResult><auto xmlns=""><operacion>1555843</operacion><statusSuccess>TRUE</statusSuccess><statusText></statusText><cotizacion><cobertura><codigo>A0</codigo><descripcion>RESPONSABILIDAD CIVIL SOLAMENTE</descripcion><premio>928,45</premio><cuotas>01</cuotas><impcuotas>928,45</impcuotas></cobertura></cotizacion><datos_cotiz><suma>477250</suma><uso>901</uso></datos_cotiz></auto></AUTOS_Cotizar_PHPResult></AUTOS_Cotizar_PHPResponse></soap:Body></soap:Envelope>
this is the code:
//converting raw cURL response to XML
$temp1 = htmlspecialchars ($reply);
//replacing top headers
$temp2 = str_replace('<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><AUTOS_Cotizar_PHPResponse xmlns="http://tempuri.org/"><AUTOS_Cotizar_PHPResult>', "<<<'EOD'", $temp1);
//replacing closing header tags
$temp3 = str_replace('</AUTOS_Cotizar_PHPResult></AUTOS_Cotizar_PHPResponse></soap:Body></soap:Envelope>', "EOD;", $temp2);
//this returns the original $temp1 without having anything replaced
echo $temp3;
//simplexml conversion
$xml = simplexml_load_string($temp3);
//running through the array and printing all values
if ($xml !== false) {
foreach ($xml->cotizacion as $cotizacion) {
foreach ($cotizacion->cobertura as $cobertura) {
echo $cobertura->codigo;
echo '<br>';
echo $cobertura->descripcion;
echo '<br>';
echo $cobertura->premio;
echo '<br>';
echo $cobertura->cuotas;
echo '<br>';
echo $cobertura->impcuotas;
echo '<br>';
}
}
}
There are probably more efficient ways to do this, or maybe I'm not doing this correctly. I'm just about learning right now, so feel free to correct me in any way if you want, I'd appreciate it!
The way you are processing the response string is a bad idea, you should stick to processing the content as XML and work with it. This uses XPath to find a start point to process the data (which I can't test with the current sample), but should help with what you need to do...
// Load the original reply
$xml = simplexml_load_string($reply);
//running through the array and printing all values
if ($xml !== false) {
// Find the <auto> element (use [0] as you want the first one)
$auto = $xml->xpath("//auto")[0];
// Loop through the cotizacion elements in the auto element
foreach ($auto->cotizacion as $cotizacion) {
foreach ($cotizacion->cobertura as $cobertura) {
echo $cobertura->codigo;
echo '<br>';
echo $cobertura->descripcion;
echo '<br>';
echo $cobertura->premio;
echo '<br>';
echo $cobertura->cuotas;
echo '<br>';
echo $cobertura->impcuotas;
echo '<br>';
}
}
}
The SOAP response is still an XML document, so work with it instead of fighting it. Treating it as a string is definitely not great.
As far as I can tell you're trying to work with all the <cotizaction> elements. It's simple to find elements inside an XML document. Read up on XPath.
$xml = simplexml_load_string(htmlspecialchars($reply));
if ($xml) {
foreach ($xml->xpath('//cotizacion') as $cotizacion) {
// do your thing
}
}

PHP simplexml_load_file 2 tags in xml file

I have an xml file like this
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Job>
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
</Job>
I use simplexml_load_file to load file and print out User, Computer and Complete attribute.
$xml=simplexml_load_file("abc.xml") or die("Error: Cannot create object");
foreach($xml->children() as $xm) {
echo $xm->User . "<br>";
echo $xm->Computer . "<br>";
echo $xm->Complete . "<br>";
But it only print out User and Computer. The result for Complete is empty.
Please help me with this, thank you!
The main issue is that you are trying to mix the fields from different elements. You can see from the original XML that <User> and <Computer> are in the <JobInfo> element and <Complete> is in the <JobFlags> element. This isn't a problem, but when you use your foreach loop, you go through each child element of <Job> and output all of the values for each element. If you change your loop to...
foreach($xml->children() as $tag => $xm) {
echo "element=".$tag . "<br>";
echo $xm->User . "<br>";
echo $xm->Computer . "<br>";
echo $xm->Complete . "<br>";
}
You get (please excuse wrong markup, I wanted the layout)...
element=JobInfoabcacbelement=JobFlagsYes
If instead you just need those three pieces of information from the document, you could access them using the full path and so just pick out the details your after without using a foreach...
echo $xml->JobInfo->User . "<br>";
echo $xml->JobInfo->Computer . "<br>";
echo $xml->JobFlags->Complete . "<br>";
Gives you.
abcacbYes
simplexml is a bit outdated. Not sure that a <complete> tag is valid, xml, it may be the issue.
If you can, try to use php-dom instead.
https://php.net/manual/en/book.dom.php
Example to retrieve all the tags texts.
#!/usr/bin/php
<?php
$remote =<<<'EOF'
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
EOF;
$i = 0;
$doc = new DOMDocument();
$doc->loadHTML($remote);
foreach($doc->getElementsByTagName('*') as $elem) {
echo $i++.": ".$elem->textContent;
}
The package bellong to php-xml:
Package php-dom is a virtual package provided by: php7.2-xml
Alternatives: https://php.net/manual/en/refs.xml.php
Try to wrap all elements into a parent tag:
<?xml version="1.0"?>
<test>
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
</test>
and then reference in your php:
<?php
$xml=simplexml_load_file("abc.xml") or die("Error: Cannot create object");
echo $xml->JobInfo->User . "<br>";
echo $xml->JobInfo->Computer . "<br>";
echo $xml->JobFlags->Complete . "<br>";
?>
It worked here.
I think it's just a matter of making a correct reference to the entire hierarchical structure of the tags.
See this example taken from http://php.net/manual/en/function.simplexml-load-file.php:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8" ?>
<rss>
<channel>
<item>
<title><![CDATA[Tom & Jerry]]></title>
</item>
</channel>
</rss>';
$xml = simplexml_load_string($xml);
// echo does the casting for you
echo $xml->channel->item->title;
// but vardump (or print_r) not!
var_dump($xml->channel->item->title);
// so cast the SimpleXML Element to 'string' solve this issue
var_dump((string) $xml->channel->item->title);
?>
Above will output:
Tom & Jerry
object(SimpleXMLElement)#4 (0) {}
string(11) "Tom & Jerry"
Reference:
https://stackoverflow.com/a/16972780/5074998
http://php.net/manual/en/function.simplexml-load-file.php
It works just fine for me:
abc
acb
Yes
You might be confused because you are expecting it all on three consecutive lines. Try outputting it like this instead and it may help you visualize that it is echoing all three of your requested elements (whether they exist or not) for two loops (since the XML has two children):
foreach($xml->children() as $xm) {
echo "Current Element: " . $xm->getName() . "<br />";
echo "User: " . $xm->User . "<br />";
echo "Computer: " . $xm->Computer . "<br />";
echo "Complete: " . $xm->Complete . "<br />";
}
which will output:
Current Element: JobInfo
User: abc
Computer: acb
Complete:
Current Element: JobFlags
User:
Computer:
Complete: Yes
As Nick suggested in his comment, if you don't want the <br /> breaks to print when the element doesn't exist, you can use isset like:
foreach($xml->children() as $xm) {
echo isset($xm->User) ? "{$xm->User}<br />" : '';
echo isset($xm->Computer) ? "{$xm->Computer}<br />" : '';
echo isset($xm->Complete) ? "{$xm->Complete}<br />" : '';
}
Or, as NigelRen recommended in their answer, you could skip using children() if you know the full path to the elements you need, and just use those paths instead, like:
echo $xml->JobInfo->User . "<br />";
echo $xml->JobInfo->Computer . "<br />";
echo $xml->JobFlags->Complete . "<br />";
The underlying issue being that when you were echoing out $xm->Complete while traversing the JobInfo element, it was outputting just the <br /> because $xml->JobInfo->Complete does't exist.

phpQuery: Replace all occurrences of text with another

I am trying to parse a website homepage to convert it into xml file to be used as an api in my app.
So far I have successfully done so. However, the parsed text contains the & (ampersand) character which causes the XML parser to fail.
I am looking for a solution that doesn't use the CDATA or doesn't output CDATA in the XML file.
I want to replace & with and at every occurrence. What phpQuery method should I use?
This causes error in browser because the text() method returns a text with
& character in it.
require('phpQuery/phpQuery.php');
$all=phpQuery::newDocumentFileHTML('BPUT.htm', $charset = 'utf-8');
$links = $all['a.myblue'];
echo '<notice>';
foreach ($links as $link) {
echo '<text>';
echo pq($link)->text();
echo '</text>';
echo '<url>';
echo pq($link)->attr('href');
echo '</url>';
}
echo '</notice>';
?>
I do not want to use CDATA, as the CDATA tag is visible in the generated XML :
<?php
header('Content-type: text/xml');
require('phpQuery/phpQuery.php');
$all=phpQuery::newDocumentFileHTML('BPUT.htm', $charset = 'utf-8');
$links = $all['a.myblue'];
echo '<notice>';
foreach ($links as $link) {
echo '<text>';
echo "<![CDATA[";
echo pq($link)->text();
echo "]]>";
echo '</text>';
echo '<url>';
echo pq($link)->attr('href');
echo '</url>';
}
echo '</notice>';
?>
bumping for answers.

how to give line breaks in xml file using php?

I am display mysql data in xml file using php.
there I used this one in here i want to give line breaks .if we give line breaks that will display line break tag in content ..we have to give html tags .but we dont show them in xml content ...
the output is coming like this..
At the same time I want to remove that empty p tags also ...that is.
<![CDATA[ <p> </p>]]>
this is the code i have written for xml ...
please solve this problems
header("Content-Type: application/xml; charset=utf-8");
date_default_timezone_set("Asia/Calcutta");
$this->view->data=$this->CallModel('posts')- >GalleryAndContent();
$xml = '
';
$xml.='Thehansindia
http://www.thehansindia.com
Newspaper with a difference';
foreach($this->view->data as $values)
{
$output=strip_tags($values['text_data'],"");
$output = preg_replace('/(<[^>]+) style=".?"/i', '$1',$output);
$output = preg_replace('/(<[^>]+) class=".?"/i', '$1', $output);
$output=preg_replace( '/style=(["\'])[^\1]?\1/i', '', $output, -1 );
$output=preg_replace("/<([a-z][a-z0-9])[^>]*?(/?)>/i",'',$output);
$output=str_replace(array("",""),array("",""),$output);
$output=str_replace(array("",""),array("",""),,$output);
//$xml.="<CONTENT>"."<![CDATA[".$output."]]>"."</CONTENT>
$xml.= '<item>';
$dom = new DOMDocument;
#$dom->loadHTML($output);
$xml.="<CONTENT>";
foreach ($dom->getElementsByTagName('p') as $tag){
//$tag->nodeValue=str_replace("<![CDATA[ <p> </p> ]]>","",$tag->nodeValue);
if(!empty($tag->nodeValue)){
//$tag->nodeValue=str_replace("<![CDATA[ <p>& & &</p> ]]>","",$tag->nodeValue);
$xml.="<![CDATA["."<p>".stripslashes($tag->nodeValue)."</p>"."]]>";
}
}
$xml.="</CONTENT>";
$xml.= ' </item>';
}
Example:
//Next replace all new lines with the unicode:
$xml = str_replace("\n","
", $xml);
Reference Link

How to find a particular value from a string

I have a string (not xml )
<headername>X-Mailer-Recptid</headername>
<headervalue>15772348</headervalue>
</header>
from this, i need to get the value 15772348, that is the value of headervalue. How is possible?
Use PHP DOM and traverse the headervalue tag using getElementsByTagName():
<?php
$doc = new DOMDocument;
#$doc->loadHTML('<headername>X-Mailer-Recptid</headername><headervalue>15772348</headervalue></header>');
$items = $doc->getElementsByTagName('headervalue');
for ($i = 0; $i < $items->length; $i++) {
echo $items->item($i)->nodeValue . "\n";
}
?>
This gives the following output:
15772348
[EDIT]: Code updated to suppress non-HTML warning about invalid headername and headervalue tags as they are not really HTML tags. Also, if you try to load it as XML, it totally fails to load.
This looks XML-like to me. Anyway, if you don't want to parse the string as XML (which might be a good idea), you could try something like this:
<?
$str = "<headervalue>15772348</headervalue>";
preg_match("/<headervalue\>([0-9]+)<\/headervalue>/", $str, $matches);
print_r($matches);
?>
// find string short way
function my_url_search($se_action_data)
{
// $regex = '/https?\:\/\/[^\" ]+/i';
$regex="/<headervalue\>([0-9]+)<\/headervalue>/"
preg_match_all($regex, $se_action_data, $matches);
$get_url=array_reverse($matches[0]);
return array_unique($get_url);
}
echo my_url_search($se_action_data)
<?php
$html = new simple_html_dom();
$html = str_get_html("<headername>X-Mailer-Recptid</headername>headervalue>15772348</headervalue></header>"); // Use Html dom here
$get_value=$html->find("headervalue", 0)->plaintext;
echo $get_value;
?>
http://simplehtmldom.sourceforge.net/manual.htm#section_find

Categories