Retrieve OuterXML From XML Child Node - php

I need to retrieve the OuterXML for each speak tag.
For example, I need to retrieve this data for the first speak tag in test.ssml:
<speak xmlns="https://www.w3.org/2001/10/synthesis" version="1.0" xml:lang="en-US">
<voice name="en-US-GuyNeural">
<prosody rate="0.00%">Test 1</prosody>
</voice>
</speak>
index.php
set_time_limit(0);
require_once('src/Config.php');
$fileName = __DIR__.DIRECTORY_SEPARATOR.'test.ssml';
$fileContent = file_get_contents($fileName);
// $fileContent = preg_replace( "/\r|\n/", "", $fileContent );
$xml=simplexml_load_file($fileName);
$reader = new XMLReader();
foreach($xml->speak as $child)
{
echo $child->getName() . " ::: " . htmlspecialchars( $reader->readOuterXml ( $child ) ). "<br>";
}
test.ssml
all tracks.mp3
bookmarks.dat
Test 1
Test 2
Current Output in Browser
Desired Output

You can get the XML directly using the SimpleXML function asXML() and don't need (as far as I can tell) the XMLReader...
$xml=simplexml_load_file($fileName);
foreach($xml->speak as $child)
{
echo $child->asXML()."<br />";
}

Related

PHP gzread, gzfile, gzopen, etc.. all strip tags off of XML and return only the values [duplicate]

This question already has answers here:
How to echo XML file in PHP
(10 answers)
Output raw XML using php
(5 answers)
Closed 1 year ago.
I have .gz files that contain xml files. I've tried every combination of all the different things shown in the code below. Any time one of the gz..... methods "works" it returns the values contained inside the XML files will all the tags and metadata gone. For example, if the xml file looks like this:
<?xml version="1.0" encoding="UTF-8" ?>
<tag1>
<taga>
This
</taga>
<tagb>
is the stuff
</tagb>
</tag1>
<tag2>
<taga>
I get but only
</taga>
<tagb>
This
</tagb>
</tag2>
What I get is:
This is the stuff I get but only This
Here's the code:
<?php
$mailfileObj->zipfile = 'path/to/gzfile.gz'; //ignore the fact that it says zipfile, it is a .gz file
try{
$opengzfile = gzopen($mailfileObj->zipfile, "r");
$contents = gzread($opengzfile, filesize($mailfileObj->zipfile));
gzclose($opengzfile);
var_dump($contents);
echo '<br>';
//$opengzfile = fopen($mailfileObj->zipfile, "r");
//$contents = fread($opengzfile, filesize($mailfileObj->zipfile));
//fclose($opengzfile);
//$contents = file_get_contents($mailfileObj->zipfile);
$contents2 = '';
$lines = gzfile($mailfileObj->zipfile);
foreach ($lines as $line) {
echo $line;
$contents2 = $contents2.$line;
}
//var_dump($contents);
//echo '<br>';
//var_dump($contents);
//echo $contents . '<br><br>';
//$xmlfilegz = $mailfileObj->filename.'.xml';
//$openxmlfile = fopen($xmlfilegz, "w");
//fwrite($openxmlfile, $contents);
//fclose($openxmlfile);
$opengzfile = fopen($mailfileObj->zipfile, "r");
$contents2 = fread($opengzfile, filesize($mailfileObj->zipfile));
fclose($opengzfile);
//$contents2 = file_get_contents($mailfileObj->zipfile);
//$contents2 = gzdecode($contents);
$contents2 = gzinflate($contents);
//$contents2 = gzuncompress($contents);
var_dump($contents2);
}
catch(Exception $e){
echo 'Caught exception: ' . $e->getMessage() . '<br>';
}
?>
What is wrong here? What am I missing?
Thank you.
You're putting the XML in an HTML web page, so the browser is interpreting the XML tags as HTML tags.
Use htmlentities() to encode them so they'll be rendered literally.
foreach ($lines as $line) {
echo htmlentities($line);
$contents2 = $contents2.$line;
}
You might want to show this in a <pre> block so the newlines and indentation will be preserved.

parsing html document for anchor tag

say i have
» Download MP4 « - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />
html page like this i wanna parse it with simple dom php parser and i wanna get download mp4 114p 19.1 as out put while i tried this code
foreach($displaybody->find('a ') as $element) {
// echo $element->innertext . '<br/>';
it returned me download mp4 only how do i parse remaining values download mp4 114p 19.1 please help me out
You can't use the <a> tag anymore since some of the text you're trying to access isn't inside it anymore, target the document itself and then use ->plaintext:
$html = <<<EOT
» Download MP4 « - <b>144p (Video Only)</b> - <span> 19.1</span> MB<br />
EOT;
$displaybody = str_get_html($html);
echo $displaybody->plaintext;
Here is another way of accessing each row thru DOMDocument with xpath:
// load the sites html page in DOMDocument
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$html_page = file_get_contents('http://www.mohammediatechnologies.in/download/downloadtest.php?name=8KPEiGqDQHg');
$dom->loadHTML(mb_convert_encoding($html_page, 'HTML-ENTITIES', 'UTF-8'));
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$data = array();
// target elements which is inside an anchor and a line break (treat them as each row)
$links = $xpath->query('//*[following-sibling::a and preceding-sibling::br]');
$temp = '';
foreach($links as $link) { // for each rows of the link
$temp .= $link->textContent . ' '; // get all text contents
if($link->tagName == 'br') {
$unit = $xpath->evaluate('string(./preceding-sibling::text()[1])', $link);
$data[] = $temp . $unit; // push them inside an array
$temp = '';
}
}
echo '<pre>';
print_r($data);
Sample Output

PHP Get xml nodes by index

How to loop thru any XML file to get node and it's values?
My struggle is: I have 3 XML files:
<namespace>
<node>
<value_a>A</value_a>
<value_b>B</value_b>
</node>
</namespace>
<global>
<country>
<code>UK</code>
</country>
</global>
<geoNames>
<country>
<countryCode>Australia</countryCode>
</country>
</geoNames>
And I am reading them with 3 same looking functions that extract information from XML and store as variables by saving .php data file. Example of one of them:
$parsed_xml_content = "";
$xml = simplexml_load_file("http://" . $srvname . $dirpath . $file_xmlData);
$obj = $xml->xpath("//geonames");
foreach ($obj[0]->country as $country)
{
$keys = (array_keys((array) $country));
$i = 0;
$parsed_xml_content .= "\t\"" . $country->countryCode . "\" => Array(\n";
foreach ($country as $val)
{
$parsed_xml_content .= "\t\t\"$keys[$i]\" => \"$val\",\n";
$i++;
}
$parsed_xml_content .= "\t),\n";
}
$fo = fopen($locpath . $file_roots, "w");
fwrite($fo, "<?php \$isoGeoData = Array(\n" . $parsed_xml_content . "\n); ?>");
fclose($fo);
How to rewrite it to not use node names $country->countryCode but indexes? Managing 3 functions get's messy.
Here is a peace of code that I normally use for array to xml or xml to array.
php-array-to-xml or xml-to-array
you can use the same class or just copy this toArray function in your script.
after creating array you can use php serialize() serialize function for writing the result with fwrite()

Why does only one (the last) XML file is saved?

I use this below to save me the contents of the XML addresses I have in array. However only one XML is saved, specifically the last one. What am I missing here?
$filenames = array('xml url','xml url','xml url');
foreach( $filenames as $filename) {
$xml = simplexml_load_file( $filename );
$xml->asXML("test.xml");
}
You appear to be opening each XML file, then saving them in the same location. File 1 is written, then File 2 overwrites it, then File 3... In short, the last file will overwrite the previous ones, and therefore "only the last one is saved".
What exactly are you trying to do here?
You save them all as the same name, so of course the earlier ones will be lost.
Try this:
$filenames = array('xml url','xml url','xml url');
foreach( $filenames as $key => $filename) {
$xml = simplexml_load_file( $filename );
$xml->asXML('test' . $key. '.xml');
}
That should save the files sequentially as test0.xml, test1.xml, test2.xml and so on.
If you want all your loaded XML URL's to be appended to a single file, you can do something like this:
$filenames = array('xml url','xml url','xml url');
$fullXml = array();
foreach( $filenames as $key => $filename) {
$xml = simplexml_load_file( $filename );
// Convert the simplexml object into a string, and add it to an array
$fullXml[] = $xml->asXML();
}
// Implode the array of all our xml into one big xml string
$fullXml = implode("\n", $fullXml);
// Load the new big xml string into a simplexml object
$xml = simplexml_load_string($fullXml);
// Now we can save the entire xml as your file
$xml->asXml('test.xml');

XMLReader and doctype

I need to parse an XML file and I need also to parse the doctype. I've tried with XML Reader but when I found a nodetype 10 (doctype), I can't get it's value.
There is a way to extract the doctype from an XML file, with XMLReader?
Edit: as asked, some sample code. however is nothing rather than a dump, right now.
$reader = new XMLReader( );
$filename = 'test.xhtml';
$reader->open($filename);
while( $reader->read( ) )
{
$nodeType = $reader->nodeType;
$nodeName = $reader->name;
$nodeValue = $reader->value;
if( $nodeType == 10 )
{
echo $nodeType ."\n";
echo $nodeName ."\n";
echo $nodeValue ."\n";
echo $reader->localName ."\n";
echo $reader->namespaceURI ."\n";
echo $reader->prefix ."\n";
echo $reader->xmlLang ."\n";
echo $reader->readString() . "\n";
echo $reader->readInnerXML() . "\n";
while( $reader->moveToNextAttribute( ) )
{
echo $reader->name . "=" . $reader->value;
}
}
You can use DOM to read the DOCTYPE data:
$doc = new DOMDocument();
$doc->loadXML($xmlData);
var_dump($doc->doctype->publicId);
var_dump($doc->doctype->systemId);
var_dump($doc->doctype->name);
var_dump($doc->doctype->entities);
var_dump($doc->doctype->notations);
I have not found a way to do this with XMLReader despite a lot of looking. However you can use DOMDocument to read the doctype quite easily, then revert to XMLReader to read the rest of the stream. For example, to get the system ID part of the doctype before processing the rest of the XML file:
$doc = new DOMDocument();
$doc->load($xmlfile);
$systemId = $doc->doctype->systemId;
unset($doc);
// Then proceed with XMLReader:
$reader = new XMLReader();
$reader->open($xmlfile);
while($reader->read())
{
// etc
I suppose that this may not be practical in all circumstances but it worked for me while processing very large XML files for which I needed to read the system ID from the doctype.

Categories