I'm using the W3 validator API, and I get this kind of response:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
<env:Body>
<m:markupvalidationresponse env:encodingStyle="http://www.w3.org/2003/05/soap-encoding" xmlns:m="http://www.w3.org/2005/10/markup-validator">
<m:uri>http://myurl.com/</m:uri>
<m:checkedby>http://validator.w3.org/</m:checkedby>
<m:doctype>-//W3C//DTD XHTML 1.1//EN</m:doctype>
<m:charset>utf-8</m:charset>
<m:validity>false</m:validity>
<m:errors>
<m:errorcount>1</m:errorcount>
<m:errorlist>
<m:error>
<m:line>7</m:line>
<m:col>80</m:col>
<m:message>character data is not allowed here</m:message>
<m:messageid>63</m:messageid>
<m:explanation> <![CDATA[
PAGE HTML IS HERE
]]>
</m:explanation>
<m:source><![CDATA[ HTML AGAIN ]]></m:source>
</m:error>
...
</m:errorlist>
</m:errors>
<m:warnings>
<m:warningcount>0</m:warningcount>
<m:warninglist>
</m:warninglist>
</m:warnings>
</m:markupvalidationresponse>
</env:Body>
</env:Envelope>
How can I extract some variables from there?
I need validity, errorcount and if possible from the list of errors: line, col, and message :)
Is there a easy way to do this?
You can load the XML string into a SimpleXMLElement with simplexml_load_string and then find the attributes using XPath. It's important to register the namespaces involved with registerXPathNamespace before using XPath.
$xml = file_get_contents('example.xml'); // $xml should be the XML source string
$doc = simplexml_load_string($xml);
$doc->registerXPathNamespace('m', 'http://www.w3.org/2005/10/markup-validator');
$nodes = $doc->xpath('//m:markupvalidationresponse/m:validity');
$validity = strval($nodes[0]);
echo 'is valid: ', $validity, "\n";
$nodes = $doc->xpath('//m:markupvalidationresponse/m:errors/m:errorcount');
$errorcount = strval($nodes[0]);
echo 'total errors: ', $errorcount, "\n";
$nodes = $doc->xpath('//m:markupvalidationresponse/m:errors/m:errorlist/m:error');
foreach ($nodes as $node) {
$nodes = $node->xpath('m:line');
$line = strval($nodes[0]);
$nodes = $node->xpath('m:col');
$col = strval($nodes[0]);
$nodes = $node->xpath('m:message');
$message = strval($nodes[0]);
echo 'line: ', $line, ', column: ', $col, ' message: ', $message, "\n";
}
You should be using a SOAP library to get this in the first place. There are various options you can try for this; nusoap, http://php.net/manual/en/book.soap.php, the zend framework also has SOAP client and server which you can use. Whatever implementation you use will allow you to get the data in some way. Doing a var_dump() on whatever holds the initial response should aid you in navigating through it.
If you rather use the DOMDocument class from php. You don't have to know Xpath to get this working. An example:
$url = "http://www.google.com";
$xml = new DOMDocument();
$xml->load("http://validator.w3.org/check?uri=".urlencode($url)."&output=soap12");
$doctype = $xml->getElementsByTagNameNS('http://www.w3.org/2005/10/markup-validator', 'doctype')->item(0)->nodeValue;
$valid = $xml->getElementsByTagNameNS('http://www.w3.org/2005/10/markup-validator', 'validity')->item(0)->nodeValue;
$errorcount = $xml->getElementsByTagNameNS('http://www.w3.org/2005/10/markup-validator', 'errorcount')->item(0)->nodeValue;
$warningcount = $xml->getElementsByTagNameNS('http://www.w3.org/2005/10/markup-validator', 'warningcount')->item(0)->nodeValue;
$errors = $xml->getElementsByTagNameNS('http://www.w3.org/2005/10/markup-validator', 'error');
foreach ($errors as $error) {
echo "<br>line: ".$error->childNodes->item(1)->nodeValue;
echo "<br>col: ".$error->childNodes->item(3)->nodeValue;
echo "<br>message: ".$error->childNodes->item(5)->nodeValue;
}
// item() arguments are uneven because the empty text between tags is counted as an item.
Related
What is the best way to format XML within a PHP class.
$xml = "<element attribute=\"something\">...</element>";
$xml = '<element attribute="something">...</element>';
$xml = '<element attribute=\'something\'>...</element>';
$xml = <<<EOF
<element attribute="something">
</element>
EOF;
I'm pretty sure it is the last one!
With DOM you can do
$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML('<root><foo><bar>baz</bar></foo></root>');
$dom->formatOutput = TRUE;
echo $dom->saveXML();
gives (live demo)
<?xml version="1.0"?>
<root>
<foo>
<bar>baz</bar>
</foo>
</root>
See DOMDocument::formatOutput and DOMDocument::preserveWhiteSpace properties description.
This function works perfectlly as you want you don't have to use any xml dom library or nething just pass the xml generated string into it and it will parse and generate the new one with tabs and line breaks.
function formatXmlString($xml){
$xml = preg_replace('/(>)(<)(\/*)/', "$1\n$2$3", $xml);
$token = strtok($xml, "\n");
$result = '';
$pad = 0;
$matches = array();
while ($token !== false) :
if (preg_match('/.+<\/\w[^>]*>$/', $token, $matches)) :
$indent=0;
elseif (preg_match('/^<\/\w/', $token, $matches)) :
$pad--;
$indent = 0;
elseif (preg_match('/^<\w[^>]*[^\/]>.*$/', $token, $matches)) :
$indent=1;
else :
$indent = 0;
endif;
$line = str_pad($token, strlen($token)+$pad, ' ', STR_PAD_LEFT);
$result .= $line . "\n";
$token = strtok("\n");
$pad += $indent;
endwhile;
return $result;
}
//Here is example using XMLWriter
$w = new XMLWriter;
$w->openMemory();
$w->setIndent(true);
$w->startElement('foo');
$w->startElement('bar');
$w->writeElement("key", "value");
$w->endElement();
$w->endElement();
echo $w->outputMemory();
//out put
<foo>
<bar>
<key>value</key>
</bar>
</foo>
The first is better if you plan to embed values into the XML, The second is better for humans to read. Neither is good if you intend really work with XML.
However if you intend to perform a simple fire and forget function that takes XML as a input parameter, then I would say use the first method because you will need to embed parameters at some point.
I personally would use the PHP class simplexml, it's very easy to use and it's built in xpath support makes detailing the data returned in XML a dream.
In PHP i have this code for making a XML header for the plesk API.
$request = <<<EOF
<packet version="1.6.7.0">
<mail>
<update>
<set>
<filter>
<site-id>$site_id</site-id>
<mailname>
<name>$name</name>
<autoresponder>
<enabled>true</enabled>
<subject>$subject</subject>
<text>$mail_body</text>
<end_date>$date</end_date>
</autoresponder>
</mailname>
</filter>
</set>
</update>
</mail>
</packet>
EOF;
However i get this response: 1014 Parser error: Cannot parse the XML from the source specified
I have put the xml into a formatting of 2, 3 ,4 and tab spacing and it doesnt seem to be able to parse it.
What am i doing wrong?
You can't guess to create a valid XML by string concatenation, especially when you have complex contents like an email text.
No all characters are allowed inside XML tags: you have to properly escape not-allowed characters. Fortunately, php have some parser that do this job for you.
First of all, create an empty XML template (check its validity using a XML validator):
$xml = '<?xml version="1.0" encoding="utf-8" ?>
<packet version="1.6.7.0">
<mail>
<update>
<set>
<filter>
<site-id/>
<mailname>
<name/>
<autoresponder>
<enabled/>
<subject/>
<text/>
<end_date/>
</autoresponder>
</mailname>
</filter>
</set>
</update>
</mail>
</packet>
';
Then, load it into a DOMDocument object and init a DOMXPath object:
$dom = new DomDocument();
$dom->loadXML( $xml );
$xpath = new DOMXPath( $dom );
Then, find each node that you want to change and set/update its node value:
$nodes = $xpath->query( 'mail/update/set/filter/site-id' );
$nodes->item(0)->nodeValue = $site_id;
$nodes = $xpath->query( 'mail/update/set/filter/mailname/name' );
$nodes->item(0)->nodeValue = $name;
For the <autoresponder> children, you can perform a loop through each child, using * at the end of your search pattern:
$nodes = $xpath->query( 'mail/update/set/filter/mailname/autoresponder/*' );
foreach( $nodes as $node )
{
if( 'enabled' == $node->nodeName )
{
$node->nodeValue = 'true';
}
elseif( 'subject' == $node->nodeName )
{
$node->nodeValue = $subject;
}
elseif( 'text' == $node->nodeName )
{
$cdata = $dom->createCDATASection( $mail_body );
$node->appendChild( $cdata );
}
elseif( 'end_date' == $node->nodeName )
{
$node->nodeValue = $date;
}
}
Note the different syntax adopted for mail body: I use a CDATA node here: if your XML doesn't allow CDATA, replace it with standard ->nodeValue syntax. Or — instead — you can have to use CDATA method for all the nodes.
When the XML is ready, you can echo it by:
echo $dom->saveXML();
DOMXPath allow to perform complex searches in the XML tree: it's not mandatory in your case, because you start from a short, empty, unambiguous template. I use it for demonstration purpose, but you can replace a line like this:
$nodes = $xpath->query( 'mail/update/set/filter/site-id' );
with:
$nodes = $dom->getElementsByTagName( 'site-id' );
and it will work fine.
Read more about DOMDocument
Read more about DOMXPath
What I tried and what doesn't work:
Input:
$d = new DOMDocument();
$d->formatOutput = true;
// Out of my control:
$someEl = $d->createElementNS('http://example.com/a', 'a:some');
// Under my control:
$envelopeEl = $d->createElementNS('http://example.com/default',
'envelope');
$d->appendChild($envelopeEl);
$envelopeEl->appendChild($someEl);
echo $d->saveXML();
$someEl->prefix = null;
echo $d->saveXML();
Output is invalid XML after substitution:
<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<a:some xmlns:a="http://example.com/a"/>
</envelope>
<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<:some xmlns:a="http://example.com/a" xmlns:="http://example.com/a"/>
</envelope>
Note that <a:some> may have children. One solution would be
to create a new <some>, and copy all children from <a:some> to <some>. Is
that the way to go?
This is really an interesting question. My first intention was to clone the <a:some> node, remove the xmlns:a attribute, remove the <a:some> and insert the clone - <a>. But this will not work, as PHP does not allow to remove the xmlns:a attribute like any regular attribute.
After some struggling with DOM methods of PHP I started to google the problem. I found this comment in the PHP documentation on this. The user suggest to write a function that clones the node manually without it's namespace:
<?php
/**
* This function is based on a comment to the PHP documentation.
* See: http://www.php.net/manual/de/domnode.clonenode.php#90559
*/
function cloneNode($node, $doc){
$unprefixedName = preg_replace('/.*:/', '', $node->nodeName);
$nd = $doc->createElement($unprefixedName);
foreach ($node->attributes as $value)
$nd->setAttribute($value->nodeName, $value->value);
if (!$node->childNodes)
return $nd;
foreach($node->childNodes as $child) {
if($child->nodeName == "#text")
$nd->appendChild($doc->createTextNode($child->nodeValue));
else
$nd->appendChild(cloneNode($child, $doc));
}
return $nd;
}
Using it would lead to a code like this:
$xml = '<?xml version="1.0"?>
<envelope xmlns="http://example.com/default">
<a:some xmlns:a="http://example.com/a"/>
</envelope>';
$doc = new DOMDocument();
$doc->loadXML($xml);
$elements = $doc->getElementsByTagNameNS('http://example.com/a', 'some');
$original = $elements->item(0);
$clone = cloneNode($original, $doc);
$doc->documentElement->replaceChild($clone, $original);
$doc->formatOutput = TRUE;
echo $doc->saveXML();
I would like to create a new simplified xml based on an existing one:
(using "simpleXml")
<?xml version="1.0" encoding="UTF-8"?>
<xls:XLS>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>Start</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
<xls:RouteInstructionsList>
<xls:RouteInstruction>
<xls:Instruction>End</xls:Instruction>
</xls:RouteInstruction>
</xls:RouteInstructionsList>
</xls:XLS>
Because there are always colons in the element-tags, it will mess with "simpleXml", I tried to use the following solution->link.
How can I create a new xml with this structure:
<main>
<instruction>Start</instruction>
<instruction>End</instruction>
</main>
the "instruction-element" gets its content from the former "xls:Instruction-element".
Here is the updated code:
But unfortunately it never loops through:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach($xml->children() as $child){
print_r("xml_has_childs");
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
echo $new_xml->asXML();
there is no error-message, if I leave the "#"…
/* the use of # is to suppress warning */
$xml = #simplexml_load_string($YOUR_RSS_XML);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->children() as $child)
{
$new_xml->addChild('instruction', $child->RouteInstruction->Instruction);
}
/* to print */
echo $new_xml->asXML();
You could use xpath to simplify things. Without knowing the full details, I don't know if it will work in all cases:
$source = "route.xml";
$xmlstr = file_get_contents($source);
$xml = #simplexml_load_string($xmlstr);
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start</instruction><instruction>End</instruction></main>
Edit: The file at http://www.gps.alaingroeneweg.com/route.xml is not the same as the XML you have in your question. You need to use a namespace like:
$xml = #simplexml_load_string(file_get_contents('http://www.gps.alaingroeneweg.com/route.xml'));
$xml->registerXPathNamespace('xls', 'http://www.opengis.net/xls'); // probably not needed
$new_xml = simplexml_load_string('<main/>');
foreach ($xml->xpath('//xls:Instruction') as $instr) {
$new_xml->addChild('instruction', (string) $instr);
}
echo $new_xml->asXML();
Output:
<?xml version="1.0"?>
<main><instruction>Start (Southeast) auf Sihlquai</instruction><instruction>Fahre rechts</instruction><instruction>Fahre halb links - Ziel erreicht!</instruction></main>
What is the best way to format XML within a PHP class.
$xml = "<element attribute=\"something\">...</element>";
$xml = '<element attribute="something">...</element>';
$xml = '<element attribute=\'something\'>...</element>';
$xml = <<<EOF
<element attribute="something">
</element>
EOF;
I'm pretty sure it is the last one!
With DOM you can do
$dom = new DOMDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML('<root><foo><bar>baz</bar></foo></root>');
$dom->formatOutput = TRUE;
echo $dom->saveXML();
gives (live demo)
<?xml version="1.0"?>
<root>
<foo>
<bar>baz</bar>
</foo>
</root>
See DOMDocument::formatOutput and DOMDocument::preserveWhiteSpace properties description.
This function works perfectlly as you want you don't have to use any xml dom library or nething just pass the xml generated string into it and it will parse and generate the new one with tabs and line breaks.
function formatXmlString($xml){
$xml = preg_replace('/(>)(<)(\/*)/', "$1\n$2$3", $xml);
$token = strtok($xml, "\n");
$result = '';
$pad = 0;
$matches = array();
while ($token !== false) :
if (preg_match('/.+<\/\w[^>]*>$/', $token, $matches)) :
$indent=0;
elseif (preg_match('/^<\/\w/', $token, $matches)) :
$pad--;
$indent = 0;
elseif (preg_match('/^<\w[^>]*[^\/]>.*$/', $token, $matches)) :
$indent=1;
else :
$indent = 0;
endif;
$line = str_pad($token, strlen($token)+$pad, ' ', STR_PAD_LEFT);
$result .= $line . "\n";
$token = strtok("\n");
$pad += $indent;
endwhile;
return $result;
}
//Here is example using XMLWriter
$w = new XMLWriter;
$w->openMemory();
$w->setIndent(true);
$w->startElement('foo');
$w->startElement('bar');
$w->writeElement("key", "value");
$w->endElement();
$w->endElement();
echo $w->outputMemory();
//out put
<foo>
<bar>
<key>value</key>
</bar>
</foo>
The first is better if you plan to embed values into the XML, The second is better for humans to read. Neither is good if you intend really work with XML.
However if you intend to perform a simple fire and forget function that takes XML as a input parameter, then I would say use the first method because you will need to embed parameters at some point.
I personally would use the PHP class simplexml, it's very easy to use and it's built in xpath support makes detailing the data returned in XML a dream.