I've been given data from a previous version of a website (it was a custom CMS) and am looking to get it into a state that I can import it into my Wordpress site.
This is what I'm working on - http://www.teamworksdesign.com/clients/ciw/datatest/index.php. If you scroll down to row 187 the data starts to fail (there should be a red message) with the following error message:
Fatal error: Uncaught exception 'Exception' with message 'String could
not be parsed as XML' in
/home/teamwork/public_html/clients/ciw/datatest/index.php:132 Stack
trace: #0
/home/teamwork/public_html/clients/ciw/datatest/index.php(132):
SimpleXMLElement->__construct('
Can anyone see what the problem is and how to fix it?
This is how I'm outputting the date:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<?php
ini_set('memory_limit','1024M');
ini_set('max_execution_time', 500); //300 seconds = 5 minutes
echo "<br />memory_limit: " . ini_get('memory_limit') . "<br /><br />";
echo "<br />max_execution_time: " . ini_get('max_execution_time') . "<br /><br />";
libxml_use_internal_errors(true);
$z = new XMLReader;
$z->open('dbo_Content.xml');
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
// move to the first <product /> node
while ($z->read() && $z->name !== 'dbo_Content');
$c = 0;
// now that we're at the right depth, hop to the next <product/> until the end of the tree
while ($z->name === 'dbo_Content')
{
if($c < 201) {
// either one should work
$node = simplexml_import_dom($doc->importNode($z->expand(), true));
if($node->ClassId == 'policydocument') {
$c++;
echo "<h1>Row: $c</h1>";
echo "<pre>";
echo htmlentities($node->XML) . "<br /><br /><br /><b>*******</b><br /><br /><br />";
echo "</pre>";
try{
$xmlObject = new SimpleXMLElement($node->XML);
foreach ($xmlObject->fields[0]->field as $field) {
switch((string) $field['name']) {
case 'parentId':
echo "<b>PARENT ID: </b> " . $field->value . "<br />";
break;
case 'title':
echo "<b>TITLE: </b> " . $field->value . "<br />";
break;
case 'summary':
echo "<b>SUMMARY: </b> " . $field->value . "<br />";
break;
case 'body':
echo "<b>BODY:</b> " . $field->value . "<br />";
break;
case 'published':
echo "<b>PUBLISHED:</b> " . $field->value . "<br />";
break;
}
}
echo '<br /><h2 style="color:green;">Success on node: '.$node->ContentId.'</h2><hr /><br />';
} catch (Exception $e){
echo '<h2 style="color:red;">Failed on node: '.$node->ContentId.'</h2>';
}
}
// go to next <product />
$z->next('dbo_Content');
}
} ?>
</body>
</html>
The error message you're getting "String could not be parsed as XML" means that the XML parser found something in the input data that was not valid XML.
You haven't shown us the data, so I can't tell you exactly what is invalid, but something in there is failing to meet the strict rules for XML parsing. There are any number of possible reasons for this.
If I had to stick my neck out on the line and guess, I'd say the most common reason cause of bad XML in the middle of a file that is otherwise okay would be an unescaped & when it should be the & entity code.
Anyone creating their XML using a proper XML writer shouldn't have this issue, but I've come across plenty of cases where people don't bother using an XML writer and just output raw XML as text and have forgotten to escape the entities, which means that that the data is fine until you come to a company name with an & in it.
If it's as simple as that, and it's a one-off import, you may be able to fix the file manually in a text editor.
However that's just a guess. You'll need to actually examine the XML file for yourself to see the problem. If you can't see the problem visually, I'd suggest using a GUI XML tool to analyse the file.
Hope that helps.
[EDIT]
Okay, I just took a better look at the data in the link you gave, and on thing sticks out like a sore thumb....
encoding="utf-16"
I note that all the data that has worked was using UTF-8, and all the data that has failed is using UTF-16.
PHP is generally fine with UTF-8, but it won't cope very well at all with UTF-16. So it's fairly clear that this is your problem.
And, to be honest, there's really no need to ever use UTF-16, so the solution here is to switch to UTF-8 encoding for everything.
How easy that is for you to do, I can't say, but worst case I'm sure you could find a batch convertion tool.
Hope that helps.
Related
I have an xml file like this
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Job>
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
</Job>
I use simplexml_load_file to load file and print out User, Computer and Complete attribute.
$xml=simplexml_load_file("abc.xml") or die("Error: Cannot create object");
foreach($xml->children() as $xm) {
echo $xm->User . "<br>";
echo $xm->Computer . "<br>";
echo $xm->Complete . "<br>";
But it only print out User and Computer. The result for Complete is empty.
Please help me with this, thank you!
The main issue is that you are trying to mix the fields from different elements. You can see from the original XML that <User> and <Computer> are in the <JobInfo> element and <Complete> is in the <JobFlags> element. This isn't a problem, but when you use your foreach loop, you go through each child element of <Job> and output all of the values for each element. If you change your loop to...
foreach($xml->children() as $tag => $xm) {
echo "element=".$tag . "<br>";
echo $xm->User . "<br>";
echo $xm->Computer . "<br>";
echo $xm->Complete . "<br>";
}
You get (please excuse wrong markup, I wanted the layout)...
element=JobInfoabcacbelement=JobFlagsYes
If instead you just need those three pieces of information from the document, you could access them using the full path and so just pick out the details your after without using a foreach...
echo $xml->JobInfo->User . "<br>";
echo $xml->JobInfo->Computer . "<br>";
echo $xml->JobFlags->Complete . "<br>";
Gives you.
abcacbYes
simplexml is a bit outdated. Not sure that a <complete> tag is valid, xml, it may be the issue.
If you can, try to use php-dom instead.
https://php.net/manual/en/book.dom.php
Example to retrieve all the tags texts.
#!/usr/bin/php
<?php
$remote =<<<'EOF'
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
EOF;
$i = 0;
$doc = new DOMDocument();
$doc->loadHTML($remote);
foreach($doc->getElementsByTagName('*') as $elem) {
echo $i++.": ".$elem->textContent;
}
The package bellong to php-xml:
Package php-dom is a virtual package provided by: php7.2-xml
Alternatives: https://php.net/manual/en/refs.xml.php
Try to wrap all elements into a parent tag:
<?xml version="1.0"?>
<test>
<JobInfo>
<User>abc</User>
<Computer>acb</Computer>
<Started>2018/04/21-21:58:30:0182-06</Started>
<Ended>2018/04/21-23:10:10:0093-06</Ended>
</JobInfo>
<JobFlags>
<Active>Yes</Active>
<Complete>Yes</Complete>
</JobFlags>
</test>
and then reference in your php:
<?php
$xml=simplexml_load_file("abc.xml") or die("Error: Cannot create object");
echo $xml->JobInfo->User . "<br>";
echo $xml->JobInfo->Computer . "<br>";
echo $xml->JobFlags->Complete . "<br>";
?>
It worked here.
I think it's just a matter of making a correct reference to the entire hierarchical structure of the tags.
See this example taken from http://php.net/manual/en/function.simplexml-load-file.php:
<?php
$xml = '<?xml version="1.0" encoding="UTF-8" ?>
<rss>
<channel>
<item>
<title><![CDATA[Tom & Jerry]]></title>
</item>
</channel>
</rss>';
$xml = simplexml_load_string($xml);
// echo does the casting for you
echo $xml->channel->item->title;
// but vardump (or print_r) not!
var_dump($xml->channel->item->title);
// so cast the SimpleXML Element to 'string' solve this issue
var_dump((string) $xml->channel->item->title);
?>
Above will output:
Tom & Jerry
object(SimpleXMLElement)#4 (0) {}
string(11) "Tom & Jerry"
Reference:
https://stackoverflow.com/a/16972780/5074998
http://php.net/manual/en/function.simplexml-load-file.php
It works just fine for me:
abc
acb
Yes
You might be confused because you are expecting it all on three consecutive lines. Try outputting it like this instead and it may help you visualize that it is echoing all three of your requested elements (whether they exist or not) for two loops (since the XML has two children):
foreach($xml->children() as $xm) {
echo "Current Element: " . $xm->getName() . "<br />";
echo "User: " . $xm->User . "<br />";
echo "Computer: " . $xm->Computer . "<br />";
echo "Complete: " . $xm->Complete . "<br />";
}
which will output:
Current Element: JobInfo
User: abc
Computer: acb
Complete:
Current Element: JobFlags
User:
Computer:
Complete: Yes
As Nick suggested in his comment, if you don't want the <br /> breaks to print when the element doesn't exist, you can use isset like:
foreach($xml->children() as $xm) {
echo isset($xm->User) ? "{$xm->User}<br />" : '';
echo isset($xm->Computer) ? "{$xm->Computer}<br />" : '';
echo isset($xm->Complete) ? "{$xm->Complete}<br />" : '';
}
Or, as NigelRen recommended in their answer, you could skip using children() if you know the full path to the elements you need, and just use those paths instead, like:
echo $xml->JobInfo->User . "<br />";
echo $xml->JobInfo->Computer . "<br />";
echo $xml->JobFlags->Complete . "<br />";
The underlying issue being that when you were echoing out $xm->Complete while traversing the JobInfo element, it was outputting just the <br /> because $xml->JobInfo->Complete does't exist.
I'm trying to get started using XMLReader to process large XML files, but I am getting a strange HTTP 400 Bad Request when I try to run the following code:
<?php
$reader = new XMLReader ();
$reader->open ( "testfile.xml" );
while ( $reader->read () ) {
switch ($reader->nodeType) {
case (XMLREADER::ELEMENT) :
echo "<" . $reader->name . "> <br>";
break;
case (XMLREADER::TEXT) :
if ($reader->hasValue) {
echo $reader->value . "<br>";
}
break;
}
}
$reader->close();
?>
I have also tried it this way and get the same 400 Bad Request error:
<?php
$reader = new XMLReader ();
$reader->open ( "testfile.xml" );
while ( $reader->read() ) {
switch ($reader->nodeType) {
case (XMLREADER::ELEMENT) :
echo "<" . $reader->name . "> <br>";
$reader->read();
if (($reader->nodeType == XMLREADER::TEXT) && $reader->hasValue) {
echo $reader->value . "<br>";
}
break;
}
}
$reader->close();
?>
In both cases, the error goes away when I comment out echo reader->value ."<br>";. Apache error logs aren't showing anything. Also, in spite of the 400 error, the page is created and rendered as expected with the elements and text values (i.e., the code appears to work, it just gives an HTTP error as well).
It is also worth noting that it seems to work without error on a small, simple test XML file with only one root and one child element with text. It's only on the larger more complicated XML file that I'm actually intending to process that I'm getting the error.
Thanks in advance for any help!
FYI in case anyone else runs into this, I found out I needed to use htmlspecialchars() to escape the value. I changed:
echo $reader->value . "<br>";
to
echo htmlspecialchars($reader->value, ENT_XML1, 'UTF-8') . "<br>";
Guess there must be some html in the XML that the browser was trying to interpret causing the 400 error.
I tried everything to get the xml from the url, even smottt idea from PHP How to hit a url and download its xml , but didnt work for me.
my scenario;
URL that generates Dollar Exchange Rates:
nrb.org.np/exportForexXML.php?YY=2015&MM=03&DD=01&YY1=2015&MM1=03&DD1=01
Here: YY MM DD are the starting date and YY1 MM1 DD1 are the ending date of report. I believe , it generates an xml in unix time of Kathmandu, Asia. Everytime and second seperate xml file name.
Searched internet but nothing,
I want to display the result of xml in a page using php either by downloading the xml from the given url to my localhost folder or directly from web.
please help.
Thanks in Advance
Edited: Code I am using is
$url = "nrb.org.np/exportForexXML.php?YY=2015&MM=03&DD=01&YY1=2015&MM1=03&DD1=01";
$xml = new SimpleXMLElement($url, null, true);
foreach($xml->CurrencyConversionResponse as $CurrencyConversionResponse) {
echo $CurrencyConversionResponse->BaseCurrency . "<br />";
echo $CurrencyConversionResponse->TargetCurrency . "<br />";
echo $CurrencyConversionResponse->ConversionTime . "<br />";
echo $CurrencyConversionResponse->ConversionRate . "<br />";
}
And the error message is
Warning: SimpleXMLElement::__construct() [simplexmlelement.--construct]: I/O warning : failed to load external entity "nrb.org.np/exportForexXML.php?YY=2015&MM=03&DD=01&YY1=2015&MM1=03&DD1=01" in C:\xampp\htdocs\xml.php on line 4
Fatal error: Uncaught exception 'Exception' with message 'String could not be parsed as XML' in C:\xampp\htdocs\xml.php:4 Stack trace: #0 C:\xampp\htdocs\xml.php(4): SimpleXMLElement->__construct('nrb.org.np/expo...', 0, true) #1 {main} thrown in C:\xampp\htdocs\xml.php on line 4
Add the http:// to the URL
$url = "http://nrb.org.np/exportForexXML.php?YY=2015&MM=03&DD=01&YY1=2015&MM1=03&DD1=01";
$xml = new SimpleXMLElement($url, null, true);
foreach($xml->CurrencyConversionResponse as $CurrencyConversionResponse) {
echo $CurrencyConversionResponse->BaseCurrency . "<br />";
echo $CurrencyConversionResponse->TargetCurrency . "<br />";
echo $CurrencyConversionResponse->ConversionTime . "<br />";
echo $CurrencyConversionResponse->ConversionRate . "<br />";
}
Is there a function that returns HTML break, <br /> when in HTML and PHP_EOL when in CLI?
so that if I code something like:
echo "error is" . appropriateEOL();
will return the appropriate line break. I know how to code appropriateEOL(), I just wonder if there is a built in function.
I am using zf2.
There's nothing built-in to do what you're asking. But it would be trivial to set it up yourself.
if (PHP_SAPI == 'cli') {
define("LINE_BREAK", PHP_EOL);
}
else {
define("LINE_BREAK", "<br/>");
}
Now just use this LINE_BREAK constant.
Though it might be better to stick with non-html in your code and use PHP_EOL, and then run your output through nl2br() before displaying output in your HTML templates.
$var = "Hi there"."<br/>"."Welcome to my website"."<br/>;"
echo $var;
Is there an elegant way to handle line-breaks in PHP? I'm not sure about other languages, but C++ has eol so something thats more readable and elegant to use?
Thanks
For linebreaks, PHP as "\n" (see double quote strings) and PHP_EOL.
Here, you are using <br />, which is not a PHP line-break : it's an HTML linebreak.
Here, you can simplify what you posted (with HTML linebreaks) : no need for the strings concatenations : you can put everything in just one string, like this :
$var = "Hi there<br/>Welcome to my website<br/>";
Or, using PHP linebreaks :
$var = "Hi there\nWelcome to my website\n";
Note : you might also want to take a look at the nl2br() function, which inserts <br> before \n.
I have defined this:
if (PHP_SAPI === 'cli')
{
define( "LNBR", PHP_EOL);
}
else
{
define( "LNBR", "<BR/>");
}
After this use LNBR wherever I want to use \n.
in php line breaks we can use PHP_EOL (END of LINE) .it working as "\n"
but it cannot be shown on the ht ml page .because we have to give HTML break to break the Line..
so you can use it using define
define ("EOL","<br>");
then you can call it
I ended up writing a function that has worked for me well so far:
// pretty print data
function out($data, $label = NULL) {
$CLI = (php_sapi_name() === 'cli') ? 'cli' : '';
$gettype = gettype($data);
if (isset($label)) {
if ($CLI) { $label = $label . ': '; }
else { $label = '<b>'.$label.'</b>: '; }
}
if ($gettype == 'string' || $gettype == 'integer' || $gettype == 'double' || $gettype == 'boolean') {
if ($CLI) { echo $label . $data . "\n"; }
else { echo $label . $data . "<br/>"; }
}
else {
if ($CLI) { echo $label . print_r($data,1) . "\n"; }
else { echo $label . "<pre>".print_r($data,1)."</pre>"; }
}
}
// Usage
out('Hello world!');
$var = 'Hello Stackoverflow!';
out($var, 'Label');
Not very "elegant" and kinda a waste, but if you really care what the code looks like you could make your own fancy flag and then do a str_replace.
Example:<br />
$myoutput = "After this sentence there is a line break.<b>.|..</b> Here is a new line.";<br />
$myoutput = str_replace(".|..","<br />",$myoutput);<br />
or
how about:<br />
$myoutput = "After this sentence there is a line break.<b>E(*)3</b> Here is a new line.";<br />
$myoutput = str_replace("E(*)3","<br />",$myoutput);<br />
I call the first method "middle finger style" and the second "goatse style".
Because you are outputting to the browser, you have to use <br/>. Otherwise there is \n and \r or both combined.
Well, as with any language there are several ways to do it.
As previous answerers have mentioned, "<br/>" is not a linebreak in the traditional sense, it's an HTML line break. I don't know of a built in PHP constant for this, but you can always define your own:
// Something like this, but call it whatever you like
const HTML_LINEBREAK = "<br/>";
If you're outputting a bunch of lines (from an array of strings for example), you can use it this way:
// Output an array of strings
$myStrings = Array('Line1','Line2','Line3');
echo implode(HTML_LINEBREAK,$myStrings);
However, generally speaking I would say avoid hard coding HTML inside your PHP echo/print statements. If you can keep the HTML outside of the code, it makes things much more flexible and maintainable in the long run.
\n didn't work for me. the \n appear in the bodytext of the email I was sending.. this is how I resolved it.
str_pad($input, 990); //so that the spaces will pad out to the 990 cut off.