simplexml_load_string and the unwelcome parse error - php

Update: Casting as an array does the trick. See this response, since I don't have enough clout to upvote :)
I started on this problem with many potential culprits, but after lots of diagnostics the problem is still there and no obvious answers remain.
I want to print the placename "Gaborone", which is located at the first tag under the first tag under the first tag of this API-loaded XML file. How can I parse this to return that content?
<?php
# load the XML file
$test1 = (string)file_get_contents('http://www.afdb.org/fileadmin/uploads/afdb/Documents/Generic-Documents/IATIBotswanaData.xml');
#throw it into simplexml for parsing
$xmlfile = simplexml_load_string($test1);
#output the parsed text
echo $xmlfile->iati-activity[0]->location[0]->gazetteer-entry;
?>
Which never fails to return this:
Parse error: syntax error, unexpected '[', expecting ',' or ';'
I've tried changing the syntax to avoid the hyphens in the tag names as such:
echo $xmlfile["iati-activity"][0]["location"][0]["gazetteer-entry"];
. . . but that returns complete nothingness; no error, no source.
I've also tried debugging based on these otherwise-helpful threads, but none of the solutions have worked. Is there an obvious error in my simplexml addressing?

I've tried changing the syntax to avoid the hyphens in the tag names
as such: echo
$xmlfile["iati-activity"][0]["location"][0]["gazetteer-entry"];
Your problem here is that, object native casting to an array isn't recursive, so that you did that for primary keys only. And yes, your guess is correct - you shouldn't deal with object properties when working with returned value of simplexml_load_string() because of the syntax issues. Instead, you should cast a returned value of it (stdclass) into an array recursively. You can use this function for that:
function object2array($object) {
return json_decode(json_encode($object), true);
}
The rest:
// load the XML file
$test1 = file_get_contents('http://www.afdb.org/fileadmin/uploads/afdb/Documents/Generic-Documents/IATIBotswanaData.xml');
$xml = simplexml_load_string($test1);
// Cast an object into array, that makes it much easier to work with
$data = object2array($xml);
$data = $data['iati-activity'][0]['location'][0]['gazetteer-entry']; // Works
var_dump($data); // string(8) "Gaborone"

I had a similar problem parsing XML using the simpleXML command until I did the following string replacements:
//$response contains the XML string
$response = str_replace(array("\n", "\r", "\t"), '', $response); //eliminate newlines, carriage returns and tabs
$response = trim(str_replace('"', "'", $response)); // turn double quotes into single quotes
$simpleXml = simplexml_load_string($response);
$json = json_decode(json_encode($simpleXml)); // an extra step I took so I got it into a nice object that is easy to parse and navigate
If that doesn't work, there's some talk over at PHP about CDATA not always being handled properly - PHP version dependent.
You could try this code prior to calling the simplexml_load_string function:
if(strpos($content, '<![CDATA[')) {
function parseCDATA($data) {
return htmlentities($data[1]);
}
$content = preg_replace_callback(
'#<!\[CDATA\[(.*)\]\]>#',
'parseCDATA',
str_replace("\n", " ", $content)
);
}
I've reread this, and I think your error is happening on your final line - try this:
echo $xmlfile->{'iati-activity'}[0]->location[0]->{'gazetteer-entry'};

Related

PHP: Converting xml to array

I have an xml string. That xml string has to be converted into PHP array in order to be processed by other parts of software my team is working on.
For xml -> array conversion i'm using something like this:
if(get_class($xmlString) != 'SimpleXMLElement') {
$xml = simplexml_load_string($xmlString);
}
if(!$xml) {
return false;
}
It works fine - most of the time :) The problem arises when my "xmlString" contains something like this:
<Line0 User="-5" ID="7436194"><Node0 Key="<1" Value="0"></Node0></Line0>
Then, simplexml_load_string won't do it's job (and i know that's because of character "<").
As i can't influence any other part of the code (i can't open up a module that's generating XML string and tell it "encode special characters, please!") i need your suggestions on how to fix that problem BEFORE calling "simplexml_load_string".
Do you have some ideas? I've tried
str_replace("<","<",$xmlString)
but, that simply ruins entire "xmlString"... :(
Well, then you can just replace the special characters in the $xmlString to the HTML entity counterparts using htmlspecialchars() and preg_replace_callback().
I know this is not performance friendly, but it does the job :)
<?php
$xmlString = '<Line0 User="-5" ID="7436194"><Node0 Key="<1" Value="0"></Node0></Line0>';
$xmlString = preg_replace_callback('~(?:").*?(?:")~',
function ($matches) {
return htmlspecialchars($matches[0], ENT_NOQUOTES);
},
$xmlString
);
header('Content-Type: text/plain');
echo $xmlString; // you will see the special characters are converted to HTML entities :)
echo PHP_EOL . PHP_EOL; // tidy :)
$xmlobj = simplexml_load_string($xmlString);
var_dump($xmlobj);
?>

SimpleXMLElement using a string

I want to create a new SimpleXMLElement with data . When I put the data from the link below in the code I get the next error: Fatal error: Uncaught exception 'Exception' with message 'String could not be parsed as XML
The encoded data can be found here:http://www.interwebmedia.nl/dataxi/base64.txt
decoded data: http://www.interwebmedia.nl/dataxi/data.txt
<?php
str = 'encodeddata';
//echo htmlspecialchars(base64_decode($str),ENT_QUOTES);
$decoded = htmlspecialchars(base64_decode($str),ENT_QUOTES);
$xml = new SimpleXMLElement($decode);
echo $xml->asXML();
?>
I think you've attempted to use HEREDOC syntax (or seen somebody else using it) but completely misunderstood it.
HEREDOC syntax is an alternative way of quoting a string, instead of " or '. It's useful for hard-coding blocks of XML, because it acts like double-quotes, but let's you use double-quotes inside, like this:
$my_xml_string = <<<XML
<some_xml>
<with multiple_lines="here" />
</some_xml>
XML;
That code is precisely equivalent to this:
$my_xml_string = "
<some_xml>
<with multiple_lines=\"here\" />
</some_xml>
";
What you have done instead is taken the literal string "<<<" and added it onto your XML, giving you a string like this:
$my_xml_string = "<<<XML
<some_xml>
<with multiple_lines=\"here\" />
</some_xml>
XML";
Or in your example, the string "<<<XML<data>XML".
As far as the XML parser's concerned, you've just put a load of garbage on the beginning and end of the string, so it rightly complains it's not a valid XML document.

Parsing JSON with PHP and displaying contents

I am having issues with this code
This is my PHP that worked with other projects...
$gas = file_get_contents('http://api.mygasfeed.com/stations/loadbygeo/47.9494949/120.23423432/reg|mid|pre|diesel/'. $api . '.json?callback=?');
$json_output = json_decode(utf8_decode($gas));
$location= $json_output->geoLocation->city_id;
This is the JSON result
?({"status":{"error":"NO","code":200,"description":"none","message":"Request ok"},"geoLocation":{"city_id":"13123","city_long":"Hulunber","region_short":"Nei Mongol","region_long":"Nei Mongol","country_long":"China","country_id":"49","region_id":"6010"},"stations":[]})
This code is returning a blank result.
You need to remove the ?( and the ). See WP:JSONP.
One could use substr() for that, or better yet a regex to assert and remove only the desired garbage:
$json = preg_replace("/ ^[?\w(]+ | [)]+\s*$ /x", "", $jsonp);
Would remove arbitrary callbackFnNames( but also your strange ?( pseudo function call.

json inside function, (php parsing)

I have sth like that inside *.txt file.
function_name({"one": {"id": "id_for_one", "value": "value_for_one"}, ...});
And I am getting the file like this:
$source = 'FILE_NAME.txt';
$json = json_decode(file_get_contents($source),true);
echo $json['one']['value'];
It doesn't work, but when I remove function_name( and ); it works.
How to parse it without removing these strings?
You can't. It is not valid JSON with those. Take a substring that excludes them.
You will have to remove those strings. With the function_name portion it is not valid JSON.
A JSON string will typically either begin with { (object notation) or [ (array notation), but can also be scalar values such as a string or number. You cannot parse it without first making sure the string is valid JSON.
You are trying to get the string within a file and decoding it as a JSON file.
The 'function_name' isn't a valid JSON string, the rest inside yes.
How to parse it without removing these strings?
There is no way.
This should work for you.
$data = file_get_contents($source);
$data = substr($data, strlen("function_name("));
$data{strlen($data)-1}=$data{strlen($data)-2}=" ";
$json = json_decode($data,true);
Both {} and [] works for string to access individual characters.
The function in your text file, means that isn't a json file.
Remove the string using a regular expression, and your problem is fixed.
If the function is a fixed name, do something like this:
$source = 'FILE_NAME.txt';
$json_content = str_replace('function_name(', '', file_get_contents($source));
$json_content = substr($json_content,0,-2);
$json = json_decode($json_content,true);
echo $json['one']['value'];

PHP json_decode() returns NULL with seemingly valid JSON?

I have this JSON object stored on a plain text file:
{
"MySQL": {
"Server": "(server)",
"Username": "(user)",
"Password": "(pwd)",
"DatabaseName": "(dbname)"
},
"Ftp": {
"Server": "(server)",
"Username": "(user)",
"Password": "(pwd)",
"RootFolder": "(rf)"
},
"BasePath": "../../bin/",
"NotesAppPath": "notas",
"SearchAppPath": "buscar",
"BaseUrl": "http:\/\/montemaiztusitio.com.ar",
"InitialExtensions": [
"nem.mysqlhandler",
"nem.string",
"nem.colour",
"nem.filesystem",
"nem.rss",
"nem.date",
"nem.template",
"nem.media",
"nem.measuring",
"nem.weather",
"nem.currency"
],
"MediaPath": "media",
"MediaGalleriesTable": "journal_media_galleries",
"MediaTable": "journal_media",
"Journal": {
"AllowedAdFileFormats": [
"flv:1",
"jpg:2",
"gif:3",
"png:4",
"swf:5"
],
"AdColumnId": "3",
"RSSLinkFormat": "%DOMAIN%\/notas\/%YEAR%-%MONTH%-%DAY%\/%TITLE%/",
"FrontendLayout": "Flat",
"AdPath": "ad",
"SiteTitle": "Monte Maíz: Tu Sitio",
"GlobalSiteDescription": "Periódico local de Monte Maíz.",
"MoreInfoAt": "Más información aquí, en el Periódico local de Monte Maíz.",
"TemplatePath": "templates",
"WeatherSource": "accuweather:SAM|AR|AR005|MONTE MAIZ",
"WeatherMeasureType": "1",
"CurrencySource": "cotizacion-monedas:Dolar|Euro|Real",
"TimesSingular": "vez",
"TimesPlural": "veces"
}
}
When I try to decode it with json_decode(), it returns NULL. Why?
The file is readable (I tried echoing file_get_contents() and it worked ok).
I've tested JSON against http://jsonlint.com/ and it's perfectly valid.
What's wrong here?
This worked for me
json_decode( preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $json_string), true );
It could be the encoding of the special characters. You could ask json_last_error() to get definite information.
You could try with it.
json_decode(stripslashes($_POST['data']))
If you check the the request in chrome you will see that the JSON is text, so there has been blank code added to the JSON.
You can clear it by using
$k=preg_replace('/\s+/', '',$k);
Then you can use:
json_decode($k)
print_r will then show the array.
Maybe some hidden characters are messing with your json, try this:
$json = utf8_encode($yourString);
$data = json_decode($json);
I had the same problem and I solved it simply by replacing the quote character before decode.
$json = str_replace('"', '"', $json);
$object = json_decode($json);
My JSON value was generated by JSON.stringify function.
For me the php function stripslashes() works when receiving json from javascript. When receiving json from python, the second optional parameter to the json_decode call does the trick since the array is associative. Workes for me like a charm.
$json = stripslashes($json); //add this line if json from javascript
$edit = json_decode($json, true); //adding parameter true if json from python
this help you to understand what is the type of error
<?php
// A valid json string
$json[] = '{"Organization": "PHP Documentation Team"}';
// An invalid json string which will cause an syntax
// error, in this case we used ' instead of " for quotation
$json[] = "{'Organization': 'PHP Documentation Team'}";
foreach ($json as $string) {
echo 'Decoding: ' . $string;
json_decode($string);
switch (json_last_error()) {
case JSON_ERROR_NONE:
echo ' - No errors';
break;
case JSON_ERROR_DEPTH:
echo ' - Maximum stack depth exceeded';
break;
case JSON_ERROR_STATE_MISMATCH:
echo ' - Underflow or the modes mismatch';
break;
case JSON_ERROR_CTRL_CHAR:
echo ' - Unexpected control character found';
break;
case JSON_ERROR_SYNTAX:
echo ' - Syntax error, malformed JSON';
break;
case JSON_ERROR_UTF8:
echo ' - Malformed UTF-8 characters, possibly incorrectly encoded';
break;
default:
echo ' - Unknown error';
break;
}
echo PHP_EOL;
}
?>
This error means that your JSON string is not valid JSON!
Enable throwing exceptions when an error happens and PHP will throw an exception with the reason for why it failed.
Use this:
$json = json_decode($string, null, 512, JSON_THROW_ON_ERROR);
Just thought I'd add this, as I ran into this issue today. If there is any string padding surrounding your JSON string, json_decode will return NULL.
If you're pulling the JSON from a source other than a PHP variable, it would be wise to "trim" it first:
$jsonData = trim($jsonData);
The most important thing to remember, when you get a NULL result from JSON data that is valid is to use the following command:
json_last_error_msg();
Ie.
var_dump(json_last_error_msg());
string(53) "Control character error, possibly incorrectly encoded"
You then fix that with:
$new_json = preg_replace('/[[:cntrl:]]/', '', $json);
Here is how I solved mine https://stackoverflow.com/questions/17219916/64923728 .. The JSON file has to be in UTF-8 Encoding, mine was in UTF-8 with BOM which was adding a weird &65279; to the json string output causing json_decode() to return null
In my case , i was facing the same issue , but it was caused by slashes inside the json string so using
json_decode(stripslashes($YourJsonString))
OR
json_decode( preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $YourJsonString), true );
If the above doesnt work, first replace the quotes from html quote , this might be happening if you are sending data from javascript to php
$YourJsonString = stripslashes($YourJsonString);
$YourJsonString = str_replace('"', '"', $YourJsonString);
$YourJsonString = str_replace('["', '[', $YourJsonString);
$YourJsonString = str_replace('"]', ']', $YourJsonString);
$YourJsonString = str_replace('"{', '{', $YourJsonString);
$YourJsonString = str_replace('}"', '}', $YourJsonString);
$YourJsonObject = json_decode($YourJsonString);
Will solve it,
It's probably BOM, as others have mentioned. You can try this:
// BOM (Byte Order Mark) issue, needs removing to decode
$bom = pack('H*','EFBBBF');
$response = preg_replace("/^$bom/", '', $response);
unset($tmp_bom);
$response = json_decode($response);
This is a known bug with some SDKs, such as Authorize.NET
If you are getting json from database, put
mysqli_set_charset($con, "utf8");
after defining connection link $con
Just save some one time. I spent 3 hours to find out that it was just html encoding problem. Try this
if(get_magic_quotes_gpc()){
$param = stripslashes($row['your column name']);
}else{
$param = $row['your column name'];
}
$param = json_decode(html_entity_decode($param),true);
$json_errors = array(
JSON_ERROR_NONE => 'No error has occurred',
JSON_ERROR_DEPTH => 'The maximum stack depth has been exceeded',
JSON_ERROR_CTRL_CHAR => 'Control character error, possibly incorrectly encoded',
JSON_ERROR_SYNTAX => 'Syntax error',
);
echo 'Last error : ', $json_errors[json_last_error()], PHP_EOL, PHP_EOL;
print_r($param);
So, html_entity_decode() worked for me. Please try this.
$input = file_get_contents("php://input");
$input = html_entity_decode($input);
$event_json = json_decode($input,true);
It took me like an hour to figure it out, but trailing commas (which work in JavaScript) fail in PHP.
This is what fixed it for me:
str_replace([PHP_EOL, ",}"], ["", "}"], $JSON);
I recommend creating a .json file (ex: config.json).
Then paste all of your json object and format it. And thus you will be able to remove all of that things that is breaking your json-object, and get clean copy-paste json-object.
I also face the same issue...
I fix the following steps... 1) I print that variable in browser 2) Validate that variable data by freeformatter 3) copy/refer that data in further processing
after that, I didn't get any issue.
This happen because you use (') insted {") in your value or key.
Here is wrong format.
{'name':'ichsan'}
Thats will be return NULL if you decode them.
You should pass the json request like this.
{"name":"ichsan"}
I've solved this issue by printing the JSON, and then checking the page source (CTRL/CMD + U):
print_r(file_get_contents($url));
Turned out there was a trailing <pre> tag.
you should ensure these points
1. your json string dont have any unknowns characters
2. json string can view from online json viewer (you can search on google as online viewer or parser for json) it should view without any error
3. your string dont have html entities it should be plain text/string
for explanation of point 3
$html_product_sizes_json=htmlentities($html);
$ProductSizesArr = json_decode($html_product_sizes_json,true);
to (remove htmlentities() function )
$html_product_sizes_json=$html;
$ProductSizesArr = json_decode($html_product_sizes_json,true);
For my case, it's because of the single quote in JSON string.
JSON format only accepts double-quotes for keys and string values.
Example:
$jsonString = '{\'hello\': \'PHP\'}'; // valid value should be '{"hello": "PHP"}'
$json = json_decode($jsonString);
print $json; // null
I got this confused because of Javascript syntax. In Javascript, of course, we can do like this:
let json = {
hello: 'PHP' // no quote for key, single quote for string value
}
// OR:
json = {
'hello': 'PHP' // single quote for key and value
}
but later when convert those objects to JSON string:
JSON.stringify(json); // "{"hello":"PHP"}"
Before applying PHP related solutions, validate your JSON format. Maybe that is the problem. Try this online JSON format validator.
For me, I had to turn off the error_reporting, to get json_decode() working correctly. It sounds weird, but true in my case. Because there is some notice printed between the JSON string that I am trying to decode.
I had exactly the same problem
But it was fixed with this code
$zip = file_get_contents($file);
$zip = json_decode(stripslashes($zip), true);
I had the same issue and none of the answers helped me.
One of the variables in my JSON object had the value Andaman & Nicobar. I removed this & and my code worked perfectly.
<?php
$json_url = "http://api.testmagazine.com/test.php?type=menu";
$json = file_get_contents($json_url);
$json=str_replace('},
]',"}
]",$json);
$data = json_decode($json);
echo "<pre>";
print_r($data);
echo "</pre>";
?>

Categories