json_decode from https url JSON validation on https - php

Can anyone shed any light on why
http://jsonlint.com/
and
http://jsonformatter.curiousconcept.com/
is giving invalid json from URL 1 below but not URL 2? They are both the same JSON generating code. The only difference is one is HTTPs and one is HTTP.
https://www.discussthemarket.com/dev/
JSON.parse: unexpected character at line 1 column 1 of the JSON data
and also
http://www.lambwatch.co.uk/json.htm
Valid JSON
Both have the same JSON generating code behind them, exact same code, but when I put the URLs into
http://jsonlint.com/
for validation, the https site is coming back with a parse error!?
Also, when I do
$json = json_decode(file_get_contents("https://www.discussthemarket.com/dev/"));
$json is NULL
However
$json = json_decode(file_get_contents("http://www.lambwatch.co.uk/json.htm"));
$json is the object as you'd expect
Can anyone shed any light on this?

Your problem is that the HTTPS server is adding a UTF8 BOM character to the start of the output, therefore invalidating the expected JSON response. Without seeing the code, it's unclear why, but it's likely a header issue.
If you're unable to solve it server-side, you can always simply remove it at the other end. Here is an example
<?php
$response = file_get_contents('https://www.discussthemarket.com/dev/');
$json = remove_utf8_bom($response);
var_dump(json_decode($json));
function remove_utf8_bom($text) {
$bom = pack('H*', 'EFBBBF');
$text = preg_replace("/^$bom/", '', $text);
return $text;
}

Related

Json empty array json_decode

I have the following code that converts json into an array:
$str = file_get_contents('http://localhost/data.json');
$decodedstr = html_entity_decode($str);
$jarray = json_decode($decodedstr, true);
echo "<pre>";
print_r($jarray);
echo "</pre>";
but my $jarray keeps returning null... I dont know why this is happening.. i have already validated my json in this question:
validated json question could anyone tell me what i am doing wrong? or what is happening. Thanks in advance.
when i echo my $str i get the following:
You are passing to json_decode a string that is not valid JSON, this is the reason NULL is returned.
As I see from comments inspecting the error code generated gives 4 that corresponds to the constant JSON_ERROR_SYNTAX that just means the JSON string has a Syntax Error.
(See http://php.net/manual/en/function.json-last-error.php)
You should inspect (echo) what you get from
$str = file_get_contents('http://localhost/data.json');
(You may edit your answer and post it - or a portion of it)
For sure it is not valid JSON; the problem lies there: in data.json.
Then as you fix things and get from data.json what is expected I would ensure you really need to use html_entity_decode on the fetched data.
It would be "weird" to have html encoded JSON data.
UPDATE
Looking at what you get from data.json it seem the JSON data contains actually HTML entities (as I see the presence of s)
This is actually weird, the right thing to do would be to fix how data.json is generated ensuring non-html-encoded JSON data is returned, charset is UTF-8 and the response content type is Content-Type: application/json.
We can't deepen this here as I don't know where does data.json come from or the code that generates it. Eventually you may post another answer.
So here is a quick fix provided that the right approach would be what I just suggested above.
As you decode html entities, non breaking spaces turns into 2 byte UTF-8 characters (byte values 196, 160) that are not valid for JSON encoded data.
The idea is to remove these characters; your code becomes:
$str = file_get_contents('http://localhost/data.json');
$decodedstr = html_entity_decode($str);
// the character sequence for decoded HTML
$nbsp = html_entity_decode( " " );
// remove every occurrence of the character sequence
$decodedstr = str_replace( $nbsp, "", $decodedstr );
$jarray = json_decode($decodedstr, true);
from php manual
http://php.net/manual/en/function.json-decode.php
Return:
... NULL is returned if the json cannot be decoded or if the encoded data is deeper than the recursion limit
So, surely the JSON string passed to json_decode() is not valid:
maybe because of html_entity_decode

valid json doesnt work

I am trying to decode a JSON string in PHP,but somehow the json_decode doesnt like my string, i think it is not valid json. The thing that is very strange to me is, that if i put the json response in a variable manually, it is working. If i write the json response out in the browser, and i write the content of the variable, both are completely the same, like this:
{"id":455463,"Created":"2016-04-30T14:20:38.09","SenderCompanyName":"x","InvoiceNumber":"2555","PaymentDueDate":"2016-04-30T00:00:00","ToBePaidAmount":350.0000}
If i look in the webpage source, the content is also completely the same. I have also tryed to convert to UTF8, but no change.
How do you guys usually debug this, or what did i forget ?
code:
// calling web service and saving json response in variable
$json_response = CallAPI($method, $url, $json_request);
// the response contain some unvalid character in the end, so i am removing it
$json_response = substr($json_response, 0, strpos($json_response, "}"));
// trying to decode it, IT PRINTS OUT NULL
var_dump(json_decode($json_response, true));
// copying the json response from the above and putting it into a variable
$json_response = '{"id":455433,"Created":"2016-04-30T12:55:12.313","SenderCompanyName":"x","InvoiceNumber":"2525","PaymentDueDate":"2016-04-30T00:00:00","ToBePaidAmount":350.0000}';
// trying to decode it, IT PRINTS OUT THE RESULT SUCCESFULLY
var_dump(json_decode($json_response, true));
Try this:
<?php
$json = '{"id":455463,"Created":"2016-04-30T14:20:38.09","SenderCompanyName":"x","InvoiceNumber":"2555","PaymentDueDate":"2016-04-30T00:00:00","ToBePaidAmount":350.0000}';
var_dump(json_decode($json));
?>
I finally found the solution.
I had forget to add this in my CURL OPTIONS:
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
After adding this, its working fine

Æøå in returned JSON result - the data doesn't look like it's supposed to

I have fetched some data from a url request using JSON with the following code:
$url = 'https://recruit.zoho.com/ats/private/xml/JobOpenings/getRecords?authtoken=$at&scope=recruitapi';
$request = new WP_Http;
$result = $request->request($url, $data = array());
$input = json_encode($result, true);
var_dump($input);
This code worked absolutely fine, except the data coming out looked really weird, such as:
"content-encoding":"gzip","vary":"Accept-Encoding","strict-transport-security":"max-age=15768000"},"body":"\u003C?xml version=\"1.0\" encoding=\"UTF-8\" ?\u003E\n\u003Cresponse uri=\"\/ats\/private\/xml\/JobOpenings\/getRecords\"\u003E\u003Cresult\u003E\u003CJobOpenings\u003E\u003Crow no=\"1\"\u003E\u003CFL val=\"JOBOPENINGID\"\u003E\u003C![CDATA[213748000001263043]]\u003E\u003C\/FL\u003E\u003CFL val=\"Published in website\"\u003E\u003C![CDATA[false]]\u003E\u003C\/FL\u003E\u003CFL val=\"Modified by\"\u003E\u003C![CDATA
After some research, I realize that part of the problem most likely is the fact that there are æ, ø, and å in the data I'm requesting. Others have solved the problem this way:
$input = json_encode(utf8_decode($result), true);
However this gives me this error:
Warning: utf8_decode() expects parameter 1 to be string, array given in
I know the array is not a string, but how else do I deal with this? It seems to have worked for others, and I cant figure out why.
Thanks.
Edit:
I noticed this in the beginning of the printed data.
string(31486) "{"headers":{"server":"ZGS","date":"Wed, 12 Aug 2015 13:59:32 GMT","content-type":"text\/xml;charset=utf-8"
Does that mean it is already UTF-8 and I'm totally off?
What you receive in $result is an utf-8 string that seems to represent an url of some sort. Anyhow, json_encode will escape any unicode character to \u008E strings.
If you don't want to escape utf-8 character, this question is relevent to you : Why does the PHP json_encode function convert UTF-8 strings to hexadecimal entities?
Everything seems to work fine from what I see. Although, the string you have provided us seem to be troncated but I guess this is an error on your part.

Error decoding JSON in PHP: "unexpected character"

I have a strange JSON decoding problem in PHP. A certain JSON file (and only this) produces reliably a decoding error. The PHP part is nothing special:
$data = array();
if (is_file(DIR.'config.json')) {
$data = json_decode(DIR.'config.json');
}
// diagnostics:
var_dump($data); // --> NULL (would be array() if decoding didn't happen)
var_dump(json_last_error()); // --> int(4) === JSON_ERROR_SYNTAX
var_dump(json_last_error_msg()); // --> string(20) "unexpected character"
echo file_get_contents(DIR.'config.json'); // --> '["a"]'
What I've tried:
Simplify JSON. The content of the file now is ["a"].
validate JSON, just to be sure, in four validators, including jsonlint.com and pasting it in browser console
No BOM, plain ASCII file
has read rights. The above echo works just fine.
completely independent of JSON file's content.
path is correct. Proven by above echo statement. JSON file is in the docroot.
Unfortunately PHP's JSON lib doesn't state errors more precisely, so I have no idea, what really causes the error.
Has anyone an idea, how I could proceed?
You're not decoding the file contents, but rather the name of the file itself, which is why you're getting null. Your code should be:
$data = array();
if (is_file(DIR.'config.json')) {
$data = json_decode(file_get_contents(DIR.'config.json'));
}
json_decode(DIR.'config.json');
You are trying to decode the string "<whatever-DIR-is>config.json". That's not a valid JSON string. You probably want:
json_decode(file_get_contents(DIR.'config.json'));
You also probably want more coffee.

Trouble decoding Json from Google

Well, I'm trying to retrieve some info, from the Google Suggest tool.
The thing is, the json returned after the request, doesn't seem decode-able (using json_decode) + JSONLint sees it as "invalid".
What's wrong?
{
e: "GooDUs7lFIeXO63LgBA",
c: 0,
u: "https://www.google.com/s?gs_rn\x3d24\x26gs_ri\x3dpsy-ab\x26tok\x3dt8ORbtI13MEFLoCQjPSv6w\x26cp\x3d2\x26gs_id\x3d3i\x26xhr\x3dt\x26q\x3dtemplate\x26es_nrs\x3dtrue\x26pf\x3dp\x26safe\x3doff\x26sclient\x3dpsy-ab\x26oq\x3d\x26gs_l\x3d\x26pbx\x3d1\x26bav\x3don.2,or.r_cp.r_qf.\x26bvm\x3dbv.50500085,d.bGE\x26fp\x3dc513cf9c63a02102\x26biw\x3d1304\x26bih\x3d437\x26tch\x3d1\x26ech\x3d20\x26psi\x3dFYkDUs-xCsrT4QTD9YGwDw.1375963413783.1",
p: true,
d: "[\x22template\x22,[[\x22template\\u003cb\\u003es\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003e monster\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003e c++\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003es for pages\\u003c\\/b\\u003e\x22,0]],{\x22t\x22:{\x22bpc\x22:false,\x22tlw\x22:false},\x22q\x22:\x22YjrI_EdhVrEkZrkqZwaGIJ_Ih4c\x22,\x22j\x22:\x223i\x22}]"
}
That's what JSONLint gives as an error :
Parse error on line 1:
{ e: "GooDUs7lFIeXO63L
-----^
Expecting 'STRING', '}'
P.S. Even after editing it like "e": and so on, it still gives out error regarding the value of u and claiming that it was expecting a STRING or NUMBER etc... :S
The code given in the question is not valid JSON.
In order to be valid JSON, it would be required to have the field named in quotes. There are no quotes around the e variable name, or any of the others.
This is what the JSON decoder is complaining about: It is expecting to see "e", not e.
In addition, JSON does not accept the \x escaping format (character reference in hex); it can only use the \u format (unicode character reference in decimal). The code you've provided includes escaped characters in both formats.
The question is, are you using an official Google API? Because they're usually pretty good at providing valid JSON. This isn't valid JSON, so it may be that you're not using the correct API. Another clue is that the variable names aren't very meaningful; offical APIs would normally give more meaningful variable names. If it is the correct API, you should try raising a ticket with Google to fix it; broken JSON is not good, but it should be pretty trivial for them to fix.
Assuming you can't get them to fix it and we can't find an alternative API location that does give valid data, how do we deal with what we've got?
While this code may not be valid JSON, it is valid as a Javascript object (the JSON rules are stricter that those of plain Javascript). It could therefore be run in a Javascript interpreter using eval(), if you trusted it enough for that.
The only other alternativate is to fix the string prior to parsing it so that the variable names are quoted. That's a bit of a pain, but would be do-able if the output was consistent. You'll have problems though if it ever changes (and again, if it's an unofficial API, that could happen at any time without warning).
The problem is with the backslashes in the strings (used for the escape characters)
In PHP 5.4, you can use JSON_UNESCAPED_SLASHES:
echo json_encode(JSON_STRING, JSON_UNESCAPED_SLASHES);
Otherwise, you can do the replacement-
str_replace('\\/', '/', json_encode(JSON_STRING));
Since \/ is a valid way to represent /
OK, so this is what I ended up doing (not elegant at all but it works) :
$content = preg_replace_callback(
"(\\\\x([0-9a-f]{2}))i",
function($a) {return chr(hexdec($a[1]));},
$content
);
$content = str_replace("e:","\"e\":",$content);
$content = str_replace("c:","\"c\":",$content);
$content = str_replace("u:","\"u\":",$content);
$content = str_replace("p:","\"p\":",$content);
$content = str_replace("d:","\"d\":",$content);
$content = str_replace("\"[","[",$content);
$content = str_replace("]\"","]",$content);
$content = json_decode($content);

Categories