Can anyone explain this PHP code using json_encode and json_decode? - php

$a = '{ "tag": "<b></b>" }';
echo json_encode( json_decode($a) );
This outputs:
{"tag":"<b><\/b>"}
when you would think it would output exactly the input. For some reason json_encode adds an extra slash.

Because it's part of the JSON standard
http://json.org/
char
any-Unicode-character-
except-"-or-\-or-
control-character
\"
\\
\/ <---- see here?
\b
\f
\n
\r
\t
\u four-hex-digits

use this:
echo json_encode($a,JSON_HEX_TAG)
Result will be:
["\u003C\u003E"]
You can read this article to improve your knowledge about JSON_ENCODE
http://php.net/manual/en/function.json-encode.php

That's probably a security-feature. The escaped version (Eg. the output) would be parsed as similar to the unescaped-version, by Javascript (Eg. \/ becomes /). Having escaped the slash like that, there is a lesser chance of the browser misinterpreting the Javascript-string as HTML. Of course, if you treat the data correct, this shouldn't be needed, so it's more a safeguard against a clueless programmer messing things up for himself.

Your input is not valid JSON, but PHP's JSON parser (like most JSON parsers) will parse it anyway.

Related

preg_match removes escaping from json string

I am trying to parse some json data using json_decode function of php. However, I need to remove certain leading and trailing characters from this long string before decoding. Therefore, I am using preg_match to remove those characters prior to decode. For some reason, preg_match is changing escaping when it encounters following substring (in the middle of the string)
{content: \\\"\\200B\\\"}
After preg_match the above string looks like this:
{content: \\"\200B\\"}
Because of this, json_decode fails.
FYI, the preg_match pattern looks like this:
(?<=remove_these_leading_char)(.*)(?=remove_these_trailing_char)
OK, so here is the additional information based on the questions being asked:
Why triple escaping? fix triple escpaing etc. The answer is that I don't have any control over it. It is not generated by my code.
The original string is not fully json compliant. It has several leading and trailing characters that need to be removed. Therefore I have to use regex. The format of that string is like this:
returnedHTMLdata({json_object},xx);
It looks like this behavior is not limited to preg_match only. Even substr also does this.
It looks like you've got some JSON with padding. To remove the function name and parenthesis, leaving the (unescaped) json object, you can do something like this:
$str = <<<'EOS'
returnedHTMLdata({content: \\\"\\200B\\\", foo: \\\"bar\\\", \"baz\": \\\"fez\\\"},xx);
EOS;
$str = preg_replace('/.+?({.+}).+/','$1', $str);
echo $str;
Output:
{content: \\\"\\200B\\\", foo: \\\"bar\\\", \"baz\": \\\"fez\\\"}
Please note that even if you manage to successfully unescape this string, json_decode requires that keys - e.g. "content" - are enclosed in double quotes, so you will need to modify the JSON string/object before calling that function. Or I guess you could instead use something like the old Services_JSON package to decode it, which I believe does not have that requirement.

PHP JSON response normalization

I've got JSON response.
Almost all is correct, but SOME values need "addslashes" before I can decode without errors.
http://jsonformatter.curiousconcept.com/ servise says that the following is invalid:
"admiAmn":"DEEE\trtrtrtrtr",
And I agree with jsonformatter.
If I use addslashes, slashes will be added everywhere, and I need just to replace th following:
[NOT_SLASH]\[NOT_SLASH]
with:
[NOT_SLASH]\\[NOT_SLASH]
I can not either str_replace or addslashes, I must be shure that the '\' which is being replaced has no any '\' after and before it.
Thanks.
I would like to hear you thoughts and ideas.
You can use preg_replace to do the trick like this:
$in = '"admiAmn":"DEEE\trtrtrtrtr\\alma"';
$slash = preg_quote('\\');
echo preg_replace("#(?<!{$slash}){$slash}(?!{$slash})#", $slash.$slash, $in), "\n";
I've moved the escaped \ to a variable to make it more readable. The pattern uses the negative lookbehind and lookahead features to make this work.
However if you can, you should try fix the source instead of patching the output (at least file a bugreport of some kind), patching output can be really brittle.

Regex messing up json_decode in php

When I put this into a json checker, it's a valid json, but the json_decode in php gives a decode error. json partial:
"regex":{
"validator":"Regex",
"options":{
"pattern":"\/^[a-zA-Z\\.\\- ]+$\/",
"messages":"Please use letters, spaces, period and dashes only"
}
}
I looked at Regular expression messing up json_decode(); but that didn't help me.
Thanks!
Here is the entire json:
This works.
<?php
error_reporting(E_ALL);
$json = '{"regex":{
"validator":"Regex",
"options":{
"pattern":"\\/^[a-zA-Z\\\\.\\\\- ]+$\\/",
"messages":"Please use letters, spaces, period and dashes only"
}
}
}';
var_dump(json_decode($json, true));
?>
Notice the entire JSON string was encapsulated with {} and also notice all backslashes were escaped with another backslash (so where regex wants \ we have \\). This works perfect.
NOTE UPDATE:
Just str_replace("\\", "\\\\", $json); and you will be fine. Also, if this is submitted in a form it SHOULD be fine. I just submitted your entire JSON string through an HTML form and sent it directly to json_decode (without escaping) and it worked. This is because the browser escapes backslashes already. So long we are not defining the string within PHP it will be escaped (atleast backslashes)

magento escape string for JavaScript part 2

This is a follow up on
magento escape string for javascript
where I accepted #AlanStorm suggestion to use json_encode to escape string literals.
But I now have a new problem with this solution.
when trying to escape a URL that has /'s in it to be rendered as a string literal for JavaScript json_encode seems to add redundant \'s in front of the /'s.
Any new suggestions here?
solutions should take a string variable and return a string that would properly be evaluated to a string literal in JavaScript. (I don't care if its surrounded with single or double quotes - although I prefer single quotes. And it must also support newlines in the string.)
Thanks
some more info: how comes '/');echo
json_encode($v); ?> results in
{"a":"\/"} ?
Details can be found here http://bugs.php.net/bug.php?id=49366
work around for this issue:
str_replace('\\/', '/', $jsonEncoded);
for your issue you can do something like
$jsonDecoded = str_replace(array("\\/", "/'s"), array("/", "/\'s"), $jsonEncoded);
Hope this helps
When I check the JSON format I see that solidi are allowed to be escaped so json_encode is in fact working correctly.
(source: json.org)
The bug link posted by satrun77 even says "It's not incorrect to escape slashes."
If you're adamant to do without and (in this case) are certain to be working with a string you can use a hack like this:
echo '["', addslashes($string), '"]';
Obviously that doesn't help for more complicated structures but as luck has it, you are using Magento which is highly modifiable. Copy lib/Zend/Json/Encoder.php to app/core/local/Zend/Json/Encoder.php (which forms an override) and fix it's _encodeString method.

Why does JSON encoder adds escaping character when encoding URLs?

I am using json_encode in PHP to encode an URL
$json_string = array ('myUrl'=> 'http://example.com');
echo json_encode ($json_string);
The above code generates the following JSON string:
{"myUrl":"http:\/\/example.com"}
Rather than
{"myUrl":"http://example.com"}
I am just newbie, which output is correct? Is JSON parser able to evaluate the second output correctly?
According to https://www.json.org/, one should escape that character, although it is not strictly necessary in JavaScript:
Also read this related bug report on php.net for a brief discussion.
See 2.5 of the RFC:
All Unicode characters may be placed
within the quotation marks except for
the characters that must be escaped:
quotation mark, reverse solidus, and
the control characters (U+0000 through
U+001F).
Any character may be escaped.
So it doesn't sound like it needs to be escaped, but it can be, and the website (and a text diagram in the RFC) illustrates it as being escaped.
My guess is that the writers of that function added that unnecessary encoding through nothing more than plain ignorance. Escaping forward slashes is not required.
A surprisingly large number of programmers I've known are just as bad with keeping their slashes straight as the rest of the world. And an even greater number are really poor with doing encoding and decoding properly.
Update:
After doing some searches, I came across this discussion. It brings up a good point that escaping a / is sometimes necessary for bad HTML parsers. I've come across a problem once where when IE 6 incorrectly handles content like this:
<script>
var json = { scriptString: "<script> /* JavaScript here */ </script>" };
</script>
IE 6 would see the </script> inside of the string and close out the script tag too early. Thus, this is more IE 6 safe (though the opening script tag in string might also break things... I can't remember):
<script>
var json = { scriptString: "<script> \/* JavaScript here *\/ <\/script>" };
</script>
And they also say that some bad parsers would see the // in http:// and treat the rest of the line like a JavaScript comment.
So it looks like this is yet another case of Internet technologies being hijacked by Browser Fail.
If you are using php 5.4 you can use json_encode options. see the manual.
Several options added in php 5.3 but JSON_UNESCAPED_SLASHES in 5.4.
I think this solves your problem
json_encode ($json_string, JSON_UNESCAPED_SLASHES );
You can see the documentation:
https://www.php.net/manual/en/function.json-encode.php https://www.php.net/manual/en/json.constants.php
I see another problem here. The string result {"myUrl":"http://example.com"} should not have the member name myUrl quoted. In JavaScript and JSON, I think all object literal member ids are unquoted strings. So, I would expect the result to be {myUrl:"http://example.com"}.
This seems too big a bug in PHP, so I must be wrong.
Edit, 2/11/11: Yes, I'm wrong. JSON syntax requires even the field names to be in double quotation marks.

Categories