How to json_decode invalid JSON with apostrophe instead of quotation mark - php

Sample code:
<?php
$json = "['foo', 'bar']";
var_dump( json_decode($json) );
It works with PHP 5.5.3 but it fails for lower PHP's versions
It works on my machine with PHP 5.5.3 but it fails everywhere else.
I know it is incorrect JSON but my webservice gives me JSON with ' symbols together with "
['foo', "bar", {'test': "crazy \"markup\""}]
Sandbox
How to parse JSON data with apostrophes in PHP 5.3? Obviously original JSON I want to parse is more complex.
(I can't upgrade my PHP on production server neither get proper JSON from webservice)

Here's an alternative solution to this problem:
function fixJSON($json) {
$regex = <<<'REGEX'
~
"[^"\\]*(?:\\.|[^"\\]*)*"
(*SKIP)(*F)
| '([^'\\]*(?:\\.|[^'\\]*)*)'
~x
REGEX;
return preg_replace_callback($regex, function($matches) {
return '"' . preg_replace('~\\\\.(*SKIP)(*F)|"~', '\\"', $matches[1]) . '"';
}, $json);
}
This approach is more robust than h2ooooooo's function in two respects:
It preserves double quotes occurring in a single quoted string, by applying additional escaping to them. h2o's variant will replace them with double quotes instead, thus changing the value of the string.
It will properly handle escaped double quotes \", for which h2o's version seems to go into an infinite loop.
Test:
$brokenJSON = <<<'JSON'
['foo', {"bar": "hel'lo", "foo": 'ba"r ba\"z', "baz": "wor\"ld ' test"}]
JSON;
$fixedJSON = fixJSON($brokenJSON);
$decoded = json_decode($fixedJSON);
var_dump($fixedJSON);
print_r($decoded);
Output:
string(74) "["foo", {"bar": "hel'lo", "foo": "ba\"r ba\"z", "baz": "wor\"ld ' test"}]"
Array
(
[0] => foo
[1] => stdClass Object
(
[bar] => hel'lo
[foo] => ba"r ba"z
[baz] => wor"ld ' test
)
)

Here's a simple parser that'll fix your quotes for you. If it encounters a ' quote which isn't in a double quote ", it'll assume that it's wrong and replace the double quotes inside of that quote, and turn the quote enclosured into double quotes:
Example:
<?php
function fixJSON($json) {
$newJSON = '';
$jsonLength = strlen($json);
for ($i = 0; $i < $jsonLength; $i++) {
if ($json[$i] == '"' || $json[$i] == "'") {
$nextQuote = strpos($json, $json[$i], $i + 1);
$quoteContent = substr($json, $i + 1, $nextQuote - $i - 1);
$newJSON .= '"' . str_replace('"', "'", $quoteContent) . '"';
$i = $nextQuote;
} else {
$newJSON .= $json[$i];
}
}
return $newJSON;
}
$brokenJSON = "['foo', {\"bar\": \"hel'lo\", \"foo\": 'ba\"r'}]";
$fixedJSON = fixJSON( $brokenJSON );
var_dump($fixedJSON);
print_r( json_decode( $fixedJSON ) );
?>
Output:
string(41) "["foo", {"bar": "hel'lo", "foo": "ba'r"}]"
Array
(
[0] => foo
[1] => stdClass Object
(
[bar] => hel'lo
[foo] => ba'r
)
)
DEMO

NikiCs´ answer is already spot on. Your input seems to be manually generated, so it's entirely possible that within ' single quoted strings, you'll receive unquoted " doubles. A regex assertion is therefore advisable instead of a plain search and replace.
But there are also a few userland JSON parsers which support a bit more Javascript expression syntax. It's probably best to speak of JSOL, JavaScript Object Literals, at this point.
PEARs Services_JSON
Services_JSON can decode:
unquoted object keys
and strings enclosed in single quotes.
No additional options are required, just = (new Services_JSON)->decode($jsol);
up_json_decode() in upgradephp
This was actually meant as fallback for early PHP versions without JSON extension. It reimplements PHPs json_decode(). But there's also the upgrade.php.prefixed version, which you'd use here.
It introduces an additional flag JSON_PARSE_JAVASCRIPT.
up_json_decode($jsol, false, 512, JSON_PARSE_JAVASCRIPT);
And I totally forgot about mentionind this in the docs, but it also supports single-quoted strings.
For instance:
{ num: 123, "key": "value", 'single': 'with \' and unquoted " dbls' }
Will decode into:
stdClass Object
(
[num] => 123
[key] => value
[single] => with ' and unquoted " double quotes
)
Other options
JasonDecoder by #ArtisticPhoenix does support unquoted keys and literals, though no '-quoted strings. It's easy to understand or extend however.
YAML (1.2) is a superset of JSON, and most parsers support both unquoted keys or single-quoted strings. See also PHP YAML Parsers
Obviously any JSOL tokenizer/parser in userland is measurably slower than just preprocessing malformed JSON. If you expect no further gotchas from your webservice, go for the regex/quote conversion instead.

One solution would be to build a proxy using NodeJS. NodeJS will handle the faulty JSON just fine and return a clean version:
johan:~ # node
> JSON.stringify(['foo', 'bar']);
'["foo","bar"]'
Maybe write a simple Node script that accepts the JSON data as STDIN and returns the validated JSON to STDOUT. That way you can call it from PHP.
The downside is that your server would need NodeJS. Not sure if that is a problem for you.

If you know that PHP 5.5.+ will parse this JSON gracefully, I would pipe the web service responses trough a proxy script on a PHP5.5+ web server, which sanitizes the responses for lower versions - meaning just echo json_encode(json_decode($response)); That's a stable and reliable approach.
If you make the web service URL configurable trough a config value, it will work for lower versions by accessing the proxy, in higher versions by accessing the web service directly.

A fast solution could be str_replace("'","\"",$string). This depends on many things, but I think you could give it a try.

You could use (and probably modify/extend) a library to build an AST from the supplied JSON and replace the single quotes with double quotes.
https://github.com/Seldaek/jsonlint/blob/master/src/Seld/JsonLint/Lexer.php
Might be a good start.

Related

How do i decode multiple json object in a single string? [duplicate]

Sample code:
<?php
$json = "['foo', 'bar']";
var_dump( json_decode($json) );
It works with PHP 5.5.3 but it fails for lower PHP's versions
It works on my machine with PHP 5.5.3 but it fails everywhere else.
I know it is incorrect JSON but my webservice gives me JSON with ' symbols together with "
['foo', "bar", {'test': "crazy \"markup\""}]
Sandbox
How to parse JSON data with apostrophes in PHP 5.3? Obviously original JSON I want to parse is more complex.
(I can't upgrade my PHP on production server neither get proper JSON from webservice)
Here's an alternative solution to this problem:
function fixJSON($json) {
$regex = <<<'REGEX'
~
"[^"\\]*(?:\\.|[^"\\]*)*"
(*SKIP)(*F)
| '([^'\\]*(?:\\.|[^'\\]*)*)'
~x
REGEX;
return preg_replace_callback($regex, function($matches) {
return '"' . preg_replace('~\\\\.(*SKIP)(*F)|"~', '\\"', $matches[1]) . '"';
}, $json);
}
This approach is more robust than h2ooooooo's function in two respects:
It preserves double quotes occurring in a single quoted string, by applying additional escaping to them. h2o's variant will replace them with double quotes instead, thus changing the value of the string.
It will properly handle escaped double quotes \", for which h2o's version seems to go into an infinite loop.
Test:
$brokenJSON = <<<'JSON'
['foo', {"bar": "hel'lo", "foo": 'ba"r ba\"z', "baz": "wor\"ld ' test"}]
JSON;
$fixedJSON = fixJSON($brokenJSON);
$decoded = json_decode($fixedJSON);
var_dump($fixedJSON);
print_r($decoded);
Output:
string(74) "["foo", {"bar": "hel'lo", "foo": "ba\"r ba\"z", "baz": "wor\"ld ' test"}]"
Array
(
[0] => foo
[1] => stdClass Object
(
[bar] => hel'lo
[foo] => ba"r ba"z
[baz] => wor"ld ' test
)
)
Here's a simple parser that'll fix your quotes for you. If it encounters a ' quote which isn't in a double quote ", it'll assume that it's wrong and replace the double quotes inside of that quote, and turn the quote enclosured into double quotes:
Example:
<?php
function fixJSON($json) {
$newJSON = '';
$jsonLength = strlen($json);
for ($i = 0; $i < $jsonLength; $i++) {
if ($json[$i] == '"' || $json[$i] == "'") {
$nextQuote = strpos($json, $json[$i], $i + 1);
$quoteContent = substr($json, $i + 1, $nextQuote - $i - 1);
$newJSON .= '"' . str_replace('"', "'", $quoteContent) . '"';
$i = $nextQuote;
} else {
$newJSON .= $json[$i];
}
}
return $newJSON;
}
$brokenJSON = "['foo', {\"bar\": \"hel'lo\", \"foo\": 'ba\"r'}]";
$fixedJSON = fixJSON( $brokenJSON );
var_dump($fixedJSON);
print_r( json_decode( $fixedJSON ) );
?>
Output:
string(41) "["foo", {"bar": "hel'lo", "foo": "ba'r"}]"
Array
(
[0] => foo
[1] => stdClass Object
(
[bar] => hel'lo
[foo] => ba'r
)
)
DEMO
NikiCs´ answer is already spot on. Your input seems to be manually generated, so it's entirely possible that within ' single quoted strings, you'll receive unquoted " doubles. A regex assertion is therefore advisable instead of a plain search and replace.
But there are also a few userland JSON parsers which support a bit more Javascript expression syntax. It's probably best to speak of JSOL, JavaScript Object Literals, at this point.
PEARs Services_JSON
Services_JSON can decode:
unquoted object keys
and strings enclosed in single quotes.
No additional options are required, just = (new Services_JSON)->decode($jsol);
up_json_decode() in upgradephp
This was actually meant as fallback for early PHP versions without JSON extension. It reimplements PHPs json_decode(). But there's also the upgrade.php.prefixed version, which you'd use here.
It introduces an additional flag JSON_PARSE_JAVASCRIPT.
up_json_decode($jsol, false, 512, JSON_PARSE_JAVASCRIPT);
And I totally forgot about mentionind this in the docs, but it also supports single-quoted strings.
For instance:
{ num: 123, "key": "value", 'single': 'with \' and unquoted " dbls' }
Will decode into:
stdClass Object
(
[num] => 123
[key] => value
[single] => with ' and unquoted " double quotes
)
Other options
JasonDecoder by #ArtisticPhoenix does support unquoted keys and literals, though no '-quoted strings. It's easy to understand or extend however.
YAML (1.2) is a superset of JSON, and most parsers support both unquoted keys or single-quoted strings. See also PHP YAML Parsers
Obviously any JSOL tokenizer/parser in userland is measurably slower than just preprocessing malformed JSON. If you expect no further gotchas from your webservice, go for the regex/quote conversion instead.
One solution would be to build a proxy using NodeJS. NodeJS will handle the faulty JSON just fine and return a clean version:
johan:~ # node
> JSON.stringify(['foo', 'bar']);
'["foo","bar"]'
Maybe write a simple Node script that accepts the JSON data as STDIN and returns the validated JSON to STDOUT. That way you can call it from PHP.
The downside is that your server would need NodeJS. Not sure if that is a problem for you.
If you know that PHP 5.5.+ will parse this JSON gracefully, I would pipe the web service responses trough a proxy script on a PHP5.5+ web server, which sanitizes the responses for lower versions - meaning just echo json_encode(json_decode($response)); That's a stable and reliable approach.
If you make the web service URL configurable trough a config value, it will work for lower versions by accessing the proxy, in higher versions by accessing the web service directly.
A fast solution could be str_replace("'","\"",$string). This depends on many things, but I think you could give it a try.
You could use (and probably modify/extend) a library to build an AST from the supplied JSON and replace the single quotes with double quotes.
https://github.com/Seldaek/jsonlint/blob/master/src/Seld/JsonLint/Lexer.php
Might be a good start.

How to repair a serialized string that has been corrupted due to a removed slash before a single quote?

I am getting an error when trying to unserialize data. The following error occurs:
unserialize(): Error at offset 46 of 151 bytes
Here is the serialized data:
s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
The error is being caused by a single quote in the data. How can I alleviate this problem when the site and database that I am working with is already live?
Unfortunately I cannot rewrite the code that was responsible for serializing and inserting the data to the database. It is highly likely that there are multiple occurrences of this problem across the database.
Is there a function I can use to escape the data?
After doing further research I have found a work around solution. According to this blog post:
"It turns out that if there's a ", ', :, or ; in any of the array
values the serialization gets corrupted."
If I was working on a site that hadn't yet been put live, a prevention method would have been to base64_encode my serialized data before it was stored in the database like so:
base64_encode( serialize( $my_data ) );
And then:
unserialize( base64_decode( $encoded_serialized_string ) );
when retrieving the data.
However, as I cannot change what has already been stored in the database, this very helpful post(original post no longer available, but looks like this) provides a solution that works around the problem:
$fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!', function($match) {
return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
}, $my_data );
$result = unserialize( $fixed_serialized_data );
From what I see, you have a valid serialized string nested inside of a valid serialized string -- meaning serialize() was called twice in the formation of your posted string.
See how you have s:151: followed by:
"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
⮤ that is a valid single string that contains pre-serialized data.
After you unserialize THAT, you get:
a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}
// ^^--^^^^^^^^^^^^^^-- uh oh, that string value has 14 bytes/characters not 15
It looks like somewhere in the string processing, and escaping slash was removed and that corrupted the string.
There is nothing foul about single quotes in serialized data.
You can choose to either:
execute an escaping call to blindly apply slashes to ALL single quotes in your string (which may cause breakages elsewhere) -- assuming you WANT to escape the single quotes for your project's subsequent processes or
execute my following snippet which will not escape the single quotes, but rather adjust the byte/character count to form a valid
Code: (Demo)
$corrupted_byte_counts = <<<STRING
s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
STRING;
$repaired = preg_replace_callback(
'/s:\d+:"(.*?)";/s',
function ($m) {
return 's:' . strlen($m[1]) . ":\"{$m[1]}\";";
},
unserialize($corrupted_byte_counts) // first unserialize string before repairing
);
echo "corrupted serialized array:\n$corrupted_byte_counts";
echo "\n---\n";
echo "repaired serialized array:\n$repaired";
echo "\n---\n";
print_r(unserialize($repaired)); // unserialize repaired string
echo "\n---\n";
echo serialize($repaired);
Output:
corrupted serialized array:
s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
---
repaired serialized array:
a:1:{i:0;a:4:{s:4:"name";s:14:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}
---
Array
(
[0] => Array
(
[name] => Chloe O'Gorman
[gender] => female
[age] => 3_6
[present] => Something from Frozen or a jigsaw
)
)
---
s:151:"a:1:{i:0;a:4:{s:4:"name";s:14:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
*keep in mind, if you want to return your data to its original Matryoshka-serialized form, you will need to call serialize() again on $repaired.
**if you have substrings that contain "; in them, you might try this extended version of my snippet.
There's nothing wrong with your serialized text as posted. The quotes inside do NOT need to be escaped, because PHP uses the type length indicators to figure out where things start/stop. e.g.
php > $foo = "This string contains a \" quote and a ' quote";
php > $bar = serialize($foo);
php > $baz = unserialize($bar);
php > echo "$foo\n$bar\n$baz\n";
This string contains a " quote and a ' quote
s:44:"This string contains a " quote and a ' quote";
This string contains a " quote and a ' quote
Note the lack of ANY kind of escaping in the serialized string - the quotes inside the string are there as-is, no quoting, no escaping, no encoding.
As posted, your serialized data properly deserializes into a plain JSON string without issue.
php nowdoc
unserialize(<<<'DDDD'
[SERIALIZE_STR]
DDDD
);

Parse JavaScript from remote server using curl

I need to grab a json-string from this page: https://retracted.com
If you view the source, I json-string starts after var mycarousel_itemList =. I need to parse this string as a correct json-array in my php-script.
How can this be done?
EDIT: I've managed to pull this off using explode, but the method is ugly as heck. Is there no build-in function to translate this json-string to a array?
To clarify: I want the string I grab (which is correct json) to be converted into a php-array.
The JSON in the script block is invalid and needs to be massaged a bit before it can be used in PHP's native json_decode function. Assuming you have already extracted the JSON string from the markup (make sure you exclude the semicolon at the end):
$json = <<< JSON
[ { address: 'Arnegårdsveien 32', … } ]
JSON;
var_dump(
json_decode(
str_replace(
array(
'address:',
'thumb:',
'description:',
'price:',
'id:',
'size:',
'url:',
'\''
),
array(
'"address":',
'"thumb":',
'"description":',
'"price":',
'"id":',
'"size":',
'"url":',
'"'
),
$json
)
,
true
)
);
This will then give an array of arrays of the JSON data (demo).
In other words, the properties have to be double quoted and the values need to be in double quotes as well. If you want an array of stdClass objects instead for the "{}" parts, remove the true.
You can do this either with str_replace as shown above or with a regular expression:
preg_match('
(.+var mycarousel_itemList = ([\[].+);.+function?)smU',
file_get_contents('http://bolig…'),
$match
);
$json = preg_replace(
array('( ([a-z]+)\:)sm', '((\'))'),
array('"$1":', '"'),
$match[1]
);
var_dump(json_decode($json, true));
The above code will fetch the URL, extract the JSON, fix it and convert to PHP (demo).
Once you have your json data, you can use json_decode (PHP >= 5.2) to convert it into a PHP object or array

PHP \uXXXX encoded string convert to utf-8

I've got such strings
\u041d\u0418\u041a\u041e\u041b\u0410\u0415\u0412
How can I convert this to utf-8 encoding?
And what is the encoding of given string?
Thank you for participating!
The simple approach would be to wrap your string into double quotes and let json_decode convert the \u0000 escapes. (Which happen to be Javascript string syntax.)
$str = json_decode("\"$str\"");
Seems to be russian letters: НИКОЛАЕВ (It's already UTF-8 when json_decode returns it.)
To parse that string in PHP you can use json_decode because JSON supports that unicode literal format.
To preface, you generally should not be encountering \uXXXX unicode escape sequences outside of JSON documents, in which case you should be decoding those documents using json_decode() rather than trying to cherry-pick strings out of the middle by hand.
If you want to generate JSON documents without unicode escape sequences, then you should use the JSON_UNESCAPED_UNICODE flag in json_encode(). However, the escapes are default as they are most likely to be safely transmitted through various intermediate systems. I would strongly recommend leaving escapes enabled unless you have a solid reason not to.
Lastly, if you're just looking for something to make unicode text "safe" in some fashion, please instead read over the following SO masterpost: UTF-8 all the way through
If, after three paragraphs of "don't do this", you still want to do this, then here are a couple functions for applying/removing \uXXXX escapes in arbitrary text:
<?php
function utf8_escape($input) {
$output = '';
for( $i=0,$l=mb_strlen($input); $i<$l; ++$i ) {
$cur = mb_substr($input, $i, 1);
if( strlen($cur) === 1 ) {
$output .= $cur;
} else {
$output .= sprintf('\\u%04x', mb_ord($cur));
}
}
return $output;
}
function utf8_unescape($input) {
return preg_replace_callback(
'/\\\\u([0-9a-fA-F]{4})/',
function($a) {
return mb_chr(hexdec($a[1]));
},
$input
);
}
$u_input = 'hello world, 私のホバークラフトはうなぎで満たされています';
$e_input = 'hello world, \u79c1\u306e\u30db\u30d0\u30fc\u30af\u30e9\u30d5\u30c8\u306f\u3046\u306a\u304e\u3067\u6e80\u305f\u3055\u308c\u3066\u3044\u307e\u3059';
var_dump(
utf8_escape($u_input),
utf8_unescape($e_input)
);
Output:
string(145) "hello world, \u79c1\u306e\u30db\u30d0\u30fc\u30af\u30e9\u30d5\u30c8\u306f\u3046\u306a\u304e\u3067\u6e80\u305f\u3055\u308c\u3066\u3044\u307e\u3059"
string(79) "hello world, 私のホバークラフトはうなぎで満たされています"

What kind of notation is this?

I've a string which looks like this:
[{
text: "key 1",
value: "value 1"
}, {
text: "key 2",
value: "value 2"
}, {
text: "key 3",
value: "value 3"
}]
I'm not sure what kind of notation this is, AFAIK this is generated by a ASP .NET backend. It looks a lot similar to JSON but calling json_decode() on this fails.
Can someone bring me some light on this kind of notation and provide me a efficient way to parse it into a key / value array with PHP?
Any way you can change the output? Quoting the key names seems to allow it to parse normally:
$test = '[{"text":"key 1","value":"value 1"},{"text":"key 2","value":"value 2"},{"text":"key 3","value":"value 3"}]';
var_dump(json_decode($test));
It is JSON-like, but apparently not exactly to the spec. The PHP json_decode function only likes double quoted key names:
// the following strings are valid JavaScript but not valid JSON
// the name and value must be enclosed in double quotes
// single quotes are not valid
$bad_json = "{ 'bar': 'baz' }";
json_decode($bad_json); // null
// the name must be enclosed in double quotes
$bad_json = '{ bar: "baz" }';
json_decode($bad_json); // null
// trailing commas are not allowed
$bad_json = '{ bar: "baz", }';
json_decode($bad_json); // null
That sample is valid YAML, which is a superset of JSON. There seem to be at least 3 PHP libraries for YAML.
If it is in fact YAML, you're better off using a real YAML library, than running it through a regex and throwing it at your JSON library. YAML has support for other features (besides unquoted strings) which, if your ASP.NET backend uses, aren't going to survive the trip.
It looks like javascript syntax (similar to JSON). Regular expressions are the way to go for parsing it. Strip the '[' and ']', then separate on the ','. Then parse each object individually.
It looks like a custom format. Replace the [{ and }] delimiters at the beginning and the end. Then explode on "},{" and you get this:
text:"key 1",value:"value 1"
text:"key 2",value:"value 2"
text:"key 3",value:"value 3"
At that point you can iterate over each element in the array and use preg_match to extract your values.
It looks almost like a sort of array-style data container - text being the index and value being the value.
$string = ....;
$endArray = array()
$string = trim($string,'[]');
$startArray = preg_split('/{.+}/');
// Array of {text:"key 1",value:"value 1"}, this will also skip empty conainers
foreach( $startArray as $arrayItem ) {
$tmpString = trim($arrayItem,'{}'); // $tmp = text:"key 1",value:"value 1"
$tmpArray = explode(',',$tmpString); // $tmpArray = ('text: "key 1"', 'value: "value 1"')
$endArray[substr($tmpArray[0],7,strlen($tmpArray[0])-1)] = substr($tmpArray[1],7,strlen($tmpArray[1])-1);
}
To get your JSON data accepted by json_decode(), you could use the following regular expression:
function json_replacer($match) {
if ($match[0] == '"' || $match[0] == "'") {
return $match;
}
else {
return '"'.$match.'"';
}
}
$json_re = <<<'END'
/ " (?: \\. | [^\\"] )* " # double-quoted string, with escapes
| ' (?: \\. | [^\\'] )* ' # single-quoted string, with escapes
| \b [A-Za-z_] \w* (?=\s*:) # A single word followed by a colon
/x
END;
$json = preg_replace_callback($json_re, 'json_replacer', $json);
Because the matches will never overlap, a word followed by colon inside a string will never match.
I also found a comparison between different JSON implementations for PHP:
http://gggeek.altervista.org/sw/article_20061113.html
I have never used it, but maybe give a look at json_decode.

Categories