I'm writing data and string parsers for a JSON variant. My understanding is that standard JSON is incompatible with PHP, since PHP uses the same datatype (Array) for JSON arrays and JSON objects. This problem is severe with empty JSON arrays and objects, since once converted to a PHP value, they are indistinguishable. So if you start with JSON strings "[]" and "{}", converting them to values and then back to JSON makes them look the same.
My basic idea is to require PHP arrays intended to be represented as JSON objects, especially empty arrays that should be encoded or stringified to "{}", to be represented differently in PHP programs, so proper error checking and JSON conversion can be done. Of course, for nonempty PHP arrays, it is possible to determine programmatically whether they have indices that are successive integers starting at 0 or not. But such a check fails to be useful if empty JSON arrays and objects are also to be properly distinguished.
The actual distinction discussed here is important to choose well, as programmers will be required to generate and type-check PHP Arrays differently depending on whether they are intended to correspond to "[]" or "{}" syntax in JSON.
So, my question is: what distinctive representation is best? The main candidates are:
for PHP associative arrays representing JSON objects:
$JSONobject=(object){};
$JSONobject=new stdClass();
$JSONobject=json_decode("{}");
I list these separately as I am not sure if they all have the same internal representation.
For PHP sequential list arrays representing JSON arrays, the Array datatype would be used unchanged, so an empty JSON array would be generated by $JSONarray=[]; or $JSONarray=new Array();
I have an answer to my own question, as a result of experimentation: I've found that json_decode("[]") results in a zero-length array, and json_decode("{}") results in a standard class object with no properties, no inheritance, and no methods. Furthermore, json_encode(json_decode ("[]") results in "[]", which is correct, and json_encode(json_decode ("{}") results in "{}", which is also correct.
So PHP itself chose my proposed answer to disambiguate the PHP Array type into two separate subtypes (Array and Class Object).
I think this answers my question in the affirmative. Furthermore, the three ways I listed above for representing JSON "{}" are indeed equivalent, as I guessed.
I hope this question and answer help those who may be puzzled by this issue in the future. The ambiguity of using Array to represent both lists and associative maps is permitted (and probably recommended) to be resolved in just the way I described.
Related
Does the string
["first","second","third"]
always preserve array order and result in the PHP array
array('first','second','third');
when using json_decode()? I realize the answer is NO for objects, but I am asking about a string representing an array as input.
Yes. Arrays are ordered by definition, and JSON preserves this.
The JSON specification says:
An array structure is a pair of square bracket tokens surrounding zero or more values. The values are
separated by commas. The order of the values is significant.
The last sentence implies that a JSON encoder or decoder that changes the order is not in conformance with the specification. I can't find anything in the PHP documentation that explicitly says that it observes this requirement, but I think it can be assumed since it claims to be implementing JSON.
Yes order will be kept.
Alternatively you can use cast array to object, as order of object variables are not modified json_encode((object)$arr).
I am using json_encode to transform my php multidimensional array to output json. Normally, this function would convert all values to strings. To make sure that integers values are send to javascript as integer values, I am using the numeric check:
$json = json_encode($data, JSON_NUMERIC_CHECK);
This works fine in all but one case for my app. In the php array (which is extracted from the database), there is one field that contains very large integers. I save it to database as a VARCHAR, but unfortunately this is converted to an integer when encoding to json. The problem is that since this is a very large integer, it gets rounded and therefore does not represent the true value. How could I tackle this problem?
Do you want the large number to be transformed to an integer? Your question leads me to believe you don't. If that's the case, remove the JSON_NUMERIC_CHECK option from the call and it shouldn't change the encoding of the field.
Documentation about this (and other) constants is here.
Maybe is to late but i did hit the same problem, and stuck on PHP 5.3 on the server because of legacy code that must be run with that version. The solution that i used is dumb, but did work for me: simple add an space character at the end of the long integer that is varchar readed from the db, and before sending it to json encode with JSON_NUMERIC_CHECK.
First of all, I couldn't get clear definition of it from WikiPedia or even from serialize function in the PHP manual. I need to know some cases where we need the term serialization and how things are going without it? In other words, Where you need serialization and without it your code will be missing some important feature.
What is serialization?
Serialization encodes objects into another format.
For example you have an array in PHP like this:
$array = array("a" => 1, "b" => 2, "c" => array("a" => 1, "b" => 2));
And then you want to store it in file or send to other application.
There are several format choices, but the idea is the same:
The array has to be encoded (or you could say "translated"), into text or bytes, that can be written to a file or sent via the network.
For example, in PHP, if you:
$data = serialize($array);
you will get this:
a:3:{s:1:"a";i:1;s:1:"b";i:2;s:1:"c";a:2:{s:1:"a";i:1;s:1:"b";i:2;}}
This is PHP's particular serializing format that PHP understands, and it works vice versa, so you are able to use it to deserialize objects.
For example, you stored a serialized array in a file, and you want it back in your code as an array:
$array = unserialize($data);
But you could choose a different serialization format, for example, JSON:
$json = json_encode($array);
will give you this:
{"a":1,"b":2,"c":{"a":1,"b":2}}
The result is not only easily saved, read by human eye, or sent via network, but is also understandable by almost every other language (JavaScript, Java, C#, C++, ...)
Conclusion
Serialization translate objects to another format, in case you want to store or share data.
Are there any situations, where you cannot do anything, but serialize it?
No. But serialization usually makes things easier.
Are JSON and PHP format the only possible formats?
No, no, no and one more time no. There are plenty of formats.
XML (e.g. using a schema like WSDL or XHTML)
Bytes, Protobuf, etc.
Yaml
...
...
Your own formats (you can create your own format for serialization and use it, but that is a big thing to do and is not worth it, most of the time)
Serialization is the process of converting some in-memory object to another format that could be used to either store in a file or sent over the network. Deserialization is the inverse process meaning the actual object instance is restored from the given serialized representation of the object. This is very useful when communicating between various systems.
The serialization format could be either interoperable or non-interoperable. Interoperable formats (such as JSON, XML, ...) allow for serializing some object using a given platform and deserializing it using a different platform. For example with JSON you could use javascript to serialize the object and send it over the network to a PHP script that will deserialize the object and use it.
The serialize() PHP function uses an non-interoperable format. This means that only PHP could be used to both serialize and deserialize the object back.
You could use the json_encode and json_decode() functions in order to serialize/deserialize PHP objects using the JSON interoperable format.
Serialization is the process of turning data (e.g. variables) into a representation such as a string, that can easily be written and read back from for example a file or the database.
Use cases? There are many, but generally it revolves around the idea of taking a complex, nested array or object and turning it into a simple string that can be saved and read later to retrieve the same structure. For example, provided you have in php:
$blub = array();
$blub['a'] = 1;
$blub['a']['b'] = 4;
$blub['b'] = 27;
$blub['b']['b'] = 46;
Instead of going through every array member individually and writing it one could just:
$dataString = serialize($blub);
And the serialized array is ready to be written anywhere as a simple string, in such a way that retrieving this string again and doing unserialize() over it gets you the exact same array structure you had before. Yes, it's really that simple.
I need to know some cases we need the term serialization and how things are going without it?
Serialization can become handy if you need to store complete structures (like an invoice with all associated data like customer address, sender address, product positions, tax caclulcations etc) that are only valid at a certain point in time.
All these data will change in the future, new tax regulations might come, the address of a customer changes, products go out of life. But still the invoice needs to be valid and stored.
This is possible with serialization. Like a snapshot. The object in memory are serialized into a (often like in PHP) binary form that can be just stored. It can be brought back to live later on (and in a different context). Like with this invoice example: In ten years, the data can still be read and the invoice object is the same as it was ten years earlier.
In other word, Where you must need serialization and without it your code will be missing some important feature.
That was one example. It's not that you always needs that, but if things become more complex, serialization can be helpful.
Since you've tagged it with javascript, one kind of serialization could be form serialization.
Here are the references for the jQuery and prototype.JS equivalents.
What they basically do is serialize form input values into comma-separated name-value pairs.
So considering an actual usage..
$.ajax({
url : 'insert.php?a=10,b=15' //values serialized via .serialize()
type: 'GET'
});
And you would probably do $GET["a"] to retrieve those values, I'm not familiar with PHP though.
Is there a recommended way of sending an object using json between a server and a client ?
Should I use only lower case for properties, should I use an object or an array ?
Update: this question came to me because by default php is encoding an associative array as an object ( after you decode it)
You should make an array, then use PHP's json_encode method. It doesn't matter if the values are uppercase or lower case.
$a = array(
'Test' => 42,
'example' => 'Testing'
);
echo json_encode($a); // {"Test":42,"example":"Testing"}
When decoding in PHP, pass true as the 2nd parameter to json_decode to convert objects to arrays.
$data = json_decode($json, true);
Both these things are entirely up to you.
The casing of your property names is a coding style matter. It really doesn't matter as long as you are consistent -- your project should have fixed standards on this type of thing. If you haven't picked your standards yet, my advice is to go for readability, which usually means lower case or camel-case. I'd avoid upper case, and I'd also avoid using hyphens or underscores, but it is entirely up to you.
As for the choice between objects or arrays, that comes down to what is best suited to the data in question. If it needs named keys, then use an JSON object (ie with curly braces and key:value pairs {'key':'value','key2':'value2'}); if it doesn't, then use an JSON array (ie with square brackets and just values ['value1','value2']). The choice is entirely down to how the data needs to be structured: both are perfectly valid in JSON and neither is better than the other; just used for different purposes.
(PHP, of course, doesn't differentiate -- both keyed and indexed data are stored in PHP arrays, so from the PHP side it makes absolutely no difference).
Naming conventions aren't standardized, you can use whatever you like. It is a good idea to use names that are also valid javascript identifiers and won't clash with javascript keywords. Object vs. array is not a matter of convention, but rather one of meaning. A JSON object is a key-value collection, while an array is a flat list. Those are different things, and even though the syntax for both is somewhat interchangeable in javascript, and PHP can implement both using the same data type, you should make a clear distinction in your design. If it's a flat list, use []. If it's a key-value thing, use {}. On the PHP side, simply use arrays for both: numerically-indexed arrays for [], associative arrays for {}.
a:3:{i:0;i:4;i:1;i:3;i:2;i:2;}
Am I right to say that this is an array of size 3 where the key value pairs are 0->4, 1->3, and 2->2?
If so, I find this representation awfully confusing. At first, I thought it was a listing of values (or the array contained {0, 4, 1, 3, 2, 2}), but I figured that the a:3: was the size of the array. And if 3 was the size, then both the keys and values appeared in the brackets with no way of clearly identifying a key/value pair without counting off.
To clarify where I'm coming from:
Why did the PHP developers choose to serialize in this manner? What advantage does this have over, let's say the way var_dump and/or var_export displays its data?
Yes that's array(4, 3, 2)
a for array, i for integer as key then value. You would have to count to get to a specific one, but PHP always deserialises the whole lot, so it has a count anyway.
Edit: It's not too confusing when you get used to it, but it can be somewhat long-winded compared to, e.g. JSON
Note: var_export() does not handle
circular references as it would be
close to impossible to generate
parsable PHP code for that. If you
want to do something with the full
representation of an array or object,
use serialize().
$string="a:3:{i:0;i:4;i:1;i:3;i:2;i:2;}";
$array=unserialize($string);
print_r($array);
outpts:
Array
(
[0] => 4
[1] => 3
[2] => 2
)
If think the point is that PHP does not differentiate between integer indexed arrays and string indexed hashtables. The serialization format can be used for hashtables exactly the same way: a:<<size>>:{<<keytype>>:<<key>>;<<valuetype>>:<<value>>;...}
As the format is not intended to be human readable but rather to provide a common format to represent all PHP variable types (with exception of resources), I think it's more simple to use the given format because the underlying variable can be reconstructed by reading the string character by character.
Serialized PHP data is not really intended to be human readable - that is not a goal of the format as far as I know.
I think the biggest reason the format looks the way it does is for brevity, and its form may also have underpinnings tied to the speed at which it can be processed.
Why don't you use the unserialize() function to restore the data to how it was before?