php/as3 regex to split multiple json in one - php

For example I have this string of 2 json objects:
{"glossary": {"title": "example glossary"}, "aaa": "1212"}{"adada": "faka"}
I want to split it in the array for PHP and Actionscript 3
Array (
[0] => '{"glossary": {"title": "example glossary"}',
[1] => '{"adada": "faka"}'
)
What is the best method to do it.
Edit:
I don't need answer how to parse json. To simplify I need to split
{...{..}....}{....}{........}
into
{...{..}....}
{....}
{........}

either you modify a JSON parser to do it, as Amargosh suggested, or you can make a simple algorithm to do it for you, that skipps through strings and comments and keeps track of open braces. when no open braces are left, then you have a complete value. repeat it until you're out of input.
however my suggestion is to try to solve the problem by talking to whoever is responsible for that output and convince him to generate valid JSON, which is [{"glossary": {"title": "example glossary"}, "aaa": "1212"},{"adada": "faka"}]
edit: to separate the JSON objects, you need to prefix every JSON object with a length (4 byte should suffice). then you read off the socket until you have the right amount of chars. the next characters will again be the length of the next object. it is imperative that you do this, because TCP works in packets. Not only can you have multiple JSON-objects in one single packet, but you can have one JSON-object split between two packets.
Also, a couple of advises:
do not use PHP for socket servers. it's not made for that. have a look at Haxe, specifically the neko backend.
do not write this kind of stuff on your own unless you really need to. it's boring and dumb work. there are virtually millions of solutions for socket servers. you should also have a look at Haxe remoting that allows transparent communication between e.g. a flash client and a neko socket server. also, please have a look at smartfox and red5.
edit2: you're underestimating the problem and your approach isn't good. you should build a robust solution, so the day you want to send arrays over the wire you won't have a total breakdown, because your "splitter" breaks, or your JSON parser is fed incomplete objects, because only half the object is read. what you want to do can be easily done: split the input using "}{", append "}" to any element but the last and prepend "{" to any element but the first. Nonetheless I heavily suggest you refrain from such an approach, because you will regret it at some point. if you really think you should do thinks like these on your own, then try to at least do them right.

Sorry for posting on an old question but I just ran into this question when googling for a similar problem and I think I have a pretty easy little solution for the specific problem that was defined in the question.
// Convert original data to valid JSON array
$json = '['.str_replace('}{', '},{', $originalData).']';
$array = json_decode($json, true);
$arrayOfJsonStrings = array();
foreach ($array as $data) {
$arrayOfJsonStrings[] = json_encode($data);
}
var_dump($arrayOfJsonStrings);
I think this should work as long as your JSON data doesn't contain strings that might include '}{'.

Regex can't handle it. If you don't actually need a JSON parser, writing a simple parsing function should do it.
Something like this would do:
function splitJSONString($json) {
$objects = array();
$depth = 0;
$current = '';
for ($i = 0; $char = substr($json, $i, 1); $i++) {
$current .= $char;
if ($char == '{') {
$depth += 1;
} else if ($char == '}' && $depth > 0) {
$depth -= 1;
if ($depth == 0) {
array_push($objects, $current);
$current = '';
}
}
}
return $objects;
}

Regex is not the right tool for this - use a JSON Parser

Related

PHP Huffman Decode Algorithm

I applied for a job recently and got sent a hackerrank exam with a couple of questions.One of them was a huffman decoding algorithm. There is a similar problem available here which explains the formatting alot better then I can.
The actual task was to take two arguments and return the decoded string.
The first argument is the codes, which is a string array like:
[
"a 00",
"b 101",
"c 0111",
"[newline] 1001"
]
Which is like: single character, two tabs, huffman code.
The newline was specified as being in this format due to the way that hacker rank is set up.
The second argument is a string to decode using the codes. For example:
101000111 = bac
This is my solution:
function decode($codes, $encoded) {
$returnString = '';
$codeArray = array();
foreach($codes as $code) {
sscanf($code, "%s\t\t%s", $letter, $code);
if ($letter == "[newline]")
$letter = "\n";
$codeArray[$code] = $letter;
}
print_r($codeArray);
$numbers = str_split($encoded);
$searchCode = '';
foreach ($numbers as $number) {
$searchCode .= $number;
if (isset($codeArray[$searchCode])) {
$returnString .= $codeArray[$searchCode];
$searchCode = '';
}
}
return $returnString;
}
It passed the two initial tests but there were another five hidden tests which it did not pass and gave no feedback on.
I realize that this solution would not pass if the character was a white space so I tried a less optimal solution that used substr to get the first character and regex matching to get the number but this still passed the first two and failed the hidden five. I tried function in the hacker rank platform with white-space as input and the sandboxed environment could not handle it anyway so I reverted to the above solution as it was more elegant.
I tried the code with special characters, characters from other languages, codes of various sizes and it always returned the desired solution.
I am just frustrated that I could not find the cases that caused this to fail as I found this to be an elegant solution. I would love some feedback both on why this could fail given that there is no white-space and also any feedback on performance increases.
Your basic approach is sound. Since a Huffman code is a prefix code, i.e. no code is a prefix of another, then if your search finds a match, then that must be the code. The second half of your code would work with any proper Huffman code and any message encoded using it.
Some comments. First, the example you provide is not a Huffman code, since the prefixes 010, 0110, 1000, and 11 are not present. Huffman codes are complete, whereas this prefix code is not.
This brings up a second issue, which is that you do not detect this error. You should be checking to see if $searchCode is empty after the end of your loop. If it is not, then the code was not complete, or a code ended in the middle. Either way, the message is corrupt with respect to the provided prefix code. Did the question specify what to do with errors?
The only real issue I would expect with this code is that you did not decode the code description generally enough. Did the question say there were always two tabs, or did you conclude that? Perhaps it was just any amount of space and tabs. Where there other character encodings you neeed to convert like [newline]? I presume you in fact did need to convert them, if one of the examples that worked contained one. Did it? Otherwise, maybe you weren't supposed to convert.
I had the same question for an Coding Challenge. with some modification as the input was a List with (a 111101,b 110010,[newline] 111111 ....)
I took a different approach to solve it,using hashmap but still i too had only 2 sample test case passed.
below is my code:
public static String decode(List<String> codes, String encoded) {
// Write your code here
String result = "";
String buildvalue ="";
HashMap <String,String> codeMap= new HashMap<String,String>();
for(int i=0;i<codes.size();i++){
String S= codes.get(i);
String[] splitedData = S.split("\\s+");
String value=splitedData[0];
String key=(splitedData[1].trim());
codeMap.put(key, value);
}
for(int j=0;j<encoded.length();j++){
buildvalue+=Character.toString(encoded.charAt(j));
if(codeMap.containsKey(buildvalue)){
if(codeMap.get(buildvalue).contains("[newline]")){
result+="\n";
buildvalue="";
}
else{
result+=codeMap.get(buildvalue);
buildvalue="";
}
}
}
return result.toString();
}
}

Parsing ridiculous json string in php

I am querying a restful web service (clickbank) I am able to send and receive my requests but (embarrassingly enough), I am having trouble parsing the results to make them readable. When I select XML as the return format what I get back is a 97,000 character long string(yay var_dump).... with no spaces in it... and no delimiter (such as ',' or '/', or even '<', or '>'), separating the values. So, I have selected JSON as the return format. I have been able to decode this herculean string using json_decode and I have deciphered that what I am getting back is an array of 100 objects, each object has 30 vars (ala get_object_vars), but some of these vars are themselves arrays of objects. Forgive my ignorance but any ideas on how I can parse this so that it's in the realm of readable? By the way, I am currently trying my hand at PHP (as that is what we use at the shop where I work). I am mildly retarded when it comes to using Eclipse PDT so any suggestions would be welcome....
P.S. I have the following function that has been helpful in determining what the hay is going on but it still doesn't separate things out like I want
function getInfo($datum) {
switch (gettype($datum)) {
case "object":
$GLOBALS['counts']++;
//var_dump($datum);
echo "<hr>";
//$members = get_object_vars($datum);
//getInfo($members);
break;
case "array":
foreach($datum as $v) {
getInfo($v);
}
//var_dump($datum);
break;
default:
echo "<div>$datum</div>";
}
}
This is a bit of a side question here: down at the bottom of my snippet (see the default condition of switch statement), $datum WAS part of an array (before it was passed back into the function at the final depth of recursion here). Is there a way to elaborate that echo statement so that it shows whatever it's key was? I'm trying to word this in the least confusing way possible, but if that value (probably a string) is the value of some index of an associative array, but without me having to specify any of the possible names of the arrays? (think in_array but without the $haystack argument)

PHP retrieve json values without loop through the whole json object

I'm a newbie in the PHP area, so please bear with my question.
Basically I have/will have a pretty big json file, and I need to query the file to get a single entry based on the key provided. An example would be as follow:
{
"key1" : {value1},
"key2" : {value2},
...,
"keyn" : {valuen}
}
I will need to retrieve only one value at any one request, and hope to get a better performance.
The basic way to deal with this sort of handling in PHP from my search is to use json_decode() and then foreach.
However, this approach seems like need to iterate through the whole file based on the order of the key and what the key I am looking for. So if I am looking for keyn, then essentially I have to read from top to bottom of the large file. (Yep, I can use some sort algorithm to get a better result)
But from my understanding, JSON is basically another form of HashMap, so given HashMap can get easily and fast, is there a similar way in PhP to get the best performance out of it?
Well, given the structure you provided you definitely don't need to loop through the entire object.
If you're looking for keyn, you would just do:
$obj = json_decode($input);
echo $obj->keyn;
Maybe I'm missing something obvious. If you want to prevent having to json_decode the entire object, your question makes a bit more sense though... but that's not what you're asking.
From JSON.org
JSON is built on two structures:
A collection of name/value pairs. In various languages, this is realized as an object, >record, struct, dictionary, hash table, keyed list, or associative array.
An ordered list of values. In most languages, this is realized as an array, vector, list, or >sequence.
You can't just interact with json without first using json_decode() to turn it into a usable object. But if you know the keys, after running json_decode() you can interact with it (because it's now an object). for example:
<?php
$string = '{"foo": "bar", "cool": "attr"}';
$result = json_decode($string);
// Result: object(stdClass)#1 (2) { ["foo"]=> string(3) "bar" ["cool"]=> string(4) "attr" }
var_dump($result);
// Prints "bar"
echo $result->foo;
// Prints "attr"
echo $result->cool;
?>
In situations like this, var_dump() and print_r() are your friends.
There really isn't any magical way to find the value without using any kind of loop
I haven't benchmarked this:
This is how I would approach the problem without having to finish the iterating over the whole tree if and when a match is foudn
$key = 'keyn'
$obj = json_decode(file_get_contents('path/to/your/file'), true);
$len = count($obj);
$match = false;
for($ii = 0; $ii < $len; $ii++){
$curr = $obj[$ii];
if($curr == $key) {
$match = $curr;
}
break;
}
In PHP you can use function: file_get_contents to parse JSON file. You have to go through each and every key-value pairs.

XML or JSON reponse for Flex applications with PHP backend?

I'm developing a floorplanner Flex mini application. I was just wondering whether JSON or XML would be a better choice in terms of performance when generating responses from PHP. I'm currently leaning for JSON since the responses could also be reused for Javascript. I've read elsewhere that JSON takes longer to parse than XML, is that true? What about flexibility for handling data with XML vs JSON in Flex?
I'd go with JSON. We've added native JSON support to Flash Player, so it will be as fast on the parsing side as XML and it's much less verbose/smaller.
=Ryan ryan#adobe.com
JSON is not a native structure to Flex (strange, huh? You'd think that the {} objects could be easily serialized, but not really), XML is. This means that XML is done behind the scenes by the virtual machine while the JSON Strings are parsed and turned into objects through String manipulation (even if you're using AS3CoreLib)... gross... Personally, I've also seen inconsistencies in JSONEncoder (at one point Arrays were just numerically indexed objects).
Once the data has been translated into an AS3 object, it is still faster to search and parse data in XML than it is with Objects. XPath expressions make data traversal a pleasure (almost easy enough to make you smile compared to other things out there).
On the other hand JS is much better at parsing JSON. MUCH, MUCH BETTER. But, since the move to JavaScript is a "maybe... someday..." then you may want to consider, "will future use of JSON be worth the performance hit right now?"
But here is a question, why not simply have two outputs? Since both JS and AS can provide you POSTs with a virtually arbitrary number of variables, you really only need to concern yourself with how the server send the data not receives it. Here's a potential way to handle this:
// as you are about to output:
$type = './outputs/' . $_GET[ 'type' ] . '.php';
if( file_exists( $type ) && strpos( $type, '.', 1 ) === FALSE )
{
include( $type );
echo output_data( $data );
}
else
{
// add a 404 if you like
die();
}
Then, when getting a $_GET['type'] == 'js', js.php would be:
function output_data( $data ){ return json_encode( $data ); }
When getting $_GET['type'] == 'xml', xml.php would hold something which had output_data return a string which represented XML (plenty of examples here)
Of course, if you're using a framework, then you could just do something like this with a view instead (my suggestion boils down to "you should have two different views and use MVC").
No, JSON is ALWAYS smaller than XML when their structures are completely same. And the cost of parsing text is almost up to the size of the target text.
So, JSON is faster than XML and if you have a plan to reuse them on javascript side, choose JSON.
Benchmark JSON vs XML:
http://www.navioo.com/ajax/ajax_json_xml_Benchmarking.php
If you're ever going to use Javascript with it, definitely go with JSON. Both have a very nice structure.
It depends on how well Flex can parse JSON though, so I would look into that. How much data are you going to be passing back? Error/Success messages? User Profiles? What kind of data is this going to contain?
Is it going to need attributes on the tags? Or just a "structure". If it needs attributes and things like that, and you don't want to go too deep into an "array like" structure, go with XML.
If you're just going to have key => value, even multi-dimensional... go with JSON.
All depends on what kind of data you're going to be passing back and forth. That will make your decision for you :)
Download time:
JSON is faster.
Javascript Parse
JSON is faster
Actionscript Parse
XML is faster.
Advanced use within Actionscript
XML is better with all the E4X functionality.
JSON is limited with no knowledge of Vectors meaning you are limited to Arrays or will need to override the JSON Encoder in ascorelib with something such as
else if ( value is Vector.<*> ) {
// converts the vector to an array and
return arrayToString( vectorToArray( value ) );
} else if ( value is Object && value != null ) {
private function vectorToArray(__vector:Object):Array {
var __return : Array;
var __vList : Vector.<*>;
__return = new Array;
if ( !__vector || !(__vector is Vector.<*>) )
{
return __return;
}
for each ( var __obj:* in (__vector as Vector.<*>) )
{
__return.push(__obj);
}
return __return;
}
But I am afraid getting those values back into Vectors is not as nice. I had to make a whole utility class devoted to it.
So which one is all depending on how advanced your objects you are going to be moving.. More advanced, go with XML to make it easier ActionScript side.
Simple stuff go JSON

Native PHP function that will grant me access to the string portions directly without having to create a temporary array?

I asked this question before, Here however I think I presented the problem poorly, and got quite a few replies that may have been useful to someone but did not address the actual question and so I pose the question again.
Is there a single-line native method in php that would allow me to do the following.
Please, please, I understand there are other ways to do this simple thing, but the question I present is does something exist natively in PHP that will grant me access to the array values directly without having to create a temporary array.
$rand_place = explode(",",loadFile("csvOf20000places.txt")){rand(0,1000)};
This is a syntax error, however ideally it would be great if this worked!
Currently, it seems unavoidable that one must create a temporary array, ie
The following is what I want to avoid:
$temporary_places_array = explode(",",loadFile("csvOf20000places.txt"));
$rand_place = $temporary_places_array[rand(0,1000)];
Also, i must note that my actual intentions are not to parse strings, or pull randomly from an array. I simply want access into the string without a temporary variable. This is just an example which i hope is easy to understand. There are many times service calls or things you do not have control over returns an array (such as the explode() function) and you just want access into it without having to create a temporary variable.
NATIVELY NATIVELY NATIVELY, i know i can create a function that does it.
No, there is no way to do that natively.
You can, however:
1.- Store the unavoidable array instead of the string. Given PHP's limitation this is what makes most sense in my opinion.
Also, don't forget you can unset()
2.- Use strpos() and friends to parse the string into what you need, as shown in other answers I won't paste here.
3.- Create a function.
There is no native PHP method of doing this.
You could make a function like this:
function array_value($array, $key) {
return $array[$key];
}
And then use it like this:
$places = "alabama,alaska,arizona .... zimbabway";
$random_place = array_value(explode(",", $places), rand(0, 1000));
I know it's poor form to answer a question with a question, but why are you concerned about this? Is it a nitpick, or an optimization question? The Zend engine will optimize this away.
However, I'd point out you don't have to create a temporary variable necessarily:
$rand_place = explode(",",loadFile("csvOf20000places.txt"));
$rand_place = $rand_place[rand(0,1000)];
Because of type mutability, you could reuse the variable. Of course, you're still not skipping a step.
list($rand_place) = array_slice(explode(',', loadFile("csvOf20000places.txt")), array_rand(explode(',', loadFile("csvOf20000places.txt"))), 1);
EDIT: Ok, you're right. The question is very hard to understand but, I think this is it. The code above will pull random items from the csv file but, to just pull whatever you want out, use this:
list($rand_place) = array_slice(explode(',', loadFile("csvOf20000places.txt")), {YOUR_NUMBER}, 1);
Replace the holder with the numeric key of the value you want to pull out.
If it's memory concerns, there are other ways of going about this that don't split out into an array. But no, there is nothing builtin to handle this sort of situation.
As an alternative, you might try:
$pos = 0;
$num = 0;
while(($pos = strpos($places, ',', $pos+1)) !== false) {$num++;}
$which = rand(0, $num);
$num = 0;
while($num <= $which) {$pos = strpos($places, ',', $pos+1);}
$random_place = substr($places, $pos, strpos($places, ',', $pos+1));
I havn't tested this, so there may be a few off-by-one issues in it, but you get the idea. This could be made shorter (and quicker) by cacheing the positions that you work out in the first loop, but this brings you back to the memory issues.
You can do this:
<?php
function foo() {
return new ArrayObject(explode(",", "zero,one,two"));
}
echo foo()->offsetGet(1); // "one"
?>
Sadly you can't do this:
echo (new ArrayObject(explode(",", "zero,one,two")))->offsetGet(2);
I spent a great deal of last year researching advanced CSV parsing for PHP and I admit it would be nice to randomly seek at will on a file. One of my semi-deadend's was to scan through a known file and make an index of the position of all known \n's that were not at the beginning of the line.
//Grab and load our index
$index = unserialize(file_get_contents('somefile.ext.ind'));
//What it looks like
$index = array( 0 => 83, 1 => 162, 2 => 178, ....);
$fHandle = fopen("somefile.ext",'RB');
$randPos = rand(0, count($index));
fseek($fHandle, $index[$randPos]);
$line = explode(",", fgets($fHandle));
Tha's the only way I could see it being done anywhere close to what you need. As for creating the index, that's rudimentary stuff.
$fHandle = fopen('somefile.ext','rb');
$index = array();
for($i = 0; false !== ($char = fgetc($fHandle)); $i++){
if($char === "\n") $index[] = $i;
}

Categories