Parsing (non-standard) json to array/object - php

I have string like this:
['key1':'value1', 2:'value2', 3:$var, 4:'with\' quotes', 5:'with, comma']
And I want to convert it to an array like this:
$parsed = [
'key1' => 'value1',
2 => 'value2',
3 => '$var',
4 => 'with\' quotes',
5 => 'with, comma',
];
How can I parse that?
Any tips or codes will be appreciated.
What can't be done?
Using standard json parsers
eval()
explode() by , and explode() by :

As you cannot use any pre-built function, like json_decode, you'll have to try and find the most possible scenarios of quoting, and replace them with known substrings.
Given that all of the values and/or keys in the input array are encapsulated in single quotes:
Please note: this code is untested
<?php
$input = "[ 'key1':'value1', 2:'value2', 3:$var, 4:'with\' quotes', 5: '$var', 'another_key': 'something not usual, like \'this\'' ]";
function extractKeysAndValuesFromNonStandardKeyValueString ( $string ) {
$input = str_replace ( Array ( "\\\'", "\'" ), Array ( "[DOUBLE_QUOTE]", "[QUOTE]" ), $string );
$input_clone = $input;
$return_array = Array ();
if ( preg_match_all ( '/\'?([^\':]+)\'?\s*\:\s*\'([^\']+)\'\s*,?\s*/', $input, $matches ) ) {
foreach ( $matches[0] as $i => $full_match ) {
$key = $matches[1][$i];
$value = $matches[2][$i];
if ( isset ( ${$value} ) $value = ${$value};
else $value = str_replace ( Array ( "[DOUBLE_QUOTE]", "[QUOTE]" ), Array ( "\\\'", "\'" ), $value );
$return_array[$key] = $value;
$input_clone = str_replace ( $full_match, '', $input_clone );
}
// process the rest of the string, if anything important is left inside of it
if ( preg_match_all ( '/\'?([^\':]+)\'?\s*\:\s*([^,]+)\s*,?\s*/', $input_clone, $matches ) ) {
foreach ( $matches[0] as $i => $full_match ) {
$key = $matches[1][$i];
$value = $matches[2][$i];
if ( isset ( ${$value} ) $value = ${$value};
$return_array[$key] = $value;
}
}
}
return $return_array;
}
The idea behind this function is to first replace all the possible combinations of quotes in the non-standard string with something you can easily replace, then perform a standard regexp against your input, then rebuild everything assuring you're resetting the previously replaced substrings

Related

PHP parse_str() function allowing passing shorthand array

We are accepting strings from templates to a markup engine, which allows for configuration to be passed in a "simple" form.
The engine parses the strings via PHP, using an adapted version of the parse_str() function - so we can parse any combination of the strings below:
config=posts_per_page:"5",default:"No questions yet -- once created they will appear here."&markup->template="{{ questions }}"
gives:
Array(
[config] => Array
(
[posts_per_page] => 5
[default] => No questions yet -- once created they will appear here.
)
[markup] => Array
(
[template] => {{ questions }}
)
)
OR:
config->default=all:"<p class='ml-3'>No members here yet...</p>"
To Get:
Array
[config] => Array
(
[default] => Array
(
[all] => <p class='ml-3'>No members here yet...</p>
)
)
)
Another:
config=>handle:"medium"
Returns:
Array (
[config] => Array
(
[>handle] => medium
)
)
Strings can be passed with spaces ( and multi-line spaces ) and string parameters should be passed between "double quotes" to preserve natural spacing - we run the following preg_replace on the string before it is passed to the parse_str method:
// strip white spaces from data that is not passed inside double quotes ( "data" ) ##
$string = preg_replace( '~"[^"]*"(*SKIP)(*F)|\s+~', "", $string );
So far, so good - until we try to pass a "delimiter" inside a string value, then it is treated literally - for example the following string returns a corrupt array:
config=posts_per_page:"5",default:"No questions yet -- once created, they will appear here."&markup->template="{{ questions }}"
Returns the following array:
Array (
[config] => Array
(
[posts_per_page] => 5
[default] => No questions yet -- once created
[ they will appear here."] =>
)
[markup] => Array
(
[template] => {{ questions }}
)
)
The "," was treated literally and the string was broken into an extra array part.
One simple solution is to create delimiters and operators with a lower chance of conflicting with string values - for example changing "," to "###" - but one important part of the markup used is that it is easy to write and read - it's intended use-case is for front-end developers to pass simple arguments to the template parser - this is one reason we have tried to avoid JSON - which is of course a good fit in terms of passing data, but it's hard to read and write - of course, that statement is subjective and open to opinion :)
Here is the parse_str method:
public static function parse_str( $string = null ) {
// h::log($string);
// delimiters ##
$operator_assign = '=';
$operator_array = '->';
$delimiter_key = ':';
$delimiter_and_property = ',';
$delimiter_and_key = '&';
// check for "=" delimiter ##
if( false === strpos( $string, $operator_assign ) ){
h::log( 'e:>Passed string format does not include asssignment operator "'.$operator_assign.'" -- '.$string );
return false;
}
# result array
$array = [];
# split on outer delimiter
$pairs = explode( $delimiter_and_key, $string );
# loop through each pair
foreach ( $pairs as $i ) {
# split into name and value
list( $key, $value ) = explode( $operator_assign, $i, 2 );
// what about array values ##
// example -- sm:medium, lg:large
if( false !== strpos( $value, $delimiter_key ) ){
// temp array ##
$value_array = [];
// split value into an array at "," ##
$value_pairs = explode( $delimiter_and_property, $value );
// h::log( $value_pairs );
# loop through each pair
foreach ( $value_pairs as $v_pair ) {
// h::log( $v_pair ); // 'sm:medium'
# split into name and value
list( $value_key, $value_value ) = explode( $delimiter_key, $v_pair, 2 );
$value_array[ $value_key ] = $value_value;
}
// check if we have an array ##
if ( is_array( $value_array ) ){
$value = $value_array;
}
}
// $key might be in part__part format, so check ##
if( false !== strpos( $key, $operator_array ) ){
// explode, max 2 parts ##
$md_key = explode( $operator_array, $key, 2 );
# if name already exists
if( isset( $array[ $md_key[0] ][ $md_key[1] ] ) ) {
# stick multiple values into an array
if( is_array( $array[ $md_key[0] ][ $md_key[1] ] ) ) {
$array[ $md_key[0] ][ $md_key[1] ][] = $value;
} else {
$array[ $md_key[0] ][ $md_key[1] ] = array( $array[ $md_key[0] ][ $md_key[1] ], $value );
}
# otherwise, simply stick it in a scalar
} else {
$array[ $md_key[0] ][ $md_key[1] ] = $value;
}
} else {
# if name already exists
if( isset($array[$key]) ) {
# stick multiple values into an array
if( is_array($array[$key]) ) {
$array[$key][] = $value;
} else {
$array[$key] = array($array[$key], $value);
}
# otherwise, simply stick it in a scalar
} else {
$array[$key] = $value;
}
}
}
// h::log( $array );
# return result array
return $array;
}
I will try to skip splitting string between "double quotes" - probably via another regex, but perhaps there are other potential pitfalls waiting that might not make this approach viable long-term - any help glady accepted!
One solution, is to change the following:
from:
$value_pairs = explode( $delimiter_and_property, $value );
to:
$value_pairs = self::quoted_explode( $value, $delimiter_and_property, '"' );
which calls a new method found on another SO answer ( linked in comment block ):
/**
* Regex Escape values
*/
public static function regex_escape( $subject ) {
return str_replace( array( '\\', '^', '-', ']' ), array( '\\\\', '\\^', '\\-', '\\]' ), $subject );
}
/**
* Explode string, while respecting delimiters
*
* #link https://stackoverflow.com/questions/3264775/an-explode-function-that-ignores-characters-inside-quotes/13755505#13755505
*/
public static function quoted_explode( $subject, $delimiter = ',', $quotes = '\"' )
{
$clauses[] = '[^'.self::regex_escape( $delimiter.$quotes ).']';
foreach( str_split( $quotes) as $quote ) {
$quote = self::regex_escape( $quote );
$clauses[] = "[$quote][^$quote]*[$quote]";
}
$regex = '(?:'.implode('|', $clauses).')+';
preg_match_all( '/'.str_replace('/', '\\/', $regex).'/', $subject, $matches );
return $matches[0];
}

How to get an associative array from a string?

This is the initial string:-
NAME=Marco\nLOCATION=localhost\nSECRET=fjsdgfsjfdskffuv=\n
This is my solution although the "=" in the end of the string does not appear in the array
$env = file_get_contents(base_path() . '/.env');
// Split string on every " " and write into array
$env = preg_split('/\s+/', $env);
//create new array to push data in the foreach
$newArray = array();
foreach($env as $val){
// Split string on every "=" and write into array
$result = preg_split ('/=/', $val);
if($result[0] && $result[1])
{
$newArray[$result[0]] = $result[1];
}
}
print_r($newArray);
This is the result I get:
Array ( [Name] => Marco [LOCATION] => localhost [SECRET] => fjsdgfsjfdskffuv )
But I need :
Array ( [Name] => Marco [LOCATION] => localhost [SECRET] => fjsdgfsjfdskffuv= )
You can use the limit parameter of preg_split to make it only split the string once
http://php.net/manual/en/function.preg-split.php
you should change
$result = preg_split ('/=/', $val);
to
$result = preg_split ('/=/', $val, 2);
Hope this helps
$string = 'NAME=Marco\nLOCATION=localhost\nSECRET=fjsdgfsjfdskffuv=\n';
$strXlate = [ 'NAME=' => '"NAME":"' ,
'LOCATION=' => '","LOCATION":"',
'SECRET=' => '","SECRET":"' ,
'\n' => '' ];
$jsonified = '{'.strtr($string, $strXlate).'"}';
$array = json_decode($jsonified, true);
This is based on 1) translation using strtr(), preparing an array in json format and then using a json_decode which blows it up nicely into an array...
Same result, other approach...
You can also use parse_str to parse URL syntax-like strings to name-value pairs.
Based on your example:
$newArray = [];
$str = file_get_contents(base_path() . '/.env');
$env = explode("\n", $str);
array_walk(
$env,
function ($i) use (&$newArray) {
if (!$i) { return; }
$tmp = [];
parse_str($i, $tmp);
$newArray[] = $tmp;
}
);
var_dump($newArray);
Of course, you need to put some sanity check in the function since it can insert some strange stuff in the array like values with empty string keys, and whatnot.

preg_split how to transform the delimiter as index using php

I have this cases with strings in PHP:
*nJohn*sSmith*fGeorge#*nHenry*sFord
and wish to create an array with
[name],[surname],[fathers] as indexes so it will produce
name_array[1] = (
[name] => 'John',
[surname] => 'Smith',
[fathers] => 'George'
)
name_array[2]=(
[name] => 'Henry',
[surname] => 'Ford'
)
and so on.
How to do it using preg_split in PHP??
Thanks!
I'd use preg_match_all to get the names. If your string is consistent I think you could do:
$string = '*nJohn*sSmith*fGeorge#*nHenry*sFord';
preg_match_all('/\*n(?<givenname>.*?)\*s(?<surname>.*?)(?:\*f(?<middlename>.*?))?(?:#|$)/', $string, $matches);
print_r($matches);
Regex demo: https://regex101.com/r/1hKzvM/1/
PHP demo: https://eval.in/784879
Solution without using regex:
$string = '*nJohn*sSmith*fGeorge#*nHenry*sFord';
$result = array();
$persons = explode('#', $string);
foreach ($persons as $person) {
$identials = explode('*', $person);
unset($r);
foreach ($identials as $idential) {
if(!$idential){
continue; //empty string
}
switch ($idential[0]) { //first character
case 'n':
$key = 'name';
break;
case 's':
$key = 'surename';
break;
case 'f':
$key = 'fathers';
break;
}
$r[$key] = substr($idential, 1);
}
$result[] = $r;
}
This function will produce the result that you want ! but consider it's not the only way and not the 100% correct way ! i used preg_split as u asked
function splitMyString($str){
$array_names = [];
$mainString = explode('#', $str);
$arr1 = preg_split("/\*[a-z]/", $mainString[0]);
unset($arr1[0]);
$arr1_values = array_values($arr1);
$arr1_keys = ['name','surname','fathers'];
$result1 = array_combine($arr1_keys, $arr1_values);
// second part of string
$arr2 = preg_split("/\*[a-z]/", $mainString[1]);
unset($arr2[0]);
$arr2_values = array_values($arr2);
$arr2_keys = ['name','surname'];
$arr2 = array_combine($arr2_keys, $arr2_values);
$array_names[] = $arr1;
$array_names[] = $arr2;
return $array_names;
}
// test result !
print_r(splitMyString("*nJohn*sSmith*fGeorge#*nHenry*sFord"));
thanks to all!
For some reason the site blocks my voting for some 'reputation' reason which I find not-fully democracy compliant! On the other hand who cares about democracy these days!
Nevertheless I am using solution #2, without indicating that solution 1 or 3 are not great!
Regards.
However inspired by your answers I came up with mine also, here it is!
$string = '*nJohn*sSmith*fGeorge#*nHenry*sFord';
split_to_key ( $string, array('n'=>'Name','s'=>'Surname','f'=>'Middle'));
function split_to_key ( $string,$ind=array() )
{
$far=null;
$i=0;
$fbig=preg_split('/#/',$string,-1,PREG_SPLIT_NO_EMPTY);
foreach ( $fbig as $fsmall ) {
$f=preg_split('/\*/u',$fsmall,-1,PREG_SPLIT_NO_EMPTY);
foreach ( $f as $fs ) {
foreach( array_keys($ind) as $key ) {
if( preg_match ('/^'.$key.'/u',$fs ) ) {
$fs=preg_replace('/^'.$key.'/u','',$fs);
$far[$i][$ind[$key]]=$fs;
}
}
}
$i++;
}
print_r($far);
}
Like Chris, I wouldn't use preg_split(). My method uses just one preg() function and one loop to completely prepare the filtered output in your desired format (notice my output is 0-indexed, though).
Input (I extended your input sample for testing):
$string='*nJohn*sSmith*fGeorge#*nHenry*sFord#*nJames*sWashington#*nMary*sMiller*fRichard';
Method (PHP Demo & Regex Demo):
if(preg_match_all('/\*n([^*]*)\*s([^*]*)(?:\*f([^#]*))?(?=#|$)/', $string, $out)){
$out=array_slice($out,1); // /prepare for array_column()
foreach($out[0] as $i=>$v){
$name_array[$i]=array_combine(['name','surname','father'],array_column($out,$i));
if($name_array[$i]['father']==''){unset($name_array[$i]['father']);}
}
}
var_export($name_array);
Output:
array (
0 =>
array (
'name' => 'John',
'surname' => 'Smith',
'father' => 'George',
),
1 =>
array (
'name' => 'Henry',
'surname' => 'Ford',
),
2 =>
array (
'name' => 'James',
'surname' => 'Washington',
),
3 =>
array (
'name' => 'Mary',
'surname' => 'Miller',
'father' => 'Richard',
),
)
My regex pattern is optimized for speed by using "negative character classes". I elected to not use the named capture groups because they nearly double the output array size from preg_match_all() and that array requires further preparation anyhow.

Convert an associative array to a simple of its values in php

I would like to convert the array:
Array
(
[0] => Array
(
[send_to] => 9891616884
)
[1] => Array
(
[send_to] => 9891616884
)
)
to
$value = 9891616884, 9891616884
Try this:
//example array
$array = array(
array('send_to'=>3243423434),
array('send_to'=>11111111)
);
$value = implode(', ',array_column($array, 'send_to'));
echo $value; //prints "3243423434, 11111111"
You can use array_map:
$input = array(
array(
'send_to' => '9891616884'
),
array(
'send_to' => '9891616884'
)
);
echo implode(', ', array_map(function ($entry) {
return $entry['tag_name'];
}, $input));
Quite simple, try this:
// initialize and empty string
$str = '';
// Loop through each array index
foreach ($array as $arr) {
$str .= $arr["send_to"] . ", ";
}
//removes the final comma and whitespace
$str = trim($str, ", ");

Query string like parameters regex

From a text like:
category=[123,456,789], subcategories, id=579, not_in_category=[111,333]
I need a regex to get something like:
$params[category][0] = 123;
$params[category][1] = 456;
$params[category][2] = 789;
$params[subcategories] = ; // I just need to know that this exists
$params[id] = 579;
$params[not_category][0] = 111;
$params[not_category][1] = 333;
Thanks everyone for the help.
PS
As you suggested, I clarify that the structure and the number of items may change.
Basically the structure is:
key=value, key=value, key=value, ...
where value can be:
a single value (e.g. category=123 or postID=123 or mykey=myvalue, ...)
an "array" (e.g. category=[123,456,789])
a "boolean" where the TRUE value is an assumption from the fact that "key" exists in the array (e.g. subcategories)
This method should be flexible enough:
$str = 'category=[123,456,789], subcategories, id=579, not_in_category=[111,333]';
$str = preg_replace('#,([^0-9 ])#',', $1',$str); //fix for string format with no spaces (count=10,paginate,body_length=300)
preg_match_all('#(.+?)(,[^0-9]|$)#',$str,$sections); //get each section
$params = array();
foreach($sections[1] as $param)
{
list($key,$val) = explode('=',$param); //Put either side of the "=" into variables $key and $val
if(!is_null($val) && preg_match('#\[([0-9,]+)\]#',$val,$match)>0)
{
$val = explode(',',$match[1]); //turn the comma separated numbers into an array
}
$params[$key] = is_null($val) ? '' : $val;//Use blank string instead of NULL
}
echo '<pre>'.print_r($params,true).'</pre>';
var_dump(isset($params['subcategories']));
Output:
Array
(
[category] => Array
(
[0] => 123
[1] => 456
[2] => 789
)
[subcategories] =>
[id] => 579
[not_in_category] => Array
(
[0] => 111
[1] => 333
)
)
bool(true)
Alternate (no string manipulation before process):
$str = 'count=10,paginate,body_length=300,rawr=[1,2,3]';
preg_match_all('#(.+?)(,([^0-9,])|$)#',$str,$sections); //get each section
$params = array();
foreach($sections[1] as $k => $param)
{
list($key,$val) = explode('=',$param); //Put either side of the "=" into variables $key and $val
$key = isset($sections[3][$k-1]) ? trim($sections[3][$k-1]).$key : $key; //Fetch first character stolen by previous match
if(!is_null($val) && preg_match('#\[([0-9,]+)\]#',$val,$match)>0)
{
$val = explode(',',$match[1]); //turn the comma separated numbers into an array
}
$params[$key] = is_null($val) ? '' : $val;//Use blank string instead of NULL
}
echo '<pre>'.print_r($params,true).'</pre>';
Another alternate: full re-format of string before process for safety
$str = 'count=10,paginate,body_length=300,rawr=[1, 2,3] , name = mike';
$str = preg_replace(array('#\s+#','#,([^0-9 ])#'),array('',', $1'),$str); //fix for varying string formats
preg_match_all('#(.+?)(,[^0-9]|$)#',$str,$sections); //get each section
$params = array();
foreach($sections[1] as $param)
{
list($key,$val) = explode('=',$param); //Put either side of the "=" into variables $key and $val
if(!is_null($val) && preg_match('#\[([0-9,]+)\]#',$val,$match)>0)
{
$val = explode(',',$match[1]); //turn the comma separated numbers into an array
}
$params[$key] = is_null($val) ? '' : $val;//Use blank string instead of NULL
}
echo '<pre>'.print_r($params,true).'</pre>';
You can use JSON also, it's native in PHP : http://php.net/manual/fr/ref.json.php
It will be more easy ;)
<?php
$subject = "category=[123,456,789], subcategories, id=579, not_in_category=[111,333]";
$pattern = '/category=\[(.*?)\,(.*?)\,(.*?)\]\,\s(subcategories),\sid=(.*?)\,\snot_in_category=\[(.*?)\,(.*?)\]/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
print_r($matches);
?>
I think this will get you the matches out... didn't actually test it but it might be a good starting point.
Then you just need to push the matches to the correct place in the array you need. Also test if the subcategories string exists with strcmp or something...
Also, notice that I assumed your subject string has that fixe dtype of structure... if it is changing often, you'll need much more than this...
$str = 'category=[123,456,789], subcategories, id=579, not_in_category=[111,333]';
$main_arr = preg_split('/(,\s)+/', $str);
$params = array();
foreach( $main_arr as $value) {
$pos = strpos($value, '=');
if($pos === false) {
$params[$value] = null;
} else {
$index_part = substr($value, 0, $pos);
$value_part = substr($value, $pos+1, strlen($value));
$match = preg_match('/\[(.*?)\]/', $value_part,$xarr);
if($match) {
$inner_arr = preg_split('/(,)+/', $xarr[1]);
foreach($inner_arr as $v) {
$params[$index_part][] = $v;
}
} else {
$params[$index_part] = $value_part;
}
}
}
print_r( $params );
Output :
Array
(
[category] => Array
(
[0] => 123
[1] => 456
[2] => 789
)
[subcategories] =>
[id] => 579
[not_in_category] => Array
(
[0] => 111
[1] => 333
)
)

Categories