Read lua-like code in php - php

I got a question...
I got code like this, and I want to read it with PHP.
NAME
{
title
(
A_STRING
);
settings
{
SetA( 15, 15 );
SetB( "test" );
}
desc
{
Desc
(
A_STRING
);
Cond
(
A_STRING
);
}
}
I want:
$arr['NAME']['title'] = "A_STRING";
$arr['NAME']['settings']['SetA'] = "15, 15";
$arr['NAME']['settings']['SetB'] = "test";
$arr['NAME']['desc']['Desc'] = "A_STRING";
$arr['NAME']['desc']['Cond'] = "A_STRING";
I don't know how I should start :/. The variables aren't always the same.
Can someone give me a hint on how to parse such a file?
Thx

This looks like a real grammar - you should use a parser generator. This discussion should get you started.
There are a few options already made for php: a lexer generator module and this is a parser generator module.

It's not an answer but suggestion:
Maybe you can modify your input code to be compatible with JSON which has similar syntax. JSON parsers and generators are available for PHP.
http://www.json.org/
http://www.php.net/json

If the files are this simple, then rolling your own homegrown parser is probably a lot easier. You'll eventually end up writing regex with lexers anyway. Here's a quick hack example: (in.txt should contain the input you provided above.)
<pre>
<?php
$input_str = file_get_contents("in.txt");
print_r(parse_lualike($input_str));
function parse_lualike($str){
$str = preg_replace('/[\n]|[;]/','',$str);
preg_match_all('/[a-zA-Z][a-zA-Z0-9_]*|[(]\s*([^)]*)\s*[)]|[{]|[}]/', $str, $matches);
$tree = array();
$stack = array();
$pos = 0;
$stack[$pos] = &$tree;
foreach($matches[0] as $index => $token){
if($token == '{'){
$node = &$stack[$pos];
$node[$ident] = array();
$pos++;
$stack[$pos] = &$node[$ident];
}elseif($token=='}'){
unset($stack[$pos]);
$pos--;
}elseif($token[0] == '('){
$stack[$pos][$ident] = $matches[1][$index];
}else{
$ident = $token;
}
}
return $tree;
}
?>
Quick explanation: The first preg_replace removes all newlines and semicolons, as they seem superfluous. The next part divides the input string into different 'tokens'; names, brackets and stuff inbetween paranthesis. Do a print_r $matches; there to see what it does.
Then there's just a really hackish state machine (or read for-loop) that goes through the tokens and adds them to a tree. It also has a stack to be able to build nested trees.
Please note that this algorithm is in no way tested. It will probably break when presented with "real life" input. For instance, a parenthesis inside a value will cause trouble. Also note that it doesn't remove quotes from strings. I'll leave all that to someone else...
But, as you requested, it's a start :)
Cheers!
PS. Here's the output of the code above, for convenience:
Array
(
[NAME] => Array
(
[title] => A_STRING
[settings] => Array
(
[SetA] => 15, 15
[SetB] => "test"
)
[desc] => Array
(
[Desc] => A_STRING
[Cond] => A_STRING
)
)
)

Related

php challenge: parse pseudo-regex

I have a challenge that I have not been able to figure out, but it seems like it could be fun and relatively easy for someone who thinks in algorithms...
If my search term has a "?" character in it, it means that it should not care if the preceding character is there (as in regex). But I want my program to print out all the possible results.
A few examples: "tab?le" should print out "table" and "tale". The number of results is always 2 to the power of the number of question marks. As another example: "carn?ati?on" should print out:
caraton
caration
carnaton
carnation
I'm looking for a function that will take in the word with the question marks and output an array with all the results...
Following your example of "carn?ati?on":
You can split the word/string into an array on "?" then the last character of each string in the array will be the optional character:
[0] => carn
[1] => ati
[2] => on
You can then create the two separate possibilities (ie. with and without that last character) for each element in the first array and map these permutations to another array. Note the last element should be ignored for the above transformation since it doesn't apply. I would make it of the form:
[0] => [carn, car]
[1] => [ati, at]
[2] => [on]
Then I would iterate over each element in the sub arrays to compute all the different combinations.
If you get stuck in applying this process just post a comment.
I think a loop like this should work:
$to_process = array("carn?ati?on");
$results = array();
while($item = array_shift($to_process)) {
$pos = strpos($item,"?");
if( $pos === false) {
$results[] = $item;
}
elseif( $pos === 0) {
throw new Exception("A term (".$item.") cannot begin with ?");
}
else {
$to_process[] = substr($item,0,$pos).substr($item,$pos+1);
$to_process[] = substr($item,0,$pos-1).substr($item,$pos+1);
}
}
var_dump($results);

read array from string php

I have a string like this
$php_string = '$user["name"] = "Rahul";$user["age"] = 12;$person["name"] = "Jay";$person["age"] = 12;';
or like this
$php_string = '$user = array("name"=>"Rahul","age"=>12);$person= array("name"=>"Jay","age"=>12);';
I need to get the array from the string ,
Expected result is
print_r($returned);
Array
(
[name] => Rahul
[age] => 12
)
Please note that there may be other contents on the string including comments,other php codes etc
Instead of relying on some magical regular expression, I would go a slightly easier route and use token_get_all() to tokenize the string and create a very basic parser that can create the necessary structures based on both array construction methods.
I don't think many people have rolled this themselves but it's likely the most stable solution.
use a combination of eval and preg_match_all like so:
if(preg_match_all('/array\s*\(.*\)/U', $php_string, $arrays)){
foreach($arrays as $array){
$myArray = eval("return {$array};");
print_r($myArray);
}
}
That will work as long as your array doesn't contain ) but can be modified further to handle that case
or as Jack suggests use token_get_all() like so:
$tokens = token_get_all($php_string);
if(is_array($tokens)){
foreach($tokens as $token){
if($token[0] != T_ARRAY)continue;
$myArray = eval("return {$token[1]};");
print_r($myArray);
}
}

Create variable from print_r output [duplicate]

This question already has answers here:
How create an array from the output of an array printed with print_r?
(11 answers)
Closed 10 years ago.
How can i create variable from it's print_r output ? In other words, i'd like to know if something similar to my fictive var_import function exists in php ? var_import would be the inverse of var_export
He is a use case:
$a = var_import('Array ( [0] => foo [1] => bar )');
$output = var_export($a);
echo $output; // Array ( [0] => foo [1] => bar )
If such a function does not exist, is there a tool (or online tool) to do this ?
I am also interested to do the same with var_dump output.
EDIT: The variable is only available as a print_r output (string). To clarify what i need, imagine the folowing situation: someone posts a some sample on the internet somewhere with a print_r output. To test his code, you need to import his print_r variable into your code. This is an example where var_import would be usefull.
Amusingly the PHP manual contains an example that tries to recreate the original structure from the print_r output:
print_r_reverse()
http://www.php.net/manual/en/function.print-r.php#93529
However it does depend on whitespace being preserved. So you would need the actual HTML content, not the rendered text to pipe it back in.
Also it doesn't look like it could understand anything but arrays, and does not descend. That would be incredibly difficult as there is no real termination marker for strings, which can both contain newlines and ) or even [0] and => which could be mistaken for print_r delimiters. Correctly reparsing the print_r structure would be near impossible even with a recursive regex (and an actual parser) - it could only be accomplished by guesswork splitting like in the code linked above.. (There are two more versions there, maybe you have to try them through to find a match for your specific case.)
Why don't you use var_export instead ?
var_export(array(1, 2, 3)); // array(1, 2, 3)
You can import var_export's output with eval(), however I would recommend you to avoid this function as much as possible.
The following functions are better for exporting and importing variables:
serialize() and unserialize():
$string = serialize(array(1, 2, 3));
$array = unserialize($string); // array(1, 2, 3);
Or json_encode() and json_decode():
$string = json_encode(array(1, 2, 3));
$array = json_decode($string);
You can wrap it in an output buffer:
ob_start();
print_r(array(1,2,3));
$arr = ob_get_clean();
echo $arr;
Ok so I misunderstood the first question. I think I have another solution which actually does answer your question:
<?php
$ar = array('foo','bar');
$p = print_r($ar, true);
$reg = '/\[([0-9]+)\] \=\> ([a-z]+)/';
$m = preg_match_all($reg, $p, $ms);
$new_ar = $ms[2];
echo "Your wanted result:\n";
print_r($new_ar);
If you want to import a var_export()'s variable, you can run the eval() function.
Or if you save the contents into a file (with a return statement), you can use the return value of include() or require().
But I would rather use serialize() and unserialize() or json_encode() and json_decode().
define('EXPORT_JSON', 1);
define('EXPORT_SERIALIZE', 2);
function exportIntoFile($var, $filename, $method=EXPORT_JSON)
{
if ( $method & EXPORT_JSON )
file_put_contents( $filename, json_encode($var) );
else if ($method & EXPORT_SERIALIZE)
file_put_contents( $filename, serialize($var) );
}
function importFromFile($filename, $method=EXPORT_JSON)
{
if ( $method & EXPORT_JSON )
return json_decode( file_get_contents($filename) );
else if ($method & EXPORT_SERIALIZE)
return unserialize( file_get_contents($filename) );
}
I'm not good at regex to code the final trash removal. Here is how far I could get though:
$str = 'Array ( [0] => foo [1] => bar [2] => baz)';
$t1 = explode('(', $str);
$t2 = explode(')', $t1[1]);
$t3 = explode(' => ', $t2[0]);
unset($t3[0]);
print_r($t3);
output:
Array
(
[1] => foo [1]
[2] => bar [2]
[3] => baz
)

php array processing question

Before I write my own function to do it, is there any built-in function, or simple one-liner to convert:
Array
(
[0] => pg_response_type=D
[1] => pg_response_code=U51
[2] => pg_response_description=MERCHANT STATUS
[3] => pg_trace_number=477DD76B-B608-4318-882A-67C051A636A6
)
Into:
Array
(
[pg_response_type] => D
[pg_response_code] =>U51
[pg_response_description] =>MERCHANT STATUS
[pg_trace_number] =>477DD76B-B608-4318-882A-67C051A636A6
)
Just trying to avoid reinventing the wheel. I can always loop through it and use explode.
I can always loop through it and use explode.
that's what you should do.
Edit - didn't read the question right at all, whoops..
A foreach through the array is the quickest way to do this, e.g.
foreach($arr as $key=>$val)
{
$new_vals = explode("=", $val);
$new_arr[$new_vals[0]] = $new_vals[1];
}
This should be around five lines of code. Been a while since I've done PHP but here's some pseudocode
foreach element in the array
explode result on the equals sign, set limit = 2
assign that key/value pair into a new array.
Of course, this breaks on keys that have more than one equals sign, so it's up to you whether you want to allow keys to have equals signs in them.
You could do it like this:
$foo = array(
'pg_response_type=D',
'pg_response_code=U51',
'pg_response_description=MERCHANT STATUS',
'pg_trace_number=477DD76B-B608-4318-882A-67C051A636A6',
);
parse_str(implode('&', $foo), $foo);
var_dump($foo);
Just be sure to encapsulate this code in a function whose name conveys the intent.

Searching an array of different strings inside a single string in PHP

I have an array of strings that I want to try and match to the end of a normal string. I'm not sure the best way to do this in PHP.
This is sorta what I am trying to do:
Example:
Input: abcde
Search array: er, wr, de
Match: de
My first thought was to write a loop that goes through the array and crafts a regular expression by adding "\b" on the end of each string and then check if it is found in the input string. While this would work it seems sorta inefficient to loop through the entire array. I've been told regular expressions are slow in PHP and don't want to implement something that will take me down the wrong path.
Is there a better way to see if one of the strings in my array occurs at the end of the input string?
The preg_filter() function looks like it might do the job but is for PHP 5.3+ and I am still sticking with 5.2.11 stable.
For something this simple, you don't need a regex. You can either loop over the array, and use strpos to see if the index is length(input) - length(test). If each entry in the search array is always of a constant length, you can also speed things up by chopping the end off the input, then comparing that to each item in the array.
You can't avoid going through the whole array, as in the worst general case, the item that matches will be at the end of the array. However, unless the array is huge, I wouldn't worry too much about performance - it will be much faster than you think.
Though compiling the regular expression takes some time I wouldn't dismiss using pcre so easily. Unless you find a compare function that takes several needles you need a loop for the needles and executing the loop + calling the compare function for each single needle takes time, too.
Let's take a test script that fetches all the function names from php.net and looks for certain endings. This was only an adhoc script but I suppose no matter which strcmp-ish function + loop you use it will be slower than the simple pcre pattern (in this case).
count($hs)=5549
pcre: 4.377925157547 s
substr_compare: 7.951938867569 s
identical results: bool(true)
This was the result when search for nine different patterns. If there were only two ('yadda', 'ge') both methods took the same time.
Feel free to criticize the test script (aren't there always errors in synthetic tests that are obvious for everyone but oneself? ;-) )
<?php
/* get the test data
All the function names from php.net
*/
$doc = new DOMDocument;
$doc->loadhtmlfile('http://docs.php.net/quickref.php');
$xpath = new DOMXPath($doc);
$hs = array();
foreach( $xpath->query('//a') as $a ) {
$hs[] = $a->textContent;
}
echo 'count($hs)=', count($hs), "\n";
// should find:
// ge, e.g. imagick_adaptiveblurimage
// ing, e.g. m_setblocking
// name, e.g. basename
// ions, e.g. assert_options
$ns = array('yadda', 'ge', 'foo', 'ing', 'bar', 'name', 'abcd', 'ions', 'baz');
sleep(1);
/* test 1: pcre */
$start = microtime(true);
for($run=0; $run<100; $run++) {
$matchesA = array();
$pattern = '/(?:' . join('|', $ns) . ')$/';
foreach($hs as $haystack) {
if ( preg_match($pattern, $haystack, $m) ) {
#$matchesA[$m[0]]+= 1;
}
}
}
echo "pcre: ", microtime(true)-$start, " s\n";
flush();
sleep(1);
/* test 2: loop + substr_compare */
$start = microtime(true);
for($run=0; $run<100; $run++) {
$matchesB = array();
foreach( $hs as $haystack ) {
$hlen = strlen($haystack);
foreach( $ns as $needle ) {
$nlen = strlen($needle);
if ( $hlen >= $nlen && 0===substr_compare($haystack, $needle, -$nlen) ) {
#$matchesB[$needle]+= 1;
}
}
}
}
echo "substr_compare: ", microtime(true)-$start, " s\n";
echo 'identical results: '; var_dump($matchesA===$matchesB);
I might approach this backwards;
if your string-ending list is fixed or varies rarely,
I would start by preprocessing it to make it easy to match against,
then grab the end of your string and see if it matches!
Sample code:
<?php
// Test whether string ends in predetermined list of suffixes
// Input: string to test
// Output: if matching suffix found, returns suffix as string, else boolean false
function findMatch($str) {
$matchTo = array(
2 => array( 'ge' => true, 'de' => true ),
3 => array( 'foo' => true, 'bar' => true, 'baz' => true ),
4 => array( 'abcd' => true, 'efgh' => true )
);
foreach($matchTo as $length => $list) {
$end = substr($str, -$length);
if (isset($list[$end]))
return $end;
}
return $false;
}
?>
This might be an overkill but you can try the following.
Create a hash for each entry of your search array and store them as keys in the array (that will be your lookup array).
Then go from the end of your input string one character at time (e, de,cde and etc) and compute a hash on a substring at each iteration. If a hash is in your lookup array, you have much.

Categories