I have a php file as a string, I am looking for places where certain functions are called and I want to extract the passed arguments to the function.
I need to match the following cases:
some_function_name("abc123", ['key' => 'value'])
some_function_name("abc123", array("key" => 'value'))
So far I have this, but it breaks as soon as I have any nesting conditions:
(function_name)\(([^()]+)\)
$text = "test test test test some_function_name('abc123', ['key' => 'value']) sdohjsh dsfkjh spkdo sdfopmsdfohp some_function_name('abc123', array('key' => 'value'))";
preg_match_all('/\w+\(.*?\)(\)|!*)/', $text, $matches);
var_dump($matches[0]);
Is this the desired result you want?
$text = "blah some_function_name('abc123', ['key' => 'value']) blah some_function_name('abc123', array('key' => 'value')) blah";
preg_match_all('/\w+\(.+?(?:array\(.+?\)|\[.+?\])\)/', $text, $matches);
var_dump($matches);
Output:
array(1) {
[0]=>
array(2) {
[0]=>
string(48) "some_function_name('abc123', ['key' => 'value'])"
[1]=>
string(53) "some_function_name('abc123', array('key' => 'value'))"
}
}
Explanation:
\w+ # 1 or more word character (i.e. [a-zA-Z0-9_])
\( # opening parenthesis
.+? # 1 or more any character, not greedy
(?: # non capture group
array\(.+?\) # array(, 1 or more any character, )
| # OR
\[.+?\] # [, 1 or more any character, ]
) # end group
\) # closing parenthesis
I managed to solve it using the following pattern:
((\'.*?\'|\".*?\")(\s*,\s*.*?)*?\);?
Thanks everyone for your suggestions!
Related
I have a problem, I would like to ask for this string:
[NAME: abc] [EMAIL: email#gm.com] [TIMEFRAME: 3 weeks] [BUDGET: 1000 dollars] [MESSAGE: bla bla bla]
Replace it with an array in the form:
array(
'NAME' => 'abc',
'EMAIL' => 'email#gm.com',
'TIMEFRAME' => '3 weeks',
'BUDGET' => '1000 dollars',
'MESSAGE' => 'bla bla bla' );
I tried to do something like this:
$content = str_replace(array('[', ']'), '', '[NAME: abc] [EMAIL: email#gm.com] [TIMEFRAME: 3 weeks] [BUDGET: 1000 dollars] [MESSAGE: bla bla bla]');
preg_match_all('/[A-Z]+\:/', $content, $inputs);
I managed to pull out the "keys", but I do not know how to pull out their "values". Any ideas?
Thank you in advance for your help and I apologize for my English.
You may use the following regex:
'~\[(\w+):\s*([^][]*)]~'
See the regex demo.
Details
\[ - a [ char
(\w+) - Group 1: 1+ letters, digits or _
: - a colon
\s* - 0+whitespaces
([^][]*) - Group 2: 0+ chars other than [ and ]
] - a ] char.
See the PHP demo:
$s = "[NAME: abc] [EMAIL: cde] [TIMEFRAME: efg] [BUDGET: hij] [MESSAGE: klm]";
if (preg_match_all('~\[(\w+):\s*([^][]*)]~', $s, $m)) {
array_shift($m); // Removes whole match values from array
print_r(array_combine($m[0], $m[1])); // Build the result with keys (Group 1) and values (Group 2)
}
I have this as an input to my command line interface as parameters to the executable:
-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"
What I want to is to get all of the parameters in a key-value / associative array with PHP like this:
$result = [
'Parameter1' => '1234',
'Parameter2' => '1234',
'param3' => 'Test \"escaped\"',
'param4' => '10',
'param5' => '0',
'param6' => 'TT',
'param7' => 'Seven',
'param8' => 'secret',
'SuperParam9' => '4857',
'SuperParam10' => '123',
];
The problem here lies at the following:
parameter's prefix can be - or --
parameter's glue (value assignment operator) can be either an = sign or a whitespace ' '
some parameters may be inside a quote block and can also have different, both separators and glues and prefixes, ie. a ? mark for the separator.
So far, since I'm really bad with RegEx, and still learning it, is this:
/(-[a-zA-Z]+)/gui
With which I can get all the parameters starting with an -...
I can go to manually explode the entire thing and parse it manually, but there are way too many contingencies to think about.
You can try this that uses the branch reset feature (?|...|...) to deal with the different possible formats of the values:
$str = '-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"';
$pattern = '~ --?(?<key> [^= ]+ ) [ =]
(?|
" (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) "
|
([^ ?"]*)
)~x';
preg_match_all ($pattern, $str, $matches);
$result = array_combine($matches['key'], $matches['value']);
print_r($result);
demo
In a branch reset group, the capture groups have the same number or the same name in each branch of the alternation.
This means that (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) is (obviously) the value named capture, but that ([^ ?"]*) is also the value named capture.
You could use
--?
(?P<key>\w+)
(?|
=(?P<value>[^-\s?"]+)
|
\h+"(?P<value>.*?)(?<!\\)"
|
\h+(?P<value>\H+)
)
See a demo on regex101.com.
Which in PHP would be:
<?php
$data = <<<DATA
-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"
DATA;
$regex = '~
--?
(?P<key>\w+)
(?|
=(?P<value>[^-\s?"]+)
|
\h+"(?P<value>.*?)(?<!\\\\)"
|
\h+(?P<value>\H+)
)~x';
if (preg_match_all($regex, $data, $matches)) {
$result = array_combine($matches['key'], $matches['value']);
print_r($result);
}
?>
This yields
Array
(
[Parameter1] => 1234
[Parameter2] => 38518
[param3] => Test \"escaped\"
[param4] => 10
[param5] => 0
[param6] => TT
[param7] => Seven
[param8] => secret
[SuperParam9] => 4857
[SuperParam10] => 123
)
i want to extract parameter like exist in annotation
i have done this far
$str = "(action=bla,arg=2,test=15,op)";
preg_match_all('/([^\(,]+)=([^,\)]*)/', $str, $m);
$data = array_combine($m[1], $m[2]);
var_dump($data);
this gives following out put
array (size=3)
'action' => string 'bla' (length=3)
'arg' => string '2' (length=1)
'test' => string '15' (length=2)
this is ignoring op (but i want it having null or empty value)
but i want to improve this so it can extract these also
(action='val',abc) in this case value inside single quote will assign to action
(action="val",abc) same as above but it also extract value between double quote
(action=[123,11,23]) now action action will contain array 123,11,23 (this also need to extract with or without quotation)
i don't want complete solution(if you can do it then most welcome) but i need at least first two
EDIT
(edit as per disucssion with r3mus)
output should be like
array (size=3)
'action' => string 'bla' (length=3)
'arg' => string '2' (length=1)
'test' => string '15' (length=2)
'op' => NULL
Edit:
This ended up being a lot more complex than just a simple regex. It ended up looking (first pass) like this:
function validate($str)
{
if (preg_match('/=\[(.*)\]/', $str, $m))
{
$newstr = preg_replace("/,/", "+", $m[1]);
$str = preg_replace("/".$m[1]."/", $newstr, $str);
}
preg_match('/\((.*)\)/', $str, $m);
$array = explode(",", $m[1]);
$output = array();
foreach ($array as $value)
{
$pair = explode("=", $value);
if (preg_match('/\[(.*)\]/', $pair[1]))
$pair[1] = explode("+", $pair[1]);
$output[$pair[0]] = $pair[1];
}
if (!isset($output['op']))
return $output;
else
return false;
}
print_r(validate("(action=[123,11,23],arg=2,test=15)"));
Old stuff that wasn't adequate:
How about:
([^\(,]+)=(\[.*\]|['"]?(\w*)['"]?)
Working example/sandbox: http://regex101.com/r/bZ8qE6
Or if you need to capture only the array within the []:
([^\(,]+)=(\[(.*)\]|['"]?(\w*)['"]?)
I know it's answered but you could do this which I think is what you wanted:
$str = '(action=bla,arg=2,test=15,op)';
preg_match_all('/([^=,()]+)(?:=([^,)]+))?/', $str, $m);
$data = array_combine($m[1], $m[2]);
echo '<pre>' . print_r($data, true) . '</pre>';
OUTPUTS
Array
(
[action] => bla
[arg] => 2
[test] => 15
[op] =>
)
You can use this code:
<pre><?php
$subject = '(action=bla,arg=2,test=15,op, arg2=[1,2,3],arg3 = "to\\"t,o\\\\", '
. 'arg4 = \'titi\',arg5=) blah=312';
$pattern = <<<'LOD'
~
(?: \(\s* | \G(?<!^) ) # a parenthesis or contiguous to a precedent match
(?<param> \w+ )
(?: \s* = \s*
(?| (?<value> \[ [^]]* ] ) # array
| "(?<value> (?> [^"\\]++ | \\{2} | \\. )* )" # double quotes
| '(?<value> (?> [^'\\]++ | \\{2} | \\. )* )' # single quotes
| (?<value> [^\s,)]++ ) # other value
)? # the value can be empty
)? # the parameter can have no value
\s*
(?:
, \s* # a comma
| # OR
(?= (?<control> \) ) ) # followed by the closing parenthesis
)
~xs
LOD;
preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER);
foreach($matches as $match) {
printf("<br>%s\t%s", $match['param'], $match['value']);
if (isset($match['control'])) echo '<br><br>#closing parenthesis#';
}
?></pre>
I have numbers like theses:
1.80
2.75
#1.55
Theses numbers are in strings and I'm trying to get them throught preg_match. At this time I have this:
$pattern = '/ [0-9]{1}\.[0-9]{2}/';
$result = preg_match($pattern, $feed, $matches);
This works pretty well but I need more precision on my preg_match and I didn't found a solution.
With this pattern, numbers like 1.556 will be found. I don't want this, my numbers length will be 4 chars. dot included.
Also, here I am not able to catch the numbers starting by a #, only a space. How can I do this?
$result = preg_match($pattern, 'test 1.556 red #1.62 blue 2.33 ?', $matches);
Here the results needed are 1.62 and 2.33
As an alternative to regular expressions, PHP-Sanitization-Filters:
$array = explode(' ', 'test 1.556 red #1.62 blue 2.33 ?');
$result = filter_var_array(
array(
'convert' => $array
),
array(
'convert' => array(
'filter' => FILTER_SANITIZE_NUMBER_FLOAT,
'flags' => FILTER_FLAG_ALLOW_FRACTION | FILTER_FORCE_ARRAY
)
)
);
var_dump(array_filter(array_map('floatval', $result['convert'])));
results in:
array(3) {
[1]=>
float(1.556)
[3]=>
float(1.62)
[5]=>
float(2.33)
}
The following pattern will match all numbers in the format of #.## with an optional leading space or at sign.
[ #]?(\d{1}\.\d{2})\b
Demo: http://regex101.com/r/eB4bL5
if you want it up to 4 precision and the # to be catched That is what you need
$pattern = '/ #*([0-9]{1}\.[0-9]{2})\b /';
I'm having problems matching the[*] which is sometimes there and sometimes not. Anyone have suggestions?
$name = 'hello $this->row[today1][] dfh fgh df $this->row[test1] ,how good $this->row[test2][] is $this->row[today2][*] is monday';
echo $name."\n";
preg_match_all( '/\$this->row[.*?][*]/', $name, $match );
var_dump( $match );
output:
hello $this->row[test] ,how good $this->row[test2] is $this->row[today][*] is monday
array (
0 =>
array (
0 => '$this->row[today1][*]',
1 => '$this->row[test1] ,how good $this->row[test2][*]',
2 => '$this->row[today2][*]',
),
)
Now the [0][1] match takes on too much because it is matching until the next '[]' instead of ending at '$this->row[test]' . I'm guessing the [*]/ adds a wildcard. Somehow need to check if the next character is [ before matching to []. Anyone?
Thanks
[, ] and * are special meta characters in regex and you need to escape them. Also you need to make last [] optional as per your question.
Following these suggestions following should work:
$name = 'hello $this->row[today1][] dfh fgh df $this->row[test1] ,how good $this->row[test2][] is $this->row[today2][*] is monday';
echo $name."\n";
preg_match_all( '/\$this->row\[.*?\](?:\[.*?\])?/', $name, $match );
var_dump( $match );
OUTPUT:
array(1) {
[0]=>
array(4) {
[0]=>
string(20) "$this->row[today1][]"
[1]=>
string(17) "$this->row[test1]"
[2]=>
string(19) "$this->row[test2][]"
[3]=>
string(21) "$this->row[today2][*]"
}
}