improve preg / pcre / regex to find PHP variable - php

String to parse:
$str = "
public $xxxx123;
private $_priv ;
$xxx = 'test';
private $arr_123 = array();
"; // | |
// ^^^^^^^---- get the variable name
What I got so far:
$str = preg_match_all('/\$\S+(;|[[:space:]])/', $str, $matches);
foreach ($matches[0] as $match) {
$match = str_replace('$', '', $match);
$match = str_replace(';', '', $match);
}
It works but I want to know if I can improve the preg, e.g. get rid of the two str_replace and maybe include \t in (;|[[:space:]])

Using a positive lookbehind, you can get only that what you need, to be sure you'll only match valid variable names, I've used this:
preg_match_all('/(?<=\$)[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*/',$str,$matches);
var_dump($matches);
which correctly shows:
array (
0 =>
array (
0 => 'xxxx123',
1 => '_priv',
2 => 'xxx',
3 => 'arr_123'
)
)
Which is all you need, no memory waisted on an array containing all variables with their leading and/or trailing chars.
The expression:
(?<=\$) is a positive lookbehind
[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*: is the regex PHP's site suggests themselves on their document pages

simply use backreferences
preg_match_all('/\$(\S+?)[;\s=]/', $str, $matches);
foreach ($matches[1] as $match) {
// $match is now only the name of the variable without $ and ;
}

I changed the regex a little bit, take a look:
$str = '
public $xxxx123;
private $_priv ;
$xxx = "test";
private $arr_123 = array();
';
$matches = array();
//$str = preg_match_all('/\$(\S+)[; ]/', $str, $matches);
$str = preg_match_all('/\$(\S+?)(?:[=;]|\s+)/', $str, $matches); //credits for mr. #booobs for this regex
print_r($matches);
The output:
Array
(
[0] => Array
(
[0] => $xxxx123;
[1] => $_priv
[2] => $xxx
[3] => $arr_123
)
[1] => Array
(
[0] => xxxx123
[1] => _priv
[2] => xxx
[3] => arr_123
)
)
Now you can use the $matches[1] in the foreach loop.
::Update::
After using regex "/\$([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)/" the output looks correct.
String:
$str = '
public $xxxx123; $input1;$input3
private $_priv ;
$xxx = "test";
private $arr_123 = array();
';
And the output:
Array
(
[0] => Array
(
[0] => $xxxx123
[1] => $input1
[2] => $input3
[3] => $_priv
[4] => $xxx
[5] => $arr_123
)
[1] => Array
(
[0] => xxxx123
[1] => input1
[2] => input3
[3] => _priv
[4] => xxx
[5] => arr_123
)
)

Related

PHP Explode string between two characters to arrays? (Noob question)

Hello :) I am a beginner in PHP.
I tried several times but did not succeed
I would like to parse a String like :
[1,[01,11,12],[20,21,22]]
to
`
arr[0][0]=>1
arr[1][0]=>01
arr[1][1]=>11
arr[1][2]=>12
arr[2][0]=>20
arr[2][1]=>21
arr[2][2]=>22
`
You can split your string on a comma that is not enclosed by [ and ] using this regex (inspired by this answer) with preg_split:
,(?![^\[]*\])
and then trim surrounding [ and ] from the resultant parts and split those strings on commas into succeeding elements of the output array. For example:
$string = '[1,[01,11,12] ,4 ,5, [20,21,22]]';
$parts = preg_split('/,(?![^\[]*\])/', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
$output = array();
foreach ($parts as $part) {
$part = trim($part, '[] ');
$output[] = explode(',', $part);
}
print_r($output);
Output:
Array
(
[0] => Array
(
[0] => 1
)
[1] => Array
(
[0] => 01
[1] => 11
[2] => 12
)
[2] => Array
(
[0] => 4
)
[3] => Array
(
[0] => 5
)
[4] => Array
(
[0] => 20
[1] => 21
[2] => 22
)
)
Demo on 3v4l.org
If you're 100% certain of the source and safety of the string, you can also just use eval:
eval("\$output = $string;");
The result will be the same.

Regex with lookahead and lookbehind

I have the following regexp that works great.
$str = "ID: {{item:id}} & First name: {{item:first_name}} & Page Title: {{page:title}}";
preg_match_all('/(?<={{)[^}]*(?=}})/', $str, $matches);
print_r($matches);
Returns:
Array
(
[0] => Array
(
[0] => item:id
[1] => item:first_name
[2] => page:title
)
)
How do I need to modify the regex to force it to match the item:id and item:first_name only (or any other string starting with "item:")? I tried adding the "item" to the regex (in several different places) but it didn't work.
You can use:
preg_match_all('/(?<={{)item:[^}]*(?=}})/', $str, $matches);
print_r($matches[0]);
Array
(
[0] => item:id
[1] => item:first_name
)
With this you can group the tokens, so you don't need to limit the expression to any single type:
(?<={{)(.+?)(?:\:)(.+?)(?=}})
Example of utilization:
$str = "ID: {{item:id}} & First name: {{item:first_name}} & Page Title: {{page:title}}";
preg_match_all('/(?<={{)(.+?)(?:\:)(.+?)(?=}})/', $str, $matches);
$tokens = array();
foreach ($matches[0] as $i => $v) {
$tokens[$matches[1][$i]][] = $matches[2][$i];
}
echo '<pre>';
print_r($tokens);
Output:
Array
(
[item] => Array
(
[0] => id
[1] => first_name
)
[page] => Array
(
[0] => title
)
)

How to extract multiple values from a string to call an array?

I want to extract values from a string to call an array for basic template functionality:
$string = '... #these.are.words-I_want.to.extract# ...';
$output = preg_replace_callback('~\#([\w-]+)(\.([\w-]+))*\#~', function($matches) {
print_r($matches);
// Replace matches with array value: $these['are']['words-I_want']['to']['extract']
}, $string);
This gives me:
Array
(
[0] => #these.are.words-I_want.to.extract#
[1] => these
[2] => .extract
[3] => extract
)
But I'd like:
Array
(
[0] => #these.are.words-I_want.to.extract#
[1] => these
[2] => are
[3] => words-I_want
[4] => to
[5] => extract
)
Which changes do I need to make to my regex?
It seems that the words are simply dot separated, so match sequences of what you don't want:
preg_replace_callback('/[^#.]+/', function($match) {
// ...
}, $str);
Should give the expected results.
However, if the # characters are the boundary of where the matching should take place, you would need a separate match and then use a simple explode() inside:
preg_replace_callback('/#(.*?)#/', function($match) {
$parts = explode('.', $match[1]);
// ...
}, $str);
You can use array_merge() function to merge the two resulting arrays:
$string = '... #these.are.words-I_want.to.extract# ...';
$result = array();
if (preg_match('~#([^#]+)#~', $string, $m)) {
$result[] = $m[0];
$result = array_merge($result, explode('.', $m[1]));
}
print_r($result);
Output:
Array
(
[0] => #these.are.words-I_want.to.extract#
[1] => these
[2] => are
[3] => words-I_want
[4] => to
[5] => extract
)

Is this possible with preg_match?

i have strings that looks similar like this:
"size:34,35,36,36,37|color:blue,red,white"
is it possible to match all the colors in a preg_match(_all)?
so that i will get "blue", "red" and "white" in the output array?
the colors can be whatever, so i cant go (blue|red|white)
Explode on |
Explode on :
Explode on ,
???
Profit!
Code
IMHO using regular expressions like what's been suggested in the other answers is a much "uglier" solution than something simple like so:
$input = 'size:34,35,36,36,37|color:blue,red,white|undercoating:yes,no,maybe,42';
function get_option($name, $string) {
$raw_opts = explode('|', $string);
$pattern = sprintf('/^%s:/', $name);
foreach( $raw_opts as $opt_str ) {
if( preg_match($pattern, $opt_str) ) {
$temp = explode(':', $opt_str);
return $opts = explode(',', $temp[1]);
}
}
return false; //no match
}
function get_all_options($string) {
$options = array();
$raw_opts = explode('|', $string);
foreach( $raw_opts as $opt_str ) {
$temp = explode(':', $opt_str);
$options[$temp[0]] = explode(',', $temp[1]);
}
return $options;
}
print_r(get_option('undercoating', $input));
print_r(get_all_options($input));
Output:
Array
(
[0] => yes
[1] => no
[2] => maybe
[3] => 42
)
Array
(
[size] => Array
(
[0] => 34
[1] => 35
[2] => 36
[3] => 36
[4] => 37
)
[color] => Array
(
[0] => blue
[1] => red
[2] => white
)
[undercoating] => Array
(
[0] => yes
[1] => no
[2] => maybe
[3] => 42
)
)
You can achieve it in a round about way with preg_match_all() but I'd recommend explode instead.
preg_match_all('/([a-z]+)(?:,|$)/', "size:34,35,36,36,37|color:blue,red,white", $a);
print_r($a[1]);
I think it's possible with lookbehind:
/(?<=(^|\|)color:([^,|],)*)[^,|](?=\||,|$)/g
(for preg_match_all)
Your explode solution is obviously cleaner :-)

empty return preg_split [how]?

i have string like this
$string = '$foo$wow$123$$$ok$';
i want to return empty string and save string in array like this
0 = foo
1 = wow
2 = 123
3 =
4 =
5 = ok
i use PREG_SPLIT_NO_EMPTY, i know when make PREG_SPLIT_NO_EMPTY return is not empty, but i want any result empty, i want my result save in variable array like in PREG_SPLIT_NO_EMPTY with $chars[$i];
this is my preg_split :
$chars = preg_split('/[\s]*[$][\s]*/', $string, -1, PREG_SPLIT_NO_EMPTY);
for($i=0;$i<=5;$i++){
echo $i.' = '.$chars[$i];
}
i want, my result show with looping. no in object loop i want pure this looping:
for($i=0;$i<=5;$i++){
echo $i.' = '.$chars[$i];
}
to show my result.
how i use this preg_split,
thanks for advance...
use explode
$str = '$foo$wow$123$$$ok$';
$res = explode ("$",$str);
print_r($res);
Array
(
[0] =>
[1] => foo
[2] => wow
[3] => 123
[4] =>
[5] =>
[6] => ok
[7] =>
)
Using explode adds the empty entrys to the front and the back.
This one matches the tc's expected output:
$str = '$foo$wow$123$$$ok$';
preg_match_all("#(?<=\\$)[^\$]*(?=\\$)#", $str, $res);
echo "<pre>";
print_r($res);
echo "</pre>";
[0] => Array
(
[0] => foo
[1] => wow
[2] => 123
[3] =>
[4] =>
[5] => ok
)

Categories