Find Logic from String using Regex - php

i want to get logic from string that what from input string. For example:
Input:
pageType == "static" OR pageType == "item" AND pageRef == "index"
How to i get logic like:
0 => pageType == "static"
1 => pageType == "item"
2 => pageRef == "index"
The logic clause must be complete based on what is entered.
I place like this:
$input = 'pageType == "static" OR pageType == "item" AND pageRef == "index"';
preg_match_all('/(.+)(?!AND|OR)(.+)/s', $input, $loX);
var_dump($loX);
but the array just show:
0 => pageType == "static" OR pageType == "item" AND pageRef == "index"
1 => pageType == "static" OR pageType == "item" AND pageRef == "index
2 => "
Please help me, thanks ^_^

One option is to make use of the \G anchor and a capture group:
\G(\w+\h*==\h*"[^"]*")(?:\h+(?:OR|AND)\h+|$)
The pattern matches:
\G Get continuous matches asserting the position at the end of the previous match from the start of the string
(\w+\h*==\h*"[^"]*") Match 1+ word characters == and the value between double quotes
(?: Non capture group for the alternatives
\h+(?:OR|AND)\h+ Match either OR or AND between spaces
| Or
$ Assert the end of the string
) Close the group
Regex demo | Php demo
$re = '/\G(\w+\h*==\h*"[^"]*")(?:\h+(?:OR|AND)\h+|$)/';
$str = 'pageType == "static" OR pageType == "item" AND pageRef == "index"';
preg_match_all($re, $str, $matches);
print_r($matches[1]);
Output
Array
(
[0] => pageType == "static"
[1] => pageType == "item"
[2] => pageRef == "index"
)
Another option to get the results is to split on the AND or OR surrounded by spaces:
$result = preg_split('/\h+(?:OR|AND)\h+/', $str);

Related

Regular Expression to extract php code partially (( array definition ))

I have php code stored (( array definition )) in a string like this
$code=' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
); ';
there is a regular expression to extract this array??, i mean i want something like
$array=(
0 => '"a"',
'a' => '$GlobalScopeVar',
'b' => 'array("nested"=>array(1,2,3))',
'c' => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
);
pD :: i do research trying to find a regular expression but nothing was found.
pD2 :: gods of stackoverflow, let me bounty this now and i will offer 400 :3
pD3 :: this will be used in a internal app, where i need extract an array of some php file to be 'processed' in parts, i try explain with this codepad.org/td6LVVme
Regex
So here's the MEGA regex I came up with:
\s* # white spaces
########################## KEYS START ##########################
(?: # We\'ll use this to make keys optional
(?P<keys> # named group: keys
\d+ # match digits
| # or
"(?(?=\\\\")..|[^"])*" # match string between "", works even 4 escaped ones "hello \" world"
| # or
\'(?(?=\\\\\')..|[^\'])*\' # match string between \'\', same as above :p
| # or
\$\w+(?:\[(?:[^[\]]|(?R))*\])* # match variables $_POST, $var, $var["foo"], $var["foo"]["bar"], $foo[$bar["fail"]]
) # close group: keys
########################## KEYS END ##########################
\s* # white spaces
=> # match =>
)? # make keys optional
\s* # white spaces
########################## VALUES START ##########################
(?P<values> # named group: values
\d+ # match digits
| # or
"(?(?=\\\\")..|[^"])*" # match string between "", works even 4 escaped ones "hello \" world"
| # or
\'(?(?=\\\\\')..|[^\'])*\' # match string between \'\', same as above :p
| # or
\$\w+(?:\[(?:[^[\]]|(?R))*\])* # match variables $_POST, $var, $var["foo"], $var["foo"]["bar"], $foo[$bar["fail"]]
| # or
array\s*\((?:[^()]|(?R))*\) # match an array()
| # or
\[(?:[^[\]]|(?R))*\] # match an array, new PHP array syntax: [1, 3, 5] is the same as array(1,3,5)
| # or
(?:function\s+)?\w+\s* # match functions: helloWorld, function name
(?:\((?:[^()]|(?R))*\)) # match function parameters (wut), (), (array(1,2,4))
(?:(?:\s*use\s*\((?:[^()]|(?R))*\)\s*)? # match use(&$var), use($foo, $bar) (optionally)
\{(?:[^{}]|(?R))*\} # match { whatever}
)?;? # match ; (optionally)
) # close group: values
########################## VALUES END ##########################
\s* # white spaces
I've put some comments, note that you need to use 3 modifiers:
x : let's me make comments
s : match newlines with dots
i : match case insensitive
PHP
$code='array(0 => "a", 123 => 123, $_POST["hello"][\'world\'] => array("is", "actually", "An array !"), 1234, \'got problem ?\',
"a" => $GlobalScopeVar, $test_further => function test($noway){echo "this works too !!!";}, "yellow" => "blue",
"b" => array("nested"=>array(1,2,3), "nested"=>array(1,2,3),"nested"=>array(1,2,3)), "c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
"bug", "fixed", "mwahahahaa" => "Yeaaaah"
);'; // Sample data
$code = preg_replace('#(^\s*array\s*\(\s*)|(\s*\)\s*;?\s*$)#s', '', $code); // Just to get ride of array( at the beginning, and ); at the end
preg_match_all('~
\s* # white spaces
########################## KEYS START ##########################
(?: # We\'ll use this to make keys optional
(?P<keys> # named group: keys
\d+ # match digits
| # or
"(?(?=\\\\")..|[^"])*" # match string between "", works even 4 escaped ones "hello \" world"
| # or
\'(?(?=\\\\\')..|[^\'])*\' # match string between \'\', same as above :p
| # or
\$\w+(?:\[(?:[^[\]]|(?R))*\])* # match variables $_POST, $var, $var["foo"], $var["foo"]["bar"], $foo[$bar["fail"]]
) # close group: keys
########################## KEYS END ##########################
\s* # white spaces
=> # match =>
)? # make keys optional
\s* # white spaces
########################## VALUES START ##########################
(?P<values> # named group: values
\d+ # match digits
| # or
"(?(?=\\\\")..|[^"])*" # match string between "", works even 4 escaped ones "hello \" world"
| # or
\'(?(?=\\\\\')..|[^\'])*\' # match string between \'\', same as above :p
| # or
\$\w+(?:\[(?:[^[\]]|(?R))*\])* # match variables $_POST, $var, $var["foo"], $var["foo"]["bar"], $foo[$bar["fail"]]
| # or
array\s*\((?:[^()]|(?R))*\) # match an array()
| # or
\[(?:[^[\]]|(?R))*\] # match an array, new PHP array syntax: [1, 3, 5] is the same as array(1,3,5)
| # or
(?:function\s+)?\w+\s* # match functions: helloWorld, function name
(?:\((?:[^()]|(?R))*\)) # match function parameters (wut), (), (array(1,2,4))
(?:(?:\s*use\s*\((?:[^()]|(?R))*\)\s*)? # match use(&$var), use($foo, $bar) (optionally)
\{(?:[^{}]|(?R))*\} # match { whatever}
)?;? # match ; (optionally)
) # close group: values
########################## VALUES END ##########################
\s* # white spaces
~xsi', $code, $m); // Matching :p
print_r($m['keys']); // Print keys
print_r($m['values']); // Print values
// Since some keys may be empty in case you didn't specify them in the array, let's fill them up !
foreach($m['keys'] as $index => &$key){
if($key === ''){
$key = 'made_up_index_'.$index;
}
}
$results = array_combine($m['keys'], $m['values']);
print_r($results); // printing results
Output
Array
(
[0] => 0
[1] => 123
[2] => $_POST["hello"]['world']
[3] =>
[4] =>
[5] => "a"
[6] => $test_further
[7] => "yellow"
[8] => "b"
[9] => "c"
[10] =>
[11] =>
[12] => "mwahahahaa"
[13] => "this is"
)
Array
(
[0] => "a"
[1] => 123
[2] => array("is", "actually", "An array !")
[3] => 1234
[4] => 'got problem ?'
[5] => $GlobalScopeVar
[6] => function test($noway){echo "this works too !!!";}
[7] => "blue"
[8] => array("nested"=>array(1,2,3), "nested"=>array(1,2,3),"nested"=>array(1,2,3))
[9] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
[10] => "bug"
[11] => "fixed"
[12] => "Yeaaaah"
[13] => "a test"
)
Array
(
[0] => "a"
[123] => 123
[$_POST["hello"]['world']] => array("is", "actually", "An array !")
[made_up_index_3] => 1234
[made_up_index_4] => 'got problem ?'
["a"] => $GlobalScopeVar
[$test_further] => function test($noway){echo "this works too !!!";}
["yellow"] => "blue"
["b"] => array("nested"=>array(1,2,3), "nested"=>array(1,2,3),"nested"=>array(1,2,3))
["c"] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
[made_up_index_10] => "bug"
[made_up_index_11] => "fixed"
["mwahahahaa"] => "Yeaaaah"
["this is"] => "a test"
)
Online regex demo
Online php demo
Known bug (fixed)
$code='array("aaa", "sdsd" => "dsdsd");'; // fail
$code='array(\'aaa\', \'sdsd\' => "dsdsd");'; // fail
$code='array("aaa", \'sdsd\' => "dsdsd");'; // succeed
// Which means, if a value with no keys is followed
// by key => value and they are using the same quotation
// then it will fail (first value gets merged with the key)
Online bug demo
Credits
Goes to Bart Kiers for his recursive pattern to match nested brackets.
Advice
You maybe should go with a parser since regexes are sensitive. #bwoebi has done a great job in his answer.
Even when you asked for a regex, it works also with pure PHP. token_get_all is here the key function. For a regex check #HamZa's answer out.
The advantage here is that it is more dynamic than a regex. A regex has a static pattern, while with token_get_all, you can decide after every single token what to do. It even escapes single quotes and backslashes where necessary, what a regex wouldn't do.
Also, in regex, you have, even when commented, problems to imagine what it should do; what code does is much easier to understand when you look at PHP code.
$code = ' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
"string_literal",
12345
); ';
$token = token_get_all("<?php ".$code);
$newcode = "";
$i = 0;
while (++$i < count($token)) { // enter into array; then start.
if (is_array($token[$i]))
$newcode .= $token[$i][1];
else
$newcode .= $token[$i];
if ($token[$i] == "(") {
$ending = ")";
break;
}
if ($token[$i] == "[") {
$ending = "]";
break;
}
}
// init variables
$escape = 0;
$wait_for_non_whitespace = 0;
$parenthesis_count = 0;
$entry = "";
// main loop
while (++$i < count($token)) {
// don't match commas in func($a, $b)
if ($token[$i] == "(" || $token[$i] == "{") // ( -> normal parenthesis; { -> closures
$parenthesis_count++;
if ($token[$i] == ")" || $token[$i] == "}")
$parenthesis_count--;
// begin new string after T_DOUBLE_ARROW
if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {
$escape = 1;
$wait_for_non_whitespace = 0;
$entry .= "'";
}
// here is a T_DOUBLE_ARROW, there will be a string after this
if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {
$wait_for_non_whitespace = 1;
}
// entry ended: comma reached
if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending == ")") || ($ending == "]" && $token[$i] == "]")) {
// go back to the first non-whitespace
$whitespaces = "";
if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {
$cut_at = strlen($entry);
while ($cut_at && ord($entry[--$cut_at]) <= 0x20); // 0x20 == " "
$whitespaces = substr($entry, $cut_at + 1, strlen($entry));
$entry = substr($entry, 0, $cut_at + 1);
}
// $escape == true means: there was somewhere a T_DOUBLE_ARROW
if ($escape) {
$escape = 0;
$newcode .= $entry."'";
} else {
$newcode .= "'".addcslashes($entry, "'\\")."'";
}
$newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));
// reset
$entry = "";
} else {
// add actual token to $entry
if (is_array($token[$i])) {
$addChar = $token[$i][1];
} else {
$addChar = $token[$i];
}
if ($entry == "" && $token[$i][0] == T_WHITESPACE) {
$newcode .= $addChar;
} else {
$entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;
}
}
}
//append remaining chars like whitespaces or ;
$newcode .= $entry;
print $newcode;
Demo at: http://3v4l.org/qe4Q1
Should output:
array(
0 => '"a"',
"a" => '$GlobalScopeVar',
"b" => 'array("nested"=>array(1,2,3))',
"c" => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
'"string_literal"',
'12345'
)
You can, to get the array's data, print_r(eval("return $newcode;")); to get the entries of the array:
Array
(
[0] => "a"
[a] => $GlobalScopeVar
[b] => array("nested"=>array(1,2,3))
[c] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
[1] => "string_literal"
[2] => 12345
)
The clean way to do this is obviously to use the tokenizer (but keep in mind that the tokenizer alone doesn't solve the problem).
For the challenge, I purpose a regex approach.
The idea is not to describe the PHP syntax, but more to describe it in a negative way (in other words, I describe only basic and needed PHP structures to obtain the result). The advantage of this basic description is to deal with more complex objects than functions, strings, integers or booleans. The result is a more flexible pattern that can deal for example with multi/single line comments, heredoc/nowdoc syntaxes:
<pre><?php
$code=' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
); ';
$pattern = <<<'EOD'
~
# elements
(?(DEFINE)
# comments
(?<comMulti> /\* .*? (?:\*/|\z) ) # multiline comment
(?<comInlin> (?://|\#) \N* $ ) # inline comment
(?<comments> \g<comMulti> | \g<comInlin> )
# strings
(?<strDQ> " (?>[^"\\]+|\\.)* ") # double quote string
(?<strSQ> ' (?>[^'\\]+|\\.)* ') # single quote string
(?<strHND> <<<(["']?)([a-zA-Z]\w*)\g{-2} (?>\R \N*)*? \R \g{-1} ;? (?=\R|$) ) # heredoc and nowdoc syntax
(?<string> \g<strDQ> | \g<strSQ> | \g<strHND> )
# brackets
(?<braCrl> { (?> \g<nobracket> | \g<brackets> )* } )
(?<braRnd> \( (?> \g<nobracket> | \g<brackets> )* \) )
(?<braSqr> \[ (?> \g<nobracket> | \g<brackets> )* ] )
(?<brackets> \g<braCrl> | \g<braRnd> | \g<braSqr> )
# nobracket: content between brackets except other brackets
(?<nobracket> (?> [^][)(}{"'</\#]+ | \g<comments> | / | \g<string> | <+ )+ )
# ignored elements
(?<s> \s+ | \g<comments> )
)
# array components
(?(DEFINE)
# key
(?<key> [0-9]+ | \g<string> )
# value
(?<value> (?> [^][)(}{"'</\#,\s]+ | \g<s> | / | \g<string> | <+ | \g<brackets> )+? (?=\g<s>*[,)]) )
)
(?J)
(?: \G (?!\A)(?<!\)) | array \g<s>* \( ) \g<s>* \K
(?: (?<key> \g<key> ) \g<s>* => \g<s>* )? (?<value> \g<value> ) \g<s>* (?:,|,?\g<s>*(?<stop> \) ))
~xsm
EOD;
if (preg_match_all($pattern, $code, $m, PREG_SET_ORDER)) {
foreach($m as $v) {
echo "\n<strong>Whole match:</strong> " . $v[0]
. "\n<strong>Key</strong>:\t" . $v['key']
. "\n<strong>Value</strong>:\t" . $v['value'] . "\n";
if (isset($v['stop']))
echo "\n<strong>done</strong>\n\n";
}
}
Here is what you asked for, very compact.
Please let me know if you'd like any tweaks.
THE CODE (you can run this straight in php)
$code=' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
); ';
$regex = "~(?xm)
^[\s'\"]*([^'\"\s]+)['\"\s]*
=>\s*+
(.*?)\s*,?\s*$~";
if(preg_match_all($regex,$code,$matches,PREG_SET_ORDER)) {
$array=array();
foreach($matches as $match) {
$array[$match[1]] = $match[2];
}
echo "<pre>";
print_r($array);
echo "</pre>";
} // END IF
THE OUTPUT
Array
(
[0] => "a"
[a] => $GlobalScopeVar
[b] => array("nested"=>array(1,2,3))
[c] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
)
$array contains your array.
You like?
Please let me know if you have any questions or require tweaks. :)
Just for this situation:
$code=' array(
0=>"a",
"a"=>$GlobalScopeVar,
"b"=>array("nested"=>array(1,2,3)),
"c"=>function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
); ';
preg_match_all('#\s*(.*?)\s*=>\s*(.*?)\s*,?\s*$#m', $code, $m);
$array = array_combine($m[1], $m[2]);
print_r($array);
Output:
Array
(
[0] => "a"
["a"] => $GlobalScopeVar
["b"] => array("nested"=>array(1,2,3))
["c"] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
)

Does not equal performing opposite in a foreach loop

I have an array that looks something like
array (size=7)
'car_make' => string 'BMW' (length=3)
'car_model' => string 'M3' (length=2)
'car_year' => string '2001' (length=4)
'car_price' => string '10000' (length=5)
'car_kilometers' => string '100000' (length=6)
'paint' => string 'black' (length=5)
'tires' => string 'pirelli' (length=7)
So basically there are a few base items in it that start with car_ and then a few extras.
I'm trying to search for each key that isn't car_* so paint and tires in this case. So I'm doing something like
foreach($_SESSION['car'][0] as $key=>$value)
{
if($key != preg_match('/car_.*/', $key))
{
echo 'Match';
}
}
Which I expected to echo out 2 Matches because of the 2 non car_ keys. Instead, this echos out 5 for the car_ keys.
But when I do
if($key == preg_match('/car_.*/', $key))
It echoes out 2 Matches for the 2 non car_ keys.
Where am I messing up or misunderstanding?
preg_match docs say about its return values:
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.
So this wouldn't have worked in the first place.
I would use:
if ( substr($key, 0, 4) === "car_" )
...which is much less expensive as an operation than preg_match anyway.
According the documentation preg_match returns true when matching and false when not. So it appears that you are having some fun with PHP's type casting.
So rather than comparing the value you should check
if(true === preg_match('/car_.*/', $key))
This should take care of it.
Also, per the documentation:
Do not use preg_match() if you only want to check if one string is
contained in another string. Use strpos() or strstr() instead as they
will be faster.
Which would be:
if(false === str_pos('car_', $key))
From http://php.net/manual/en/function.preg-match.php
Return Values
preg_match() returns 1 if the pattern matches given subject, 0 if it
does not, or FALSE if an error occurred.
So you should compare it to 1 or to 0, not to a string.
The reason why your code was working but backwards is because comparing a number to a string produces interesting results:
From http://php.net/manual/en/types.comparisons.php
1 == "" is FALSE
0 == "" is TRUE

Eliminate Partial Strings from Array of Strings PHP

I can do this, I'm just wondering if there is a more elegant solution than the 47 hacked lines of code I came up with...
Essentially I have an array (the value is the occurrences of said string);
[Bob] => 2
[Theresa] => 3
[The farm house] => 2
[Bob at the farm house] => 1
I'd like to iterate through the array and eliminate any entries that are sub-strings of others so that the end result would be;
[Theresa] => 3
[Bob at the farm house] => 1
Initially I was looping like (calling this array $baseTags):
foreach($baseTags as $key=>$count){
foreach($baseTags as $k=>$c){
if(stripos($k,$key)){
unset($baseTags[$key]);
}
}
}
I'm assuming I'm looping through each key in the array and if there is the occurrence of that key inside another key to unset it... doesn't seem to be working for me though. Am I missing something obvious?
Thank you in advance.
-H
You're mis-using strpos/stripos. They can return a perfectly valid 0 if the string you're searching for happens to be at the START of the 'haystack' string, e.g. your Bob value. You need to explicitly test for this with
if (stripos($k, $key) !== FALSE) {
unset(...);
}
if strpos/stripos don't find the needle, they return a boolean false, which under PHP's normal weak comparison rules is equal/equivalent to 0. Using the strict comparison operators (===, !==), which compare type AND value, you'll get the proper results.
Don't forget as-well as needing !== false, you need $k != $key so your strings don't match themselves.
You have two problems inside your code-example:
You combine each key with each key, so not only others, but also itself. You would remove all entries so because "Bob" is a substring of "Bob", too.
stripos returns false if not found which can be confused with 0 that stands for found at position 0, which is also equal but not identical to false.
You need to add an additional check to not remove the same key and then fix the check for the "not found" case (Demo):
$baseTags = array(
'Bob' => 2,
'Theresa' => 3,
'The farm house' => 2,
'Bob at the farm house' => 1,
);
foreach ($baseTags as $key => $count)
{
foreach ($baseTags as $k => $c)
{
if ($k === $key)
{
continue;
}
if (false !== stripos($k, $key))
{
unset($baseTags[$key]);
}
}
}
print_r($baseTags);
Output:
Array
(
[Theresa] => 3
[Bob at the farm house] => 1
)

Convert Regexp in Js into PHP?

I have the following regular expression in javascript and i would like to have the exact same functionality (or similar) in php:
// -=> REGEXP - match "x bed" , "x or y bed":
var subject = query;
var myregexp1 = /(\d+) bed|(\d+) or (\d+) bed/img;
var match = myregexp1.exec(subject);
while (match != null){
if (match[1]) { "X => " + match[1]; }
else{ "X => " + match[2] + " AND Y => " + match[3]}
match = myregexp1.exec(subject);
}
This code searches a string for a pattern matching "x beds" or "x or y beds".
When a match is located, variable x and variable y are required for further processing.
QUESTION:
How do you construct this code snippet in php?
Any assistance appreciated guys...
You can use the regex unchanged. The PCRE syntax supports everything that Javascript does. Except the /g flag which isn't used in PHP. Instead you have preg_match_all which returns an array of results:
preg_match_all('/(\d+) bed|(\d+) or (\d+) bed/im', $subject, $matches,
PREG_SET_ORDER);
foreach ($matches as $match) {
PREG_SET_ORDER is the other trick here, and will keep the $match array similar to how you'd get it in Javascript.
I've found RosettaCode to be useful when answering these kinds of questions.
It shows how to do the same thing in various languages. Regex is just one example; they also have file io, sorting, all kinds of basic stuff.
You can use preg_match_all( $pattern, $subject, &$matches, $flags, $offset ), to run a regular expression over a string and then store all the matches to an array.
After running the regexp, all the matches can be found in the array you passed as third argument. You can then iterate trough these matches using foreach.
Without setting $flags, your array will have a structure like this:
$array[0] => array ( // An array of all strings that matched (e.g. "5 beds" or "8 or 9 beds" )
0 => "5 beds",
1 => "8 or 9 beds"
);
$array[1] => array ( // An array containing all the values between brackets (e.g. "8", or "9" )
0 => "5",
1 => "8",
2 => "9"
);
This behaviour isn't exactly the same, and I personally don't like it that much. To change the behaviour to a more "JavaScript-like"-one, set $flags to PREG_SET_ORDER. Your array will now have the same structure as in JavaScript.
$array[0] => array(
0 => "5 beds", // the full match
1 => "5", // the first value between brackets
);
$array[1] => array(
0 => "8 or 9 beds",
1 => "8",
2 => "9"
);

Negotiate arrays inside an array

When i perform a regular expression
preg_match_all('~(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)~', $content, $turls);
print_r($turls);
i got an array inside array. I need a single array only.
How to negotiate the arrays inside another arrays
By default preg_match_all() uses PREG_PATTERN_ORDER flag, which means:
Orders results so that $matches[0] is
an array of full pattern matches,
$matches1 is an array of strings
matched by the first parenthesized
subpattern, and so on.
See http://php.net/preg_match_all
Here is sample output:
array(
0 => array( // Full pattern matches
0 => 'http://www.w3.org/TR/html4/strict.dtd',
1 => ...
),
1 => array( // First parenthesized subpattern.
// In your case it is the same as full pattern, because first
// parenthesized subpattern includes all pattern :-)
0 => 'http://www.w3.org/TR/html4/strict.dtd',
1 => ...
),
2 => array( // Second parenthesized subpattern.
0 => 'www.w3.org',
1 => ...
),
...
)
So, as R. Hill answered, you need $matches[0] to access all matched urls.
And as budinov.com pointed, you should remove outer parentheses to avoid second match duplicate first one, e.g.:
preg_match_all('~https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?~', $content, $turls);
// where $turls[0] is what you need
Not sure what you mean by 'negociate'. If you mean fetch the inner array, that should work:
$urls = preg_match_all('~(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)~', $content, $matches) ? $matches[0] : array();
if ( count($urls) ) {
...
}
Generally you can replace your regexp with one that doesn't contain parenthesis (). This way your results will be hold just in the $turls[0] variable :
preg_match_all('/https?\:\/\/[^\"\'\s]+/i', file_get_contents('http://www.yahoo.com'), $turls);
and then do some code to make urls unique like this:
$result = array_keys(array_flip($turls[0]));

Categories