PHP regular expressions exact data inside brackets - php

I have some string like Western Australia 223/5 (59.3 ov)
I would like to split this string and extract the following informations with regular expressions
$team = 'Western Australia'
$runs = 223/5
$overs = 59.3
Issue is, format of the text is varying, it may any of the follwing
Western Australia 223/5 (59.3 ov)
Australia 223/5 (59.3 ov)
KwaZulu-Natal Inland
Sri Lanka v West Indies
Any help (like is it possible to have in a single regexp) will be appreciated..

if (preg_match(
'%^ # start of string
(?P<team>.*?) # Any number of characters, as few as possible (--> team)
(?:\s+ # Try to match the following group: whitespace plus...
(?P<runs>\d+ # a string of the form number...
(?:/\d+)? # optionally followed by /number
) # (--> runs)
)? # optionally
(?:\s+ # Try to match the following group: whitespace plus...
\( # (
(?P<overs>[\d.]+) # a number (optionally decimal) (--> overs)
\s+ov\) # followed by ov)
)? # optionally
\s* # optional whitespace at the end
$ # end of string
%six',
$subject, $regs)) {
$team = $regs['team'];
$runs = $regs['runs'];
$overs = $regs['overs'];
} else {
$result = "";
}
You might need to catch an error if the matches <runs> and/or <overs> are not actually present in the string. I don't know much about PHP. (Don't know much biology...SCNR)

Assuming you are using preg_match, you can use the following:
preg_match('/^([\w\s]+)\s*(\d+\/\d+)?\s*(\(\d+\.\d+ ov\))?$/', $input, $matches);
Then, you can inspect $matches to see which one of the options you are supossed to manage was found.
See preg_match documentation for more information.

Related

preg_match_all Compilation failed: range out of order in character class at offset

I have trouble to find specific object with preg_match_all pattern. I have a text. But I would like to find just one specific
Like I have a string of text
sadasdasd:{"website":["https://bitcoin.org/"]tatic/cloud/img/coinmarketcap_grey_1.svg?_=60ffd80');display:inline-block;background-position:center;background-repeat:no-repeat;background-size:contain;width:239px;height:41px;} .cqVqre.cmc-logo--size-large{width:263px;height:45px;}
/* sc-component-id: sc-2wt0ni-0 */
However I just need to find "website":["https://bitcoin.org/"]. Where website is dynamic data. Such as website can be a google "website":["https://google.com/"]
Right now I have something like this. That's just return a bulk of urls. I need just specific
$parsePage = "sadasdasd:{"website":["https://bitcoin.org/"]tatic/cloud/img/coinmarketcap_grey_1.svg?_=60ffd80');display:inline-block;background-position:center;background-repeat:no-repeat;background-size:contain;width:239px;height:41px;} .cqVqre.cmc-logo--size-large{width:263px;height:45px;}
/* sc-component-id: sc-2wt0ni-0 */";
$pattern = '/
\"website":[" # [ character
(?: # non-capturing group
[^{}] # anything that is not a { or }
| # OR
(?R) # recurses the entire pattern
)* # previous group zero or more times
\"] # ] character
/x';
preg_match_all($pattern, $parsePage, $matches);
print_r($matches[0]);
Try :
$pattern = '~"website":\["([^"]*)"~'

PHP: Parse comma-delimited string outside single and double quotes and parentheses

I've found several partial answers to this question, but none that cover all my needs...
I am trying to parse a user generated string as if it were a series of php function arguments to determine the number of arguments:
This string:
$arg1,$arg2='ABC,DEF',$arg3="GHI\",JKL",$arg4=array(1,'2)',"3\"),")
will be inserted as the arguments of a function:
function my_function( [insert string here] ){ ... }
I need to parse the string on the commas, taking into account single- and double-quotes, parentheses, and escaped quotes and parentheses to create an array:
array(4) {
[0] => $arg1
[1] => $arg2='ABC,DEF'
[2] => $arg3="GHI\",JKL"
[3] => $arg4=array(1,'2)',"3\"),")
}
Any help with a regular expression or parser function to accomplish this is appreciated!
It isn't possible to solve this problem with a classical csv tool since there is more than one character able to protect parts of the string.
Using preg_split is possible but will result in a very complicated and inefficient pattern. So the best way is to use preg_match_all. There are however several problems to solve:
as needed, a comma enclosed in quotes or parenthesis must be ignored (seen as a character without special meaning, not as a delimiter)
you need to extract the params, but you need to check if the string has the good format too, otherwise the match results may be totally false!
For the first point, you can define subpatterns to describe each cases: the quoted parts, the parts enclosed between parenthesis, and a more general subpattern able to match a complete param and that uses the two previous subpatterns when needed.
Note that the parenthesis subpattern needs to refer to the general subpattern too, since it can contain anything (and commas too).
The second point can be solved using the \G anchor that ensures that all matchs are contiguous. But you need to be sure that the end of the string has been reached. To do that, you can add an optional empty capture group at the end of the main pattern that is created only if the anchor for the end of the string \z succeeds.
$subject = <<<'EOD'
$arg1,$arg2='ABC,DEF',$arg3="GHI\",JKL",$arg4=array(1,'2)',"3\"),")
EOD;
$pattern = <<<'EOD'
~
# named groups definitions
(?(DEFINE) # this definition group allows to define the subpatterns you want
# without matching anything
(?<quotes>
' [^'\\]*+ (?s:\\.[^'\\]*)*+ ' | " [^"\\]*+ (?s:\\.[^"\\]*)*+ "
)
(?<brackets> \( \g<content> (?: ,+ \g<content> )*+ \) )
(?<content> [^,'"()]*+ # ' # (<-- comment for SO syntax highlighting)
(?:
(?: \g<brackets> | \g<quotes> )
[^,'"()]* # ' #
)*+
)
)
# the main pattern
(?: # two possible beginings
\G(?!\A) , # a comma contiguous to a previous match
| # OR
\A # the start of the string
)
(?<param> \g<content> )
(?: \z (?<check>) )? # create an item "check" when the end is reached
~x
EOD;
$result = false;
if ( preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER) &&
isset(end($matches)['check']) )
$result = array_map(function ($i) { return $i['param']; }, $matches);
else
echo 'bad format' . PHP_EOL;
var_dump($result);
demo
You could split the argument string at ,$ and then append $ back the array values:
$args_array = explode(',$', $arg_str);
foreach($args_array as $key => $arg_raw) {
$args_array[$key] = '$'.ltrim($arg_raw, '$');
}
print_r($args_array);
Output:
(
[0] => $arg1
[1] => $arg2='ABC,DEF'
[2] => $arg3="GHI\",JKL"
[3] => $arg4=array(1,'2)',"3\"),")
)
If you want to use a regex, you can use something like this:
(.+?)(?:,(?=\$)|$)
Working demo
Php code:
$re = '/(.+?)(?:,(?=\$)|$)/';
$str = "\$arg1,\$arg2='ABC,DEF',\$arg3=\"GHI\",JKL\",\$arg4=array(1,'2)',\"3\"),\")\n";
preg_match_all($re, $str, $matches);
Match information:
MATCH 1
1. [0-5] `$arg1`
MATCH 2
1. [6-21] `$arg2='ABC,DEF'`
MATCH 3
1. [22-39] `$arg3="GHI\",JKL"`
MATCH 4
1. [40-67] `$arg4=array(1,'2)',"3\"),")`

Looping within a regular expression

can regex able to find a patter to this?
{{foo.bar1.bar2.bar3}}
where in the groups would be
$1 = foo $2 = bar1 $3 = bar2 $4 = bar3 and so on..
it would be like re-doing the expression over and over again until it fails to get a match.
the current expression i am working on is
(?:\{{2})([\w]+).([\w]+)(?:\}{2})
Here's a link from regexr.
http://regexr.com?3203h
--
ok I guess i didn't explain well what I'm trying to achieve here.
let's say I am trying to replace all
.barX inside a {{foo . . . }}
my expected results should be
$foo->bar1->bar2->bar3
This should work, assuming no braces are allowed within the match:
preg_match_all(
'%(?<= # Assert that the previous character(s) are either
\{\{ # {{
| # or
\. # .
) # End of lookbehind
[^{}.]* # Match any number of characters besides braces/dots.
(?= # Assert that the following regex can be matched here:
(?: # Try to match
\. # a dot, followed by
[^{}]* # any number of characters except braces
)? # optionally
\}\} # Match }}
) # End of lookahead%x',
$subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
I'm not a PHP person, but I managed to construct this piece of code here:
preg_match_all("([a-z0-9]+)",
"{{foo.bar1.bar2.bar3}}",
$out, PREG_PATTERN_ORDER);
foreach($out[0] as $val)
{
echo($val);
echo("<br>");
}
The code above prints the following:
foo
bar1
bar2
bar3
It should allow you to exhaustively search a given string by using a simple regular expression. I think that you should also be able to get what you want by removing the braces and splitting the string.
I don't think so, but it's relatively painless to just split the string on periods like so:
$str = "{{foo.bar1.bar2.bar3}}";
$str = str_replace(array("{","}"), "", $str);
$values = explode(".", $str);
print_r($values); // Yields an array with values foo, bar1, bar2, and bar3
EDIT: In response to your question edit, you could replace all barX in a string by doing the following:
$str = "{{foo.bar1.bar2.bar3}}";
$newStr = preg_replace("#bar\d#, "hi", $str);
echo $newStr; // outputs "{{foo.hi.hi.hi}}"
I don't know the correct syntax in PHP, for pulling out the results, but you could do:
\{{2}(\w+)(?:\.(\w+))*\}{2}
That would capture the first hit in the first capturing group and the rest in second capturing group. regexr.com is lacking the ability to show that as far as I can see though. Try out Expresso, and you'll see what I mean.

How to use regular expressions in php to extract data like this?

I am using the following code in php to extract username, password and email:
$subject = "fjcljt # 123456789 # chengyong702#126.com";
$pattern2 = '/^(\w+\ # ){2}?\w+ ?/';
preg_match($pattern2, $subject, $matches);
but the returned result using print_r is Array ( [0] => fjcljt # 123456789 # chengyong702 [1] => 123456789 # )
What am I doing wrong with preg_match here?
if " # " delimits your values...no need for regex at all...
$subject = "fjcljt # 123456789 # chengyong702#126.com";
$subject = array_map('trim',explode("#",$subject));
The result of preg_match captures the entire string in [0], and then each captured group in [i]. A captured group is denoted by the brackets in your $pattern2. Since there's only one set of brackets, there's only one captured group.
Even though your pattern matches twice, only the latest match is stored group 1, being 123456789 # (overriding the fjcljt #).
To get explicit groups you have to write the captured groups in your regex explicitly as opposed to with the {2}:
$pattern2 = '/^(\w+\ # )(\w+\ # )\w+ ?/';
Then your return array will have [1] bein fjcljt # and [2] being 1123456789 #.
list($username, $password, $email) = explode(' # ', $subject);
Try using explode instead of regex. regular expression use more resources.
$data = explode('#','fjcljt # 123456789 # chengyong702#126.com');
then you can access data like below:
$data[0]; //username
$data[1]; //password
$data[2]; //email
EDIT
for whitespace use delimiter like below:
" # "
It's two things here. First you are using a quantifier {2} on a match group (..). What happens is that you only get the last of the two matches as result group [1]. If you wanted to get both numbers/words separately, you have to expand the regex.
The second problem is that \w+ does not include #. So you only get half the email.
$pattern2 = '/^(\w+) # (\w+) # ([\w#.]+)/';
Might be what you wanted.
Not sure what the F bomb your trying to do, however if your trying to get login credentials you could try something like this
if(preg_match('/^.*\#.*$/i', $type_of_login) > 0)
{
$request = User::get_by_email($type_of_login);
}
else
{
//get by username or whatevers....
}
//then extract the password!!

PHP: Get last Tag of a String with Regular Expressions

Quite simple problem (but difficult solution): I got a string in PHP like as follows:
['one']['two']['three']
And from this, i must extract the last tags, so i finally got three
it is also possible that there is a number, like
[1][2][3]
and then i must get 3
How can i solve this?
Thanks for your help!
Flo
Your tag is \[[^\]]+\].
3 Tags are: (\[[^\]]+\]){3}
3 Tags at end are: (\[[^\]]+\]){3}$
N Tags at end are: (\[[^\]]+\])*$ (N 0..n)
Example:
<?php
$string = "['one']['two']['three'][1][2][3]['last']";
preg_match("/((?:\[[^\]+]*\]){3})$/", $string, $match);
print_r($match); // Array ( [0] => [2][3]['last'] [1] => [2][3]['last'] )
This tested code may work for you:
function getLastTag($text) {
$re = '/
# Match contents of last [Tag].
\[ # Literal start of last tag.
(?: # Group tag contents alternatives.
\'([^\']+)\' # Either $1: single quoted,
| (\d+) # or $2: un-quoted digits.
) # End group of tag contents alts.
\] # Literal end of last tag.
\s* # Allow trailing whitespace.
$ # Anchor to end of string.
/x';
if (preg_match($re, $text, $matches)) {
if ($matches[1]) return $matches[1]; // Either single quoted,
if ($matches[2]) return $matches[2]; // or non quoted digit.
}
return null; // No match. Return NULL.
}
Here is a regex that may work for you. Try this:
[^\[\]']*(?='?\]$)

Categories