PHP : Matching strings between two strings - php

i have a problem with preg_match , i cant figure it out.
let the code say it :
function::wp_statistics_useronline::end
function::wp_statistics_visitor|today::end
function::wp_statistics_visitor|yesterday::end
function::wp_statistics_visitor|week::end
function::wp_statistics_visitor|month::end
function::wp_statistics_visitor|total::end
these are some string that run functions inside php;
when i use just one function::*::end it works just fine.
but when it contain more than one function , not working the way i want
it parse the match like :
function::wp_statistics_useronline::end function::wp_statistics_visitor|today::end AND ....::end
so basically i need Regex code that separate them and give me an array for each function::*::end

I assume you were actually using function::(.*)::end since function::*::end is never going to work (it can only match strings like "function::::::end").
The reason your regex failed with multiple matches on the same line is that the quantifier * is greedy by default, matching as many characters as possible. You need to make it lazy: function::(.*?)::end

It's pretty straight forward:
$result = preg_match_all('~function::(\S*)::end~m', $subject, $matches)
? $matches[1] : [];
Which gives:
Array
(
[0] => wp_statistics_useronline
[1] => wp_statistics_visitor|today
[2] => wp_statistics_visitor|yesterday
[3] => wp_statistics_visitor|week
[4] => wp_statistics_visitor|month
[5] => wp_statistics_visitor|total
)
And (for the second example):
Array
(
[0] => wp_statistics_useronline
[1] => wp_statistics_visitor|today
)
The regex in the example is a matching group around the part in the middle which does not contain whitespace. So \S* is a good fit.
As the matching group is the first one, you can retrieve it with $matches[1] as it's done after running the regular expression.

This is what you're looking for:
function\:\:(.*?)\:
Make sure you have the dot matches all identifier set.
After you get the matches, run it through a forloop and run an explode on "|", push it to an array and boom goes the dynamite, you've got what you're looking for.

Related

PHP regex match multiple tags within a string and add to array

I have a string with multiple tags in as so:
<item>foo bar</item> <item>foo bar</item>
I need to match each of these and they can be on new lines and add them to an array, it can't seem to match them though, I'm new to regex so I'm not understanding what is going wrong, an explanation would be great, thanks!
preg_match_all('/<item>(.*)<\/item>/',$content,$matches);
At the moment, it returns two empty index in the matches array.
I have also tried:
<item>([\s\S]*)<\/item>
This matches from the first tag until the very last one, so grabs everything essentially.
You can use this
preg_match_all('/<item>(.*?)<\/item>/',$content,$matches);
Result
Array
(
[0] => Array
(
[0] => <item>foo bar</item>
[1] => <item>foo bar</item>
)
[1] => Array
(
[0] => foo bar
[1] => foo bar
)
)
I only added ? to the regex, that looks for the nearest match and get it.
Read about lazy and greedy here: What do lazy and greedy mean in the context of regular expressions?

PHP preg_match_all: extract parameters of a command

I have the following LaTeX command:
\autocites[][]{}[][]{}
where the parameters inside [] are optional the others inside {} are mandatory. The \autocites command can be extended by additional groups of arguments like:
\autocites[a1][a2]{a3}[b1][b2]{b3}
\autocites[a1][a2]{a3}[b1][b2]{b3}[c1][c2]{c3}
...
It can also be used like this:
\autocites{a}{b}
\autocites{a}[b1][]{b3}
\autocites{a}[][b2]{b3}
...
I'd like to extract its parameters by using a regular expression in PHP. This is my first attempt:
/\\autocites(\[(.*?)\])(\[(.*?)\])(\{(.*?)\})(\[(.*?)\])(\[(.*?)\])(\{(.*?)\})/
Although this works fine if \autocites contains only two groups of three parameters I'm not able to figure out how to get it working for an unknown number of parameters.
I also tried using the following expression:
/\\autocites((\[(.*?)\]\[(.*?)\])?\{(.*?)\}){2,}/
This time I'm able to match even larger numbers of parameters but then I'm not able to extract all values because PHP always just gives me the content of the last three parameters:
Array
(
[0] => Array
(
[0] => \autocites[a][b]{c}[d][e]{f}[a][a]{a}
)
[1] => Array
(
[0] => [a][a]{a}
)
[2] => Array
(
[0] => [a][a]
)
[3] => Array
(
[0] => a
)
[4] => Array
(
[0] => a
)
[5] => Array
(
[0] => a
)
)
Any help is greatly appreciated.
You'll have to do this in two steps. Only .NET can retrieve an arbitrary amount of captures. In all other flavors, the amount of resulting captures is fixed by the number of groups in your pattern (repeating a group will only overwrite previous captures).
So first, match the entire thing to get the parameters, and then extract them in a second step:
preg_match('/\\\\autocites((?:\{[^}]*\}|\[[^]]*\])+)/', $input, $autocite);
preg_match_all('/(?|\{([^}]*)\}|\[([^]]*)\])/', $autocite[1], $parameters);
// $parameters[1] will now be an array of all parameters
Working demo.
Using a slightly more elaborate approach and the anchor \G we could also do it all in one go, by using an arbitrary amount of matches instead of captures:
preg_match_all('/
(?| # two alternatives whose group numbers both begin at 1
\\\\autocites # match the command
(?|\{([^}]*)\}|\[([^]]*)\])
# and a parameter in group 1
| # OR
\G # anchor the match to the end of the last match
(?|\{([^}]*)\}|\[([^]]*)\])
# and match a parameter in group 1
)
/x',
$input,
$parameters);
// again, you'll have an array of parameters in $parameters[1]
Working demo.
Note that with this approach - if you have multiple autocites in your code, you'll get all parameters from all commands in a single list. There are some ways alleviate that, but I think the first approach would be cleaner in that case.
If you want to be able to distinguish between optional and mandatory parameters (with any approach), capture the opening or closing bracket/brace along with the parameter, and check against that character to find out which type it is.

PHP preg_match: comma separated decimals

This regex finds the right string, but only returns the first result. How do I make it search the rest of the text?
$text =",415.2109,520.33970,495.274100,482.3238,741.5634
655.3444,488.29980,741.5634";
preg_match("/[^,]+[\d+][.?][\d+]*/",$text,$data);
echo $data;
Follow up:
I'm pushing the initial expectations of this script, and I'm at the point where I'm pulling out more verbose data. Wasted many hours with this...can anyone shed some light?
heres my string:
155.101.153.123:simple:mass_mid:[479.0807,99.011, 100.876],mass_tol:[30],mass_mode: [1],adducts:[M+CH3OH+H],
130.216.138.250:simple:mass_mid:[290.13465,222.34566],mass_tol:[30],mass_mode:[1],adducts:[M+Na],
and heres my regex:
"/mass_mid:[((?:\d+)(?:.)(?:\d+)(?:,)*)/"
I'm really banging my head on this one! Can someone tell me how to exclude the line mass_mid:[ from the results, and keep the comma seperated values?
Use preg_match_all rather than preg_match
From the PHP Manual:
(`preg_match_all`) searches subject for all matches to the regular expression given in pattern and puts them in matches in the order specified by flags.
After the first match is found, the subsequent searches are continued on from end of the last match.
http://php.net/manual/en/function.preg-match-all.php
Don't use a regex. Use split to split apart your inputs on the commas.
Regexes are not a magic wand you wave at every problem that happens to involve strings.
Description
To extract a list of numeric values which may include a single decimal point, then you could use this regex
\d*\.?\d+
PHP Code Example:
<?php
$sourcestring=",415.2109,520.33970,495.274100,482.3238,741.5634
655.3444,488.29980,741.5634";
preg_match_all('/\d*\.?\d+/im',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
yields matches
$matches Array:
(
[0] => Array
(
[0] => 415.2109
[1] => 520.33970
[2] => 495.274100
[3] => 482.3238
[4] => 741.5634
[5] => 655.3444
[6] => 488.29980
[7] => 741.5634
)
)

Problem (un-)greedy RegExp

Consider the following Strings:
1: cccbbb
2: cccaaabbb
I would like to end up with are matches like this:
1: Array
(
[1] =>
[2] => bbb
)
2: Array
(
[1] => aaa
[2] => bbb
)
How can I match both in one RegExp?
Here's my try:
#(aaa)?(.*)$#
I have tried many variants of greedy and ungreedy modifications but it doesn't work out. As soon as I add the '?' everything is matched in [2]. Making [2] ungreedy doesn't help.
My RegExp works as expected if I omit the 'ccc', but I have to allow other characters at the beginning...
/(aaa)?((.)\3*)$/
There will be an extra [3] though. I don't think that's a problem.
Thanks for the brainstorming here guys! I have finally been able to figure something out that's working:
^(?:([^a]*)(aaa))?(.*)$
here's a non-regex way. search and split on "aaa" if found, then store the rest of the right side of "aaa" into array.
$str="cccaaabbb";
if (strpos($str,"aaa")!==FALSE){
$array[]="aaa";
$s = explode("aaa",$str);
$array[]=end($s);
}
print_r($array);
output
$ php test.php
Array
(
[0] => aaa
[1] => bbb
)
As for [1], depending on what's your criteria when "aaa" is not found, it can be as simple as getting the substring from character 4 onwards using strpos().
this will match the groups but its not very flexible can you put a little more detail of what you need to do. It may be much easier to grab three characters a time and evaluate them.
Also I tested this in poweshell which has a slightly different flavor of regex.
(a{3,3})*(b{3,3})
do like this:
$sPattern = "/(aaa?|)(bbb)/";
this works well.

Can preg_match() (or other php regex function) match a variable number of parenthesized subpatterns?

Suppose I have '/srv/www/site.com/htdocs/system/application/views/' and want to test it against a regexp that matches each directory name in the path?
Something like this pattern: '(/[^/])'
That yields an array with 'srv','www','site.com'... etc.
PS: the regexp syntax I wrote is just to illustrate, it's not tested and surely wrong, but just to give an idea.
PS2: I know there's explode() but let's see if we can do this with a regexp (it's useful for other languages and frameworks which don't have explode).
preg_match_all:
$str = '/srv/www/site.com/htdocs/system/application/views/';
preg_match_all('/\/([^\/]+)/', $str, $matches);
// $matches[0] contains matching strings
// $matches[1] contains first subgroup matches
print_r($matches[1]);
Output:
Array
(
[0] => srv
[1] => www
[2] => site.com
[3] => htdocs
[4] => system
[5] => application
[6] => views
)
There is preg_split for splitting files on regular expressions similar to explode, and then there is preg_match_all which does what you want.
I don't think you can, but you could instead use preg_match_all() to get multiple matches from a regular expression. There is also preg_split() which may be more appropriate.

Categories