Preg_match multiple instances reuse delimiter

Preg_match multiple instances reuse delimiter - php

I've revised the question as I did not explain correctly the first time.
Can someone please help me with this regex. I can't seem to figure out how to use the same delimeter as the end of one match and then reuse as the start of the next.
In the following code I'm trying to match everything in between each delimiter_test statement.
$string = "
delimiter_test this is a test
this is more data,etc
delimiter_test this is another test
and this is more data
delimiter_test this yet another test
and this is even more data
";
Here is the regex I've tried:
preg_match_all('/delimiter_test(.*?)delimiter_test/s', $string, $matches);
And here are my results:
Array
(
[0] => Array
(
[0] => delimiter_test this is a test
this is more data,etc
delimiter_test
)
[1] => Array
(
[0] => this is a test
this is more data,etc
)
)
So it only gets what is between the first and second 'delimiter_test'.
Hopefully that makes sense.
Thanks, Max
Thanks,
Max

Updated answer:
You can use Lookarounds to achieve this.
preg_match_all('/(?<=delimiter_test).*?(?=delimiter_test|$)/s', $string, $matches);
print_r($matches[0]);
Working Demo

Related

Extracting all the emojis from a string using REGEX

I have been trying to extract all the emojis from a string using a regex function listed below. However, this function is not accurate sometimes as it adds up additional emojis in the process.
The regex that I am using is this one:
preg_match_all('/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]?/u', $string, $emojis);
When I try to print 'emojis[0]' after this, sometimes, it is not accurate.
For example,
CODE:
$string = "Get into it !!! 🤰🏻🍴";
preg_match_all('/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]?/u', $string, $emojis);
print_r($emojis[0]);
OUTPUT:
Array ( [0] => 🤰 [1] => 🏻 [2] => 🍴 )
This is not expected as the second element in the above array was not in the inputted string.
Is this a REGEX issue? Is there any better REGEX for this? Or anything other than REGEX to extract emojis?

Your are dealing with "Fitzpatrick Modifiers".
I haven't had a close look at your regex pattern to make refinements, but I can offer a quick solution.
Use: (?:[\x{1f3fb}-\x{1f3ff}](*SKIP)(*FAIL))| at the start of your pattern disqualify the modifiers.
Code: (Demo)
$string = "Pregnant Woman: 🤰🏻 Pregnant Woman: 🤰 Fork and Knife: 🍴 Light Skin Tone: 🏻 (a pale skin tone modifier)";
//$string = "Get into it !!! 🤰🏻🍴";
preg_match_all('/(?:[\x{1f3fb}-\x{1f3ff}](*SKIP)(*FAIL))|[0-9|#][\x{20E3}]|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]/u', $string, $emojis);
print_r($emojis[0]);
Output:
Array
(
[0] => 🤰
[1] => 🤰
[2] => 🍴
)

Finding values in a string via regex in php

I am trying to get information out of a textarea that contains certain strings (e.g. [name]) and find each item encased in the square brackets using regex patterns (currently tried using preg_match, preg_split, preg_quote, preg_match_all). It seems that the problem is in my regex pattern that I am providing for it.
My current regex:
$menuItems = preg_match_all('/[^[][([^[].*)]/U', $_SESSION['emailBody'], $menuItems);
I have tried many other patterns e.g.
/(?[...]\w+): (?[...]\d+)/
Any help that can be provided with this is greatly appreciated.
EDIT:
Sample input:
[email] address [to] name [from] someone
Message displayed on var_dump of the $menuItems variable:
array(1) { [0]=> string(0) "" }
EDIT 2:
Thank you to everyone for the help and support with this, I am pleased to say that it is all up and running perfectly!

From the comment stream above, you can simplify the regular expression as follows:
preg_match_all('/\[(.*)\]/U', $_SESSION['emailBody'], $menuItems);
One thing to note:
preg_match_all() fills the array in its 3rd parameter with the results of the matches. Your example line then overwrites this array with the result of preg_match_all() (an integer).
You should then be able to iterate over the results by using the following loop:
foreach ($menuItems[1] as $menuItem) {
// ...
}

Escape the square brackets and remove the dot:
$menuItems = preg_match_all('/[^[]\[([^[]*)\]/U', $_SESSION['emailBody'], $menuItems);
// here __^ __^ ^
preg_match_all doesn't return a string. You have to add an array for the last parameter:
preg_match_all('/\[([^[\]]*)\]/U', $_SESSION['emailBody'], $matches);
The matches are in the array $matches
print_r($matches);
Working example:
$str = '[email] address [to] name [from] someone';
preg_match_all('/\[([^[\]]*)\]/U', $str, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => [email]
[1] => [to]
[2] => [from]
)
[1] => Array
(
[0] => email
[1] => to
[2] => from
)
)

Here is a simple solution. This regex will capture all items encased in brackets along with brackets as well.
If you don't want brackets in result change regex to $regex = "/(?:\\[(\\w+)\\])/mi";
$subject = "[email] address [to] name [from] someone";
$regex = "/(\\[\\w+\\])/mi";
$matches = array();
preg_match_all($regex, $subject, &$matches);
print_r($matches);

Php Regex: isolate and count occurrences

So far I managed to isolate and count pictures in a string by doing this:
preg_match_all('#<img([^<]+)>#', $string, $temp_img);
$count=count($temp_img[1]);
I would like to do something similar with parts that would look like this:
"code=mYrAnd0mc0dE123".
For instance, let's say I have this string:
$string="my first code is code=aZeRtY and my second one is code=qSdF1E"
I would like to store "aZeRtY" and "qSdF1E" in an array.
I tried a bunch of regex to isolate the "code=..." but none has worked for me.
Obviously, regex is beyond me.

Are you looking for this?
preg_match_all('#code=([A-Za-z0-9]+)#', $string, $results);
$count = count($results[1]);

This:
$string = '
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
';
preg_match_all('/(?<=code=)[a-zA-Z0-9]+/', $string, $matches);
echo('<pre>');
print_r($matches);
echo('</pre>');
Outputs:
Array
(
[0] => Array
(
[0] => jhb2345jhbv2345ljhb2435
[1] => jhb2345jhbv2345ljhb2435
[2] => jhb2345jhbv2345ljhb2435
[3] => jhb2345jhbv2345ljhb2435
)
)
However without a suffixing delimiter, it won't work correctly if this pattern is concatenated, eg: code=jhb2345jhbv2345ljhb2435code=jhb2345jhbv2345ljhb2435
But perhaps that won't be a problem for you.

newbie php regex issue

I have the following code:
<?php
$data="000ffe-fcc9f4 1 000fbe-fccabe";
$pattern='/([0-9A-F]{6})-([0-9A-F]{6})$/i';
echo "the pattern we are using is: ".$pattern."<BR>";
preg_match_all($pattern,$data,$matches, PREG_SET_ORDER );
print_r($matches[0]);
?>
I don't understand why it's not finding both mac addresses as matches.
Here's what the output on the page looks like:
the pattern we are using is: /([0-9A-F]{6})-([0-9A-F]{6})$/i
Array ( [0] => 000fbe-fccabe [1] => 000fbe [2] => fccabe )
I was expecting that element [0] would contain both 000ffe-fcc9f4 and 000fbe-fccabe.
Can you tell me what I'm doing wrong?
Thanks.

The reason it isn't finding both is because you have a $ at the end of your regex which means it will only match that pattern at the end of the string.
Try changing $pattern to /([0-9A-F]{6})-([0-9A-F]{6})/i and that should match both.

PHP Regexp: ignoring everything before a defined substring

I'm trying to parse a web page.
Basically it gets stored in a string that will look like this:
"[HTML CODE ...]world:[HTML CODE ...]my_number[REST OF HTML_CODE ...]"
Of course "world:" and "MY_NUMBER" are part of the html code, however I would like to ignore everything before the first occurrence of "world:". What I need is the first number that appears after the first occurrence of "world:", keeping in mind that a bunch of html code will be between those.
I could substring the html code but I would like to do this all just by using a single regex if possible.
This is the regular expression I tried to match:
'/(?<=world:)\D+?[0-9]+/'
But this returns me all the html stuff between "world:" and my number.
Thanks!

I think you were close to getting it. I was able to use this on the string you provided.
$subject = "[HTML CODE ...]world:[HTML CODE ...]3334[REST OF HTML_CODE ...]";
$pattern = "/world:\D+?(?<my_number>[0-9]+)/";
$matches = array();
$result = preg_match_all($pattern, $subject, &$matches);
print_r($matches);
Results in:
Array
(
[0] => Array
(
[0] => world:[HTML CODE ...]3334
)
[my_number] => Array
(
[0] => 3334
)
[1] => Array
(
[0] => 3334
)
)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Preg_match multiple instances reuse delimiter - php

Updated answer: You can use Lookarounds to achieve this. preg_match_all('/(?<=delimiter_test).*?(?=delimiter_test|$)/s', $string, $matches); print_r($matches[0]); Working Demo

Related

Extracting all the emojis from a string using REGEX

Finding values in a string via regex in php

Php Regex: isolate and count occurrences

newbie php regex issue

PHP Regexp: ignoring everything before a defined substring

Categories

Resources