Php Regex: isolate and count occurrences - php

So far I managed to isolate and count pictures in a string by doing this:
preg_match_all('#<img([^<]+)>#', $string, $temp_img);
$count=count($temp_img[1]);
I would like to do something similar with parts that would look like this:
"code=mYrAnd0mc0dE123".
For instance, let's say I have this string:
$string="my first code is code=aZeRtY and my second one is code=qSdF1E"
I would like to store "aZeRtY" and "qSdF1E" in an array.
I tried a bunch of regex to isolate the "code=..." but none has worked for me.
Obviously, regex is beyond me.

Are you looking for this?
preg_match_all('#code=([A-Za-z0-9]+)#', $string, $results);
$count = count($results[1]);

This:
$string = '
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
code=jhb2345jhbv2345ljhb2435
';
preg_match_all('/(?<=code=)[a-zA-Z0-9]+/', $string, $matches);
echo('<pre>');
print_r($matches);
echo('</pre>');
Outputs:
Array
(
[0] => Array
(
[0] => jhb2345jhbv2345ljhb2435
[1] => jhb2345jhbv2345ljhb2435
[2] => jhb2345jhbv2345ljhb2435
[3] => jhb2345jhbv2345ljhb2435
)
)
However without a suffixing delimiter, it won't work correctly if this pattern is concatenated, eg: code=jhb2345jhbv2345ljhb2435code=jhb2345jhbv2345ljhb2435
But perhaps that won't be a problem for you.

Related

Extracting all the emojis from a string using REGEX

I have been trying to extract all the emojis from a string using a regex function listed below. However, this function is not accurate sometimes as it adds up additional emojis in the process.
The regex that I am using is this one:
preg_match_all('/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]?/u', $string, $emojis);
When I try to print 'emojis[0]' after this, sometimes, it is not accurate.
For example,
CODE:
$string = "Get into it !!! 🀰🏻🍴";
preg_match_all('/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]?/u', $string, $emojis);
print_r($emojis[0]);
OUTPUT:
Array ( [0] => 🀰 [1] => 🏻 [2] => 🍴 )
This is not expected as the second element in the above array was not in the inputted string.
Is this a REGEX issue? Is there any better REGEX for this? Or anything other than REGEX to extract emojis?
Your are dealing with "Fitzpatrick Modifiers".
I haven't had a close look at your regex pattern to make refinements, but I can offer a quick solution.
Use: (?:[\x{1f3fb}-\x{1f3ff}](*SKIP)(*FAIL))| at the start of your pattern disqualify the modifiers.
Code: (Demo)
$string = "Pregnant Woman: 🀰🏻 Pregnant Woman: 🀰 Fork and Knife: 🍴 Light Skin Tone: 🏻 (a pale skin tone modifier)";
//$string = "Get into it !!! 🀰🏻🍴";
preg_match_all('/(?:[\x{1f3fb}-\x{1f3ff}](*SKIP)(*FAIL))|[0-9|#][\x{20E3}]|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{1F000}-\x{1FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F9FF}][\x{1F000}-\x{1FEFF}]/u', $string, $emojis);
print_r($emojis[0]);
Output:
Array
(
[0] => 🀰
[1] => 🀰
[2] => 🍴
)

Preg_match multiple instances reuse delimiter

I've revised the question as I did not explain correctly the first time.
Can someone please help me with this regex. I can't seem to figure out how to use the same delimeter as the end of one match and then reuse as the start of the next.
In the following code I'm trying to match everything in between each delimiter_test statement.
$string = "
delimiter_test this is a test
this is more data,etc
delimiter_test this is another test
and this is more data
delimiter_test this yet another test
and this is even more data
";
Here is the regex I've tried:
preg_match_all('/delimiter_test(.*?)delimiter_test/s', $string, $matches);
And here are my results:
Array
(
[0] => Array
(
[0] => delimiter_test this is a test
this is more data,etc
delimiter_test
)
[1] => Array
(
[0] => this is a test
this is more data,etc
)
)
So it only gets what is between the first and second 'delimiter_test'.
Hopefully that makes sense.
Thanks, Max
Thanks,
Max
Updated answer:
You can use Lookarounds to achieve this.
preg_match_all('/(?<=delimiter_test).*?(?=delimiter_test|$)/s', $string, $matches);
print_r($matches[0]);
Working Demo

Pattern for preg_match

I have a string contains the following pattern "[link:activate/$id/$test_code]" I need to get the word activate, $id and $test_code out of this when the pattern [link.....] occurs.
I also tried getting the inside items by using grouping but only gets active and $test_code couldn't get $id. Please help me to get all the parameter and action name in array.
Below is my code and output
Code
function match_test()
{
$string = "Sample string contains [link:activate/\$id/\$test_code] again [link:anotheraction/\$key/\$second_param]]] also how the other ationc like [link:action] works";
$pattern = '/\[link:([a-z\_]+)(\/\$[a-z\_]+)+\]/i';
preg_match_all($pattern,$string,$matches);
print_r($matches);
}
Output
Array
(
[0] => Array
(
[0] => [link:activate/$id/$test_code]
[1] => [link:anotheraction/$key/$second_param]
)
[1] => Array
(
[0] => activate
[1] => anotheraction
)
[2] => Array
(
[0] => /$test_code
[1] => /$second_param
)
)
Try this:
$subject = <<<'LOD'
Sample string contains [link:activate/$id/$test_code] again [link:anotheraction/$key/$second_param]]] also how the other ationc like [link:action] works
LOD;
$pattern = '~\[link:([a-z_]+)((?:/\$[a-z_]+)*)]~i';
preg_match_all($pattern, $subject, $matches);
print_r($matches);
if you need to have \$id and \$test_code separated you can use this instead:
$pattern = '~\[link:([a-z_]+)(/\$[a-z_]+)?(/\$[a-z_]+)?]~i';
Is this what you are looking for?
/\[link:([\w\d]+)\/(\$[\w\d]+)\/(\$[\w\d]+)\]/
Edit:
Also the problem with your expression is this part:
(\/\$[a-z\_]+)+
Although you have repeated the group, the match will only return one because it is still only one group declaration. The regex won't invent matching group numbers for you (Not that i've ever seen anyway).

trying to filter string with <br> tags using explode, does not work

I get a string that looks like this
<br>
ACCEPT:YES
<br>
SMMD:tv240245ce
<br>
is contained in a variable $_session['result']
I am trying to parse through this string and get the following either in an array or as separate variables
ACCEPT:YES
tv240245ce
I first tried
to explode the string using as the delimiter, and that did not work
then I already tried
$yes = explode(":", strip_tags($_SESSION['result']));
echo print_r($yes);
which gives me an array like so
Array ( [0] => ACCEPT [1] => YESSEED [2] => tv240245ce ) 1
which gives me one of my answers.
Please what would be a great way of trying to achieve what I am trying to achieve?
is there a way to get rid of the first and last?
then use the remaining one as a delimiter to explode the string ?
or what's the best way to go about this ?
This will do it:
$data=preg_split('/\s?<br>\s?/', str_replace('SMMD:','',$data), NULL, PREG_SPLIT_NO_EMPTY);
See example here:
CodePad
You can also skip caring about the spurious <br> and treat the whole string as key:value format with a simple regex like:
preg_match_all('/^(\w+):(.*)/', $text, $result, PREG_SET_ORDER);
This requires that you really have line breaks in it though. Gives you a $result list which is easy to convert into an associative array afterwards:
[0] => Array
(
[0] => ACCEPT:YES
[1] => ACCEPT
[2] => YES
)
[1] => Array
(
[0] => SMMD:tv240245ce
[1] => SMMD
[2] => tv240245ce
)
First, do a str_replace to remove all instances of "SMMD:". Then, Explode on "< b r >\n". Sorry for weird spaced, it was encoding the line break.
Include the new line character and you should get the array you want:
$mystr = str_replace( 'SMMD:', '', $mystr );
$res_array = explode( "<br>\n", $mystr );

return empty string from preg_split

Right now i'm trying to get this:
Array
(
[0] => hello
[1] =>
[2] => goodbye
)
Where index 1 is the empty string.
$toBeSplit= 'hello,,goodbye';
$textSplitted = preg_split('/[,]+/', $toBeSplit, -1);
$textSplitted looks like this:
Array
(
[0] => hello
[1] => goodbye
)
I'm using PHP 5.3.2
[,]+ means one or more comma characters while as much as possible is matched. Use just /,/ and it works:
$textSplitted = preg_split('/,/', $toBeSplit, -1);
But you don’t even need regular expression:
$textSplitted = explode(',', $toBeSplit);
How about this:
$textSplitted = preg_split('/,/', $toBeSplit, -1);
Your split regex was grabbing all the commas, not just one.
Your pattern splits the text using a sequence of commas as separator (its syntax also isn't perfect, as you're using a character class for no reason), so two (or two hundred) commas count just as one.
Anyway, since your just using a literal character as separator, use explode():
$str = 'hello,,goodbye';
print_r(explode(',', $str));
output:
Array
(
[0] => hello
[1] =>
[2] => goodbye
)

Categories