newbie php regex issue - php

I have the following code:
<?php
$data="000ffe-fcc9f4 1 000fbe-fccabe";
$pattern='/([0-9A-F]{6})-([0-9A-F]{6})$/i';
echo "the pattern we are using is: ".$pattern."<BR>";
preg_match_all($pattern,$data,$matches, PREG_SET_ORDER );
print_r($matches[0]);
?>
I don't understand why it's not finding both mac addresses as matches.
Here's what the output on the page looks like:
the pattern we are using is: /([0-9A-F]{6})-([0-9A-F]{6})$/i
Array ( [0] => 000fbe-fccabe [1] => 000fbe [2] => fccabe )
I was expecting that element [0] would contain both 000ffe-fcc9f4 and 000fbe-fccabe.
Can you tell me what I'm doing wrong?
Thanks.

The reason it isn't finding both is because you have a $ at the end of your regex which means it will only match that pattern at the end of the string.
Try changing $pattern to /([0-9A-F]{6})-([0-9A-F]{6})/i and that should match both.

Related

php preg_match s and m modifiers not working for multiple lines

I have the following input string which consists of multiple lines:
BYTE $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$13,$14,$01,$19,$20,$01,$20,$17,$08,$09,$0C,$05,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66 // comment
BYTE $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
I use the following preg_match statement to match the data part (so only the hexadecimal values) and not the preceding white space and text, nor the trailing white space and comment sections:
preg_match('/(\$.*?) /s', $sFileContents, $aResult);
The output is this:
output: Array
(
[0] => $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$13,$14,$01,$19,$20,$01,$20,$17,$08,$09,$0C,$05,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
[1] => $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$13,$14,$01,$19,$20,$01,$20,$17,$08,$09,$0C,$05,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
)
As you may be able to see, the match appears to be correct but the first input line is repeated twice. The 's' modifier should help me get past the end of line, but I cannot seem to get past the first line.
Does anyone have an idea of how to proceed?
You can match data from all lines easy:
preg_match_all('/\$[\dA-Fa-f,\$]+/', $sFileContents, $aResult);
echo "<pre>".print_r($aResult,true);
Output:
$aResultArray:
(
[0] => Array
(
[0] => $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$13,$14,$01,$19,$20,$01,$20,$17,$08,$09,$0C,$05,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
[1] => $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
)
)
You don't need s (DOTALL) flag for this. You can use:
preg_match_all('/(\$[0-9A-Fa-f]{2}(?:,\$[0-9A-Fa-f]{2})+)/', $input, $m);
print_r($m[1]);
RegEx Demo

Finding values in a string via regex in php

I am trying to get information out of a textarea that contains certain strings (e.g. [name]) and find each item encased in the square brackets using regex patterns (currently tried using preg_match, preg_split, preg_quote, preg_match_all). It seems that the problem is in my regex pattern that I am providing for it.
My current regex:
$menuItems = preg_match_all('/[^[][([^[].*)]/U', $_SESSION['emailBody'], $menuItems);
I have tried many other patterns e.g.
/(?[...]\w+): (?[...]\d+)/
Any help that can be provided with this is greatly appreciated.
EDIT:
Sample input:
[email] address [to] name [from] someone
Message displayed on var_dump of the $menuItems variable:
array(1) { [0]=> string(0) "" }
EDIT 2:
Thank you to everyone for the help and support with this, I am pleased to say that it is all up and running perfectly!
From the comment stream above, you can simplify the regular expression as follows:
preg_match_all('/\[(.*)\]/U', $_SESSION['emailBody'], $menuItems);
One thing to note:
preg_match_all() fills the array in its 3rd parameter with the results of the matches. Your example line then overwrites this array with the result of preg_match_all() (an integer).
You should then be able to iterate over the results by using the following loop:
foreach ($menuItems[1] as $menuItem) {
// ...
}
Escape the square brackets and remove the dot:
$menuItems = preg_match_all('/[^[]\[([^[]*)\]/U', $_SESSION['emailBody'], $menuItems);
// here __^ __^ ^
preg_match_all doesn't return a string. You have to add an array for the last parameter:
preg_match_all('/\[([^[\]]*)\]/U', $_SESSION['emailBody'], $matches);
The matches are in the array $matches
print_r($matches);
Working example:
$str = '[email] address [to] name [from] someone';
preg_match_all('/\[([^[\]]*)\]/U', $str, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => [email]
[1] => [to]
[2] => [from]
)
[1] => Array
(
[0] => email
[1] => to
[2] => from
)
)
Here is a simple solution. This regex will capture all items encased in brackets along with brackets as well.
If you don't want brackets in result change regex to $regex = "/(?:\\[(\\w+)\\])/mi";
$subject = "[email] address [to] name [from] someone";
$regex = "/(\\[\\w+\\])/mi";
$matches = array();
preg_match_all($regex, $subject, &$matches);
print_r($matches);

Preg_match multiple instances reuse delimiter

I've revised the question as I did not explain correctly the first time.
Can someone please help me with this regex. I can't seem to figure out how to use the same delimeter as the end of one match and then reuse as the start of the next.
In the following code I'm trying to match everything in between each delimiter_test statement.
$string = "
delimiter_test this is a test
this is more data,etc
delimiter_test this is another test
and this is more data
delimiter_test this yet another test
and this is even more data
";
Here is the regex I've tried:
preg_match_all('/delimiter_test(.*?)delimiter_test/s', $string, $matches);
And here are my results:
Array
(
[0] => Array
(
[0] => delimiter_test this is a test
this is more data,etc
delimiter_test
)
[1] => Array
(
[0] => this is a test
this is more data,etc
)
)
So it only gets what is between the first and second 'delimiter_test'.
Hopefully that makes sense.
Thanks, Max
Thanks,
Max
Updated answer:
You can use Lookarounds to achieve this.
preg_match_all('/(?<=delimiter_test).*?(?=delimiter_test|$)/s', $string, $matches);
print_r($matches[0]);
Working Demo

Regex in PHP not working

My regex is:
$regex = '/(?<=Α: )(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))/';
My content among others is:
Q: Email Address
A: name#example.com
Rad Software Regular Expression Designer says that it should work.
Various online sites return the correct results.
If I remove the (?<=Α: ) lookbehind the regex returns all emails correctly.
When I run it from php it returns no matches.
What's going on?
I've also used the specific type of regex (ie (?<=Email: ) with different content. It works just fine in that case.
You are not most likely not using DOTALL flag s here which will make DOT match newlines as well in your regex:
$str = <<< EOF
Q: Email Address
A: name#example.com
EOF;
if (preg_match_all('/(?<=A: )(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))/s',
$str, $arr))
print_r($arr);
OUTPUT:
Array
(
[0] => Array
(
[0] => name#example.com
)
[1] => Array
(
[0] => name#example.com
)
[2] => Array
(
[0] => name
)
[3] => Array
(
[0] => example.
)
[4] => Array
(
[0] => com
)
)
This is my newer monster script for verifying whether an e-mail "validates" or not. You can feed it strange things and break it, but in production this handles 99.99999999% of the problems I've encountered. A lot more false positives really from typos.
<?php
$pattern = '!^[^#\s]+#[^.#\s]+\.[^#\s]+$!';
$examples = array(
'email#email.com',
'my.email#email.com',
'e.mail.more#email.co.uk',
'bad.email#..email.com',
'bad.email#google',
'#google.com',
'my#email#my.com',
'my email#my.com',
);
foreach($examples as $test_mail){
if(preg_match($pattern,$test_mail)){
echo ("$test_mail - passes\n");
} else {
echo ("$test_mail - fails\n");
}
}
?>
Output
email#email.com - passes
my.email#email.com - passes
e.mail.more#email.co.uk - passes
bad.email#..email.com - fails
bad.email#google - fails
#google.com - fails
my#email#my.com - fails
my email#my.com - fails
Unless there's a reason for the look-behind, you can match all of the emails in the string with preg_match_all(). Since you're working with a string, you would slightly modify the regex slightly:
$string_only_pattern = '!\s([^#\s]+#[^.#\s]+\.[^#\s]+)\s!s';
$mystring = '
email#email.com - passes
my.email#email.com - passes
e.mail.more#email.co.uk - passes
bad.email#..email.com - fails
bad.email#google - fails
#google.com - fails
my#email#my.com - fails
my email#my.com - fails
';
preg_match_all($string_only_pattern,$mystring,$matches);
print_r ($matches[1]);
Output from string only
Array
(
[0] => email#email.com
[1] => my.email#email.com
[2] => e.mail.more#email.co.uk
[3] => email#my.com
)
The problem is that your regular expression contains Α, which has an accent over it, but the content contains A, which doesn't. So the lookbehind doesn't match.
I change the regex to:
$regex = '/(?<=A: )(([\w-\.]+)#((?:[\w]+\.)+)([a-zA-Z]{2,4}))/';
and it works.
Outside of your regex issue itself, you should really consider not trying to write your own e-mail address regex parser. See stackoverflow post: Using a regular expression to validate an email address on why -- upshot: the RFC is long and demanding on your regex abilities.
The A char in your subject is the "normal" char with the code 65 (unicode or ascii). But The A you use in the lookbehind of your pattern have the code 913 (unicode). They look similar but are different.

Why isn't this regular expression returning anything?

preg_match_all('/<p>.*:</p>/gm', $content, $matches);
var_dump($matches); //ouput is NULL
I want something like this: <p>The Ideal Candidate:</p> to match but not <p>Some more text</p>. The requirement being it must contain a <p> tag followed by some text and at the end it must contain a : followd by the end of the p tag (</p> ).
Note: I tried escaping the ending p tag, but it is still not working.
Updated code:
preg_match_all('/<p>.*:<\/p>/gm', "<p>The Ideal Candidate:</p>", $matches);
Escape the / or use another delimiter
/<p>(.*?:)<\/p>/m
or
#<p>(.*?:)</p>#m
Tested:
preg_match_all('#<p>(.*?:)</p>#m', "<p>The Ideal Candidate:</p>", $m);
print_r($m)
output:
Array
(
[0] => Array
(
[0] => <p>The Ideal Candidate:</p>
)
[1] => Array
(
[0] => The Ideal Candidate:
)
)
gis not a valid modifier, see PHP: Pattern modifiers. You should pay close attention to the warnings PHP issue. When running
preg_match_all("/<p>.*:<\/p>/gm", "<p>The Ideal Candidate:</p>", $matches);
print_r($matches);
I get
Warning: preg_match_all(): Unknown modifier 'g' in Command line code on line 1
Whereas the same line without the gmodifier yields
Array
(
[0] => Array
(
[0] => <p>The Ideal Candidate:</p>
)
)

Categories