PHP regular expression to match multiple email ID formats

PHP regular expression to match multiple email ID formats - php

I am trying to obtain the local part of an email ID using regex.
The challenge here is that the local part comes in two different formats and I need to figure out which format I'm reading and prepare the alternate form of that email ID. As always the snippet of my code that does this is pasted below.
$testarray = array("user1#gmail.com", "user2.tp#gmail.com", "user3#gmail.com", "user4.tp#gmail.com", "user5.tp#gmail.com");
foreach($testarray as $emailID) {
preg_match("/([\w\d]*)\.([\w\d]*)#gmail.com/", $emailID, $match);
if ($match[2] == "tp") {
$altform = $match[1] . "#gmail.com";
} else {
$altform = $match[1] . ".tp#gmail.com";
}
error_log("ALTERNATE FORM OF $emailID IS $altform");
}
The problem I'm facing here is I'm not getting the desired result as neither $match[1] and $match[2] match anything for "user1#gmail.com".

You need to use an optional group around the dot + word chars subpattern, and then check if the group matched after executing the search:
foreach($testarray as $emailID) {
$altform = "";
if (preg_match("/(\w+)(?:\.(\w+))?#gmail\.com/", $emailID, $match))
{
if (!empty($match[2]) && $match[2] == "tp") {
$altform = $match[1] . "#gmail.com";
} else {
$altform = $match[1] . ".tp#gmail.com";
}
}
print_r("ALTERNATE FORM OF $emailID IS $altform\n");
}
See the online PHP demo.
Notes on the pattern:
(\w+) - Capturing group 1: one or more word chars
(?:\.(\w+))? - 1 or 0 occurrences (due to ? quantifier) of:
\. - a dot
(\w+) - Capturing group 2: one or more word chars
#gmail\.com - a literal string #gmail.com (note the . is escaped to match a literal dot).

Related

php - find match in string

I'm trying to work out how to find a match in a string.
I'm looking for a match on any of the following - = ? [] ~ # ! 0 - 9 A-Z a-z and I need to know what its matched on .
Eg: matched on !, or matched on = and # and ?
Normally I'd use this:
$a = 'How are you?';
if (strpos($a, 'are') !== false) {
echo 'true';
}
However I'm not sure how to do that so it looks up the characters needed.
Also where I may have [], It could be [] or [xxxx] where xxxx could be any number of alpha numeric characters.
I need to match and any of the characters listed, return the characters so I know what was matched and if the [] contain any value return that as well.
Eg:
$a = 'DeviceLocation[West12]';
Would return: $match = '[]'; $match_val= 'West12';
$a = '#=Device';
Would return:$match = '#,=';
$a= '?[1234]=#Martin';
Would return: $match = '?, [], =, #'; $match_val= '1234';
Can any one advise how I can do this.
Thanks

Well, that requirements are a bit vague, but that is what I deduced:
1) if there is an alphanumeric string inside square brackets get it as a separate value
2) all other mentioned chars should be matched one by one and then imploded.
You may use the following regex to get the values you need:
$re = '#\[([a-zA-Z0-9]+)\]|[][=?~#!]#';
Details:
\[ - a [
([a-zA-Z0-9]+) - Group 1 value capturing 1 or more alphanumeric symbols
\] - a closing ]
| - or
[][=?~#!] - a single char, one of the defined chars in the set.
See the regex demo. The most important here is the code that gets the matches (feel free to adapt):
$re = '#\[([a-zA-Z0-9]+)\]|[][=?~#!]#';
$strs =array('DeviceLocation[West12]', '#=Device', '?[1234]=#Martin');
foreach ($strs as $str) {
preg_match_all($re, $str, $matches, PREG_SET_ORDER);
$results = array();
$match_val = "";
foreach ($matches as $m) {
if (!empty($m[1])) {
$match_val = trim($m[1], "[]");
array_push($results, "[]");
} else {
array_push($results, $m[0]);
}
}
echo "Value: " . $match_val . "\n";
echo "Symbols: " . implode(", ", $results);
echo "\n-----\n";
}
See the PHP demo
Output:
Value: West12
Symbols: []
-----
Value:
Symbols: #, =
-----
Value: 1234
Symbols: ?, [], =, #
-----

Please use Regular Expressions, e.g using preg_match

Try this
It will match the string in []
preg_match_all("/\[([^\]]*)\]/", $text, $matches);
And this will match string after ? and #=
preg_match_all("/^#=(\S+)|\?(.*)/", $text, $matches);
var_dump($matches);

You need regular expressions to check for any text inside another with different properties, here is a simple tutorial on that link.

Regular Expression not working in PHP

How to check below line in regular expression?
[albums album_id='41']
All are static except my album_id. This may be 41 or else.
Below my code I have tried but that one not working:
$str = "[albums album_id='41']";
$regex = '/^[albums album_id=\'[0-9]\']$/';
if (preg_match($regex, $str)) {
echo $str . " is a valid album ID.";
} else {
echo $str . " is an invalid ablum ID. Please try again.";
}
Thank you

You need to escape the first [ and add + quantifier to [0-9]. The first [ being unescaped created a character class - [albums album_id=\'[0-9] and that is something you did not expect.
Use
$regex = '/^\[albums album_id=\'[0-9]+\']$/';
Pattern details:
^ - start of string
\[ - a literal [
albums album_id=\'- a literal string albums album_id='
[0-9]+ - one or more digits (thanks to the + quantifier, if there can be no digits here, use * quantifier)
\'] - a literal string ']
$ - end of string.
See PHP demo:
$str = "[albums album_id='41']";
$regex = '/^\[albums album_id=\'[0-9]+\']$/';
if (preg_match($regex, $str)) {
echo $str . " is a valid album ID.";
} else {
echo $str . " is an invalid ablum ID. Please try again.";
}
// => [albums album_id='41'] is a valid album ID.

You have an error in your regex code, use this :
$regex = '/^[albums album_id=\'[0-9]+\']$/'
The + after [0-9] is to tell that you need to have one or more number between 0 and 9 (you can put * instead if you want zero or more)
To test your regex before using it in your code you can work with this website regex101

php regex to separate out characters stuck to left, right or in the middle

Looking for a php regex that will allow me to separate out certain characters from words (if they're sticking to the left or right of the word, or even anywhere within the word).
For example,
hello. -> hello .
.hello -> . hello
hello.hello -> hello . hello
I have the below code but it won't work for all cases. Please note that $value could be '.', '?', or any character.
$regex = "/(?<=\S)\\" . $value . "|\\" . $value . "(?=\S)/";
$this->str = preg_replace_callback($regex, function($word) {
return ' ' . $word[0];
}, $this->str);
Also, please help with specifying the part where I can turn on (or off) the 3rd condition.
[UPDATE]
I think there might be confusion about exact requirements. Let me try to be more specific. I want a regex which will help me seperate out certain characters which are either at the end or the beginning of a group of text. What is group of text? Group of text could be any length (>=1) and contain any characters however it must begin with a-z or 0-9. Again, would be nice if this aspect would be highlighted in solution so that if we want group of text to begin&end with more characters (not just a-z or 0-9) it's possible.
$character = '.', string is ".hello.world." => ". hello.world ."
$character = '.', string is ".1ello.worl2." => ". 1ello.worl2 ."
$character = '.', string is ".?1ello.worl2." => ".?1ello.worl2 ."
$character = '.', string is "4/5.5" => "4/5.5"
$character = '.', string is "4.?1+/5" => "4.?1+/5"
$character = '.', string is ".4/5.5." => ". 4/5.5 ."
$character = '/', string is ".hello?.world/" => ".hello?.world /"
$character = '/', string is ".hello?.worl9/" => ".hello?.worl9 /"
Hope, its more clear now.

You can use 3 alternatives each captured into its own capture group, and use a preg_replace_callback to apply the corresponding replacement:
$wrd = ".";
$re = '~(?<=\S)(' . preg_quote($wrd) . ')(?=\S)|(?<=\S)(' . preg_quote($wrd) . ')|(' . preg_quote($wrd) . ')(?=\S)~';
$str = "hello.\n.hello\nhello.hello";
$result = preg_replace_callback($re, function($m) {
if (!empty($m[1])) {
return " " . $m[1] . " ";
} else if (!empty($m[2])) {
return " " . $m[2];
} else return $m[3] . " ";
}, $str);
echo $result;
See the IDEONE demo
The regex will be
(?<=\S)(\.)(?=\S)|(?<=\S)(\.)|(\.)(?=\S)
| 1| | 2| | 3|
See regex demo
The first group is your Case 3 (hello.hello -> hello . hello), the second group is your Case 1 (hello. -> hello .) and the third group singals your Case 2 (.hello -> . hello).
UPDATE (handling exceptions)
If you have exceptions, you can add more capturing groups. E.g., you want to protect the dot in float numbers. Add a (\d\.\d) alternative, and check inside the callback function if it is not empty. If not, just restore it with return $m[n]:
$wrd = ".";
$re = '~(\d\.\d)|(?<=\S)(' . preg_quote($wrd) . ')(?=\S)|(?<=\S)(' . preg_quote($wrd) . ')|(' . preg_quote($wrd) . ')(?=\S)~';
$str = "hello.\n.hello\nhello.hello\nhello. 3.5/5\nhello.3\na./b";
$result = preg_replace_callback($re, function($m) {
if ( !empty($m[1])) { // The dot as a decimal separator
return $m[1]; // No space is inserted
}
else if (!empty($m[2])) { // A special char is enclosed with non-spaces
return " " . $m[2] . " "; // Add spaces around
} else if (!empty($m[3])) { // A special char is enclosed with non-spaces
return " " . $m[3]; // Add a space before the special char
} else return $m[4] . " "; // A special char is followed with a non-space, add a space on the right
}, $str);
echo $result;
See an updated code demo
Another code demo - based on matching locations before and after the . that are not enclosed with spaces (and protecting a float value) (based on #bobblebubble's solution (deleted)):
$wrd = ".";
$re = '~(\d\.\d)|(?<!\s)(?=' . preg_quote($wrd) . ')|(?<=' . preg_quote($wrd) . ')(?!\s)~';
$str = "hello.\n.hello\nhello.hello\nhello. 3.5/5\nhello.3\na./b";
$result = preg_replace_callback($re, function($m) {
if ( !empty($m[1])) { // The dot as a decimal separator
return $m[1]; // No space is inserted
}
else return " "; // Just insert a space
}, $str);
echo $result;
SUMMARY:
You cannot use \b since your . / ? etc. can appear in mixed "word" and "non-word" contexts
You need to use capturing and preg_replace_callback since there are different replacement schemes

You can use a regex based on word boundaries.
\b(?=\.(?!\S))|(?<=(?<!\S)\.)\b
Would match the boundary (zero-width) between a word and a literal dot if not followed by a non-whitespace \S or not preceded by a non-whitespace using lookarounds to check.
See demo at regex101. Use in a PHP function with value parameter and replace with space.
// $v = character
function my_func($str, $v=".")
{
$v = preg_quote($v, '/');
return preg_replace('/\b(?='.$v.'(?!\S))|(?<=(?<!\S)'.$v.')\b/', " ", $str);
}
PHP demo at eval.in

From what I understand the . can be any non-word character. If that's the case, try this:
$patron = '/(\W+)/';
$this->str = trim(preg_replace($patron, ' $1 ', $this->str));

(\s?[.]\s?)
If you use the above regex, you can simply replace all the matches with " . "
How it works:
I used \s? to capture a leading and trailing whitespace, if there is any.
[.] is a char class, so you should add all of the "certain characters" you want to find.
A regex that catches the first 2 conditions and never the third is (\s[.]\s?|\s?[.]\s). (Again, you'll need to replace the capture with " . ", and also add your "certain characters" to the char classes.)
You can then choose which regex you will use.

preg_match: can't find substring which has trailing special characters

I have a function which uses preg_match to check for if a substring is in another string.
Today I realize that if substring has trailing special characters like special regular expression characters (. \ + * ? [ ^ ] $ ( ) { } = ! < > | : -) or #, my preg_match can't find the substring even though it is there.
This works, returns "A match was found."
$find = "website scripting";
$string = "PHP is the website scripting language of choice.";
if (preg_match("/\b" . $find . "\b/i", $string)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
But this doesn't, returns "A match was not found."
$find = "website scripting #";
$string = "PHP is the website scripting # language of choice.";
if (preg_match("/\b" . $find . "\b/i", $string)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
I have tried preg_quote, but it doesn't help.
Thank you for any suggestions!
Edit: Word boundary is required, that's why I use \b. I don't want to find "phone" in "smartphone".

You can just check if the characters around the search word are not word characters with look-arounds:
$find = "website scripting #";
$string = "PHP is the website scripting # language of choice.";
if (preg_match("/(?<!\\w)" . preg_quote($find, '/') . "(?!\\w)/i", $string)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
See IDEONE demo
Result: A match was found.
Note the double slash used with \w in (?<!\\w) and (?!\\w), as you have to escape regex special characters in interpolated strings.
The preg_quote function is necessary as the search word - from what I see - can have special characters, and some of them must be escaped if intended to be matched as literal characters.
UPDATE
There is a way to build a regex with smartly placed word boundaries around the keyword, but the performance will be worse compared with the approach above. Here is sample code:
$string = "PHP is the website scripting # language of choice.";
$find = "website scripting #";
$find = preg_quote($find);
if (preg_match('/\w$/u', $find)) { // Setting trailing word boundary
$find .= '\\b';
}
if (preg_match('/^\w/u', $find)) { // Setting leading word boundary
$find = '\\b' . $find;
}
if (preg_match("/" . $find . "/ui", $string)) {
echo "A match was found.";
} else {
echo "A match was not found.";
}
See another IDEONE demo

If you try to find a string from another string, you can strpos().
Ex.
<?php
$find = "website scripting";
$string = "PHP is the website scripting language of choice.";
if (strpos($string,$find) !== false) {
echo 'true';
} else {
echo 'false';
}

How do i parse a Google Voice email address with PHP/Regex?

I would like to extract 16197226146 from the following string using PHP:
"(480) 710-6186" <18583894531.16197226146.S7KH51hwhM#txt.voice.google.com>
Could someone please help me with the regex please?

<\d*?\.(\d+)
< Match "<"
\d Match digits
* 0 or more times
? Lazy, take as little as possible
\. Match a "."
( Capture
\d Match digits
+ 1 or more times
) Stop capturing
That matches the second number after a .. The match is in group 1.
if (preg_match("/<\d*?\.(\d+)/", $subject, $regs)) {
$result = $regs[1];
} else {
$result = "";
}
You can play with the regex here.

You could use explode.
$value = "(480) 710-6186"<18583894531.16197226146.S7KH51hwhM#txt.voice.google.com>";
$result = explode('.', $value);
echo $result[1]; // is 16197226146

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP regular expression to match multiple email ID formats - php

Related

php - find match in string

Regular Expression not working in PHP

php regex to separate out characters stuck to left, right or in the middle

preg_match: can't find substring which has trailing special characters

How do i parse a Google Voice email address with PHP/Regex?

Categories

Resources