I want to test a string to see it contains certain words.
i.e:
$string = "The rain in spain is certain as the dry on the plain is over and it is not clear";
preg_match('`\brain\b`',$string);
But that method only matches one word. How do I check for multiple words?
Something like:
preg_match_all('#\b(rain|dry|clear)\b#', $string, $matches);
preg_match('~\b(rain|dry|certain|clear)\b~i',$string);
You can use the pipe character (|) as an "or" in a regex.
If you just need to know if any of the words is present, use preg_match as above. If you need to match all the occurences of any of the words, use preg_match_all:
preg_match_all('~\b(rain|dry|certain|clear)\b~i', $string, $matches);
Then check the $matches variable.
http://php.net/manual/en/function.preg-match.php
"Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster."
preg_match('\brain\b',$string, $matches);
var_dump($matches);
Related
The first question is this:
I am using http://www.phpliveregex.com/ to check my regex is right and it finds more than one matching lines.
I am doing this regex:
$lines = explode('\n', $text);
foreach($lines as $line) {
$matches = [];
preg_match("/[0-9]+[A-Z][a-z]+ [A-Z][a-z]+S[0-9]+\-[0-9]+T[0-9]+/uim", $line, $matches);
print_r($matches);
}
on the $text which looks like this: http://pastebin.com/9UQ5wNRu
The problem is that printed matches is only one match:
Array
(
[0] => 3Bajus StanislavS2415079249-2615T01
)
Why is it doing to me? any ideas what could fix the problem?
The second question
Maybe you've noticed not regular alphabetic characters of slovak language inside the text (from pastebin). How to match those characters and select the users which have this format:
{number}{first_name}{space}{last_name}{id_number}
how to do that?
Ok first issue is fixed. Thank you #chris85 . I should have used preg_match_all and do it on the whole text. Now I get an array of all students which have non-slovak (english) letters in the name.
preg_match is for one match. You need to use preg_match_all for a global search.
[A-Z] does not include an characters outside that range. Since you are using the i modifier that character class actual is [A-Za-z] which may or may not be what you want. You can use \p{L} in place of that for characters from any language.
Demo: https://regex101.com/r/L5g3C9/1
So your PHP code just be:
preg_match_all("/^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$/uim", $text, $matches);
print_r($matches);
You can also use T-Regx library:
pattern("^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$", 'uim')->match($text)->all();
The first question is this:
I am using http://www.phpliveregex.com/ to check my regex is right and it finds more than one matching lines.
I am doing this regex:
$lines = explode('\n', $text);
foreach($lines as $line) {
$matches = [];
preg_match("/[0-9]+[A-Z][a-z]+ [A-Z][a-z]+S[0-9]+\-[0-9]+T[0-9]+/uim", $line, $matches);
print_r($matches);
}
on the $text which looks like this: http://pastebin.com/9UQ5wNRu
The problem is that printed matches is only one match:
Array
(
[0] => 3Bajus StanislavS2415079249-2615T01
)
Why is it doing to me? any ideas what could fix the problem?
The second question
Maybe you've noticed not regular alphabetic characters of slovak language inside the text (from pastebin). How to match those characters and select the users which have this format:
{number}{first_name}{space}{last_name}{id_number}
how to do that?
Ok first issue is fixed. Thank you #chris85 . I should have used preg_match_all and do it on the whole text. Now I get an array of all students which have non-slovak (english) letters in the name.
preg_match is for one match. You need to use preg_match_all for a global search.
[A-Z] does not include an characters outside that range. Since you are using the i modifier that character class actual is [A-Za-z] which may or may not be what you want. You can use \p{L} in place of that for characters from any language.
Demo: https://regex101.com/r/L5g3C9/1
So your PHP code just be:
preg_match_all("/^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$/uim", $text, $matches);
print_r($matches);
You can also use T-Regx library:
pattern("^[0-9]+\p{L}+ \p{L}+S[0-9]+\-[0-9]+T[0-9]+$", 'uim')->match($text)->all();
I am trying to extract the digits from between the words in this string.
110.0046102.005699.0008103.0104....
I want to extract 4 digits after dot (point/period).
110.0046
102.0056
99.0008
103.0104
I was wondering if this was possible to do with a regular expression or if I should just use other way.
// replace the variable $numbers with your numbers
$numbers = "110.0046102.005699.0008103.0104";
preg_match_all("#\d+\.\d{4}#", $numbers, $matches);
var_dump($matches); // outputting all matches
https://regex101.com/r/oG1dK1/1 -> you can see the regex in action here. The numbers are in the box MATCH INFORMATION on the right.
Try this regex:
(\d{1,}\.\d{4})
Demo here: https://regex101.com/r/uJ1wU6/1
I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!
You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo
You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.
You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)
Given the string:
100,000 this is some text 12,000 this is text I want to match.
I need a regular expression that matches 12,000 based on matching
text I want to match
So, we can get a position with:
strpos($haystack, 'text I want to match');
Then, I guess we could use a regular expression to look backwards:
But, this is where I need help.
If you know that the digits will always precede the based context you want to match ...
preg_match('/([\d,]+)\D*text I want to match/', $str, $match);
var_dump($match[1]);
It is simple:
/ ([0-9,]+) this is text I want to match\.$/
Demo:
http://sandbox.onlinephpfunctions.com/code/b288ca9a322c7a5b54c6490334540ab142b6a979
Another solution:
$re = "/([\\d,]+)(?=\\D*text I want to match)/";
$str = "100,000 this is some text 12,000 this is text I want to match.";
preg_match($re, $str, $matches);
Live demo