RegEx in PHP: find the first matching string - php

I want to find the first matching string in a very very long text. I know I can use preg_grep() and take the first element of the returned array. But it is not efficient to do it like that if I only need the first match (or I know there is exactly only one match in advance). Any suggestion?

preg_match() ?
preg_match() returns the number of
times pattern matches. That will be
either 0 times (no match) or 1 time
because preg_match() will stop
searching after the first match.
preg_match_all() on the contrary will
continue until it reaches the end of
subject. preg_match() returns FALSE if
an error occurred.

Here's an example of how you can do it:
$string = 'A01B1/00asdqwe';
$pattern = '~^[A-Z][0-9][0-9][A-Z][0-9]+~';
if (preg_match($pattern, $string, $match) ) {
echo "We have matched: $match[0]\n";
} else {
echo "Not matched\n";
}
You can try print_r($match) to check the array structure and test your regex.
Side note on regex:
The tilde ~ in the regex are just delimiters needed to wrap around
the pattern.
The caret ^ denote that we are matching from the start
of the string (optional)
The plus + denotes that we can have one or
more integers that follow. (So that A01B1, A01B12, A01B123 will also
be matched.

Related

Regular expression return only certain values PHP

I cant remember what to use to return only a specific part of a string.
I have a string like this:-
$str = "return(me or not?)";
I want to get the word which is after (. In this example me will be my result. How can I do this?
I dont think substr is what I am looking for. as substr returns value based on the index you provided. which in this case i dont know the index, it can vary. All I know is that I want to return whatever is after "(" and before the space " ". The index positions will always be different there for i cant use substr(..).
This regular expression should do the trick. Since you didn't provide general rules but only an example it might need further changes though.
preg_match('/\((\S+)/', $input, $matches);
$matches[1] contains "me" then.
<?php
// Your input string
$string = "return(me or not?)";
// Pattern explanation:
// \( -- Match opening parentheses
// ([^\s]+) -- Capture at least one character that is not whitespace.
if (preg_match('/\(([^\s]+)/', $string, $matches) === 1)
// preg_match() returns 1 on success.
echo "Substring: {$matches[1]}";
else
// No match found, or bad regular expression.
echo 'No match found';
Result of capture group will be your result using this regex and preg_match().
$regex = '/\((\w+)/';
Check preg_match() for the working reference.

How to preg_match '{95}1340{113}1488{116}1545{99}1364'

i want to preg_match following as it is
$this_string = '{95}1340{113}1488{116}1545{99}1364';
My best try was
preg_match('/^[\{\d+\}\d+]+$/', $this_string);
That matches
{95}1340{113}1488
but also
{95}1340{113}
which is wrong.
I know why it is matching last example. One match {95}1340 was true, so '+' 'll be always true. But i don't know how to tell, if it match, so it has always be a complete match in '[…]'
i do expect only matches likes these
{…}…
{…}…{…}…
{…}…{…}…{…}…
one of the tries:
^(\{\d+\}\d+)+$
does also match
{99}1364
at the very last end of this string as a second match, so i get back an Array with two Elements:
Array[0] = {95}1340{113}1488{116}1545{99}1364 and
Array[1] = {99}1364
Problem is unnecessary use of character class in your regex i.e. [ and ].
You can use:
'/^(\{\d+\}\d+)+$/'
The translation of your regex to a clearer thing would be: /^[\{\}0-9+]+$/, this would be explained as everything that is inside this chracters {}0123456789+, exactly those ones.
What you want is grouping, for grouping, parentheses are needed and not character classes () instead [], so what you want to do is replace [] for ().
Short answer: '/^(\{\d+\}\d+)+$/'
What you are trying to do is a little unclear. Since your last edit, I assume that you want to check the global format of the string and to extract all items (i.e. {121}1231) one by one. To do that you can use this code:
$str = '{95}1340{113}1488{116}1545{99}1364';
$pattern = '~\G(?:{\d+}\d+|\z)~';
if (preg_match_all($pattern, $str, $matches) && empty(array_pop($matches[0])))
print_r($matches[0]);
\G is an anchor for the start of the string or the end of the previous match
\z is an anchor for the end of the string
The alternation with \z is only needed to check that the last match is at the end of the string. If the last match is empty, you are sure that the format is correct until the end.

perl regex match any number that is not

given a string:
//foo.bar/baz/123/index.html
I am trying to match the number after baz, so long as it is not 123.
//foo.bar/baz/124/index.html (WOULD MATCH)
//foo.bar/baz/123/index.html (WOULD NOT MATCH)
How can I express this? I keep trying things like:
/baz\/d+^(123)/index/
but have not been successful. Any help is appreciated!
Use negative look-ahead to assert that there is not 123 after baz/. Then go on to match with \d+:
m~baz/(?!123\b)\d+/index~
In Perl, you can use different delimiter when your regex pattern already contains /, to avoid escaping them. Here I've used ~.
If the substring to not allow is fixed to be baz/123, you can also do it with index() function:
$str = "//foo.bar/baz/124/index.html";
$needle = "/baz/123/";
if (index($str, $needle) == -1) {
print "Match found\n";
}

Regex s modifier, not working?

Okay, I am a noob to regex, and I am using this site for my regex primer:
Question: using the s modifier, the code below is suppose to echo 4 as it has found 4 newline characters.
However, when I run this I get one(1), why?
link text
<?php
/*** create a string with new line characters ***/
$string = 'sex'."\n".'at'."\n".'noon'."\n".'taxes'."\n";
/*** look for a match using s modifier ***/
echo preg_match("/sex.at.noon/s", $string, $matches);
/*The above code will echo 4 as it has found 4 newline characters.*/
?>
Use preg_match_all() instead which doesn't stop after the first match.
preg_match() returns the number of times pattern matches. That will be either 0 times (no match) or 1 time because preg_match() will stop searching after the first match. preg_match_all() on the contrary will continue until it reaches the end of subject . preg_match() returns FALSE if an error occurred. —PHP.net
However, the code will output still only 1 because what you are matching is the regex "sex.at.noon" and not a line break.
preg_match() will only ever return 0 or 1 because it stops after the first time the pattern matches. If you use preg_match_all() it will still return 1 because your pattern only matches once in the string you're matching against.
If you want the number of newlines via regex:
echo preg_match_all("/\n/m", $string, $matches);
Or via string functions:
echo substr_count($string, "\n");

Match number at the end of the string

Given the following string how can I match the entire number at the end of it?
$string = "Conacu P PPL Europe/Bucharest 680979";
I have to tell that the lenght of the string is not constant.
My language of choice is PHP.
Thanks.
You could use a regex with preg_match, like this :
$string = "Conacu P PPL Europe/Bucharest 680979";
$matches = array();
if (preg_match('#(\d+)$#', $string, $matches)) {
var_dump($matches[1]);
}
And you'll get :
string '680979' (length=6)
And here is some information:
The # at the beginning and the end of the regex are the delimiters -- they don't mean anything : they just indicate the beginning and end of the regex ; and you could use whatever character you want (people often use / )
The '$' at the end of the pattern means "end of the string"
the () means you want to capture what is between them
with preg_match, the array given as third parameter will contain those captured data
the first item in that array will be the whole matched string
and the next ones will contain each data matched in a set of ()
the \d means "a number"
and the + means one or more time
So :
match one or more number
at the end of the string
For more information, you can take a look at PCRE Patterns and Pattern Syntax.
The following regex should do the trick:
/(\d+)$/
EDIT: This answer checks if the very last character in a string is a digit or not. As the question https://stackoverflow.com/q/12258656/1331430 was closed as an exact duplicate of this one, I'll post my answer for it here. For what this question's OP is requesting though, use the accepted answer.
Here's my non-regex solution for checking if the last character in a string is a digit:
if (ctype_digit(substr($string, -1))) {
//last character in string is a digit.
}
DEMO
substr passing start=-1 will return the last character of the string, which then is checked against ctype_digit which will return true if the character is a digit, or false otherwise.
References:
substr
ctype_digit
To get the number at the end of a string, without using regex:
function getNumberAtEndOfString(string $string) : ?int
{
$result = sscanf(strrev($string), "%d%s");
if(isset($result[0])) return strrev($result[0]);
return null;
}
var_dump(getNumberAtEndOfString("Conacu P PPL Europe/Bucharest 680979")); //int(680979)

Categories