How to extract the last 2 delimitered numbers using regex - php

I have to extract the first instance of a number-number. For example I want to extract 8236497-234783 from the string bnjdfg/dfg.vom/fdgd3-8236497-234783/dfg8jfg.vofg. The string has no apparent structure besides the number followed by a dash and followed by a number which is the thing I want to extract.
The thing I want to extract may be at the very start of the string, or the middle, or the end, or maybe the entire string itself is just a number-number.
$b = "bnjdfg/dfg.vom/fdgd3-8236497-234783/dfg8jfg.vofg";
preg_match('\d-\d', $b, $matches);
echo($matches[0]);
// Expecting to print 8236497-234783

You're missing the delimiter around the regexp. PHP's preg functions require that the regex begin with a punctuation character, and it looks for the matching character at the end of the regexp (because flags can be put after the second delimiter).
\d just matches a single digit. If you want to match a string of digits, you should write \d+.
You should require that the numbers be surrounded by word boundaries with \b, otherwise it will match the 3 at the end of fdgd3
preg_match('/\b\d+-\d+\b/', $b, $matches);

Related

Why regex with lookaheads doesn't match?

I need (in PHP) to split a sententse by the word that cannot be the first or the last one in the sentence. Say the word is "pression" and here is my regex
/^.+?(?=[\s\.\,\:\;])pression(?=[\s\.\,\:\;]).+$/i
Live here: https://regex101.com/r/CHAhKj/1/
First, it doesn't match.
Next, I think - it is at all possible to split that way? I tryed simplified example
print_r(preg_split('/^.+pizza.+$/', 'my pizza is cool'));
live here http://sandbox.onlinephpfunctions.com/code/10b674900fc1ef44ec79bfaf80e83fe1f4248d02
and it prints an array of 2 empty strings, when I expect
['my ', ' is cool']
I need (in PHP) to split a sentence by the word that cannot be the first or the last one in the sentence
You may use this regex:
(?<=[^\s.?]\h)pression(?=\h[^\s.?])
RegEx Demo
RegEx Details:
(?<=[^\s.?]\h): Lookbehind to assert that ahead of current position we have a space and a character that not a whitespace, not a dot and not a ?.
pression: Match word pression
(?=\h[^\s.?]): Lookahead to assert that before current position we have a space and a character that not a whitespace, not a dot and not a ?
First, ^.+?(?=[\s\.\,\:\;])pression(?=[\s\.\,\:\;]).+$ can't match any string at all because the (?=[\s\.\,\:\;])p part requires p to be also either a whitespace char, or a ., ,, : or ;, which invalidates the whole match at once.
Second, ^.+pizza.+$ pattern does not ensure the pizza matched is not the first or last word in a sentence as . matches whitespace, too. It does not return anything meaningful, because preg_split uses the match to break string into chunks, and the two empty values are 1) start of string and 2) empty string positions.
That said, all you need is:
preg_match('~^(.*?\w\W+)pression(\W+\w.*)$~is', $text, $m)
See the regex demo. Details:
^ - start of string
(.*?\w\W+) - Capturing group 1: any zero or more chars, as few as possible, then a word char and then one or more non-word chars
pression - a word
(\W+\w.*) - Capturing group 2: one or more non-word chars, a word char, and then any zero or more chars as many as possible
$ - end of string.
s makes the . match across lines and i flag makes the pattern match in a case insensitive way.
See the PHP demo:
$text = "You can use any regular expression pression inside the lookahead ";
if (preg_match('~^(.*?\w\W+)pression(\W+\w.*)$~is', $text, $m)) {
echo $m[1] . " << | >> " . $m[2];
}
// => You can use any regular expression << | >> inside the lookahead

Two or more occurrence of at least one in character set with PHP regex

I want to make PHP regex to find if text has two or more of at least one character in character set {-, l, s, i, a}.
I made like this.
preg_match("/[-lisa]{2,}/", $text);
But this doesn't work.
Please help me.
Matching two or more occurrences means matching two is enough for the check to be valid.
At least one in character set might either mean you want to match the same char from the set or any of the chars in the set two times. If you want the former, when the same char repeats, you can use preg_match('~([-lisa]).*?\1~', $string, $match) (note the single quotes delimiting the string literal, if you use double quotes, the backreference must have double backslash), if the latter, i.e. you want to match ..l...i.., you can use preg_match('~[-lisa].*?[-lisa]~', $string, $match) or preg_match('~([-lisa]).*?(?1)~', $string, $match) (where (?1) is a regex subroutine that repeats the corresponding group pattern).
If your strings contain line breaks, do not forget to add s modifier, preg_match('~([-lisa]).*?\1~s', $string, $match).
More than that, if you want to check for consecutive character repetition, you should remove .* from the above patterns, i.e. 1) must be preg_match('~([-lisa])\1~', $string, $match) and 2) must be preg_match('~[-lisa]{2}~', $string, $match) (though, this is not what you want judging by your own feeback, so this example here is just for the record).
The ([-lisa])\1{2} pattern that you find useful matches a repeated -, l, i, s or a char three times (---, lll, sss, etc.), thus only use it if it fits your requirements.
Note that preg_match functions searches for a match anywhere inside a string and does not require a full string match (thus, no need adding .* (or ^.*, .*$) at the start and end of the pattern).
See a sample regex demo, feel free to test your strings in this environment.

How do I test if string maches integer:integer with preg_match?

I need a regular expression to test if string matches integer:integer (ex: 9:4).
I have tried
preg_match("[0-9]:[0-9]", $str)
but it's not correct.
You have to mark the start and end of the regular expression, usually with /.
Try this:
preg_match("/[0-9]:[0-9]/", $str)
One hint: you can use \d instead of [0-9].
If you want to make sure that the string only contains digit:digit, use ^ as the marker for the start of the string and $ for the end:
preg_match("/^[0-9]:[0-9]$/", $str)
Also, add + to match numbers of more than one digit:
preg_match("/^[0-9]+:[0-9]+$/", $str)
^[0-9](:[0-9])*$
^ matches the start of the string, and $ matches the end, ensuring that you're examining the entire string. It will match a single digit, plus zero or more instances of a colons followed by a digit after it.

PHP/Laravel trim all but last word in a namespace

Trying to trim a fully qualified namespace so to use just the last word. Example namepspace is App\Models\FruitTypes\Apple where that final word could be any number of fruit types. Shouldn't this...
$fruitName = 'App\Models\FruitTypes\Apple';
trim($fruitName, "App\\Models\\FruitTypes\\");
...do the trick? It is returning an empty string. If I try to trim just App\\Models\\ it returns FruitTypes\Apples as expected. I know the backslash is an escape character, but doubling should treat those as actual backslashes.
If you want to use native functionality for this rather than string manipulation, then ReflectionClass::getShortName will do the job:
$reflection = new ReflectionClass('App\\Models\\FruitTypes\\Apple');
echo $reflection->getShortName();
Apple
See https://3v4l.org/eVl9v
preg_match() with the regex pattern \\([[:alpha:]]*)$ should do the trick.
$trimmed = preg_match('/\\([[:alpha:]]*)$/', $fruitName);
Your result will then live in `$trimmed1'. If you don't mind the pattern being a bit less explicit, you could do:
preg_match('/([[:alpha:]]*)$/', $fruitName, $trimmed);
And your result would then be in $trimmed[0].
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
preg_match - php.net
(matches is the third parameter that I named $trimmed, see documentation for full explanation)
An explanation for the regex pattern
\\ matches the character \ literally to establish the start of the match.
The parentheses () create a capturing group to return the match or a substring of the match.
In the capturing group ([[:alpha:]]*):
[:alpha:] matches a alphabetic character [a-zA-Z]
The * quantifier means match between zero and unlimited times, as many times as possible
Then $ asserts position at the end of the string.
So basically, "Find the last \ then return all letter between this and the end of the string".

how to extract a certain digit from a String using regular expression in php?

I have a String (filename): s_113_2.3gp
How can I extract the number that appears after the second underscore? In this case it's '2' but in some cases that can be a few digits number.
Also the number of digits that appears after the first underscore can vary so the length of this String is not constant.
You can use a capturing group:
preg_match('/_(\d+)\.\w+$/', $str, $matches);
$number = $matches[1];
\d+ represents 1 or more digits. The parentheses around that capture it, so you can later retrieve it with $matches[1]. The . needs to be escaped, because otherwise it would match any character but line breaks. \w+ matches 1 or more word characters (digits, letters, underscores). And finally the $ represents the end of the string and "anchors" the regular expression (otherwise you would get problems with strings containing multiple .).
This also allows for arbitrary file extensions.
As Ωmega pointed out below there is another possibility, that does not use a capturing group. With the concept of lookarounds, you can avoid matching _ at the start and the \.\w+$ at the end:
preg_match('/(?<=_)\d+(?=\.\w+$)/', $str, $matches);
$number = $matches[0];
However, I would recommend profiling, before applying this rather small optimization. But it is something to keep in mind (or rather, to read up on!).
Using regex lookaround it is very short code:
$n = preg_match('/(?<=_)\d+(?=\.)/', $str, $m) ? $m[0] : "";
...which reads: find one or more digits \d+ that are between underscore (?<=_) and period (?=\.)

Categories