Regex for extracting comma delimited numbers in brackets - php

String:
lorem ipsum 999
[id:284,286]
[id:28]
Block in brackets may contain a lot of numbers.
Regex:
\[id:(\d+)(,\d+)*\]
What I'd like to see:
284
286
28
Solution using PHP:
preg_match_all('/\[id:(.*)\]/', $input, $ids);
if (strpos($ids[1][0], ',')) {
$ids = explode(',', $ids[1][0]);
foreach ($ids as $id) {
echo $id . "\n";
}
} else {
echo $ids[1][0];
}
But is it possible using regex without explode()?

The explode way is perhaps the best. Unfortunately, PCRE does not remember repeated groups, thus, you either do it in 2 steps (with the explode), or use a \G based regex. Here is a safer regex than the one you are using (if there are no spaces in between the numbers):
$input = "lorem ipsum 999 [id:284,286] [id:28]";
preg_match_all('~\[id:([\d,]*)]~', $input, $ids);
foreach ($ids[1] as $id) {
print_r(explode(',', $id)) . PHP_EOL;
}
See the IDEONE demo
The '~\[id:([\d,]*)]~' regex matches [id: and then matches and captures into Group 1 zero or more (due to * 0+ occurrences quantifier) digits (\d) or ,s.
If you need a one-regex solution, in PHP, if you process individual strings, you can make use of a \G based regex that you can leverage to set up the leading boundary and then match the consecutive numbers:
'~(?:\[id:|(?!^)\G,)\K\d+~'
See the regex demo and this IDEONE demo:
$re = '~(?:\[id:|(?!^)\G,)\K\d+~';
$strs = array("lorem ipsum 999", "[id:284,286]", "[id:28]");
foreach ($strs as $s) {
preg_match_all($re, $s, $matches);
print_r($matches[0]);
}
Pattern details:
(?:\[id:|(?!^)\G,) - match the [id: literal character sequence or the end of each successful match with (?!^)\G with a comma after it
\K - omit the matched value
\d+ - only match 1+ digits
If there can be whitespace between the digits, add \s* after (and perhaps, before) the comma.

Related

Regex only grabbing first digit

I'm trying to grab everything after the following digits, so I end up with just the store name in this string:
full string: /stores/1077029-gacha-pins
what I want to ignore: /stores/1077029-
what I need to grab: gacha-pins
Those digits can change at any time so it's not specifically that ID, but any numbers after /stores/
My attempt so far is only grabbing /stores/1
\/stores\/[0-9]
I'm still trying, just thought I would see if I can get some help in the meantime too, will post an answer if I solve.
You may use
'~/stores/\d+-\K[^/]+$~'
Or a more specific one:
'~/stores/\d+-\K\w+(?:-\w+)*$~'
See the regex demo and this regex demo.
Details
/stores/ - a literal string
\d+ - 1+ digits
- - a hyphen
\K - match reset operator
[^/]+ - any 1+ chars other than /
\w+(?:-\w+)* - 1+ word chars and then 0+ sequences of - and 1+ word chars
$ - end of string.
See the PHP demo:
$s = "/stores/1077029-gacha-pins";
$rx = '~/stores/\d+-\K[^/]+$~';
if (preg_match($rx, $s, $matches)) {
echo "Result: " . $matches[0];
}
// => Result: gacha-pins
You should do it like this:
$string = '/stores/1077029-gacha-pins';
preg_match('#/stores/[0-9-]+(.*)#', $string, $matches);
$part = $matches[1];
print_r($part);

PHP Extract Specific Character from string

i have the bellow string
$LINE = TCNU1573105 HDPE HTA108 155 155 000893520918 PAL990 25.2750 MT 28.9750 MT
and i want extract the PAL990 from the above string. actually extract PAL990 string or any string that has PAL followed by some digits Like PAL222 or PAL123
i tried many ways and could not get the result. i used,
substr ( $LINE, 77, 3)
but when the value in different position i get the wrong value.
You may use
$LINE = "TCNU1573105 HDPE HTA108 155 155 000893520918 PAL990 25.2750 MT 28.9750 MT";
if (preg_match('~\bPAL\d+\b~', $LINE, $res)) {
echo $res[0]; // => PAL990
}
See the PHP demo and this regex demo.
Details
\b - a word boundary
PAL - a PAL substring
\d+ - 1+ digits
\b - a word boundary.
The preg_match function will return the first match.
Note that in case your string contains similar strings in between hyphens/whitespace you will no longer be able to rely on word boundaries, use custom whitespace boundaries then, i.e.:
'~(?<!\S)PAL\d+(?!\S)~'
See this regex demo
EDIT
If you may have an optional whitespace between PAL and digits, you may use
preg_replace('~.*\b(PAL)\s?(\d+)\b.*~s', '$1$2', $LINE)
See this PHP demo and this regex demo.
Or, match the string you need with spaces, and then remove them:
if (preg_match('~\bPAL ?\d+\b~', $LINE, $res)) {
echo str_replace(" ", "", $res[0]);
}
See yet another PHP demo
Note that ? makes the preceding pattern optional (1 or 0 occurrences are matched).
$string = "123ABC1234 *$%^&abc.";
$newstr = preg_replace('/[^a-zA-Z\']/','',$string);
echo $newstr;
Output:ABCabc

Search string for first word that has an exclamation-mark

I have a string like this:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
I want to get the first word that is followed by an exclamation-mark. So in the example above, it should be:
$word = 'k-on';
I'm lost as to what's the appropriate approach to take. Maybe a regex solution?
If you need to only support ASCII letter words, you can use
/\b[a-z]+(?:-[a-z]+)*!/i
See regex demo
If you plan to support Unicode, use \p{L}:
/\b\p{L}+(?:-\p{L}+)*!/u
See another regex demo
Here is the pattern explanation:
\b - a word boundary (the previous character must be a non-word one or the beginning of the string)
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
(?:-\p{L}+)* - zero or more sequences of:
- - a literal hyphen
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
! - a literal ! symbol
PHP demo:
$re = '/\b\p{L}+(?:-\p{L}+)*!/u';
$str = "Hello k-ąn! Lorem Ipsum! Lorem.";
preg_match($re, $str, $match);
print_r($match);
I think this might do what you're looking for. Basically split the string into words, look for the first word that ends in '!', do whatever then break out of the loop:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
arry = explode(" ", $string);
foreach ($arry as $word) {
if (substr($word,-1) == "!") {
do something ...
break;
}
}
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
preg_match('/[A-Za-z0-9-]+!/', $string, $match);
$yourWord = str_replace("!", "", $match[0]); //prints k-on
obviously, the Solution for the requirement is RegExp, here i used a simple expression which allows AlphaNumeric String, exceptionally allowing hyphen(-) as well. use of preg_match matches the pattern into the string and returns the first matching keyword, which in your case is k-on! and used str_replace in order to take out the exclamation from the returned string.
know more about preg_match : http://php.net/manual/en/function.preg-match.php

split string in numbers and text but accept text with a single digit inside

Let's say I want to split this string in two variables:
$string = "levis 501";
I will use
preg_match('/\d+/', $string, $num);
preg_match('/\D+/', $string, $text);
but then let's say I want to split this one in two
$string = "levis 5° 501";
as $text = "levis 5°"; and $num = "501";
So my guess is I should add a rule to the preg_match('/\d+/', $string, $num); that looks for numbers only at the END of the string and I want it to be between 2 and 3 digits.
But also the $text match now has one number inside...
How would you do it?
To slit a string in two parts, use any of the following:
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
This regex matches:
^ - the start of the string
(.*?) - Group 1 capturing any one or more characters, as few as possible (as *? is a "lazy" quantifier) up to...
\s* - zero or more whitespace symbols
(\d+) - Group 2 capturing 1 or more digits
\D* - zero or more characters other than digit (it is the opposite shorthand character class to \d)
$ - end of string.
The ~s modifier is a DOTALL one forcing the . to match any character, even a newline, that it does not match without this modifier.
Or
preg_split('~\s*(?=\s*\d+\D*$)~', $s);
This \s*(?=\s*\d+\D*$) pattern:
\s* - zero or more whitespaces, but only if followed by...
(?=\s*\d+\D*$) - zero or more whitespaces followed with 1+ digits followed with 0+ characters other than digits followed with end of string.
The (?=...) construct is a positive lookahead that does not consume characters and just checks if the pattern inside matches and if yes, returns "true", and if not, no match occurs.
See IDEONE demo:
$s = "levis 5° 501";
preg_match('~^(.*?)\s*(\d+)\D*$~s', $s, $matches);
print_r($matches[1] . ": ". $matches[2]. PHP_EOL);
print_r(preg_split('~\s*(?=\s*\d+\D*$)~', $s, 2));

Regex PHP - dont match specific string followed by numeric

Im looping over a large number of files in a directory, and want to extract all the numeric values in a filename where it starts lin64exe , for instance, lin64exe005458002.17 would match 005458002.17. I have this part sorted, but in the directory there are other files, such as part005458 and others. How can I make it so I only get the numeric (and . ) after lin64exe ?
This is what I have so far:
[^lin64exe][^OTHERTHINGSHERE$][0-9]+
Regex to match the number with decimal point which was just after to lin64exe is,
^lin64exe\K\d+\.\d+$
DEMO
<?php
$mystring = "lin64exe005458002.17";
$regex = '~^lin64exe\K\d+\.\d+$~';
if (preg_match($regex, $mystring, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> 005458002.17
You can try with look around as well
(?<=^lin64exe)\d+(\.\d+)?$
Here is demo
Pattern explanation:
(?<= look behind to see if there is:
^ the beginning of the string
lin64exe 'lin64exe'
) end of look-behind
\d+ digits (0-9) (1 or more times (most possible))
( group and capture to \1 (optional):
\. '.'
\d+ digits (0-9) (1 or more times (most possible))
)? end of \1
$ the end of the string
Note: use i for ignore case
sample code:
$re = "/(?<=^lin64exe)\\d+(\\.\\d+)?$/i";
$str = "lin64exe005458002.17\nlin64exe005458002\npart005458";
preg_match_all($re, $str, $matches);
You can use this regex and use captured group #1 for your number:
^lin64exe\D*([\d.]+)$
RegEx Demo
Code:
$re = '/^lin64exe\D*([\d.]+)$/i';
$str = "lin64exe005458002.17\npart005458";
if ( preg_match($re, $str, $m) )
var_dump ($m[1]);

Categories