Regex to match specific pattern - php

I am so bad at creating regex and I'm struggling with what I am SURE it's a simple stupid regex.
I am using PHP to do this match. Here is what I have until now.
Test string: 8848842356063003
if(!preg_match('/^[0-2]|[7-9]{16}/', $token)) {
return array('status' => 'failed', 'message' => "Invalid token", 'token' => '');
}
The regex must comply to this: Start with 0-2 or 7-9 and have EXACTLY 16 characters. What am I doing wrong? Because I get, as a match:
array(
0 => 8
)
And I should get:
array(
0 => 8848842356063003
)
By the way: I am using PHP Live Regex to test my regex string.
Thanks in advance,
Ares D.

The regex must comply to this: Start with 0-2 or 7-9 and have EXACTLY 16 characters
You can put starting numbers in same character class and use end anchor after matching 15 more charaters:
/^[0-27-9].{15}$/
If you want to match only digits then use:
/^[0-27-9]\d{15}$/

Related

Validation Challenge: Does string contain all characters in character mask as a continuous substring?

Given a haystack string (single word) consisting only of lowercase letters and a character mask containing only unique lowercase letters, determine if all letters in the character mask occur consecutively at any point in the haystack string. Letters in the character mask may be used in any order and may be used more than once to form a qualifying string if necessary.
Test strings and commented expected boolean results:
$tests = [
['word' => 'example', 'mask' => 'lmp'], // true (mpl)
['word' => 'goodness', 'mask' => 'dns'], // false (dn, ss)
['word' => 'slippers', 'mask' => 'eip'], // true (ippe)
['word' => 'slippers', 'mask' => 'ips'], // false (s, ipp, s)
['word' => 'google', 'mask' => 'go'], // true (goog)
['word' => 'food', 'mask' => 'go'], // false (oo)
['word' => 'bananas', 'mask' => 'ans'], // true (ananas)
['word' => 'candle', 'mask' => 'ace'], // false (ca, e)
['word' => 'mississippi', 'mask' => 'i'], // true (i)
['word' => 'executive', 'mask' => 'ecitx'], // false (exec, ti, e)
];
I am interested in elegant, efficient, and/or abstract answers as an exercise in imaginative programming. Have fun with it!
There are many pre-existing questions on Stack Overflow across a spectrum of languages that have similar requirements, but they do not have the same combination of rules. In this case, the qualifying substring must consist entirely of characters in the mask and all characters in the mask must be used at least once.
This question is a salvage operation after an interesting but incomplete question from another user was closed, abandoned, and deleted by the Roomba.I have arbitrarily added details to clarify the task, limited the scope, and populated a battery of test cases.
My first creation uses preg_match_all() to extract consecutive qualifying characters, then remove characters from the character mask using the extracted letters.
preg_match_all("/[$mask]+/", $word, $m)
&& array_filter($m[0], fn($chars) => !ltrim($mask, $chars))
Then I realized that preg_match_all() might be matching substrings that can be eliminated earlier because they have insufficient length to clear the mask characters. I added a minimum quantifier to the regex based on the length of the mask. It may or may not be worth the extra function call versus the decreased pattern readability.
preg_match_all("/[$mask]{" . strlen($mask) . ",}/", $word, $m)
&& array_filter($m[0], fn($chars) => !ltrim($mask, $chars))
Finally, I wanted to see if the task could be done solely with regex and avoid doing needless surplus matching. Using a technique that is typically used to validate password strength, I called preg_replace() to generate a series of lookaheads and built a pattern that preg_match() could use to ensure that every letter from the mask is present in the isolated substring.
(bool) preg_match('/' . preg_replace('/[a-z]/', "(?=[$mask]*$0)", $mask) . "[$mask]+/", $word)
Opinions will vary about which one is most/least readable/maintainable and I did not perform any benchmarks to see which one performs best. Here is the PHP demo.

regex to match kbt-y102_9999_0001v-s001r and kbt-y102_999a

I'm looking for a Regex that converts strings like
kbt-y102_9999_0001v-s001v
into N1v-s1v
and
kbt-y102_999a
into N1a
kbt-y102_ => ignore everything until first underscore
9999 => N
_0001v => 1v
-s001v => -s1v
kbt-y102_9999_0001v-s001r => N1v-s1r
kbt-y102_9999_0002r-s001v => N2r-s1v
kbt-y102_9999_0001v => N1v
kbt-y102_9999_0002r => N2r
kbt-y102_999a => Na
kbt-y102_999aa => Naa
kbt-y102_9999a => Na
kbt-y102_9999aa => Naa
my attempt covers the first four cases: (.*)_[0-9]{4}_[0-9]{3}([0-9][vr])?((-s)0{0,2}+([0-9][vr]))? (regex fiddle)
But I'm struggling with 999a.
Following your patterns this is a general Regular Expression to extract required data:
^[^_]*_\d+([a-z]*)(?:_0*([1-9][a-z])(?:(-[a-z])0*([1-9][a-z]))?)?
It's long but has nothing more than some acceptable wildcards (tokens) in proper places. You need to replace match with:
N$1$2$3$4
Live demo

Get values from formatted, delimited string with quoted labels and values

I have an input string like this:
"Day":June 8-10-2012,"Location":US,"City":Newyork
I need to match 3 value substrings:
June 8-10-2012
US
Newyork
I don't need the labels.
Per my comment above, if this is JSON, you should definitely use those functions as they are more suited for this.
However, you can use the following REGEX.
/:([a-zA-Z0-9\s-]*)/g
<?php
preg_match('/:([a-zA-Z0-9\s-]*)/', '"Day":June 8-10-2012,"Location":US,"City":Newyork', $matches);
print_r($matches);
The regex demo is here:
https://regex101.com/r/BbwVQ5/1
Here are a couple of simple ways:
Code: (Demo)
$string = '"Day":June 8-10-2012,"Location":US,"City":Newyork';
var_export(preg_match_all('/:\K[^,]+/', $string, $out) ? $out[0] : 'fail');
echo "\n\n";
var_export(preg_split('/,?"[^"]+":/', $string, 0, PREG_SPLIT_NO_EMPTY));
Output:
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
array (
0 => 'June 8-10-2012',
1 => 'US',
2 => 'Newyork',
)
Pattern #1 Demo \K restarts the match after : so that a positive lookbehind can be avoided (saving "steps" / improving pattern efficiency) By matching all following characters that are not a comma, a capture group can be avoided (saving "steps" / improving pattern efficiency).
Patter #2 Demo ,? makes the comma optional and qualifies the leading double-quoted "key" to be matched (split on). The targeted substring to split on will match the full "key" substring and end on the following : colon.

Ordered List Group Numbers, Symbols

I currently have code that displays data like so:
1
11 Title Here
2
21 Guns
A
Awesome
Using this:
foreach($animes as $currentAnime){
$thisLetter = strtoupper($currentAnime->title[0]);
$sorted[$thisLetter][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
unset($thisLetter);
}
How do I group all numbers to a #, and all Symbols to a ~?
Like so:
#
11 Title Here
21 Guns
~
.ahaha
A
Awesome
Thank you for the advice.
You can check with is_numeric() if this is a number and with preg_match() if this is a symbol.
foreach($animes as $currentAnime){
$thisLetter = strtoupper($currentAnime->title[0]);
if(is_numeric($thisLetter))
{
$sorted['#'][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
else if(preg_match('/[^a-zA-Z0-9]+/', $thisLetter))
{
$sorted['~'][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
else
{
$sorted[$thisLetter][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
unset($thisLetter);
}
[^a-zA-Z0-9]+ - fits letter if there are no a-z letters also no A-Z characters nor 0-9 digits there can be added characters that will not suit as well you didn't precise what characters fits so I've added to regex basics. Moreover, you can check also if there is a letter only by [a-zA-Z]+ and add this to string group and last "else" statement will be for strings that aren't numeric neither strings.

Regex Optional Matches

I'm trying to match two types of strings using the preg_match function in PHP which could be the following.
'_mything_to_newthing'
'_onething'
'_mything_to_newthing_and_some_stuff'
In the third one above, I only want the "mything" and "newthing" so everything that comes after the third part is just some optional text the user could add. Ideally out of the regex would come in the cases of above;
'mything', 'newthing'
'onething'
'mything', 'newthing'
The patterns should match a-zA-Z0-9 if possible :-)
My regex is terrible, so any help would be appreciated!
Thanks in advanced.
Assuming you're talking about _ deliminated text:
$regex = '/^_([a-zA-Z0-9]+)(|_to_([a-zA-Z0-9]+).*)$/';
$string = '_mything_to_newthing_and_some_stuff';
preg_match($regex, $string, $match);
$match = array(
0 => '_mything_to_newthing_and_some_stuff',
1 => 'mything',
2 => '_to_newthing_and_some_stuff',
3 => 'newthing',
);
As far as anything farther, please provide more details and better sample text/output
Edit: You could always just use explode:
$parts = explode('_', $string);
$parts = array(
0 => '',
1 => 'mything',
2 => 'to',
3 => 'newthing',
4 => 'and',
5 => 'some',
6 => 'stuff',
);
As long as the format is consistent, it should work well...

Categories