Using php regex to split string

Using php regex to split string - php

I've having trouble splitting this string into components.
The example string I have is Criminal.Minds.S10E22.WEB-DL.x264-FUM[ettv]. I'm trying to split it into the following:
Criminal Minds, 10, 22.
Though I've dabbled a bit in perl regex, the php implementation is confusing me.
I've written the following:
$word = "Criminal.Minds.S10E22.WEB-DL.x264-FUM[ettv]";
// First replace periods and dashes by spaces
$patterns = array();
$patterns[0] = '/\./';
$patterns[1] = '/-/';
$replacement = ' ';
$word = preg_replace($patterns, $replacement, $word);
print_r(preg_split('#([a-zA-Z])+\sS(\d+)E(\d+)#i', $word));
Which outputs Array ( [0] => Criminal [1] => WEB DL x264 FUM[ettv] )
Please point me in the right direction.

Use matching rather than splitting if the string is always in this format:
$word = "Criminal.Minds.S10E22.WEB-DL.x264-FUM[ettv]";
preg_match('~^(?<name>.*?)\.S(?<season>\d+)E(?<episode>\d+)~', $word, $m);
print_r($m);
See the PHP demo
Then, you can access the name, season and episode values using $m["name"], $m["season"] and $m["episode"].
Pattern details:
^ - start of string
(?<name>.*?) - a named capturing group matching any 0+ chars other than line break symbols, as few as possible, up to the first....
\.S - .S substring of literal chars
(?<season>\d+) - a "season" named capturing group matching 1+ digits
E - a literal char E
(?<episode>\d+) - an "episode" named capturing group matching 1+ digits

Related

Regex only grabbing first digit

I'm trying to grab everything after the following digits, so I end up with just the store name in this string:
full string: /stores/1077029-gacha-pins
what I want to ignore: /stores/1077029-
what I need to grab: gacha-pins
Those digits can change at any time so it's not specifically that ID, but any numbers after /stores/
My attempt so far is only grabbing /stores/1
\/stores\/[0-9]
I'm still trying, just thought I would see if I can get some help in the meantime too, will post an answer if I solve.

You may use
'~/stores/\d+-\K[^/]+$~'
Or a more specific one:
'~/stores/\d+-\K\w+(?:-\w+)*$~'
See the regex demo and this regex demo.
Details
/stores/ - a literal string
\d+ - 1+ digits
- - a hyphen
\K - match reset operator
[^/]+ - any 1+ chars other than /
\w+(?:-\w+)* - 1+ word chars and then 0+ sequences of - and 1+ word chars
$ - end of string.
See the PHP demo:
$s = "/stores/1077029-gacha-pins";
$rx = '~/stores/\d+-\K[^/]+$~';
if (preg_match($rx, $s, $matches)) {
echo "Result: " . $matches[0];
}
// => Result: gacha-pins

You should do it like this:
$string = '/stores/1077029-gacha-pins';
preg_match('#/stores/[0-9-]+(.*)#', $string, $matches);
$part = $matches[1];
print_r($part);

regex expected value in a postion depends on a random value in another position

I need regex to find all shortcode tag pairs that look like this [sc1-g-data]b[/sc1-g-data] but the number next to the sc can vary but they must match.
So something like this won't work \[sc(.*?)\-((.|\n)*?)\[\/sc(.*?)\- as this matches unmatching tag pairs like this which i don't want [sc1-g-data]b[/sc2-g-data]
so the expected number in the second tag depends on a random number in the first tag

You may use a regex like:
\[(sc\d*-[^\]\[]*)\]([\s\S]*?)\[\/\1\]
See the regex demo
\[ - a [ char
(sc\d*-[^\]\[]*) - Capturing group 1: sc, 0+ digits, -, and then 0+ chars other than ] and [
\] - a ] char
([\s\S]*?) - Capturing group 2: any 0+ chars, as few as possible
\[\/ - a [/ string
\1 - the same text stored in Group 1
\] - a ] char
See the regex graph:
PHP demo:
$pattern = '~\[(sc\d*-[^][]*)](.*?)\[/\1]~s';
$string = '[sc1-g-data]a[/sc1-g-data] ';
if (preg_match($pattern, $string, $matches)) {
print_r($matches);
}
Mind the use of a single quoted string literal, if you use a double quoted one you will need to use \\1, not \1 as '\1' != "\1" in PHP.
Output:
Array
(
[0] => [sc1-g-data]a[/sc1-g-data]
[1] => sc1-g-data
[2] => a
)

If your tags are just anything between brackets [blah][/blah] you can use:
\[(.*?)\].*?\[\/\1\]

Match regex pattern that isn't within a bbcode tag

I am attempting to create a regex patten that will match words in a string that begin with #
Regex that solves this initial problem is '~(#\w+)~'
A second requirement of the code is that it must also ignore any matches that occur within [quote] and [/quote] tags
A couple of attempts that have failed are:
(?:[0-9]+|~(#\w+)~)(?![0-9a-z]*\[\/[a-z]+\])
/[quote[\s\]][\s\S]*?\/quote](*SKIP)(*F)|~(#\w+)~/i
Example: the following string should have an array output as displayed:
$results = [];
$string = "#friends #john [quote]#and #jane[/quote] #doe";
//run regex match
preg_match_all('regex', $string, $results);
//dump results
var_dump($results[1]);
//results: array consisting of:
[1]=>"#friends"
[2]=>"#john"
[3]=>"#doe

You may use the following regex (based on another related question):
'~(\[quote](?:(?1)|.)*?\[/quote])(*SKIP)(*F)|#\w+~s'
See the regex demo. The regex accounts for nested [quote] tags.
Details
(\[quote](?:(?1)|.)*?\[/quote])(*SKIP)(*F) - matches the pattern inside capturing parentheses and then (*SKIP)(*F) make the regex engine omit the matched text:
\[quote] - a literal [quote] string
(?:(?1)|.)*? - any 0+ (but as few as possible) occurrences of the whole Group 1 pattern ((?1)) or any char (.)
\[/quote] - a literal [/quote] string
| - or
#\w+ - a # followed with 1+ word chars.
PHP demo:
$results = [];
$string = "#friends #john [quote]#and #jane[/quote] #doe";
$rx = '~(\[quote\](?:(?1)|.)*?\[/quote])(*SKIP)(*F)|#\w+~s';
preg_match_all($rx, $string, $results);
print_r($results[0]);
// => Array ( [0] => #friends [1] => #john [2] => #doe )

PHP Regex matches beween Slash and Subtract

Hello I need a regex to get a string "trkfixo" from
SIP/trkfixo-000072b6
I was trying to use explode but I prefer a regex solution.
$ex = explode("/",$sip);
$ex2 = explode("-",$ex[1]);
echo $ex2[0];

You may use '~/([^-]+)~':
$re = '~/([^-]+)~';
$str = "SIP/trkfixo-000072b6";
preg_match($re, $str, $match);
echo $match[1]; // => trkfixo
See the regex demo and a PHP demo
Pattern details:
/ - matches a /
([^-]+) - Group 1 capturing 1 or more (+) symbols other than - (due to the fact that [^-] is a negated character class that matches any symbols other than all symbols and ranges inside this class).

$match = preg_match('/\/[a-zA-Z]-/', "SIP/trkfixo-000072b6");

Words finder regex fails

I'm using this pattern to check if certain words exists in a string:
/\b(apple|ball|cat)\b/i
It works on this string cat ball apple
but not on no spaces catball smallapple
How can the pattern be modified so that the words match even if they are combined with other words and even if there are no spaces?

Remove \b from the regex. \b will match a word boundary, and you want to match the string that is not a complete word.
You can also remove the capturing group (denoted by ()) as it is not required any longer.
Use
/apple|ball|cat/i
Regex Demo
An IDEONE PHP demo:
$re = "/apple|ball|cat/i";
$str = "no spaces catball smallapple";
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Results:
[0] => cat
[1] => ball
[2] => apple

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Using php regex to split string - php

Related

Regex only grabbing first digit

regex expected value in a postion depends on a random value in another position

Match regex pattern that isn't within a bbcode tag

PHP Regex matches beween Slash and Subtract

Words finder regex fails

Categories

Resources