Include matched part in regular expression

Include matched part in regular expression - php

My try:
$a = preg_split("/[0-9](\-)[0-9]/", $d);
print_r($a);
If $d=sometext9-9sometext, I want to be able to get from print_r($a);
Array
(
[0] => sometext9
[1] => -
[2] => 9sometext
)
What am I missing?

You may use
$re = "/(?<=[0-9])(-)(?=[0-9])/";
$str = "sometext9-9sometext";
$a = preg_split($re, $str, $matches, PREG_SPLIT_DELIM_CAPTURE);
print_r($a);
See IDEONE demo. Since the - is in Group 1 (enclosed with (...)) and we use PREG_SPLIT_DELIM_CAPTURE flag, the hyphen is returned as part of the resulting array.
PREG_SPLIT_DELIM_CAPTURE
If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.
The lookarounds (?<=[0-9]) and (?=[0-9]) check for but do not consume the digits on both ends thus they are kept in the elements adjoining to -. See more on that behavior at Lookarounds Stand their Ground.

Related

Matching whole words between commas, or a comma at the beginning, or a comma at the end with Regex

I have a string like this:
page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags
I made this regex that I expect to get the whole tags with:
(?<=\,)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=\,)
I want it to match all the ocurrences.
In this case:
page-9000 and rss-latest.
This regex checks whole words between commas just fine but it ignores the first and the last because it's not between commas (obviously).
I've also tried that it checks if it's between commas OR one comma at the beginning OR one comma to the end, however it would give me false positives, as it would match:
category-128
while the string contains:
page-category-128
Any help?

Try using the following pattern:
(?<=,|^)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=,|$)
The only change I have made is to add boundary markers ^ and $ to the lookarounds to also match on the start and end of the input.
Script:
$input = "page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags";
preg_match_all("/(?<=,|^)(rss-latest|listing-latest-no-category|category-128|page-9000)(?=,|$)/", $input, $matches);
print_r($matches[1]);
This prints:
Array
(
[0] => page-9000
[1] => rss-latest
)

Here is a non-regex way using explode and array_intersect:
$arr1 = explode(',', 'page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags');
$arr2 = explode('|', 'rss-latest|listing-latest-no-category|category-128|page-9000');
print_r(array_intersect($arr1, $arr2));
Output:
Array
(
[0] => page-9000
[6] => rss-latest
)

The (?<=\,) and (?=,) require the presence of , on both sides of the matching pattern. You want to match also at the start/end of string, and this is where you need to either explicitly tell to match either , or start/end of string or use double-negating logic with negated character classes inside negative lookarounds.
You may use
(?<![^,])(?:rss-latest|listing-latest-no-category|category-128|page-9000)(?![^,])
See the regex demo
Here, (?<![^,]) matches the start of string position or a , and (?![^,]) matches the end of string position or ,.
Now, you do not even need a capturing group, you may get rid of its overhead using a non-capturing group, (?:...). preg_match_all won't have to allocate memory for the submatches and the resulting array will be much cleaner.
PHP demo:
$re = '/(?<![^,])(?:rss-latest|listing-latest-no-category|category-128|page-9000)(?![^,])/m';
$str = 'page-9000,page-template,page-type,page-category-128,image-195,listing-latest,rss-latest,even-more-info,even-more-tags';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
// => Array ( [0] => page-9000 [1] => rss-latest )

regex: select all characters before and after a specific string

I want to select all text before and after a specific substring, I used the following expression to do that, but it not selecting all the needed text:
/^(?:(?!\<\?php echo[\s?](.*?)\;[\s?]\?\>).)*/
for example:
$re = '/^(?:(?!\<\?php echo[\s?](.*?)\;[\s?]\?\>).)*/';
$str = 'customFields[<?php echo $field["id"]; ?>][type]';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
it will select only this part customFields[, while the expected result should be customFields[ and ][type]
check this link for debugging

The pattern ^(?:(?!\<\?php echo[\s?](.*?)\;[\s?]\?\>).)* uses a tempered greedy token which matches any character except a newline from the start of the string ^ that fulfills the assertion of the negative lookahead.
That will only match customFields[
For your example data you could make use of a tempered greedy token regex demo, but instead you could also just make use of a negated character class and SKIP FAIL:
^[^[]+\[|<\?php echo\s(.*?)\;\s\?\>(*SKIP)(*FAIL)|\]\[[^]]*\]
Regex demo | Php demo
For example
$re = '/^[^[]+\[|<\?php echo\s(.*?)\;\s\?\>(*SKIP)(*FAIL)|\]\[[^]]*\]/';
$str = 'customFields[<?php echo $field["id"]; ?>][type]';
preg_match_all($re, $str, $matches, PREG_SET_ORDER);
print_r($matches);
Result
Array
(
[0] => Array
(
[0] => customFields[
)
[1] => Array
(
[0] => ][type]
)
)
To get a more exact match you might also use capturing groups:
^((?:(?!<\?php echo[\s?](?:.*?)\;\s\?>).)*)<\?php echo\s(?:.*?)\;[\s?]\?>(.*)$
regex demo | Php demo

What about using positive lookarounds:
(.*)(?=\<\?php echo)|(?<=\?\>)(.*)
Demo

Get numbers after string php regex

I need a regex that can actually get any number that is inserted after "ab" and "cr". For example, I have a string like this:
rw200-208-ab66
fg200-cr30-201
I need to print ab66 and cr30.
I have tried using strpos:
if (strpos($part,'ab') !== false) {
$a = explode("ab", $part);
echo 'ab'.$a[1];
}
That does not work for the second item.

You could use \K to discard the previously matched chars from printing at the final. The below regex would give you the number which exists next to ab
or cr.
(?:ab|cr)\K\d+
To get the number with alphabets also, use
preg_match_all('~(?:ab|cr)\d+~', $str, $match);

Use this regex:
(?>ab|cr)\d+
See IDEONE demo:
$re = "#(?>ab|cr)\d+#";
$str = "rw200-208-ab66\nfg200-cr30-201";
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Output:
Array
(
[0] => ab66
[1] => cr30
)

Get all occurrences of words between curly brackets

I have a text like:
This is a {demo} phrase made for {test}
I need to get
demo
test
Note: My text can have more than one block of {}, not always two. Example:
This is a {demo} phrase made for {test} written in {English}
I used this expression /{([^}]*)}/ with preg_match but it returns only the first word, not all words inside the text.

Use preg_match_all instead:
preg_match_all($pattern, $input, $matches);
It's much the same as preg_match, with the following stipulations:
Searches subject for all matches to the regular expression given in
pattern and puts them in matches in the order specified by flags.
After the first match is found, the subsequent searches are continued
on from end of the last match.

Your expression is correct, but you should be using preg_match_all() instead to retrieve all matches. Here's a working example of what that would look like:
$s = 'This is a {demo} phrase made for {test}';
if (preg_match_all('/{([^}]*)}/', $s, $matches)) {
echo join("\n", $matches[1]);
}
To also capture the positions of each match, you can pass PREG_OFFSET_CAPTURE as the fourth parameter to preg_match_all. To use that, you can use the following example:
if (preg_match_all('/{([^}]*)}/', $s, $matches, PREG_OFFSET_CAPTURE)) {
foreach ($matches[1] as $match) {
echo "{$match[0]} occurs at position {$match[1]}\n";
}
}

As the { and } are part of regex matching syntax, you need to escape these characters:
<?php
$text = <<<EOD
this {is} some text {from}
which I {may} want to {extract}
some words {between} brackets.
EOD;
preg_match_all("!\{(\w+)\}!", $text, $matches);
print_r($matches);
?>
produces
Array
(
[0] => Array
(
[0] => {is}
[1] => {from}
[2] => {may}
[3] => {extract}
[4] => {between}
)
... etc ...
)
This example may be helpful to understand the use of curly brackets in regexes:
<?php
$str = 'abc212def3456gh34ij';
preg_match_all("!\d{3,}!", $str, $matches);
print_r($matches);
?>
which returns:
Array
(
[0] => Array
(
[0] => 212
[1] => 3456
)
)
Note that '34' is excluded from the results because the \d{3,} requires a match of at least 3 consecutive digits.

Matching portions between pair of braces using RegEx, is less better than using Stack for this purpose. Using RegEx would be something like «quick and dirty patch», but for parsing and processing input string you have to use a stack. Visit here for the concept and here for applying the same.

Regex For Get Last URL

I have:
stackoverflow.com/.../link/Eee_666/9_uUU/66_99U
What regex for /Eee_666/9_uUU/66_99U?
Eee_666, 9_uUU, and 66_99U is a random value
How can I solve it?

As simple as that:
$link = "stackoverflow.com/.../link/Eee_666/9_uUU/66_99U";
$regex = '~link/([^/]+)/([^/]+)/([^/]+)~';
# captures anything that is not a / in three different groups
preg_match_all($regex, $link, $matches);
print_r($matches);
Be aware though that it eats up any character expect the / (including newlines), so you either want to exclude other characters as well or feed the engine only strings with your format.
See a demo on regex101.com.

You can use \K here to makei more thorough.
stackoverflow\.com/.*?/link/\K([^/\s]+)/([^/\s]+)/([^/\s]+)
See demo.
https://regex101.com/r/jC8mZ4/2

In the case you don't how the length of the String:
$string = stackoverflow.com/.../link/Eee_666/9_uUU/66_99U
$regexp = ([^\/]+$)
result:
group1 = 66_99U
be careful it may also capture the end line caracter

For this kind of requirement, it's simpler to use preg_split combined with array_slice:
$url = 'stackoverflow.com/.../link/Eee_666/9_uUU/66_99U';
$elem = array_slice(preg_split('~/~', $url), -3);
print_r($elem);
Output:
Array
(
[0] => Eee_666
[1] => 9_uUU
[2] => 66_99U
)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Include matched part in regular expression - php

My try: $a = preg_split("/[0-9](\-)[0-9]/", $d); print_r($a); If $d=sometext9-9sometext, I want to be able to get from print_r($a); Array ( [0] => sometext9 [1] => - [2] => 9sometext ) What am I missing?

Related

Matching whole words between commas, or a comma at the beginning, or a comma at the end with Regex

regex: select all characters before and after a specific string

Get numbers after string php regex

Get all occurrences of words between curly brackets

Regex For Get Last URL

Categories

Resources