RegEx: match all occurrences of X not enclosed by Y

RegEx: match all occurrences of X not enclosed by Y - php

Is it possible to create a regular expression of a pattern X that is not enclosed by a pattern Y using preg_match in PHP?
for example, consider this string:
hello, i said <a>hello</a>
I want a regex that matches the first hello but not the second... I couldn't think of anything

Use negative look behind lookup:
(?<!<a>)hello

Description
Assuming your use case is a bit more complex then hello, i said <a>hello</a>; then if you where looking for all the hello in hello, i said <a>after arriving say hello</a> you might want to just capture the good and bad, then use some programming logic to process only the matches you're interested in.
This expression will capture all the <a>...</a> sub strings and all the hello strings. Since the undesirable substring is matched first if the desirable substring appears inside then it won't ever be included in the capture group 1.
<a>.*?<\/a>|\b(hello)\b
Example
Live example: http://ideone.com/jpcqSR
Sample Text
Chello said Hello, i said <a>after arriving say hello</a>
Code
$string = 'Chello said Hello, i said <a>after arriving say hello</a>';
$regex = '/<a>.*?<\/a>|\b(hello)\b/ims';
preg_match_all($regex, $string, $matches);
foreach($matches as $key=>$value){
if ($value[1]) {
echo $key . "=" . $value[0];
}
}
Output
Note the upper case H in hello shows that it was the desired substring.
0=Hello

Related

preg_replace - similar patterns

I have a string that contains something like "LAB_FF, LAB_FF12" and I'm trying to use preg_replace to look for both patterns and replace them with different strings using a pattern match of;
/LAB_[0-9A-F]{2}|LAB_[0-9A-F]{4}/
So input would be
LAB_FF, LAB_FF12
and the output would need to be
DAB_FF, HAD_FF12
Problem is, for the second string, it interprets it as "LAB_FF" instead of "LAB_FF12" and so the output is
DAB_FF, DAB_FF
I've tried splitting the input line out using 2 different preg_match statements, the first looking for the {2} pattern and the second looking for the {4} pattern. This sort of works in that I can get the correct output into 2 separate strings but then can't combine the two strings to give the single amended output.

\b is word boundary. Meaning it will look at where the word ends and not only pattern match.
https://regex101.com/r/upY0gn/1
$pattern = "/\bLAB_[0-9A-F]{2}\b|\bLAB_[0-9A-F]{4}\b/";
Seeing the comment on the other answer about how to replace the string.
This is one way.
The pattern will create empty entries in the output array for each pattern that fails.
In this case one (the first).
Then it's just a matter of substr.
$re = '/(\bLAB_[0-9A-F]{2}\b)|(\bLAB_[0-9A-F]{4}\b)/';
$str = 'LAB_FF12';
preg_match($re, $str, $matches);
var_dump($matches);
$substitutes = ["", "DAB", "HAD"];
For($i=1; $i<count($matches); $i++){
If($matches[$i] != ""){
$result = $substitutes[$i] . substr($matches[$i],3);
Break;
}
}
Echo $result;
https://3v4l.org/gRvHv

You can specify exact amounts in one set of curly braces, e.g. `{2,4}.
Just tested this and seems to work:
/LAB_[0-9A-F]{2,4}/
LAB_FF, LAB_FFF, LAB_FFFF
EDIT: My mistake, that actually matches between 2 and 4. If you change the order of your selections it matches the first it comes to, e.g.
/LAB_([0-9A-F]{4}|[0-9A-F]{2})/
LAB_FF, LAB_FFFF
EDIT2: The following will match LAB_even_amount_of_characters:
/LAB_([0-9A-F]{2})+/
LAB_FF, LAB_FFFF, LAB_FFFFFF...

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!

You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo

You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.

You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

Regular Expressions: Numeric Value before Occurrence PHP

Given the string:
100,000 this is some text 12,000 this is text I want to match.
I need a regular expression that matches 12,000 based on matching
text I want to match
So, we can get a position with:
strpos($haystack, 'text I want to match');
Then, I guess we could use a regular expression to look backwards:
But, this is where I need help.

If you know that the digits will always precede the based context you want to match ...
preg_match('/([\d,]+)\D*text I want to match/', $str, $match);
var_dump($match[1]);

It is simple:
/ ([0-9,]+) this is text I want to match\.$/
Demo:
http://sandbox.onlinephpfunctions.com/code/b288ca9a322c7a5b54c6490334540ab142b6a979

Another solution:
$re = "/([\\d,]+)(?=\\D*text I want to match)/";
$str = "100,000 this is some text 12,000 this is text I want to match.";
preg_match($re, $str, $matches);
Live demo

Matching multiple variables using regex (PHP/JS)

I understand how to use PHP's preg_match() to extract a variable sequence from a string. However, i'm not sure what to do if there are 2 variables that I need to match.
Here's the code i'm interested in:
$string1 = "help-xyz123#mysite.com";
$pattern1 = '/help-(.*)#mysite.com/';
preg_match($pattern1, $string1, $matches);
print_r($matches[1]); // prints "xyz123"
$string2 = "business-321zyx#mysite.com";
So basically I'm wondering how to extract two patterns: 1) Whether the string's first part is "help" or "business" and 2) whether the second part is "xyz123" vs. "zyx321".
The optional bonus question is what would the answer look like written in JS? I've never really figured out if regex (i.e., the code including the slashes, /..../) are always the same or not in PHP vs. JS (or any language for that matter).

The solution is pretty simple actually. For each pattern you want to match, place that pattern between parentheses (...). So to extract any pattern use what've you already used (.*). To simply distinguish "help" vs. "business", you can use | in your regex pattern:
/(help|business)-(.*)#mysite.com/
The above regex should match both formats. (help|business) basically says, either match help or business.
So the final answer is this:
$string1 = "help-xyz123#mysite.com";
$pattern1 = '/(help|business)-(.*)#mysite.com/';
preg_match($pattern1, $string1, $matches);
print_r($matches[1]); // prints "help"
echo '<br>';
print_r($matches[2]); // prints "xyz123"
The same regex pattern should be usable in Javascript. You don't need to tweak it.

Yes, Kemal is right. You can use the same pattern in javascript.
var str="business-321zyx#mysite.com";
var patt1=/(help|business)-(.*)#mysite.com/;
document.write(str.match(patt1));
Just pay attention at the return value from the functions that are different.
PHP return an array with more information than this code in Javascript.

How to get the word which is in first quotation marks?

Let's say I have the this text (not to be treated as PHP code):
$this->validation->set('username','username','trim');
$this->validation->set('password','password','trim');
$this->validation->set('password2','password2','trim');
$this->validation->set('name','name','trim');
$this->validation->set('surname','surname','trim');
I want to get the list of first words after set( which is in quotation marks in every line, so the output of previous input must be like this:
username
password
password2
name
surname
I think, it's possible with regular expressions. My question is how can I get the list of the words which is in first quotation marks with PHP?

Lets say the variable $text holds the data from your question.
Let's analyse the regular expression /set\('(.*?)'/:
/ is the delimiter.
set\(' and ' are the strings set(' and ', respectively.
.*? is the least amount of (arbitrary) characters between the two aforementioned strings.1
As a result, this regular expression matches:
$this->validation->set('username','username','trim');
To store all the strings you need in the array $matches[1], we can use the function preg_match_all.
It suffices to call preg_match_all("/set\('(.*?)'/", $text, $matches).
1 See also: Regex Tutorial - Repetition with Star and Plus - Laziness Instead of Greediness
Example code:
$text = <<<EOF
\$this->validation->set('username','username','trim');
\$this->validation->set('password','password','trim');
\$this->validation->set('password2','password2','trim');
\$this->validation->set('name','name','trim');
\$this->validation->set('surname','surname','trim');
EOF;
preg_match_all("/set\('(.*?)'/", $text, $matches);
print_r($matches[1]);

$arr = explode("'","this->validation->set('surname','surname','trim')");
print_r($arr);
not sure why you would want to do something like that, but the above should work

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

RegEx: match all occurrences of X not enclosed by Y - php

Is it possible to create a regular expression of a pattern X that is not enclosed by a pattern Y using preg_match in PHP? for example, consider this string: hello, i said <a>hello</a> I want a regex that matches the first hello but not the second... I couldn't think of anything

Use negative look behind lookup: (?<!<a>)hello

Related

preg_replace - similar patterns

preg_replace with Regex - find number-sequence in URL

Regular Expressions: Numeric Value before Occurrence PHP

Matching multiple variables using regex (PHP/JS)

How to get the word which is in first quotation marks?

Categories

Resources