don't match string in brackets php regex

don't match string in brackets php regex - php

I've been trying to use preg_replace() in php to replace string. I want to match and replace all 's' in this string, but I just came with solution only mathching 's' between 'b' and 'c' or 's' between > <. Is there any way I can use negative look behind not just for the character '>' but for whole string ? I don't want to replace anything in brackets.
<text size:3>s<text size:3>absc
<text size:3>xxetxx<text size:3>sometehing
edit:
just get 's' in >s< and in bsc. Then when I will change string for example from 's' to 'te', to replace 'te' in xtex and sometehing. So I was looking for regular expression to avoid replacing anything in <....>

You can use this pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/';
$replace = '\1\3■'; # ■ = your replacement string
$result = preg_replace( $pattern, $replace, $str );
regex101 demo
Pattern explanation:
( # group 1:
(<[^>]*>)* # group 2: zero-or-more <...>
)
([^s]*) # group 3: zero-or-more not “s”
s # litterally “s”
If you want match case-insensitive, add a “i” at the end of pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/i';
Edit: Replacement explanation
In the search pattern we have 3 groups surrounded by round brackets. In the replace string we can refer to groups by syntax \1, where 1 is the group number.
So, replace string in the example means: replace group 1 with itself, replace group 3 with itself, replace “s” with desired replacement. We don't need to use group 2 because it is included in group 1 (this due to regex impossibility to retrieve repeating groups).
In the demo string:
abs<text size:3>ssss<text size:3><img src="img"><text size:3>absc
└┘╵└───────────┘╵╵╵╵└───────────────────────────────────────┘└┘╵╵
└─┘└────────────┘╵╵╵└──────────────────────────────────────────┘
1 2 345 6
Pattern matches:
group 1 group 3 s
--------- --------- ---------
1 > 0 1 1
2 > 1 0 1
3 > 0 0 1
4 > 0 0 1
5 > 0 0 1
6 > 3 1 1
The last “c” is not matches, so is not replaced.

Use preg_match_all to get all the s letters and use it with flag PREG_OFFSET_CAPTURE to get the indices.
The regular expression $pat contains a negative lookahead and lookbehind so that the s inside the brackets expression is not matched.
In this example I replace s with the string 5. Change to the string you want to substitute:
<?php
$s = " <text size:3>s<text size:3>absc";
$pat = "/(?<!\<text )s(?!ize:3\>)/";
preg_match_all($pat, $s, $matches, PREG_OFFSET_CAPTURE);
foreach ($matches[0] as $match) {
$s[$match[1]] = "5";
}
print_r(htmlspecialchars($s));

Related

Regular expression for highlighting numbers between words

Site users enter numbers in different ways, example:
from 8 000 packs
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs
I am looking for a regular expression with which I could highlight words before digits (if there are any), digits in any format and words after (if there are any). It is advisable to exclude spaces.
Now I have such a design, but it does not work correctly.
(^[0-9|a-zA-Z].*?)\s([0-9].*?)\s([a-zA-Z]*$)
The main purpose of this is to put the strings in order, bring them to the same form, format them in PHP digit format, etc.
As a result, I need to get the text before the digits, the digits themselves and the text after them into the variables separately.
$before = 'from';
$num = '8000';
$after = 'packs';
Thank you for any help in this matter)

I think you may try this:
^(\D+)?([\d \t]+)(\D+)?$
group 1: optional(?) group that will contain anything but digit
group 2: mandatory group that will contain only digits and
white space character like space and tab
group 3: optional(?) group that will contain anything but digit
Demo
Source (run)
$re = '/^(\D+)?([\d \t]+)(\D+)?$/m';
$str = 'from 8 000 packs
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach ($matches as $matchgroup)
{
echo "before: ".$matchgroup[1]."\n";
echo "number:".preg_replace('/\D/m','',$matchgroup[2])."\n";
echo "after:".$matchgroup[3]."";
echo "\n\n\n";
}

I corrected your regex and added groups, the regex looks like this:
^(?<before>[a-zA-Z]+)?\s?(?<number>[0-9].*?)\s?(?<after>[a-zA-Z]+)?$`
Test regex here: https://regex101.com/r/QLEC9g/2
By using groups you can easily separate the words and numbers, and handle them any way you want.

Your pattern does not match because there are 4 required parts that all expect 1 character to be present:
(^[0-9|a-zA-Z].*?)\s([0-9].*?)\s([a-zA-Z]*$)
^^^^^^^^^^^^ ^^ ^^^^^ ^^
The other thing to note is that the first character class [0-9|a-zA-Z] can also match digits (you can omit the | as it would match a literal pipe char)
If you would allow all other chars than digits on the left and right, and there should be at least a single digit present, you can use a negated character class [^\d\r\n]* optionally matching any character except a digit or a newline:
^([^\d\r\n]*)\h*(\d+(?:\h+\d+)*)\h*([^\d\r\n]*)$
^ Start of string
([^\d\r\n]*) Capture group 1, match any char except a digit or a newline
\h* Match optional horizontal whitespace chars
(\d+(?:\h+\d+)*) Capture group 2, match 1+ digits and optionally repeat matching spaces and 1+ digits
\h* Match optional horizontal whitespace chars
([^\d\r\n]*) Capture group 3, match any char except a digit or a newline
$ End of string
See a regex demo and a PHP demo.
For example
$re = '/^([^\d\r\n]*)\h*(\d+(?:\h+\d+)*)\h*([^\d\r\n]*)$/m';
$str = 'from 8 000 packs
test from 8 000 packs test
432534534
from 344454 packs
45054 packs
04 555
434654
54 564 packs';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach($matches as $match) {
list(,$before, $num, $after) = $match;
echo sprintf(
"before: %s\nnum:%s\nafter:%s\n--------------------\n",
$before, preg_replace("/\h+/", "", $num), $after
);
}
Output
before: from
num:8000
after:packs
--------------------
before: test from
num:8000
after:packs test
--------------------
before:
num:432534534
after:
--------------------
before: from
num:344454
after:packs
--------------------
before:
num:45054
after:packs
--------------------
before:
num:04555
after:
--------------------
before:
num:434654
after:
--------------------
before:
num:54564
after:packs
--------------------
If there should be at least a single digit present, and the only allowed characters are a-z for the word(s), you can use a case insensitive pattern:
(?i)^((?:[a-z]+(?:\h+[a-z]+)*)?)\h*(\d+(?:\h+\d+)*)\h*((?:[a-z]+(?:\h+[a-z]+)*)?)?$
See another regex demo and a php demo.

regex expected value in a postion depends on a random value in another position

I need regex to find all shortcode tag pairs that look like this [sc1-g-data]b[/sc1-g-data] but the number next to the sc can vary but they must match.
So something like this won't work \[sc(.*?)\-((.|\n)*?)\[\/sc(.*?)\- as this matches unmatching tag pairs like this which i don't want [sc1-g-data]b[/sc2-g-data]
so the expected number in the second tag depends on a random number in the first tag

You may use a regex like:
\[(sc\d*-[^\]\[]*)\]([\s\S]*?)\[\/\1\]
See the regex demo
\[ - a [ char
(sc\d*-[^\]\[]*) - Capturing group 1: sc, 0+ digits, -, and then 0+ chars other than ] and [
\] - a ] char
([\s\S]*?) - Capturing group 2: any 0+ chars, as few as possible
\[\/ - a [/ string
\1 - the same text stored in Group 1
\] - a ] char
See the regex graph:
PHP demo:
$pattern = '~\[(sc\d*-[^][]*)](.*?)\[/\1]~s';
$string = '[sc1-g-data]a[/sc1-g-data] ';
if (preg_match($pattern, $string, $matches)) {
print_r($matches);
}
Mind the use of a single quoted string literal, if you use a double quoted one you will need to use \\1, not \1 as '\1' != "\1" in PHP.
Output:
Array
(
[0] => [sc1-g-data]a[/sc1-g-data]
[1] => sc1-g-data
[2] => a
)

If your tags are just anything between brackets [blah][/blah] you can use:
\[(.*?)\].*?\[\/\1\]

Get all matched groups PREG PHP flavor

Pattern : '/x(?: (\d))+/i'
String : x 1 2 3 4 5
Returned : 1 Match Position[11-13] '5'
I want to catch all possible repetitions, or does it return 1 result per group?
I want the following :
Desired Output:
MATCH 1
1. [4-5] `1`
2. [6-7] `2`
3. [8-9] `3`
4. [10-11] `4`
5. [12-13] `5`
Which I was able to achieve just by copy pasting the group, but this is not what I want. I want a dynamic group capturing
Pattern: x(?: (\d))(?: (\d))(?: (\d))(?: (\d))(?: (\d))

You cannot use one group to capture multiple texts and then access them with PCRE. Instead, you can either match the whole substring with \d+(?:\s+\d+)* and then split with space:
$re2 = '~\d+(?:\s+\d+)*~';
if (preg_match($re2, $str, $match2)) {
print_r(preg_split("/\\s+/", $match2[0]));
}
Alternatively, use a \G based regex to return multiple matches:
(?:x|(?!^)\G)\s*\K\d+
See demo
Here is a PHP demo:
$str = "x 1 2 3 4 5";
$re1 = '~(?:x|(?!^)\G)\s*\K\d+~';
preg_match_all($re1, $str, $matches);
var_dump($matches);
Here, (?:x|(?!^)\G) is acting as a leading boundary (match the whitespaces and digits only after x or each successful match). When the digits are encountered, all the characters matched so far are omitted with the \K operator.

Regex negative prefix and suffix

I have these strings for example:
download-google-chrome-free
download-mozilla-firefox-free
And also these strings:
google-chrome
mozilla-firefox
My prefix is download- and my suffix is -free, I need a regular expression to capture the words google-chrome, If the input string does not have download- and -free prefix and suffix, the group captured is the string itself.
String 1 = download-google-chrome-free (with prefix and sufix)
String 1 Group 1 = download-
String 1 Group 2 = google-chrome
String 1 Group 3 = -free
String 2 = google-chrome (without prefix and sufix)
String 2 Group 1 = '' (empty)
String 2 Group 2 = google-chrome (empty)
String 2 Group 3 = '' (empty)
You can do that? I am using PHP using preg_match.

preg_match('/(download-)?(google-chrome|mozilla-firefox)(-free)?/', $string, $match);
The ? indicates that the group before it is optional. If the prefix or suffix isn't in the string, those capture groups will be empty in $match.
If you don't actually want to return the groups with the optional prefix and suffix, make them non-capturing groups by putting ?: at the beginning of the group:
preg_match('/(?:download-)?(google-chrome|mozilla-firefox)(?:-free)?/', $string, $match);
Now $match[1] will contain the browser name that you want.

How do i break string into words at the position of number

I have some string data with alphanumeric value. like us01name, phc01name and other i.e alphabates + number + alphabates.
i would like to get first alphabates + number in first string and remaining on second.
How can i do it in php?

You can use a regular expression:
// if statement checks there's at least one match
if(preg_match('/([A-z]+[0-9]+)([A-z]+)/', $string, $matches) > 0){
$firstbit = $matches[1];
$nextbit = $matches[2];
}
Just to break the regular expression down into parts so you know what each bit does:
( Begin group 1
[A-z]+ As many alphabet characters as there are (case agnostic)
[0-9]+ As many numbers as there are
) End group 1
( Begin group 2
[A-z]+ As many alphabet characters as there are (case agnostic)
) End group 2

Try this code:
preg_match('~([^\d]+\d+)(.*)~', "us01name", $m);
var_dump($m[1]); // 1st string + number
var_dump($m[2]); // 2nd string
OUTPUT
string(4) "us01"
string(4) "name"
Even this more restrictive regex will also work for you:
preg_match('~([A-Z]+\d+)([A-Z]+)~i', "us01name", $m);

You could use preg_split on the digits with the pattern capture flag. It returns all pieces, so you'd have to put them back together. However, in my opinion is more intuitive and flexible than a complete pattern regex. Plus, preg_split() is underused :)
Code:
$str = 'user01jason';
$pieces = preg_split('/(\d+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($pieces);
Output:
Array
(
[0] => user
[1] => 01
[2] => jason
)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

don't match string in brackets php regex - php

Related

Regular expression for highlighting numbers between words

regex expected value in a postion depends on a random value in another position

Get all matched groups PREG PHP flavor

Regex negative prefix and suffix

How do i break string into words at the position of number

Categories

Resources