Regex negative prefix and suffix - php

I have these strings for example:
download-google-chrome-free
download-mozilla-firefox-free
And also these strings:
google-chrome
mozilla-firefox
My prefix is ​​download- and my suffix is ​​-free, I need a regular expression to capture the words google-chrome, If the input string does not have download- and -free prefix and suffix, the group captured is the string itself.
String 1 = download-google-chrome-free (with prefix and sufix)
String 1 Group 1 = download-
String 1 Group 2 = google-chrome
String 1 Group 3 = -free
String 2 = google-chrome (without prefix and sufix)
String 2 Group 1 = '' (empty)
String 2 Group 2 = google-chrome (empty)
String 2 Group 3 = '' (empty)
You can do that? I am using PHP using preg_match.

preg_match('/(download-)?(google-chrome|mozilla-firefox)(-free)?/', $string, $match);
The ? indicates that the group before it is optional. If the prefix or suffix isn't in the string, those capture groups will be empty in $match.
If you don't actually want to return the groups with the optional prefix and suffix, make them non-capturing groups by putting ?: at the beginning of the group:
preg_match('/(?:download-)?(google-chrome|mozilla-firefox)(?:-free)?/', $string, $match);
Now $match[1] will contain the browser name that you want.

Related

Regular expression for dotted string with exception

I'm not sure that is it possible to do in one regex, but it doesn't hurt to ask.
I created the expression:
/(?<variable>\w+)((\.(?<method>\w+)\((?<parameter>[^{}%]*)\))|(\.(?<subvariable>\w+)))?/i
which helps me to "convert" dotted strings to arrays or call to methods:
core.settings => $core['settings']
core.set(param1, param2) => $core->set('param1', 'param2')
It works very well. But I have no idea how to build a several level expression which will work like this:
string: core.settings
group <variable> = core
group <subvariable> = settings
string: core.get(param)
group <variable> = core
group <method> = get
group <parameter> = param
string core.settings.time
group <variable> = core
group <subvariable> = settings.time
string core.settings.time.set(param)
group <variable> = core
group <subvariable> = settings.time
group <method> = set
group <parameter> = param
Any ideas? And whether it is generally possible?
You can use
^(?<variable>\w+)(?:\.(?<subvariable>\w+(?:\.\w+)*))??(?:\.(?<method>\w+)\((?<parameter>[^{}%]*)\))?$
See the regex demo.
Details:
^ - start of string
(?<variable>\w+) - Group "variable": one or more word chars
(?:\.(?<subvariable>\w+(?:\.\w+)*))?? - zero or one occurrence of . and then Group "subvariable" matching one or more word chars followed with zero or more occurrences of a . and one or more word chars
(?:\.(?<method>\w+)\((?<parameter>[^{}%]*)\))? - an optional sequence of
\. - a dot
(?<method>\w+) - Group "method": one or more word chars
\( - a ( char
(?<parameter>[^{}%]*) - Group "parameter": zero or more chars other than {, }, %
\) - a ) char
$ - end of string.

How to split a string into two parts then join them in reverse order as a new string?

This is an example:
$str="this is string 1 / 4w";
$str=preg_replace(?); var_dump($str);
I want to capture 1 / 4w in this string and move this portion to the begin of string.
Result: 1/4W this is string
Just give me the variable that contains the capture.
The last portion 1 / 4W may be different.
e.g. 1 / 4w can be 1/ 16W , 1 /2W , 1W , or 2w
The character W may be an upper case or a lower case.
Use capture group if you want to capture substring:
$str = "this is string 1 / 4w"; // "1 / 4w" can be 1/ 16W, 1 /2W, 1W, 2w
$str = preg_replace('~^(.*?)(\d+(?:\s*/\s*\d+)?w)~i', "$2 $1", $str);
var_dump($str);
Without seeing some different sample inputs, it seems as though there are no numbers in the first substring. For this reason, I use a negated character class to capture the first substring, leave out the delimiting space, and then capture the rest of the string as the second substring. This makes my pattern very efficient (6x faster than Toto's and with no linger white-space characters).
Pattern Demo
Code:
$str="this is string 1 / 4w";
$str=preg_replace('/([^\d]+) (.*)/',"$2 $1",$str);
var_export($str);
Output:
'1 / 4w this is string'

don't match string in brackets php regex

I've been trying to use preg_replace() in php to replace string. I want to match and replace all 's' in this string, but I just came with solution only mathching 's' between 'b' and 'c' or 's' between > <. Is there any way I can use negative look behind not just for the character '>' but for whole string ? I don't want to replace anything in brackets.
<text size:3>s<text size:3>absc
<text size:3>xxetxx<text size:3>sometehing
edit:
just get 's' in >s< and in bsc. Then when I will change string for example from 's' to 'te', to replace 'te' in xtex and sometehing. So I was looking for regular expression to avoid replacing anything in <....>
You can use this pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/';
$replace = '\1\3■'; # ■ = your replacement string
$result = preg_replace( $pattern, $replace, $str );
regex101 demo
Pattern explanation:
( # group 1:
(<[^>]*>)* # group 2: zero-or-more <...>
)
([^s]*) # group 3: zero-or-more not “s”
s # litterally “s”
If you want match case-insensitive, add a “i” at the end of pattern:
$pattern = '/((<[^>]*>)*)([^s]*)s/i';
Edit: Replacement explanation
In the search pattern we have 3 groups surrounded by round brackets. In the replace string we can refer to groups by syntax \1, where 1 is the group number.
So, replace string in the example means: replace group 1 with itself, replace group 3 with itself, replace “s” with desired replacement. We don't need to use group 2 because it is included in group 1 (this due to regex impossibility to retrieve repeating groups).
In the demo string:
abs<text size:3>ssss<text size:3><img src="img"><text size:3>absc
└┘╵└───────────┘╵╵╵╵└───────────────────────────────────────┘└┘╵╵
└─┘└────────────┘╵╵╵└──────────────────────────────────────────┘
1 2 345 6
Pattern matches:
group 1 group 3 s
--------- --------- ---------
1 > 0 1 1
2 > 1 0 1
3 > 0 0 1
4 > 0 0 1
5 > 0 0 1
6 > 3 1 1
The last “c” is not matches, so is not replaced.
Use preg_match_all to get all the s letters and use it with flag PREG_OFFSET_CAPTURE to get the indices.
The regular expression $pat contains a negative lookahead and lookbehind so that the s inside the brackets expression is not matched.
In this example I replace s with the string 5. Change to the string you want to substitute:
<?php
$s = " <text size:3>s<text size:3>absc";
$pat = "/(?<!\<text )s(?!ize:3\>)/";
preg_match_all($pat, $s, $matches, PREG_OFFSET_CAPTURE);
foreach ($matches[0] as $match) {
$s[$match[1]] = "5";
}
print_r(htmlspecialchars($s));

Get all matched groups PREG PHP flavor

Pattern : '/x(?: (\d))+/i'
String : x 1 2 3 4 5
Returned : 1 Match Position[11-13] '5'
I want to catch all possible repetitions, or does it return 1 result per group?
I want the following :
Desired Output:
MATCH 1
1. [4-5] `1`
2. [6-7] `2`
3. [8-9] `3`
4. [10-11] `4`
5. [12-13] `5`
Which I was able to achieve just by copy pasting the group, but this is not what I want. I want a dynamic group capturing
Pattern: x(?: (\d))(?: (\d))(?: (\d))(?: (\d))(?: (\d))
You cannot use one group to capture multiple texts and then access them with PCRE. Instead, you can either match the whole substring with \d+(?:\s+\d+)* and then split with space:
$re2 = '~\d+(?:\s+\d+)*~';
if (preg_match($re2, $str, $match2)) {
print_r(preg_split("/\\s+/", $match2[0]));
}
Alternatively, use a \G based regex to return multiple matches:
(?:x|(?!^)\G)\s*\K\d+
See demo
Here is a PHP demo:
$str = "x 1 2 3 4 5";
$re1 = '~(?:x|(?!^)\G)\s*\K\d+~';
preg_match_all($re1, $str, $matches);
var_dump($matches);
Here, (?:x|(?!^)\G) is acting as a leading boundary (match the whitespaces and digits only after x or each successful match). When the digits are encountered, all the characters matched so far are omitted with the \K operator.

Is it possible to match all attributes in a preg_match with empty or missing attributes?

I'm having a little bit of an issue with pre_match.
I have a string that can come with attributes in any order (eg. [foobar a="b" c="d" f="g"] or [foobar c="d" a="b" f="g"] or [foobar f="g" a="b" c="d"] etc.)
These are the patterns I have tried:
// Matches when all searched for attributes are present
// doesn't match if one of them is missing
// http://www.phpliveregex.com/p/dHi
$pattern = '\[foobar\b(?=\s)(?=(?:(?!\]).)*\s\ba=(["|'])((?:(?!\1).)*)\1)(?=(?:(?!\]).)*\s\bc=(["'])((?:(?!\3).)*)\3)(?:(?!\]).)*]'
// Matches only when attributes are in the right order
// http://www.phpliveregex.com/p/dHj
$pattern = '\[foobar\s+a=["\'](?<a>[^"\']*)["\']\s+c=["\'](?<c>[^"\']*).*?\]'
I'm trying to figure it out, but can't seem to get it right.
Is there a way to match all the attributes, even when other ones are missing or empty (a='')?
I've even toyed with explode at the spaces between the attributes and then str_replace, but that seemed too overkill and not the right way to go about this.
In the links I've only matched for a="b" and c="d" but I also want to match these cases even if there is an e="f" or a z="x"
If you have the [...] strings as separate strings, not inside larger text, it is easy to use a \G based regex to mark a starting boundary ([some_text) and then match any key-value pair with some basic regex subpatterns using negated character classes.
Here is the regex:
(?:\[foobar\b|(?!^)\G)\s+\K(?<key>[^=]+)="(?<val>[^"]*)"(?=\s+[^=]+="|])
Here is what it matches in human words:
(?:\[foobar\b|(?!^)\G) - a leading boundary, the regex engine should find it first before proceeding, and it matches literal [foobar or the end of the previous successful match (\G matches the string start or position right after the last successful match, and since we need the latter only, the negative lookahead (?!^) excludes the beginning of the string)
\s+ - 1 or more whitespaces (they are necessary to delimit tag name with attribute values)
\K - regex operator that forces the regex engine to omit all the matched characters grabbed so far. A cool alternative to a positive lookbehind in PCRE.
(?<key>[^=]+) - Named capture group "key" matching 1 or more characters other than a =.
=" - matches a literal =" sequence
-(?<val>[^"]*) - Named capture group "val" matching 0 or more characters (due to * quantifier) other than a "
" - a literal " that is a closing delimiter for a value substring.
(?=\s+[^=]+="|]) - a positive lookahead making sure there is a next attribute or the end of the [tag xx="yy"...] entity.
PHP code:
$re = '/(?:\[foobar\b|(?!^)\G)\s+\K(?<key>[^=]+)="(?<val>[^"]*)"(?=\s+[^=]+="|])/';
$str = "[foobar a=\"b\" c=\"d\" f=\"g\"]";
preg_match_all($re, $str, $matches);
print_r(array_combine($matches["key"], $matches["val"]));
Output: [a] => b, [c] => d, [f] => g.
You could use the following function:
function toAssociativeArray($str) {
// Single key/pair extraction pattern:
$pattern = '(\w+)\s*=\s*"([^"]*)"';
$res = array();
// Valid string?
if (preg_match("/\[foobar((\s+$pattern)*)\]/", $str, $matches)) {
// Yes, extract key/value pairs:
preg_match_all("/$pattern/", $matches[1], $matches);
for ($i = 0; $i < count($matches[1]); $i += 1) {
$res[$matches[1][$i]] = $matches[2][$i];
}
};
return $res;
}
This is how you could use it:
// Some test data:
$testData = array('[foobar a="b" c="d" f="g"]',
'[foobar a="b" f="g" a="d"]',
'[foobar f="g" a="b" c="d"]',
'[foobar f="g" a="b"]',
'[foobar f="g" c="d" f="x"]');
// Properties I am interested in, with a default value:
$base = array("a" => "null", "c" => "nothing", "f" => "");
// Loop through the test data:
foreach ($testData as $str) {
// get the key/value pairs and merge with defaults:
$res = array_merge($base, toAssociativeArray($str));
// print value of the "a" property
echo "value of a is {$res['a']} <br>";
}
This script outputs:
value of a is b
value of a is d
value of a is b
value of a is b
value of a is null

Categories