I am trying to write a RegEx for preg_match_all in php to match a string inside 2 $ symbols, like $abc$ but only if it doesn't have a space, for example, I don't need to match $ab c$.
I wrote this regex /[\$]\S(.*)[\$]/U and some variations but can't get it to work.
Thanks for your help guys.
Overview
Your regex: [\$]\S(.*)[\$]
[\$] - No point in escaping $ inside [] because it's already interpreted as the literal character. No point putting \$ inside [] because \$ is the escaped version. Just use one or the other [$] or \$.
\S(.*) Matches any non-whitespace character (once), followed by any character (except \n) any number of times
Code
See regex in use here
\$\S+\$
\$ Match $ literally
\S+ Match any non-whitespace character one or more times
\$ Match $ literally
Usage
$re = '/\$\S+\$/';
$str = '$abc$
$ab c$';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);
I think this will suit your needs.
https://regex101.com/r/WgUwh9/1
\$([a-zA-Z]*)\$
It will match a-Z of any lenght without space between two $
Related
I have a URL:
https:\u002F\u002Fsite.vid.com\u002F93836af7-f465-4d2c-9feb-9d8128827d85\u002F6njx6dp3gi.m3u8?token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjb3VudHJ5IjoiSU4iLCJkZXZpY2VfaWQiOiI1NjYxZTY3Zi0yYWE3LTQ1MjUtOGYwYy01ODkwNGQyMjc3ZmYiLCJleHAiOjE2MTA3MjgzNjEsInBsYXRmb3JtIjoiV0VCIiwidXNlcl9pZCI6MH0.c3Xhi58DnxBhy-_I5yC2XMGSWU3UUkz5YgeVL1buHYc","
And I want to match it using preg_match_all. My regex expression is:
preg_match_all('/(https:\/\/site\.vid\.com\/.*\",")/', $input_lines, $output_array);
But I am not able to match special character \ & u002F in above code. I tried using (escaping fuction). But it is not matching. I know it maybe a lame question, but if anyone could help me in matching \ and u002F or in escaping \ and u002F in preg_match_all, that would be helpfull.
Question Edit:
I want to use only preg_match_all because I am trying to extract above URL from a html page.
You may use
preg_match_all('~https:(?://|(?:\\\\u002F){2})site\.vid\.com(?:/|\\\\u002F)[^"]*~', $string)
See the regex demo. Details:
https: - a literal string (if s is optional, use https?:)
(?://|(?:\\u002F){2}) - a non-capturing group matching either // or (|) two occurrences of \u002F
site\.vid\.com - a literal site.vid.com string (the dot is a metacharacter that matches any char but line break chars, so it must be escaped)
(?:/|\\u002F) - a non-capturing group matching / or \u002F text
[^"]* - a negated character class matching zero or more chars other than ".
See the PHP demo:
$re = '~https:(?://|(?:\\\\u002F){2})site\.vid\.com(?:/|\\\\u002F)[^"]*~';
$str = 'https:\\u002F\\u002Fsite.vid.com\\u002F93836af7-f465-4d2c-9feb-9d8128827d85\\u002F6njx6dp3gi.m3u8?token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjb3VudHJ5IjoiSU4iLCJkZXZpY2VfaWQiOiI1NjYxZTY3Zi0yYWE3LTQ1MjUtOGYwYy01ODkwNGQyMjc3ZmYiLCJleHAiOjE2MTA3MjgzNjEsInBsYXRmb3JtIjoiV0VCIiwidXNlcl9pZCI6MH0.c3Xhi58DnxBhy-_I5yC2XMGSWU3UUkz5YgeVL1buHYc","';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
print_r($matches[0]);
// => Array( [0] => https:\u002F\u002Fsite.vid.com\u002F93836af7-f465-4d2c-9feb-9d8128827d85\u002F6njx6dp3gi.m3u8?token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJjb3VudHJ5IjoiSU4iLCJkZXZpY2VfaWQiOiI1NjYxZTY3Zi0yYWE3LTQ1MjUtOGYwYy01ODkwNGQyMjc3ZmYiLCJleHAiOjE2MTA3MjgzNjEsInBsYXRmb3JtIjoiV0VCIiwidXNlcl9pZCI6MH0.c3Xhi58DnxBhy-_I5yC2XMGSWU3UUkz5YgeVL1buHYc )
I trying to extract file patches, without disk letter, that are inside text. Like from AvastSecureBrowserElevationService; C:\Program Files (x86)\AVAST Software\Browser\Application\elevation_service.exe [X] extract :\Program Files (x86)\AVAST Software\Browser\Application\elevation_service.exe.
My actual regex look like this, but it will stop on any space, which can contains file names.
(?<=:\\)([^ ]*)
The soulution that I figure out is, that I can match first space character after dot, because there is very little chance that there will be some directory name with space after dot, and I will always do fast manual check. But I do not know how to write this in regex
You may use this regex for this purpose:
(?<=[a-zA-Z]):[^.]+\.\S+
RegEx Demo
RegEx Details:
(?<=[a-zA-Z]): Lookbehind to assert we have a English letter before :
:: Match literal :
[^.]+: Match 1+ non-dot characters
\.: Match literal .
\S+: Match 1+ non-whitespace characters
Here we would consume our entire string, as we collect what we wish to output, and we would preg_replace:
.+C(:\\.+\..+?)\s.+
Test
$re = '/.+C(:\\.+\..+?)\s.+/m';
$str = 'AvastSecureBrowserElevationService; C:\\Program Files (x86)\\AVAST Software\\Browser\\Application\\elevation_service.exe [X]';
$subst = '$1';
$result = preg_replace($re, $subst, $str);
echo $result;
Demo
You can use the following regex:
[A-Z]\K:.+\.\w+
It will match any capital letter followed by :, then any character string ending wit ., followed by at least one word character.
\K removes from the match what comes before it.
Demo
I have 2 texts in a string:
%Juan%
%Juan Gonzalez%
And I want to only be able to get %Juan% and not the one with the Space, I have been trying several Regexes witout luck. I currently use:
/%(.*)%/U
but it gets both things, I tried adding and playing with [^\s] but it doesnt works.
Any help please?
The issue is that . matches any character but a newline. The /U ungreedy mode only makes .* lazy and it captures a text from the % up to the first % to the right of the first %.
If your strings contain one pair of %...%, you may use
/%(\S+)%/
See the regex demo
The \S+ pattern matches 1+ characters other than a whitespace, and the whole [^\h%] negated character class that matches any character but a horizontal space and % symbol.
If you have multiple %...% pairs, you may use
/%([^\h%]+)%/
See another regex demo, where \h matches any horizontal whitespace.
PHP demo:
$re = '/%([^\h%]+)%/';
$str = "%Juan%\n%Juan Gonzalez%";
preg_match_all($re, $str, $matches);
print_r($matches[1]);
In PHP I am trying to complete a simple task of pulling some information from a string using preg_match_all
I have a string like this for example 0(a)1(b)2(c)3(d)4(e)5(f)
and I am trying to return all the contents inside of each () BUT having respect for the fact that escaped parenthesis might exist inside of these.
I have tried multiple combinations but I just can't get any regular expression to allow for something like this 4(here are some escaped parens\(\) more text) to return this here are some escaped parens\(\) more text rather than this here are some escaped parens\(\)
I have a regular expression that works, but not with escaped parenthesis
[0-9]*\(([^ESCAPED PARENTHESIS])*?\)
Can someone give me an idea on how to accomplish this?
You can use a negative look behind to make your regex engine just match the close parenthesis which doesn't precede with backslash:
\((.+?)(?<!\\)\)
See Demo https://regex101.com/r/oU9sF2/1
Debuggex Demo
You can use this regex to match your text:
preg_match_all('/(?<!\\)\((.*?)(?<!\\)\)/', $str, $matches);
print_r($matches[1]);
RegEx Demo
You can use this pattern:
$pattern = <<<'EOD'
~[0-9]+\([^)\\]*+(?s:\\.[^)\\]*)*+\)~
EOD;
demo
The idea is to match all characters until the closing parenthesis and the backslash. When a backslash is reached, the next character is matched too, and "und so weiter", etc., until the end of the world (or a closing parenthesis), all characters that are not a closing parenthesis or a backslash are matched.
Note: possessive quantifiers *+ are only here to limit the backtracking when there is no closing parenthesis.
Here is a working regex:
[0-9]*\(([^()\\]*(?:\\.[^()\\]*?)*)\)
See regex demo
See IDEONE demo:
$re = '~[0-9]*\(([^()\\\\]*(?:\\\\.[^()\\\\]*?)*)\)~s';
$str = "0(a)1(b)2(c)3(d)4(here are some escaped parens\(\) more text)5(f)";
preg_match_all($re, $str, $matches);
print_r($matches[1]);
Regex breakdown:
[0-9]* - matches 0 or more digits
\( - matches a literal (
([^()\\]*(?:\\[()][^()]*?)*) - matches and captures
[^()\\]* - 0 or more symbols other than \, ( and )
(?:\\.[^()]*?)* - matches 0 or more sequences of...
\\. - escaped character followed by
[^()\\]*? - as few as possible characters other than \, ( and )
\) - matches a literal )
I have a string like this
05/15/2015 09:19 PM pt_Product2017.9.abc.swl.px64_kor_7700 I need to select the pt_Product2017.9.abc.swl.px64_kor from that. (start with pt_ and end with _kor)
$str = "05/15/2015 09:19 PM pt_Product2017.9.abc.swl.px64_kor_7700";
preg_match('/^pt_*_kor$/',$str, $matches);
But it doesn't work.
You need to remove the anchors, adda \b at the beginning to match pt_ preceded with a non-word character, and use a \S with * (\S shorthand character class that matches any character but whitespace):
preg_match('/\bpt_\S*_kor/',$str, $matches);
See regex demo
In your regex,^ and $ force the regex engine to search for the ptat the beginning and _kor at the end of the string, and _* matches 0 or more underscores. Note that regex patterns are not the same as wildcards.
In case there can be whitespace between pt_ and _kor, use .*:
preg_match('/\bpt_.*_kor/',$str, $matches);
I should also mention greediness: if you have pt_something_kor_more_kor, the .*/\S* will match the whole string, but .*?/\S*? will match just pt_something_kor. Please adjust according to your requirements.
^ and $ are the start and end of the complete string, not only the matched one. So use simply (pt_.+_kor) to match everything between pt_ and _kor: preg_match('/(pt_+_kor)/',$str, $matches);
Here's a demo: https://regex101.com/r/qL4fW9/1
The ^ and $ that you have used in the regular expression means that the string should start with pt AND end with kor. But it's neither starting as such, nor ending with kor (in fact, ending with kor_7700).
Try removing the ^ and $, and you'll get the match:
preg_match('/pt_.*_kor/',$str, $matches);