get initialized string regex - php

kNO = "Get this value now if you can";
How do I get Get this value now if you can from that string? It looks easy but I don't know where to start.

Start by reading PHP PCRE and see the examples. For your question:
$str = 'kNO = "Get this value now if you can";';
preg_match('/kNO\s+=\s+"([^"]+)"/', $str, $m);
echo $m[1]; // Get this value now if you can
Explanation:
kNO Match with "kNO" in the input string
\s+ Follow by one or more whitespace
"([^"]+)" Get any characters within double-quotes

Depending on how you're getting that input, you could use parse_ini_file or parse_ini_string. Dead simple.

Use character classes to start extracting from one open quote to the next:
$str = 'kNO = "Get this value now if you can";'
preg_match('~"([^"]*)"~', $str, $matches);
print_r($matches[1]);
Explanation:
~ //php requires explicit regex bounds
" //match the first literal double quotation
( //begin the capturing group, we want to omit the actual quotes from the result so group the relevant results
[^"] //charater class, matches any character that is NOT a double quote
* //matches the aforementioned character class zero or more times (empty string case)
) //end group
" //closing quote for the string.
~ //close the boundary.
EDIT, you may also want to account for escaped quotes, use the following regex instead:
'~"((?:[^\\\\"]+|\\\\.)*)"~'
This pattern is slightly more difficult to wrap your head around. Essentially this is broken into two possible matches (seperated by the Regex OR character |)
[^\\\\"]+ //match any character that is NOT a backslash and is NOT a double quote
| //or
\\\\. //match a backslash followed by any character.
The logic is pretty straightforward, the first character class will match all characters except a double quote or a backslash. If a quote or a backslash is found, the regex attempts to match the 2nd part of the group. In the event that it's a backslash, it will of course match the pattern \\\\., but it will also advance the match by 1 character, effectively skipping whatever escaped character followed the backslash. The only time this pattern will stop matching is when a lone, unescaped double quote is encountered,

Related

Ignoring apostrophe while capturing contents in single quotes REGEX

The issue for me here is to capture the content inside single quotes(like 'xyz').
But the apostrophe which is the same symbol as a single quote(') is coming in the way!
The regex I've written is : /(\w\'\w)(*SKIP)(*F)|(\'[^\']*\')/
The example i have used is : Hello ma'am 'This is Prashanth's book.'
What needs to be captured is : 'This is Prashanth's book.'.
But, what's capured is : 'This is Prashanth'!
Here is the link of what i tried on online regex tester
Any help is greatly appreciated. Thank you!
You can't use [^\'] to capture a text that contains ' with in and in your example, This is Prashanth's book. contains a ' character within the text. You need to modify your regex to use .*? instead of [^\'] and can write your regex as this,
(\w'\w)(*SKIP)(*F)|('.*?'\B)
Demo with your updated regex
Also, you don't need to escape a single quote ' as that has no special meaning in regex.
From your example, it is not clear whether you want the captured match to contain ' around the match or not. In case you don't want ' to be captured in the match, you can use a lookarounds based regex and use this,
(?<=\B').*?(?='\B)
Explanation of regex:
(?<=\B') - This positive look behind ensures what gets captured in match is preceded by a single quote which is not preceded by a word character which is ensured by \B
.*? - Captures the text in non-greedy manner
(?='\B) - Ensures the matched text is followed by a single quote and \B ensures it doesn't match a quote that is immediately followed by any word character. E.g. it won't match an ending quote like 's
Demo
For the string you have provided, you can use the regex:
\B'\K(?:(?!'\B).)+
Click for Demo
Explanation:
\B - a non-word boundary
' - matches a '
\K - forget everything matched so far
(?:(?!'\B).)+ - matches 1+ occurrences of any character(except newline) which does not start with ' followed by a non-word boundary

PCRE regex with lookahead and lookbehind always returns true

I’m trying to create a regex for form validation but it always returns true. The user must be able to add something like {user|2|S} as input but also use brackets if they are escaped with \.
This code checks for the left bracket { for now.
$regex = '/({(?=([a-zA-Z0-9]+\|[0-9]*\|(S|D[0-9]*)}))|[^{]|(?<=\\\){)*/';
if (preg_match($regex, $value)) {
return TRUE;
} else {
return FALSE;
}
A possible correct input would be:
Hello {user|1|S}, you have {amount|2|D2}
or
Hello {user|1|S}, you have {amount|2|D2} in \{the_bracket_bank\}
However, this should return false:
Hello {user|1|S}, you have {amount|2}
and this also:
Hello {user|1|S}, you have {amount|2|D2} in {the_bracket_bank}
A live example can be found here: http://regexr.com?37tpu Note that there is a \ in the lookbehind at the end, PHP was giving me error messages because I had to escape it an extra time in my code.
The main error is that you do not specify that the regex should match from the beginning to the of the checked string. Use the ^ and $ assertions.
I think you have to escape { and } in your regex as they have special meaning. Together they form a quantifier.
The (?<=\\\) is better written (?<=\\\\). The backslash has to be double escaped as it has special meaning in both single-quoted string and PCRE regex. Using \\\ works too, because if single-quoted string contains any escape sequence except \\ and \', it handles it as literal backslash and letter, therefore \) is taken literally. But explicitly escaping the backslash twice seems easier to read to me.
The regex should be
$regex = '/^(\{(?=([a-zA-Z0-9]+\|[0-9]*\|(S|D[0-9]*)\}))|[^{]|(?<=\\\\)\{)*$/';
But notice that the look-around assertions are not necessary. This regex should do the job too:
$regex = '/^([^{]|\\\{|\{[a-zA-Z0-9]+\|[0-9]*\|(S|D[0-9]*)\})*$/';
Any non-{ characters are matched by the first alternative. When a { is read, one of the remaining two alternatives is used. Either the pattern for the brace thing matches, or the regex engine backtracks one character and tries to match \{ character sequence. If it fails, both ways, it backtracks further till it reaches string start and fails completely.
Matching without lookbehind
You can make a regex for this without using lookbehind/lookaheads (which is usually recommended).
For example, if your requirement is that you can match any character but a { and a } unless it's preceded by a \. You can also say:
Match any character but a { and a } OR match a \{ or a \}. To match any character but a { and a } use:
[^{}]
To match a \{ use:
\\\{
One backslash is for escaping the { (which might not be necessary, depending on your regex compiler) and one backslash is for escaping the other backslash.
You would end up with this:
(?:
[^{}]
|
\\\{
|
\\\}
)+
I nicely formatted this regex so that it's readable. If you want to use it in your code like this make sure to use the [PCRE_EXTENDED][1] modifier.
Looks more of a job for a lookbehind to me:
/((?<!\\\\)\{[a-zA-Z0-9]+\|[0-9]+\|[SD][0-9]*\})/
However, the obfuscation factor is so high that I would rather recognize all bracketed strings and parse them later.

Regex to match possibly-escaped quotes

I'm trying to write a regex to match single quotes, which may be escaped. A matched quote should have an even number of backslashes before it (an odd number means that the quote is escaped). For example, in these five strings:
'quotes should be matched'
\'quotes should NOT be matched\'
\\'quotes should be matched\\'
\\\'quotes should NOT be matched\\\'
\\\\'quotes should be matched\\\\'
Here is the regex that I have:
(?<=[^\\](?:\\\\)*)'
However, this does not match anything in the above example. I find this strange because removing the * from the regex matches the quotes with two backslashes, as it should:
(?<=[^\\](?:\\\\))' matches \\'
As far as I can see, it's not possible to match just the '. This is because you can't have dynamic length lookbehinds as Wiseguy pointed out.
The following regex would match the correct ' AND any \s leading up to it however. Not sure if this will be of any use..
(?<!\\)(?:\\\\)*'
Matches an arbitrary number of double \s not preceded by a \ and followed by a '.

Getting regular expression

How can i extract https://domain.com/gamer?hid=.115f12756a8641 from the below string ,i.e from url
rrth:'http://www.google.co',cctp:'323',url:'https://domain.com/gamer?hid=.115f12756a8641',rrth:'https://another.com'
P.s :I am new to regular expression, I am learning .But above string seems to be formatted..so some sort of shortcut must be there.
If your input string is called $str:
preg_match('/url:\'(.*?)\'/', $str, $matches);
$url = $matches[1];
(.*?) captures everything between url:' and ' and can later be retrieved with $matches[1].
The ? is particularly important. It makes the repetition ungreedy, otherwise it would consume everything until the very last '.
If your actual input string contains multiple url:'...' section, use preg_match_all instead. $matches[1] will then be an array of all required values.
Simple regex:
preg_match('/url\s*\:\s*\'([^\']+)/i',$theString,$match);
echo $match[1];//should be the url
How it works:
/url\s*\:\s*: matches url + [any number of spaces] + : (colon)+ [any number of spaces]But we don't need this, that's where the second part comes in
\'([^\']+)/i: matches ', then the brackets (()) create a group, that will be stored separately in the $matches array. What will be matches is [^']+: Any character, except for the apostrophe (the [] create a character class, the ^ means: exclude these chars). So this class will match any character up to the point where it reaches the closing/delimiting apostrophe.
/i: in case the string might contain URL:'http://www.foo.bar', I've added that i, which is the case-insensitive flag.
That's about it.Perhaps you could sniff around here to get a better understanding of regex's
note: I've had to escape the single quotes, because the pattern string uses single quotes as delimiters: "/url\s*\:\s*'([^']+)/i" works just as well. If you don't know weather or not you'll be dealing with single or double quotes, you could replace the quotes with another char class:
preg_match('/url\s*\:\s*[\'"]([^\'"]+)/i',$string,$match);
Obviously, in that scenario, you'll have to escape the delimiters you've used for the pattern string...

Can you use back references in the pattern part of a regular expression?

Is there a way to back reference in the regular expression pattern?
Example input string:
Here is "some quoted" text.
Say I want to pull out the quoted text, I could create the following expression:
"([^"]+)"
This regular expression would match some quoted.
Say I want it to also support single quotes, I could change the expression to:
["']([^"']+)["']
But what if the input string has a mixture of quotes say Here is 'some quoted" text. I would not want the regex to match. Currently the regex in the second example would still match.
What I would like to be able to do is if the first quote is a double quote then the closing quote must be a double. And if the start quote is single quote then the closing quote must be single.
Can I use a back reference to achieve this?
My other related question: Getting text between quotes using regular expression
You can make use of the regex:
(["'])[^"']+\1
() : used for grouping
[..] : is the char class. so ["']
matches either " or ' equivalent
to "|'
[^..] : char class with negation.
It matches any char not listed after
the ^
+ : quantifier for one or more
\1 : backreferencing the first
group which is (["'])
In PHP you'd use this as:
preg_match('#(["\'])[^"\']+\1#',$str)
preg_match('/(["\'])([^"\']+)\1/', 'Here is \'quoted text" some quoted text.');
Explanation: (["'])([^"']+)\1/ I placed the first quote in parentheses. Because this is the first grouping, it's back reference number is 1. Then, where the closing quote would be, I placed \1 which means whichever character was matched in group 1.
/"\(.*?\)".*?\1/ should work, but it depends on the regular expression engine
This is old. But you need to provide the $matches variable in preg_match($pattern, $subject, &$matches)
Then you can use it var_dump($matches)
see https://www.php.net/manual/en/function.preg-match

Categories