How can I check if a string EXACTLY matches a regex pattern? - php

I'm working on a registration script for my client's product sales website.
I'm currently working on a reference ID input area, and I want to make sure that the reference ID is within the correct parameters of the payment method
The Reference ID will look something like this: XXXXX-XXXXX-XXXXX
I'm trying to use this RegEx pattern to match it: /(\w+){5}-(\w+){5}-(\w+){5}/
This matches it perfectly, but it also matches XXXXX-XXXXX-XXXXXXXXXX
Or at least it finds a match in there. I want it to make sure the entire string matches. I'm not too familiar with RegEx
How can I do this?

You need to use start and finish anchors. Alternatively, if you don't need to capture those groups, you can omit the parenthesis.
Also, the +{5} means match more than once exactly 5 times. I believe you didn't want that so I dropped the +.
/^\w{5}-\w{5}-\w{5}\z/
Also, I used \z so your string doesn't match "abcde-12345-edcba\n".

Use ^ and $ to match the start and end of the input string, respectively.
Also note that your use of + was superfluous, as (\w+){5} means "a word character, at least once, times five" which means it can match at least five times. You probably meant (\w){5} (or just \w{5} if you don't need the backreference; I'll assume in my example that you do).
/^(\w){5}-(\w){5}-(\w){5}$/

put the regular expression in between ^ and $ to match the whole string and check if it matches anything
example:
/^(\w+){5}-(\w+){5}-(\w+){5}$/

Try
/^([\w]{5,5})-([\w]{5,5})-([\w]{5,5})$/i
There are several online regex tester out there, I work with this one before I code.

Enclose it in "^" and "$" thus:
/^(\w+){5}-(\w+){5}-(\w+){5}$/

You need ^ to match the start of the string and $ to match the end:
/^\w{5}-\w{5}-\w{5}$/
Note that (\w+){5} is incorrect because that means five repetitions of \w+, but that in turn means "one or more word characters".

/^(\w){5}-(\w){5}-(\w){5}$/
You need to explicitly say that you want the pattern to start at the beginning of the string and end at it's ending.
You can improve it: /^((\w){5}-){2}(\w){5}$/ ; this way, you can easily modify the number of elements your serial number might have.

Use ^ and $ to mark the start and end of the regex string:
/^\w{5}-\w{5}-\w{5}$/
http://www.regular-expressions.info/anchors.html

In preg, \b marks word boundaries. So you could try with something like
/\b(\w+){5}-(\w+){5}-(\w+){5}\b/

Related

Regex - Match characters but don't include within results

I have got the following Regex, which ALMOST works...
(?:^https?:\/\/)(?:www|[a-z]+)\.([^.]+)
I need the result to be the only result, or within the same position in the Array.
So for example this http://m.facebook.com/ matches perfect, there is only 1 group.
However, if I change it to http://facebook.com/ then I get com/in place of where Facebook should be. So I need to have (?:www|[a-z]+) as an optional check really.
Edit:
What I expect is just to match facebook, if ANY of the strings are as follows:
http://www.facebook.com
http://facebook.com
http://m.facebook.com
And obviously the https counterparts.
This is my Regex now
(?:^https?:\/\/)(?:www)?\.?([^.]+)
This is close, however it matches the m on when I try `http://m.facebook.com
https://regex101.com/r/GDapY5/1
So I need to have (?:www|[a-z]+) as an optional check really.
A ? at the end of a pattern is generally used for "optional" bits -- it means "match zero or one" of that thing, so your subpattern would be something like this:
(?:www|[a-z]+)?
If you're simply trying to get the second level domain, I wouldn't bother with regex, because you'll be constantly adjusting it to handle special cases you come across. Just split on dots and take the penultimate value:
$domain = array_reverse(explode('.', parse_url($str)['host']))[1];
Or:
$domain = array_reverse(explode('.', parse_url($str, PHP_URL_HOST)))[1];
Perhaps you could make the first m. part optional with (?:\w+\.)?.
Instead of a capturing group you could use \K to reset the starting point of the reported match.
Then match one or more word characters \w+ and use a positive lookahead to assert that what follows is a dot (?=\.)
For example:
^https?://(?:www)?(?:\w+\.)?\K\w+(?=\.)
Edit: Or you could match for m. or www. using an alternation:
^https?://(?:m\.|www\.)?\K\w+(?=\.)
Demo Php

Positive look ahead regex confusing

I'm building this regex with a positive look ahead in it. Basically it must select all text in the line up to last period that precedes a ":" and add a "|" to the end to delimit it. Some sample text below. I am testing this in gskinner and editpadpro which has full grep regex support apparently so if I could get the answers in that for I'd appreciate it.
The regex below works to a degree but I am unsure if it is correct. Also it falls down if the text contains brackets.
Finally I would like to add another ignore rule like the one that ignores but includes "Co." in the selection. This second ignore rule would ignore but include periods that have a single Capital letter before them. Sample text below too. Thanks for all the help.
^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)
121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.
I don't think I understand what you want to do. But this part [^(?:Co)] is definitely not correct.
With the square brackets you are creating a character class, because of the ^ it is a negated class. That means at this place you don't want to match one of those characters (?:Co), in other words it will match any other character than "?)(:Co".
Update:
I don't think its possible. How should I distinguish between L. Co. or something similar and the end of the sentence?
But I found another error in your regex. The last part (?=[^:]*?\:) should be (?=[^.]*?\:) if you want to match the last dot before the : with your expression it will match on the first dot.
See it here on Regexr
This seems to do what you want.
(.*\.)(?=[^:]*?:)
It quite simply matches all text up to the last full stop that occurs before the colon.

PHP string replace question

If I have a string that equals "firstpart".$unknown_var."secondpart", how can I delete everything between "firstpart" and "secondpart" (on a page that does not know the value of $unknown_var)?
Thanks.
Neel
substr_replace
start and length can be computed with strpos. Or you could go the regex route if you're comfortable learning about them.
As long as $unkonwn_var does not contain neither firstpart nor secondpart, you can match against
firstpart(.*)secondpart
and replace it with
firstpartsecondpart
You shoukd use a regexp to do so.
preg_replace('/firspart(.*)secondpart/','firstpartsecondpart',$yourstring);
will replace anything between the first occurence of firstpart and the last of secondpart, if you want to delete multiple time between first and second part you can make the expression ungreedy by replacing (.*) by (.*?) in the expression
preg_replace('/firspart(.*?)secondpart/','firstpartsecondpart',$yourstring);

REGEX: Match at the beginning of a string OPTIONALLY

Im building a regex to match the word combo W7. Not W73 or NW7 or 2W7.
So far I have
^w7{1}\b
which works perfectly. However, I have a problem.
I also need to have //W7 (with 2 forward slashs) also match. So if W7 or //W7 are entered they should match
Any ideas?
Thanks!
Just add an optional // at the start.
^(//)?w7\b
You may need to escape them.
^(\/\/)?w7\b
You could just add an optional group to your regex
^(?://)?W7\b
Remember to use a non-/ delimiter (it's tidier than escaping those slashes).
If you want the subject string to only ever contain //W7 or W7 then an alternative (full pattern) would be:
~^(?://)?W7$~D
What about ^(//)?W7? the question mark indicates one or zero occurrences.

Parse block with php regex

I'm trying to write a (I think) pretty simple RegEx with PHP but it's not working.
Basically I have a block defined like this:
%%%%blockname%%%%
stuff goes here
%%%%/blockname%%%%
I'm not any good at RegEx, but this is what I tried:
preg_match_all('/^%%%%(.*?)%%%%(.*?)%%%%\/(.*?)%%%%$/i',$input,$matches);
It returns an array with 4 empty entries.
I guess it also, apart from actually working, needs some sort of pointer for the third match because it should be equal to the first one?
Please enlighten me :)
You need to allow the dot to match newlines, and to allow ^ and $ to match at the start and end of lines (not just the entire string):
preg_match_all('/^%%%%(.*?)%%%%(.*?)%%%%\/(.*?)%%%%$/sm',$input,$matches);
The s (single-line) option makes the dot match any character including newlines.
The m (multi-line) option allows ^ and $ to match at the start and end of lines.
The i option is unnecessary in your regex since there are no case-sensitive characters in it.
Then, to answer the second part of your question: If blockname is the same in both cases, then you can make that explicit by using a backreference to the first capturing group:
preg_match_all('/^%%%%(.*?)%%%%(.*?)%%%%\/\1%%%%$/sm',$input,$matches);
I'm pretty sure you can't since these operations would need to save a variable and you can't in regex. You should try to do this using PHP's built-in token parser. http://php.net/manual/en/function.token-get-all.php

Categories