Problems with PHP PCRE - php

I'm having a problem with PHP PCRE, and I'm used to POSIX, so I'm not too sure about what I'm doing wrong. Basically, this function is matching up to 10 numbers separated by commas. However, it's also matching the string sdf (and probably many others), which I don't see the reason for. Can anyone help me?
$pattern='^\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?\d{0,5},? ?^';
$leftcheck=preg_match($pattern, $leftmodules);
$centercheck=preg_match($pattern, $centermodules);
$rightcheck=preg_match($pattern, $rightmodules);
if(!$leftcheck OR !$centercheck OR !$rightcheck)
{
$editpage = $_SERVER['HTTP_REFERER'].'?&error=1';
die("Location:$editpage");
}

^\d{1,5}(, *\d{1,5}){0,9}$

I'm assuming the following:
Spaces may or may not be there.
Numbers can be any length.
Only numbers, spaces, and comma's are allowed.
Trailing commas without a number after them are allowed.
Between 1 and 10 numbers seperated by commas are ok.
Given that:
$pattern = '/^(\d+,* *){1,10}$/';
works.

From what I can see, the regular expression you provided will match anything you pass into it. Here's why
\d{0,5} #\d matches any digit character, while {0,5} means the
#preceding character must be repeated between **0** and five times
So your regular expression is essentially short circuiting. The engine see the first character of your string and says "has a digit been repeated 0 times? Yes? OK, it's a match!

I think if your number are only separated by commas something like this should do it
$pattern = '^\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5},\d{0,5}$';

You need to contain the pattern between two equal symbols for it to be valid. People usually use /.
$pattern = '/some pattern/';
To match the whole thing you want to have ^ at the start and $ at the end. Getting this wrong is probably why your sdf was matching.
$pattern = '/^whole pattern match$/';
It's a bit confusing how the numbers will be separated. Is it comma or space? Is both OK? What about none? Here's my best guess though.
$pattern = '/^\d{,5}[, ](\d{,5}[, ]){,9}$/';

Related

regular expression gone wrong

I want to find all strings looking like [!plugin=tesplugin arg=dfd arg=2!] and put them in array.
Important feature: the string could contain arg=uments or NOT(in some cases). and of course there could be any number of arg's. So the string could look like:
[!plugin=myname!] or [!plugin=whatever1 arg=22!] or even [!plugin=gal-one arg=1 arg=text arg=tx99!]. I need to put them all in $strarray items
Here is what i did...
$inp = "[!plugin=tesplugin arg=dfd!] sometxt [!plugin=second arg=1 arg=2!] 1sd";
preg_match_all('/\[!plugin=[a-z0-9 -_=]*!]/i', $inp, $str);
but $str[0][0] contains:
[!plugin=tesplugin arg=dfd!] sometxt [!plugin=second arg=1 arg=2!]
instead of putting each expression in a new array item..
I think my problem in regex.. but can't find one. Plz help...
The last ] needs to be escaped and the - in the character class needs to be at the start, end, or escaped. As is it is a range of ascii characters between a space and underscore.
\[!plugin=[a-z0-9 \-_=]*!\]
Regex101 Demo: https://regex101.com/r/zV4bO2/1

Regex to extract substring

really struggling with this...hopefully someone can put me on the right path to a solution.
My input string is structured like this:
66-2141-A-AC107-7
I'm interested in extracting the string 'AC107' using a single regular expression. I know how to do this with other PHP string functions, but I have to do this with a regular expression.
What I need is to extract all data between the third and fourth hyphens. The structure of each section is not fixed (i.e, 66 may be 8798709 and 2141 may be 38). The presence of the number of hyphens is guaranteed (i.e., there will always be a total of four (4) hyphens).
Any help/guidance is greatly appreciated!
This will do what you need:
(?:[^-]*-){3}([^-]+)
Debuggex Demo
Explanation:
(?:[^-]*-) Look for zero or more non-hyphen characters followed by a hyphen
{3} Look for three of the blocks just described
([^-]+) Capture all the consecutive non-hyphen characters from that point forward (will automatically cut off before the next hyphen)
You can use it in PHP like this:
$str = '66-2141-A-AC107-7';
preg_match('/^(?:[^-]*-){3}([^-]+)/', $str, $matches);
echo $matches[1]; // prints AC107
This should look for anything followed by a hyphen 3 times and then in group 2 (the second set of parenthesis) it will have your value, followed by another hyphen and anything else.
/^(.*-){3}(.*)-(.*)/
You can access it by using $2. In php, it would be like this:
$string = '66-2141-A-AC107-7';
preg_match('/^(.*-){3}(.*)-(.*)/', $string, $matches);
$special_id = $matches[2];
print $special_id;

PHP Regex: match text urls until space or end of string

This is the text sample:
$text = "asd dasjfd fdsfsd http://11111.com/asdasd/?s=423%423%2F gfsdf http://22222.com/asdasd/?s=423%423%2F
asdfggasd http://3333333.com/asdasd/?s=423%423%2F";
This is my regex pattern:
preg_match_all( "#http:\/\/(.*?)[\s|\n]#is", $text, $m );
That match the first two urls, but how do I match the last one? I tried adding [\s|\n|$] but that will also only match the first two urls.
Don't try to match \n (there's no line break after all!) and instead use $ (which will match to the end of the string).
Edit:
I'd love to hear why my initial idea doesn't work, so in case you know it, let me know. I'd guess because [] tries to match one character, while end of line isn't one? :)
This one will work:
preg_match_all('#http://(\S+)#is', $text, $m);
Note that you don't have to escape the / due to them not being the delimiting character, but you'd have to escape the \ as you're using double quotes (so the string is parsed). Instead I used single quotes for this.
I'm not familar with PHP, so I don't have the exact syntax, but maybe this will give you something to try. the [] means a character class so |$ will literally look for a $. I think what you'll need is another look ahead so something like this:
#http:\/\/(.*)(?=(\s|$))
I apologize if this is way off, but maybe it will give you another angle to try.
See What is the best regular expression to check if a string is a valid URL?
It has some very long regular expressions that will match all urls.

Regular expression to match single dot but not two dots?

Trying to create a regex pattern for email address check. That will allow a dot (.) but not if there are more than one next to each other.
Should match:
test.test#test.com
Should not match:
test..test#test.com
Now I know there are thousands of examples on internet for e-mail matching, so please don't post me links with complete solutions, I'm trying to learn here.
Actually the part that interests me the most is just the local part:
test.test that should match and test..test that should not match.
Thanks for helping out.
You may allow any number of [^\.] (any character except a dot) and [^\.])\.[^\.] (a dot enclosed by two non-dots) by using a disjunction (the pipe symbol |) between them and putting the whole thing with * (any number of those) between ^ and $ so that the entire string consists of those. Here's the code:
$s1 = "test.test#test.com";
$s2 = "test..test#test.com";
$pattern = '/^([^\.]|([^\.])\.[^\.])*$/';
echo "$s1: ", preg_match($pattern, $s1),"<p>","$s2: ", preg_match($pattern, $s2);
Yields:
test.test#test.com: 1
test..test#test.com: 0
This seams more logical to me:
/[^.]([\.])[^.]/
And it's simple. The look-ahead & look-behinds are indeed useful because they don't capture values. But in this case the capture group is only around the middle dot.
strpos($input,'..') === false
strpos function is more simple, if `$input' has not '..' your test is success.
To answer the question in the title, I'd update the RegExp by Junuxx and allow dots in the beginning and end of the string:
'/^\.?([^\.]|([^\.]\.))*$/'
which is optional . in the beginning followed by any number of non-. or [non-. followed by .].
^([^.]+\.?)+#$
That should do for the what comes before the #, I'll leave the rest for you.
Note that you should optimise it more to avoid other strange character setups, but this seems sufficient in answering what interests you
Don't forget the ^ and $ like I first did :(
Also forgot to slash the . - silly me

Regular Expressions find and replace

I am having problems with RegEx in PHP and can't seem to find the answer.
I have a string, which is 3 letters, all caps ie COS.
the letters will change but always be 3 chars long and in caps, it will also be in the center of another string, surrounded by commas.
I need a regEx to find 3 caps letter inside a string and cahnge them from COS to 'COS'
(im doing this to amend a sql insert string)
I can't seem to find the regEx unless i use spercifit letter but the letters will change.
I need something along the lines of
[A-z]{3} then replace with '[A-Z]' (I know this isnt anywere near correct, just shorthand)
Anyone any suggestions?
Cheers
EDIT:
Just wanted to add incase anyone comes accross this question at a later date:
the sql insert string (provided from an external source and ftp's to my server daily)
contained the 3 capital string twice, once with commas and once with out
so I had to also remove the double commas added from the first regEx
$sqlString = preg_replace('/([A-Z]{3})/', "'$1'", $isqlString);
$sqlString = preg_replace('/\'\'([A-Z]{3})\'\'/', "'$1'", $sqlStringt);
Thanks everyone
You were actually very close. You could use:
echo preg_replace('/([A-Z]{3})/', "'$1'", 'COS'); //will output 'COS'
For MySQL statements I would advise to use the function mysql_real_escape_string() though.
$string = preg_replace('/([A-Z]{3})/', "'$1'", $string);
http://php.net/manual/en/function.preg-replace.php
Assuming it's like you said, "three capital letters surrounded by commas, e.g.
Foo bar,COS,Foo Bar
You can use look-ahead and look-behinds and find the letters:
(?<=,)([A-Z]{3})(?=,)
Then a simple replace to surround with single quotes will be adequate:
'$1'
All together, Here's it working.
preg_replace('/(^|\b)([A-Z]{3})(\b|$)/', "'${2}'", $string);

Categories