preg_replace returns unexpected results to $1

preg_replace returns unexpected results to $1 - php

<?php
$data='123
[test=abc]cba[/test]
321';
$test = preg_replace("(\[test=(.+?)\](.+?)\[\/test\])is","$1",$data);
echo $test;
?>
I expect the above code to return
abc
but instead of returning abc it returns
123 abc 321
Please tell me what I am doing wrong.

You're only replacing the matched part (the BBcode section). You're leaving the rest of the string untouched.
If you also want to remove the leading/trailing text, include those in the expression:
$test = preg_replace("(.*\[test=(.+?)\](.+?)\[\/test\].*)is","$1",$data);

I don't know if you're aware of this, but the outermost set of parentheses in your regex does not form a group (capturing or otherwise). PHP is interpreting them as regex delimiters. If you are aware of that, and you're using them as delimiters on purpose, please don't. It's usually best to use a non-bracketing character that never has any special meaning in regexes (~, %, #, etc.).
I agree with Casimir that preg_match() is the tool you should be using, not preg_replace(). But his solution is trickier than it needs to be. Your original regex works fine; all you have to do is grab the contents of the first capturing group, like so:
if (preg_match('%\[test=(.+?)\](.+?)\[/test\]%si', $data, $match)) {
$test = $match[1];
}

You don't need to use a replace here, all that you need is to take something in the string. To do that preg_match is more useful:
$data='123
[test=abc]cba[/test]
321';
$test = preg_match('~\[test=\K[^\]]++~', $data, $match);
echo $match[0];

Related

PHP preg_match regular expression for find date in string

I try to make system that can detect date in some string, here is the code :
$string = "02/04/16 10:08:42";
$pattern = "/\<(0?[1-9]|[12][0-9]|3[01])\/\.- \/\.- \d{2}\>/";
$found = preg_match($pattern, $string);
if ($found) {
echo ('The pattern matches the string');
} else {
echo ('No match');
}
The result i found is "No Match", i don't think that i used correct regex for the pattern. Can somebody tell me what i must to do to fix this code

First of all, remove all gibberish from the pattern. This is the part you'll need to work on:
(/0?[1-9]|[12][0-9]|3[01]/)
(As you said, you need the date only, not the datetime).
The main problem with the pattern, that you are using the logical OR operators (|) at the delimiters. If the delimiters are slashes, then you need to replace the tube characters with escaped slashes (/). Note that you need to escape them, because the parser will not take them as control characters. Like this: \/.
Now, you need to solve some logical tasks here, to match the numbers correctly and you're good to go.
(I'm not gonna solve the homework for you :) )
These articles will help you to solve the problem tough:
Character classes
Repetition opetors
Special characters
Pipe character (alternation operator)
Good luck!

In your comment you say you are looking for yyyy, but the example says yy.
I made a code for yy because that is what you gave us, you can easily change the 2 to a 4 and it's for yyyy.
preg_match("/((0|1|2|3)[0-9])\/\d{2}\/\d{2}/", $string, $output_array);
Echo $output_array[1]; // date
Edit:
If you use this pattern it will match the time too, thus make it harder to match wrong.
((0|1|2|3)[0-9])/\d{2}/\d{2}\s+\d{2}:\d{2}:\d{2}
http://www.phpliveregex.com/p/fjP
Edit2:
Also, you can skip one line of code.
You first preg_match to $found and then do an if $found.
This works too:
If(preg_match($pattern, $string, $found))}{
Echo $found[1];
}Else{
Echo "nothing found";
}
With pattern and string as refered to above.
As you can see the found variable is in the preg_match as the output, thus if there is a match the if will be true.

Regex pattern for matching mm <sup>3<sup>

I’m trying to write a regular expression to change mm3 to mL:
<?php
$match = 'mm<sup>3</sup>';
if(preg_match('/\b(mm<sup>3</sup>)\b/', $match))
{
$replacement = 'ml';
$replac = preg_replace('/\b(mm<sup>3</sup>)\b/', $replacement, $match);
echo $replac;
}
?>
But my regular expression doesn't capture the content in $match variable, and the $replac value isn't output. What am I doing wrong?

Change:
if(preg_match('/\b(mm<sup>3</sup>)\b/',$match))
to:
if(preg_match('#\bmm<sup>3</sup>\b#',$match))
and similarly in the preg_replace call.
Since your regular expression contains /, you need to either escape it or use a different delimiter around the regular expression.
There's also no need for the parentheses, since you're not doing anything with the groups.

You need to either use preg_quote to get rid of that / in your regexp, or use a different delimiter (usually # is used).
Also, the \b separator after the > is not necessary, nor are parentheses since you don't seem to be doing capture; you're basically doing a more expensive str_replace.
Finally, you can do everything in one move. If there's no match, nothing will happen.
<?php
$match = 'mm<sup>3</sup>';
$replacement='ML';
$replac = preg_replace('#\\bmm<sup>3</sup>#',
$replacement,
$match);
echo $replac;
?>
If you want to be picky, I guess you should also replace with 'ml', not 'ML' :-)
(for replacement of multiple strings, preg_replace supports arrays).
Note: unless you're sure that is the correct HTML you want replaces, maybe you ought to try
$match = 'mm\\s*<sup>\\s*3\\s*</sup>';
in order to catch mm 3 and similar, in addition to mm3 (in some circumstances they may look alike, and some editors might use or automatically "correct" either form into the other).

regex to clean up url

I am looking for a way to get a valid url out of a string like:
$string = 'http://somesite.com/directory//sites/9/my_forms/3-895a3e/somefilename.jpg|:||:||:||:|19845';
My original solution was:
preg_match('#^[^:|]*#', str_replace('//', '/', $string), $modifiedPath);
But obviously its going to remove a slash from the http:// instead of the one in the middle of the string.
My expected output that I want from the original is:
http://somesite.com/directory/sites/9/my_forms/3-895a3e/somefilename.jpg
I could always break off the http part of the string first but would like a more elegant solution in the form of regex if possible. Thanks.

This will do exactly what you are asking:
<?php
$string = 'http://somesite.com/directory//sites/9/my_forms/3-895a3e/somefilename.jpg|:||:||:||:|19845';
preg_match('/^([^|]+)/', $string, $m); // get everything up to and NOT including the first pipe (|)
$string = $m[1];
$string = preg_replace('/(?<!:)\/\//', '/' ,$string); // replace all occurrences of // as long as they are not preceded by :
echo $string; // outputs: http://somesite.com/directory/sites/9/my_forms/3-895a3e/somefilename.jpg
exit;
?>
EDIT:
(?<!X) in regular expressions is the syntax for what is called a lookbehind. The X is replaced with the character(s) we are testing for.
The following expression would match every instance of double slashes (/):
\/\/
But we need to make sure that the match we are looking for is NOT preceded by the : character so we need to 'lookbehind' our match to see if the : character is there. If it is then we don't want it to be counted as a match:
(?<!:)\/\/
The ! is what says NOT to match in our lookbehind. If we changed it to (?=:)\/\/ then it would only match the double slashes that did have the : preceding them.
Here is a Quick tutorial that can explain it all better than I can lookahead and lookbehind tutorial

Assuming all your strings are in the form given, you don't need any but the simplest of regexes to do this; if you want an elegant solution, then a regex is definitely not what you need. Also, double slashes are legal in a URL, just like in a Unix path, and mean the same thing a single slash does, so you don't really need to get rid of them at all.
Why not just
$url = array_shift(preg_split('/\|/', $string));
?
If you really, really care about getting rid of the double slashes in the URL, then you can follow this with
$url = preg_replace('/([^:])\/\//', '$1/', $url);
or even combine them into
$url = preg_replace('/([^:])\/\//', '$1/', array_shift(preg_split('/\|/', $string)));
although that last form gets a little bit hairy.

Since this is a quite strictly defined situation, I'd consider just one preg to be the most elegant solution.
From the top of my head:
$sanitizedURL = preg_replace('~((?<!:)/(?=/)|\\|.+)~', '', $rawURL);
Basically, what this does is look for any forward slash that IS NOT preceded by a colon (:), and IS followed bij another forward slash. It also searches for any pipe character and any character following it.
Anything found is removed from the result.
I can explain the RegEx in more detail if you like.

regex back-reference not working in PHP PCRE

I want to match matching tags like <tag>...</tag>. I tried the regex
~<([^>]+)>.*?</\1>~
but this fails. The expression worked when I used the exact text inside the angle brackets, i.e,
~<(tag)>.*?</tag>~
works, but even
~<(tag)>.*?</\1>~
fails.
I'm assuming that the back reference is not working here.
Can someone help me out please. Thanks
PS: I'm not using this to parse HTML. I know I shouldn't.

You didn't show your PHP code, but I surmise you have your regex in double quotes. If so then the backreference \1 actually is converted into an ASCII character ☺ before it reaches PCRE. (All \123 sequences are interpreted as C-string octal escapes there.)

It worked for me...
$str = '<a></a>';
var_dump(preg_match('~<([^>]+)>.*?</\1>~', $str)); // int(1)
CodePad.
Also, have you considered an XML parser? Otherwise it won't like a piece of HTML like this...
<a title="Is 4 > 6?"></a>
CodePad.

Apart from the fact that it's not always a good idea to try and match markup languages using regex, your regex looks OK. Maybe you're using it wrong?
if (preg_match('~<([^>]+)>.*?</\1>~', $subject, $regs)) {
$result = $regs[0];
} else {
$result = "";
}
should work.

Use single quotes in the pattern
preg_match_all('/(sens|respons)e and \1ibility/', "sense and sensibility", $matches);
print_r($matches);

Simple RegEx PHP

Since I am completely useless at regex and this has been bugging me for the past half an hour, I think I'll post this up here as it's probably quite simple.
hey.exe
hey2.dll
pomp.jpg
In PHP I need to extract what's between the <a> tags example:
hey.exe
hey2.dll
pomp.jpg

Avoid using '.*' even if you make it ungreedy, until you have some more practice with RegEx. I think a good solution for you would be:
'/<a[^>]+>([^<]+)<\/a>/i'
Note the '/' delimiters - you must use the preg suite of regex functions in PHP. It would look like this:
preg_match_all($pattern, $string, $matches);
// matches get stored in '$matches' variable as an array
// matches in between the <a></a> tags will be in $matches[1]
print_r($matches);

This appears to work:
$pattern = '/<a.*?>(.*?)<\/a>/';

([^<]*)

I found this regular expression tester to be helpful.

Here is a very simple one:
<a.*>(.*)</a>
However, you should be careful if you have several matches in the same line, e.g.
hey.exehey2.dll
In this case, the correct regex would be:
<a.*?>(.*?)</a>
Note the '?' after the '*' quantifier. By default, quantifiers are greedy, which means they eat as much characters as they can (meaning they would return only "hey2.dll" in this example). By appending a quotation mark, you make them ungreedy, which should better fit your needs.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_replace returns unexpected results to $1 - php

<?php $data='123 [test=abc]cba[/test] 321'; $test = preg_replace("(\[test=(.+?)\](.+?)\[\/test\])is","$1",$data); echo $test; ?> I expect the above code to return abc but instead of returning abc it returns 123 abc 321 Please tell me what I am doing wrong.

You're only replacing the matched part (the BBcode section). You're leaving the rest of the string untouched. If you also want to remove the leading/trailing text, include those in the expression: $test = preg_replace("(.\[test=(.+?)\](.+?)\[\/test\].)is","$1",$data);

You don't need to use a replace here, all that you need is to take something in the string. To do that preg_match is more useful: $data='123 [test=abc]cba[/test] 321'; $test = preg_match('~\[test=\K[^\]]++~', $data, $match); echo $match[0];

Related

PHP preg_match regular expression for find date in string

Regex pattern for matching mm <sup>3<sup>

regex to clean up url

regex back-reference not working in PHP PCRE

Simple RegEx PHP

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_replace returns unexpected results to $1 - php

<?php $data='123 [test=abc]cba[/test] 321'; $test = preg_replace("(\[test=(.+?)\](.+?)\[\/test\])is","$1",$data); echo $test; ?> I expect the above code to return abc but instead of returning abc it returns 123 abc 321 Please tell me what I am doing wrong.

You're only replacing the matched part (the BBcode section). You're leaving the rest of the string untouched. If you also want to remove the leading/trailing text, include those in the expression: $test = preg_replace("(.*\[test=(.+?)\](.+?)\[\/test\].*)is","$1",$data);

You don't need to use a replace here, all that you need is to take something in the string. To do that preg_match is more useful: $data='123 [test=abc]cba[/test] 321'; $test = preg_match('~\[test=\K[^\]]++~', $data, $match); echo $match[0];

Related

PHP preg_match regular expression for find date in string

Regex pattern for matching mm <sup>3<sup>

regex to clean up url

regex back-reference not working in PHP PCRE

Simple RegEx PHP

Categories

Resources

You're only replacing the matched part (the BBcode section). You're leaving the rest of the string untouched. If you also want to remove the leading/trailing text, include those in the expression: $test = preg_replace("(.\[test=(.+?)\](.+?)\[\/test\].)is","$1",$data);