regex for double and single quotes in PHP - php

I'm trying to make a regex that will match the argument for a function in PHP. The only problem is the regex doesn't work, or I'm not escaping it well, since I need both double quotes and single quotes to be matched, depending on what the developer used in code.
fn_name("string");
fn_name('string');
I've made two expressions, one for each case, but I guess that can be done way better.
/fn_name\("(.*?)\"\)/
/fn_name\('(.*?)\'\)/
How do I make that one regex, and escape it properly from this?
preg_match_all('/fn_name\("(.*?)\"\)/', file_get_contents($filename), $out);
Thanks.

Use ["'] and backreference (\1):
preg_match_all('/fn_name\((["\'])(.*?)\1\)/', "fn_name('string');", $out);
preg_match_all('/fn_name\((["\'])(.*?)\1\)/', 'fn_name("string");', $out);
See demo.

Try this:
<?php
$s = <<<END
fn_name("string"); fn_name("same line");
fn_name( "str\"ing" );
fn_name("'str'ing'");
fn_name ('string');
fn_name('str\'ing' );
fn_name( '"st"ring"');
fn_name("string'); # INVALID
fn_name('string"); # INVALID
END;
preg_match_all('~fn_name\s*\(\s*([\'"])(.+?)\1\s*\)~', $s, $match, PREG_SET_ORDER);
print_r($match);
Regex explanation
PHP demo

Related

preg_match_all() html tags

Using preg_match_all(), I want to match something like:
...randomtext...>MATCH1</a>" (MATCH2)"...randomtext... EDIT: to clarify, this is exactly the string I'm trying to extract data from, including the brackets, quotes, angle-brackets etc.
Here's what I've tried: preg_match_all("/^>(.+?)</a>\" \((.+?)\)\"$/", $htmlfile, $matches);
It should extract MATCH1 as $matches[1][0] and MATCH2 as $matches[2][0]
Any idea why it isn't working?
Thanks
You didn't escape your end tag </a>
This should work:
preg_match_all("/>(.+?)<\/a>\" \((.*?)\)/", $htmlfile, $matches);
See Codepad example.
You need to escape the / in your pattern, and you don't want your pattern anchored to ^ and $
So probably this will work: preg_match_all("/>(.+?)<\/a>\" \((.+?)\)\"/", $htmlfile, $matches);

php: how to regex replace the following strings?

would like to replace all occurence of, double quote included
"http://somebunchofchar"
to
"link"
I came up with preg_replace("/\"http:\/\/.\"/i", "\"link\"", $string);
Just add an asterisk and question mark after dot
preg_replace("/\"http:\/\/.*?\"/i", "\"link\"", $string);
$string = preg_replace('#"http://.+"#', '"link"', $string);
You can use:
preg_replace('~"http://[^"]*"~i', '"link"', $string);
Just look here:
http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2&AspxAutoDetectCookieSupport=1
how to match an URL with the correct pattern; than use preg_replace with the particular regexp pattern ;-) (you can add those quotes at the start and end to the pattern yourself quite easily) :-)

issues with preg_match PHP

I have a string:
$str="(94896)content is here(/94896)(94897)content is here(/94897)(94898)content is here(/94898)(94899)content is here(/94899)";
the (number) and (/number) act as tags to take certain content out of the string.
and I have a preg_match to take the content out:
if(preg_match('/(94896)\"(.*)\"(\/94896)/',$str,$c)) {echo "I found the content, its:".$co[1];}
Now for some reason, it doesn't find a match in the string ($str), though its clearly there....
Any ideas on what im doing wrong here?
You need to take the double-quotes out of your regex string, since they don't appear in $str, but are expected by the regex.
'/(94896)\"(.*)\"(\/94896)/'
// ^^ ^^
// These aren't in the string.
EDIT: I think you'll also need to escape your brackets, since they will be getting read as grouping operators, not actual brackets.
Your expression should be:
'/\(94896\)(.*)\(\/94896\)/'
Parentheses are used in a regex to denote subpatterns. If you want to search these characters in a string, you must escape them:
preg_match('/\(94896\)(.*)\(\/94896\)/',$str,$c)
If the pattern is found:
echo "I found the content, its:".$c[0];
Oh, and as Karl Nicoll says, why are the quotations in your pattern?
To match all content:
$str="(94896)content is here(/94896)(94897)content is here(/94897)(94898)content is here(/94898)(94899)content is here(/94899)";
$re = '/\((\d+)\)(.*)\(\/\1\)/';
preg_match_all($re, $str, $matches,PREG_SET_ORDER);
var_dump($matches);
Number will be in $matches[*][1], content in $matches[*][2].

Regex pattern matching literal repeated \n

Given a literal string such as:
Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld
I would like to reduce the repeated \n's to a single \n.
I'm using PHP, and been playing around with a bunch of different regex patterns. So here's a simple example of the code:
$testRegex = '/(\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex ,'\n',$test);
echo "<hr/>test regex<hr/>".$test2;
I'm new to PHP, not that new to regex, but it seems '\n' conforms to special rules. I'm still trying to nail those down.
Edit: I've placed the literal code I have in my php file here, if I do str_replace() I can get good things to happen, but that's not a complete solution obviously.
To match a literal \n with regex, your string literal needs four backslashes to produce a string with two backlashes that’s interpreted by the regex engine as an escape for one backslash.
$testRegex = '/(\\\\n){2,}/';
$test = 'Hello\n\n\n\n\n\n\n\n\n\n\n\nWorld';
$test2 = preg_replace($testRegex, '\n', $test);
Perhaps you need to double up the escape in the regular expression?
$pattern = "/\\n+/"
$awesome_string = preg_replace($pattern, "\n", $string);
Edit: Just read your comment on the accepted answer. Doesn't apply, but is still useful.
If you're intending on expanding this logic to include other forms of white-space too:
$output = echo preg_replace('%(\s)*%', '$1', $input);
Reduces all repeated white-space characters to single instances of the matched white-space character.
it indeed conforms to special rules, and you need to add the "multiline"-modifier, m. So your pattern would look like
$pattern = '/(\n)+/m'
which should provide you with the matches. See the doc for all modifiers and their detailed meaning.
Since you're trying to reduce all newlines to one, the pattern above should work with the rest of your code. Good luck!
Try this regular expression:
/[\n]*/

regex back-reference not working in PHP PCRE

I want to match matching tags like <tag>...</tag>. I tried the regex
~<([^>]+)>.*?</\1>~
but this fails. The expression worked when I used the exact text inside the angle brackets, i.e,
~<(tag)>.*?</tag>~
works, but even
~<(tag)>.*?</\1>~
fails.
I'm assuming that the back reference is not working here.
Can someone help me out please. Thanks
PS: I'm not using this to parse HTML. I know I shouldn't.
You didn't show your PHP code, but I surmise you have your regex in double quotes. If so then the backreference \1 actually is converted into an ASCII character ☺ before it reaches PCRE. (All \123 sequences are interpreted as C-string octal escapes there.)
It worked for me...
$str = '<a></a>';
var_dump(preg_match('~<([^>]+)>.*?</\1>~', $str)); // int(1)
CodePad.
Also, have you considered an XML parser? Otherwise it won't like a piece of HTML like this...
<a title="Is 4 > 6?"></a>
CodePad.
Apart from the fact that it's not always a good idea to try and match markup languages using regex, your regex looks OK. Maybe you're using it wrong?
if (preg_match('~<([^>]+)>.*?</\1>~', $subject, $regs)) {
$result = $regs[0];
} else {
$result = "";
}
should work.
Use single quotes in the pattern
preg_match_all('/(sens|respons)e and \1ibility/', "sense and sensibility", $matches);
print_r($matches);

Categories