Why is this regular expression not working? - php

Content of 1.txt:
Image" href="images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false"><img src="im
Code that does not work:
<?php
$pattern = '/(images\/product_images\/original_images\/)(.*)(\.jpg)/i';
$result = file_get_contents("1.txt");
preg_match($pattern,$result,$match);
echo "<h3>Preg_match Pattern test:</h3><br><br><pre>";
print_r($match);
echo "</pre>";
?>
I expect this result:
Array
(
[0] => images/product_images/original_images/9961_1.jpg
[1] => images/product_images/original_images/
[2] => 9961_1
[3] => .jpg
)
But i take this-like:
Array
(
[0] => images/product_images/original_images/9961_1.jpg" rel="disable-zoom:false; disable-expand: false">
[1] => images/product_images/original_images/
[2] => 9961_1.jpg" rel="disable-zoom:false; disable-expand: false">
)
I'n tired of trying from a million combinations of this regexp. I dunno what's wrong. Please and thanks a lot!

Make it ungreedy:
$pattern = '/(images\/product_images\/original_images\/)(.*?)(\.jpg)/i';

Remember that Regular Expressions are greedy. Your second capture (.*) says to match any character except the new line (unless in mutliline mode). So it is probably capturing the rest of the line.
You can make it ungreedy as suggested by Wrikken. But I like to ensure I am capturing what I want. In your case, it looks like the value of the href attribute. So really I want at least 1 character, can't be a quote, followed by the jpg extension:
$pattern = '/(images\/product_images\/original_images\/)([^'"]+)(\.jpg)/i';

Here's the basic regex:
href="((.*/)(.*?)(.jpg))"

Do not parse HTML with regex.
Do not parse HTML with regex.
Do not parse HTML with regex.

Related

preg match to get text after # symbol and before next space using php

I need help to find out the strings from a text which starts with # and till the next immediate space by preg_match in php
Ex : I want to get #string from this line as separate.
In this example, I need to extract "#string" alone from this line.
Could any body help me to find out the solutions for this.
Thanks in advance!
PHP and Python are not the same in regard to searches. If you've already used a function like strip_tags on your capture, then something like this might work better than the Python example provided in one of the other answers since we can also use look-around assertions.
<?php
$string = <<<EOT
I want to get #string from this line as separate.
In this example, I need to extract "#string" alone from this line.
#maybe the username is at the front.
Or it could be at the end #whynot, right!
dog#cat.com would be an e-mail address and should not match.
EOT;
echo $string."<br>";
preg_match_all('~(?<=[\s])#[^\s.,!?]+~',$string,$matches);
print_r($matches);
?>
Output results
Array
(
[0] => Array
(
[0] => #string
[1] => #maybe
[2] => #whynot
)
)
Update
If you're pulling straight from the HTML stream itself, looking at the Twitter HTML it's formatted like this however:
<s>#</s><b>UserName</b>
So to match a username from the html stream you would match with the following:
<?php
$string = <<<EOT
<s>#</s><b>Nancy</b> what are you on about?
I want to get <s>#</s><b>string</b> from this line as separate. In this example, I need to extract "#string" alone from this line.
<s>#</s><b>maybe</b> the username is at the front.
Or it could be at the end <s>#</s><b>WhyNot</b>, right!
dog#cat.com would be an e-mail address and should not match.
EOT;
$matchpattern = '~(<s>(#)</s><b\>([^<]+)</b>)~';
preg_match_all($matchpattern,$string,$matches);
$users = array();
foreach ($matches[0] as $username){
$cleanUsername = strip_tags($username);
$users[]=$cleanUsername;
}
print_r($users);
Output
Array
(
[0] => #Nancy
[1] => #string
[2] => #maybe
[3] => #WhyNot
)
Just do simply:
preg_match('/#\S+/', $string, $matches);
The result is in $matches[0]

preg_match on special characters

I’m trying to make a preg_match in my PHP code, but I can't seem to get it to work.
I have these 3 tags
{$success url='http://www.domain.localhost/form-success'}
{$error url='http://www.domain.localhost/form-error'}
{$button title='send form'}
How can I get my preg_match to accept what I’m trying to do?
I’m trying to get {$button(*)} out to my match path, and the same I want to do with the 2 others {$success(*)} and {$error(*)}
Now I think it should look like this
preg_match("/\{\$button(.+?)}/s", $content, $match);
But it still isn’t working, so I hope other people can help me here.
\$ already means "literal $" in PHP strings, so when put in a regex it just means "end of string".
Try \\\$ instead.
I think the correct regex to retrieve the 'button' tag should be : \{\$button([^\}]*)\}
You can try your expression on http://regexpal.com/
So with php :
preg_match("/\{\$button([^\}]*)\}/s", $content, $match );
You need to escape the closing } and need to use preg_match_all to match all the lines.
Try this regex:
preg_match('/\{\$(?:success|error|button)\s+([^}]+)\}/i', $content, $match );
Working Demo: http://ideone.com/fKq5L2
OUTPUT:
Array
(
[0] => {$success url='http://www.domain.localhost/form-success'}
[1] => {$error url='http://www.domain.localhost/form-error'}
[2] => {$button title='send form'}
)

php preg_match replace result duplicate

I am a newbie in preg_match patterns, so I would be glad if someone could help me for next situation:
I need to replace those string:
[popup="about"]about me[/popup]
to
<a href="#PopupAbout" data-plugin-options='{"type":"inline", preloader: false}'>about me</a>
I have tried with $pattern = '/\[popup="(.*?)"\](.*?)\[\/popup\]/'; but it does not give me expected result, it give duplicated results. And how can I replace it all in a simple way!
Regards!
How about:
$str = preg_replace('~\[popup="about"\](.+?)\[/popup\]'~,
"<a href=\"#PopupAbout\" data-plugin-options='{\"type\":\"inline\", preloader: false}'>$1</a>",
$str);
Try this:
preg_match("/\[popup="(.*)"\](.*?)\[\/popup\]/", $input_line, $output_array);
I get this result:
Array
(
[0] => [popup="about"]about me[/popup]
[1] => about
[2] => about me
)
You can test it online here: http://www.phpliveregex.com/

Regex - How to match one pattern at a time

I've this function that parses some content to retrieve homemade link tag and convert it to normal link tag.
Possible input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah</p>
Output :
<p>blabalblahhh text to click blablabah</p>
Here is my code:
$regex = '/\<moolinkx pageid="(.{1,})"\>(.{1,})\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
It works perfectly well if there is only one in the string. But as soon as there is a second one, it doesn't work.
Input:
<p>blabalblahhh <moolinkx pageid="121">text to click</moolinkx> blablabah.</p>
<p>Another <moolinkx pageid="128">text to clickclick</moolinkx> again blablablah.</p>
That's what I got when I print_r($matches):
Array
(
[0] => Array
(
[0] => <moolinkx pageid="121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128">text to clickclick</moolinkx>
)
[1] => Array
(
[0] => 121">text to click</moolinkx> blablabah.</p><p>Another <moolinkx pageid="128
)
[2] => Array
(
[0] => text to clickclick
)
)
I'm not at ease with regex, so it must be something very trivial... but I can't pinpoint what it is :(
Thank you very much in advance!
NB: This is my first post here, though I've been using this terrific Q&A for ages!
Use a negative Regex:
$regex = '/<moolinkx pageid="([^"]+)">([^<]+)<\/moolinkx>/';
Explained demo here: http://regex101.com/r/sI3wK5
You are using a greedy selector, which is recognising everything between the first openning tag and the last closing tag as the content between the tags. Change your regex to:
$regex = '/\<moolinkx pageid="(.+?)"\>(.+?)\<\/moolinkx\>/';
preg_match_all( $regex, $string, $matches );
Notice the .{1,} has changed to .+?. The + means one or more instances, and the ? tells the regex to select the fewest characters it can to fulfil the expression.

Is there a way to match recursively/nested with regex? (PHP, preg_match_all)

How can I match both (http://[^"]+)'s?:
(I know it's an illegal URL, but same idea)
I want the regex to give me these two matches:
1 http://yoursite.com/goto/http://aredirectURL.com/extraqueries
2 http://aredirectURL.com/extraqueries
Without running multiple preg_match_all's
Really stumped, thanks for any light you can shed.
This regular expression will get you the output you want: ((?:http://[^"]+)(http://[^"]+)). Note the usage of the non-capturing group (?:regex). To read more about non-capturing groups, see Regular Expression Advanced Syntax Reference.
<?php
preg_match_all(
'((?:http://[^"]+)(http://[^"]+))',
'',
$out);
echo "<pre>";
print_r($out);
echo "</pre>";
?>
The above code outputs the following:
Array
(
[0] => Array
(
[0] => http://yoursite.com/goto/http://aredirectURL.com/extraqueries
)
[1] => Array
(
[0] => http://aredirectURL.com/extraqueries
)
)
you can split the string with this function:
http://de.php.net/preg_split
each part can contain e.g. one of the urls in the array given in the result.
if there is more content maybe call the preg_split using a callback operation while your full text is "worked" on.
$str = '';
preg_match("/\"(http:\/\/.*?)(http:\/\/.*?)\"/i", $str, $match);
echo "{$match[0]}{$match[1]}\n";
echo "{$match[1]}\n";

Categories