Regular expression - need to get string from html comment - php

I need to get string from comment in HTML file, I was trying to do it with DOM, but I didn't find good solution with this method.
So I want to try it with regular expressions, but I can't find satisfactory solution. Please, can you help me?
This is what I need:
<!--adress-"String here I need to get"-->
Thanks in advance for answer

Look into $matches after this code
preg_match('~<!--adress-"(.*?)"-->~msi', $string, $matches);

HTML comments are regular; you can just match <!--adress-"([^">]+)"--> and get the first group.
This assumes that the comments are always well-formed and always have a quoted string containing no quotes.

It will be more accurate:
$regex = '<!--(.+?)-"{0,1}(.+?)"{0,1}-->';
preg_match_all($regex, $html, $matches_array);
Just do the var_dump($matches_array) and see results.

Related

Regular expression to match a sample shortcode

I need help with a PHP regular expression that would match the sample shortcode below:
[smiley set_name="Happy" filename="smiling.gif"]
I'd like to extract "Happy" and "smiling.gif" from the above shortcode. I would highly appreciate your help. Thanks in advance.
I'm not sure how much of your string you can assume to know but maybe something like:
/\[smiley set_name="([\w\s]+)" filename="(\w+.gif|jpg|png|jpeg)"\]/
So your code may look like:
$string = '[smiley set_name="Happy" filename="smiling.gif"]';
preg_match('/\[smiley set_name="([\w\s]+)" filename="(\w+.gif|jpg|png|jpeg)"\]/', $string, $matches);
var_dump($matches);

Preg_match_all and complex regexp for text match

for example i have data response from here:
http://www.facebook.com/ajax/shares/view/?target_fbid=410558838979218&__a=1
there is a pattern that looks like this:
data-hovercard=\"\/ajax\/hovercard\/hovercard.php?id=655581307\">
how can i parse it with the preg_match_all() in PHP?
I know i need complex regular expression, but i dont have a clue how to write one for such pattern in the text.
Thanks for help
UPD:
the following code does give the id:
$str = 'hovercard.php?id=655581307';
preg_match_all('/[0-9]{9}/', $str , $matches);
print_r($matches);
BUT
this one doesnt
$url = 'http://www.facebook.com/ajax/shares/view/?target_fbid=410558838979218&__a=1';
$html = file_get_contents($url);
preg_match_all('/[0-9]{9}/', $html, $matches);
print_r($matches);
This gets a bit messy due to the backslashes escaping stuff, but to match exactly that string this call to preg_match_all() should work:
preg_match_all('#(data-hovercard=\\\\"\\\\/ajax\\\\/hovercard\\\\/hovercard.php\?id=[0-9]+\\\\">)#', $str, $matches);
That will give you the whole string you posted in $matches. However, if you only want the numbers from id you can add extra parenthesis around that like so:
preg_match_all('#(data-hovercard=\\\\"\\\\/ajax\\\\/hovercard\\\\/hovercard.php\?id=([0-9]+)\\\\">)#', $str, $matches);
And the numbers will appear individually in $matches (similarly, you can remove the parenthesis that wraps the whole regexp to stop matching the whole string).
Update:
And now I see the question is updated. If your new example fails it's because there are no sequence of 9 digits in the data you get. When I try myself I simply get a response that says I need to log in, so maybe your matching issues is in fact due to you not getting the data you expect? Try dumping $html to see if what you are looking for is in fact in there.

find url with regex on text

there are a lot of topics like this one but i don't know what the error i tried a lot
so this is the original text
onclick="NewWindow('http://google.com','name','800','600','yes');return false">
this is my code
$re1='(onclick)';
$re2='(=)';
$re3='(.)';
$re4='(NewWindow)';
$re5='(\\()';
$re6='(.)';
$re7='((?:http|https)(?::\\/{2}[\\w]+)(?:[\\/|\\.]?)(?:[^\\s"]*))';
$c=preg_match_all ("/".$re1.$re2.$re3.$re4.$re5.$re6.$re7."/is", $txt, $matches);
print_r($matches);
any one can help me to get the url using regular expression and php??
what is the wrong with this code?
Regards
preg_match("/NewWindow\('([^']*)'/",$txt, $matches);
matches[1] contains the url
is it what you need ?
(edit: put in code block because a parenthesis was not escaped correclty
This should work:
preg_match("/onclick=\"NewWindow\('(.*)','n/",$txt,$matches);
I'd use non-greedy matching for this:
preg_match("/onclick=\"NewWindow\('(.*?)'/", $txt, $matches);
Based on your description, the regex I would use, would be:
/(?<=NewWindow\(\').*(http://|https://)[^\'\"]*/i
or
/(?<=onclick=\"NewWindow\(\').*(http://|https://)[^\'\"]*/i
A great tool for testing your regex is: http://gskinner.com/RegExr/
It outputs just the url and only does so if it is preceded by "NewWindow('" in the first example or "onclick="NewWindow('", which means, in your case, 'http://google.com').

url matching regex for php

I'm looking for a pattern that can match urls.
All of them will contain ".no" as there is only Norwegian domains input.
I think whats needed is this:
search for a space or linebreak before and after '.no', and the match will be a link.
Some examples of what it should match (all with text around it):
test.no
test.no/blablabla/
test.no/blablabla/test.html
test.no/blablabla/test.php
test.no/blablabla/test.htm
and this should then be replaced with
MATCH
anyone can figure this out?
This should do it:
$html = preg_replace("#\w+\.no[\w/.-]*#", '$0', $html);
Check John Gruber's Regular expression for URLs and go on from there.
I hope this is strict enough:
^[\w\d-\.\\]+.no.*$

Regular expressions PHP preg_match_all

Hi I am trying to use preg_match_all() to extract the number in bold out of an image URL...
http://profile.ak.fbcdn.net/hprofile-ak-snc4/174844_39677118233_8277870_t.jpg
Could someone please help me with the regular expression needed as I am stumped.
I've used this so far:
preg_match_all("(http://profile.ak.fbcdn.net/hprofile-ak-snc4/.*_t.jpg)siU", $this->html, $matching_data);
return $matching_data[0];
}
Which is just giving me an array of the full links.
Hope someone can help, thanks!!!
This will give you all occurrences:
$matches = preg_match_all ('!/hprofile-ak-snc4/[0-9]+_([0-9]+)[^/]+?\.jpg!i', $txt);
print_r ($matches);
Number you have bolded should be contained in $matches[$n][3]...
preg_match_all("#http://profile\.ak\.fbcdn\.net/(.*?)/([0-9]+)_([0-9]+)_([0-9]+)_t\.jpg#is", $string, $matches);
print_r($matches);
Try this:
([a-z][a-z0-9+\-.]*:(//[^/?#]+)?)?
([a-z0-9\-._~%!$&'()*+,;=:#/]*)
(?:(?:\d+_)(\d+)(?:_\d+))\3
I've separated it out onto multiple lines for easier reading. You will want to use capture group 4
Or (just minimized it a bit)
(?:[a-z][a-z0-9+\-.]*:(?://[^/?#]+)?)?
([a-z0-9\-._~%!$&'()*+,;=:#/]*)
(?:(?:\d+_)(\d+)(?:_\d+))\1
and use capture group 2

Categories