I need to preg_match for
src="http:// "
where the blank space following // is the rest of the url ending with the ". My adapted doesn't seem to work:
preg_match('#src="(http://[^"]+)#', $data, $match);
And I am also struggling to get text that starts with > and ends with EITHER a full stop . or an exclamation mark ! or a question mark ? I have no idea how to do this one. An example of the text I want to preg_match for is:
blahblahblah>Hello world this is what I want.
I'm hoping a kind preg_match guru can tell me the answer and save me hours of headscratching.
Thanks for reading.
As for the URL:
preg_match('#src="(.*?)"#', $data, $match);
and for the second case, use />(.*?)(\.|!|\?)/
(.*?)" will match any character greedily up until the time it sees the end double quote
It seems that you want to parse a document or string which follows a HTML, DOM, XML or something similiar structure.
Use XPath, and parse to the Tag and let it return the src Attribute, this will save much trouble and you can forget about regular expressions.
Example: CLICK ME
Related
I think I need to use the preg_replace function but not sure exactly how to type in the patterns I want to find and replace. Basically, I want to replace this:
: u"x"x",
with this:
: u"x'x",
x means that any characters can go there. But I don't know how to write the x in PHP.
Thank you!
Edit: basically, I want to replace that middle double-quote with a single-quote. And I'll be searching through a big JSON file to do it. Probably should have said this at the start.
You could use this regular expression:
$result = preg_replace('#(: u".*?)"(.*?")#', "$1'$2", $string);
I'm trying to scrap some information just for learning PHP and regex and I would like to extract it from an html.
The html text is an entire webpage but it has some patterns like somehtmltext_andtags_andeverything /ajax/hovercard/user.php?id=THE_ID_I_WANT andmore_text_and_tags.
I can isolate the pattern with TextEdit in Mac, but I want separate it!
how could I make it in PHP?
Thank you in advance!
Rafael.
Sorry, I was very unclear.
I want to separate only de ID, so if you see the image, the only text you would get is 100009799451329 . If the final result is the whole sentence (ajax/hovercard/user.php?id=100009799451329) it doesn't matter, goes fine for me!
try this
$matchArr = NULL;
preg_match_all("/\/ajax\/hovercard\/user\.php\?id=(.*?)\&/", $yourStr, $matchArr);
print_r($matchArr);
You can use the following pattern to find the id:
\/ajax\/hovercard\/user.php\?id=(\d+)
See a demo.
Explanation:
\/ajax\/hovercard\/user.php\?id= will match /ajax/hovercard/user.php?id=
(\d+) captures a sequence of digits, in this case the user id.
Ok, so here's my issue:
I have a link, say: http://www.blablabla.com/watch?v=1lyu1KKwC74&feature=list_other&playnext=1&list=AL94UKMTqg-9CfMhPFKXPXcvJ_j65v7UuV
And the link is between two tags say like this:
<br>http://www.blablabla.com/watch?v=1lyu1KKwC74&feature=list_other&playnext=1&list=AL94UKMTqg-9CfMhPFKXPXcvJ_j65v7UuV<br></p>
Using this regex with preg_replace:
'#(^|[^\/]|[^>])('.addcslashes($link,'.?+').')([^\w\/]|[^<]$)#i'
As such:
preg_replace('#(^|[^\/]|[^>])('.addcslashes($link,'.?+').')([^\w\/]|[^<]$)#i', "***",$strText);
The resulted string is :
<br***p>
Which is wrong!!
It should have been
<br>***<br></p>
How can I get the desired result? I have blasted my head out trying to solve this one out.
I would like to mention that str_replace replaces even the link within another valid link, so it's not a good method, I need an exact match between two boundaries, even if the boundary is text or another HTML tag.
Assuming you don't want to use a DOM parser for some reason, I believe doing what you intended is as simple as the following:
preg_replace('#(^|[^\/]|[^>])('.addcslashes($link,'.?+').')([^\w\/]|[^<]$)#i', "$1***$3",$strText);
This uses $1 and $3 to put back the delimiting text you matched in your regular expression.
As others have pointed out, using a DOM parser is more reliable.
Does this do what you want?
Ok I tried to google it but couldn't find a solution so I am asking here. I am trying to save the HTML tags into a variable in php. I am trying to use preg_match but cannot find the right pattern(regex). I did find one regex '\s*(.*?)\s*>\s*'. This works ok on the functions-online site where I try it and gives me the whole tag i.e.i=<body> but when I try to run it in my programme I get
preg_match(): Delimiter must not be alphanumeric or backslash
It would be helpful if anyone could sort out this issue and even better if anyone could give the regex to get the data within the angle brackets(HTML tags)
Please also let me know if there is another method to store the html tags in php i.e.
<body>
then $var=body
RegEx match open tags except XHTML self-contained tags <-- Read the 1st answer if you are considering "parsing" HTML with regexes
You need to add so called delimiters: '/\s*(.*?)\s*>\s*/'
Ok Thanx to the link provided by killerx I did find a regex which could be use but it is not the best method but should work for my task
'\'<([a-z]+)[^>]*(?<!/)>\''
This should work. It will get the full tag in an array and the tag description in the other.
Thanx a ton for helping me out
Been beating my head against a wall trying to get this to work - help from any regex gurus would be greatly appreciated!
The text that has to be matched
[template option="whatever"]
<p>any amount of html would go here</p>
[/template]
I need to pull the 'option' value (i.e. 'whatever') and the html between the template tags.
So far I have:
> /\[template\s*option=["\']([^"\']+)["\']\]((?!\[\/template\]))/
Which gets me everything except the html between the template tags.
Any ideas?
Thanks, Chris
edit: [\s\S] will match anything that is space or not space.
you may have a problem when there are consecutive blocks in a large string. in that case you will need to make a more specific quantifier - either non greedy (+?) or specify range {1,200} or make the [\s\S] more specific
/\[template\s*option=["\']([^"\']+)["\']\]([\s\S]+)\[\/template\]/
Try this
/\[template\s*option=\"(.*)\"\](.*)\[\/template]/
basically instead of using complex regex to match every single thing just use (.*) which means all since you want everything in between its not like you want to verify the data in between
The assertion ?! method is unneeded. Just match with .*? to get the minimum giblets.
/\[template\s*option=\pP([\h\w]+)\pP\] (.*?) [\/template\]/x
Chris,
I see you've already accepted an answer. Great!
However, I don't think use of regular expressions is the right solution here. I think you can get the same effect by using string manipulations (substrings, etc)
Here is some code that may help you. If not now, maybe later in your coding endeavors.
<?php
$string = '[template option="whatever"]<p>any amount of html would go here</p>[/template]';
$extractoptionline = strstr($string, 'option=');
$chopoff = substr($extractoptionline,8);
$option = substr($chopoff, 0, strpos($chopoff, '"]'));
echo "option: $option<br \>\n";
$extracthtmlpart = strstr($string, '"]');
$chopoffneedle = substr($extracthtmlpart,2);
$html = substr($chopoffneedle, 0, strpos($chopoffneedle, '[/'));
echo "html: $html<br \>\n";
?>
Hope this helps anyone looking for a similar answer with a different flavor.