extract YouTube URL from random text in PHP - php

Im trying to extract a YouTube link from just random text. e.g.
This is some random text and url is http://www.youtube.com/watch?v=-d3RYW0YoEk&feature=channel and I want to pull this URL out of this text in PHP. Can't seem to figure it out. Found a solution in another language but don't know how to convert it.
Thanks for the help.

You can use preg_match_all to grab all such URL's as:
if(preg_match_all('~(http://www\.youtube\.com/watch\?v=[%&=#\w-]*)~',$input,$m)){
// matches found in $m
}

you could try to use Regex
http://php.net/manual/en/function.preg-match.php

Use preg_match.
The pattern should be something like:
/(http\:\/\/www\.youtube\.com\/watch\?v=\w{11})/

Related

Regex to extract and format part of URL

I need to extract p5925 and c98 of the following URL:
http://www.example.com/mens-shoes/mens-boots/p5925/c98/colour/Black
The format I want is:
p:5925, c:98
The regex I got now is ([^pc]\d+) which matches just the numbers but not sure how I can get it in the format I wanted.
I used the code preg_match('/([^pc]\d+)/', $ls_url, $la_matches); to split it.
You need to use preg_match_all
preg_match_all('~(?<=/)([pc])(\d+)(?=/)~', $ls_url, $la_matches);
or
preg_match_all('~\b([pc])(\d+)\b~', $ls_url, $la_matches);
DEMO

PHP preg_match search for specific pattern coming from a path

I'm trying to scrap some information just for learning PHP and regex and I would like to extract it from an html.
The html text is an entire webpage but it has some patterns like somehtmltext_andtags_andeverything /ajax/hovercard/user.php?id=THE_ID_I_WANT andmore_text_and_tags.
I can isolate the pattern with TextEdit in Mac, but I want separate it!
how could I make it in PHP?
Thank you in advance!
Rafael.
Sorry, I was very unclear.
I want to separate only de ID, so if you see the image, the only text you would get is 100009799451329 . If the final result is the whole sentence (ajax/hovercard/user.php?id=100009799451329) it doesn't matter, goes fine for me!
try this
$matchArr = NULL;
preg_match_all("/\/ajax\/hovercard\/user\.php\?id=(.*?)\&/", $yourStr, $matchArr);
print_r($matchArr);
You can use the following pattern to find the id:
\/ajax\/hovercard\/user.php\?id=(\d+)
See a demo.
Explanation:
\/ajax\/hovercard\/user.php\?id= will match /ajax/hovercard/user.php?id=
(\d+) captures a sequence of digits, in this case the user id.

PHP preg_match for grabbing YouTube ID from img.youtube.com too

I have implemented a function for grabbing Youtube ID and i use the regular expression:
preg_match('%(?:youtube(?:-nocookie)?\.com/(?:(?:v|e(?:mbed)?)/|.*[?&]v=|[^/]+/.+/)|youtu\.‌​be/)([^"&?/ ]{11})%i', $sourceCode, $youtube))
My problem is that this resular expression doesn't grab the id from a URL like: http://img.youtube.com/vi/Di9mW35zprs/0.jpg
So, how can i change it to work as it works right now but by giving me the "Di9mQ35zprs" from a URL "http://img.youtube.com/vi/Di9mW35zprs/0.jpg" too?
THERE ARE MANY OTHER QUESTIONS REGARDING GRABBING YOUTUBE ID WITH PREG_MATCH, BUT NOONE OF THEM ANSWERS HOW TO GRAB FROM A URL http://img.youtube.com/vi/Di9mW35zprs/0.jpg TOO.
All i want to achieve is to replace my current preg_match and enable it to grab youtube ID from http://img.youtube.com/vi/Di9mW35zprs/0.jpg urls.
Thank you!
Use parse_url() and explode() first. Now your regexp looks like it came straight from hell.
try using this:
$videoId = preg_replace("/.*v[=i]\/?(\w+).*/", '\1', "http://img.youtube.com/vi/Di9mQ35zprs/0.jpg");
The function will return the id you want. It should work with both types of URL you described.

preg_match and images URL

I have a little problem with preg_match function in PHP. I think that I never will learn how to use this function. I want to extract URL of image from HTML without name of image. For example, if I have some link for image:
"/data/images/2013-10-03/someimage.jpg"
or
"http://something.com//data/images/2013-10-03/someimage.jpg"
How can I use preg_match function to delete everything left of last forward slash, so I can get only image name from URL?
Maybe it's smarter to use different function but I dont know which one?
P.S. Can you give me some good tutorial for preg_match function?
Maybe I forgot to say... I dont know how long is image name or what is image name exactly. I need function for extract only what is on right side from last forward slash.
$pattern = '/[\w\-]+\.(jpg|png|gif|jpeg)/';
$subject = 'http://something.com//data/images/2013-10-03/someimage.png';
$result = preg_match($pattern, $subject, $matches);
echo $matches[0]; //someimage.jpg
No need for regex or anything fancy:
$var = "http://something.com/data/images/2013-10-03/someimage.jpg";
$image = basename($var);
U need use preg_replace() and u can try use online for play with regular, it is a fast way to learn regex. http://preg_replace.onlinephpfunctions.com/
For example: /\/someimage.jpg/ replace on ''(null).
It will return http://something.com//data/images/2013-10-03 from http://something.com//data/images/2013-10-03/someimage.jpg.
You can use Simple HTML DOM Parser to get href between the a tags.
For example:
foreach($html->find('a.[class="your class"]') as $var)
// echo "href." >sometext";
hope this helps!

Using PHP's preg_match_all to extract a URL

I have been struggling for a while now to make the following work. Basically, I'd like to be able to extract a URL from an expression contained in an HTML template, as follows:
{rssfeed:url(http://www.example.com/feeds/posts/default)}
The idea is that, when this is found, the URL is extracted, and an RSS feed parser is used to get the RSS and insert it here. It all works, for example, if I hardcode the URL in my PHP code, but I just need to get this regex figured out so the template is actually flexible enough to be useful in many situations.
I've tried at least ten different regex expressions, mostly found here on SO, but none are working. The regex doesn't even need to validate the URL; I just want to find it and extract it, and the delimiters for the URL don't need to be parens, either.
Thank you!
Could this work for you?
'#((https?://)?([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)*)#'
I use it to match URLs in text.
Example:
$subject = "{rssfeed:url(http://www.example.com/feeds/posts/default)}";
$pattern ='#((https?://)?([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)*)#';
preg_match_all($pattern, $subject, $matches);
print($matches[1][0]);
Output:
http://www.example.com/feeds/posts/default
Note:
There is also a nice article on Daring Fireball called An Improved Liberal, Accurate Regex Pattern for Matching URLs that could be interesting for you.
/\{rssfeed\:url\(([^)]*)\)\}/
preg_match_all('/\{rssfeed\:url\(([^)]*)\)\}/', '{rssfeed:url(http://www.example.com/feeds/posts/default)}', $matches, PREG_PATTERN_ORDER);
print_r($matches[1]);
you should be able to get ALL the urls on the content available in $matches[1]..
Note: this will only get urls with the {rssfeed:url()} format, not all the urls in the content.
you can try this here: http://www.spaweditor.com/scripts/regex/index.php

Categories