preg_match and images URL - php

I have a little problem with preg_match function in PHP. I think that I never will learn how to use this function. I want to extract URL of image from HTML without name of image. For example, if I have some link for image:
"/data/images/2013-10-03/someimage.jpg"
or
"http://something.com//data/images/2013-10-03/someimage.jpg"
How can I use preg_match function to delete everything left of last forward slash, so I can get only image name from URL?
Maybe it's smarter to use different function but I dont know which one?
P.S. Can you give me some good tutorial for preg_match function?
Maybe I forgot to say... I dont know how long is image name or what is image name exactly. I need function for extract only what is on right side from last forward slash.

$pattern = '/[\w\-]+\.(jpg|png|gif|jpeg)/';
$subject = 'http://something.com//data/images/2013-10-03/someimage.png';
$result = preg_match($pattern, $subject, $matches);
echo $matches[0]; //someimage.jpg

No need for regex or anything fancy:
$var = "http://something.com/data/images/2013-10-03/someimage.jpg";
$image = basename($var);

U need use preg_replace() and u can try use online for play with regular, it is a fast way to learn regex. http://preg_replace.onlinephpfunctions.com/
For example: /\/someimage.jpg/ replace on ''(null).
It will return http://something.com//data/images/2013-10-03 from http://something.com//data/images/2013-10-03/someimage.jpg.

You can use Simple HTML DOM Parser to get href between the a tags.
For example:
foreach($html->find('a.[class="your class"]') as $var)
// echo "href." >sometext";
hope this helps!

Related

PHP remove page name Regex - preg_replace

I have this url (several similar ones)..
images/image1/image1.jpg
images/images1/images2/image2.jpg
images/images2/images3/images4/image4.jpg
I have this regex: but I want it to strip away the image name from the string:
<?php $imageurlfolder = $pagename1;
$imageurlfolder = preg_replace('/[A-Za-z0-9]+.asp/', '', $pagename1);?>
the string would look like the url's above images/images2/images3/images4/ but without the image4.jpg
hope you can help
Thanks
For this particular purpose function dirname() would be sufficient:
<?php echo dirname('images/images2/images3/images4/image4.jpg'); ?>
Would return:
images/images2/images3/images4
I think you can use the dirname function
for instance (from that page)
dirname("/etc/passwd")
would print
/etc
A quite straightforward way to do it:
preg_replace("#(?<=/)[^/]+$#","",$your_string);
It will remove everything between the last / and the end of the string.
Edit: as many peopole pointed out, you can also use dirname which might proof faster…

html pattern to parse in php

i have a following pattern, inside the html file, that i would like to parse in php to get a link but for now i dont see a solution as i am trying to use QueryPath and my case is simply not a common dom element:
<script>
to.addVariable("site_name","http://www.sitename.com");
</script>
I just would like to return a link part of that pattern in order to print it.
Hope someone could recommend how to.
Thank you.
UPDATE: I would like to get http://www.sitename.com as a value from the code above using php, maybe with phpQuery or QueryPath.
Something like this I guess will work
<?PHP
$text = '
<script>
to.addVariable("site_name","http://www.sitename.com");
</script>
';
preg_match('#to\.addVariable\("site_name","([^"]+)"\);#', $text, $matches);
echo $matches[1];
?>
You can also use preg_match_all if you have more than one to.addVariable(... strings in your <script> section.
Try this regular exp:
$regex = '#to\.addVariable\("(.+?)", "(.+?)"\)#';
Then, use preg_match_all to get the matches. If you want to check that the URL is an actual URL, the get any regular expression to match URLs and place it in the second .+?, these patterns will match anything between "", so you should check that you have what you need unless you trust the source.
NOTE: I'm not so sure that " doesn't needs to be escaped in regex, so check it out
Hope I can help!
If you don't understand something drop a comment!

How can I use a PHP regex to transform the contents of certain HTML tag attributes?

I think I am right in asuming that RegEx can do this job, I'm just not sure how I would do it!
Basically I have a number of links on my website that are in the format of:
Example
I need some code that will transform the href value so that it gets outputed in lowercase, but that does not affect the anchor text . E.g:
Example
Is this possible? And if so, what would be the code to do this?
you can use preg_replace_callback
something like that
function replace($match){
return strtolower($matches[0])
}
...
preg_replace_callback('/(href="[^"]*")/i' 'replace',$str);
Using preg_match and strtolower functions
preg_match('/\<a(.*)\>(.*)\<\/a\>/i',$cadena, $a);
$a[1]=strtolower($a[1]);
$cadena = preg_replace('/\<a(.*)\>(.*)\<\/a\>/i',$a[1],$cadena);
echo $cadena;
Regards!

extract YouTube URL from random text in PHP

Im trying to extract a YouTube link from just random text. e.g.
This is some random text and url is http://www.youtube.com/watch?v=-d3RYW0YoEk&feature=channel and I want to pull this URL out of this text in PHP. Can't seem to figure it out. Found a solution in another language but don't know how to convert it.
Thanks for the help.
You can use preg_match_all to grab all such URL's as:
if(preg_match_all('~(http://www\.youtube\.com/watch\?v=[%&=#\w-]*)~',$input,$m)){
// matches found in $m
}
you could try to use Regex
http://php.net/manual/en/function.preg-match.php
Use preg_match.
The pattern should be something like:
/(http\:\/\/www\.youtube\.com\/watch\?v=\w{11})/

regex to get current page or directory name?

I am trying to get the page or last directory name from a url
for example if the url is: http://www.example.com/dir/ i want it to return dir or if the passed url is http://www.example.com/page.php I want it to return page Notice I do not want the trailing slash or file extension.
I tried this:
$regex = "/.*\.(com|gov|org|net|mil|edu)/([a-z_\-]+).*/i";
$name = strtolower(preg_replace($regex,"$2",$url));
I ran this regex in PHP and it returned nothing. (however I tested the same regex in ActionScript and it worked!)
So what am I doing wrong here, how do I get what I want?
Thanks!!!
Don't use / as the regex delimiter if it also contains slashes. Try this:
$regex = "#^.*\.(com|gov|org|net|mil|edu)/([a-z_\-]+).*$#i";
You may try tho escape the "/" in the middle. That simply closes your regex. So this may work:
$regex = "/.*\.(com|gov|org|net|mil|edu)\/([a-z_\-]+).*/i";
You may also make the regex somewhat more general, but that's another problem.
You can use this
array_pop(explode('/', $url));
Then apply a simple regex to remove any file extension
Assuming you want to match the entire address after the domain portion:
$regex = "%://[^/]+/([^?#]+)%i";
The above assumes a URL of the format extension://domainpart/everythingelse.
Then again, it seems that the problem here isn't that your RegEx isn't powerful enough, just mistyped (closing delimiter in the middle of the string). I'll leave this up for posterity, but I strongly recommend you check out PHP's parse_url() method.
This should adequately deliver:
substr($s = basename($_SERVER['REQUEST_URI']), 0, strrpos($s,'.') ?: strlen($s))
But this is better:
preg_replace('/[#\.\?].*/','',basename($path));
Although, your example is short, so I cannot tell if you want to preserve the entire path or just the last element of it. The preceding example will only preserve the last piece, but this should save the whole path while being generic enough to work with just about anything that can be thrown at you:
preg_replace('~(?:/$|[#\.\?].*)~','',substr(parse_url($path, PHP_URL_PATH),1));
As much as I personally love using regular expressions, more 'crude' (for want of a better word) string functions might be a good alternative for you. The snippet below uses sscanf to parse the path part of the URL for the first bunch of letters.
$url = "http://www.example.com/page.php";
$path = parse_url($url, PHP_URL_PATH);
sscanf($path, '/%[a-z]', $part);
// $part = "page";
This expression:
(?<=^[^:]+://[^.]+(?:\.[^.]+)*/)[^/]*(?=\.[^.]+$|/$)
Gives the following results:
http://www.example.com/dir/ dir
http://www.example.com/foo/dir/ dir
http://www.example.com/page.php page
http://www.example.com/foo/page.php page
Apologies in advance if this is not valid PHP regex - I tested it using RegexBuddy.
Save yourself the regular expression and make PHP's other functions feel more loved.
$url = "http://www.example.com/page.php";
$filename = pathinfo(parse_url($url, PHP_URL_PATH), PATHINFO_FILENAME);
Warning: for PHP 5.2 and up.

Categories