RegEx to grab url out of another url - php

String to pull from : http:\/\/c.ypcdn.com\/2\/c\/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com
RegEx I have used before: www\..*?\.\w{2,5}
However the above RegEx will only grab the URL if it has a "www". in it. If I take out the "www." of the RegEx it justs grabs the c.ypcdn.com. I want to grab the Cleanation.com at the end of the string.
Needs to be dynamic so it can grab any url that doesn't have a "www." out of that url.

why not use parse_url() and then parse_str() on the returned query index to get it?
edit: example:
$url= "http://c.ypcdn.com/2/c/rtd?vrid=357c99c36bd7ed631eda2e43fc9e30f8&rid=283d465f-f63b-4b0d-90b0-be6c12ed7617&ptid=943aw4l8qj&ypid=11720135&lid=194823099&tl=6&lsrc=SP&dest=http%3A%2F%2FCleanation.com";
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query,$params);
echo $params['dest'];

If this is always the dest parameter, you can grab it with something like:
"dest=https?%3A%2F%2F([^?&]+?)"
If its aways the last parameter, you can grab it with:
"dest=https?%3A%2F%2F(.+)$"

Related

Regex to filter int from url

I'm trying to filter out a value of a url.
The url looks like the following:
http://userimages-akm.imvu.com/catalog/includes/modules/phpbb2/images/avatars/145870556_47076915459092eafd7b69.jpg
Now i'm trying to only receive the following part from the url: 145870556
I thought about using a regex. But i won't get a working regex beside this one:
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$
Is there a better regex to use?
If the image filename always follows the same format <timestamp>_<hex-value>.<extension>, then you don't need to match the entire URL.
$url = 'http://userimages-akm.imvu.com/catalog/includes/modules/phpbb2/images/avatars/145870556_47076915459092eafd7b69.jpg';
preg_match_all('~\/(\d+)_.*$~', $url, $matches);
// $matches[1] = '145870556';
https://regex101.com/r/vsDnoj/2

Cannot get the full value of a URL passed in the query string

I want to send a URL to another page like this:
http://localhost/l.php?u=http://www.simplesite.com?view=photo&id=13
Where the URL http://www.simplesite.com?view=photo&id=13 is the value of the parameter u.
In the l.php file my result looks like this:
echo $_GET['u']; // http://www.simplesite.com?view=photo
// &id=13 is missing
What is wrong with this? I want to redirect to the URL http://www.simplesite.com?view=photo&id=13, but the &id=13 part is missing.
I create links with preg_replace. I cannot apply a PHP function to the $1 variable
preg_replace("/(https?:\/\/[\w-?#&;~=\.\/\#]+[\w\/])/i","<a target=\"_blank\"
href=\"l.php?u=$1\">$1</a>",$text);
Because you have an ampersand in your URL. When you use that, you define a new URL parameter. Use urlencode before sending it and urldecode when you need the URL.
E.g.: urlencode('http://localhost/l.php?u=http://www.simplesite.com?view=photo&id=13');
output: http%3A%2F%2Flocalhost%2Fl.php%3Fu%3Dhttp%3A%2F%2Fwww.simplesite.com%3Fview%3Dphoto%26id%3D13
To decode it again:
urldecode('http%3A%2F%2Flocalhost%2Fl.php%3Fu%3Dhttp%3A%2F%2Fwww.simplesite.com%3Fview%3Dphoto%26id%3D13');
output: http://localhost/l.php?u=http://www.simplesite.com?view=photo&id=13
Try this code:
preg_replace_callback("/(https?:\/\/[\w-?#&;~=\.\/\#]+[\w\/])/i", function($m){
return "<a target='_blank' href='l.php?u=".urlencode($m[0])."'>".$m[0]."</a>";
}, $text);
This will replace all links with anchor tags & will encode the URL as well.

URL with query string validation using PHP

I need a PHP validation function for URL with Query string (parameters seperated with &). currently I've the following function for validating URLs
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
echo preg_match($pattern, $url);
This function correctly validates input like
google.com
www.google.com
http://google.com
http://www.google.com ...etc
But this won't validate the URL when it comes with parameters (Query string). for eg.
http://google.com/index.html?prod=gmail&act=inbox
I need a function that accepts both types of URL inputs. Please help. Thanks in advance.
A simple filter_var
if(filter_var($yoururl, FILTER_VALIDATE_URL))
{
echo 'Ok';
}
might do the trick, although there are problems with url not preceding the schema:
http://codepad.org/1HAdufMG
You can turn around the issue by placing an http:// in front of urls without it.
As suggested by #DaveRandom, you could do something like:
$parsed = parse_url($url);
if (!isset($parsed['scheme'])) $url = "http://$url";
before feeding the filter_var() function.
Overall it's still a simpler solution than some extra-complicated regex, though..
It also has these flags available:
FILTER_FLAG_PATH_REQUIRED FILTER_VALIDATE_URL Requires the URL to
contain a path part. FILTER_FLAG_QUERY_REQUIRED FILTER_VALIDATE_URL
Requires the URL to contain a query string.
http://php.net/manual/en/function.parse-url.php
Some might think this is not a 100% bullet-proof,
but you can give a try as a start

get a specific part of a string

I have the following url. http://domain.com/userfiles/dynamic/images/whatever_dollar_1318105152.png
Everything in the url can change except the userfiles part and the last underscore. Basically I want to get the part of the url which is userfiles/dynamic/images/whatever_dollar_ What is a good way to do this. I'm open or both JavaScript or php.
Use parse_url in PHP to split an url in its various parts. Get the path part that is returned. It contains the path without the domain and the query string.
After that use strrpos to find the last occurrance of the _ within the path.
With substr you can copy the first part of the path (up until the found _) and you're done.
You could, with JavaScript, try:
var string = "http://domain.com/userfiles/dynamic/images/whatever_dollar_1318105152.png";
var newString = string.substring(string.indexOf('userfiles'),string.lastIndexOf('_'));
alert(newString); // returns: "userfiles/dynamic/images/whatever_dollar" (Without quotes).
JS Fiddle demo.
References:
substring().
indexOf().
lastIndexOf().
Assuming your string is stored in $s, simply:
echo preg_replace('/.*(userfiles.*_).*/', '$1', $s);

regex to get $_GET variables

I have a URL string and would like to extract parts of the URL. I have been trying to do understand how to do it with regex but no luck.
http://www.example.com?id=example.id&v=other.variable
From the example above I would like to extract the id value ie. example.id
I'm assuming you're not referring to actual $_GET variables, but to a string containing a URL with a query string.
PHP has built-in functions to process those:
parse_url() to extract the query string from a URL
parse_str() to split the query string into its components
No need for regexp here, just use php built in function parse_url
$url = 'http://www.example.com?id=example.id&v=other.variable';
parse_str(parse_url($url, PHP_URL_QUERY), $vars);

Categories