I was wondering if someone knows what the best method would be to extract a link from another link , Here's an example:
If I have links in the following format:
http://www.youtube.com/watch?v=35HBFeB4jYg OR
http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv OR
https://www.google.it/search?q=rap+tedesco&aq=f&oq=rap+tedesco&aqs=chrome.0.57j62l2.2287&sourceid=chrome&ie=UTF-8#hl=en&sclient=psy-ab&q=migliori+programatori&oq=migliori+programatori&gs_l=serp.3..0i19j0i13i30i19l3.9986.13880.0.14127.14.10.0.4.4.0.165.931.6j4.10.0...0.0...1c.1.7.psy-ab.tPmiWRyUVXA&pbx=1&bav=on.2,or.r_cp.r_qf.&fp=ffc0e9337f73a744&biw=1280&bih=699
How would I go about extracting only the web pages like so:
http://www.youtube.com
http://it.answers.yahoo.com
https://www.google.it
I was wondering if and what regular expression I could use with PHP to achieve this, also are regular expressions the way to go?
There is a PHP function for parsing URLs: parse_url
$url = 'http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv';
$p = parse_url($url);
echo $p["scheme"] . "// . "$p["host"];
Use function parse_url.
$link = "https://www.google.it/search?q=rap+tedesco";
$parseUrl = parse_url($link);
$siteName = $parseUrl['scheme']."://". $parseUrl['host'];
Using Regexp.
preg_match('#http(s?)://([\w]+\.){1}([\w]+\.?)+#',$link,$matches);
echo $matches[0];
Codeviper Demo.
You just want to have the domain of the page, in PHP there exists a function called parse_url that could help
Related
I have a simple URL:
http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?amp%3Butm_medium=push&%3Butm_source=application&p=2&price=12-260
I want to change the URL to be like this:
http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?p=2&price=12-260
I start resolving it with preg_replace:
preg_replace('/(&|\?)utm(.+)=[^&]*&/', '$1', preg_replace('/(&|\?)utm(.+)=[^&]*$/', '', $url2))
but I didn't get the result that I wanted.
I will happy if someone can help me and thank you
I've made code for you with my logic. First, I separate the URL into two main URL and query using explode. Then in the query, I explode again to get the query need. After I get all of the requirement then I construct a new URL.
<?php
$url = "http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?amp%3Butm_medium=push&%3Butm_source=application&p=2&price=12-260";
$url = explode("?",$url);
$query = explode("&",$url[1]);
$newUrl = $url[0]."?".$query[2]."&".$query[3];
print_r($newUrl);
Hi I have a form in WordPress where users can submit a link to a product, but very often the links come with unnecessary baggage, like tracking codes. I would like to create a filter in WordPress and clean the links so they consist of just a working link. I would like to if possible confirm that the link still works or a method that will guarantee that the link will still work.
The main things I want to get rid of in links are utm_source and it's contents, utm_medium and it's contents, etc. Everything but the clean working link.
So for example, a link like this:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055&pdp=true&source=detail&utm_source=affiliate&utm_medium=affiliate&utm_campaign=pjdatafeed&publisherId=20648&clickId=2669312134#fo_c=745&fo_k=c0ebaf8359ca7853df8343e535533280&fo_s=pepperjam
Will end up like this:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055
I'd really appreciate if someone can lead me in the right direction.
Thanks!
You can do what you want with explode, parse_str and http_build_query. This code uses an array of unwanted parameters to decide what to delete from the query string:
$unwanted_params = array('utm_source', 'utm_medium', 'utm_campaign', 'clickId', 'publisherId', 'source', 'pdp', 'details', 'fo_k', 'fo_s');
$url = 'https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055&pdp=true&source=detail&utm_source=affiliate&utm_medium=affiliate&utm_campaign=pjdatafeed&publisherId=20648&clickId=2669312134#fo_c=745&fo_k=c0ebaf8359ca7853df8343e535533280&fo_s=pepperjam';
list($path, $query_string) = explode('?', $url, 2);
// parse the query string
parse_str($query_string, $params);
// delete unwanted parameters
foreach ($unwanted_params as $p) unset($params[$p]);
// rebuild the query
$query_string = http_build_query($params);
// reassemble the URL
$url = $path . '?' . $query_string;
echo $url;
Output:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055
Demo on 3v4l.org
You can do this in the PHP itself. There is a function called parse_url() (https://secure.php.net/manual/en/function.parse-url.php) which can give you all the URI params as array. After parsing, you can filter the parameters, remove the unwanted. Finally, use http_build_query() (https://secure.php.net/manual/en/function.http-build-query.php) to build a string URI to return :)
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Get the subdomain from a URL
I have seen posts about using parse_url to get www.domain.tld but how can i just get "domain" using php?
i have got this regex currently
$pattern = '#https?://[a-z]{1,}\.{0,}([a-z]{1,})\.com(\.[a-z]{1,}){0,}#';
but this only works with .com and i need it to work with all TLDs (.co.uk, .com , .tv etc.)
Is there a reliable way to do this, i am not sure if regex is the best way to go or not? or maybe explode on "." but then again subdomains would mess it up.
EDIT
so the desired outcome would be
$url = "https://stackoverflow.com/questions/11952907/get-domain-without-tld-using-php#comment15926320_11952907";
$output = "stackoverflow";
Doing more research would anyone advise using parse_url to get www.domain.tld then using explode to get domain?
Try this regex :
#^https?://(www\.)?([^/]*?)(\.co)?\.[^.]+?/#
You could use the parse_url function. Doc is here.
Something like:
$url = 'http://username:password#hostname/path?arg=value#anchor';
print_r(parse_url($url));
And then you can take $url['host'] and do:
$arr = explode('.',$url['host']);
return $arr[count($arr) - 2];
I think you don't need regex.
function getDomain($url){
$things_like_WWW_at_the_start = array('www');
$urlContents = parse_url($url);
$domain = explode('.', $urlContents['host']);
if (!in_array($domain[0], $things_like_WWW_at_the_start))
return $domain[0];
else
return $domain[1];
}
I have a url that looks something like this:
zigzagstudio/#!/page_wedding2
and I need to take the part of the url after the #. In fact I need to reach to page_wedding2 in order to take the number ad compare it with and id from my database. Is this possible using php? Does anyone have an example of code? I also searched for a solution using javascript but I don't know how to send it to php using javascript.
$url = "zigzagstudio/#!/page_wedding2";
$pattern = "([\#\!\/]+(.*))";
preg_match($pattern, $url, $string);
$name = $string[1];
echo $name; // prints 'page_wedding2'
$url = 'zigzagstudio/#!/page_wedding2';
echo parse_url($url, PHP_URL_FRAGMENT);
See parse_url() docs.
You'll need to read it in JavaScript and then pass it to your PHP.
Read it in Javascript like this:
var query = location.href.split('#');
var anchorPart = query[0];
Once you have anchorPart and parsed relevant information from it, pass it to your PHP - there may be different ways of doing this, depending on your web application.
You could make an AJAX request to page_wedding2.php and pass it any parsed information in the querystring. Use the returned HTML string in your own web page.
Edit: To clarify, the browser doesn't pass the anchor part of the URL to the server side.
You can use explode in PHP and split in Javascript.
PHP:
$str = 'zigzagstudio/#!/page_wedding2';
$array = explode('#', $str);
echo array_pop($array);
Javascript:
var str = 'zigzagstudio/#!/page_wedding2';
var str_array = str.split('#');
alert(str_array.pop());
I have some YouTube URLs stored in a database that I need to rewrite.
They are stored in this format:
$http://youtu.be/IkZuQ-aTIs0
I need to have them re-written to look like this:
$http://youtube.com/v/IkZuQ-aTIs0
These values are stored as a variable $VideoType
I'm calling the variable like this:
$<?php if ($video['VideoType']){
$echo "<a rel=\"shadowbox;width=700;height=400;player=swf\" href=\"" . $video['VideoType'] . "\">View Video</a>";
$}?>
How do I rewrite them?
Thank you for the help.
You want to use the preg_replace function:
Something like:
$oldurl = 'youtu.be/blah';
$pattern = '/youtu.be/';
$replacement = 'youtube.com/v';
$newurl = preg_replace($pattern, $replacement, $string);
You can use a regular expression to do this for you. If you have ONLY youtube URLs stored in your database, then it would be sufficient to take the part after the last slash 'IkZuQaTIs0' and place it in the src attribute after 'http://www.youtube.com/'.
For this simple solution, do something like this:
<?php
if ($video['VideoType']) {
$last_slash_position = strrpos($video['VideoType'], "/");
$youtube_url_code = substr($video['VideoType'], $last_slash_position);
echo "<a rel=\"shadowbox;width=700;height=400;player=swf\"
href=\"http://www.youtube.com/".$youtube_url_code."\">
View Video</a>";
}
?>
I cannot test it at the moment, maybe you can try to experiment with the position of the last slash occurence etc. You can also have a look at the function definitions:
http://www.php.net/manual/en/function.substr.php
http://www.php.net/manual/en/function.strrpos.php
However, be aware of the performance. Build a script which prases your database and converts every URL or stores a short and a long URL in each entry. Because regular expressions in the view are never a good idea.
UPDATE: it would be even better to store ONLY the youtube video identifier / url code in the database for every entry, so in the example's case it would be IkZuQ-aTIs0.