PHP regular expression, remove specific attributes - php

I have a simple URL:
http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?amp%3Butm_medium=push&%3Butm_source=application&p=2&price=12-260
I want to change the URL to be like this:
http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?p=2&price=12-260
I start resolving it with preg_replace:
preg_replace('/(&|\?)utm(.+)=[^&]*&/', '$1', preg_replace('/(&|\?)utm(.+)=[^&]*$/', '', $url2))
but I didn't get the result that I wanted.
I will happy if someone can help me and thank you

I've made code for you with my logic. First, I separate the URL into two main URL and query using explode. Then in the query, I explode again to get the query need. After I get all of the requirement then I construct a new URL.
<?php
$url = "http://store.com/maison-entretien/aspirateur-nettoyeur/tous-les-aspirateurs-nettoyeurs.html?amp%3Butm_medium=push&%3Butm_source=application&p=2&price=12-260";
$url = explode("?",$url);
$query = explode("&",$url[1]);
$newUrl = $url[0]."?".$query[2]."&".$query[3];
print_r($newUrl);

Related

How to delete tracking code from links in PHP

Hi I have a form in WordPress where users can submit a link to a product, but very often the links come with unnecessary baggage, like tracking codes. I would like to create a filter in WordPress and clean the links so they consist of just a working link. I would like to if possible confirm that the link still works or a method that will guarantee that the link will still work.
The main things I want to get rid of in links are utm_source and it's contents, utm_medium and it's contents, etc. Everything but the clean working link.
So for example, a link like this:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055&pdp=true&source=detail&utm_source=affiliate&utm_medium=affiliate&utm_campaign=pjdatafeed&publisherId=20648&clickId=2669312134#fo_c=745&fo_k=c0ebaf8359ca7853df8343e535533280&fo_s=pepperjam
Will end up like this:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055
I'd really appreciate if someone can lead me in the right direction.
Thanks!
You can do what you want with explode, parse_str and http_build_query. This code uses an array of unwanted parameters to decide what to delete from the query string:
$unwanted_params = array('utm_source', 'utm_medium', 'utm_campaign', 'clickId', 'publisherId', 'source', 'pdp', 'details', 'fo_k', 'fo_s');
$url = 'https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055&pdp=true&source=detail&utm_source=affiliate&utm_medium=affiliate&utm_campaign=pjdatafeed&publisherId=20648&clickId=2669312134#fo_c=745&fo_k=c0ebaf8359ca7853df8343e535533280&fo_s=pepperjam';
list($path, $query_string) = explode('?', $url, 2);
// parse the query string
parse_str($query_string, $params);
// delete unwanted parameters
foreach ($unwanted_params as $p) unset($params[$p]);
// rebuild the query
$query_string = http_build_query($params);
// reassemble the URL
$url = $path . '?' . $query_string;
echo $url;
Output:
https://www.serenaandlily.com/variationproduct?dwvar_m10055_size=Twin&dwvar_m10055_color=Chambray&pid=m10055
Demo on 3v4l.org
You can do this in the PHP itself. There is a function called parse_url() (https://secure.php.net/manual/en/function.parse-url.php) which can give you all the URI params as array. After parsing, you can filter the parameters, remove the unwanted. Finally, use http_build_query() (https://secure.php.net/manual/en/function.http-build-query.php) to build a string URI to return :)

PHP URL Variable Appending

Hoping this is a simple and easy question. I've seen multiple examples of, and know how to append variables to the URL (i.e. mydomain.com/index.php?id=1&stat=0), but my question is this:
If I have a page on my site that already has variables in the URL (i.e. mydomain.com/tickets.php?stat=Open), how can I append a page number to the end of that URL (i.e. mydomain.com/tickets.php?stat=Open&page=2). This is for pagination purposes of a table with values from my database, that includes a search and select function (select open, closed, or all tickets, and search for a specific ticket number).
I've done several searches with google, and came up dry, as most topics regarding this have you hardcode the url with variables from the get go, and not append them. I may just be using the wrong search parameters as well, and am not sure what to search for exactly.
Any help or insight on this would be greatly appreciated, thank you.
Please note I wish to do this solely in PHP, HTML, and MySQLi. I want to refrain from using javascript or ajax if possible for my clients that may have those features disabled on their browsers.
Using this way:--
<?php
$domain = "mydomain.com";
$page = "tickets.php?";
$full_page_url = $domain.'/'.$page;
$arr = array('stat' => 'Open', 'page' =>2);
$add= http_build_query($arr);
$correct_url = $full_page_url. $add;
echo $correct_url;
?>
output:--mydomain.com/tickets.php?stat=Open&page=2
I would do it like this:
$page = 2;
$url = 'mydomain.com/tickets.php?stat=Open';
if( false !== strpos($url, '?')){
//if url has a ? split it.
$arr_url = explode('?', $url);
//convert query string to array, $array=['stat'=>'Open']
parse_str($arr_url[1], $array);
//add or replace page by array key
$array['page'] = $page;
//convert it back to a query string.
$query = http_build_query($array);
print_r($query);
}
Outputs
stat=Open&page=2
It's a simple matter of putting $query back with $arr_url[0] I'll leave this up to you. But I will give you a hint $arr_url[0].'?'.$query
The advantage here is that you don't have to worry about getting into a situation where you are adding page after page after page after...
Like this:
mydomain.com/tickets.php?stat=Open&page=1&page=2&page=3
You can't simply concatenate it onto the end of the url, and it's probably just as hard to remove it as it is to parse the query string.
As a side note, you could just use $_GET but where is the fun in that, as $_GET is the query string already parsed as an array ( so you could skip parse_str). But it may not be on a request, such as if you were just building the link from a string.
So I thought I would show it with parse_str to cover the "harder" case.
One last thing if you are just building a bunch of urls all the same except the page part. The obvious answer is to setup a base url and then just loop out the numbers.
$url = 'mydomain.com/tickets.php?stat=Open';
$pagedUrls = [];
$numberPages = 10;
for($i=1; $i<=$nubmerPages; $i++){
$pagedUrls[] = $url.='&page='.$i;
}
Or what have you for the number of pages.
It's really not that clear in your question exactly what you are trying to do..
Hope that helps.

How do i extract a link from a longer link using php?

I was wondering if someone knows what the best method would be to extract a link from another link , Here's an example:
If I have links in the following format:
http://www.youtube.com/watch?v=35HBFeB4jYg OR
http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv OR
https://www.google.it/search?q=rap+tedesco&aq=f&oq=rap+tedesco&aqs=chrome.0.57j62l2.2287&sourceid=chrome&ie=UTF-8#hl=en&sclient=psy-ab&q=migliori+programatori&oq=migliori+programatori&gs_l=serp.3..0i19j0i13i30i19l3.9986.13880.0.14127.14.10.0.4.4.0.165.931.6j4.10.0...0.0...1c.1.7.psy-ab.tPmiWRyUVXA&pbx=1&bav=on.2,or.r_cp.r_qf.&fp=ffc0e9337f73a744&biw=1280&bih=699
How would I go about extracting only the web pages like so:
http://www.youtube.com
http://it.answers.yahoo.com
https://www.google.it
I was wondering if and what regular expression I could use with PHP to achieve this, also are regular expressions the way to go?
There is a PHP function for parsing URLs: parse_url
$url = 'http://it.answers.yahoo.com/question/index?qid=20080520042405AApM2Rv';
$p = parse_url($url);
echo $p["scheme"] . "// . "$p["host"];
Use function parse_url.
$link = "https://www.google.it/search?q=rap+tedesco";
$parseUrl = parse_url($link);
$siteName = $parseUrl['scheme']."://". $parseUrl['host'];
Using Regexp.
preg_match('#http(s?)://([\w]+\.){1}([\w]+\.?)+#',$link,$matches);
echo $matches[0];
Codeviper Demo.
You just want to have the domain of the page, in PHP there exists a function called parse_url that could help

Remove certain part of string in PHP [duplicate]

This question already has answers here:
Get domain name (not subdomain) in php
(18 answers)
Closed 10 years ago.
I've already seen a bunch of questions on this exact subject, but none seem to solve my problem. I want to create a function that will remove everything from a website address, except for the domain name.
For example if the user inputs: http://www.stackoverflow.com/blahblahblah I want to get stackoverflow, and the same way if the user inputs facebook.com/user/bacon I want to get facebook.
Do anyone know of a function or a way where I can remove certain parts of strings? Maybe it'll search for http, and when found it'll remove everything until after the // Then it'll search for www, if found it'll remove everything until the . Then it keeps everything until the next dot, where it removes everything behind it? Looking at it now, this might cause problems with sites as http://www.en.wikipedia.org because I'll be left with only en.
Any ideas (preferably in PHP, but JavaScript is also welcome)?
EDIT 1:
Thanks to great feedback I think I've been able to work out a function that does what I want:
function getdomain($url) {
$parts = parse_url($url);
if($parts['scheme'] != 'http') {
$url = 'http://'.$url;
}
$parts2 = parse_url($url);
$host = $parts2['host'];
$remove = explode('.', $host);
$result = $remove[0];
if($result == 'www') {
$result = $remove[1];
}
return $result;
}
It's not perfect, at least considering subdomains, but I think it's possible to do something about it. Maybe add a second if statement at the end to check the length of the array. If it's bigger than two, then choose item nr1 instead of item nr0. This obviously gives me trouble related to any domain using .co.uk (because that'll be tree items long, but I don't want to return co). I'll try to work around on it a little bit, and see what I come up with. I'd be glad if some of you PHP gurus out there could take a look as well. I'm not as skilled or as experienced as any of you... :P
Use parse_url to split the URL into the different parts. What you need is the hostname. Then you will want to split it by the dot and get the first part:
$url = 'http://facebook.com/blahblah';
$parts = parse_url($url);
$host = $parts['host']; // facebook.com
$foo = explode('.', $host);
$result = $foo[0]; // facebook
You can use the parse_url function from PHP which returns exactly what you want - see
Use the parse_url method in php to get domain.com and then use replace .com with empty string.
I am a little rusty on my regular expressions but this should work.
$url='http://www.en.wikipedia.org';
$domain = parse_url($url, PHP_URL_HOST); //Will return en.wikipedia.org
$domain = preg_replace('\.com|\.org', '', $domain);
http://php.net/manual/en/function.parse-url.php
PHP REGEX: Get domain from URL
http://rubular.com/r/MvyPO9ijnQ //Check regular expressions
You're looking for info on Regular Expression. It's a bit complicated, so be prepared to read up. In your case, you'll best utilize preg_match and preg_replace. It searches for a match based on your pattern and replaces the matches with your replacement.
preg_match
preg_replace
I'd start with a pattern like this: find .com, .net or .org and delete it and everything after it. Then find the last . and delete it and everything in front of it. Finally, if // exists, delete it and everything in front of it.
if (preg_match("/^http:\/\//i",$url))
preg_replace("/^http:\/\//i","",$url);
if (preg_match("/www./i",$url))
preg_replace("/www./i","",$url);
if (preg_match("/.com/i",$url))
preg_replace("/.com/i","",$url);
if (preg_match("/\/*$/",$url))
preg_replace("/\/*$/","",$url);
^ = at the start of the string
i = case insensitive
\ = escape char
$ = the end of the string
This will have to be played around with and tweaked, but it should get your pointed in the right direction.
Javascript:
document.domain.replace(".com","")
PHP:
$url = 'http://google.com/something/something';
$parse = parse_url($url);
echo str_replace(".com","", $parse['host']); //returns google
This is quite a quick method but should do what you want in PHP:
function getDomain( $URL ) {
return explode('.',$URL)[1];
}
I will update it when I get chance but basically it splits the URL into pieces by the full stop and then returns the second item which should be the domain. A bit more logic would be required for longer domains such as www.abc.xyz.com but for normal urls it would suffice.

Rewrite Youtube URL

I have some YouTube URLs stored in a database that I need to rewrite.
They are stored in this format:
$http://youtu.be/IkZuQ-aTIs0
I need to have them re-written to look like this:
$http://youtube.com/v/IkZuQ-aTIs0
These values are stored as a variable $VideoType
I'm calling the variable like this:
$<?php if ($video['VideoType']){
$echo "<a rel=\"shadowbox;width=700;height=400;player=swf\" href=\"" . $video['VideoType'] . "\">View Video</a>";
$}?>
How do I rewrite them?
Thank you for the help.
You want to use the preg_replace function:
Something like:
$oldurl = 'youtu.be/blah';
$pattern = '/youtu.be/';
$replacement = 'youtube.com/v';
$newurl = preg_replace($pattern, $replacement, $string);
You can use a regular expression to do this for you. If you have ONLY youtube URLs stored in your database, then it would be sufficient to take the part after the last slash 'IkZuQaTIs0' and place it in the src attribute after 'http://www.youtube.com/'.
For this simple solution, do something like this:
<?php
if ($video['VideoType']) {
$last_slash_position = strrpos($video['VideoType'], "/");
$youtube_url_code = substr($video['VideoType'], $last_slash_position);
echo "<a rel=\"shadowbox;width=700;height=400;player=swf\"
href=\"http://www.youtube.com/".$youtube_url_code."\">
View Video</a>";
}
?>
I cannot test it at the moment, maybe you can try to experiment with the position of the last slash occurence etc. You can also have a look at the function definitions:
http://www.php.net/manual/en/function.substr.php
http://www.php.net/manual/en/function.strrpos.php
However, be aware of the performance. Build a script which prases your database and converts every URL or stores a short and a long URL in each entry. Because regular expressions in the view are never a good idea.
UPDATE: it would be even better to store ONLY the youtube video identifier / url code in the database for every entry, so in the example's case it would be IkZuQ-aTIs0.

Categories