Check URL for valid format by pattern - php

I have social bookmarking website and in this website users can submit link from others website (using booklet or bookmark button in bookmark bar, or by adding URLs in direct method).
The users have problem with some URLs when they add links with bookmark button in their browsers. The problem occurs with URLs that contain "&" character. Most of the users who work with Safari on Mac or Windows can not add such link with bookmark button.
Issue is that all URLs with "&" end up with $isLink = preg_match($pattern, $url); // Returns false (see the code below).
I removed part of my code (see comments in the snippet), and that fixed the problem.
But I do not want to remove this code. How can I fix the problem without removing it?
$url = htmlspecialchars(sanitize($_POST['url'], 3));
$url = str_replace('&', '&', $url);
$url = html_entity_decode($url);
if (strpos($url,'http')!==0) {
$url = "http://$url";
}
// check if URL is valid format
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w]([-\d\w]{0,253}[\d\w])?\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.,\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.,\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.,\/\d\w]|%[a-fA-f\d]{2,2})*)?$/';
// vvv I REMOVED FROM HERE vvv
$isLink = preg_match($pattern, $url); // Returns true if a link
// ^^^ UNTIL HERE ^^^
if($url == "http://" || $url == "") {
if(Submit_Require_A_URL == false) {
$linkres->valid = true;
} else {
$linkres->valid = false;
}
$linkres->url_title = "";
} elseif ($isLink == false) {
$linkres->valid = false;
}
Website bookmark button code is:
javascript:q=(document.location.href);void(open('http://website.com/submit.php?url='+escape(q),'_self','resizable,location,menubar,toolbar,scrollbars,status'));

Why are you not using the PHP function "filter_var()" to check the url:
$url = $_POST['url'];
$isLink = filter_var($url, FILTER_VALIDATE_URL);

Related

Remove subdomain from URL/host to match domains in affiliate link array

I want to make a redirect file using php which can add Affiliates tag automatically to all links. Like how it works https://freekaamaal.com/links?url=https://www.amazon.in/ .
If I open the above link it automatically add affiliate tag to the link and the final link which is open is this ‘https://www.amazon.in/?tag=freekaamaal-21‘ And same for Flipkart and many other sites also.
It automatically add affiliate tags to various links. For example amazon, Flipkart, ajio,etc.
I’ll be very thankful if anyone can help me regarding this.
Thanks in advance 🙏
Right now i made this below code but problem is that sometimes link have extra subdomain for example https://dl.flipkart.com/ or https://m.shopclues.com/ , etc for these type links it does not redirect from the array instead of this it redirect to default link.
<?php
$subid = isset($_GET['subid']) ? $_GET['subid'] : 'telegram'; //subid for external tracking
$affid = $_GET['url']; //main link
$parse = parse_url($affid);
$host = $parse['host'];
$host = str_ireplace('www.', '', $host);
//flipkart affiliate link generates here
$url_parts = parse_url($affid);
$url_parts['host'] = 'dl.flipkart.com';
$url_parts['path'] .= "/";
if(strpos($url_parts['path'],"/dl/") !== 0) $url_parts['path'] = '/dl'.rtrim($url_parts['path'],"/");
$url = $url_parts['scheme'] . "://" . $url_parts['host'] . $url_parts['path'] . (empty($url_parts['query']) ? '' : '?' . $url_parts['query']);
$afftag = "harshk&affExtParam1=$subid"; //our affiliate ID
if (strpos($url, '?') !== false) {
if (substr($url, -1) == "&") {
$url = $url.'affid='.$afftag;
} else {
$url = $url.'&affid='.$afftag;
}
} else { // start a new query string
$url = $url.'?affid='.$afftag;
}
$flipkartlink = $url;
//amazon link generates here
$amazon = $affid;
$amzntag = "subhdeals-21"; //our affiliate ID
if (strpos($amazon, '?') !== false) {
if (substr($amazon, -1) == "&") {
$amazon = $amazon.'tag='.$amzntag;
} else {
$amazon = $amazon.'&tag='.$amzntag;
}
} else { // start a new query string
$amazon = $amazon.'?tag='.$amzntag;
}
}
$amazonlink = $amazon;
$cueurl = "https://linksredirect.com/?subid=$subid&source=linkkit&url="; //cuelinks deeplink for redirection
$ulpsub = '&subid=' .$subid; //subid
$encoded = urlencode($affid); //url encode
$home = $cueurl . $encoded; // default link for redirection.
$partner = array( //Insert links here
"amazon.in" => "$amazonlink",
"flipkart.com" => "$flipkartlink",
"shopclues.com" => $cueurl . $encoded,
"aliexpress.com" => $cueurl . $encoded,
"ajio.com" => "https://ad.admitad.com/g/?ulp=$encoded$ulpsub",
"croma.com" => "https://ad.admitad.com/g/?ulp=$encoded$ulpsub",
"myntra.com" => "https://ad.admitad.com/g/?ulp=$encoded$ulpsub",
);
$store = array_key_exists($host, $partner) === false ? $home : $partner[$host]; //Checks if the host exists if not then redirect to your default link
header("Location: $store"); //Do not changing
exit(); //Do not changing
?>
Thank you for updating your answer with the code you have and explaining what the actual problem is. Since your reference array for the affiliate links is indexed by base domain, we will need to normalize the hostname to remove any possible subdomains. Right now you have:
$host = str_ireplace('www.', '', $host);
Which will do the job only if the subdomain is www., obviously. Now, one might be tempted to simply explode by . and take the last two components. However that'd fail with your .co.id and other second-level domains. We're better off using a regular expression.
One could craft a universal regular expression that handles all possible second-level domains (co., net., org.; edu.,...) but that'd become a long list. For your use case, since your list currently only has the .com, .in and .co.in domain extensions, and is unlikely to have many more, we'll just hard-code these into the regex to keep things fast and simple:
$host = preg_replace('#^.*?([^.]+\.)(com|id|co\.id)$#i', '\1\2', $host);
To explain the regex we're using:
^ start-of-subject anchor;
.*? ungreedy optional match for any characters (if a subdomain -- or a sub-sub-domain exists);
([^.]+\.) capturing group for non-. characters followed by . (main domain name)
(com|id|co\.id) capturing group for domain extension (add to list as necessary)
$ end-of-subject anchor
Then we replace the hostname with the contents of the capture groups that matched domain. and its extension. This will return example.com for www.example.com, foo.bar.example.com -- or example.com; and example.co.id for www.example.co.id, foo.bar.example.co.id -- or example.co.id. This should help your script work as intended. If there are further problems, please update the OP and we'll see what solutions are available.

how to pass variables from controller to controller in prestashop?

is there a way to pass variables from a controller to another in prestashop? I'm tring to pass the new_address variable on an AddressController override like this:
Tools::redirect('index.php?controller='.$back.($mod ? '&back='.$mod : '') . '&new_address=' . $address->id);
NOTE that this is the original line + . '&new_address=' . $address->id, so I have to stick to Tools::redirect.
By using that line, no new_address is found on the next page in $_GET. From OrderController and ParentOrderController too, I don't see it.
I've found the reason in Tools::redirect. It has a line like this:
$url = Tools::strReplaceFirst('&', '?', $url);
here they are basically excluding any queryvar other than the first one, so you won't find any if you add some. If you have index.php?a=1&b=2, you'll get index.php?a=1?b=2. I don't really see the point... Maybe it's a bug.
So I overridden Tools::redirect like this (modded lines are commented):
public static function redirect($url, $base_uri = __PS_BASE_URI__, Link $link = null, $headers = null){
if (!$link) $link = Context::getContext()->link;
$querystring = array_pop(explode('?', $url)); // MOD: Save the original querystring. I take the last item in array because sometimes (i think it's a bug) the $url is like index.php?controller=order.php?step=1, so 2 question marks.
if (strpos($url, 'http://') === false && strpos($url, 'https://') === false && $link) {
if (strpos($url, $base_uri) === 0) {
$url = substr($url, strlen($base_uri));
}
if (strpos($url, 'index.php?controller=') !== false && strpos($url, 'index.php/') == 0) {
$url = substr($url, strlen('index.php?controller='));
if (Configuration::get('PS_REWRITING_SETTINGS')) {
$url = Tools::strReplaceFirst('&', '?', $url); // ...Don't see the point here...
}
}
$explode = explode('?', $url);
// don't use ssl if url is home page
// used when logout for example
$use_ssl = !empty($url);
$url = $link->getPageLink($explode[0], $use_ssl);
if($querystring) $url .= '?'.$querystring; // MOD: adding full querystring!! Also deleted 3 lines that added $explode[1] instead
}
// Send additional headers
if ($headers) {
if (!is_array($headers)) $headers = array($headers);
foreach ($headers as $header) {
header($header);
}
}
header('Location: '.$url);
exit;
}

Thinking about domain validation

this is my first question. And btw I am unconfy with RegExes.
I was thinking about a PHP function that validates domains or URLs, given by user input. (Sub)Domains shall be collected via html input field.
So I have to deal with different formats like http(s)://domain.tld and domain.tld both with the possibility of including a path or being invalid.
The function should rather correct almost correct user input instead of returning false.
In the end, I want to return the format (sub.)domain.tld, but only for real existing domains.
My WIP-solution is the following. What do you think about it?
function valDomain($url,$prefix=""){
$url = trim($url);
$url = str_replace(" ", "", $url);
$url = trim($url,'.');
$url = trim($url,'?');
$url = trim($url,'-');
$url = trim($url,'/');
$url = strtolower($url);
$url = substr($url,0,100);
if(strpos($url,'.') == false) {
return false;
}
if(strpos($url,'http') !== false) {
$x = parse_url($url);
if(isset($x['host'])){
$url = $x['host'];
}
}
if(strpos($url,'/') !== false) {
$x = explode("/", $url);
if(isset($x[0])){
$url = $x[0];
}
}
if(checkdnsrr($url,"A")){
return $prefix.$url;
} else {
return false;
}
}
For explanation: It tidies up the user input, checks if it can be a url/domain at all, takes the host if it's a proper url, deletes the path, and then, when it only should be the raw url, check if there is a dns entry corresponding to it. Only if yes, it returns the validated domain. Other it returns false.
Does this make sense?
(The $prefix argument can optionally be used to add a http:// to the url in order to render a hyperlink).
Retrieved results will be stored in database, so they need to be hack-safe.

How to replace ?{GET Variable}= with a /

Guys please help me with this
I want to replace this URL http://www.mywebsite.com/index?type=traveler with http://www.mywebsite.com/index/traveler
http://www.mywebsite.com/index?type=traveler is a hyperlink on some other page
I have tried many things but haven't succeeded till now.
You'll just need to add a few lines to your .htaccess file:
RewriteEngine On
RewriteRule index/(.*) index?type=$1 [L]
This will use "traveler" as your type parameter. It will catch any other value and send to the querystring as the type value as well.
Get the url and compare with this condition then include your page and exit.
if (strstr($url, 'index/')) {
$urlarr = explode('product/', $url);
$url = $urlarr[1];
$sing_ext = strpos($url, "'");
if ($sing_ext != false) {
$ur_ar = explode("'", $url);
$url = $ur_ar[0];
}
$sub_cats_url = $_REQUEST['tag'] = $urlarr[1];
include("product.php");
exit; }

Url validation with regex for old php version

Note: I'm using an older PHP version so FILTER_VALIDATE_URL is not available at this time.
After many many searches I am still unable to find the exact answer that can cover all URL structure possibilities but at the end I'm gonna use this way:
I'm using the following function
1) Function to get proper scheme
function convertUrl ($url){
$pattern = '#^http[s]?://#i';
if(preg_match($pattern, $url) == 1) { // this url has proper scheme
return $url;
} else {
return 'http://' . $url;
}
}
2) Conditional to check if it is a URL or not
if (preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i", $url)) {
echo "URL is valid";
}else {
echo "URL is invalid<br>";
}
Guess What!? It works so perfect for all of these possibilities:
$url = "google.com";
$url = "www.google.com";
$url = "http://google.com";
$url = "http://www.google.com";
$url = "https://google.com";
$url = "https://www.codgoogleekarate.com";
$url = "subdomain.google.com";
$url = "https://subdomain.google.com";
But still have this edge case
$url = "blahblahblahblah";
The function convertUrl($url) will convert this to $url = "http://blahblahblahblah";
then the regex will consider it as valid URL while it isn't!!
How can I edit it so that it won't pass a URL with this structure http://blahblahblahblah
If you want to validate internet url's, add a check for including a dot (.) character in your reg-ex.
Note: http://blahblahblah is a valid url as is http://localhost
Try this:
if (preg_match("/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/", $url)) {
echo "URL is valid";
}else {
echo "URL is invalid<br>";
}

Categories