Url validation with regex for old php version - php

Note: I'm using an older PHP version so FILTER_VALIDATE_URL is not available at this time.
After many many searches I am still unable to find the exact answer that can cover all URL structure possibilities but at the end I'm gonna use this way:
I'm using the following function
1) Function to get proper scheme
function convertUrl ($url){
$pattern = '#^http[s]?://#i';
if(preg_match($pattern, $url) == 1) { // this url has proper scheme
return $url;
} else {
return 'http://' . $url;
}
}
2) Conditional to check if it is a URL or not
if (preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i", $url)) {
echo "URL is valid";
}else {
echo "URL is invalid<br>";
}
Guess What!? It works so perfect for all of these possibilities:
$url = "google.com";
$url = "www.google.com";
$url = "http://google.com";
$url = "http://www.google.com";
$url = "https://google.com";
$url = "https://www.codgoogleekarate.com";
$url = "subdomain.google.com";
$url = "https://subdomain.google.com";
But still have this edge case
$url = "blahblahblahblah";
The function convertUrl($url) will convert this to $url = "http://blahblahblahblah";
then the regex will consider it as valid URL while it isn't!!
How can I edit it so that it won't pass a URL with this structure http://blahblahblahblah

If you want to validate internet url's, add a check for including a dot (.) character in your reg-ex.
Note: http://blahblahblah is a valid url as is http://localhost

Try this:
if (preg_match("/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/", $url)) {
echo "URL is valid";
}else {
echo "URL is invalid<br>";
}

Related

Checking the URI PHP

I have a problem with checking the URI. I would like to create a pattern that will accept such a URI:
?name=key&name=key
#anchor
foo?name=key&name=key
foo#anchor
foo/bar/*
foo/bar?name=key&name=key
foo/bar#anchor
At the moment I have something like this:
$path = 'foo';
$uri = 'foo/bar/bb';
preg_match('/^('.$path.'[^\w])[\/\w\S]+$/i', $uri);
// OR
preg_match('/^('.$path.'|'.$path.'(\?|#)[\w\S\=\.&%-]+)$/i', $uri);
I would like to simplify it somehow. Thank you in advance for your help.
If you getting full url then it'll work.
You can use filter_var with FILTER_VALIDATE_URL
Snippet
$url = "https://stackoverflow.com/questions/53304342/checking-the-uri-php";
if (filter_var($url, FILTER_VALIDATE_URL)) {
echo "$url is a valid URL";
} else {
echo"$url is not a valid URL";
}
Live demo
filter_var

This URL validation is not working properly

I have a code that I which is taken from another stackoverflow post,
here it is,
function validate_url($url)
{
$pattern = "/^((ht|f)tp(s?)\:\/\/|~/|/)?([w]{2}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?/";
if (!preg_match($pattern, $url))
{
$this->form_validation->set_message('validate_url', 'The URL you entered is not correctly formatted.');
return false;
}
return false;
}
It is not working properly. It is allowing URL without (something like) .com or .in (anything after dot).
Meaning, it should allow proper the URL as
http://something.com or
http://www.something.in or
but not
http://something (without .in or .com or any other) or
something or
www.something
I don't know much about regular expressions. Please help me..
take a look at this site:
https://mathiasbynens.be/demo/url-regex
It contains a lot of different URL validation regex.
The one from Diego Perini:
_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS
seems so much better than the one used by filter_var.
You can use filter_var filter_var('http://example.com', FILTER_VALIDATE_URL); for validating your urls.
Here you can find all types of validation types you can use.
If you want to require the http(s) you can add use this.
filter_var('http://example.com', FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED)
Use this
<?php
$url = "http://something";
if ((!filter_var($url, FILTER_VALIDATE_URL) === false) && #fopen($url,"r")) {
echo("$url is a valid URL");
} else {
echo("$url is not a valid URL");
}
?>
Or use this
<?php
$url = 'http://example';
if(validateURL($url)){
echo "Valid";
}else{
echo "invalid";
}
function validateURL($URL) {
$pattern_1 = "/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+.(com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
$pattern_2 = "/^(www)((\.[A-Z0-9][A-Z0-9_-]*)+.(com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
if(preg_match($pattern_1, $URL) || preg_match($pattern_2, $URL)){
return true;
} else{
return false;
}
}
?>
You can use following regular expression
$url = 'http://something.com';
$regex = "((https?|ftp)\:\/\/)?"; // SCHEME
$regex .= "([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?#)?"; // User and Pass
$regex .= "([a-z0-9-.]*)\.([a-z]{2,3})"; // Host or IP
$regex .= "(\:[0-9]{2,5})?"; // Port
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?"; // Path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:#&%=+\/\$_.-]*)?"; // GET Query
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?"; // Anchor
if(preg_match("/^$regex$/", $url)) {
echo "Matched";
} else {
echo "Not Matched";
}

PHP check if url is valid

I wonder what would be the best way in php to check if provided url is valid... At first I tried with:
filter_var($url, FILTER_VALIDATE_URL) === false
But it does not accept www.example.com (without protocol). So I tried with a simple modification:
protected function checkReferrerUrl($url) {
if(strpos($url, '://') == false) {
$url = "http://".$url;
}
if(filter_var($url, FILTER_VALIDATE_URL) === false) {
return false;
}
return true;
}
Now it works fine with www.example.com but also accepts simple foo as it converts to http://foo. However though this is not a valid public url I think... so what would you suggest? Go back to traditional regexp?
I recommend, that you do not use filter_var with type URL.
There are much more side-effects.
For example, these are valid URLs according to filter_var:
http://example.com/"><script>alert(document.cookie)</script>
http://example.ee/sdsf"f
Additionally FILTER_VALIDATE_URL does not support internationalized domain names (IDN).
I recommend using a regex combined with some ifs afterwards (f.e. for the domain) for security reasons.
Without the security aspect I am using parse_url to take my parts. But this function has a similar issue, when the scheme (no http/https) is missing.
Use this
<?php
$url = 'www.example.com';
if(validateURL($url)){
echo "Valid";
}else{
echo "invalid";
}
function validateURL($URL) {
$pattern_1 = "/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+.(com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
$pattern_2 = "/^(www)((\.[A-Z0-9][A-Z0-9_-]*)+.(com|org|net|dk|at|us|tv|info|uk|co.uk|biz|se)$)(:(\d+))?\/?/i";
if(preg_match($pattern_1, $URL) || preg_match($pattern_2, $URL)){
return true;
} else{
return false;
}
}
?>
Try this one too
<?php
// Assign URL to $URL variable
$url = 'http://example.com';
// Check url using preg_match
if (preg_match("/^(https?:\/\/+[\w\-]+\.[\w\-]+)/i",$url)){
echo "Valid";
}else{
echo "invalid";
}
?>

Best Practice for Validating URLs

I have a form to get some urls from users. Eg: Web Address, Facebook Address, Twitter Address, Google+ address etc... My problem is how I validate these urls when they submit the form. I tried to validate URL in PHP by using the FILTER_VALIDATE_URL or simply, using regular expression.
Here, I would like to know what are the best methods to get such a urls from users. Is it always good to let them to enter protocol? sometimes they may not know it is http, https, ftp, ftps.. etc. I think it is something hard to do some users.
I tried something like this using FILTER_VALIDATE_URL, But it always use protocol and sometime I am confusing how its work..
// validate url
$url = 'http://www.example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'hp://www.example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'http://example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'http://example.com?id=32&name=kamalani';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
Can you tell me what are the best ways to get urls from user and how those validate?
Any comments are greatly appreciating..
Thank you.
You need to use regular expression to check valid url here.
Please try this :
$pattern = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i";
$URL= 'http://example.com?id=32&name=kamalani';
if(preg_match($pattern, $URL) ){
echo "<br>valid";
} else{
echo "<br>invalid";
}
Output
valid

How to add http:// if it doesn't exist in the URL

How can I add http:// to a URL if it doesn't already include a protocol (e.g. http://, https:// or ftp://)?
Example:
addhttp("google.com"); // http://google.com
addhttp("www.google.com"); // http://www.google.com
addhttp("google.com"); // http://google.com
addhttp("ftp://google.com"); // ftp://google.com
addhttp("https://google.com"); // https://google.com
addhttp("http://google.com"); // http://google.com
addhttp("rubbish"); // http://rubbish
A modified version of #nickf code:
function addhttp($url) {
if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
$url = "http://" . $url;
}
return $url;
}
Recognizes ftp://, ftps://, http:// and https:// in a case insensitive way.
At the time of writing, none of the answers used a built-in function for this:
function addScheme($url, $scheme = 'http://')
{
return parse_url($url, PHP_URL_SCHEME) === null ?
$scheme . $url : $url;
}
echo addScheme('google.com'); // "http://google.com"
echo addScheme('https://google.com'); // "https://google.com"
See also: parse_url()
Simply check if there is a protocol (delineated by "://") and add "http://" if there isn't.
if (false === strpos($url, '://')) {
$url = 'http://' . $url;
}
Note: This may be a simple and straightforward solution, but Jack's answer using parse_url is almost as simple and much more robust. You should probably use that one.
The best answer for this would be something like this:
function addhttp($url, $scheme="http://" )
{
return $url = empty(parse_url($url)['scheme']) ? $scheme . ltrim($url, '/') : $url;
}
The protocol flexible, so the same function can be used with ftp, https, etc.
Scan the string for ://. If it does not have it, prepend http:// to the string... Everything else just use the string as is.
This will work unless you have a rubbish input string.
Try this. It is not watertight1, but it might be good enough:
function addhttp($url) {
if (!preg_match("#^[hf]tt?ps?://#", $url)) {
$url = "http://" . $url;
}
return $url;
}
1. That is, prefixes like "fttps://" are treated as valid.
nickf's solution modified:
function addhttp($url) {
if (!preg_match("#^https?://#i", $url) && !preg_match("#^ftps?://#i", $url)) {
$url = "http://" . $url;
}
return $url;
}
<?php
if (!preg_match("/^(http|ftp):/", $_POST['url'])) {
$_POST['url'] = 'http://'.$_POST['url'];
}
$url = $_POST['url'];
?>
This code will add http:// to the URL if it’s not there.

Categories