parsing url - php, check if scheme exists, validate url regex - php

Is this a good way to validate a posted URL?
if (filter_var($_POST['url'], FILTER_VALIDATE_URL)){
echo "valid url";
}else{
echo "invalid url";
}
This is what I wrote to start with, as I could show multiple error messages with it:
function validateURL($url)
{
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
return preg_match($pattern, $url);
}
$result = validateURL($_POST['url']);
if ($result == "1"){
$scheme = parse_url($_POST['url'], PHP_URL_SCHEME);
if (isset($scheme)){
echo $scheme . "://" . parse_url($_POST['url'], PHP_URL_HOST);
}else{
echo "error you did not enter http://";
}
}else{
echo "your url is not a valid format";
}

I'd simply go for the build-in FILTER_VALIDATE_URL and use a generic error message like:
Invalid URL. Please remember to input http:// as well.
If you're nice you could check if the first 7/8 letters are http:// or https:// and prepend them if not.
Coming up with and maintaining such a RegEx is not something you should get into if the problem is already solved. There's also usually no need to be any more detailed in the error message, unless you're in the business of explaining URL formats.

Have you checked this out Kyle, http://phpcentral.com/208-url-validation-in-php.html

I think simple
filter_var($var, FILTER_VALIDATE_URL)
and checking the protocol by strpos() is enough because user can, if wants to, give you the wrong (which does not exists) url.
Of course you can check if domain exists and return valid http status but I think it is little overstatement.

Related

PHP Check for HTTP and HTTPS in the submitted URL through FORM

I have this code which checks for http:// in the URL submitted. But I want it to also check for https://. So I tried with an or in the if condition but it still checks only for http:// and not https://.
Here is my code.
if(!preg_match("#^http://#i",$turl) or !preg_match("#^https://#i",$turl)){
$msg = "<div class='alert alert-danger'>Invalid Target URL! Please input a standard URL with <span class='text-info'>http://</span> for example <span class='text-info'>http://www.kreatusweb.com</span> </div>";
}
If I now put https:// in the URL and submit, it still returns this error message as now http:// is false here. What logic or code should I use here to check for both. I just don't want users to submit www.somewebsite.com. I want them to submit full URL using either http:// or https://. If either of these two exists in the URL then only the form will be processed further.
You can simplify the regex so the s is optional by just adding a ? after it.
if(!preg_match("#^https?://#i",$turl)){
replace the or with &&
if(!preg_match("#^http://#i",$turl) && !preg_match("#^https://#i",$turl))
I used to do this logic mistake when I started to code because you think like this if (not something or not somethingelse)
but doing if (!http || !https) will return true in both http and https because
1- if it is http, then the !https part will return true
2- if it is https, then the !http part will return true too
Check out the PHP validate filters at http://php.net/manual/en/filter.filters.validate.php.
<?php
$arr = [ 'http:example.com','https:/example.com','https://www.example.com','http://example.com',
'ftp://example.com','www.example.com','www.example.com/test.php','https://www.example.com/test.php?q=6'];
foreach ($arr as $str) {
$filtered = filter_var($str,FILTER_VALIDATE_URL,FILTER_FLAG_SCHEME_REQUIRED|FILTER_FLAG_HOST_REQUIRED);
if (!empty($filtered)) {
if (stripos($filtered,'http') === 0) {
echo $str.' is valid'.PHP_EOL;
} else {
echo $str.' is a valid URL, but not HTTP'.PHP_EOL;
}
} else {
echo $str.' is not a valid URL'.PHP_EOL;
}
}

URL validation must contain http or https

I am searching multiple websites to fix this issue. The problem is I am asking user to enter website address and like people says never trust user input.
So, possible scenario can be like this:
https or http://www.google.com
https or http://google.com
www.google.com
google.com
Now I want URL must be like this. http or https//www.google.com
At the moment I have below code but it is not working as expected.
$url = "www.google.com";
if (preg_match("/\b(?:(?:https?):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i", $url)) {
echo "URL is valid";
}
else {
echo "URL is invalid";
}
Check if the start of the string contains http which also includes https AND check if it's a valid URL:
if((strpos($url, 'http') === 0) && filter_var($url, FILTER_VALIDATE_URL)) {
echo "URL is valid";
} else {
echo "URL is invalid";
}
Try this Expression
/[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi
It will aceept all the cases that you have mentioned above

Why does FILTER_VALIDATE_URL consider this to be a valid URL?

For some reason that I don't understand, FILTER_VALIDATE_URL says the following URL is valid:
http://ghjfgh
Don't all valid URLs contain at least one period? I've never seen a TDL that didn't have one by definition. So why does PHP say it's valid?
Here's the code. You can quickly run it on phpfiddle.org for yourself:
<?php
$URL = "http://ghjfgh";
if($URL != "" && !filter_var($URL, FILTER_VALIDATE_URL)) {
$error = "Please enter a valid URL";
} else {
$error = "All good";
}
echo $error;
?>
It filters according to RFC 2396, and http://ghjfgh is valid according to that spec. An easy example would be http://localhost which is obviously valid (as #johnconde pointed out in the comments)

Best Practice for Validating URLs

I have a form to get some urls from users. Eg: Web Address, Facebook Address, Twitter Address, Google+ address etc... My problem is how I validate these urls when they submit the form. I tried to validate URL in PHP by using the FILTER_VALIDATE_URL or simply, using regular expression.
Here, I would like to know what are the best methods to get such a urls from users. Is it always good to let them to enter protocol? sometimes they may not know it is http, https, ftp, ftps.. etc. I think it is something hard to do some users.
I tried something like this using FILTER_VALIDATE_URL, But it always use protocol and sometime I am confusing how its work..
// validate url
$url = 'http://www.example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'hp://www.example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'http://example.com';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
// validate url
$url = 'http://example.com?id=32&name=kamalani';
if (filter_var( $url, FILTER_VALIDATE_URL)){
echo "<br>valid";
} else {
echo "<br>invalid";
}
OUTPUT : valid
Can you tell me what are the best ways to get urls from user and how those validate?
Any comments are greatly appreciating..
Thank you.
You need to use regular expression to check valid url here.
Please try this :
$pattern = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i";
$URL= 'http://example.com?id=32&name=kamalani';
if(preg_match($pattern, $URL) ){
echo "<br>valid";
} else{
echo "<br>invalid";
}
Output
valid

Regular expression for validation of a facebook page url

I need to validate the facebook page url which should not consider http/https or www given or not?
I mean the following should be accepted or valid:
www.facebook.com/ABCDE
facebook.com/ABCDE
http://www.facebook.com/ABCDE
https://www.facebook.com/ABCDE
And following should not be accepted or invalid:
http://www.facebook.com/ => User name/page name not given
http://www.facebook.com/ABC => User name/page name should have the minimum length of 5.
For the above requirement I'd made following regular expression, but it is not checking the User Name or Page Name which is the only problem. Rest is working fine:
/^(https?:\/\/)?((w{3}\.)?)facebook.com\/(([a-z\d.]{5,})?)$/
I am very new to Regular Expression, so don't have much idea about it.
Any type of help would be appreciable.
Thanks in advance.
parse_url() can help you with that.
<?php
$array = array(
"www.facebook.com/ABCDE",
"facebook.com/ABCDE",
"http://www.facebook.com/ABCDE",
"https://www.facebook.com/ABCDE",
"http://www.facebook.com/",
"http://www.facebook.com/ABC"
);
foreach ($array as $link) {
if (strpos($link, "http") === false) {
$link = "http://" . $link; //parse_url requires a valid URL. A scheme is needed. Add if not already there.
}
$url = parse_url($link);
if (!preg_match("/(www\.)?facebook\.com/", $url["host"])) {
//Not a facebook URL
echo "FALSE!";
}
elseif (strlen(trim($url["path"], "/")) < 5) {
//Trailing path (slashes not included) is less than 5
echo "FALSE!";
}
else {
//None of the above
echo "TRUE";
}
echo "<br>";
}
Try this one (have not tested it, should work)
'~^(https?://)?(www\.)?facebook\.com/\w{5,}$~i'
\w is like [a-zA-Z0-9_]
Robert

Categories