validate url in php - php

I have the following seems simple code in php; but the ptoblem is that it shows all valid links as "not valid"; any help appreciated:
<?php
$m = "urllist.txt";
$n = fopen($m, "r");
while (!feof($n)) {
$l = fgets($n);
if (filter_var($l, FILTER_VALIDATE_URL) === FALSE) {
echo "NOT VALID - $l<br>";
} else {
echo "VALID - $l<br>";
}
}
fclose($n);
?>

The string returned by fgets() contains a trailing newline character that needs to be trimmed before you can validate it. Try out following code, I hope this will help you:
<?php
$m = "urllist.txt";
$n = fopen($m, "r");
while (!feof($n)) {
$l = fgets($n);
if(filter_var(trim($l), FILTER_VALIDATE_URL)) {
echo "VALID - $l<br>";
} else {
echo "NOT VALID - $l<br>";
}
}
fclose($n);
?>
I have tried with following urls:
http://stackoverflow.com/
https://www.google.co.in/
https://www.google.co.in/?gfe_rd=cr&ei=bf4HVLOmF8XFoAOg_4HoCg&gws_rd=ssl
www.google.com
http://www.example.com
example.php?name=Peter&age=37
and get following result:
VALID - http://stackoverflow.com/
VALID - https://www.google.co.in/
VALID - https://www.google.co.in/?gfe_rd=cr&ei=bf4HVLOmF8XFoAOg_4HoCg&gws_rd=ssl
NOT VALID - www.google.com
VALID - http://www.example.com
NOT VALID - example.php?name=Peter&age=37

maybe you have some symbols at end of each line '\n'
I think you can just use trim function before validate the $l like this:
filter_var(trim($l), FILTER_VALIDATE_URL) === TRUE
maybe this will help you.

Please try with the different filters available to see where it fails:
FILTER_FLAG_SCHEME_REQUIRED - Requires URL to be an RFC compliant URL
(like http:// example)
FILTER_FLAG_HOST_REQUIRED - Requires URL to
include host name (like http:// www.example.com)
FILTER_FLAG_PATH_REQUIRED - Requires URL to have a path after the
domain name (like www. example.com/example1/test2/)
FILTER_FLAG_QUERY_REQUIRED - Requires URL to have a query string
(like "example.php?name=Peter&age=37")
(cc of http://www.w3schools.com/php/filter_validate_url.asp)
You can try the good old regex too:
if (!preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i",$url))

Try this code. It must be helpful. I have tested it and its working.
<?php
$m = "urllist.txt";
$n = fopen($m, "r");
while (!feof($n)) {
$l = fgets($n);
if(filter_var(trim($l), FILTER_VALIDATE_URL)) {
echo "URL is not valid";
}
else{
echo "URL is valid";
}
}
fclose($n);
?>
Here is the DEMO

Related

recognize any type of link in content with php

I tried to make code to recognize link in content than in found this part of code : filter_var($url, FILTER_VALIDATE_URL)
but i had realized it just can recognize urls at the first of text content in the other cases url perhaps exsist at the middale of text but the filter_var function can not recognize url so i exploaded every words of a content in a array and i turned code like:
$words = explode("\n", $text);
$eleman = sizeof($words);
for ($i=0; $i <= $eleman ; $i++) {
if (filter_var($words[$i], FILTER_VALIDATE_URL)) {
echo "yes it is a link...";
return TRUE;
}else{
echo "no it isnt link...";
return FALSE;
}
}
but it still dosent recognize url in content.

Validate the path of a url in PHP

In PHP, I'm trying to validate the path of an Url with regex.
The current regex that I have tested is this one:
^(\/\w+)+\.\w+(\?(\w+=[\w\d]+(&\w+=[\w\d]+)+)+)*$
public function isValidPath($urlPath)
{
if (!preg_match("#^(\/\w+)+\.\w+(\?(\w+=[\w\d]+(&\w+=[\w\d]+)+)+)*$#i", $urlPath)) { return false; }
else { return true; }
}
$arrUrl = parse_url($url);
$urlPath = $arrUrl['path'];
// valid path ?
if(isValidPath($urlPath)) { echo "OK"; }
else { echo "Invalid Path URL"; }
But it doesn't work with path that just start with /.
- / -> valid path
- /aaa -> valid path
- /aaa/bbb -> valid path
- /aaa?q=x -> valid path
- aaa -> Not valid path
- /asd/asd./jsp -> Not valid path
- /asd/asd.jsp/ -> Not valid path
- /asd./asd.jsp -> Not valid path
- /asd///asd.js -> Not valid path
- /asd/asd.jsp&bar=baz?inga=42?quux -> Not valid path
I'm not a regex expert and I'm breaking my head trying to do one that seems very simple.
Here you go:
^\/(?!.*\/$)(?!.*[\/]{2,})(?!.*\?.*\?)(?!.*\.\/).*
Sample function:
function validateUrl($url){
if (preg_match('%^/(?!.*\/$)(?!.*[\/]{2,})(?!.*\?.*\?)(?!.*\.\/).*%im', $url)) {
return true;
} else {
return false;
}
}
I've used some negative look-ahead that exclude certain patterns.
Its matches only the "valid paths" you specified.
Regex101Demo
I use #cmorrissey's approach, which actually does not require regex:
$result = filter_var('http://www.example.com' . $path, FILTER_VALIDATE_URL);
if ($result !== false) {
$result = true;
}
result is then true or false depending on path validity. Note that paths should always start with a / else they are part of a path rather than a complete path.

How to echo/print only a certain piece of input?

I want to echo/print only a certain piece of input. For example i have this youtube url http://www.youtube.com/watch?v=p963CeTtJVM how would i be able to only echo the last piece of :"p963CeTtJVM" from the input. As far as i know their always 11 symbols.
Code:
if (empty($_POST["website"]))
{$website = "";}
else
{
$website = test_input($_POST["website"]);
// check if URL address syntax is valid (this regular expression also allows dashes in the URL)
if (!preg_match("/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&##\/%?=~_|!:,.;]*[-a-z0-9+&##\/%=~_|]/i",$website))
{
$websiteErr = "Invalid URL";
}
}
list ($void, $query_string) = split('?', $url); // or list(,$qs)
parse_str($query_string, $data);
var_dump($data);
For this specific string substr($str, -11) will take the last 11 chars, but that doesn't include other tags. Check out parse_str, it will probably save you a headache in the long run.
I hope it can help you.
<?php
$url = 'http://www.youtube.com/watch?v=p963CeTtJVM';
$urlParts = explode('v=', $url);
if (count($urlParts) == 2 && isset($urlParts[1])) {
echo "youtube code : {$urlParts[1]}";
} else {
echo "Invalid Youtube url.";
}
You can use substr method to return part of a string.
You can use the explode function to seperate the video ID and the rest of the link like this:
$array = explode("=", $website);
echo $array[1];
This parses the URL into its component parts, then parses the query string into an associative array.
$url = parse_url($url);
parse_str($url['query'], $params);
$v = $params['v'];

Check if a user entered an email address that has a domain similar to the domain name they enter above

In my signup form, I ask users to enter an email with the same domain name as they enter in the url field above.
Right now, I collect data this way:
URL : http://www.domain.com The domain.com part is what the user enters. The http://www is hard coded.
Email : info# domain.com The bold part is entered by the user. The # is hard coded.
The domain.com part in the url and domain.com part in the email should match. Right now, I can match the two fields since they are separate.
But I want to give up the above approach and make the user enter the entire domain name and email. When that's the case, what would be a good way to check if a user entered an email with the same domain he entered in the url field above.
I'm doing all this using php.
<?php
//extract domain from email
$email_domain_temp = explode("#", $_POST['email']);
$email_domain = $email_domain_temp[1];
//extract domain from url
$url_domain_temp = parse_url($_POST['url']);
$url_domain = strip_out_subdomain($url_domain_temp['host']);
//compare
if ($email_domain == $url_domain){
//match
}
function strip_out_subdomain($domain){
//do nothing if only 1 dot in $domain
if (substr_count($domain, ".") == 1){
return $domain;
}
$only_my_domain = preg_replace("/^(.*?)\.(.*)$/","$2",$domain);
return $only_my_domain;
}
So what this does is :
First, split the email string in 2 parts in an array. The second part is the domain.
Second, use the php built in function to parse the url, then extract the "host", while removing the (optionnal) subdomain.
Then compare.
you can do this by explode()
supp url = bla#gmail.com
$pieces = explode("#", $url);
$new = $pieces[1]; //which will be gmail.com
now again explode
$newpc= explode(".", $new );
$new1 = $newpc[0]; //which will be gmail
This is my version (tested, works):
<?php
$domain = 'www2.example.com'; // Set domain here
$email = 'info#example.com'; // Set email here
if(!preg_match('~^https?://.*$~i', $domain)) { // Does the URL start with http?
$domain = "http://$domain"; // No, prepend it with http://
}
if(filter_var($domain, FILTER_VALIDATE_URL)) { // Validate URL
$host = parse_url($domain, PHP_URL_HOST); // Parse the host, if it is an URL
if(substr_count($host, '.') > 1) { // Is there a subdomain?
$host = substr($host, -strrpos(strrev($host), '.')); // Get the host
}
if(strpos(strrev($email), strrev($host)) === 0) { // Does it match the end of the email?
echo 'Valid!'; // Valid
} else {
echo 'Does not match.'; // Invalid
}
} else {
echo 'Invalid domain!'; // Domain is invalid
}
?>
you could do:
$parsedUrl = parse_url($yourEnteredUrl);
$domainHost = str_replace("www.", "", $parsedUrl["host"]);
$emailDomain = array_pop(explode('#', $yourEnteredEmail));
if( $emailDomain == $domainHost ) {
//valid data
}
$email = 'myemail#example.com';
$site = 'http://example.com';
$emailDomain = ltrim( strstr($email, '#'), '#' );
// or automate it using array_map(). Syntax is correct only for >= PHP5.4
$cases = ['http://'.$emailDomain, 'https://'.$emailDomain, 'http://www.'.$emailDomain, 'https://www.'.$emailDomain];
$bSameDomain = in_array($site, $cases);
var_dump($bSameDomain);
Use regular expressions with positive lookbehinds(i.e only return the expression I'd like to match if it is preceded by a certain pattern, but don't include the lookbehind itself in the match), like so:
<?php
$url = preg_match("/(?<=http:\/\/www\.).*/",$_POST['url'],$url_match);
$email = preg_match("/(?<=#).*/",$_POST['email'],$email_match);
if ($url_match[0]==$email_match[0]) {
// Success Code
}
else {
// Failure Code
}
?>
Of course this is a bit oversimplified as you also need to account for https or www2 and the likes, but these require only minor changes to the RegExp, using the question mark as the "optional" operator

Remove parts of a string with PHP

I have an input box that tells uers to enter a link from imgur.com
I want a script to check the link is for the specified site but I'm not sue how to do it?
The links are as follows: http://i.imgur.com/He9hD.jpg
Please note that after the /, the text may vary e.g. not be a jpg but the main domain is always http://i.imgur.com/.
Any help appreciated.
Thanks, Josh.(Novice)
Try parse_url()
try {
if (!preg_match('/^(https?|ftp)://', $_POST['url']) AND !substr_count($_POST['url'], '://')) {
// Handle URLs that do not have a scheme
$url = sprintf("%s://%s", 'http', $_POST['url']);
} else {
$url = $_POST['url'];
}
$input = parse_url($url);
if (!$input OR !isset($input['host'])) {
// Either the parsing has failed, or the URL was not absolute
throw new Exception("Invalid URL");
} elseif ($input['host'] != 'i.imgur.com') {
// The host does not match
throw new Exception("Invalid domain");
}
// Prepend URL with scheme, e.g. http://domain.tld
$host = sprintf("%s://%s", $input['scheme'], $input['host']);
} catch (Exception $e) {
// Handle error
}
substr($input, 0, strlen('http://i.imgur.com/')) === 'http://i.imgur.com/'
Check this, using stripos
if(stripos(trim($url), "http://i.imgur.com")===0){
// the link is from imgur.com
}
Try this:
<?php
if(preg_match('#^http\:\/\/i\.imgur.com\/#', $_POST['url']))
echo 'Valid img!';
else
echo 'Img not valid...';
?>
Where $_POST['url'] is the user input.
I haven't tested this code.
$url_input = $_POST['input_box_name'];
if ( strpos($url_input, 'http://i.imgur.com/') !== 0 )
...
Several ways of doing it.. Here's one:
if ('http://i.imgur.com/' == substr($link, 0, 19)) {
...
}

Categories