How to convert random domain names into lowercase consistent urls? - php

I have this function in a class:
protected $supportedWebsitesUrls = ['www.youtube.com', 'www.vimeo.com', 'www.dailymotion.com'];
protected function isValid($videoUrl)
{
$urlDetails = parse_url($videoUrl);
if (in_array($urlDetails['host'], $this->supportedWebsitesUrls))
{
return true;
} else {
throw new \Exception('This website is not supported yet!');
return false;
}
}
It basically extracts the host name from any random url and then checks if it is in the $supportedWebsitesUrls array to ensure that it is from a supported website. But if I add say: dailymotion.com instead of www.dailymotion.com it won't detect that url. Also if I try to do WWW.DAILYMOTION.COM it still won't work. What can be done? Please help me.

You can use preg_grep function for this. preg_grep supports regex matches against a given array.
Sample use:
$supportedWebsitesUrls = array('www.dailymotion.com', 'www.youtube.com', 'www.vimeo.com');
$s = 'DAILYMOTION.COM';
if ( empty(preg_grep('/' . preg_quote($s, '/') . '/i', $supportedWebsitesUrls)) )
echo 'This website is not supported yet!\n';
else
echo "found a match\n";
Output:
found a match

You can run a few checks on it;
For lower case vs upper case, the php function strtolower() will sort you out.
as for checking with the www. at the beginning vs without it, you can add an extra check to your if clause;
if (in_array($urlDetails['host'], $this->supportedWebsitesUrls) || in_array('www.'.$urlDetails['host'], $this->supportedWebsitesUrls))

Related

Create URL with only A-Z characters that includes variable and extension

I am trying to create file links based a variable which has a "prefix" and an extension at the end.
Here's what I have:
$url = "http://www.example.com/mods/" . ereg("^[A-Za-z_\-]+$", $title) . ".php";
Example output of what I wish to have outputted (assuming $title = testing;):
http://www.example.com/mods/testing.php
What it currently outputs:
http://www.example.com/mods/.php
Thanks in advance!
Perhaps this is what you need:
$title = "testing";
if(preg_match("/^[A-Za-z_\-]+$/", $title, $match)){
$url = "http://www.example.com/mods/".$match[0].".php";
}
else{
// Think of something to do here...
}
Now $url is http://www.example.com/mods/testing.php.
Do you want to keep letters and remove all other chars in the URL?
In this case the following should work:
$title = ...
$fixedtitle=preg_replace("/[^A-Za-z_-]/", "", $title);
$url = "http://www.example.com/mods/".$fixedtitle.".php";
the inverted character class will remove everything you do not want.
OK first it's important for you to realize that ereg() is deprecated and will eventually not be available as a command for php, so to prevent an error down the road you should use preg_match instead.
Secondly, both ereg() and preg_match output the status of the match, not the match itself. So
ereg("^[A-Za-z_\-]+$", $title)
will output an integer equal to the length of the string in $title, 0 if there's no match and 1 if there's a match but you didn't pass it another variable to store the matches in.
I'm not sure why it's displaying
http://www.example.com/mods/.php
It should actually be outputting
http://www.example.com/mods/1.php
if everything was working correctly. So there is something going on there, and it's definitely not doing what you want it to. You need to pass another variable to the function that will store all the matches found. If the match is successful (which you can check using the return value of the function) then that variable will be an array of all matches.
Note that with preg_match by default only the first match will be returned. but it will still generate an array (which can be used to get isolated portions of the match) whereas preg_match_all will match multiple things.
See http://www.php.net/manual/en/function.preg-match.php for more details.
Your regex looks more or less correct
So the proper code should look something like:
$title = 'testing'; //making sure that $title is what we think it is
if (preg_match('/^[A-Za-z_\-]+$/',$title,$matches)) {
$url = "http://www.example.com/mods/" . $matches[0] . ".php";
} else {
//match failed, put error code in here
}

Returning a top level domain with a period at the end in php

Basically the problem I am having is I need to write this function that can take a URL like www.stackoverflow.com and just return the "com". But I need to be able to return the same value even if the URL has a period at the end like "www.stackoverflow.com."
This is what I have so far. The if statement is my attempt to return the point in the array before the period but I dont think I am using the if statement correctly. Otherwise the rest of the code does exactly what is supposed to do.
<?php
function getTLD($domain)
{
$domainArray = explode("." , $domain);
$topDomain = end($domainArray);
if ($topDomain == " ")
$changedDomain = prev(end($domainArray));
return $changedDomain;
return $topDomain;
}
?>
Don't use a regex for simple cases like that, it is cpu costly and unreadable. Just remove the final dot if it exists:
function getTLD($domain) {
$domain = rtrim($domain, '.');
return end(explode('.', $domain));
}
The end function is returning an empty string "" (without any spaces). You are comparing $topDomain to single space character so the if is not evaluating to true.
Also prev function requires array input and end($domainArray) is returning a string, so, $changedDomain = prev(end($domainArray)) should throw an E_WARNING.
Since end updates the internal pointer of the array $domainArray, which is already updated when you called $topDomain = end($domainArray), you do not need to call end on $domainArray inside the if block.
Try:
if ($topDomain == "") {
$changedDomain = prev($domainArray);
return $changedDomain; // Will output com
}
Here is the phpfiddle for it.
Use regular expressions for something like this. Try this:
function getTLD($domain) {
return preg_replace("/.*\.([a-z]+)\.?$/i", "$1", $domain );
}
A live example: http://codepad.org/km0vCkLz
Read more about regular expressions and about how to use them: http://www.regular-expressions.info/

Most Efficient Way to Search for "Bad Names" in a User's Name

I have an app that I'm developing, in it users can choose a name for themselves. I need to be able to filter out "bad" names, so I do this for now:
$error_count=0;
$bad_names="badname1badname2";
preg_match_all("/\b".$user_name."\b/i",$global['bad_names'],
$matches,PREG_OFFSET_CAPTURE);
if(count($matches[0])>0)
{
$error_count++;
}
This would tell me if the user's name was inside the bad names list, however, it doesn't tell me if the bad name itself is in the user's name. They could combine a bad word with something else and I wouldn't detect it.
What kind of regex (if I even use regex) would I use for this? I need to be able to take any bad name (preferably in an array like $bad_names), and search through the user's name to see whether that word is within their name. I'm not great with regex, and the only way I can think of is to put it all through a loop which seems highly inefficient. Anyone have a better idea? I guess I need to figure out how to search through a string with an array.
$badnames = array('name1', 'name2');
// you need to quote the names so they can be inserted into the
// regular expression safely
$badnames_quoted = array();
foreach ($badnames as $name) {
$badnames_quoted[] = preg_quote($name, '/');
}
// now construct a RE that will match any bad name
$badnames_re = '/\b('.implode('|', $badnames_quoted).')\b/Siu';
// no need to gather all matches, or even to see what matched
$hasbadname = preg_match($badnames_re, $thestring);
if ($hasbadname) {
// bad name found
}
private static $bad_name = array("word1", "word2", "word3");
private static $forbidden_name = array (array of unwanted character strings)
private static function userNameValid($name_in) {
$badFound = preg_match("/\b(" . implode(self::$bad_name,"|") . ")\b/i", $name_in); // checks array for exact match
$forbiddenFound = preg_match("/(" . implode(self::$forbidden_name,"|") . ")/i", $name_in); // checks array for any character match with a given name (i.e. "ass" would be found in assassin)
if ($badFound) {
return FALSE;
} elseif ($forbiddenFound) {
return FALSE;
} else {
return TRUE;
}
This works GREAT for me

Most efficient way to check a URL

I'm trying to check if a user submitted URL is valid, it goes directly to the database when the user hits submit.
So far, I have:
$string = $_POST[url];
if (strpos($string, 'www.') && (strpos($string, '/')))
{
echo 'Good';
}
The submitted page should be a page in a directory, not the main site, so http://www.address.com/page
How can I have it check for the second / without it thinking it's from http:// and that doesn't include .com?
Sample input:
Valid:
http://www.facebook.com/pageName
http://www.facebook.com/pageName/page.html
http://www.facebook.com/pageName/page.*
Invalid:
http://www.facebook.com
facebook.com/pageName
facebook.com
if(!parse_url('http://www.address.com/page', PHP_URL_PATH)) {
echo 'no path found';
}
See parse_url reference.
See the parse_url() function. This will give you the "/page" part of the URL in a separate string, which you can then analyze as desired.
filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED)
More information here :
http://ca.php.net/filter_var
Maybe strrpos will help you. It will locate the last occurrence of a string within a string
To check the format of the URL you could use a regular expression:
preg_match [ http://php.net/manual/en/function.preg-match.php ] is a good start, but a knowledge of regular expressions is needed to make it work.
Additionally, if you actually want to check that it's a valid URL, you could check the URL value to see if it actually resolves to a web page:
function check_404($url) {
$return = #get_headers($url);
if (strpos($return[0], ' 404 ') === false)
return true;
else {
return false;
}
}
Try using a regular expression to see that the URL has the correct structure. Here's more reading on this. You need to learn how PCRE works.
A simple example for what you want (disclaimer: not tested, incomplete).
function isValidUrl($url) {
return preg_match('#http://[^/]+/.+#', $url));
}
From here: http://www.blog.highub.com/regular-expression/php-regex-regular-expression/php-regex-validating-a-url/
<?php
/**
* Validate URL
* Allows for port, path and query string validations
* #param string $url string containing url user input
* #return boolean Returns TRUE/FALSE
*/
function validateURL($url)
{
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
return preg_match($pattern, $url);
}
$result = validateURL('http://www.google.com');
print $result;
?>

Determine if a character is alphabetic

Having problems with this.
Let's say I have a parameter composed of a single character and I only want to accept alphabetic characters. How will I determine that the parameter passed is a member of the latin alphabet (a–z)?
By the way Im using PHP Kohana 3.
Thanks.
http://php.net/manual/en/function.ctype-alpha.php
<?php
$ch = 'a';
if (ctype_alpha($ch)) {
// Accept
} else {
// Reject
}
This also takes locale into account if you set it correctly.
EDIT: To be complete, other posters here seem to think that you need to ensure the parameter is a single character, or else the parameter is invalid. To check the length of a string, you can use strlen(). If strlen() returns any non-1 number, then you can reject the parameter, too.
As it stands, your question at the time of answering, conveys that you have a single character parameter somewhere and you want to check that it is alphabetical. I have provided a general purpose solution that does this, and is locale friendly too.
Use the following guard clause at the top of your method:
if (!preg_match("/^[a-z]$/", $param)) {
// throw an Exception...
}
If you want to allow upper case letters too, change the regular expression accordingly:
if (!preg_match("/^[a-zA-Z]$/", $param)) {
// throw an Exception...
}
Another way to support case insensitivity is to use the /i case insensitivity modifier:
if (!preg_match("/^[a-z]$/i", $param)) {
// throw an Exception...
}
preg_match('/^[a-zA-Z]$/', $var_vhar);
Method will return int value: for no match returns 0 and for matches returns 1.
I'd use ctype, as Nick suggested,since it is not only faster than regex, it is even faster than most of the string functions built into PHP. But you also need to make sure it is a single character:
if (ctype_alpha($ch) && strlen($ch) == 1) {
// Accept
} else {
// Reject
}
You can't use [a-zA-Z] for Unicode.
here are the example working with Unicode,
if ( preg_match('/^\p{L}+$/u', 'my text') ) {
echo 'match';
} else {
echo 'not match';
}
This will help hopefully.This a simple function in php called ctype_alpha
$mystring = 'a'
if (ctype_alpha($mystring))
{
//Then do the code here.
}
You can try:
preg_match('/^[a-zA-Z]$/',$input_char);
The return value of the above function is true if the $input_char contains a single alphabet, else it is false. You can suitably make use of return value.

Categories