Detecting a url using preg_match? without http:// in the string

Detecting a url using preg_match? without http:// in the string - php

I was wondering how I could check a string broken into an array against a preg_match to see if it started with www. I already have one that check for http://www.
function isValidURL($url)
{
return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url);
}
$stringToArray = explode(" ",$_POST['text']);
foreach($stringToArray as $key=>$val){
$urlvalid = isValidURL($val);
if($urlvalid){
$_SESSION["messages"][] = "NO URLS ALLOWED!";
header("Location: http://www.domain.com/post/id/".$_POST['postID']);
exit();
}
}
Thanks!
Stefan

You want something like:
%^((https?://)|(www\.))([a-z0-9-].?)+(:[0-9]+)?(/.*)?$%i
this is using the | to match either http:// or www at the beginning. I changed the delimiter to % to avoid clashing with the |

John Gruber of Daring Fireball has posted a very comprehensive regex for all types of URLs that may be of interest. You can find it here:
http://daringfireball.net/2010/07/improved_regex_for_matching_urls

I explode the string at first as the url might be half way through it e.g. hello how are you www.google.com
Explode the string and use a foreach statement.
Eg:
$string = "hello how are you www.google.com";
$string = explode(" ", $string);
foreach ($string as $word){
if ( (strpos($word, "http://") === 0) || (strpos($word, "www.") === 0) ){
// Code you want to excute if string is a link
}
}
Note you have to use the === operator because strpos can return, will return a 0 which will appear to be false.

I used this below which allows you to detect url's anywhere in a string. For my particular application it's a contact form to combat spam so no url's are allowed. Works very well.
Link to resource: https://css-tricks.com/snippets/php/find-urls-in-text-make-links/
My implementation;
<?php
// Validate message
if(isset($_POST['message']) && $_POST['message'] == 'Include your order number here if relevant...') {
$messageError = "Required";
} else {
$message = test_input($_POST["message"]);
}
if (strlen($message) > 1000) {
$messageError = "1000 chars max";
}
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
if (preg_match($reg_exUrl, $message)) {
$messageError = "Url's not allowed";
}
// Validate data
function test_input($data) {
$data = trim($data);
$data = stripslashes($data);
$data = htmlspecialchars($data);
return $data;
}
?>

Try implode($myarray, '').strstr("www.")==0. That implodes your array into one string, then checks whether www. is at the beginning of the string (index 0).

Related

Find URL in string and turn into a link

I'm using the code given on this page to look through a string and turn the URL into an HTML link.
It works quite well, but there is a little issue with the "replace" part of it.
The problem occurs when I have almost identical links. For example:
https://example.com/page.php?goto=200
and
https://example.com/page.php
Everything will be fine with the first link, but the second will create a <a> tag in the first <a> tag.
First run
https://example.com/page.php?goto=200
Second
https://example.com/page.php?goto=200">https://example.com/page.php?goto=200</a>
Because it's also replacing the html link just created.
How do I avoid this?
<?php
function turnUrlIntoHyperlink($string){
//The Regular Expression filter
$reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
// Check if there is a url in the text
if(preg_match_all($reg_exUrl, $string, $url)) {
// Loop through all matches
foreach($url[0] as $newLinks){
if(strstr( $newLinks, ":" ) === false){
$link = 'http://'.$newLinks;
}else{
$link = $newLinks;
}
// Create Search and Replace strings
$search = $newLinks;
$replace = ''.$link.'';
$string = str_replace($search, $replace, $string);
}
}
//Return result
return $string;
}
?>

You need to add a whitespace identifier \s in your regex at the start, also remove \b because \b only returns the last match.
You regex can written as:
$reg_exUrl = "/(?i)\s((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/"
check this one: https://regex101.com/r/YFQPlZ/1

I have change the replace part a bit, since I couldn't get the suggested regex to work.
Maybe it can be done better, but I'm still learning :)
function turnUrlIntoHyperlink($string){
//The Regular Expression filter
$reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
// Check if there is a url in the text
if(preg_match_all($reg_exUrl, $string, $url)) {
// Loop through all matches
foreach($url[0] as $key => $newLinks){
if(strstr( $newLinks, ":" ) === false){
$url = 'https://'.$newLinks;
}else{
$url = $newLinks;
}
// Create Search and Replace strings
$replace .= ''.$url.',';
$newLinks = '/'.preg_quote($newLinks, '/').'/';
$string = preg_replace($newLinks, '{'.$key.'}', $string, 1);
}
$arr_replace = explode(',', $replace);
foreach ($arr_replace as $key => $link) {
$string = str_replace('{'.$key.'}', $link, $string);
}
}
//Return result
return $string;
}

PHP code to create a negative word dictionary and search if a post has negative words

I'm trying to develop a PHP application where it takes comments from users and then match the string to check if the comment is positive or negative. I have list of negative words in negative.txt file. If a word is matched from the word list, then I want a simple integer counter to increment by 1. I tried the some links and created the a code to check if the comment has is negative or positive but it is only matching the last word of the file.Here's the code what i have done.
<?php
function teststringforbadwords($comment)
{
$file="BadWords.txt";
$fopen = fopen($file, "r");
$fread = fread($fopen,filesize("$file"));
fclose($fopen);
$newline_ele = "\n";
$data_split = explode($newline_ele, $fread);
$new_tab = "\t";
$outoutArr = array();
//process uploaded file data and push in output array
foreach ($data_split as $string)
{
$row = explode($new_tab, $string);
if(isset($row['0']) && $row['0'] != ""){
$outoutArr[] = trim($row['0']," ");
}
}
//---------------------------------------------------------------
foreach($outoutArr as $word) {
if(stristr($comment,$word)){
return false;
}
}
return true;
}
if(isset($_REQUEST["submit"]))
{
$comments = $_REQUEST["comments"];
if (teststringforbadwords($comments))
{
echo 'string is clean';
}
else
{
echo 'string contains banned words';
}
}
?>
Link Tried : Check a string for bad words?

I added the strtolower function around both your $comments and your input from the file. That way if someone spells STUPID, instead of stupid, the code will still detect the bad word.
I also added trim to remove unnecessary and disruptive whitespace (like newline).
Finally, I changed the way how you check the words. I used a preg_match to split about all whitespace so we are checking only full words and don't accidentally ban incorrect strings.
<?php
function teststringforbadwords($comment)
{
$comment = strtolower($comment);
$file="BadWords.txt";
$fopen = fopen($file, "r");
$fread = strtolower(fread($fopen,filesize("$file")));
fclose($fopen);
$newline_ele = "\n";
$data_split = explode($newline_ele, $fread);
$new_tab = "\t";
$outoutArr = array();
//process uploaded file data and push in output array
foreach ($data_split as $bannedWord)
{
foreach (preg_split('/\s+/',$comment) as $commentWord) {
if (trim($bannedWord) === trim($commentWord)) {
return false;
}
}
}
return true;
}

1) Your storing $row['0'] only why not others index words. So problem is your ignoring some of word in text file.
Some suggestion
1) Insert the text in text file one by one i.e new line like this so you can access easily explode by newline to avoiding multiple explode and loop.
Example: sss.txt
...
bad
stupid
...
...
2) Apply trim and lowercase function to both comment and bad string.
Hope it will work as expected
function teststringforbadwords($comment)
{
$file="sss.txt";
$fopen = fopen($file, "r");
$fread = fread($fopen,filesize("$file"));
fclose($fopen);
foreach(explode("\n",$fread) as $word)
{
if(stristr(strtolower(trim($comment)),strtolower(trim($word))))
{
return false;
}
}
return true;
}

Check and get the text after "near " in php

When someone writes:
"Near Tokyo"
I would like to check first if the $search contains "near" and if it does then take the "Tokyo" into a variable $location.
I tried this:
if(strpos($search, 'near') == true){
$search = explode("near ", $location);
echo $location;
exit();
}
did not work, it does not execute the if statement

You have multiple bugs here:
strpos may return 0, which signifies a match but will not compare equal to true
strpos is case-sensitive, which would make your example not work (look into stripos instead)
explode is also case-sensitive
It would probably be easiest to use a regex for this:
$input = "Near Tokyo";
if (preg_match('/near\s+(\w+)/i', $input, $matches)) {
echo "Near: ".$matches[1]."\n";
}
else {
echo "No match.\n";
}
See it in action.
This particular regex will only match the next word after "near", but this can be modified to suit your requirements.

returntype of strpos is int, not bool
http://php.net/manual/en/function.strpos.php
so use (according to manual pages) this:
if(strpos($search, 'near') !== false)

change it to:
if(strpos($search, 'near') !== false){
$search = explode("near ", $location);
echo $location;
exit();
}
just take a look at the documentation where this behaviour is explained:
It is easy to mistake the return values for "character found at
position 0" and "character not found". Here's how to detect the
difference:
<?php
$pos = strrpos($mystring, "b");
if ($pos === false) { // note: three equal signs
// not found...
}
?>

Yes, strpos, cumbersome boolean result handling. You probably should or should want to use stristr instead, which is also case-insensitive:
if (stristr($search, "Near")) {
And since you are extracting text anyway, why not use a regex? (People are using the awful explode workaround way too often.)
if (preg_match("'Near (\S+)'i", $search, $match)) {
echo $match[1];
}

use this to get a result:
if(strpos($search, 'near') !== false){
$location = explode("near ", $search);
print_r($location);
exit();
}

$search = "Near Tokyo";
if(strpos($search, 'Near') === 0){
$location = explode("Near ", $search);
echo $location[1];
exit();
}

<?php
$search = "Near Tokoyo";
if(preg_match("/near ([a-z]+)/i", $search, $match))
{
$location = $match[1];
echo $location;
}
?>
EDIT:
Fixed some bugs in your code.
if(stripos($search, 'near') !== false){
$location = explode("near ", $search);
echo $location[0];
exit();
}

How to find url using preg_match in php [duplicate]

I was wondering how I could check a string broken into an array against a preg_match to see if it started with www. I already have one that check for http://www.
function isValidURL($url)
{
return preg_match('|^http(s)?://[a-z0-9-]+(.[a-z0-9-]+)*(:[0-9]+)?(/.*)?$|i', $url);
}
$stringToArray = explode(" ",$_POST['text']);
foreach($stringToArray as $key=>$val){
$urlvalid = isValidURL($val);
if($urlvalid){
$_SESSION["messages"][] = "NO URLS ALLOWED!";
header("Location: http://www.domain.com/post/id/".$_POST['postID']);
exit();
}
}
Thanks!
Stefan

You want something like:
%^((https?://)|(www\.))([a-z0-9-].?)+(:[0-9]+)?(/.*)?$%i
this is using the | to match either http:// or www at the beginning. I changed the delimiter to % to avoid clashing with the |

John Gruber of Daring Fireball has posted a very comprehensive regex for all types of URLs that may be of interest. You can find it here:
http://daringfireball.net/2010/07/improved_regex_for_matching_urls

I explode the string at first as the url might be half way through it e.g. hello how are you www.google.com
Explode the string and use a foreach statement.
Eg:
$string = "hello how are you www.google.com";
$string = explode(" ", $string);
foreach ($string as $word){
if ( (strpos($word, "http://") === 0) || (strpos($word, "www.") === 0) ){
// Code you want to excute if string is a link
}
}
Note you have to use the === operator because strpos can return, will return a 0 which will appear to be false.

I used this below which allows you to detect url's anywhere in a string. For my particular application it's a contact form to combat spam so no url's are allowed. Works very well.
Link to resource: https://css-tricks.com/snippets/php/find-urls-in-text-make-links/
My implementation;
<?php
// Validate message
if(isset($_POST['message']) && $_POST['message'] == 'Include your order number here if relevant...') {
$messageError = "Required";
} else {
$message = test_input($_POST["message"]);
}
if (strlen($message) > 1000) {
$messageError = "1000 chars max";
}
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
if (preg_match($reg_exUrl, $message)) {
$messageError = "Url's not allowed";
}
// Validate data
function test_input($data) {
$data = trim($data);
$data = stripslashes($data);
$data = htmlspecialchars($data);
return $data;
}
?>

Try implode($myarray, '').strstr("www.")==0. That implodes your array into one string, then checks whether www. is at the beginning of the string (index 0).

Filter some words

I want to filter some reserved word on my title form.
$adtitle = sanitize($_POST['title']);
$ignore = array('sale','buy','rent');
if(in_array($adtitle, $ignore)) {
$_SESSION['ignore_error'] = '<strong>'.$adtitle.'</strong> cannot be use as your title';
header('Location:/submit/');
exit;
How to make something like this. If
user type Car for sale the sale
will detected as reserved keyword.
Now my current code only detect single keyword only.

You're probably looking for a regular expression:
foreach($ignore as $keyword) {
if(preg_match("/\b$keyword\b/i", $adtitle) {
// Uhoh, the user used a bad word!!
}
}
This will also prevent some false positives, such as 'torrent' not coming up as a reserved word because it contains 'rent'.

You could also try something like this:
$ignore = array('sale','rent','buy');
$invalid = array_intersect($ignore, preg_split('{\W+}', $adtitle));
Then $invalid will contain a list of all the reserved words used in the title. This could be useful if you wanted to explain why the title cannot be used.
EDIT:
$invalid = array_intersect($ignore, preg_split('{\W+}', strtolower($adtitle));
if you want case-insensitive matching.

$adtitle = sanitize($_POST['title']);
$ignoreArr =
array('sale','buy','rent');
foreach($ignoreArr as $ignore){
if(strpos($ignore, $adtitle)!==false){
$_SESSION['ignore_error'] = '<strong>'.$adtitle.'</strong> cannot
be use as your title';
break;
}
}
header('Location:/submit/');
exit;
This should work. Not tested though.

function isValidTitle($str) {
// these may want to be placed in a config file
$badWords = array('sale','buy','rent');
foreach($badWords as $word) {
if (strstr($str, $word)) return false; // found a word!
}
// no bad word found
return true;
}
If you'd like to match the words only (not partial matches as well, as in within other words), try this modified one below
function isValidTitle($str) {
$badWords = array('sale','buy','rent');
foreach($badWords as $word) {
if (preg_match('/\b' . trim($word) . '\b/i', $str)) return false;
}
return true;
}

How about something as simple as this:
if ( preg_match("/\b" . implode("|", $ignore) . "\b/i", $adtitle) ) {
// No good
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Detecting a url using preg_match? without http:// in the string - php

You want something like: %^((https?://)|(www\.))([a-z0-9-].?)+(:[0-9]+)?(/.*)?$%i this is using the | to match either http:// or www at the beginning. I changed the delimiter to % to avoid clashing with the |

John Gruber of Daring Fireball has posted a very comprehensive regex for all types of URLs that may be of interest. You can find it here: http://daringfireball.net/2010/07/improved_regex_for_matching_urls

Try implode($myarray, '').strstr("www.")==0. That implodes your array into one string, then checks whether www. is at the beginning of the string (index 0).

Related

Find URL in string and turn into a link

PHP code to create a negative word dictionary and search if a post has negative words

Check and get the text after "near " in php

How to find url using preg_match in php [duplicate]

Filter some words

Categories

Resources