Filter some words - php

I want to filter some reserved word on my title form.
$adtitle = sanitize($_POST['title']);
$ignore = array('sale','buy','rent');
if(in_array($adtitle, $ignore)) {
$_SESSION['ignore_error'] = '<strong>'.$adtitle.'</strong> cannot be use as your title';
header('Location:/submit/');
exit;
How to make something like this. If
user type Car for sale the sale
will detected as reserved keyword.
Now my current code only detect single keyword only.

You're probably looking for a regular expression:
foreach($ignore as $keyword) {
if(preg_match("/\b$keyword\b/i", $adtitle) {
// Uhoh, the user used a bad word!!
}
}
This will also prevent some false positives, such as 'torrent' not coming up as a reserved word because it contains 'rent'.

You could also try something like this:
$ignore = array('sale','rent','buy');
$invalid = array_intersect($ignore, preg_split('{\W+}', $adtitle));
Then $invalid will contain a list of all the reserved words used in the title. This could be useful if you wanted to explain why the title cannot be used.
EDIT:
$invalid = array_intersect($ignore, preg_split('{\W+}', strtolower($adtitle));
if you want case-insensitive matching.

$adtitle = sanitize($_POST['title']);
$ignoreArr =
array('sale','buy','rent');
foreach($ignoreArr as $ignore){
if(strpos($ignore, $adtitle)!==false){
$_SESSION['ignore_error'] = '<strong>'.$adtitle.'</strong> cannot
be use as your title';
break;
}
}
header('Location:/submit/');
exit;
This should work. Not tested though.

function isValidTitle($str) {
// these may want to be placed in a config file
$badWords = array('sale','buy','rent');
foreach($badWords as $word) {
if (strstr($str, $word)) return false; // found a word!
}
// no bad word found
return true;
}
If you'd like to match the words only (not partial matches as well, as in within other words), try this modified one below
function isValidTitle($str) {
$badWords = array('sale','buy','rent');
foreach($badWords as $word) {
if (preg_match('/\b' . trim($word) . '\b/i', $str)) return false;
}
return true;
}

How about something as simple as this:
if ( preg_match("/\b" . implode("|", $ignore) . "\b/i", $adtitle) ) {
// No good
}

Related

PHP code to create a negative word dictionary and search if a post has negative words

I'm trying to develop a PHP application where it takes comments from users and then match the string to check if the comment is positive or negative. I have list of negative words in negative.txt file. If a word is matched from the word list, then I want a simple integer counter to increment by 1. I tried the some links and created the a code to check if the comment has is negative or positive but it is only matching the last word of the file.Here's the code what i have done.
<?php
function teststringforbadwords($comment)
{
$file="BadWords.txt";
$fopen = fopen($file, "r");
$fread = fread($fopen,filesize("$file"));
fclose($fopen);
$newline_ele = "\n";
$data_split = explode($newline_ele, $fread);
$new_tab = "\t";
$outoutArr = array();
//process uploaded file data and push in output array
foreach ($data_split as $string)
{
$row = explode($new_tab, $string);
if(isset($row['0']) && $row['0'] != ""){
$outoutArr[] = trim($row['0']," ");
}
}
//---------------------------------------------------------------
foreach($outoutArr as $word) {
if(stristr($comment,$word)){
return false;
}
}
return true;
}
if(isset($_REQUEST["submit"]))
{
$comments = $_REQUEST["comments"];
if (teststringforbadwords($comments))
{
echo 'string is clean';
}
else
{
echo 'string contains banned words';
}
}
?>
Link Tried : Check a string for bad words?
I added the strtolower function around both your $comments and your input from the file. That way if someone spells STUPID, instead of stupid, the code will still detect the bad word.
I also added trim to remove unnecessary and disruptive whitespace (like newline).
Finally, I changed the way how you check the words. I used a preg_match to split about all whitespace so we are checking only full words and don't accidentally ban incorrect strings.
<?php
function teststringforbadwords($comment)
{
$comment = strtolower($comment);
$file="BadWords.txt";
$fopen = fopen($file, "r");
$fread = strtolower(fread($fopen,filesize("$file")));
fclose($fopen);
$newline_ele = "\n";
$data_split = explode($newline_ele, $fread);
$new_tab = "\t";
$outoutArr = array();
//process uploaded file data and push in output array
foreach ($data_split as $bannedWord)
{
foreach (preg_split('/\s+/',$comment) as $commentWord) {
if (trim($bannedWord) === trim($commentWord)) {
return false;
}
}
}
return true;
}
1) Your storing $row['0'] only why not others index words. So problem is your ignoring some of word in text file.
Some suggestion
1) Insert the text in text file one by one i.e new line like this so you can access easily explode by newline to avoiding multiple explode and loop.
Example: sss.txt
...
bad
stupid
...
...
2) Apply trim and lowercase function to both comment and bad string.
Hope it will work as expected
function teststringforbadwords($comment)
{
$file="sss.txt";
$fopen = fopen($file, "r");
$fread = fread($fopen,filesize("$file"));
fclose($fopen);
foreach(explode("\n",$fread) as $word)
{
if(stristr(strtolower(trim($comment)),strtolower(trim($word))))
{
return false;
}
}
return true;
}

Match one or more keywords defined in array [duplicate]

Lets say I have an array of bad words:
$badwords = array("one", "two", "three");
And random string:
$string = "some variable text";
How to create this cycle:
if (one or more items from the $badwords array is found in $string)
echo "sorry bad word found";
else
echo "string contains no bad words";
Example:
if $string = "one fine day" or "one fine day two of us did something", user should see sorry bad word found message.
If $string = "fine day", user should see string contains no bad words message.
As I know, you can't preg_match from array. Any advices?
How about this:
$badWords = array('one', 'two', 'three');
$stringToCheck = 'some stringy thing';
// $stringToCheck = 'one stringy thing';
$noBadWordsFound = true;
foreach ($badWords as $badWord) {
if (preg_match("/\b$badWord\b/", $stringToCheck)) {
$noBadWordsFound = false;
break;
}
}
if ($noBadWordsFound) { ... } else { ... }
Why do you want to use preg_match() here?
What about this:
foreach($badwords as $badword)
{
if (strpos($string, $badword) !== false)
echo "sorry bad word found";
else
echo "string contains no bad words";
}
If you need preg_match() for some reasons, you can generate regex pattern dynamically. Something like this:
$pattern = '/(' . implode('|', $badwords) . ')/'; // $pattern = /(one|two|three)/
$result = preg_match($pattern, $string);
HTH
If you want to check each word by exploding the string into words, you can use this:
$badwordsfound = count(array_filter(
explode(" ",$string),
function ($element) use ($badwords) {
if(in_array($element,$badwords))
return true;
}
})) > 0;
if($badwordsfound){
echo "Bad words found";
}else{
echo "String clean";
}
Now, something better came to my mind, how about replacing all the bad words from the array and check if the string stays the same?
$badwords_replace = array_fill(0,count($badwords),"");
$string_clean = str_replace($badwords,$badwords_replace,$string);
if($string_clean == $string) {
echo "no bad words found";
}else{
echo "bad words found";
}
Here is the bad word filter I use and it works great:
private static $bad_name = array("word1", "word2", "word3");
// This will check for exact words only. so "ass" will be found and flagged
// but not "classic"
$badFound = preg_match("/\b(" . implode(self::$bad_name,"|") . ")\b/i", $name_in);
Then I have another variable with select strings to match:
// This will match "ass" as well as "classic" and flag it
private static $forbidden_name = array("word1", "word2", "word3");
$forbiddenFound = preg_match("/(" . implode(self::$forbidden_name,"|") . ")/i", $name_in);
Then I run an if on it:
if ($badFound) {
return FALSE;
} elseif ($forbiddenFound) {
return FALSE;
} else {
return TRUE;
}
Hope this helps. Ask if you need me to clarify anything.

filtering bad words from text

This function filer the email from text and return matched pattern
function parse($text, $words)
{
$resultSet = array();
foreach ($words as $word){
$pattern = 'regex to match emails';
preg_match_all($pattern, $text, $matches, PREG_OFFSET_CAPTURE );
$this->pushToResultSet($matches);
}
return $resultSet;
}
Similar way I want to match bad words from text and return them as $resultSet.
Here is code to filter badwords
TEST HERE
$badwords = array('shit', 'fuck'); // Here we can use all bad words from database
$text = 'Man, I shot this f*ck, sh/t! fucking fu*ker sh!t f*cking sh\t ;)';
echo "filtered words <br>";
echo $text."<br/>";
$words = explode(' ', $text);
foreach ($words as $word)
{
$bad= false;
foreach ($badwords as $badword)
{
if (strlen($word) >= strlen($badword))
{
$wordOk = false;
for ($i = 0; $i < strlen($badword); $i++)
{
if ($badword[$i] !== $word[$i] && ctype_alpha($word[$i]))
{
$wordOk = true;
break;
}
}
if (!$wordOk)
{
$bad= true;
break;
}
}
}
echo $bad ? 'beep ' : ($word . ' '); // Here $bad words can be returned and replace with *.
}
Which replaces badwords with beep
But I want to push matched bad words to $this->pushToResultSet() and returning as in first code of email filtering.
can I do this with my bad filtering code?
Roughly converting David Atchley's answer to PHP, does this work as you want it to?
$blocked = array('fuck','shit','damn','hell','ass');
$text = 'Man, I shot this f*ck, damn sh/t! fucking fu*ker sh!t f*cking sh\t ;)';
$matched = preg_match_all("/(".implode('|', $blocked).")/i", $text, $matches);
$filter = preg_replace("/(".implode('|', $blocked).")/i", 'beep', $text);
var_dump($filter);
var_dump($matches);
JSFiddle for working example.
Yes, you can match bad words (saving for later), replace them in the text and build the regex dynamically based on an array of bad words you're trying to filter (you might store it in DB, load from JSON, etc.). Here's the main portion of the working example:
var blocked = ['fuck','shit','damn','hell','ass'],
matchBlocked = new RegExp("("+blocked.join('|')+")", 'gi'),
text = $('.unfiltered').text(),
matched = text.match(matchBlocked),
filtered = text.replace(matchBlocked, 'beep');
Please see the JSFiddle link above for the full working example.

preg_match array items in string?

Lets say I have an array of bad words:
$badwords = array("one", "two", "three");
And random string:
$string = "some variable text";
How to create this cycle:
if (one or more items from the $badwords array is found in $string)
echo "sorry bad word found";
else
echo "string contains no bad words";
Example:
if $string = "one fine day" or "one fine day two of us did something", user should see sorry bad word found message.
If $string = "fine day", user should see string contains no bad words message.
As I know, you can't preg_match from array. Any advices?
How about this:
$badWords = array('one', 'two', 'three');
$stringToCheck = 'some stringy thing';
// $stringToCheck = 'one stringy thing';
$noBadWordsFound = true;
foreach ($badWords as $badWord) {
if (preg_match("/\b$badWord\b/", $stringToCheck)) {
$noBadWordsFound = false;
break;
}
}
if ($noBadWordsFound) { ... } else { ... }
Why do you want to use preg_match() here?
What about this:
foreach($badwords as $badword)
{
if (strpos($string, $badword) !== false)
echo "sorry bad word found";
else
echo "string contains no bad words";
}
If you need preg_match() for some reasons, you can generate regex pattern dynamically. Something like this:
$pattern = '/(' . implode('|', $badwords) . ')/'; // $pattern = /(one|two|three)/
$result = preg_match($pattern, $string);
HTH
If you want to check each word by exploding the string into words, you can use this:
$badwordsfound = count(array_filter(
explode(" ",$string),
function ($element) use ($badwords) {
if(in_array($element,$badwords))
return true;
}
})) > 0;
if($badwordsfound){
echo "Bad words found";
}else{
echo "String clean";
}
Now, something better came to my mind, how about replacing all the bad words from the array and check if the string stays the same?
$badwords_replace = array_fill(0,count($badwords),"");
$string_clean = str_replace($badwords,$badwords_replace,$string);
if($string_clean == $string) {
echo "no bad words found";
}else{
echo "bad words found";
}
Here is the bad word filter I use and it works great:
private static $bad_name = array("word1", "word2", "word3");
// This will check for exact words only. so "ass" will be found and flagged
// but not "classic"
$badFound = preg_match("/\b(" . implode(self::$bad_name,"|") . ")\b/i", $name_in);
Then I have another variable with select strings to match:
// This will match "ass" as well as "classic" and flag it
private static $forbidden_name = array("word1", "word2", "word3");
$forbiddenFound = preg_match("/(" . implode(self::$forbidden_name,"|") . ")/i", $name_in);
Then I run an if on it:
if ($badFound) {
return FALSE;
} elseif ($forbiddenFound) {
return FALSE;
} else {
return TRUE;
}
Hope this helps. Ask if you need me to clarify anything.

How do I return a part of text with a certain word in the middle?

If this is the input string:
$input = 'In biology (botany), a "fruit" is a part of a flowering
plant that derives from specific tissues of the flower, mainly one or
more ovaries. Taken strictly, this definition excludes many structures
that are "fruits" in the common sense of the term, such as those
produced by non-flowering plants';
And now I want to perform a search on the word tissues and consequently return only a part of the string, defined by where the result is, like this:
$output = '... of a flowering plant that derives from specific tissues of the flower, mainly one or more ovaries ...';
The search term may be in the middle.
How do I perform the aforementioned?
An alternative to my other answer using preg_match:
$word = 'tissues'
$matches = array();
$found = preg_match("/\b(.{0,30}$word.{0,30})\b/i", $string, $matches);
if ($found == 0) {
// string not found
} else {
$output = $matches[1];
}
This may be better as it uses word boundaries.
EDIT: To surround the search term with a tag, you'll need to slightly alter the regex. This should do it:
$word = 'tissues'
$matches = array();
$found = preg_match("/\b(.{0,30})$word(.{0,30})\b/i", $string, $matches);
if ($found == 0) {
// string not found
} else {
$output = $matches[1] . "<strong>$word</strong>" . $matches[2];
}
User strpos to find the location of the word and substr to extract the quote. For example:
$word = 'tissues'
$pos = strpos($string, $word);
if ($pos === FALSE) {
// string not found
} else {
$start = $pos - 30;
if ($start < 0)
$start = 0;
$output = substr($string, $start, 70);
}
Use stripos for case insensitive search.

Categories