bad words filter integration - php

I am trying to integrate php bad words filter
The input is taken through $_REQUEST['qtitle'] and $_REQUEST['question']
But I am failed to do so
$USERID = intval($_SESSION['USERID']);
if ($USERID > 0)
{
$sess_ver = intval($_SESSION[VERIFIED]);
$verify_asker = intval($config['verify_asker']);
if($verify_asker == "1" && $sess_ver == "0")
{
$error = $lang['225'];
$theme = "error.tpl";
}
else
{
$theme = "ask.tpl";
STemplate::assign('qtitle',htmlentities(strip_tags($_REQUEST['qtitle']), ENT_COMPAT, "UTF-8"));
STemplate::assign('question',htmlentities(strip_tags($_REQUEST['question']), ENT_COMPAT, "UTF-8"));
if($_REQUEST['subform'] != "")
{
$qtitle = htmlentities(strip_tags($_REQUEST['qtitle']), ENT_COMPAT, "UTF-8");
$question = htmlentities(strip_tags($_REQUEST['question']), ENT_COMPAT, "UTF-8");
$category = intval($_REQUEST['category']);
if($qtitle == "")
{
$error = $lang['3'];
}
elseif($category <= "0")
{
$error = $lang['4'];
}
else
{
if($config['approve_stories'] == "1")
{
$addtosql = ", active='0'";
}
$query="INSERT INTO posts SET USERID='".mysql_real_escape_string($USERID)."', title='".mysql_real_escape_string($qtitle)."',question='".mysql_real_escape_string($question)."', tags='".mysql_real_escape_string($qtitle)."', category='".mysql_real_escape_string($category)."', time_added='".time()."', date_added='".date("Y-m-d")."' $addtosql";
$result=$conn->execute($query);
$userid = mysql_insert_id();
$message = $lang['5'];
}
}
}
}
else
{
$question = htmlentities(strip_tags($_REQUEST['qtitle']), ENT_COMPAT, "UTF-8");
$redirect = base64_encode($thebaseurl."/ask?qtitle=".$question);
header("Location:$config[baseurl]/login?redirect=$redirect");exit;
}
I am trying the following code but this code replaces every word (which is not included in the array)
FUNCTION BadWordFilter(&$text, $replace){
$bads = ARRAY (
ARRAY("butt","b***"),
ARRAY("poop","p***"),
ARRAY("crap","c***")
);
IF($replace==1) { //we are replacing
$remember = $text;
FOR($i=0;$i<sizeof($bads);$i++) { //go through each bad word
$text = EREGI_REPLACE($bads[$i][0],$bads[$i][1],$text); //replace it
}
IF($remember!=$text) RETURN 1; //if there are any changes, return 1
} ELSE { //we are just checking
FOR($i=0;$i<sizeof($bads);$i++) { //go through each bad word
IF(EREGI($bads[$i][0],$text)) RETURN 1; //if we find any, return 1
}
}
}
$qtitle = BadWordFilter($wordsToFilter,0);
$qtitle = BadWordFilter($wordsToFilter,1);
What I am missing here?

I agree with #Gordon that this is reinventing the wheel, but if you really want to do it, here's a better start:
function badWordFilter(&$text, $replace)
{
$patterns = array(
'/butt/i',
'/poop/i',
'/crap/i'
);
$replaces = array(
'b***',
'p***',
'c***'
);
$count = 0;
if($replace){
$text = preg_replace($patterns, $replaces, $text, -1, $count);
} else {
foreach($patterns as $pattern){
$count = preg_match($pattern, $text);
if($count > 0){
break;
}
}
}
return $count;
}
There are lots of inherent issues, though. For instance, run the filter on the text How do you like my buttons? ... You'll end up with How do you like my b***ons?

I think you should use this kind of function :
function badWordsFilter(&$text){
$excluded_words = array( 'butt', 'poop', 'crap' );
$replacements = array();
$i = count($excluded_words);
while($i--){
$tmp = $excluded_words{0};
for($i=0;$i<(strlen($excluded_words)-1);$i++){
$tmp .= '*';
}
$replacements[] = $tmp;
}
str_replace($excluded_words, $replacements, $text);
}

Related

How does this code "know how to escape special characters"?

Hello Stack Overflow Experts. I found this post and the author states, "... function knows how to escape special characters". I am still learning and I'm having difficulty seeing where the actual "handling" of special characters takes place and how they are "handled." Could someone point me in the right direction? Thank you. Here is the listed code:
$dataArray = csvstring_to_array( file_get_contents('Address.csv'));
function csvstring_to_array($string, $separatorChar = ',', $enclosureChar = '"', $newlineChar = "\n") {
// #author: Klemen Nagode
$array = array();
$size = strlen($string);
$columnIndex = 0;
$rowIndex = 0;
$fieldValue="";
$isEnclosured = false;
for($i=0; $i<$size;$i++) {
$char = $string{$i};
$addChar = "";
if($isEnclosured) {
if($char==$enclosureChar) {
if($i+1<$size && $string{$i+1}==$enclosureChar){
// escaped char
$addChar=$char;
$i++; // dont check next char
}else{
$isEnclosured = false;
}
}else {
$addChar=$char;
}
}else {
if($char==$enclosureChar) {
$isEnclosured = true;
}else {
if($char==$separatorChar) {
$array[$rowIndex][$columnIndex] = $fieldValue;
$fieldValue="";
$columnIndex++;
}elseif($char==$newlineChar) {
echo $char;
$array[$rowIndex][$columnIndex] = $fieldValue;
$fieldValue="";
$columnIndex=0;
$rowIndex++;
}else {
$addChar=$char;
}
}
}
if($addChar!=""){
$fieldValue.=$addChar;
}
}
if($fieldValue) { // save last field
$array[$rowIndex][$columnIndex] = $fieldValue;
}
return $array;
}

How do I check if there are bad word in a string or not

This is my PHP code for replacing bad word in the string to *, it's working well.
But I want to check if in the string are bad word or not. How can I do this?
index.php
<?php
include("badwords.php");
$content = "cat bad_word dog";
$badword = new badword();
echo $badword->word_fliter("$content");
?>
badword.php
<?php
$bad_words = array (
// an array of bad words here...
);
class badword {
function word_fliter($content) {
global $bad_words, $wordreplace;
$count = count($bad_words);
for ($n = 0; $n < $count; ++$n, next ($bad_words)) {
$filter = "*";
//Search for bad_words in content
$search = "$bad_words[$n]";
$content = preg_replace("'$search'i","<i>$filter</i>",$content);
}
return $content;
}
}
?>
............................................................................................................................................................
edit: Since you wanted full code...
please note that i changed the function name from word_fliter() to word_filter()
index.php
<?php
include("badwords.php");
$content = "this is an example string with bad words in it";
$badword = new badword();
echo $badword->word_filter("$content");
if($badword->usedBadWords()){
// do whatever you want to do if bad words were used
}
?>
badwords.php
<?php
$bad_words = array (
// insert your naughty list here
);
class badword {
private $usedBadWords = false;
function word_filter($content) {
foreach($bad_words as $bad_word){
if(strpos($content, $bad_word) !== false){
$this->usedBadwords = true;
}
$content = str_replace($bad_word, '***', $content);
}
return $content;
}
function usedBadWords(){
return $this->usedBadWords;
}
}
?>
This should work, you just check if it the content matches any bad words and returns it.
index.php
<?php
include("badwords.php");
$content = "cat bad_word dog";
$badword = new badword();
$return_val = $badword->word_fliter("$content");
echo $return_val['content'];
if($return_val['has_bad_words']){
..do stuff
}
?>
badword.php
<?php
$bad_words = array (
// an array of bad words here...
);
class badword {
function word_fliter($content) {
global $bad_words, $wordreplace;
$count = count($bad_words);
$has_bad_words = false;
for ($n = 0; $n < $count; ++$n, next ($bad_words)) {
$filter = "*";
//Search for bad_words in content
$search = "$bad_words[$n]";
$content = preg_replace("'$search'i","<i>$filter</i>",$content);
if(preg_match("'$search'i", $content) && !$has_bad_words){
$has_bad_words = true;
}
else {
$has_bad_words = false;
}
}
return array('content' => $content, 'has_bad_words' => $has_bad_words);
}
}
<?php
$bad_words = array (
// an array of bad words here...
);
$bad_word_check = "";
$bad_word_detect = "";
class badword {
function word_fliter($content) {
global $bad_words, $wordreplace;
$count = count($bad_words);
for ($n = 0; $n < $count; ++$n, next ($bad_words)) {
$filter = "*";
//Search for bad_words in content
$search = "$bad_words[$n]";
$xx = "%".$search."%";
if(preg_match($xx, $content))
{
$bad_word_check = "1";
$bad_word_detect = $bad_word_detect."".$search.",";
}
$content = preg_replace("'$search'i","<i>$filter</i>",$content);
}
return array('content' => $content, 'bad_word_check' => $bad_word_check, 'bad_word_detect' => $bad_word_detect);
}
}
$content = "cat bad_word bad_word dog association";
$badword = new badword();
echo $badword->word_fliter("$content")[content];
echo "<BR>";
echo $badword->word_fliter("$content")[bad_word_check];
echo "<BR>";
echo $badword->word_fliter("$content")[bad_word_detect];
?>

PHP not executing the same function a second time

I have a function that I call using Config::save('key', $value);
public static function save($params, $value)
{
$parts = explode('.', $params);
$count = count($parts);
$mainFile = PANEL_PATH.'/conf.php';
$mainConfig = include $mainFile;
if($count == 1)
{
if(isset($mainConfig[$parts[0]]))
{
$mainConfig[$parts[0]] = $value;
}
}
elseif($count == 2)
{
if(isset($mainConfig[$parts[0]][$parts[1]]))
{
$mainConfig[$parts[0]][$parts[1]] = $value;
}
}
elseif($count == 3)
{
if(isset($mainConfig[$parts[0]][$parts[1]][$parts[2]]))
{
$mainConfig[$parts[0]][$parts[1]][$parts[2]] = $value;
}
}
elseif($count == 4)
{
if(isset($mainConfig[$parts[0]][$parts[1]][$parts[2]][$parts[3]]))
{
$mainConfig[$parts[0]][$parts[1]][$parts[2]][$parts[3]] = $value;
}
}
ob_start();
echo var_export($mainConfig);
$content = ob_get_contents();
ob_end_clean();
$content = str_replace(" ", "\t", $content);
$content = str_replace("\n\tarray (", "array(", $content);
$content = str_replace("\n\t\tarray (", "array(", $content);
$content = str_replace("\n\t\t\tarray (", "array(", $content);
$mainFileHandler = fopen($mainFile, 'w+');
$mainFileWrite = fwrite($mainFileHandler, "<?php\n\nreturn " . $content . ";");
if($mainFileWrite > 0)
{
return true;
}
else
{
return false;
}
fclose($mainFileHandler);
}
And the conf.php file looks like this:
<?php
return array (
'name' => '<NAME>',
'license' => '<LICENSE>',
'url' => '<URL>',
'usage_id' => '<USAGE_ID>',
'installed' => '<INSTALLED>'
);
So when I do this statement
if (Config::save('usage_id', 'USAGE_ID') && Config::save('license', 'LICENSE'))
{
echo "Reset License";
}
It only resets the license variable, it seems like it skips over the first one. Is there something wrong with the code that's making it act like this?
If anymore code is needed let me know and I'll be glad to provide it.

How increase the performance of this code in php?

How increase the performance of this code in php?
or any alternative method to find out the comment if the string starts with # then call at()
or if string starts with "#" then call hash()
here the sample comment is "#hash #at #####tag";
/
/this is the comment with mention,tag
function getCommentWithLinks($comment="#name #tag ###nam1 test", $pid, $img='', $savedid='', $source='', $post_facebook='', $fbCmntInfo='') {
//assign to facebook facebookComment bcz it is used to post into the fb wall
$this->facebookComment = $comment;
//split the comment based on the space
$comment = explode(" ", $comment);
//get the lenght of the splitted array
$cmnt_length = count($comment);
$store_cmnt = $tagid = '';
$this->img = $img;
$this->saveid = $savedid;//this is uspid in product saved table primary key
//$this->params = "&product=".base_url()."product/".$this->saveid;
$this->params['product'] = base_url()."product/".$this->saveid;
//$this->params['tags']='';
foreach($comment as $word) {
//check it is tag or not
//the first character must be a # and remaining all alphanumeric if any # or # is exist then it is comment
//find the length of the tag #mention
$len = strlen($word);
$cmt = $c = $tag_name = '';
$j = 0;
$istag = false;
for($i=0; $i<$len; $i++) {
$j = $i-1;
//check if the starting letter is # or not
if($word[$i] == '#') {
//insert tagname
if($istag) {
//insert $tag_name
$this->save_tag($tag_name, $pid);
$istag = false;
$tag_name = '';
}
//check for comment
if($i >= 1 && $word[$j]=='#') {
$this->store_cmnt .= $word[$i];
}else{
//append to the store_coment if the i is 1 or -1 or $word[j]!=#
$this->store_cmnt .= $word[$i];//23,#
}
}else if($word[$i]=='#') {
//insert tagname
if($istag) {
//insert $tag_name
$this->save_mention($tag_name, $pid, $fbCmntInfo);
$istag = false;
$tag_name = '';
}
//check for comment
if($i >= 1 && $word[$j]=='#') {
$this->store_cmnt .= $word[$i];
}else{
$this->store_cmnt .= $word[$i];//23,#
}
}else if( $this->alphas($word[$i]) && $i!=0){
if($tag_name=='') {
//check the length of the string
$strln=strlen($this->store_cmnt);//4
if($strln != 0) {
$c = substr($this->store_cmnt, $strln-1, $strln);//#
if($c=='#' || $c=='#') {
$this->store_cmnt = substr($this->store_cmnt, 0, $strln-1);//23,
$tag_name = $c;
}
}
//$tag_name='';
}
//check that previous is # or # other wise it is
if($c=='#' || $c=='#') {
$tag_name .= $word[$i];
$istag = true;
//check if lenis == i then add anchor tag her
if($i == $len-1) {
$istag =false;
//check if it is # or #
if($c=='#')
$this->save_tag($tag_name,$pid);
else
$this->save_mention($tag_name,$pid,$fbCmntInfo);
//$this->store_cmnt .= '<a >'. $tag_name.'</a>';
}
}else{
$this->store_cmnt .= $word[$i];
}
}else{
if($istag) {
//insert $tag_name
$this->save_tag($tag_name,$pid);
$istag = false;
$tag_name = '';
}
$this->store_cmnt .= $word[$i];
}
}
$this->store_cmnt .=" ";
}
}
Try This it may be help full
function getResultStr($data, $param1, $param2){
return $param1 != $param2?(''.$data.''):$data;
}
function parseWord($word, $symbols){
$result = $word;
$status = FALSE;
foreach($symbols as $symbol){
if(($pos = strpos($word, $symbol)) !== FALSE){
$status = TRUE;
break;
}
}
if($status){
$temp = $symFlag = '';
$result = '';
foreach(str_split($word) as $char){
//Checking whether chars are symbols(#,#)
if(in_array($char, $symbols)){
if($symFlag != ''){
$result .= getResultStr($temp, $symFlag, $temp);
}
$symFlag = $temp = $char;
} else if(ctype_alnum($char) or $char == '_'){
//accepts[0-9][A-Z][a-z] and unserscore (_)
//Checking whether Symbol already started
if($symFlag != ''){
$temp .= $char;
} else {
//Just appending the char to $result
$result .= $char;
}
} else {
//accepts all special symbols excepts #,# and _
if($symFlag != ''){
$result .= getResultStr($temp, $symFlag, $temp);
$temp = $symFlag = '';
}
$result .= $char;
}
}
$result .= getResultStr($temp, $symFlag, '');
}
return $result;
}
function parseComment($comment){
$str = '';
$symbols = array('#', '#');
foreach(explode(' ', $comment) as $word){
$result = parseWord($word, $symbols);
$str .= $result.' ';
}
return $str;
}
$str = "#Richard, #McClintock, a Latin professor at $%#Hampden_Sydney #College-in #Virginia, looked up one of the ######more obscure Latin words, #######%%#%##consectetur, from a Lorem Ipsum passage, and #going#through the cites of the word in classical literature";
echo "<br />Before Parsing : <br />".$str;
echo "<br /><br />After Parsing : <br />".parseComment($str);
use strpos or preg_match or strstr
Please refer string functions in php. You can do it in a line or two with that in built functions.
If it not matches better to write a regex.

PHP Remove JavaScript

I am trying to remove JavaScript from the HTML.
I can't get the regular expression to work with PHP; it's giving me an null array. Why?
<?php
$var = '
<script type="text/javascript">
function selectCode(a)
{
var e = a.parentNode.parentNode.getElementsByTagName(PRE)[0];
if (window.getSelection)
{
var s = window.getSelection();
if (s.setBaseAndExtent)
{
s.setBaseAndExtent(e, 0, e, e.innerText.length - 1);
}
else
{
var r = document.createRange();
r.selectNodeContents(e);
s.removeAllRanges();
s.addRange(r);
}
}
else if (document.getSelection)
{
var s = document.getSelection();
var r = document.createRange();
r.selectNodeContents(e);
s.removeAllRanges();
s.addRange(r);
}
else if (document.selection)
{
var r = document.body.createTextRange();
r.moveToElementText(e);
r.select();
}
}
</script>
';
function remove_javascript($java){
echo preg_replace('/<script\b[^>]*>(.*?)<\/script>/i', "", $java);
}
?>
this should do it:
echo preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $var);
/s is so that the dot . matches newlines too.
Just a warning, you should not use this type of regexp to sanitize user input for a website. There is just too many ways to get around it. For sanitizing use something like the http://htmlpurifier.org/ library
This might do more than you want, but depending on your situation you might want to look at strip_tags.
Here's an idea
while (true) {
if ($beginning = strpos($var,"<script")) {
$stringLength = (strpos($var,"</script>") + strlen("</script>")) - $beginning;
substr_replace($var, "", $beginning, $stringLength);
} else {
break
}
}
In your case you could regard the string as a list of newline delimited strings and remove the lines containing the script tags(first & second to last) and you wouldn't even need regular expressions.
Though if what you are trying to do is preventing XSS it might not be sufficient to only remove script tags.
function clean_jscode($script_str) {
$script_str = htmlspecialchars_decode($script_str);
$search_arr = array('<script', '</script>');
$script_str = str_ireplace($search_arr, $search_arr, $script_str);
$split_arr = explode('<script', $script_str);
$remove_jscode_arr = array();
foreach($split_arr as $key => $val) {
$newarr = explode('</script>', $split_arr[$key]);
$remove_jscode_arr[] = ($key == 0) ? $newarr[0] : $newarr[1];
}
return implode('', $remove_jscode_arr);
}
You can remove any JavaScript code from HTML string with the help of following PHP function
You can read more about it here:
https://mradeveloper.com/blog/remove-javascript-from-html-with-php
function sanitizeInput($inputP)
{
$spaceDelimiter = "#BLANKSPACE#";
$newLineDelimiter = "#NEWLNE#";
$inputArray = [];
$minifiedSanitized = '';
$unMinifiedSanitized = '';
$sanitizedInput = [];
$returnData = [];
$returnType = "string";
if($inputP === null) return null;
if($inputP === false) return false;
if(is_array($inputP) && sizeof($inputP) <= 0) return [];
if(is_array($inputP))
{
$inputArray = $inputP;
$returnType = "array";
}
else
{
$inputArray[] = $inputP;
$returnType = "string";
}
foreach($inputArray as $input)
{
$minified = str_replace(" ",$spaceDelimiter,$input);
$minified = str_replace("\n",$newLineDelimiter,$minified);
//removing <script> tags
$minifiedSanitized = preg_replace("/[<][^<]*script.*[>].*[<].*[\/].*script*[>]/i","",$minified);
$unMinifiedSanitized = str_replace($spaceDelimiter," ",$minifiedSanitized);
$unMinifiedSanitized = str_replace($newLineDelimiter,"\n",$unMinifiedSanitized);
//removing inline js events
$unMinifiedSanitized = preg_replace("/([ ]on[a-zA-Z0-9_-]{1,}=\".*\")|([ ]on[a-zA-Z0-9_-]{1,}='.*')|([ ]on[a-zA-Z0-9_-]{1,}=.*[.].*)/","",$unMinifiedSanitized);
//removing inline js
$unMinifiedSanitized = preg_replace("/([ ]href.*=\".*javascript:.*\")|([ ]href.*='.*javascript:.*')|([ ]href.*=.*javascript:.*)/i","",$unMinifiedSanitized);
$sanitizedInput[] = $unMinifiedSanitized;
}
if($returnType == "string" && sizeof($sanitizedInput) > 0)
{
$returnData = $sanitizedInput[0];
}
else
{
$returnData = $sanitizedInput;
}
return $returnData;
}
this was very usefull for me. try this code.
while(($pos = stripos($content,"<script"))!==false){
$end_pos = stripos($content,"</script>");
$start = substr($content, 0, $pos);
$end = substr($content, $end_pos+strlen("</script>"));
$content = $start.$end;
}
$text = strip_tags($content);
I use this:
function clear_text($s) {
$do = true;
while ($do) {
$start = stripos($s,'<script');
$stop = stripos($s,'</script>');
if ((is_numeric($start))&&(is_numeric($stop))) {
$s = substr($s,0,$start).substr($s,($stop+strlen('</script>')));
} else {
$do = false;
}
}
return trim($s);
}

Categories