This function is part of a system i have created that creates semi-unique content, however this function needs to run 350* 200 words and is taking an awfully long time (300 seconds+) i am wondering if there is any OBVIOUS slow downs in the code.
<?php
set_time_limit(0);
function spinthis($article)
{
//words to be replaced
$words = file_get_contents('words.txt', FILE_USE_INCLUDE_PATH);
$checker = "2";
//Explode each word of the article into $word
foreach (explode(" ", $article) as $word) {
$checker = "4";
$checker2 = "0";
if($word != '') {
if (strpos($words, $word) == true) {
//Explode each line of words.txt into $spinLine
foreach (explode("\n", $words) as $spinLine) {
//Explode each word from the chosen line
foreach (explode("|", trim($spinLine, '{}')) as $spinword) {
$stage2count = count(explode("|", $spinLine));
//if word matches grab the rest of the lines
if ($spinword == $word) {
$replaceWords = explode("|", trim($spinLine, '{}'));
shuffle($replaceWords);
//Add replacement word
$newword = str_replace('}','',$replaceWords[0]);
$newword = str_replace('}','',$newword);
$newword = lcfirst($newword);
$earticle = $earticle.' '.$newword;
$checker = "1";
break 2;
}else{
$checker2++;
if ($checker2 == $stage2count) {
$checker = "0";
break 1;
}
}
}
}
}else{
$checker = "0";
$checker2++;
}
}
//CHECK IF IT IS A WORD
if ($checker == '0'){
//Word not found
$earticle = $earticle . ' ' . $word;
}elseif($checker == '1'){
//Word found
//Do Nothing
}elseif($checker == '2'){
//First time
$earticle = $word;
}
}
return $earticle;
}
?>
the words.txt file contains lots of the following:
{address|Tackle|Handle|Target}
{add|Include|Incorporate|Increase|Put}
{adequate|Sufficient|Satisfactory|Ample}
{adjustment|Realignment|Adjusting|Modification|Change}
{adjust|Alter|Change|Modify|Regulate}
{administer|Give|Provide|Dispense|Render}
{administration|Management|Supervision|Government}
{administrator|Manager|Supervisor|Officer|Owner}
{admire|Appreciate|Enjoy|Respect|Adore}
{admission|Entrance|Entry|Programs|Everyone}
{admit|Acknowledge|Confess|Disclose|Declare}
{adolescent|Teenage}
{adoption|Ownership|Usage|Use}
{adopt|Follow|Embrace|Undertake}
{adult|Grownup|Grown-up|Person}
{advanced|Advanced level|High level|Higher level|Sophisticated}
{advance|Progress}
{advantage|Benefit|Edge|Gain}
{adventure|Journey|Experience|Venture|Voyage}
{advertising|Marketing|Promotion}
{advice|Guidance|Assistance}
{adviser|Agent|Advisor|Mechanic|Coordinator}
{advise|Recommend|Suggest|Guide|Encourage}
{advocate|Suggest|Supporter}
{ad|Advert|Advertisement|Advertising|Offer}
{aesthetic|Visual|Cosmetic|Artistic|Functional}
{affair|Event|Matter|Occasion}
{affect|Impact|Influence}
{afford|Manage}
{afraid|Scared|Frightened|Reluctant|Fearful}
{afternoon|Mid-day|Morning|Day|Evening}
{agency|Company|Organization|Firm|Bureau}
{agenda|Plan|Goal|Schedule|Intention}
{agent|Broker|Realtor|Adviser|Representative}
{age|Era}
{aggression|Hostility|Violence}
{aggressive|Intense|Hostile|Ambitious|Extreme}
{ago|Previously|Before}
{agreement|Contract|Arrangement|Settlement|Deal}
{agree|Concur|Consent|Acknowledge|Recognize}
{agricultural|Farming}
{agriculture|Farming}
{ahead|Forward|Onward}
{ah|Oh}
{aide|Assist|Help|Guide|Benefit}
{aid|Help|Support|Assist|Assistance}
{aim|Goal|Purpose|Intention}
{aircraft|Plane|Airplane}
{airline|Flight}
{airplane|Plane|Aircraft|Airline|Jet}
{air|Atmosphere|Oxygen}
{aisle|Section|Fence}
{alarm|Alert}
{album|Recording|Record|Lp|Cd}
{alcohol|Booze|Liquor}
{alien|Unfamiliar|Noncitizen|Nonresident|Strange}
{alike|Likewise|Equally}
{alive|Living|Well}
{allegation|Claims|Claim|Accusation}
{allegedly|Presumably|Apparently|Purportedly|Theoretically}
{alleged|So-called|Supposed|Claimed|Assumed}
{alley|Street}
{alliance|Connections|Coalition}
{allow|Permit|Enable|Let}
Related
My code so far:
$text = 'Herman Archer LIVEs in neW YORK';
$oldWords = explode(' ', $text);
$newWords = array();
$counter = 0;
foreach ($oldWords as $word) {
for($k=0;$k<strlen($word);$k++)
$counter = 0;
if ($word[k] == strtoupper($word[$k]))
$counter=$counter+1;
if($counter>1)
$word = strtolower($word);
if($counter == 1)
$word = ucfirst(strtolower($word));
else $word = strtolower($word);
echo $word."<br>";
}
Result:
Herman
Archer
Lives
In
New
York
Expected output:
Herman Archer lives in new york
If you want to use the counter approach you could use something as the following
<?php
$text = 'Herman Archer LIVEs in A neW YORK';
$words = explode(' ', $text);
foreach($words as &$word) {
$counter = 0;
for($i = 1; $i <= strlen($word);$i++) {
if (strtoupper($word[$i]) == $word[$i]) $counter++;
if ($counter == 2) break;
}
if ($counter == 2) $word = strtolower($word);
}
echo implode(' ', $words);
Let's do it in a simple manner. Let's loop $oldWords, compare the strings from the second character to the end with their lower-case version and replace if the result is different.
for ($index = 0; $index < count($oldWords); $index++) {
//Skip one-lettered words, such as a or A
if (strlen($oldWords[$index]) > 1) {
$lower = strtolower($oldWords[$index]);
if (substr($oldWords[$index], 1) !== substr($lower, 1)) {
$oldWords[$index] = $lower;
}
}
}
If you are using not only English language, you might want to switch to mb_strtolower
<?php
$text = 'Herman Archer LIVEs in neW YORK';
function normalizeText($text)
{
$words = explode(" ", $text);
$normalizedWords = array_map(function ($word) {
$loweredWord = strtolower($word);
if (ucfirst($loweredWord) === $word) {
return $word;
}
return $loweredWord;
}, $words);
return join(" ", $normalizedWords);
}
echo normalizeText($text) . PHP_EOL; // Herman Archer lives in new york
you can combine ctype_upper for first character and ctype_lower for the rest
$text = 'Herman Archer LIVEs in neW YORK';
$oldWords = explode(' ', $text);
$newWords = '';
foreach ($oldWords as $word) {
if(ctype_upper($word[0])&&ctype_lower(substr($word,1))){
$newWords .= $word.' ';
}else{
$newWords .= strtolower($word).' ';
}
}
echo $newWords;
Meanwhile I've found out that this can be done in an easier way
if(isset($_POST["sumbit"])){
$string = $_POST["string"];
if(!empty($string)){
$word = explode (" ",$string);
foreach($words as $word){
//cut the first letter.
//check caselower.
//if not, attach the letter back and turn all lowercase.
//if yes, attach the letter back and leave it .
$wordCut = substr($word,1);
if(ctype_lower($wordCut)){
echo $word." ";
} else {
echo strtolower($word). " ";
}
}
I have a working function that strips profanity words.
The word list is compose of 1700 bad words.
My problem is that it censored
'badwords '
but not
'badwords.' , 'badwords' and the like.
If I chose to remove space after
$badword[$key] = $word;
instead of
$badword[$key] = $word." ";
then I would have a bigger problem because if the bad word is CON then it will stripped a word CONSTANT
My question is, how can i strip a WORD followed by special characters except space?
badword. badword# badword,
.
function badWordFilter($data)
{
$wordlist = file_get_contents("badwordsnew.txt");
$words = explode(",", $wordlist);
$badword = array();
$replacementword = array();
foreach ($words as $key => $word)
{
$badword[$key] = $word." ";
$replacementword[$key] = addStars($word);
}
return str_ireplace($badword,$replacementword,$data);
}
function addStars($word)
{
$length = strlen($word);
return "*" . substr($word, 1, 1) . str_repeat("*", $length - 2)." " ;
}
Assuming that $data is a text that needs to be censored, badWordFilter() will return the text with bad words as *.
function badWordFilter($data)
{
$wordlist = file_get_contents("badwordsnew.txt");
$words = explode(",", $wordlist);
$specialCharacters = ["!","#","#","$","%","^","&","*","(",")","_","+",".",",",""];
$dataList = explode(" ", $data);
$output = "";
foreach ($dataList as $check)
{
$temp = $check;
$doesContain = contains($check, $words);
if($doesContain != false){
foreach($specialCharacters as $character){
if($check == $doesContain . $character || $check == $character . $doesContain ){
$temp = addStars($doesContain);
}
}
}
$output .= $temp . " ";
}
return $output;
}
function contains($str, array $arr)
{
foreach($arr as $a) {
if (stripos($str,$a) !== false) return $a;
}
return false;
}
function addStars($word)
{
$length = strlen($word);
return "*" . substr($word, 1, 1) . str_repeat("*", $length - 2)." " ;
}
Sandbox
I was able to answer my own question with the help of #maxchehab answer, but I can't declared his answer because it has fault at some area. I am posting this answer so others can use this code when they need a BAD WORD FILTER.
function badWordFinder($data)
{
$data = " " . $data . " "; //adding white space at the beginning and end of $data will help stripped bad words located at the begging and/or end.
$badwordlist = "bad,words,here,comma separated,no space before and after the word(s),multiple word is allowed"; //file_get_contents("badwordsnew.txt"); //
$badwords = explode(",", $badwordlist);
$capturedBadwords = array();
foreach ($badwords as $bad)
{
if(stripos($data, $bad))
{
array_push($capturedBadwords, $bad);
}
}
return badWordFilter($data, $capturedBadwords);
}
function badWordFilter($data, array $capturedBadwords)
{
$specialCharacters = ["!","#","#","$","%","^","&","*","(",")","_","+",".",","," "];
foreach ($specialCharacters as $endingAt)
{
foreach ($capturedBadwords as $bad)
{
$data = str_ireplace($bad.$endingAt, addStars($bad), $data);
}
}
return trim($data);
}
function addStars($bad)
{
$length = strlen($bad);
return "*" . substr($bad, 1, 1) . str_repeat("*", $length - 2)." ";
}
$str = 'i am bad words but i cant post it here because it is not allowed by the website some bad words# here with bad. ending in specia character but my code is badly strong so i can captured and striped those bad words.';
echo "$str<br><br>";
echo badWordFinder($str);
function PigLatin($sentence)
{
$vowelSufix = "way";
$consonantSufix = "ay";
$vowelArray = array('a','e','o','u','i');
$finalword;
$wordArray = explode(' ', $sentence);
foreach ($wordArray as $value)
{
$word = $value;
$consonant = $word[0];
if (in_array($word[0], $vowelArray))
{
$finalword = substr($word, 1). $word[0]. $vowelSufix. "<br />";
}
else
{
for ($i=1; $i <strlen($word) ; $i++)
{
if (in_array($word[$i], $vowelArray))
{
$finalword = substr($word, $i). $consonant. $consonantSufix . "<br />";
}
else
{
$consonant .= $word[$i];
}
}
}
if ($finalword[0] == $finalword[1])
{
return substr($finalword, 1);
}
$finalword .= $finalword;
}
var_dump($wordArray);
}
So basicly it is giveing me the follow errors "Uninitialized string offset".I know this error comes because i am useing the arrays not proberly but i am stuck, Can someone please help me?
Your script doesn't handle the case where $word is empty, which will happen if you have two spaces in a row in the sentence. If $word is an empty string, $word[0] will get the error you reported, because there is no such character in the string.
Change the loop to:
foreach ($wordArray as $word)
{
if ($word === '') {
continue;
}
This will skip empty words. Note also that you don't need separate variables $value and $word.
How increase the performance of this code in php?
or any alternative method to find out the comment if the string starts with # then call at()
or if string starts with "#" then call hash()
here the sample comment is "#hash #at #####tag";
/
/this is the comment with mention,tag
function getCommentWithLinks($comment="#name #tag ###nam1 test", $pid, $img='', $savedid='', $source='', $post_facebook='', $fbCmntInfo='') {
//assign to facebook facebookComment bcz it is used to post into the fb wall
$this->facebookComment = $comment;
//split the comment based on the space
$comment = explode(" ", $comment);
//get the lenght of the splitted array
$cmnt_length = count($comment);
$store_cmnt = $tagid = '';
$this->img = $img;
$this->saveid = $savedid;//this is uspid in product saved table primary key
//$this->params = "&product=".base_url()."product/".$this->saveid;
$this->params['product'] = base_url()."product/".$this->saveid;
//$this->params['tags']='';
foreach($comment as $word) {
//check it is tag or not
//the first character must be a # and remaining all alphanumeric if any # or # is exist then it is comment
//find the length of the tag #mention
$len = strlen($word);
$cmt = $c = $tag_name = '';
$j = 0;
$istag = false;
for($i=0; $i<$len; $i++) {
$j = $i-1;
//check if the starting letter is # or not
if($word[$i] == '#') {
//insert tagname
if($istag) {
//insert $tag_name
$this->save_tag($tag_name, $pid);
$istag = false;
$tag_name = '';
}
//check for comment
if($i >= 1 && $word[$j]=='#') {
$this->store_cmnt .= $word[$i];
}else{
//append to the store_coment if the i is 1 or -1 or $word[j]!=#
$this->store_cmnt .= $word[$i];//23,#
}
}else if($word[$i]=='#') {
//insert tagname
if($istag) {
//insert $tag_name
$this->save_mention($tag_name, $pid, $fbCmntInfo);
$istag = false;
$tag_name = '';
}
//check for comment
if($i >= 1 && $word[$j]=='#') {
$this->store_cmnt .= $word[$i];
}else{
$this->store_cmnt .= $word[$i];//23,#
}
}else if( $this->alphas($word[$i]) && $i!=0){
if($tag_name=='') {
//check the length of the string
$strln=strlen($this->store_cmnt);//4
if($strln != 0) {
$c = substr($this->store_cmnt, $strln-1, $strln);//#
if($c=='#' || $c=='#') {
$this->store_cmnt = substr($this->store_cmnt, 0, $strln-1);//23,
$tag_name = $c;
}
}
//$tag_name='';
}
//check that previous is # or # other wise it is
if($c=='#' || $c=='#') {
$tag_name .= $word[$i];
$istag = true;
//check if lenis == i then add anchor tag her
if($i == $len-1) {
$istag =false;
//check if it is # or #
if($c=='#')
$this->save_tag($tag_name,$pid);
else
$this->save_mention($tag_name,$pid,$fbCmntInfo);
//$this->store_cmnt .= '<a >'. $tag_name.'</a>';
}
}else{
$this->store_cmnt .= $word[$i];
}
}else{
if($istag) {
//insert $tag_name
$this->save_tag($tag_name,$pid);
$istag = false;
$tag_name = '';
}
$this->store_cmnt .= $word[$i];
}
}
$this->store_cmnt .=" ";
}
}
Try This it may be help full
function getResultStr($data, $param1, $param2){
return $param1 != $param2?(''.$data.''):$data;
}
function parseWord($word, $symbols){
$result = $word;
$status = FALSE;
foreach($symbols as $symbol){
if(($pos = strpos($word, $symbol)) !== FALSE){
$status = TRUE;
break;
}
}
if($status){
$temp = $symFlag = '';
$result = '';
foreach(str_split($word) as $char){
//Checking whether chars are symbols(#,#)
if(in_array($char, $symbols)){
if($symFlag != ''){
$result .= getResultStr($temp, $symFlag, $temp);
}
$symFlag = $temp = $char;
} else if(ctype_alnum($char) or $char == '_'){
//accepts[0-9][A-Z][a-z] and unserscore (_)
//Checking whether Symbol already started
if($symFlag != ''){
$temp .= $char;
} else {
//Just appending the char to $result
$result .= $char;
}
} else {
//accepts all special symbols excepts #,# and _
if($symFlag != ''){
$result .= getResultStr($temp, $symFlag, $temp);
$temp = $symFlag = '';
}
$result .= $char;
}
}
$result .= getResultStr($temp, $symFlag, '');
}
return $result;
}
function parseComment($comment){
$str = '';
$symbols = array('#', '#');
foreach(explode(' ', $comment) as $word){
$result = parseWord($word, $symbols);
$str .= $result.' ';
}
return $str;
}
$str = "#Richard, #McClintock, a Latin professor at $%#Hampden_Sydney #College-in #Virginia, looked up one of the ######more obscure Latin words, #######%%#%##consectetur, from a Lorem Ipsum passage, and #going#through the cites of the word in classical literature";
echo "<br />Before Parsing : <br />".$str;
echo "<br /><br />After Parsing : <br />".parseComment($str);
use strpos or preg_match or strstr
Please refer string functions in php. You can do it in a line or two with that in built functions.
If it not matches better to write a regex.
I use php preg_match to match the first & last word in a variable with a given first & last specific words,
example:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '(.*)' . $last_word .'$/' , $str))
{
echo 'true';
}
But the problem is i want to force match the whole word at (starting & ending) not the first or last characters.
Using \b as boudary word limit in search:
$first_word = 't'; // I want to force 'this'
$last_word = 'ne'; // I want to force 'done'
$str = 'this function can be done';
if(preg_match('/^' . $first_word . '\b(.*)\b' . $last_word .'$/' , $str))
{
echo 'true';
}
I would go about this in a slightly different way:
$firstword = 't';
$lastword = 'ne';
$string = 'this function can be done';
$words = explode(' ', $string);
if (preg_match("/^{$firstword}/i", reset($words)) && preg_match("/{$lastword}$/i", end($words)))
{
echo 'true';
}
==========================================
Here's another way to achieve the same thing
$firstword = 'this';
$lastword = 'done';
$string = 'this can be done';
$words = explode(' ', $string);
if (reset($words) === $firstword && end($words) === $lastword)
{
echo 'true';
}
This is always going to echo true, because we know the firstword and lastword are correct, try changing them to something else and it will not echo true.
I wrote a function to get Start of sentence but it is not any regex in it.
You can write for end like this. I don't add function for the end because of its long...
<?php
function StartSearch($start, $sentence)
{
$data = explode(" ", $sentence);
$flag = false;
$ret = array();
foreach ($data as $val)
{
for($i = 0, $j = 0;$i < strlen($val), $j < strlen($start);$i++)
{
if ($i == 0 && $val{$i} != $start{$j})
break;
if ($flag && $val{$i} != $start{$j})
break;
if ($val{$i} == $start{$j})
{
$flag = true;
$j++;
}
}
if ($j == strlen($start))
{
$ret[] = $val;
}
}
return $ret;
}
print_r(StartSearch("th", $str));
?>