PHP finding out if words are contained in a large array - php

I need to check if an array of words ($words) is present in a larger array ($dictionary).
If all the words are there, no errors.
If one or more are not included in $dictionary, I want to send out an error message.
So far, I have come up with this:
<?php
// first I select a column from a MySQL table and retrieve
//all the words contained in that field.
$spell = "SELECT * FROM eventi WHERE utente='{$_SESSION['username']}'";
$qspell = mysql_query($spell) or die ("Error Query [".$spell."]");
while ($risu = mysql_fetch_array($qspell)){
$risu = mysql_fetch_array($qspell);
// the following lines remove parentheses, digits and multiple spaces
$desc = strtolower($risu["descrizione"]);
$words = explode(" ",$desc);
$words = str_replace("(","",$words);
$words = str_replace(")","",$words);
$words = preg_replace('/[0-9]+/','',$words);
$words = preg_replace('/\s+/',' ',$words);
// the array $dictionary is generated taking a long list
//of words from a txt file
$dictionary = file('./docs/dizionario.txt',FILE_IGNORE_NEW_LINES);
foreach($words as $k => $v){
if (in_array($v, $dictionary)){
//Do something?
} else {
$error = "error";
echo "The word ".$v." can't be found in the dictionary.";
}
}
}
if (!isset($error)){
echo "All the words are in the dictionary.";
} else {
echo "There are some unknown words. See above.";
}
?>
This code always returns one single error message, without reporting which word can't be found.
On top of that, words which are actually missing are not detected.
What am I doing wrong?

Apparently, the problem lies at the line:
$words = preg_replace('/[0-9]+/','',$words);
Removing digits somehow messes the whole matching procedure.
Without removing digits, my code works.

Related

PHP - Search, put letters written in bold

I'm trying to do a search engine where I write in a textbox, for example, "Mi" and it selects and shows "Mike Ross". However it's not working with spaces. I write "Mike" and I get "Mike Ross", but when I write "Mike " I get "Mike Ross" (no bold).
The same is happening with accents.
So I write "Jo" and the result is "João Carlos". If I write "Joa", the result is "João Carlos" (without any bold part). I want to ignore the accents while writing but still display them in the results.
So this is my script after the SELECT:
while($row = $result->fetch_array()) {
$name = $row['name'];
$array = explode(' ',trim($name));
$array_length = count($array);
for ($i=0; $i<$array_length; $i++ ) {
$letters = substr($array[$i], 0, $q_length);
if (strtoupper($letters) == strtoupper($q)) {
$bold_name = '<strong>'.$letters.'</strong>';
$final_name = preg_replace('~'.$letters.'~i', $bold_name, $array[$i], 1);
$array[$i] = $final_name;
}
array[$i] = array[$i]." ";
}
foreach ($array as $t_name) { echo $t_name;
}
Thank you for your help!
if (strtoupper($letters) == strtoupper($q))
This will never evaluate to "true" with spaces since you're removing spaces from the matchable letter set with explode(' ', trim($name), effectively making any value of $q with a space unmatchable to $letters
Here's a quick example that does what I think you're looking for
<?php
$q = "Mike "; // User query
$name = "Mike Ross"; // Database row value
if(stripos($name, $q) !== false) // Case-insensitive match
{
// Case-insensitive replace of match with match enclosed in strong tag
$result = preg_replace("/($q)/i", '<strong>$1</strong>', $name);
print_r($result);
}
// Result is
// <strong>Mike </strong>Ross
From what I can tell (a quick google for "replace accented characters PHP"), you're kind of out of luck with that one. This question provides a quick solution using strtr, and this tip uses a similar method with str_replace.
Unfortunately, these rely on predefined character sets, so incoming accents you haven't prepared for will fail. You may be better off relying on users to enter the special characters when they search, or create a new column with a "searchable" name with the accented characters replaced as best as you can, and return the real name as the "matched" display field.
One more Note
I found another solution that can do most of what you want, except the returned name will not have the accent. It will, however, match the accented value in the DB with a non-accented search. Modified code is:
<?php
$q = "Joa";
$name = "João Carlos";
$searchable_name = replace_accents($name);
if(stripos($searchable_name, $q) !== false)
{
$result = preg_replace("/($q)/i", '<strong>$1</strong>', $searchable_name);
print_r($result);
}
function replace_accents($str) {
$str = htmlentities($str, ENT_COMPAT, "UTF-8");
$str = preg_replace('/&([a-zA-Z])(uml|acute|grave|circ|tilde);/','$1',$str);
return html_entity_decode($str);
}

Trying to spit sentence into array with words

I am trying to split a sentence into an array with words, one word as each element, in PHP if there is more than one word in the sentence. If there is only one word in the sentence, then I just print that one word.
My issue is when I split the sentence into words delimited by a space and put the contents into an array. I do this all using explode. But when I run through the array that explode apparently makes, it says there is nothing in the array when I try to print each item.
Here is my code:
if(isset($_GET['check'])){
$input = trim($_GET['check']);
$sentence='';
if(stripos($input, ' ')!==false){
$sentence = explode(' ', $input);
foreach($sentence as $item){
echo $item;
}
}
else{
echo $input;
}
}
Why is echo $item; printing nothing? Why isn't there anything in the array $sentence?
Your code seems to be working fine. Make sure you're getting the variable. You can do a print_r($_GET);
You can also just do this:
<?php
$_GET['check'] = 'Hey how are you?';
if (isset($_GET['check'])) {
// each word is now an element in the array
$arr = explode(' ', trim($_GET['check']));
}
// piece each word back together with a space
echo implode(' ', $arr);
?>
It doesn't matter if it's a single word or multiple words.
UPDATE: If you really want to check if the user has entered one word or more than one word you can do this:
<?php
$_GET['check'] = 'Hey how are you?';
if (str_word_count($_GET['check']) > 1) {
echo 'More than one word';
} else {
echo 'Only one word';
}
?>
Check out str_word_count

Word counter: Doesn't seem to give the output I need (PHP)

here's the line of code that I came up with:
function Count($text)
{
$WordCount = str_word_count($text);
$TextToArray = explode(" ", $text);
$TextToArray2 = explode(" ", $text);
for($i=0; $i<$WordCount; $i++)
{
$count = substr_count($TextToArray2[$i], $text);
}
echo "Number of {$TextToArray2[$i]} is {$count}";
}
So, what's gonna happen here is that, the user will be entering a text, sentence or paragraph. By using substr_count, I would like to know the number of occurrences of the word inside the array. Unfortunately, the output the is not what I really need. Any suggestions?
I assume that you want an array with the word frequencies.
First off, convert the string to lowercase and remove all punctuation from the text. This way you won't get entries for "But", "but", and "but," but rather just "but" with 3 or more uses.
Second, use str_word_count with a second argument of 2 as Mark Baker says to get a list of words in the text. This will probably be more efficient than my suggestion of preg_split.
Then walk the array and increment the value of the word by one.
foreach($words as $word)
$output[$word] = isset($output[$word]) ? $output[$word] + 1 : 1;
If I had understood your question correctly this should also solve your problem
function Count($text) {
$TextToArray = explode(" ", $text); // get all space separated words
foreach($TextToArray as $needle) {
$count = substr_count($text, $needle); // Get count of a word in the whole text
echo "$needle has occured $count times in the text";
}
}
$WordCounts = array_count_values(str_word_count(strtolower($text),2));
var_dump($WordCounts);

preg_replace suddenly stops making distinctions

Confounded. I've been using the below IF PREG_MATCH to distinguish between words which entire words and words which are parts of other words. It has suddenly ceased to function in this script, and any other script I use, which depend on this command.
The result is it finds parts of words, although you can see it is explicitly told to find only entire words.
$word = preg_replace("/[^a-zA-Z 0-9]+/", " ", $word);
if (preg_match('#\b'.$word.'\b#',$goodfile) && (trim($word) != "")) {
$fate = strpos($goodfile,$word);
print $word ." ";
print $fate ."</br>";
If you only want to read the first word of a line of a text file, like your title suggests, try another method:
// Get the file as an array, each element being a line
$lines = file("/path/to/file");
// Break up the first line by spaces
$words = explode(" ", $lines[0]);
// Get the first word
$firstWord = $words[0];
This would be faster and cleaner than explode and you won't be making any array
$first_word = stristr($lines, ' ', true);

determine if a string contains one of a set of words in an array

I need a simple word filter that will kill a script if it detects a filtered word in a string.
say my words are as below
$showstopper = array(badword1, badword2, badword3, badword4);
$yourmouth = "im gonna badword3 you up";
if(something($yourmouth, $showstopper)){
//stop the show
}
You could implode the array of badwords into a regular expression, and see if it matches against the haystack. Or you could simply cycle through the array, and check each word individually.
From the comments:
$re = "/(" . implode("|", $showstopper) . ")/"; // '/(badword1|badword2)/'
if (preg_match($re, $yourmouth) > 0) { die("foulmouth"); }
in_array() is your friend
$yourmouth_array = explode(' ',$yourmouth);
foreach($yourmouth_array as $key=>$w){
if (in_array($w,$showstopper){
// stop the show, like, replace that element with '***'
$yourmouth_array[$key]= '***';
}
}
$yourmouth = implode(' ',$yourmouth_array);
You might want to benchmark this vs the foreach and preg_match approaches.
$showstopper = array('badword1', 'badword2', 'badword3', 'badword4');
$yourmouth = "im gonna badword3 you up";
$check = str_replace($showstopper, '****', $yourmouth, $count);
if($count > 0) {
//stop the show
}
A fast solution involves checking the key as this does not need to iterate over the array. It would require a modification of your bad words list, however.
$showstopper = array('badword1' => 1, 'badword2' => 1, 'badword3' => 1, 'badword4' => 1);
$yourmouth = "im gonna badword3 you up";
// split words on space
$words = explode(' ', $yourmouth);
foreach($words as $word) {
// filter extraneous characters out of the word
$word = preg_replace('/[^A-Za-z0-9]*/', '', $word);
// check for bad word match
if (isset($showstopper[$word])) {
die('game over');
}
}
The preg_replace ensures users don't abuse your filter by typing something like bad_word3. It also ensures the array key check doesn't bomb.
not sure why you would need to do this but heres a way to check and get the bad words that were used
$showstopper = array(badword1, badword2, badword3, badword4);
$yourmouth = "im gonna badword3 you up badword1";
function badWordCheck( $var ) {
global $yourmouth;
if (strpos($yourmouth, $var)) {
return true;
}
}
print_r(array_filter($showstopper, 'badWordCheck'));
array_filter() returns an array of bad words, so if the count() of it is 0 nothign bad was said

Categories