This php function retrieves a list of common words used in a string and excludes a blacklist of words.
Array1: a,b,c
Although a default blacklist is useful, I needed to add words to the blacklist from a database.
Array2: d,e,f
I added the MYSQL which gets an additional list from an field in our services table.
I explode \n from the words into an array and merge the two arrays at the beginning of the function so that the blacklist is now
Array3: a,b,c,d,e,f
To test I used print_r to display the array and it does merge successfully.
The problem is this...
If I manually add d,e,f to the default array the script returns a clean list of words.
If I merge the two arrays into one its returning the list of words with the blacklist words still in it.
Why would the merged array be any different than just adding to the default array?
Here is the function
function extractCommonWords($string,$init_blacklist){
/// the default blacklist words
$stopWords = array('a','b','c');
/// select the additional blacklist words from the database
$gettingblack_sql = "SELECT g_serv_blacklist FROM services WHERE g_serv_id='".$init_blacklist."' LIMIT 1";
$gettingblack_result = mysql_query($gettingblack_sql) or die(mysql_error());
$gettingblack_row = mysql_fetch_array($gettingblack_result);
$removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);
// this adds the d,e,f array from the database to the default a,b,c blacklist
$stopWords = array_merge($stopWords,$removingblack_array);
// replace whitespace
$string = preg_replace('/\s\s+/i', '', $string);
$string = trim($string);
// only take alphanumerical chars, but keep the spaces and dashes too
$string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string);
// make it lowercase
$string = strtolower($string);
preg_match_all('/\b.*?\b/i', $string, $matchWords);
$matchWords = $matchWords[0];
foreach ($matchWords as $key => $item) {
if ($item == '' || in_array(strtolower($item), $stopWords) || strlen($item) <= 3){
unset($matchWords[$key]);}}
$wordCountArr = array();
if (is_array($matchWords)) {
foreach ($matchWords as $key => $val) {
$val = strtolower($val);
if (isset($wordCountArr[$val])) {
$wordCountArr[$val]++;
} else {
$wordCountArr[$val] = 1;
}
}
}
arsort($wordCountArr);
$wordCountArr = array_slice($wordCountArr, 0, 30);
return $wordCountArr;
}
/// end of function
/// posted string = a b c d e f g
$generate = $_POST["generate"];
/// the unique id of the row to retrieve additional blacklist keywords from
$generate_id = $_POST["generate_id"];
/// run the function by passing the text string and the id
$generate = extractCommonWords($generate, $generate_id);
/// update the database with the result
$update_data = "UPDATE services SET
g_serv_tags='".implode(',', array_keys($generate))."'
WHERE g_serv_acct='".$_SESSION["session_id"]."'
AND g_serv_id='".$generate_id."' LIMIT 1";
$update_result = mysql_query($update_data);
if(!$update_result){die('Invalid query:' . mysql_error());}
else{echo str_replace(",",", ",implode(',', array_keys($generate)));}
/// end of database update
If the extra blacklist in the database was populated in an admin panel from a Windows client, there is likely to be a stray \r at the end of each word. Thus, your list would be a,b,c,d\r,e\r,f\r.
Try replacing this line:
$removingblack_array = explode("\n", $gettingblack_row["g_serv_blacklist"]);
with this:
$removingblack_array = preg_split('/(\r|\n|\r\n)/', $gettingblack_row["g_serv_blacklist"]);
Related
I've researched all sorts of ways, but I haven't found a solution for this case.
Basically I have to see if the word repeats and just remove the first occurrence of it in the array. For example:
$array_words = ['harmony', 'Acrobat', 'harmony', 'harmony'];
How do I check the repeated word, just once, leaving the array like this:
$array_final = ['Acrobat', 'harmony', 'harmony'];
I threw together this simple loop, and explained it with comments
$array_words = ['harmony', 'Acrobat', 'harmony', 'harmony'];
//get a count of each word in the array
$counted_values = array_count_values($array_words);
//hold the words we have already checked
$checked_words = [];
//variable to hold our output after filtering
$output = [];
//loop over words in array
foreach($array_words as $word) {
//if word has not been checked, and appears more than once
if(!in_array($word, $checked_words) && $counted_values[$word] > 1) {
//add word to checked list, continue to next word in array
$checked_words[] = $word;
continue;
}
//add word to output
$output[] = $word;
}
$output value
Array
(
[0] => Acrobat
[1] => harmony
[2] => harmony
)
GrumpyCrouton's solution is probably neater, but here's another way. Basically you put all the values into a single string, and then use string functions to do the work.
Code is commented with explanatory notes:
<?php
$array_words = ['harmony', 'Acrobat', 'harmony', 'harmony'];
$array_words_unique = array_unique($array_words); //get a list of unique words from the original array
$array_str = implode(",", $array_words);
foreach ($array_words_unique as $word) {
//count how many times the word occurs
$count = substr_count($array_str, $word);
//if it occurs more than once, remove the first occurence
if ($count > 1) {
//find the first position of the word in the string, then replace that with nothing
$pos = strpos($array_str, $word);
$array_str = substr_replace($array_str, "", $pos, strlen($word));
}
}
//convert back to an array, and filter any blank entries caused by commas with nothing between them
$array_final = array_filter(explode(",", $array_str));
var_dump($array_final);
Demo: https://3v4l.org/i1WKI
Credit to Using str_replace so that it only acts on the first match? for code to replace only the first occurence of a string inside another string.
We can use an array to keep track of each item that has been removed, and then use array_shift to move out of the item and count to limit loop overruns
<?php
$record = ['harmony','harmony', 'Acrobat', 'harmony', 'harmony','last'];
for($i=0,$count=count($record),$stack=array();$i<$count;$i++){
$item = array_shift($record);
in_array($item,$record) && !in_array($item,$stack)
? array_push($stack,$item)
: array_push($record,$item);
}
var_dump($record);
I have text :
$a = I wanna eat apple , and banana .
I wanna get every words and punctuation of that sentence :
$b = explode(' ', strtolower(trim($a)));
the result of explode is array.
I have a words table on db that has fields : id, word and typewords all in lowercase. but for punctuation there are no exist in db.
I wanna search every words in db to take the type of words, so the final result that i want to get is :
words/typeofwords = I/n wanna/v eat/v apple/n ,/, and/p banana/n ./.
here's the code :
function getWord ($word){
$i = 0 ;
$query = mysql_query("SELECT typewords FROM words WHERE word = '$word' ");
while ($row = mysql_fetch_array($query)) {
$word[$i] = $row['typewords'];
$i++;
}
return $word;
}
echo $b.'/'.getWord($b);
but it doesn't work, please help me, thanks !
Try with this:
function getWord($words){
$association = array();
foreach($words as $word)
{
$query = mysql_query("SELECT typewords FROM words WHERE word = '$word' ");
if($row = mysql_fetch_array($query))
$association[$word] = $row['typewords'];
elseif(preg_match('/[\.\,\:\;\?\!]/',$word)==1)
$association[$word] = $word;
}
return $association;
}
$typewords = getWord($b);
foreach($b as $w)
echo $w.'/'.$typewords[$w];
function getWord($word){
// concat the word your searching for with the result
// http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat
$query = mysql_query("SELECT CONCAT(word,'/',typewords) as this_result FROM words WHERE word = '$word' ");
$row = mysql_fetch_array($query);
return $row['this_result']." "; // added a space at the end.
}
// loop through the $b array and send each to the function
foreach($b as $searchword) {
echo getWord($searchword);
}
You assume the function parameter to be an array, but it is a string.
Edit: In your function you treat $word as an array as well as a string. Decide what you want and recode your function.
I have a sting that looks like this
$storelist = "‘F Mart (6)’, ‘ACME (5)’, 'J/M Store (17)'";
I want to break out selected companies and the number of locations by comparing the first string to a second string like
$selectedstores = "‘F Mart’, 'J/M Store";
And output a sting like
$selectedwithnumber = "‘F Mart (6)’, 'J/M Store (17)'"
There could be 1 to 15 companies in a string and the location number varies but the apostrophes and parenthesis are standard. I hope there an easy way to do this as I have no idea where to start. Thanks in advance.
You can use explode function to split arrays to parts, and use preg_replace function to remove number of companies (with brackets) from first string. below you can find working example:
$storelist = "‘F Mart (6)’, ‘ACME (5)’, 'J/M Store (17)'";
$selectedstores = "‘F Mart’, 'J/M Store'";
//split second array
$selectedArray = explode(', ', $selectedstores);
$resultArray = array();
//split first array
foreach(explode(', ', $storelist) as $storeWithNumber) {
//remove " (number)" from each part
$store = preg_replace('/\s+\(\d+\)/', '', $storeWithNumber);
//check if current part is on selected list
if (in_array($store, $selectedArray)) {
$resultArray[] = $storeWithNumber;
}
}
$selectedwithnumber = implode(', ', $resultArray);
echo $selectedwithnumber.PHP_EOL;
result is:
‘F Mart (6)’, 'J/M Store (17)'
This will get what you need based on your description. It breaks up your strings into arrays and then uses a nested foreach loop to do the comparisons. I used string functions over regular expression functions in case speed becomes an issue. It does however require that your main string of stores follows the conventions you described.
<?php
$storelist = "'F Mart (6)', 'ACME (5)', 'J/M Store (17)'";
$selectedstores = "'F Mart', 'J/M Store'";
$stores = explode(",", $storelist);
$selected = explode(",", $selectedstores);
$newStoreList = array();
foreach($selected as $selectedStore) {
foreach($stores as $store) {
$s = trim( $selectedStore, "' ");
if(strstr($store, $s)) {
$newStoreList[] = $store;
}
}
}
$newStoreList = implode(",", $newStoreList);
echo $newStoreList;
?>
This will output: 'F Mart (6)', 'J/M Store (17)'
Hope that helps!
I have a small personal project I am trying to complete. I need to take a string of characters and try to "create" words from variations of said string; checking against a text file with a list of known words (words are separated by new lines).
In summary:
user provides string $chars_provided (i.e "jdlwhfushfmgh"),
$chars_provided is then exploded
exploded $chars_provided are randomly arranged in attempt to create words from said string
created words the checked/verified against the dictionary text file to ensure they exist
results are displayed by the character count of the created words, with a limit of 100 words.
I have the concept in my head just not sure how it should be done, I'm just looking for someone who can explain the process to me.
<?php
// list of words, one per line
$dictionary = file_get_contents('dictionary.txt');
// provided characters, in the end by user
$chars_provided = "a,t,w,q,u,i,f,d,s,b,v,x,o";
// count the total # of characters
$chars_count = strlen($chars_provided);
// display given information so far
echo "The letters '$chars_provided' were entered, totaling $chars_count letters.";
// explode the characters by using the comma designator
$break_chars = explode(",", $chars_provided);
foreach ($break_chars as $letter) {
echo "$letter[0]";
}
This is easier if you get the letter counts for each word in the dictionary, hold onto it, and then match against the user input character counts.
For example, with 'aaab', any word with less than (or equal to) 3 'a's, less than (or equal to) 1 'b's, and no other characters will match.
//// 1. Grab letter counts for your user input.
$user_input_chars = 'abcdefg'; // for example
$user_in_letter_counts = get_letter_counts($user_input_chars);
// $letters[$char][$num] will contain all words that have exactly $num number of $char characters
$letters = array('a' => array(), 'b' => array(), /* ...,*/ 'z' => array());
//// 2. Generate list of words with at least $number_of quantity of $letter characters
// (only have to be done once for any amount of user input if you keep this in memory)
foreach ($words as $word){
// get letter counts for each type of character for this word
$letter_counts = get_letter_counts($word);
// store in array of letters and count
foreach($letter_counts as $letter => $number_of){
// we already have a word that had $number_of $letter characters; add word to existing array
if (isset($letters[$letter][$number_of])){
$letters[$letter][$number_of][] = $word;
} // make array to record that this word has $number_of $letter characters
else {
$letters[$letter][$number_of] = array($word);
}
$number_of--;
}
}
//// 3. Find matching words.
$potential_words = array();
foreach ($letters as $letter => $arr){
foreach($arr as $num => $words){
// if this array has less than or equal to the number of $letter characters that the user input has,
// add the words to the potential match list for that character
if ($num <= $arr[$user_in_letter_counts[$letter]]){
$potential_words[$letter] = array_merge($potential_words[$letter], $words);
}
}
}
// the words must have met the requirement for each character, so only grab words that satisfy all conditions
$all_matching_words = array_intersect($potential_words['a'], $potential_words['b'], /* ..., */ $potential_words['z']);
// (It should be trivial to just grab 100 of these.)
function get_letter_counts($word){
$result = array();
$result['a'] = substr_count($my_word, 'a');
$result['b'] = substr_count($my_word, 'b');
// ...
$result['z'] = substr_count($my_word, 'z');
return $result;
}
Hope you can use this.
$file = file_get_contents("dictionary.txt");
$SearchString = "jdlwhfushfmgh/maybeasencondword";
$breakstrings = explode('/',$SearchString);
foreach ($breakstrings as $values)
{
if(!strpos($file, $values))
{
echo $values." string not found!\n";
}
else
{
echo $values." string Found!\n";
}
I hope you can help me.
I have a string like the following
Luke 1:26-38
And I would like to be able to break it up into tokens or individual variables so that I can use the variables in an SQL query.
I've tried using explode, however I've only been able to make it explode on one character such as : or -
My string has : and - and also a space between the name and the first number.
My goal is to have:
$name = Luke;
$book = 1;
$from = 26;
$to = 38;
Is anyone able to help please.
Many thanks
You can do that with a simple string scanning (Demo):
$r = sscanf("Luke 1:26-38", "%s %d:%d-%d", $name, $book, $from, $to);
The varibales then contain the information. %s represents a string (without spaces), %d a decimal. See sscanf.
To make this "bible safe", it needs some additional modifications:
$r = sscanf($string, "%[ a-zA-Z] %d:%d-%d", $name, $book, $from, $to);
$name = trim($name);
(Second demo).
list( $name, $book, $from, $to ) = preg_split( '/[ :-]/', 'Luke 1:26-38' );
echo $name; //"Luke"
/* Split results in an Array
(
[0] => Luke
[1] => 1
[2] => 26
[3] => 38
)
*/
$string = "Luke 1:26-38";
preg_match('#^(\w+)\s(\d+):(\d+)-(\d+)$#', $string, $result);
print_r($result);
regex is hard to configure for this because of the multiple configurations of bible book names, chapter and verse num. Because some books begin with a number and some books have multiple spaces in the book names.
I came up with this for building a sql query, it works for these passage search types..
(John), (John 3), (Joh 3:16), (1 Thes 1:1)
Book names can be 3 letter abbreviations.
Does unlimited individual word search and exact phrase.
$string = $_GET['sstring'];
$type = $_GET['stype'];
switch ($type){
case "passage":
$book = "";
$chap = "";
$stringarray = explode(':', $string); // Split string at verse refrence/s, if exist.
$vref = $stringarray[1];
$vrefsplit = explode('-', $vref);// Split verse refrence range, if exist.
$minv = $vrefsplit[0];
$maxv = $vrefsplit[1]; // Assign min/max verses.
$bc = explode(" ", $stringarray[0]); // Split book string into array with space as delimiter.
if(is_numeric($bc[count($bc)-1])){ // If last book array element is numeric?
$chap = array_pop($bc); // Remove it from array and assign it to chapter.
$book = implode(" ", $bc); // Put remaining elemts back into string and assign to book.
}else{
$book = implode(" ", $bc); // Else book array is just book, convert back to string.
}
// Build the sql query.
$query_rs1 = "SELECT * FROM kjvbible WHERE bookname LIKE '$book%'";
if($chap != ""){
$query_rs1.= " AND chapternum='$chap'";
}
if($maxv != ""){
$query_rs1.= " AND versenum BETWEEN '$minv' AND '$maxv'";
}else if($minv != ""){
$query_rs1.= " AND versenum='$minv'";
}
break;
case "words":
$stringarray = explode(" ", $string); // Split string into array.<br />
// Build the sql query.
$query_rs1 = "SELECT * FROM kjvbible WHERE versetext REGEXP '[[:<:]]". $stringarray[0] ."[[:>:]]'";
if(count($stringarray)>1){
for($i=1;$i<count($stringarray);$i++){
$query_rs1.= " AND versetext REGEXP '[[:<:]]". $stringarray[$i] ."[[:>:]]'";
}
}
break;
case "xphrase":
// Build the sql query.
$query_rs1 = "SELECT * FROM kjvbible WHERE versetext REGEXP '[[:<:]]". $string ."[[:>:]]'";
break;
default :
break;
}