struggling here with doing a search on an array, for example:
string1; string2; string3;
string4; string5; string6;
If I use preg_match, I can search the array and it will return a result if the search pattern is exactly the same as an item in the array e.g. if the search term is “string1”.
My question is, is there a way to return a positive result if the search string doesn’t have an exact match, e.g. if the search term is “my string” it would return all 6 as suggested results.
Thanks!
You can explode this by ;, loop and use similar_text() function to check how similar is keyword with each string and decide if you want it or not.
$percent = 0;
similar_text($keyword, $string, $percent);
if ($percent > 85) {
// match
}
Related
I'm working on a search / advertising system that matches a given ad group with keywords. Query is the string that is the search string, what we're looking for is the best and most efficient way to enhance the simple 'contains' script below that searches through the query, but looks for keyword matches on an AND (&&) explosion. With this script one could build either 'IF' or it could be a "CASE" below is the pseudo code:
$query = "apple berry tomato potato":
if contains ($query,"tomato") { }
if contains ($query,"potato,berry") { }
if contains ($query,"apple,berry") { }
else i.e. none of the above do { }
the function contains would use strpos but would also use some combination of explode to distinguish words that are separated by commas. So Apple,berry would be where a string contains the list of keywords separated by commas.
What would be the best way to write a contains script that searches through the query string and matches against the comma-separated values in the second parameter? Love your ideas. Thanks.
Here is a the classic simple 'contains' function, but it doesn't handle the comma-separated AND Explosion - it only works with single words or phrases
function contains($haystack,$needle)
{
return strpos($haystack, $needle) !== false;
}
Note : the enhanced contains function should scan for the match of the string on an AND basis. If commas exist in the $needle it needs to include all of the keywords to show a match. The simple contains script is explained on this post Check if String contains a given word . What I'm looking for is an expanded function by the same name that also searches for multiple keywords, not just a single word.
The $query string will always be space delimited.
The $needle string will always be comma delimited, or it could be delimited by spaces.
The main thing is that the function works in multiple directions
Suppose the $query = 'business plan template'
or $query = 'templates for business plan'
if you ran contains ($query,"business plan")
or contains ($query,"business,plan") both tests would show a match. The sequence of the words should not matter.
Here's a simple way. Just compare the count of $needle with the count of $needle(s) that are in $haystack using array_intersect():
function contains($haystack, $needle) {
$h = explode(' ', $haystack);
$n = explode(',', $needle);
return count(array_intersect($h, $n)) == count($n);
}
You could optionally pass $needle in as an array and then no need for that explode().
If you need it case-insensitive:
$h = explode(' ', strtolower($haystack));
$n = explode(',', strtolower($needle));
I have a string like the one below
20Nov 18:14:xxxxxxxxxx has given 10 points to xxxxx. New bitcoin collection Balance:XXXXXXXX. Ref:675743957424
I will explode it and it will then be turned into an array.
But I want to check if the array has Ref:675743957424 and then place it inside a variable like for example $a.
I want to do this since the string might change from one point to another so the position of Ref is not fixed.
How Can i obtain such thing?
Thanks.
Edited
I tried not exploding it but instead try grabbing the data see code below
<?php
$line = "20Nov 18:14:xxxxxxxxxx has given 10 points to xxxxx. New bitcoin collection Balance:XXXXXXXX. Ref:675743957424";
// perform a case-Insensitive search for the word "Vi"
if (preg_match("/\bRef\b/i", $line, $match)) :
print "Match found!";
//how can I grab the Ref part?
endif;
?>
You have to use:
preg_match ('/Ref:[\d]*/', $line, $matches);
The matches will be saved to variable $matches and then you can operate with said matches.
The RegExp, you just need to look for string Ref: followed by any amount of numbers (\d looks for any digit and * looks for zero or more ocurrences of the previous operator, digits in this case).
If you know the exact number of digits that you must to find and it is not varying you could use the pattern {NUMBER}, like:
preg_match ('/Ref:[\d]{12}/', $line, $matches);
This case, you are looking for 12 digits after Ref:.
You can use strpos() to check whether the substring present in the string. If it is true, you can assign that to your variabble. Pleas see the below code, it may help you.
$line = "20Nov 18:14:xxxxxxxxxx has given 10 points to xxxxx. New bitcoin collection Balance:XXXXXXXX. Ref:675743957424";
$string_to_check ='Ref:675743957424'
if (strpos($line,$string_to_check) !== false) { //Ref is present
$a = $line;
}
I am new to php and trying out some different things.
I got a problem with printing a random value from a string from multiply values.
$list = "the weather is beautiful tonight".
$random = one random value from $list, for example "beautiful" or "is"
Is there any simple way to get this done?
Thanks!
well, as #Dagon suggested, you can use explode() to get an array of strings, then you can use rand($min, $max) to get an integer between 0 and the length of your array - 1. and then read the string value inside your array at the randomly generated number position.
// First, split the string on spaces to get individual words
$arg = explode(' ',"the weather is beautiful tonight");
// Shuffle the order of the words
shuffle($arg);
// Display the first word of the shuffled array
print $arg[0];
i am using php partial matching. but problem is this there is a huge list of matching available every time. we want to limit it. it only shows the partial match when it matches 40 % (means 4 characters out of 10).
Try something like this:
function fuzzyMatch ($source, $term, $percentRequired){
$matches = array_filter($source, function($test) use ($term, $percentRequired){
$matchPer = null;
similar_text($term, $test, $matchPer);
return $matchPer >= $percentRequired;
});
return $matches;
}
This will take an array or terms, the term you want to match it against and the % required for a match and return an array the matching values.
I'm looking either for routine or way to look for error tolerating string comparison.
Let's say, we have test string Čakánka - yes, it contains CE characters.
Now, I want to accept any of following strings as OK:
cakanka
cákanká
ČaKaNKA
CAKANKA
CAAKNKA
CKAANKA
cakakNa
The problem is, that I often switch letters in word, and I want to minimize user's frustration with not being able (i.e. you're in rush) to write one word right.
So, I know how to make ci comparison (just make it lowercase :]), I can delete CE characters, I just can't wrap my head around tolerating few switched characters.
Also, you often put one character not only in wrong place (character=>cahracter), but sometimes shift it by multiple places (character=>carahcter), just because one finger was lazy during writing.
Thank you :]
Not sure (especially about the accents / special characters stuff, which you might have to deal with first), but for characters that are in the wrong place or missing, the levenshtein function, that calculates Levenshtein distance between two strings, might help you (quoting) :
int levenshtein ( string $str1 , string $str2 )
int levenshtein ( string $str1 , string $str2 , int $cost_ins , int $cost_rep , int $cost_del )
The Levenshtein distance is defined as
the minimal number of characters you
have to replace, insert or delete to
transform str1 into str2
Other possibly useful functions could be soundex, similar_text, or metaphone.
And some of the user notes on the manual pages of those functions, especially the manual page of levenshtein might bring you some useful stuff too ;-)
You could transliterate the words to latin characters and use a phonetic algorithm like Soundex to get the essence from your word and compare it to the ones you have. In your case that would be C252 for all of your words except the last one that is C250.
Edit The problem with comparative functions like levenshtein or similar_text is that you need to call them for each pair of input value and possible matching value. That means if you have a database with 1 million entries you will need to call these functions 1 million times.
But functions like soundex or metaphone, that calculate some kind of digest, can help to reduce the number of actual comparisons. If you store the soundex or metaphone value for each known word in your database, you can reduce the number of possible matches very quickly. Later, when the set of possible matching value is reduced, then you can use the comparative functions to get the best match.
Here’s an example:
// building the index that represents your database
$knownWords = array('Čakánka', 'Cakaka');
$index = array();
foreach ($knownWords as $key => $word) {
$code = soundex(iconv('utf-8', 'us-ascii//TRANSLIT', $word));
if (!isset($index[$code])) {
$index[$code] = array();
}
$index[$code][] = $key;
}
// test words
$testWords = array('cakanka', 'cákanká', 'ČaKaNKA', 'CAKANKA', 'CAAKNKA', 'CKAANKA', 'cakakNa');
echo '<ul>';
foreach ($testWords as $word) {
$code = soundex(iconv('utf-8', 'us-ascii//TRANSLIT', $word));
if (isset($index[$code])) {
echo '<li> '.$word.' is similar to: ';
$matches = array();
foreach ($index[$code] as $key) {
similar_text(strtolower($word), strtolower($knownWords[$key]), $percentage);
$matches[$knownWords[$key]] = $percentage;
}
arsort($matches);
echo '<ul>';
foreach ($matches as $match => $percentage) {
echo '<li>'.$match.' ('.$percentage.'%)</li>';
}
echo '</ul></li>';
} else {
echo '<li>no match found for '.$word.'</li>';
}
}
echo '</ul>';
Spelling checkers do something like fuzzy string comparison. Perhaps you can adapt an algorithm based on that reference. Or grab the spell checker guessing code from an open source project like Firefox.