Matching best similar array element - php

I have an array of keywords on which i run foreach loop and match each element with specific search term. e.g. i have array like
Array(
[0] => polka dresses
[1] => polka clothes
[2] => polka dots dress
[3] => polka dots bottoms
)
and i search for the term polka in my array. it gives result when use strpos or stristr (Also tried similar_text but no results).
Issue
if i search for polka it works but, if accidentally, i type p0lka then it do not give any result.
Is there anyway to achieve this.

If you want to get most similar results of a typed word, then you can calculate Levenshtein distance between the searched word and stored words and return results which have the least distance.
You can make use of PHP's levenshtein function for this.
PHP Snippet:
<?php
$data = array(
'polka dresses',
'polka clothes',
'polka dots dress',
'polka dots bottoms',
'dummy dummy'
);
function getSimilarMatches($sentences,$search_str){
$min_distance = -1;
$closest_matches = [];
foreach($sentences as $sentence){
$min_levenshtein_dist = -1;
foreach(explode(" ",$sentence) as $word){
$levenshtein_dist = levenshtein($word,$search_str);
if($min_levenshtein_dist == -1 || $min_levenshtein_dist > $levenshtein_dist){
$min_levenshtein_dist = $levenshtein_dist;
}
}
if($min_distance == -1 || $min_distance > $min_levenshtein_dist){
$min_distance = $min_levenshtein_dist;
$closest_matches = [];
$closest_matches[] = $sentence;
}else if($min_distance === $min_levenshtein_dist){
$closest_matches[] = $sentence;
}
}
return $closest_matches;
}
print_r(getSimilarMatches($data,'polka'));
print_r(getSimilarMatches($data,'p0lka'));
Demo: https://3v4l.org/E9gea

Related

Find all substrings within a string with overlap

Hi im trying to find all overlapping substrings in a string here is my code its only finding nonrepeating ACA.
$haystack = "ACAAGACACATGCCACATTGTCC";
$needle = "ACA";
echo preg_match_all("/$needle/", $haystack, $matches);
You're using echo to print the return value of preg_match_all. That is, you're displaying only the number of matches found. What you probably wanted to do was something like print_r($matches);, like this:
$haystack = "ACAAGACACATGCCACATTGTCC";
$needle = "ACA";
preg_match_all("/$needle/", $haystack, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => ACA
[1] => ACA
[2] => ACA
)
)
Demo
If your real concern is that it counted ACACA only once, well, there are three things that need to be said about that:
That's basically unavoidable with regex.
You really shouldn't count this twice, as it's overlapping. It's not a true recurrence of the pattern.
That said, if you want to count that twice, you could do so with something like this:
echo preg_match_all("/(?=$needle)/", $haystack, $matches);
Output:
4
Demo
Here a script to find all occurences of a substring, also the overlapping ones.
$haystack = "ACAAGACACATGCCACATTGTCC";
$needle = "ACA";
$positions = [];
$needle_len = strlen($needle);
$haystack_len = strlen($haystack);
for ($i = 0; $i <= $haystack_len; $i++) {
if( substr(substr($haystack,$i),0,$needle_len) == $needle){
$positions[]=$i;
}
}
print_r($positions);
Output: Array ( 0, 5, 7, 14 )

PHP Get first array value that ends with ".jpg"

I have some arrays and they might all be formatted like so:
Array (
[0] => .
[1] => ..
[2] => 151108-some_image-006.jpg
[3] => high
[4] => low
)
I know they have 5 values, but I can not be certain where each value is being placed.
I am trying to get only the image out of this array
$pos = array_search('*.jpg', $main_photo_directory);
echo $main_photo_directory[$pos];
But as we all know, it's looking for a literal *.jpg which it can't find. I'm not so super at regex and wouldn't know how to format an appropriate string.
What is the best way to get the first image (assuming there may be more than one) out of this array?
ADDITION
One of the reasons I am asking is to find a simple way to search through an array. I do not know regex, though I'd like to use it. I wrote the question looking for a regex to find '.jpg' at the end of a string which no search results had yielded.
This should work:
echo current(preg_grep('/\.jpg$/', $array));
You can loop over the array until you find a string ending in jpg:
function endsWith($haystack, $needle) {
return $needle === "" || (($temp = strlen($haystack) - strlen($needle)) >= 0 && strpos($haystack, $needle, $temp) !== FALSE);
}
$array = [];
$result = false;
foreach($array as $img) {
if(endsWith($img, '.jpg')) {
$result = $img;
break;
}
}
echo $result;

similar substring in other string PHP

How to check substrings in PHP by prefix or postfix.
For example, I have the search string named as $to_search as follows:
$to_search = "abcdef"
And three cases to check the if that is the substring in $to_search as follows:
$cases = ["abc def", "def", "deff", ... Other values ...];
Now I have to detect the first three cases using substr() function.
How can I detect the "abc def", "def", "deff" as substring of "abcdef" in PHP.
You might find the Levenshtein distance between the two words useful - it'll have a value of 1 for abc def. However your problem is not well defined - matching strings that are "similar" doesn't mean anything concrete.
Edit - If you set the deletion cost to 0 then this very closely models the problem you are proposing. Just check that the levenshtein distance is less than 1 for everything in the array.
This will find if any of the strings inside $cases are a substring of $to_search.
foreach($cases as $someString){
if(strpos($to_search, $someString) !== false){
// $someString is found inside $to_search
}
}
Only "def" is though as none of the other strings have much to do with each other.
Also on a side not; it is prefix and suffix not postfix.
To find any of the cases that either begin with or end with either the beginning or ending of the search string, I don't know of another way to do it than to just step through all of the possible beginning and ending combinations and check them. There's probably a better way to do this, but this should do it.
$to_search = "abcdef";
$cases = ["abc def", "def", "deff", "otherabc", "noabcmatch", "nodefmatch"];
$matches = array();
$len = strlen($to_search);
for ($i=1; $i <= $len; $i++) {
// get the beginning and end of the search string of length $i
$pre_post = array();
$pre_post[] = substr($to_search, 0, $i);
$pre_post[] = substr($to_search, -$i);
foreach ($cases as $case) {
// get the beginning and end of each case of length $i
$pre = substr($case, 0, $i);
$post = substr($case, -$i);
// check if any of them match
if (in_array($pre, $pre_post) || in_array($post, $pre_post)) {
// using the case as the array key for $matches will keep it distinct
$matches[$case] = true;
}
}
}
// use array_keys() to get the keys back to values
var_dump(array_keys($matches));
You can use array_filter function like this:
$cases = ["cake", "cakes", "flowers", "chocolate", "chocolates"];
$to_search = "chocolatecake";
$search = strtolower($to_search);
$arr = array_filter($cases, function($val) use ($search) { return
strpos( $search,
str_replace(' ', '', preg_replace('/s$/', '', strtolower($val))) ) !== FALSE; });
print_r($arr);
Output:
Array
(
[0] => cake
[1] => cakes
[3] => chocolate
[4] => chocolates
)
As you can it prints all the values you expected apart from deff which is not part of search string abcdef as I commented above.

Match part of string with part of other string

I working on a simple search function where I want to match a part of a string with part of another string.
Example:
The search term is: fruitbag
I want to match the product: fruit applebag
I want to create something so that the system matches:
fruitbag
fruit applebag
Or even "fruit" and "bag".
In summary; parts inside a string need to match with parts inside the search term. Is this possible?
$products = array(
'fruit applebag',
'pinapple',
'carrots',
'bananas',
'coconut oil',
'cabbage',
);
if( ! empty( $_POST['search'] ) ) {
$s = $_POST['search'];
$results = array();
foreach( $products as $index => $product ) {
if( preg_match( '/' . $s . '.*/i', $product, $matched ) ) {
$results[] = $matched[0];
}
}
print_r($results);
// This only returns fruit applebag if the search term is something like "fruit ap"
}
Use something like this (split the searched word into two parts and look for a match thas has characters between those two parts):
$products = array(
'fruit applebag',
'pinapple',
'carrots',
'bananas',
'coconut oil',
'cabbage',
);
$s = 'fruitbag';
$results = array();
foreach( $products as $index => $product ) {
for($i=1;$i<strlen($s);$i++){
if(preg_match( '/' . substr($s,0,$i) . '.*'.substr($s,$i+1).'/i', $product, $matched ) ) {
$results[] = $matched[0];
}
}
}
print_r($results);
Output:
Array ( [0] => fruit applebag [1] => fruit applebag )
It is possible, but could be very costly. As stated, your requirement is that any substring of the search term is potentially relevant. So take fruitbag and generate a list of all substrings:
f,r,u,i,t,b,a,g,fr,ru,ui,it,tb,ba,ag,fru,rui,uit,...,bag,...,fruit,...,fruitbag
But you probably don't want that any word with the letter a be a match. So a first approach could be to specify a minimum number of letters (e.g. 3), which will significantly limit the potential matches. But even then... Does it make sense to match fru, or rui?
A better approach would be to use a dictionary, to extract actual words or syllables from your search string. (In your case, extract fruit and bag from fruitbag).
You can find an English dictionary fairly easily.

Full text search PHP alone

I have an InnoDB table from which values are retrieved and stored in an array in PHP.
Now I want to sort the array by relevance to the matches in the search string.
eg: If I search "hai how are you", it will split the string into separate words as "hai" "how" "are" "you" and the results after search must be as follows:
[0] hai how are all people there
[1] how are things going
[2] are you coming
[3] how is sam
...
Is there any way I can sort the array by relevance in basic PHP functions alone?
Maybe something like this:
$arrayToSort=array(); //define your array here
$query="hai how are you";
function compare($arrayMember1,$arrayMember2){
$a=similar_text($arrayMember1,$query);
$b=similar_text($arrayMember2,$query);
if($a>$b)return 1;
else return -1;
}
usort($arrayToSort,"compare");
Look in the php manual for clarification on what similar_text and usort do.
$searchText = "hai how are you"; //eg: if there are multiple spaces between words
$searchText = preg_replace("(\s+)", " ", $searchText );
$searchArray =& split( " ", $searchText );
$text = array(0 => 'hai how are all people there',
1 => 'how are things going ',
2 => 'are you coming',
3 => 'how is sam',
4 => 'testing ggg');
foreach($text as $key=>$elt){
foreach($searchArray as $searchelt){
if(strpos($elt,$searchelt)!== FALSE){
$matches[] = $key; //just storing key to avoid memory wastage
break;
}
}
}
//print the matched string with help of stored keys
echo '<pre>matched string are as follows: ';
foreach ($matches as $key){
echo "<br>{$text[$key]}";
}

Categories