PHP: find biggest overlap between multiple strings - php

I have this array:
$array = array('abc123', 'ac123', 'tbc123', '1ac123');
I want to compare each string to each other and find the longest common substring. In the example above the result would be c123.

Update
I've completely misunderstood the question; the aim was to find the biggest overlap between an array of strings:
$array = array('abc123', 'ac123', 'tbc123', '1ac123');
function overlap($a, $b)
{
if (!strlen($b)) {
return '';
}
if (strpos($a, $b) !== false) {
return $b;
}
$left = overlap($a, substr($b, 1));
$right = overlap($a, substr($b, 0, -1));
return strlen($left) > strlen($right) ? $left : $right;
}
$biggest = null;
foreach ($array as $item) {
if ($biggest === null) {
$biggest = $item;
}
if (($biggest = overlap($biggest, $item)) === '') {
break;
}
}
echo "Biggest match = $biggest\n";
I'm not great at recursion, but I believe this should work ;-)
Old answer
I would probably use preg_grep() for that; it returns an array with the matches it found based on your search string:
$matches = preg_grep('/' . preg_quote($find, '/') . '/', $array);
Alternatively, you could use array_filter():
$matches = array_filter($array, function($item) use ($find) {
return strpos($item, $find) !== false;
});
I need to extract the value "c123" like it is the biggest match for all strings in array
I think what you would want to do here is then sort the above output based on string length (i.e. smallest string length first) and then take the first item:
if ($matches) {
usort($matches, function($a, $b) {
return strlen($a) - strlen($b);
});
echo current($matches); // take first one: ac123
}
Let me know if I'm wrong about that.
If you're just after knowing whether $find matches an element exactly:
$matching_keys = array_keys($array, $find, true); // could be empty array
Or:
$matching_key = array_search($find, $array, true); // could be false
Or event:
$have_value = in_array($find, $array, true);

in_array($find, $array);
returns true if it's in the array, but it has to be the exact match, in your case it won't finde 'ac123'.
if you want to see if it contains the string then you need to loop through the array and use a preg_match() or similar

You could use array_filter with a callback.
$output = array_filter ($input, function ($elem) { return false !== strpos ($elem, 'c123'); });

<?php
$array1 = array('abc123', 'ac123', 'tbc123', '1ac123');
if (in_array("c123", $array1)) {
echo "Got c123";
}
?>

You can use in_array as used here http://codepad.org/nOdaajNe
or use can use array_search as used here http://codepad.org/DAC1bVCi
see if it can help you ..
Documentation link : http://php.net/manual/en/function.array-search.php and http://www.php.net/manual/en/function.in-array.php

Related

Check if all string values of one array are either similar or sub-string of another array values

Ok posting this query after going through many similar issues but they took into consideration only numeric values. So my issue is:
I have 2 arrays -
$a = ["XYZ1250H100",
"XYZ1280H130",
"XYZ1250H150",
"XYZ3300H200",
"XYZ3350H200",
"XYZ33350H280Ada08",
"XYZ33450H300Ada08",
"XYZ33508H406Ada08"];
and
$b = ["XYZ0L200H150A4c00",
"350L610H457Ada08",
"XYZ33762H610Ada08",
"350L914H610Ada08",
"3700L250H200A410b",
"XYZ33457H305Ada08",
"XYZ4550H100MMQOJ",
"XYZ4580H130Ada08",
"XYZ4550H150A69b5",
"3101L280H356A8b83",
"XYZ4550H1501FC5Z",
"3116L150H15074QFR",
"XYZ1250H200A21ac",
"3101L750H500A8b83",
"350L356H279Ada08",
"XYZ1250H200A3f1c",
"3700L153H102A8d96",
"XYZ4550H150Ada08",
"XYZ4580H130A69b5",
"350L1830H610Ada08",
"3700L153H102A4c00",
"XYZ4550H150STD9J",
"3800L200H1505CZJI",
"XYZ4550H100A69b5",
"XYZ331370H450Ada08"
];
I need to see if values of array $a are there in array $b, I tried to use array_diff() and array_intersect() but that didn't solve the purpose. What I want is if the 2 arrays are matched then it should return either true or false. So for example if we consider value in $a -> "XYZ1280H130" and value in $b -> "XYZ1280H130A69b5" then it should be considered as a matched value (coz former is a sub-string of latter value) and true should be return and same for all similar values.
I tried something like this but it gives back all values of array $b everytime:
$result = array_filter($a, function($item) use($b) {
foreach($b as $substring)
if(strpos($item, $substring) !== FALSE) return TRUE;
return FALSE;
});
Please let me know if I missed something. Thanks in advance.
Use regex:
foreach($b as $search){
foreach($a as $sub){
if(preg_match("/$sub/",$search)){ echo "found ".$sub." in ".$search;}
}
}
You could use array_map.
But this should give you the result you want as well :
function test($a, $b) {
foreach($a as $first) {
foreach($b as $second) {
if(strpos($b, $a) !== false) return true;
}
}
}
Modified one of the answers above by #clearshot66 to get the desired result:
function match_them($a, $b){
foreach($b as $search){
foreach($a as $key => $val){
if(preg_match("/$val/",$search)){unset($a[$key]);}
}
}
return count($a);
}
var_dump(match_them($a, $b));

Group the same array element in PHP

I have this function to check for word sequences:
function sequence($arr_scheme = [], $arr_input = [])
{
$sequence_need = array_values(array_intersect($arr_scheme, $arr_input));
if(!empty($arr_input) && ($sequence_need == $arr_input)):
return true;
else:
return false;
endif;
}
There were my sample and scheme variables:
$sample = "branch of science";
$scheme = "The branch of science concerned of nature and property of matter and energy";
I have converted to array:
$arr_sample = explode(" ",trim(rtrim(rtrim($sample,".")," ")));
echo 'Sample:';
var_dump($arr_sample);
$arr_scheme = explode(" ",trim(rtrim(rtrim($scheme,".")," ")));
echo '<br/>Scheme:';
var_dump($arr_scheme);
Now, I check the sequences:
$result = sequence($arr_scheme, $arr_sample);
The result:
echo '<br/>Result:';
var_dump($result);
When I set the variable $sample to
"branch science" the result will return true. This was fine.
However when I set the variable sample to
"branch of science" the result will return false .
Reason - the word of was more than 1, how I can solve this problem?
Find first input word in the scheme (can be multiple).
Then run recursive for rests of arrays.
function sequence($arr_scheme = [], $arr_input = [])
{
if (!$arr_input) return true;
$first = array_shift($arr_input);
$occurences = array_keys($arr_scheme, $first);
if (!$occurences) return false;
foreach ($occurences as $o) { // loop first word occurences
$found = sequence(array_slice($arr_scheme, $o), $arr_input);
if ($found) return true;
}
return false;
}
First word later occurences should not matter anything for match.
So, this tail-recursion function will work even better:
function sequence($arr_scheme = [], $arr_input = [])
{
if (!$arr_input) return true;
$first = array_shift($arr_input);
$index = array_search($arr_scheme, $first);
if ($index === false) return false; // not found
return sequence(array_slice($arr_scheme, $index), $arr_input);
}
You can research more at here. Note: "Returns an array containing all of the values in array1 whose values exist in all of the parameters.". Then, look at in your result, when you call var_dump($arr_scheme);, you see "of" appears 3 times. and size of array result after compare is 5. however, size of array $sample is 3. So, you can understand why it returns false.
Solution for this case. why dont you try to use regular expression? or strpos function?
$sequence_need = array_unique($sequence_need);
array_unique removes any duplicate values in your array.. the duplicate 'of' will be removes.. Hope it helps..
I think you should go with array_diff(). It computes the difference of arrays and returns the values in $arr_sample that are not present in $arr_scheme.
So,
array_diff($arr_sample, $arr_scheme)
will return an empty array if all the values in $arr_sample are present in $arr_scheme
The next step would be to count the length of the array returned by array_diff(). If it equals 0, then we should return true
return count(array_diff($arr_sample, $arr_scheme)) === 0;
The above return statement could be presented as:
$diff = array_diff($arr_sample, $arr_scheme);
if (count($diff) === 0) {
return true;
} else {
return false;
}
From your comments it became clear that your function should return true
if all the elements of $arr_input are present in $arr_scheme in the same order
that they appear in $arr_scheme. Othewise it should return false
So,
sequence(['branch', 'of', 'science', 'and', 'energy'], ['branch', 'of', 'energy'])
should return true
and
sequence(['branch', 'of', 'science', 'and', 'energy'], ['science', 'of', 'branch'])
should return false
In this case the function sequence() could be defined as follows:
function sequence($arr_scheme = [], $arr_input = [])
{
//test if all elements of $arr_input are present in $arr_scheme
$diff = array_diff($arr_input, $arr_scheme);
if ($diff) {
return false;
}
foreach ($arr_input as $value) {
$pos = array_search($value, $arr_scheme);
if (false !== $pos ) {
$arr_scheme = array_slice($arr_scheme, $pos + 1);
continue;
}
return false;
}
return true;
}

Is there a version of array_keys that works on partial matches?

I want to return all keys in a PHP array where the corresponding value contains a search element.
array_keys will work if the value matches the search term exactly, but not if the search term occurs somewhere in the value but does not match it exactly.
How can I achieve this?
A combination of array_keys() and array_filter() can achieve what you want:
$myArray = ['knitting needle', 'haystack', 'needlepoint'];
$search = 'needle';
$keys = array_keys(
array_filter(
$myArray,
function ($value) use ($search) {
return (strpos($value, $search) !== false);
}
)
);
Demo
Try this custom function:
function array_keys_partial(array $haystack, $needle) {
$keys = array();
foreach ($haystack as $key => $value) {
if (false !== stripos($value, $needle)) {
array_push($keys,$key);
}
}
if(empty($keys)){
return false;
}
else {
return $keys;
}
}
This will return the keys on partial matches, and is also case insensitive. If you want to make it case sensitive, change stripos to strpos.

Which member of array does the string contain in PHP?

How can I check if a string contains a member of an array, and return the index (integer) of the relevant member?
Let's say my string is this :
$string1 = "stackoverflow.com";
$string2 = "superuser.com";
$r = array("queue" , "stack" , "heap");
get_index($string1 , $r); // returns 1
get_index($string2 , $r); // returns -1 since string2 does not contain any element of array
How can I write this function in an elegant (short) and efficient way ?
I found a function (expression ? ) that checks if the string contains a member of an array :
(0 < count(array_intersect(array_map('strtolower', explode(' ', $string)), $array)))
but this is a boolean. does the count() function return what I want in this statement ?
Thanks for any help !
function get_index($str, $arr){
foreach($arr as $key => $val){
if(strpos($str, $val) !== false)
return $key;
}
return -1;
}
Demo: https://eval.in/95398
This will find the number of matching elements in your array, if you want all matching keys, use the commented lines instead:
function findMatchingItems($needle, $haystack){
$foundItems = 0; // start counter
// $foundItems = array(); // start array to save ALL keys
foreach($haystack as $key=>$value){ // start to loop through all items
if( strpos($value, $needle)!==false){
++$foundItems; // if found, increase counter
// $foundItems[] = $key; // Add the key to the array
}
}
return $foundItems; // return found items
}
findMatchingItems($string1 , $r);
findMatchingItems($string2 , $r);
If you want to return all matching keys, just change $foundItems to an array and add the keys in the if-statement (switch to the commented lines).
If you only want to know if something matches or not
function findMatchingItems($needle, $haystack){
if( strpos($value, $needle)!==false){
return true;
break; // <- This is important. This stops the loop, saving time ;)
}
return false;// failsave, if no true is returned, this will return
}
I would do a function like this:
function getIndex($string, $array) {
$index = -1;
$i = 0;
foreach($array as $array_elem) {
if(str_pos($array_elem, $string) !== false) {
$index = $i;
}
$i++;
}
return $index;
}

Find needle in haystack, where needle is array of needles

I have a function that takes a string (the haystack) and an array of strings (the needles) and returns true if at least one needle is a substring of the haystack. It didn't take much time or effort to write it, but I'm wondering if there's a PHP function that already does this.
function strstr_array_needle($haystack, $arrayNeedles){
foreach($arrayNeedles as $needle){
if(strstr($haystack, $needle)) return true;
}
return false;
}
just a suggestion...
function array_strpos($haystack, $needles)
{
foreach($needles as $needle)
if(strpos($haystack, $needle) !== false) return true;
return false;
}
I think the closest function would be array_walk_recursive(), but that requires a callback. So using it would probably be more complicated than what you already have.
I'm not exactly sure what you're wanting to do but I think in_array() could help you do what you're looking for.
$needleArray = array(1, 2, 3); // the values we want to get from
$outputArray = array( ... ); // the values to search for
foreach ($outputArray as $value) {
if (in_array($value, $needleArray)) {
// do what you want to do...the $value exists in $needleArray
}
}
If you are just trying to determine which needles exist in the haystack, I suggest the array_intersect function.
Documentation from the PHP.net website
<?php
$array1 = array("a" => "green", "red", "blue");
$array2 = array("b" => "green", "yellow", "red");
$result = array_intersect($array1, $array2);
print_r($result);
?>
The above example will output:
Array
(
[a] => green
[0] => red
)
Basically, this will result in an array that shows all values that appear in both arrays. In your case, your code is returning true if any needle is found. The following code will do this using the array_intersect function, though if this is any simpler than Charles answer is debatable.
if(sizeof(array_intersect($hackstack, $arrayNeedles)) > 0)
return true;
else
return false;
Again, I am not sure exactly what your code is trying to do, other than return true if any needle exists. If you can provide some context on what you want to achieve, there may be a better way.
Hope this helps.
There's no single function that behaves as strstr_array_needle (the name is misleading; I'd expect it to return a substring of $haystack). There are other functions that could be used instead of a loop, but they don't have benefits and take more time. For example:
# iterates over entire array, though stops checking once a match is found
array_reduce($needles,
function($found, $needle) use ($haystack) {
return $found || (strpos($haystack, $needle) !== false);
},
false);
# iterates over entire array and checks each needle, even if one is already found
(bool)array_filter($needles,
function($needle) use ($haystack) {
return strpos($haystack, $needle) !== false;
});
Here is a tested and working function:
<?php
function strpos_array($haystack, $needles, $offset = 0) {
if (is_array($needles)) {
foreach ($needles as $needle) {
$pos = strpos_array($haystack, $needle);
if ($pos !== false) {
return $pos;
}
}
return false;
} else {
return strpos($haystack, $needles, $offset);
}
}

Categories