Fastest way to find relevant results in array from an input array

Fastest way to find relevant results in array from an input array - php

As a mostly front-end developer, this is in the realm of computer science that I don't often delve into, but here's my scenario:
I've got an input of a string, split on spaces, say "pinto beans"
I've got a array of results to search, that contains results like:
["beans, mung","beans, pinto","beans, yellow","beans, fava"]
what might be the quickest way (preferably in javascript or php) to find the most "relevant" results, aka most matches, for instance, in the above case, I would like to sort the return array so that "beans, pinto" is put at the top, and the rest come below, and any other results would go below those.
My first attempt at this would be to do something like matching each result item against each input item, and incrementing matches on each one, then sorting by most matches to least.
This approach would require me to iterate through the entire result array a ton of times though, and I feel that my lack of CS knowledge is leaving me without the best solution here.
/* EDIT: Here's how I ended up dealing with the problem: */
Based on crazedfred's suggestion and the blog post he mentioned (which was VERY helpful), I wrote some php that basically uses a combination of the trie method and the boyer-moore method, except searching from the beginning of the string (as I don't want to match "bean" in "superbean").
I chose php for the ranking based on the fact that I'm using js libraries, and getting real benchmarks while using convenience functions and library overhead wouldn't produce the testable results I'm after, and I can't guarantee that it won't explode in one browser or another.
Here's the test data:
Search String: lima beans
Result array (from db): ["Beans, kidney","Beans, lima","Beans, navy","Beans, pinto","Beans, shellie","Beans, snap","Beans, mung","Beans, fava","Beans, adzuki","Beans, baked","Beans, black","Beans, black turtle soup","Beans, cranberry (roman)","Beans, french","Beans, great northern","Beans, pink","Beans, small white","Beans, yellow","Beans, white","Beans, chili","Beans, liquid from stewed kidney beans","Stew, pinto bean and hominy"]
First, I drop both the search string and the result array into php variables, after explode()ing the string on spaces.
then, I precompile my patterns to compare the results to:
$max = max(array_map('strlen',$input));
$reg = array();
for($m = 0; $m < $max; $m++) {
$reg[$m] = "";
for($ia = 0; $ia < count($input); $ia++) {
$reg[$m]. = $input[$ia][$m];
}
}
this gives me something like : ["lb","ie","ma","an","s"]
then, I basically take each result string (split on spaces), and match a case insensitive character class with the corresponding character number to it. If at any point during that comparison process I don't get any matches, I skip the word. This means if only 1 result starts with "b" or "l", I'll only run one comparison per WORD, which is really fast. Basically I'm taking the part of trie that compiles the searches together, and the constant speedup of the Boyer-Moore stuff.
Here's the php - I tried whiles, but got SIGNIFICANTLY better results with foreaches:
$sort = array();
foreach($results as $result) {
$matches = 0;
$resultStrs = explode(' ', $result);
foreach($resultStrs as $r) {
$strlen = strlen($r);
for($p = 0; $p < $strlen; $p++) {
if($reg[$p])
preg_match('/^['.$reg[$p].']/i',$r[$p],$match);
if($match==true) {
$matches++;
} else {
break 2;
}
}
}
$sort[$result] = $matches;
}
That outputs an array with the results on the keys, and how many character matches we got in total on the values.
The reason I put it that way is to avoid key collisions that would ruin my data, and more importantly, so I can do a quick asort and get my results in order.
That order is in reverse, and on the keys, so after the above code block, I run:
asort($sort);
$sort = array_reverse(array_keys($sort));
That gives me a properly indexed array of results, sorted most to least relevant. I can now just drop that in my autocomplete box.
Because speed is the whole point of this experiment, here's my results - obviously, they depend partially on my computer.
2 input words, 40 results: ~5ms
2 input words, (one single character, one whole) 126 results: ~9ms
Obviously there's too many variables at stake for these results to mean much to YOU, but as an example, I think it's pretty impressive.
If anyone sees something wrong with the above example, or can think of a better way than that, I'd love to hear about it. The only thing I can think of maybe causing problems right now, is if I were to search for the term lean bimas, I would get the same result score as lima beans, because the pattern isn't conditional based on the previous matches. Because the results I'm looking for and the input strings I'm expecting shouldn't make this happen very often, I've decided to leave it how it is for now, to avoid adding any overhead to this quick little script. However, if I end up feeling like my results are being skewed by it, I'll come back here and post about how I sorted that part.

I try to suggest a solution for JavaScript but it can works in PHP too.
In this way you don't need to use nested loops, you can use just a sorting function and some regular expression.
Something like this:
var query = 'pinto beans';
var results = [ 'beans, mung','beans, pinto','beans, yellow','beans, fava' ];
// Evaluate a regular expression for your
// query like /pinto|beans/g joining each
// query item with the alternation operator
var pattern = eval( '/' + query.split( ' ' ).join( '|' ) + '/g' );
// Define a function for custom sorting
function regsort( a, b ) {
var ra = a.match( pattern );
var rb = b.match( pattern );
// return the indexing based on
// any single query item matched
return ( rb ? rb.length : 0 ) - ( ra ? ra.length : 0 );
}
// Just launch the sorting
var sorted = results.sort( regsort );
// Should output the right sort:
// ["beans, pinto", "beans, mung", "beans, yellow", "beans, fava"]
console.log( sorted );
I'm not sure if this is the fastest way to handle your request but for sure it could be a nice solution to avoid a nested loop + string comparsion.
Hope this helps!
Ciao

Since you specifically noted that it could be in several languages, I'll leave my answer in pseudocode so you can adapt to the language of your choice.
Since you are matching an array-to-array, performance is going to vary a lot based on your implementation, so trying several ways and considering exactly when/how/how-often this will be used in a good idea.
The simple way is to leave the data as-is and run an O(n^2) search:
for (every element in array X)
for (every element in array Y)
if (current X == current Y)
add current X to results
return results
If you sorted the arrays first (a sorting algorithm such a squicksort is implemented for you in many languages, check your documentation!), then the actual matching is quicker. Use whatever string-comparison your language has:
Sort array X
Sort array Y
Let A = first element of X
Let B = first element of Y
while (A and B are in array)
if (A > B)
Next B
else if (A < B)
Next A
else //match!
Add A to results
Next A
Next B
//handle case where one array is larger (trivial loop)
return results
Now the important part to the above solution is if the sorting of the arrays saved time versus just an ordinary O(n^2) sort. Usually, moving elements around in arrays is fast whereas string comparisons are not, so it may be worth it. Again, try both.
Finally, there's this crazy algorithm the Mailinator guy dreamed up for doing huge string comparisons in constant time using some awesome data structures. Never tried it myself, but it must work since he runs his whole site on very low-end hardware. He blogged about it here if you're interested. (Note: the blog post is about filtering spam, so some words in the post may be slightly NSFW.)

VERY PRIMITIVE
Just wanted to get that tid-bit out of the way up-front. But here is a simple implementation of sorting based on terms used. I will be up-front and mention that word variations (e.g. searching for "seasoned" and the result comes back as "seasonal") will have no effect on sorting (in fact, these words will, for all intents and purposes, but thought of as being just as different as "apple" and "orange" due to their suffix).
With that out of the way, here's is one method to do what you're looking for. I'm using this more as just an introduction as to how you can implement, it's up to you how you want to implement the "scoreWord" function. I also use jQuery's .each() method just because I'm lazy, but this can be replaced very easily with a for statement.
Anyways, here's one version, I hope it helps.
var search = "seasoned pinto beans";
var results = ["beans, mung","beans, pinto","beans, yellow","beans, pinto, seasonal","beans, fava"];
// scoreWord
// returns a numeric value representing how much of a match this is to the search term.
// the higher the number, the more definite a match.
function scoreWord(result){
var terms = search.toLowerCase().split(/\W/),
words = result.toLowerCase().split(/\W/),
score = 0;
// go through each word found in the result
$.each(words,function(w,word){
// and then through each word found in the search term
$.each(terms,function(t,term){
// exact word matches score higher (a whole point)
if (term==word)
score++;
// a word found in a word should be considered a partial
// match and carry less weight (1/2 point in this case)
else if (term.indexOf(word)!=-1||word.indexOf(term)!=-1)
score+=0.5;
});
});
// return the score
return score;
}
// go through and sort the array.
results.sort(function(a,b){
// grab each result's "score", them compare them to see who rates higher in
// the array
var aScore = scoreWord(a), bScore = scoreWord(b);
return (aScore>bScore?-1:(aScore<bScore?1:0));
});
// they are officially "sorted" by relevance at this point.

You can precompute a map which maps a word to many results.
var results = ["beans, mung","beans, pinto","beans, yellow","beans, fava"];
var index = {};
for (var i = 0; i < results.length; i ++) {
results[i].replace(/\w+/g, function(a) {
if (!index[a]) {
index[a] = [i];
} else {
index[a].push (i);
}
});
}
When searching, you can split the query into words.
function doSearch(searchString) {
var words = [];
var searchResults = [];
var currentIndex;
searchString.replace(/\w+/g, function(a) {
words.push (a);
});
Build the search result as a copy of the results array, but I put it in object so that it can hold both the text and the score.
for (var i = 0; i < results.length; i ++) {
searchResults.push ({
text: results[i],
score: 0
});
}
Then, for each search word, increase the score in the search results.
for (var i = 0; i < words.length; i ++) {
if ((currentIndex = index[words[i]])) {
for (var j = 0; j < currentIndex.length; j ++) {
searchResults[currentIndex[j]].score ++;
}
}
}
Finally, sort it by score.
searchResults.sort (function(a, b) {
return b.score - a.score;
});
return searchResults;
}
Doing `doSearch("pinto beans"), it returns an array of search results and also the score.
[{text:"beans, pinto", score:2}, {text:"beans, mung", score:1}, {text:"beans, yellow", score:1}, {text:"beans, fava", score:1}]

Related

How to (quickly) discover if two arrays have at least one common item

I am writing a script that will repeatedly search a large group of arrays (40,000) and merge all of the arrays that have at least one common element. I have tried array_intersect(), but I found that it was too slow for this application. Is there another function that is faster and simply returns true if at least one element is shared between two arrays?
It is my assumption that array_intersect() is slowed down by the fact that both arrays are reviewed completely and the commons values are grouped together and returned. It would be faster to exit when a single match is found.
To clarify: All arrays are held with an another master array (which is a 2d array.) If it is discovered that the arrays stored at $master[231] and $master[353] both contain the element 124352354, they should be merged into an new array and the result stored in a different 2d array designed to store merged results.
Current code:
$test = array_intersect($entry, $entry2);
if($test){
...
}
A better method is:
foreach($entry as $item){
if(in_array($item, $entry2)){
$test = true;
break;
}
}
if($test){
...
}
and another improvement is using isset() and array_flip() instead of in_array();
$flipped = array_flip($entry2);
foreach($entry as $item){
if(isset($flipped[$item]){
$test = true;
break;
}
}
if($test){
...
}

Assuming you want to just discover if two arrays have a common element, you could create your own getIntersect function which would be faster than using array_intersect since it would return instantly on first match.
function getIntersect($arr1, $arr2)
{
foreach($arr1 as $val1)
{
foreach($arr2 as $val2)
{
if($val1 == $val2)
{ return true; }
}
}
return false;
}
Assuming that what you really want to find is arrays in which one element occurs at least more than once.
Then you could easily have
function hasCommonElements($arr)
{
for($i = 0; $i < count($arr); $i++)
{
$val = $arr[$i];
unset($arr[$i]);
if(in_array($val, $arr))
{
return true;
}
}
}
And you could easily get an array of all arrays containing common elements using array_filter:
array_filter($my40k, "hasCommonElements");
Assuming that what you really want to do is to find all arrays which have at least one value in common, you have to do a higher level array filter.
$mybigarray;//your big array
function hasIntersects($arr)
{
for($i = 0; $i < count($mybigarray); $i++)
{
if(getIntersect($arr, $mybigarray[$i]))
{
return true;
}
}
}
Then call our filter monster
array_filter($mybigarray, "hasIntersects");
Disclaimer: None of this stuff is tested. Check for typos

If your arrays contain only values that are also valid keys (integers, ...), it could be better to flip the arrays (swap keys and values), which technically means build and index, and search on the keys. Examples:
function haveCommonValues1($a1, $a2) {
$a1_flipped = array_flip($a1);
foreach ($a2 as $val) {
if (isset($a1_flipped[$val])) {
return true;
}
}
return false;
}
or if you need the intersection:
function haveCommonValues2($a1, $a2) {
$a1_flipped = array_flip($a1);
$a2_flipped = array_flip($a2);
return array_intersect_key($a1_flipped, $a2_flipped);
}
On some test arrays i got these results, this however depends highly on the array structures. So you need to test it and compare the times.
array_intersect : 0m1.175s
haveCommonValues1 : 0m0.454s
haveCommonValues2 : 0m0.492s

Busting-up pre-conceived notions all around
I both proved my point, and undermined it, in one answer. TL;DR = Do this in SQL.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PHP built-ins are code-level cleaner, but surprisingly inefficient.
If you are going by key, your best bet would be the tried and true for-loop:
$array1 = array( ... );
$array2 = array( ... );
$match = FALSE;
foreach($array1 as $key => $value) {
if(isset($array2[$key]) && $array2[$key] == $value) {
$match = TRUE;
break;
}
}
It may seem counter-intuitive, but at the execution level, no mater what, you have to iterate over every item in at least one array. You can keep this shorter by only doing it on the shorter array.
array_intersect() keeps going for every key in both arrays, so, although it looks crazy, you just have to do it the "dirty" way.
If the data comes from a database, it would actually be faster to have the SQL engine do the lifting. A simple join with a limit 1 will give you a flag to know if there are duplicates, and then you can execute another query to get the merged data (dynamically generate the query against multiple tables or source queries if you need to do this on more than one pair).
SQL will be magnitudes faster than any higher-level language, like PHP, for doing this. I don't care if you already have the arrays in memory, executing the query and loading the new array from the DB will be faster than trying to do the compare and then merge in resident memory of the App...
Again, with counter-intuitive things...
So, this is interesting:
I made a test script for this at http://pastebin.com/rzeQnyu2
With the matcher (phone number) at 5 digits, the foreach loop consistently executed in 1/100 the time of the other option. HOWEVER, up that to 10 digits (removing all possibility of collision) and the foreach jumps to 36x slower than the other option.
# 5 digit key (phone number)
Match found
For Each completed in 0.0001
Intersect Key completed in 0.0113
# 10 digit key
Match not found
For Each completed in 0.2342
Intersect Key completed in 0.0064
I wonder why the second option (which likely had larger arrays) was faster for Intersect than the smaller one... WEIRD...
This is because, while the intersect always iterates over all items, the foreach loop wins when it can exit early, but looks to be VERY slow if it doesn't get that opportunity. I would be interested in the deeper technical reasons for this.
EIther Way - in the end, just do it in SQL.

array_intersect has a runtime of O(n * log(n)) because it uses a sorting algorithm before the comparison itself. Depending on your input you can improve that in many different ways (e.g. if you have integers from a small range you may impelement the algorithm using counting sort).
You can find an example for that optimizations right here or here.
A possible solution where you won't need sorting is posted in this thread. It also has linear time so I guess this is what your are looking for.

SELECT *
FROM contacts t1 INNER JOIN contacts t2
ON t1.phone = t2.phone
AND t1.AccountID < t2.AccountID
Also, if your system may ever grow to include international phone numbers you should store them as a string type. There are countries, I believe in Europe, that use leading zeroes in their phone numbers and you cannot properly store them with a numeric type.
edit
The below query will return all instances of phone numbers used multiple times with no duplicate rows no matter how many accounts are sharing a phone number:
SELECT DISTINCT t1.AccountID, t1.phone
FROM contacts t1 INNER JOIN contacts t2
ON t1.phone = t2.phone
AND t1.AccountID != t2.AccountID
ORDER by t1.phone
I'd include a SQLfiddle, but it seems to be broken atm. This is the schema/data I used as a test:
CREATE TABLE IF NOT EXISTS `contacts` (
`AccountID` int(11) NOT NULL,
`phone` varchar(32) NOT NULL,
KEY `aid` (`AccountID`),
KEY `phn` (`phone`)
)
INSERT INTO `contacts` (`AccountID`, `phone`) VALUES
(6, 'b'),
(1, 'b'),
(1, 'c'),
(2, 'd'),
(2, 'e'),
(3, 'f'),
(3, 'a'),
(4, 'a'),
(5, 'a'),
(1, 'a');

Find index of value in associative array in php?

If you have any array $p that you populated in a loop like so:
$p[] = array( "id"=>$id, "Name"=>$name);
What's the fastest way to search for John in the Name key, and if found, return the $p index? Is there a way other than looping through $p?
I have up to 5000 names to find in $p, and $p can also potentially contain 5000 rows. Currently I loop through $p looking for each name, and if found, parse it (and add it to another array), splice the row out of $p, and break 1, ready to start searching for the next of the 5000 names.
I was wondering if there if a faster way to get the index rather than looping through $p eg an isset type way?
Thanks for taking a look guys.

Okay so as I see this problem, you have unique ids, but the names may not be unique.
You could initialize the array as:
array($id=>$name);
And your searches can be like:
array_search($name,$arr);
This will work very well as native method of finding a needle in a haystack will have a better implementation than your own implementation.
e.g.
$id = 2;
$name= 'Sunny';
$arr = array($id=>$name);
echo array_search($name,$arr);
Echoes 2
The major advantage in this method would be code readability.

If you know that you are going to need to perform many of these types of search within the same request then you can create an index array from them. This will loop through the array once per index you need to create.
$piName = array();
foreach ($p as $k=>$v)
{
$piName[$v['Name']] = $k;
}
If you only need to perform one or two searches per page then consider moving the array into an external database, and creating the index there.

$index = 0;
$search_for = 'John';
$result = array_reduce($p, function($r, $v) use (&$index, $search_for) {
if($v['Name'] == $search_for) {
$r[] = $index;
}
++$index;
return $r;
});
$result will contain all the indices of elements in $p where the element with key Name had the value John. (This of course only works for an array that is indexed numerically beginning with 0 and has no “holes” in the index.)
Edit: Possibly even easier to just use array_filter, but that will not return the indices only, but all array element where Name equals John – but indices will be preserved:
$result2 = array_filter($p, function($elem) {
return $elem["Name"] == "John" ? true : false;
});
var_dump($result2);
What suits your needs better, resp. which one is maybe faster, is for you to figure out.

Improving algorithm using MySQL

The code below is written mainly using PHP, but I am hoping to speed up the process, and parsing strings in PHP is slow.
Assume the following where I get a string from the database, and converted it into an array.
$data['options_list'] = array(
"Colours" => array('red','blue','green','purple'),
"Length" => array('3','4','5','6'),
"Voltage" => array('6v','12v','15v'),
);
These subarrays will each be a dropdown Select list, an the end user can select exactly 1 from each of the select lists.
When the user hits submit, I will want to match the submitted values against a "price table" pre-defined by the admins. Potentially "red" and "6v" would cost $5, but "red" and "5"(length) and "6v" would cost $6.
The question is, how to do so?
Currently the approach I have taken is such:
Upon submission of the form (of the 3 select lists), I get the relevant price rules set by the admin from the database. I've made an example of results.
$data['price_table'] =
array(
'red;4'=>'2',
'red;5'=>'3',
'red;6'=>'4',
'blue;3'=>'5',
'blue;4'=>'6',
'blue;5'=>'7',
'blue;6'=>'8',
'green;3'=>'9',
'green;4'=>'10',
'green;5'=>'11',
'green;6'=>'12',
'purple;3'=>'13',
'purple;4'=>'14',
'purple;5'=>'15',
'purple;6'=>'16',
'red;3'=>'1',
'red;3;12v'=>'17',
'blue;6;15v'=>'18',
);
Note : The order of the above example can be of any order, and the algorithm should work.
I then explode each of the above elements into an array, and gets the result that matches the best score.
$option_choices = $this->input->post('select');
$score = 0;
foreach($data['price_table'] as $key=>$value)
{
$temp = 0;
$keys = explode(';',$key);
foreach($keys as $k)
{
if(in_array($k, $option_choices))
{
$temp++;
}else{
$temp--;
}
}
if($temp > $score)
{
$score = $temp;
$result = $value;
}
}
echo "Result : ".$result;
Examples of expected results:
Selected options: "red","5"
Result: 3
Selected Options: "3", "red"
Result: 1
Selected Options: "red", "3", "12v"
Result: 17
The current method works as expected. However, handling these using PHP is slow. I've thought of using JSON, but that would mean that I would be giving the users my whole price table, which isn't really what I am looking for. I have also thought of using another language, (e.g python) but it wouldn't particularly be practical considering the costs. That leaves me with MySQL.
If someone can suggest a cheap and cost-efficient way to do this, please provide and example. Better still if you could provide an even better PHP solution to this which works fast.
Thank you!

It looks like you did work to make the results read faster but you're still parsing and testing every array part against the full list? This would probably run faster moving the search to MySQL and having extra columns there.
Since you can control the array (or test string) perhaps try fixed length strings:
$results = explode("\n", "
1 Red v1
22 Blue v2
333 Green v3");
$i = 0;
while($i < count($results)) {
$a = substr($results[$i], 0, 10);
$b = substr($results[$i], 10, 20);
$c = substr($results[$i], strpos(' ', strrev($results[$i]))-1);
if(stripos($userInput, $a . $b . $c) !== false) {
// parse more...
Supposedly JavaScript is good at this with memoizaion:
http://addyosmani.com/blog/faster-javascript-memoization/

javascript equivalent to php max for arrays

I have a function in php that selects the array with that contains the most elements.
$firstArray = array('firstArray','blah','blah','blah');
$secondArray = array('secondArray','blah','blah');
$thirdArray = array('thirdArray','blah','blah','blah','blah');
then I get the name of the variable with the highest length like this:
$highest = max($firstArray, $secondArray, $thirdArray)[0];
but I am developing an application and I want to avoid using php and I have tried javascript's Math.max() to achieve the same results but it doesn't work the same way unless I do
Math.max(firstArray.length, secondArray.length, thirdArray.length)
But this is useless since I need to know the name of the array that contains the most elements. Is there any other way to achieve this?

This function takes as input an array of arrays, and returns the largest one.
function largestArray(arrays){
var largest;
for(var i = 0; i < arrays.length; i++){
if(!largest || arrays[i].length > largest.length){
largest = arrays[i];
}
}
return largest;
}
We can test it out with your example:
firstArray = ['firstArray','blah','blah','blah'];
secondArray = ['secondArray','blah','blah'];
thirdArray = ['thirdArray','blah','blah','blah','blah'];
// should print the third array
console.log(largestArray([firstArray, secondArray, thirdArray]));

The following url has a max() equivalent. It supports more then just numbers just like in php:
js max equivalent of php

If you feel ok with including 3rd-party libs, maybe http://underscorejs.org/#max does what you want:
var aList = [firstArray, secondArray, thirdArray];
_.max(aList, function(each) { return each.length; });

What's quicker, an array lookup (including array build) or an IF stack?

I was wondering which was better:
$lookup = array( "a" => 1, "b" => 2, "c" => 3 );
return $lookup[$key];
or
if ( $key == "a" ) return 1
else if ( $key == "b" ) return 2
else if ( $key == "c" ) return 3
or maybe just a nice switch...
switch($key){
case "a": return 1;
case "b": return 2;
case "c": return 3;
}
I always prefer the first method as I can separate the data from the code; At this scale it looks quite silly but on a larger scale with thousands of lines of lookup entries; How much longer is PHP going to take building an array and then only checking maybe 1 or 2 entries per request.
I think it'd have to be tested and clocked, but I'd say the bigger and more complicated the array the slower it's going to become.
PHP Should be able to handle lookups faster than I can in PHP-code, but building the array in the first place surely takes up a lot of time.

For anything with measurable performance (not only 3 entries) lookup is fastest way. That's what hash tables are for.

First, it's easy to test it yourself.
Second, and more importantly, which is most appropriate for the code you're using? The amount of time you'll save is negligible in any case.

There will be a tipping point you will just have to test to find it. My guess is with 3 items you are better off with if/then/else. This is a nice article on bit counting which compared computing the number of bits and using lookups. Spoiler: Lookups won!

Are you building the array every time, or can you build it once and cache it?
If you are building it every time, I cannot see how that could be faster. Building the array by itself should take longer that the chained if()s (Adding one item to the array would be close in time to one if(), but you'd have to add every item, when you could exit from the if() early)
If you can use a cached array, than I think that would be the clear winner.

So I did a bit of testing with this example and got the following results:
emptyfunction: 0.00000087601416110992430969503855231472755349386716
lookuparray: 0.00000136602194309234629100648257538086483009465155
makearrayonly: 0.00000156002373695373539708814922266633118397294311
makearray: 0.00000174602739810943597796187489595842734502184612
ifblock: 0.00000127001986503601083772739543942265072473674081
switchblock: 0.00000131001937389373773757957151314679222764425504
Each was inside a method, so I also included the time for an empty method. They were ran 1,000,000 times each and then averaged out.
Just doing a lookup (without the building of the array) is actually slower than an if block (uses a global lookup the same as my code) and just by a fraction slower than a switch block.
I can't be bothered scaling this up to hundreds of if statements but it just shows that the if statement is faster even at this level against a single lookup.

If you've got thousands of entries, an array lookup will win hands down. The associative array might be a bit slow, but finding an array key is much faster than doing thousands of if() blocks (not to mention the time it takes to type it all out!)

You could test how long it takes to see if the value is not there as well. My guess is the array key lookup will be faster. And if you have to query it twice or more, caching the array should make it faster.
But speed is not the most important thing here. The array key is better for the long term, where you want to add and remove data. You can also get the data from another source in the future.
If you just want to look up a value
you use an array.
If you want to take an action then
if and switch both have their uses.

This is a little test for array manipulations
{
$x = 0;
foreach ($test as $k => $v) {
$x = sprintf(” % s=>%sn”,$k,$v);}
}
{
$x = 0;
reset($test);
while (list($k, $v) = each($test)) {
$x = sprintf(” % s=>%sn”,$k,$v);
}
}
{
$x = 0;
$k = array_keys($test);
$co = sizeof($k);
for ($it = 0; $it < $co; $it++) {
$x = sprintf(” % s=>%sn”,$k[$it],$test[$k[$it]]);
}
}
{
$x = 0;
reset($test);
while ($k = key($test)) {
$x = sprintf(” % s=>%sn”,$k,current($test)); next($test);
}
}
access time (ms)
8.1222
10.3221
9.7921
8.9711

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Fastest way to find relevant results in array from an input array - php

Related

How to (quickly) discover if two arrays have at least one common item

Find index of value in associative array in php?

Improving algorithm using MySQL

javascript equivalent to php max for arrays

What's quicker, an array lookup (including array build) or an IF stack?

Categories

Resources