I have an multidim. array with >2000 Values.
Which simple way is the most efficient to search a value within.
What i am especially curious about is if array_search() uses alphabetic narrowing down to be more efficient (there is no parameter though to indicate if alphanumeric).
$_array = ["abidabi", "beda", "cedi", "zamibula"];
$_target = "zamibula";
foreach($_array as $val)
{
if ($val == $_target)
{
echo $val;
}
vs.
echo $_array[array_search('zamibula', $_array)];
for sure my actual code is very more complicated (json, multidimensional array with massive data to populate an INPUT->SELECT->OPTIONS), so it already lags browserside.
What are your thoughts on the alphabetical search? Can i structure the data somehow to make it less consuming :)?
thanks
If all the values in the array are unique, than the fastest way is to switch values to keys and then call isset like so:
$_array = ["abidabi", "beda", "cedi", "zamibula"];
$_target = "zamibula";
$indexed = array_flip($_array);
var_dump(isset($indexed[$_target]));
This is almost constantly fast no matter the size of an array.
Related
I need to determine the keys of values, that have duplicates from an array.
What I came up with is:
$duplicates_keys = array();
$unique = array_unique($in);
$duplicates = array_diff_assoc($in, $unique);
foreach ($in as $key => $val){
if (in_array($val,$duplicates)){
$duplicates_keys[]=$key;
}
}
Which works, but that's pretty resource intensive, is there a faster way to do this?
As per my comment, i doubt this is a bottleneck. However you can reduce your iterations to once through the array, as follows:
$temp=[];
$dup=[];
foreach ($in as $key=>$val) {
if(isset($temp[$val])){
$dup[]=$key;
}else{
$temp[$val]=0;
}
}
Note that the value is set as an array key in temp, so you can use O(1) isset rather than in_array, which must search the full array until the value is found.
This is theoretically faster than your example, but you would need to profile it to be sure (as you should have already done to ascertain your current code is slow).
Probably you can do something else that has a far greater impact, like caching or a better database query
Use array_intersect() for this.
$duplicates_keys = array_intersect($in, $duplicates);
array_intersect()
I need to check some input string against a huge (and growing) list of strings coming from a CSV file (1000000+). I currently load every string in an array and check against it via in_array(). Code looks like this:
$filter = array();
$filter = ReadFromCSV();
$input = array("foo","bar" /* more elements... */);
foreach($input as $i){
if(in_array($i,$filter)){
// do stuff
}
}
It already takes some time and I was wondering is there is a faster way to do this?
in_array() checks every element in the array until it finds a match. The average complexity is O(n).
Since you are comparing strings, you might store your input as array keys instead of values and look them up via array_key_exists(); which requires a constant time O(1).
Some code:
$filter = array();
$filter = ReadFromCSV();
$filter = array_flip($filter); // switch key <=> value
$input = array("foo","bar" /* more elements... */);
foreach($input as $i){
if(array_key_exists($i,$filter)){ // array_key_exists();
// do stuff
}
}
That's what indexes were invented for.
It's not a matter of in_array() speed, as the data grows, you should probably consider using indexes by loading data into a real DBMS.
If you have any array $p that you populated in a loop like so:
$p[] = array( "id"=>$id, "Name"=>$name);
What's the fastest way to search for John in the Name key, and if found, return the $p index? Is there a way other than looping through $p?
I have up to 5000 names to find in $p, and $p can also potentially contain 5000 rows. Currently I loop through $p looking for each name, and if found, parse it (and add it to another array), splice the row out of $p, and break 1, ready to start searching for the next of the 5000 names.
I was wondering if there if a faster way to get the index rather than looping through $p eg an isset type way?
Thanks for taking a look guys.
Okay so as I see this problem, you have unique ids, but the names may not be unique.
You could initialize the array as:
array($id=>$name);
And your searches can be like:
array_search($name,$arr);
This will work very well as native method of finding a needle in a haystack will have a better implementation than your own implementation.
e.g.
$id = 2;
$name= 'Sunny';
$arr = array($id=>$name);
echo array_search($name,$arr);
Echoes 2
The major advantage in this method would be code readability.
If you know that you are going to need to perform many of these types of search within the same request then you can create an index array from them. This will loop through the array once per index you need to create.
$piName = array();
foreach ($p as $k=>$v)
{
$piName[$v['Name']] = $k;
}
If you only need to perform one or two searches per page then consider moving the array into an external database, and creating the index there.
$index = 0;
$search_for = 'John';
$result = array_reduce($p, function($r, $v) use (&$index, $search_for) {
if($v['Name'] == $search_for) {
$r[] = $index;
}
++$index;
return $r;
});
$result will contain all the indices of elements in $p where the element with key Name had the value John. (This of course only works for an array that is indexed numerically beginning with 0 and has no “holes” in the index.)
Edit: Possibly even easier to just use array_filter, but that will not return the indices only, but all array element where Name equals John – but indices will be preserved:
$result2 = array_filter($p, function($elem) {
return $elem["Name"] == "John" ? true : false;
});
var_dump($result2);
What suits your needs better, resp. which one is maybe faster, is for you to figure out.
I have a working script, but I'm sure that my method of managing arrays could be better. I've searched for a solution and haven't found one, but I'm sure that I should be using the functionality of associative arrays to do things more efficiently.
I have two arrays, one from a CSV file and one from a DB. I've created the CSV array as numeric and the DB array as associative (although I'm aware that the difference is blurry in PHP).
I'm trying to find a record in the DB array where the value in one field matches a value in the CSV array. Both arrays are multi-dimensional.
Within each record in each array there is a reference number. It appears once in the CSV array and may appear in the DB array. If it does, I need to take action.
I'm currently doing this (simplified):
$CSVarray:
('reference01', 'blue', 'small' ),
('reference02', 'red', 'large' ),
('reference03', 'pink', 'medium' )
$Dbarray:
(0 => array(ref=>'reference01',name=>"tom",type=>"mouse"),
(1 => array(ref=>'reference02',name=>"jerry",type=>"cat"),
(2 => array(ref=>'reference03',name=>"butch",type=>"dog"),
foreach ($CSVarray as $CSVrecord) {
foreach ($Dbarray as $DBrecord) {
if ($CSVarray[$numerickey] == $DBrecord['key'] {
do something with the various values in the $DBrecord
}
}
}
This is horrible, as the arrays are each thousands of lines.
I don't just want to know if matching values exist, I want to retrieve data from the matching record, so functions like 'array_search ' don't do what I want and array_walk doesn't seem any better than my current approach.
What I really need is something like this (gibberish code):
foreach ($CSVarray as $CSVrecord) {
WHERE $Dbarray['key']['key'] == $CSVrecord[$numerickey] {
do something with the other values in $Dbarray['key']
}
}
I'm looking for a way to match the values using the keys (either numeric or associative) rather than walking the arrays. Can anyone offer any help please?
use a hash map - take one array and map each key of the record it belongs to, to that record. Then take the second array and simply iterate over it, checking for each record key if the hashmap has anything set for it.
Regarding your example:
foreach ($DBarray as $DBrecord){
$Hash[$record[$key]] = $DBrecord;
}
foreach ($CSVarray as $record){
if (isset($Hash[$record[$CSVkey]])){
$DBrecord = $Hash[$record[$CSVkey]];
//do stuff with $DBrecord and $CSVrecord
}
}
this solution works at O(n) while yours at O(n^2)...
You can use foreach loops like this too:
foreach ($record as $key => $value) {
switch($key)
{
case 'asd':
// do something
break;
default:
// Default
break;
}
}
A switch may be what you are looking for also :)
Load CSV into the db, and use db (not db array) if possible for retrieval. Index the referenceid field.
I have 2 sets of arrays:
$dates1 = array('9/12','9/13','9/14','9/15','9/16','9/17');
$data1 = array('5','3','7','7','22','18');
// for this dataset, the value on 9/12 is 5
$dates2 = array('9/14','9/15');
$data2 = array('12','1');
As you can see the 2nd dataset has fewer dates, so I need to "autofill" the reset of the array to match the largest dataset.
$dates2 = array('9/12','9/13','9/14','9/15','9/16','9/17');
$data2 = array('','','12','1','','');
There will be more than 2 datasets, so I would have to find the largest dataset, and run a function for each smaller dataset to properly format it.
The function I'd create is the problem for me. Not even sure where to start at this point. Also, I can format the date and data arrays differently (multidimensional arrays?) if for some reason that is better.
You can do this in a pretty straightforward manner using some array functions. Try something like this:
//make an empty array matching your maximum-sized data set
$empty = array_fill_keys($dates1,'');
//for each array you wish to pad, do this:
//make key/value array
$new = array_combine($dates2,$data2);
//merge, overwriting empty keys with data values
$new = array_merge($empty,$new);
//if you want just the data values again
$data2 = array_values($new);
print_r($data2);
It would be pretty easy to turn that into a function or put it into a for loop to operate on your array sets. Turning them into associative arrays of key/value pairs would make them easier to work with too I would think.
If datas are related will be painful to scatter them on several array.
The best solution would be model an object with obvious property names
and use it with related accessor.
From your question I haven't a lot of hint of what data are and then I have to guess a bit:
I pretend you need to keep a daily log on access on a website with downloads. Instead of using dates/data1/data2 array I would model a data structure similar to this:
$log = array(
array('date'=>'2011-09-12','accessCount'=>7,'downloadCount'=>3),
array('date'=>'2011-09-13','accessCount'=>9), /* better downloadsCount=>0 though */
array('date'=>'2011-09-15','accessCount'=>7,'downloadCount'=>3)
...
)
Using this data structure I would model a dayCollection class with methods add,remove,get,set, search with all methods returning a day instance (yes, the remove too) and according signature. The day Class would have the standard getter/setter for every property (you can resolve to magic methods).
Depending on the amount of data you have to manipulate you can opt to maintain into the collection just the object data (serialize on store/unserialize on retrieve) or the whole object.
It is difficult to show you some code as the question is lacking of details on your data model.
If you still want to pad your array than this code would be a good start:
$temp = array($dates, $data1, $data2);
$max = max(array_map('count',$temp));
$result = array_map( function($x) use($max) {
return array_pad($x,$max,0);
}, $temp);
in $result you have your padded arrays. if you want to substitute your arrays do a simple
list($dates, $data1, $data2) = array_map(....
You should use hashmaps instead of arrays to associate each date to a data.
Then, find the largest one, cycle through its keys with a foreach, and test the existence of the same key in the small one.
If it doesn't exist, create it with an empty value.
EDIT with code (for completeness, other answers seem definitely better):
$dates_data1 = array('9/12'=>'5', '9/13'=>'3', '9/14'=>'7' /* continued */);
$dates_data2 = array('9/14'=>'12', '9/15'=>'1');
#cycle through each key (date) of the longest array
foreach($dates_data1 as $key => $value){
#check if the key exists in the smallest and add '' value if it does not
if(!isset( $date_data2[$key] )){ $date_data2[$key]=''; }
}