Will foreach become inefficient when number of array elements grow big? - php

Currently I'm using foreach to search key when use array_replace:
$grades = array(
0 =>array('id'=>1, 'grade'=>4),
1 =>array('id'=>5, 'grade'=>2),
2 =>array('id'=>17,'grade'=>1),
)
$replacement = array('id'=>17,'grade'=>3);
foreach($grades as $key=>$grade){
if($grade->id ==$replacement['id'] )
$found = $key;
}
$new_grades = array_replace($grades, array($found_key=>$replacement));
I wonder if this will become inefficient when the number of elements grow too much in $grades array. Is there any better way to do the search and replace job?

The execution time grows linearly with the number of elements in the array (O(N)). Use a better data structure, i.e. use the array in an associative way with the ID as index:
$grades = array(
1 => array('grade'=>4),
5 => array('grade'=>2),
17 => array('grade'=>1)
);
Then the lookup cost is constant (O(1)). You can do:
$grades[$replacement['id']] = array('grade' => $replacement['grade']);
or something similar, depending on your data.

Yeah, that can be done vastly more efficiently.
$grades = array(
1 => 4,
5 => 2,
17 => 1,
);
$replacement = array(
17 => 3,
);
$grades = array_merge($grades, $replacement);
If you need more information associated with the ID than just the grade, then you'll still need a more involved data structure like Felix Kling has. But no such requirement is present in your question so I'm not assuming it.

Related

Using str_replace() With Array Values Giving Unexpected Results

Using str_replace() to replace values in a couple paragraphs of text data, it seems to do so but in an odd order. The values to be replaced are in a hard-coded array while the replacements are in an array from a query provided by a custom function called DBConnect().
I used print_r() on both to verify that they are correct and they are: both have the same number of entries and are in the same order but the on-screen results are mismatched. I expected this to be straightforward and didn't think it needed any looping for this simple task as str_replace() itself usually handles that but did I miss something?
$replace = array('[MyLocation]','[CustLocation]','[MilesInc]','[ExtraDoc]');
$replacements = DBConnect($sqlPrices,"select",$siteDB);
$PageText = str_replace($replace,$replacements,$PageText);
and $replacements is:
Array
(
[0] => 25
[MyLocation] => 25
[1] => 45
[CustLocation] => 45
[2] => 10
[MilesInc] => 10
[3] => 10
[ExtraDoc] => 10
)
Once I saw what the $replacements array actually looked like, I was able to fix it by filtering out the numeric keys.
$replace = array('[MyLocation]','[CustLocation]','[MilesInc]','[ExtraDoc]');
$replacements = DBConnect($sqlPrices,"select",$siteDB);
foreach ($replacements as $key=>$value) :
if (!is_numeric($key)) $newArray[$key] = $value;
endforeach;
$PageText = str_replace($replace,$newArray,$PageText);
The former $replacements array, filtered to $newArray, looks like this:
Array
(
[MyLocation] => 25
[CustLocation] => 45
[MilesInc] => 10
[ExtraDoc] => 10
)
-- edited: Removed some non sense statements --
#DonP, what you are trying to do is possible.
In my opinion, the strtr() function could be more beneficial to you. All you need to make a few adjustments in your code like this ...
<?php
$replacements = DBConnect($sqlPrices,"select",$siteDB);
$PageText = strtr($PageText, [
'[MyLocation]' => $replacements['MyLocation'],
'[CustLocation]' => $replacements['CustLocation'],
'[MilesInc]' => $replacements['MilesInc'],
'[ExtraDoc]' => $replacements['ExtraDoc'],
]);
?>
This code is kinda verbose and requires writing repetitive strings. Once you understand the way it works, you can use some loops or array functions to refactor it. For example, you could use the following more compact version ...
<?php
// Reference fields.
$fields = ['MyLocation', 'CustLocation', 'MilesInc', 'ExtraDoc'];
// Creating the replacement pairs.
$replacementPairs = [];
foreach($fields as $field){
$replacementPairs["[{$field}]"] = $replacements[$field];
}
// Perform the replacements.
$PageText = strtr($PageText, $replacementPairs);
?>

How to use a file instead of a database?

I have 2 arrays. The first array contains the correct word variants and the second array contains incorrect word variants. I want to write them by combining into one as an array, passing in the wrong version of the words to the key and to the value of the correct version of the word. Then write them to a file and use it. I saved the contents of the array in a file, but with a new record to the same file, all the data is cleared and only new entries are re-written and the old entries are deleted automatically. I want that before writing new data to an array, the array is checked for duplication and if such is not written in the file with the array, then let it be written to the file without clearing the previous data of the file. In general, that will select a file or database for storing more than a billion words. Does the database have the best speed or file?
Example first array:
$uncorrect = array
(
0 => "мувосокори",
1 => "мунаггас",
2 => "мангит",
3 => "мангития",
4 => "мунфачир",
5 => "мунфачира",
6 => "манфиатпарасти",
7 => "манфиатчу",
8 => "манфиатчуи",
9 => "манфиатхох",
10 => "манфи",
.....................
);
Example second array:
$correct = array
(
0 => "мувосокорӣ",
1 => "мунағғас",
2 => "манғит",
3 => "манғития",
4 => "мунфаҷир",
5 => "мунфаҷира",
6 => "манфиатпарастӣ",
7 => "манфиатҷӯ",
8 => "манфиатҷӯӣ",
9 => "манфиатхоҳ",
10 => "манфӣ",
.....................
);
I combined two arrays with this code:
$dict = array_combine($uncorrect, $correct);
Example result my array with data here:
$dict = array (
"мувосокори" => "мувосокорӣ",
"мунаггас" => "мунағғас",
"мангит" => "манғит",
"мангития" => "манғития",
"мунфачир" => "мунфаҷир",
"мунфачира" => "мунфаҷира",
"манфиатпарасти" => "манфиатпарастӣ",
"манфиатчу" => "манфиатҷӯ",
"манфиатчуи" => "манфиатҷӯӣ",
"манфиатхох" => "манфиатхоҳ",
"манфи" => "манфӣ",
"минкор" => "минқор",
.....................................
);
I am writing to a file with this code:
file_put_contents("data.json", json_encode($dict));
I will get the array with this code:
$array = json_decode(file_get_contents("data.json"));
You would be better off using a database for this task.
To solve your issue if you decide to keep with the file storage, the reason you are losing old entries is because you forgot to load them before adding new values.
// more or less something like below
$array = json_decode(file_get_contents("data.json"));
$dict = array_combine($incorrect, $correct);
$newArray = array_merge($array, $dict);
file_put_contents("data.json", json_encode($newArray));
This will not be efficient for billions or rows, especially if it's something that gets loaded every time a user loads a page.
Any time you want to add new items just load the file first, then merge the new items in before saving it again. file_put_contents is going to overwrite whatever was there, so you need to get that data before running that function. Something like this:
$array = json_decode(file_get_contents("data.json"));
$newArray = array(
array( "rightWord" => "wrongWord")
);
$finalArray = array_merge($newArray, $array);
file_put_contents("data.json", json_encode($finalArray));

Building a new array using key IDS

Using the numbers from $ids, I want to pull the data from $nuts.
So for example:
$ids = [0,3,5]; // 0 calories, 3 sugar, 5 fat
$nuts = [
'calories' => 'cal',
'protein' => 'pro',
'carbohydrate' => 'car',
'sugar' => 'sug',
'fiber' => 'fib',
'fat' => 'fat',
];
$returnData = [
'calories' => 'cal',
'sugar' => 'sug',
'fat' => 'fat',
];
I could loop through each $ids number with a foreach(); but I'm curious to see if there is a better method than this?
$newNuts = array_values(array_flip($nuts));
foreach($ids as $i)
$returnData[$newNuts[$i]] = $nuts[$newNuts[$i]];
I did some work and realized, you don't need array_flip, array_values is fine.
$num_nuts = array_values ($nuts);
for ($z=0; $z<sizeof($ids); $z++) {
echo $num_nuts[$ids[$z]];
}
Just 1 more line of code, but I think it does the job. I think mine is going to be faster because the array_flip basically exchanges all keys with their associated values in an array, which is not what I am doing. It's actually one less pain.
I am simply converting the original array to a new one by index and simply looping upon it. Also, not the elegant way to use the power of PHP available to us, but works just fine. array_flip is O(n), but I think better not use it for larger data-sets.
How about a simple array_slice?
$result = array();
foreach ($ids as $i) {
$result += array_slice($nuts, $i, 1, true);
}
No need to create a copy of the array.

How to randomize a PHP array of records, giving more weight to more recent items?

I have an array of records from a database (although the database is irrelevant to this question -- it eventually becomes an array of "rows", each row is an array with string keys corresponding to the field name). For example:
$items = array(
1 => array('id' => 1, 'name' => 'John', 'created' => '2011-08-14 8:47:39'),
2 => array('id' => 2, 'name' => 'Mike', 'created' => '2011-08-30 16:00:12'),
3 => array('id' => 5, 'name' => 'Jane', 'created' => '2011-09-12 2:30:00'),
4 => array('id' => 7, 'name' => 'Mary', 'created' => '2011-09-14 1:18:40'),
5 => array('id' => 16, 'name' => 'Steve', 'created' => '2011-09-14 3:10:30'),
//etc...
);
What I want to do is shuffle this array, but somehow give more "weight" to items with a more recent "created" timestamp. The randomness does not have to be perfect, and the exact weight does not really matter to me. In other words, if there's some fast and simple technique that kinda-sorta seems random to humans but isn't mathematically random, I'm okay with that. Also, if this is not easy to do with an "infinite continuum" of timestamps, it would be fine with me to assign each record to a day or a week, and just do the weighting based on which day or week they're in.
A relatively fast/efficient technique is preferable since this randomization will occur on every page load of a certain page in my website (but if it's not possible to do efficiently, I'm okay with running it periodically and caching the result).
You can use eg. this comparison function:
function cmp($a, $b){
$share_of_a = $a['id'];
$share_of_b = $b['id'];
return rand(0, ($share_of_a+$share_of_b)) > $share_of_a ? 1 : -1;
}
and then use it like this:
usort($items, 'cmp');
It compares two elements of an array based on their IDs (it is easier and they are assigned based on the date of creation - newer elements have bigger IDs). The comparison is done randomly, with different chances of success for each element, giving more chances to the newer elements. The bigger the ID (the newer the element), the more chances it has to appear at the beginning.
For example element with id=16 has 16x more chances than element id=1 to appear earlier on the resulting list.
What about splitting it up into chunks by date, randomizing each chunk, and then putting them back together as one list?
//$array is your array
$mother=array();
foreach($array as $k->$v) $mother[rand(0,count($array))][$k]=$v;
ksort($mother);
$child=array();
foreach($mother as $ak->$av)
foreach($av as $k->$v) $child[$k]=$v;
$array=$child;
or you can use shuffle()
After being partially inspired by the response from #Tadeck , I came up with a solution. It's kind of long-winded, if anyone could simplify it that would be great. But it seems to work just fine:
//Determine lowest and highest timestamps
$first_item = array_slice($items, 0, 1);
$first_item = $first_item[0];
$min_ts = strtotime($first_item['created']);
$max_ts = strtotime($first_item['created']);
foreach ($items as $item) {
$ts = strtotime($item['created']);
if ($ts < $min_ts) {
$min_ts = $ts;
}
if ($ts > $max_ts) {
$max_ts = $ts;
}
}
//bring down the min/max to more reasonable numbers
$min_rand = 0;
$max_rand = $max_ts - $min_ts;
//Create an array of weighted random numbers for each item's timestamp
$weighted_randoms = array();
foreach ($items as $key => $item) {
$random_value = mt_rand($min_rand, $max_rand); //use mt_rand for a higher max value (plain old rand() maxes out at 32,767)
$ts = strtotime($item['created']);
$ts = $ts - $min_ts; //bring this down just like we did with $min_rand and $max_rand
$random_value = $random_value + $ts;
$weighted_randoms[$key] = $random_value;
}
//Sort by our weighted random value (the array value), with highest first.
arsort($weighted_randoms, SORT_NUMERIC);
$randomized_items = array();
foreach ($weighted_randomsas $item_key => $val) {
$randomized_items[$item_key] = $items[$item_key];
}
print_r($randomized_items);

Find intersecting rows between two 2d arrays comparing differently keyed columns

I have two arrays,
The $first has 5000 arrays inside it and looks like:
array(
array('number' => 1),
array('number' => 2),
array('number' => 3),
array('number' => 4),
...
array('number' => 5000)
);
and the $second has 16000 rows and looks like:
array(
array('key' => 1, 'val' => 'something'),
array('key' => 2, 'val' => 'something'),
array('key' => 3, 'val' => 'something'),
...
array('key' => 16000, 'val' => 'something'),
)
I want to create a third array that contains $second[$i]['val'] IF $second[$i][$key] is in $first[$i][$number]
currently I am doing:
$third = array();
foreach($first as &$f)
$f = $f['number'];
foreach($second as $s){
if(in_array($s['key'], $first)
$third[] = $s['val];
}
but, unless I use php's set_timeout(0) it is timing out, is there a more efficient way?
$third = array();
$ftemp = array();
foreach($first as $f)
$ftemp[$f['number']] = true;
foreach($second as $s){
if(isset($ftemp[$s['key']]))
$third[] = $s['val'];
}
should be waaay faster.
Don't try to make lookup dictionary in more convoluted way like below, because it actually is slower than above straightforward loop:
$third = array();
$ftemp = array_flip(reset(call_user_func_array('array_map', array_merge(array(null), $first))));
// $ftemp = array_flip(array_map('reset', $first)); // this is also slower
// array_unshift($first, null); $ftemp = array_flip(reset(call_user_func_array('array_map', $first))); // and this is even slower and modifies $first array
foreach($second as $s){
if(isset($ftemp[$s['key']]))
$third[] = $s['val'];
}
It's probably serious nitpicking but you could replace the foreach with a for which is a little faster, but i doubt that will make a big difference. You are working on a big dataset which might simply be not really fast to process on a webserver.
You are asking for "intersections" between the two arrays but on specific column keys which are not identical. Not to worry, PHP has a native function that is optimized under the hood for this task. array_uintersect() with no special data preparation. Within the custom callback function, null coalesce to the opposite array's key name. The reason for this fallback is because $a and $b do not represent array1 and array2. Because the intersect/diff family of native array functions sort while they filter, there may be instances where column values from the same array will be compared against each other. Again, this is part of the source code optimization.
Code: (Demo)
var_export(
array_uintersect(
$keyValues,
$numbers,
fn($a, $b) => ($a['number'] ?? $a['key']) <=> ($b['number'] ?? $b['key'])
)
);
As a general rule, though, if you are going to make a lot of array comparisons and speed matters, it is better to make key-based comparisons instead of value-based comparisons. Because of the way that PHP handles arrays as hashmaps, key-comparison functions/processes always outpace their value-comparing equivalent.
If you need to isolate the val column data after filtering, array_column() will fix this up for you quickly.

Categories