I'm using data pulled from an SQL query to build two charts later in my code. And example query would be something like:
SELECT purchase_location, purchase_item, SUM(purchase_amount) as totalPurchase
FROM purchases
GROUP BY purchase_item_id, purchase_location
Not an exact example, but the idea is there. I then iterate through my results to build the two data sets.
$locationData = [];
$itemData = [];
foreach($queryResults as $result) {
$locationData[$result['purchase_location']] += $result['totalPurchase'];
$itemData[$result['purchase_item']] += $result['totalPurchase'];
}
Since I want the data from two different points of view, I have to use += to get the correct totals. My question is this: doing the += operator on an unset index of an array is incredibly slow. I've found that if I do the following:
if (isset($locationData['purchase_location'])) {
$locationData['purchase_location'] += $result['totalPurchase'];
} else {
$locationData['purchase_location'] = $result['totalPurchase'];
}
Using an = for the first time the index is seen. This speeds up the code significantly (As an example, my code went from 8-10 second run time down to less than half a second). My question is, is that the correct/cleanest way to handle this?
And before anyone mentions, I know I could write the query to handle all of this in this simple case, this was just a really easy example to show the issue, that of using += on an, as of yet, undefined array index.
I would suggest initializing the array with that index already, avoiding the need for isset checks and making the code cleaner:
$locationData = ['purchase_location' => 0];
$itemData = ['purchase_item' => 0];
foreach($queryResults as $result) {
$locationData['purchase_location'] += $result['totalPurchase'];
$itemData['purchase_item'] = $result['totalPurchase'];
}
For my projects I usually try to limit the usage of isset to the validation of received data from outside sources (ex: GET, POST) that I can't fully control.
Since you've updated you're answer, now it makes sense to use isset in this case to avoid another loop to construct the array.
$locationData = [];
$itemData = [];
foreach($queryResults as $result) {
if (!isset($locationData[$result['purchase_location']])) {
$locationData[$result['purchase_location']] = 0;
}
if (!isset($itemData[$result['purchase_item']])) {
$itemData[$result['purchase_item']] = 0;
}
$locationData[$result['purchase_location']] += $result['totalPurchase'];
$itemData[$result['purchase_item']] += $result['totalPurchase'];
}
If you're using PHP 7+ you can use the null coalesce ?? to simplify your code like this:
$locationData = [];
$itemData = [];
foreach($queryResults as $result) {
$locationData[$result['purchase_location']] = $result['totalPurchase'] + ($locationData[$result['purchase_location']] ?? 0);
$itemData[$result['purchase_item']] = $result['totalPurchase'] + ($itemData[$result['purchase_item']] ?? 0);
}
You could try the following:
$locationData = [];
$itemData = [];
foreach($queryResults as $result) {
$locationData[$result['purchase_location']] = ($locationData[$result['purchase_location']]??0) + $result['totalPurchase'];
$itemData[$result['purchase_item']] = ($itemData[$result['purchase_item']]??0) + $result['totalPurchase'];
}
But I think it's cleaner and more obvious just to initialise everything to zero separately first:
foreach($queryResults as $result) {
$locationData[$result['purchase_location']] = 0;
$itemData[$result['purchase_item']] = 0;
}
and then do the work of addition in another loop.
for your question, yes you need to create this index before adding += to this index.
instead of check it every iteration, so just configure it before the foreach loop, and then this index will exits and you will be able to sum += and thats all
$locationData = [
'purchase_location' => 0
];
$itemData = [];
foreach($queryResults as $result) {
$locationData['purchase_location'] += $result['totalPurchase'];
$itemData['purchase_item'] = $result['totalPurchase'];
}
My question is, is that the correct/cleanest way to handle this?
Without example data and expected results it's hard to know that results you are after.
A educated guess would be using MySQL WITH ROLLUP GROUP BY Modifier to get a added total record.
SELECT purchase_location, purchase_item, SUM(purchase_amount) as totalPurchase
FROM purchases
GROUP BY purchase_item_id, purchase_location WITH ROLLUP
Related
I have a multidimensional array defined as follows
$SquadList = array('name', 'position', 'dob', 'nation', 'games', 'goals', 'assists');
I'm looping through several foreach loops and storing data from JSON
foreach ($season as $key => $squad){
$SquadList[0] = $squad['full_name'];
$SquadList[1] = $squad['position'];
$SquadList[2] = gmdate("d-m-y", $birthday);
$SquadList[3] = $squad['nationality'];
$SquadList[4] = $squad['appearances_overall'];
$SquadList[5] = $squad['goals_overall'];
$SquadList[6] = $squad['assists_overall']; }
foreach ($season1 as $key => $squad){
$SquadList[0] = $squad['full_name'];
$SquadList[1] = $squad['position'];
$SquadList[2] = gmdate("d-m-y", $birthday);
$SquadList[3] = $squad['nationality'];
$SquadList[4] = $squad['appearances_overall'];
$SquadList[5] = $squad['goals_overall'];
$SquadList[6] = $squad['assists_overall'];
The code is messy. The output is only 2 elements when it should be 30+
I've tried array_push as follows
array_push($SquadList['name'], $squad['full_name'];
i'm not sure if i get the question correctly, but i imagine you want it to be structured something like this:
$SquadList = []; // define it as an array
$ctr = 0; // define a counter that would be used in the two iterations
foreach ($season as $key => $squad){
$SquadList[$ctr][0] = $squad['full_name'];
$SquadList[$ctr][1] = $squad['position'];
$SquadList[$ctr][2] = gmdate("d-m-y", $birthday);
$SquadList[$ctr][3] = $squad['nationality'];
$SquadList[$ctr][4] = $squad['appearances_overall'];
$SquadList[$ctr][5] = $squad['goals_overall'];
$SquadList[$ctr][6] = $squad['assists_overall'];
$ctr++; // increase counter
}
foreach ($season1 as $key => $squad){
$SquadList[$ctr][0] = $squad['full_name'];
$SquadList[$ctr][1] = $squad['position'];
$SquadList[$ctr][2] = gmdate("d-m-y", $birthday);
$SquadList[$ctr][3] = $squad['nationality'];
$SquadList[$ctr][4] = $squad['appearances_overall'];
$SquadList[$ctr][5] = $squad['goals_overall'];
$SquadList[$ctr][6] = $squad['assists_overall'];
$ctr++; // increase counter
}
The reason you had two results is because you got the last squad for each season. This happened because each time a season iterated, it overwrote the previous squad.
To solve this problem, $SquadList must be an array. But you have to assign all its members at once, otherwise the array will increment every time you add a member.
Populating an array of arrays
foreach ($season as $key => $squad) {
$squadList[] = [
$squad['full_name'],
$squad['position'],
gmdate("d-m-y", $squad['birthday']),
$squad['nationality'],
$squad['appearances_overall'],
$squad['goals_overall'],
$squad['assists_overall']
];
}
Note a couple of changes I made:
I removed the capitalization on $squadList because convention has starting with a capital indicating an object, not a plain old variable
$birthday was undefined, so I made an educated guess
Cleaning up the code
You mentioned that “the code is messy”. That is a very healthy observation to make.
What you are noticing is the result of two things:
Your code is repeating itself (a violation of DRY - Don’t Repeat Yourself)
Need to follow convention of PSR-12
So let’s get rid of the code duplication
Refactoring
When you start repeating yourself, that’s a signal to pull things into a function
function buildSquad(array $season)
{
foreach ($season as $key => $squad) {
$squadList[] = [
$squad['full_name'],
$squad['position'],
gmdate("d-m-y", $squad['birthday']),
$squad['nationality'],
$squad['appearances_overall'],
$squad['goals_overall'],
$squad['assists_overall']
];
}
return $squadList;
}
$squadList = [];
// if you just want to lump them all together
$squadList[] = buildSquad($season);
$squadList[] = buildSquad($season2);
// etc
I am looking for a way to create new arrays in a loop. Not the values, but the array variables. So far, it looks like it's impossible or complicated, or maybe I just haven't found the right way to do it.
For example, I have a dynamic amount of values I need to append to arrays. Let's say it will be 200 000 values. I cannot assign all of these values to one array, for memory reasons on server, just skip this part.
I can assign a maximum amount of 50 000 values per one array. This means, I will need to create 4 arrays to fit all the values in different arrays. But next time, I will not know how many values I need to process.
Is there a way to generate a required amount of arrays based on fixed capacity of each array and an amount of values? Or an array must be declared manually and there is no workaround?
What I am trying to achieve is this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_array$i = array();
foreach ($data as $val) {
$new_array$i[] = $val;
}
}
// Created arrays: $new_array1, $new_array2, $new_array3
A possible way to do is to extend ArrayObject. You can build in limitation of how many values may be assigned, this means you need to build a class instead of $new_array$i = array();
However it might be better to look into generators, but Scuzzy beat me to that punchline.
The concept of generators is that with each yield, the previous reference is inaccessible unless you loop over it again. It will be in a way, overwritten unlike in arrays, where you can always traverse over previous indexes using $data[4].
This means you need to process the data directly. Storing the yielded data into a new array will negate its effects.
Fetching huge amounts of data is no issue with generators but one should know the concept of them before using them.
Based on your comments, it sounds like you don't need separate array variables. You can reuse the same one. When it gets to the max size, do your processing and reinitialize it:
$max_array_size = 50000;
$n = 1;
$new_array = [];
foreach ($data as $val) {
$new_array[] = $val;
if ($max_array_size == $n++) {
// process $new_array however you need to, then empty it
$new_array = [];
$n = 1;
}
}
if ($new_array) {
// process the remainder if the last bit is less than max size
}
You could create an array and use extract() to get variables from this array:
$required_number_of_arrays = ceil($data/50000);
$new_arrays = array();
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$new_arrays["new_array$i"] = $data;
}
extract($new_arrays);
print_r($new_array1);
print_r($new_array2);
//...
I think in your case you have to create an array that holds all your generated arrays insight.
so first declare a variable before the loop.
$global_array = [];
insight the loop you can generate the name and fill that array.
$global_array["new_array$i"] = $val;
After the loop you can work with that array. But i think in the end that won't fix your memory limit problem. If fill 5 array with 200k entries it should be the same as filling one array of 200k the amount of data is the same. So it's possible that you run in both ways over the memory limit. If you can't define the limit it could be a problem.
ini_set('memory_limit', '-1');
So you can only prevent that problem in processing your values directly without saving something in an array. For example if you run a db query and process the values directly and save only the result.
You can try something like this:
foreach ($data as $key => $val) {
$new_array$i[] = $val;
unset($data[$key]);
}
Then your value is stored in a new array and you delete the value of the original data array. After 50k you have to create a new one.
Easier way use array_chunk to split your array into parts.
https://secure.php.net/manual/en/function.array-chunk.php
There's non need for multiple variables. If you want to process your data in chunks, so that you don't fill up memory, reuse the same variable. The previous contents of the variable will be garbage collected when you reassign it.
$chunk_size = 50000;
$number_of_chunks = ceil($data_size/$chunk_size);
for ($i = 0; $i < $data_size; $i += $chunk_size) {
$new_array = array();
foreach ($j = $i * $chunk_size; $j < min($j + chunk_size, $data_size); $j++) {
$new_array[] = get_data_item($j);
}
}
$new_array[$i] serves the same purpose as your proposed $new_array$i.
You could do something like this:
$required_number_of_arrays = ceil(count($data)/50000);
for ($i = 1;$i <= $required_number_of_arrays;$i++) {
$array_name = "new_array_$i";
$$array_name = [];
foreach ($data as $val) {
${$array_name}[] = $val;
}
}
I have a PHP array that I'd like to duplicate but only copy elements from the array whose keys appear in another array.
Here are my arrays:
$data[123] = 'aaa';
$data[423] = 'bbb';
$data[543] = 'ccc';
$data[231] = 'ddd';
$data[642] = 'eee';
$data[643] = 'fff';
$data[712] = 'ggg';
$data[777] = 'hhh';
$keys_to_copy[] = '123';
$keys_to_copy[] = '231';
$keys_to_copy[] = '643';
$keys_to_copy[] = '712';
$keys_to_copy[] = '777';
$copied_data[123] = 'aaa';
$copied_data[231] = 'ddd';
$copied_data[643] = 'fff';
$copied_data[712] = 'ggg';
$copied_data[777] = 'hhh';
I could just loop through the data array like this:
foreach ($data as $key => $value) {
if ( in_array($key, $keys_to_copy)) {
$copied_data[$key] = $value;
}
}
But this will be happening inside a loop which is retrieving data from a MySQL result set. So it would be a loop nested within a MySQL data loop.
I normally try and avoid nested loops unless there's no way of using PHP's built-in array functions to get the result I'm looking for.
But I'm also weary of having a nested loop within a MySQL data loop, I don't want to keep MySQL hanging around.
I'm probably worrying about nested loop performance unnecessarily as I'll never be doing this for more than a couple of hundred rows of data and maybe 10 keys.
But I'd like to know if there's a way of doing this with built-in PHP functions.
I had a look at array_intesect_key() but that doesn't quite do it, because my $keys_to_copy array has my desired keys as array values rather than keys.
Anyone got any ideas?
Cheers, B
I worked it out - I almost had it above.I thought I'd post the answer anyway for completeness. Hope this helps someone out!
array_intersect_key($data, array_flip($keys_to_copy))
Use array_flip() to switch $keys_to_copy so it can be used within array_intersect_keys()
I'll run some tests to compare performance between the manual loop above, to this answer. I would expect the built-in functions to be faster but they might be pretty equal. I know arrays are heavily optimised so I'm sure it will be close.
EDIT:
I have run some benchmarks using PHP CLI to compare the foreach() code in my question with the code in my answer above. The results are quite astounding.
Here's the code I used to benchmark, which I think is valid:
<?php
ini_set('max_execution_time', 0);//NOT NEEDED FOR CLI
// BUILD RANDOM DATA ARRAY
$data = array();
while ( count($data) <= 200000) {
$data[rand(0, 500000)] = rand(0, 500000);
}
$keys_to_copy = array_rand($data, 100000);
// FOREACH
$timer_start = microtime(TRUE);
foreach ($data as $key => $value) {
if ( in_array($key, $keys_to_copy)) {
$copied_data[$key] = $value;
}
}
echo 'foreach: '.(microtime(TRUE) - $timer_start)."s\r\n";
// BUILT-IN ARRAY FUNCTIONS
$timer_start = microtime(TRUE);
$copied_data = array_intersect_key($data, array_flip($keys_to_copy));
echo 'built-in: '.(microtime(TRUE) - $timer_start)."s\r\n";
?>
And the results...
foreach: 662.217s
array_intersect_key: 0.099s
So it's much faster over loads of array elements to use the PHP array functions rather than foreach. I thought it would be faster but not by that much!
Why not load the entire result set into an array, then begin processing with nested loops?
$query_result = mysql_query($my_query) or die(mysql_error());
$query_rows = mysql_num_rows($query_result);
for ($i = 0; $i < $query_rows; $i++)
{
$row = mysql_fetch_assoc($query_result);
// 'key' is the name of the column containing the data key (123)
// 'value' is the name of the column containing the value (aaa)
$data[$row['key']] = $row['value'];
}
foreach ($data as $key => $value)
{
if ( in_array($key, $keys_to_copy))
{
$copied_data[$key] = $value;
}
}
I have a large array.
In this array I have got (among many other things) a list of products:
$data['product_name_0'] = '';
$data['product_desc_0'] = '';
$data['product_name_1'] = '';
$data['product_desc_1'] = '';
This array is provided by a third party (so I have no control over this).
It is not known how many products there will be in the array.
What would be a clean way to loop though all the products?
I don't want to use a foreach loop since it will also go through all the other items in the (large) array.
I cannot use a for loop cause I don't know (yet) how many products the array contains.
I can do a while loop:
$i = 0;
while(true) { // doing this feels wrong, although it WILL end at some time (if there are no other products)
if (!array_key_exists('product_name_'.$i, $data)) {
break;
}
// do stuff with the current product
$i++;
}
Is there a cleaner way of doing the above?
Doing a while(true) looks stupid to me or is there nothing wrong with this approach.
Or perhaps there is another approach?
Your method works, as long as the numeric portions are guaranteed to be sequential. If there's gaps, it'll miss anything that comes after the first gap.
You could use something like:
$names = preg_grep('/^product_name_\d+$/', array_keys($data));
which'll return all of the 'name' keys from your array. You'd extract the digit portion from the key name, and then can use that to refer to the 'desc' section as well.
foreach($names as $name_field) {
$id = substr($names, 12);
$name_val = $data["product_name_{$id}"];
$desc_val = $data["product_desc_{$id}"];
}
How about this
$i = 0;
while(array_key_exists('product_name_'.$i, $data)) {
// loop body
$i++;
}
I think you're close. Just put the test in the while condition.
$i = 0;
while(array_key_exists('product_name_'.$i, $data)) {
// do stuff with the current product
$i++;
}
You might also consider:
$i = 0;
while(isset($data['product_name_'.$i])) {
// do stuff with the current product
$i++;
}
isset is slightly faster than array_key_exists but does behave a little different, so may or may not work for you:
What's quicker and better to determine if an array key exists in PHP?
Difference between isset and array_key_exists
I have a PHP script which reads a large CSV and performs certain actions, but only if the "username" field is unique. The CSV is used in more than one script, so changing the input from the CSV to only contain unique usernames is not an option.
The very basic program flow (which I'm wondering about) goes like this:
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (in_array($username, $allUsernames)) continue;
$allUsernames[] = $username;
// process this row
}
Since this CSV could actually be quite large, it's that in_array bit which has got me thinking. The most ideal situation when searching through an array for a member is if it is already sorted, so how would you build up an array from scratch, keeping it in order? Once it is in order, would there be a more efficient way to search it than using in_array(), considering that it probably doesn't know the array is sorted?
Not keeping the array in order, but how about this kind of optimization? I'm guessing isset() for an array key should be faster than in_array() search.
$allUsernames = array();
while($row = fgetcsv($fp)) {
$username = $row[0];
if (isset($allUsernames[$username])) {
continue;
} else {
$allUsernames[$username] = true;
// do stuff
}
}
The way to build up an array from scratch in sorted order is an insertion sort. In PHP-ish pseudocode:
$list = []
for ($element in $elems_to_insert) {
$index = binary_search($element, $list);
insert_into_list($element, $list, $index);
}
Although, it might actually turn out to be faster to just create the array in unsorted order and then use quicksort (PHP's builtin sort functions use quicksort)
And to find an element in a sorted list:
function binary_search($list, $element) {
$start = 0;
$end = count($list);
while ($end - $start > 1) {
$mid = ($start + $end) / 2;
if ($list[$mid] < $element){
$start = $mid;
}
else{
$end = $mid;
}
}
return $end;
}
With this implementation you'd have to test $list[$end] to see if it is the element you want, since if the element isn't in the array, this will find the point where it should be inserted. I did it that way so it'd be consistent with the previous code sample. If you want, you could check $list[$end] === $element in the function itself.
The array type in php is an ordered map (php array type). If you pass in either ints or strings as keys, you will have an ordered map...
Please review item #6 in the above link.
in_array() does not benefit from having a sorted array. PHP just walks along the whole array as if it were a linked list.