I wondering whether anyone has any good ideas on optimizing the following code. I have an multi-dimensional array ($List) as follows:
Array
(
[0] => Array
(
[id] => 1
[title] => A good read
[priority] => 10
)
[1] => Array
(
[id] => 2
[title] => A bad read
[priority] => 20
)
[2] => Array
(
[id] => 3
[title] => A good read
[priority] => 10
)
)
First I'm removing any entries that share the same title (no matter what the other values are) as follows:
$List_new = array();
foreach ($List as $val) {
$List_new[$val['title']] = $val;
}
$List = array_values($List_new);
Perfect. Then I'm reordering the array, first by the priority field and then id:
$sort_id = array();
$sort_priority = array();
foreach ($List as $key => $row) {
$sort_id[$key] = $row['id'];
$sort_priority[$key] = $row['priority'];
}
array_multisort($sort_priority, SORT_DESC, $sort_id, SORT_DESC, $List);
Both code blocks appear in a loop, hence the clearing of $sort_id and $sort_priority before reordering.
Is there a better way to do this - i.e. use the sorting process to remove duplicate title entries? This code block is being executed in a loop of up to 500,000 records and so any improvement would be welcome!
One loop, but a few extra function calls so I can't tell you how the Big O changes. One thing to note, the padding around numbers must be big enough to prevent overflow i.e. 2 = max 99 priorities and 6 = max 999,999 items.
$list_titles = array();
foreach($List as $val) {
if(isset($list_titles[$val['title']])) continue;
$list_titles[$val['title']] = true;
$List_new[str_pad($val['priority'], 2, 0, STR_PAD_LEFT).str_pad($val['id'], 6, 0, STR_PAD_LEFT)] = $val;
}
krsort($List_new);
Edit: made some minor modifications.
Related
I will try to explain the data I'm working with first, then I'll explain what I hope to do with the data, then I'll explain what I've tried so far. Hopefully someone can point me in the right direction.
What I'm working with:
I have an array containing survey responses. The first two items are the two answers for the first question and responses contains the number of people who selected those answers. The last three items are the three answers for the other question we asked.
Array
(
[0] => Array
(
[survey_id] => 123456789
[question_text] => Have you made any changes in how you use our product this year?
[d_answer_text] => No
[responses] => 92
)
[1] => Array
(
[survey_id] => 123456789
[question_text] => Have you made any changes in how you use our product this year?
[answer_text] => Yes
[responses] => 30
)
[2] => Array
(
[survey_id] => 123456789
[question_text] => How would you describe your interaction with our staff compared to prior years?
[answer_text] => Less Positive
[responses] => 14
)
[3] => Array
(
[survey_id] => 123456789
[question_text] => How would you describe your interaction with our staff compared to prior years?
[answer_text] => More Positive
[responses] => 35
)
[4] => Array
(
[survey_id] => 123456789
[question_text] => How would you describe your interaction with our staff compared to prior years?
[answer_text] => No Change
[responses] => 72
)
)
What I want to achieve:
I want to create an array where the question_text is used as the key (or I might grab the question_id and use it instead), use the answer_text as a key, with the responses as the value. It would look something like this:
Array
(
[Have you made any changes in how you use our product this year?] => Array
(
[No] => 92
[Yes] => 30
)
[How would you describe your interaction with our staff compared to prior years?] => Array
(
[Less Positive] => 14
[More Positive] => 35
[No Change] => 72
)
)
Here's what I've tried:
$response_array = array();
foreach($result_array as $value){
//$responses_array['Our question'] = array('answer 1'=>responses,'answer 2'=>responses);
$responses_array[$value['question_text']] = array($value['answer_text']=>$value['responses']);
}
This does not work because each loop will overwrite the value for $responses_array[$question]. This makes sense to me and I understand why it won't work.
My next thought was to try using array_merge().
$responses_array = array();
foreach($result as $value){
$question_text = $value['question_text'];
$answer_text = $value['answer_text'];
$responses = $value['responses'];
$responses_array[$question_text] = array_merge(array($responses_array[$question_text],$answer_text=>$responses));
}
I guess my logic was wrong because it looks like the array is nesting too much.
Array
(
[Have you made any changes in how you use our product this year?] => Array
(
[0] => Array
(
[0] =>
[No] => 92
)
[Yes] => 30
)
My problem with array_merge is that I don't have access to all answers for the question in each iteration of the foreach loop.
I want to design this in a way that allows it to scale up if we introduce more questions with different numbers of answers. How can this be solved?
Sounds like a reduce job
$response_array = array_reduce($result_array, function($carry, $item) {
$carry[$item['question_text']][$item['answer_text']] = $item['responses'];
return $carry;
}, []);
Demo ~ https://eval.in/687264
Update
Remove condition (see #Phil comment)
I think you are looking for something like that :
$output = [];
for($i = 0; $i < count($array); $i++) {
$output[$array[$i]['question_text']] [$array[$i]['answer_text']]= $array[$i]['responses'];
}
print_r($output);
Slightly different approach than the answer posted, more in tune with what you'v already tried. Try This:
$responses_array = array();
$sub_array = array();
$index = $result[0]['question_text'];
foreach($result as $value){
$question_text = $value['question_text'];
$answer_text = $value['answer_text'];
$responses = $value['responses'];
if (strcmp($index, $question_text) == 0) {
$sub_array[$answer_text] = $responses;
} else {
$index = $question_text;
$responses_array[$index] = $sub_array;
$sub_array = array();
}
}
Edit: Found my mistake, updated my answer slightly, hopefully this will work.
Edit 2: Working with example here: https://eval.in/687275
I have converted a CSV to a two dimensional array where the following array structure stores the column and row data
$table['status'] = ['active', 'active', 'inactive'];
$table['plan'] = ['annual', 'weekly', 'weekly '];
$table['spend'] = ['12,000', '19,000', '0' ];
print_r($table);
would appear as follows:
( [status] => Array ( [0] => active [1] => active [2] => inactive )
[plan] => Array ( [0] => annual [1] => weekly [2] => weekly )
[spend] => Array ( [0] => 12,000 [1] => 19,000 [2] => 0 ) )
I want to use native PHP array functions to query the arrays without having to write loops with nested conditions. If this was a MySQL database and I wanted to find the sum of spend from accounts with active status and weekly plans I would simply run the following query
SELECT SUM('Spend') FROM table WHERE status = 'Active' AND plan = 'Weekly';
But instead I have to take the following approach using a for loop
for ($index = 1; $index < count($table); $index++){
if (($table['status'][$index] == 'active') && ($table['plan'][$index] == 'weekly')){
$spend[$index] = $table['spend'][$index];
}
}
echo array_sum($spend);
This approach gives me a headache. Is there an obvious solution for refactoring this into php's native array functions or is a mess of explicit loops inevitable?
There aren't any native functions to do what you want. What's wrong with storing the information from the CSV files into a database?
If that's simply not an option then try foreach loops, they're much cleaner.
$spend = array();
foreach ($table['spend'] as $key => $amount)
{
if ($table['status'][$key] == 'active' && $table['plan'][$key] == 'weekly')
{
$spend[] = $amount;
}
}
Whilst it doesn't solve the loops issue it does help clean them up so you don't lose your mind so much.
Using Array Keys to get all "active" keys, then loop through only those to find the matches.
$keys = array_keys($table['status'], 'active');
foreach($keys as $key)
{
if($table['plan'][$key] == 'weekly')
{
$spend[] = $table['spend'][$key];
}
}
print_r($spend);
Spend:
Array
(
[0] => 19,000
)
this is pretty messy but I found one way of doing it
$keys = array_intersect(array_keys($table['status'], 'active'), array_keys($table['plan'], 'weekly'));
$subs['total'] = array_intersect_key($table['spend'], array_flip($keys));
print_r(array_sum($subs['total']));
I start with two arrays. The first is long and consists of potential ids, but the ids can show up multiple times in the $potential array as a way to increase the probability of that id being selected later.
The second array are ids of persons needing to be paired with somebody from the $potential array. However, the persons needing a partner will show up in the both arrays. So, I need to temporarily remove the elements containing the user id before assigning pairs in order to avoid pairing a person with himself.
$potential = array('105', '105', '105', '2105', '1051');
$users = array('105', '1051');
From this I need to end up with:
$arr1 = Array ( [0] => 105 [1] => 105 [2] => 105 )
$arr2 = Array ( [3] => 2105 [4] => 1051 )
so that I can assign a partner to 105 from $arr2, then recombine the arrays and in the next iteration be able to assign a partner to 1051:
$arr1 = Array ( [4] => 1051 )
$arr2 = Array ( [0] => 105 [1] => 105 [2] => 105 [3] => 2105 )
I've been messing around, but this is the best I've managed to do:
function differs ($v) { global $users; return ($v == current($users)) === true; }
foreach ($users as $value) {
$arr1 = array_filter($potential, differs);
$arr2 = array_diff($potential, $arr1);
}
Of course, the above does not work. Any ideas? Am I going about this all wrong? Thanks.
Let me see if I get it straight! You need to loop the users and on each loop, you must have an array with the id's inside the "potencial" array, except the current id. Is that right?
I was about to ask you this in the comment but I don't have enough reputation :(
Maybe this code will help, if it's what I'm supposing to be :)
$potential = array('105', '105', '105', '2105', '1051');
$users = array('105', '1051');
foreach ($users as $user) {
$available = array_filter($potential, function($id) use ($user){
return ($id != $user);
});
}
I have two arrays (in PHP):
ArrayA
(
[0] => 9
[1] => 1
[2] => 2
[3] => 7
)
ArrayB
(
[0] => 1
[1] => 1
[3] => 8
)
I want to make two new arrays, where I have only the elements declared in both of the arrays, like the following:
ArrayA
(
[0] => 9
[1] => 1
[3] => 7
)
ArrayB
(
[0] => 1
[1] => 1
[3] => 8
)
In this example ArrayA[2] doesn't exist, so ArrayB[2] has been unset.
I wrote this for loop:
for ($i = 0, $i = 99999, $i++){
if (isset($ArrayA[$i]) AND isset($ArrayB[$i]) == FALSE)
{
unset($ArrayA[$i],$ArrayB[$i]);
}
}
But it's not great because it tries every index between 0 and a very big number (99999 in this case). How can I improve my code?
The function you're looking for is array_intersect_key:
array_intersect_key() returns an array containing all the entries of array1 which have keys that are present in all the arguments.
Since you want both arrays, you'll have to run it twice, with the parameters in opposite orders, as it only keeps keys from the first array. An example:
$arrayA_filtered = array_intersect_key($arrayA, $arrayB);
$arrayB_filtered = array_intersect_key($arrayB, $arrayA);
Also, although a for loop wasn't ideal in this case, in other cases where you find yourself needing to loop through sparse array (one where not every number is set), you can use a foreach loop:
foreach($array as $key => $value) {
//Do stuff
}
One very important thing to note about PHP arrays is that they are associative. You can't simply use a for loop, as the indices are not necessarily a range of integers. Consider what would happen if you applied this algorithm twice! You'd get out of bounds errors as $arrayA[2] and $arrayB[2] no longer exist!
I would iterate through the arrays using nested foreach statements. I.e.
$outputA = array();
$outputB = array();
foreach ($arrayA as $keyA => $itemA) {
foreach ($arrayB as $keyB => $itemB) {
if ($keyA == $keyB) {
$outputA[$keyA] = $itemA;
$outputB[$keyB] = $itemB;
}
}
This should give you two arrays, $outputA and $outputB, which look just like $arrayA and $arrayB, except they only include key=>value pairs if the key was present in both original arrays.
foreach($arrayA as $k=>$a)
if (!isset($arrayB[$k]))
unset($arrayA[$k];
Take a look to php : array_diff
http://docs.php.net/manual/fr/function.array-diff.php
I want to check only the value [id] for duplicates, and remove all keys where this "field" [id] is a duplicate.
Example: If I have numbers 1,2,1. I want the result to be 2, not 1,2. And criteria for duplicates is determined only by checking [id], not any other "field".
Original array:
Array
(
[0] => Array
(
[name] => John
[id] => 123
[color] => red
)
[1] => Array
(
[name] => Paul
[id] => 958
[color] => red
)
[2] => Array
(
[name] => Jennifer
[id] => 123
[color] => yellow
)
)
The result I want:
Array
(
[0] => Array
(
[name] => Paul
[id] => 958
[color] => red
)
)
I agree with everyone above, you should give us more information about what you've tried, but I like to code golf, so here's a completely unreadble solution:
$new_array = array_filter($array, function($item) use (&$array){
return count(array_filter($array, function($node) use (&$item){
return $node['id'] == $item['id'];
})) < 2;
});
This should be fairly easy to accomplish with a couple of simple loops:
set_time_limit(0); // Disable time limit to allow enough time to process a large dataset
// $items contains your data
$id_counts = array();
foreach ($items as $item) {
if (array_key_exists($item['id'], $id_counts)) {
$id_counts[$item['id']]++;
} else {
$id_counts[$item['id']] = 1;
}
}
for ($i = count($items); $i >= 0; $i--) {
if ($id_counts[$items[$i]['id']] > 1) {
array_splice($items, $i, 1);
}
}
Result:
Array
(
[0] => Array
(
[name] => Paul
[id] => 958
[color] => red
)
)
While there are neater ways to do it, one advantage of this method is you're only creating new arrays for the list of ids and duplicate ids and the array_splice is removing the duplicates from the original array, so memory usage is kept to a minimum.
Edit: Fixed a bug that meant it sometimes left one behind
This is a very basic approach to the answer and I am sure there are much better answers however I would probably start by doing it the way I would on paper.
I look at the first index, check its value. Then I go through every other index making note of their index if the value is the same as my originally noted value. Once I have gone through the list if I have more than one index with that particular value I remove them all (starting with the highest index, so as to not affect indexes of the others while deleting).
Do this for all other indexes till you reach the end of the list.
It is long winded but will make sure it removes all values which have duplicates. and leaves only those which originally had no duplicates.
function PickUniques(array $items){
// Quick way out
if(empty($items)) return array();
$counters = array();
// Count occurences
foreach($items as $item){
$item['id'] = intval($item['id']);
if(!isset($counters[$item['id']])){
$counters[$item['id']] = 0;
}
$counters[$item['id']]++;
}
// Pop multiples occurence ones
foreach($counters as $id => $occurences){
if($occurences > 1){
unset($counters[$id]);
}
}
// Keep only those that occur once (in $counters)
$valids = array();
foreach($items as $item){
if(!isset($items[$item['id']])) continue;
$valids[$item['id']] = $item;
}
return $valids;
}
Try this one :)