Speeding up PHP foreach loop with unset()? - php

Based on what I have read about the internals of php arrays this may not be possible, but I wanted to run this past you and see if there's a chance...
I am pulling two results sets from a SQL query. One is an array of customers (an array of arrays, actually - with each customer having and ID and some other personal data), and the second array is an array of customer orders (also an array of arrays). Basically my foreach loop is matching the customer ID values from the customer array with all of the orders that customer made in the second array and pushing them into a new third data structure.
So let's say I make my SQL query and I pull an array of 500 customers and 3,000 orders. On average then, one customer will have six orders to match. For each iteration through the customer array I need to also iterate through the entire orders array to get all the matches. Then I will push these into a different data structure which is ultimately what I will use.
I wanted to know if unsetting the matched rows from both of the original arrays would speed up the foreach loop since it would in principle have less to iterate over with each cycle. Based on how PHP uses hashes and buckets for its arrays, and the way it makes a copy of arrays in foreach loops, I am wondering if a performance increase would actually occur. I plan to test this with some dummy data but I wanted to know if anyone here has run across a similar situation.
Thanks!!
EDIT - thanks for your answers. Yes, joining the tables is probably the best way, but I am asking this question to gain a better understanding of how PHP handles arrays, etc.
I wrote a test script and I can now see that using unset() doesn't help me:
$customers = Array('1' =>'Jim', '2' => 'Bill', '3' => 'John', '4' => 'Ed', '5' => 'Greg');
$orders = Array();
$final = Array();
for ($x = 0; $x < 1000000; $x++) {
$orders[$x] = rand(1,5);
}
$start = microtime(true);
$counter = 0;
foreach ($customers as $key=>$customer) {
$final[$customer] = Array();
foreach ($orders as $key=>$order) {
$counter++;
if ($order == $key) {
$final[$customer][] = $order;
unset($orders[$key]);
}
}
}
echo $counter; // I usually get the same number of iterations whether unset() is used or not
$finish = microtime(true) - $start; // similar or worse performance if unset() is used
What I see is no performance increase with unset(), but rather a decrease. On top of that, unset() doesn't actually remove the row from the array. When I do a count() of the $orders array it is the same at the end as it is at the beginning.

As per my under standing you are getting results of Customers separately and orders data separately from DB.
Customer
ID Name Add..
1 one ...
2 two ...
3 three ....
Orders
ID CID ord
1 1 ...
2 3 ....
3 3 ....
4 2 ....
While retrieving data itself Join the Tables and get the required results, for example..
SELECT c.Name, c.ID, o.ord FROM customer c
LEFT JOIN orders o ON c.ID = o.CID;
You can add where condition as per requirement;

This is just off the top of my head:
Appending data on to another array using the 'arrayxyz[]' syntax could be faster than 'unset' due to the overhead of logic that needs to run to keep the 'foreach' in a useful state.
Also, unsetting an array item removes the key/value, and if you are using numeric indexes then this leaves a 'hole' which may or may not affect the 'count' function.
You may need to use 'array_values' to reindex the array.

Related

PHP Foreach loop too slow when applied on large arrays

So, basically, i have to loop thought an array of 25000 items, then compare each item with another array's ID and if the ID's from the first array and the second match then create another array of the matched items. That looks something like this.
foreach ($all_games as $game) {
foreach ($user_games_ids as $user_game_id) {
if ($user_game_id == $game["appid"]) {
$game_details['title'][] = $game["title"];
$game_details['price'][] = $game["price"];
$game_details['image'][] = $game["image_url"];
$game_details['appid'][] = $game["appid"];
}
}
}
I tested this loop with only 2500 records from the first array ($all_games) and about 2000 records from the second array ($user_games_ids) and as far as i figured, it takes about 10 seconds for the execution of that chunk of code, only the loops execution. Is that normal? Should that take that long or I'm I approaching the issue from the wrong side? Is there a way to reduce that time? Because when i apply that code to 25000 records that time will significantly increase.
Any help is appreciated,
Thanks.
EDIT: So there is no confusion, I can't use the DB query to improve the performance, although, i have added all 25000 games to the database, i can't do the same for the user games ids. There is no way that i know to get all of the users through that API i'm accessing, and even there is, that would be really a lot of users. I get user games ids on the fly when a user enters it's ID in the form and based on that i use file_get_contents to obtain those ids and then cross reference them with the database that stores all games. Again, that might not be the best way, but only one i could think of at this point.
If you re-index the $game array by appid using array_column(), then you can reduce it to one loop and just check if the data isset...
$game = array_column($game,null,"appid");
foreach ($user_games_ids as $user_game_id) {
if (isset( $game[$user_game_id])) {
$game_details['title'][] = $game[$user_game_id]["title"];
$game_details['price'][] = $game[$user_game_id]["price"];
$game_details['image'][] = $game[$user_game_id]["image_url"];
$game_details['appid'][] = $game[$user_game_id]["appid"];
}
}

How to limit the number of page results based on one table in INNER JOIN SQL query or in php?

I am trying to get 5 questions per page with answers (one to many relational for questions and answers table) but, i am getting the number of records per page for this join table, is there anyway to limit the results based on questions table for pagination.
<?php
$topic_id = $_GET['topic_id'];
$answers_data = [];
$questions_data = [];
if (isset($_GET["page"])) { $page = $_GET["page"]; } else { $page=1; };
$num_rec_from_page = 5;
$start_from = ($page-1) * $num_rec_per_page;
$sql = "SELECT questions.q_id,questions.question,answers.answers,answers.answer_id FROM questions INNER JOIN answers ON questions.q_id = answers.q_id WHERE topic_id='$topic_id' LIMIT $start_from, $num_rec_from_page";
$result = $connection->query($sql);
while($row = mysqli_fetch_assoc($result)) {
$data[] = $row;
}//While loop
foreach($data as $key => $item) {
$answers_data[$item['q_id']][$item['answer_id']] = $item['answers'];
}
foreach($data as $key => $item) {
$questions_data[$item['q_id']] = $item['question'];
}
?>
I am get results for above query data using 2 for-each loops as below.
<?php
$question_count= 0;
foreach ($answers_data as $question_id => $answers_array) {
$question_count++;
$q_above_class = "<div class='uk-card-default uk-margin-bottom'><div class='uk-padding-small'><span class='uk-article-meta'>Question :".$question_count."</span><p>";
$q_below_class = "</p></span><div class='uk-padding-small'>";
echo $q_above_class.$questions_data[$question_id].$q_below_class;
$answer_count = 0;
foreach($answers_array as $key => $answer_options) {
$answer_count++;
$answer_options = strip_tags($answer_options, '<img>');
$ans_above_class="<a class='ansck'><p class='bdr_lite uk-padding-small'><span class='circle'>".$answer_count."</span>";
$ans_below_class = "</p></a>";
echo $ans_above_class.$answer_options.$ans_below_class;
}
echo "</div></div></div>";
}
?>
Is there any idea, how can i limit the results per page, based on questions table.
something like this
SELECT
q.q_id,
q.question,
a.answers,
a.answer_id
FROM
(
SELECT
q_id, question
FROM
questions
WHERE
topic_id=:topic_id
LIMIT
$start_from, $num_rec_from_page
) AS q
JOIN
answers AS a ON q.q_id = a.question_id
A few questions/thoughts/notes.
you had question.q_id and question.question_id which seems like an error. So I just went with q_id the other one is more typing (which I don't like) I had a 50-50 chance I figured... so
you had just topic_id so I can't be sure what table it's from, I'm assuming it's from table "question"? It makes a big difference as we really need the where condition on the sub-query where the limit is.
Inner Join, is the same thing as a Join, so I just put Join because I'm lazy. I found this previous post (click here) on SO that talks about it
:topic_id I parameterized your query, I don't do variable concatenation and SQLInjection vulnerability stuff. (aka. please use prepared statements) Named placeholders are for PDO, that's what I like using, you can pretty much just replace it with a ? for mysqli
as I said with INNER JOIN, I'm lazy so I like aliasing my tables with just 1 character, so that was what I did. ( I think you don't even need the AS part, but I'm not "that" lazy). Sometimes I have to use 2, which really irritates me, but whatever
With a sub-query, you can just limit the rows from that table, then you join the results of that query back to the main query like normal. This way you pull 5 or what have you from question table, and then {n} rows from answer based only on the join to the results of the inner query.
Cant really test it, but in theory it should work. You'll have to go though the results, and group them by question. Because you will get {n} rows that have the same 5 questions joined in them. With PDO, you could do PDO::FETCH_GROUP I don't think Mysqli has an equivalent method so you'll have to do it manually, but it's pretty trivial.
UPDATE
Here is a DB fiddle I put to gather you can see it does exactly what you need it to
https://www.db-fiddle.com/f/393uFotgJVPYxVgdF2Gy2V/3
Also I put a non-subquery below it to show the difference.
As for things like small syntax errors and table/column names, well I don't have access to your DB, you're going to have to put some effort in to adapt it to your setup. The only information I have is what you put in the question and so your question had the wrong table/column name. I already pointed several of these issues out before. I'm not saying that to be mean or condescending, it's just a blunt fact.
UPDATE1
Based on you comment. in 1st query the question is redundant
This is just the way the database works, To explain it is very simple, in my example I have 5 questions that match with 8 answers. In any database (not including NoSQL like mongoDB) you can't realistically nest data. In other words you cant pull it like this.
question1
answer1
answer2
answer3
You have to pull it flat, and the way that happens is like this
question1 answer1
question1 answer2
question1 answer3
This is just a natural consequence of how the Database works when joining data. Now that that is out of the way, what do we do about it. Because we want the data nested like in the first example.
We can pull the question, iterate (loop) over the result and do a query for each question and add the data to a sub-element.
Advantage It's easy to do
Disadvantage While this works it's undesirable because we will be making 6 connections to the database (1 to get 5 questions, 1 to get answers for each of the 5 questions), it requires 2 while loops to process the results, and actually more code.
Psudo code instructions (i don't feel like coding this)
init data variable
query for our 5 questions
while each questions as question
- add question to data
- query for answers that belong to question
- while each answers as answer
-- add answer to nested array in data[question]
return data
We can process the results and build the structure we want.
Advantage We can pull the data in one request
Disadvantage we have to write some code, in #1. we still have to write code, and in fact we have to write more code, because we have to process the DB results 6x (2 while loop) here we need 1 while loop.
Psudo code instructions (for comparison)
init data variable
query for our 5 questions and their answers
while each questions&answers as row
- check if question is in data
-- if no, add question with a key we can match to it
- remove data specific to question (redundant data)
- add answers to data[question]
return data
As you can see the basic instructions for the second one are no more complex then the first (same number of instruction). This is just assuming each line has the same complexity. Obviously a SQL query or a while loop is more complex then an if condition. You'll see below how I convert this psudo code to real code. I actually often write psudo code when planing a project.
Anyway, this is what we need to do. (using the previous SQL or the first one in the fiddle). Here is your normal "standard" loop to pull data from the DB
$data = [];
while($row = mysqli_fetch_assoc($result)) {
$data[] = $row;
}//While loop
We will modify this just a bit (it's very easy)
//query for our 5 questions and their answers(using SQL explained above)
//init data variable
$data = [];
//while each questions&answers as row
while($row = mysqli_fetch_assoc($result)) {
// - create a key based of the question id, for convenience
$key = 'question_'.$row['q_id'];
// - check if question is in data
if(!isset( $data[$key] ) ){
//--if no, add question with a key we can match to it
$data[$key] = [
'q_id' => $row['q_id'],
'question' => $row['question'],
'children' => [] //you can call this whatever you want, i choose "children" because there is a field named "answers"
];
}
//- remove data specific to question (redundant data) [optional]
unset($data['q_id'], $data['question']);
//- add answers to data[question]
$data[$key]['answers'][] = $row;
}
//return data
So what does this look like: For the first while, the standard one, we get this with as you called it redundant data.
[
["q_id" => "4", "question" => "four", "answers"=>"4", "answer_id"=>"4"],
["q_id" => "5", "question" => "five", "answers"=>"5", "answer_id"=>"5"],
["q_id" => "5", "question" => "five", "answers"=>"5", "answer_id"=>"6"],
]
For the second one, with our harder code (that's not really hard) we get this:
[
["q_id" => "4","question" => "four","children" = ["answers"=>"4","answer_id"=>"4"]],
[
"q_id" => "5",
"question" => "five",
"children" = [
"answers"=>"5",
"answer_id"=>"5"
],[
"answers"=>"5",
"answer_id"=>"6"
]
],
]
I expanded the second question so you can see the nesting. This is also a good example of why it had redundant data, and what is happening in general. As you can see there is no way to represent 8 rows with 5 shared question without have some redundant data (without nesting them).
The last thing I would like to mention is my choice of $key. We could have used just q_id with no question_ bit added on. I do this for 2 reasons.
It's easier to read when printing it out.
There are several array_* (and other) functions in PHP that will reset numeric keys. Because we are storing important data here, we don't want to lose that information. The way to do this is to use strings. You can just cast it to a string (int)$row['q_id'], but in some cases the keys can still get removed. An example is when JSON encoding, there is a flag JSON_FORCE_OBJECT that forces numeric keys to be an object {"0":"value} but it acts global. In any case it can happen where you lose the keys if they are just numbers, and I don't much care for that happening. So I prefix them to prevent that.
It's not hard to do something like preg_match('/question_([0-9]+)/', $key, $match) or $id = substr($key, 9); to pull it back off of there,We have the q_id in the array,It's no harder to check isset($data['question_1']) then isset($data['1']), and it looks better.
So for minimum difficulty we can be sure we won't lose our ID's to some code over site (unless we use usort instead of uasort) but I digress..

mysql | PHP | Join within own table

i dont know if i am doing right or wrong, please dont judge me...
what i am trying to do is that if a record belongs to parent then it will have parent id assosiated with it.. let me show you my table schema below.
i have two columns
ItemCategoryID &
ItemParentCategoryID
Let Suppose a record on ItemCategoryID =4 belongs to ItemCategoryID =2 then the column ItemParentCategoryID on ID 4 will have the ID of ItemCategoryID.
I mean a loop with in its own table..
but problem is how to run the select query :P
I mean show all the parents and childs respective to their parents..
This is often a lazy design choise. Ideally you want a table for these relations or/and a set number of depths. If a parent_id's parent can have it's own parent_id, this means a potential infinite depth.
MySQL isn't a big fan of infinite nesting depths. But php don't mind. Either run multiple queryies in a loop such as Nil'z's1, or consider fetching all rows and sorting them out in arrays in php. Last solution is nice if you pretty much always get all rows, thus making MySQL filtering obsolete.
Lastly, consider if you could have a more ideal approach to this in your database structure. Don't be afraid to use more than one table for this.
This can be a strong performance thief in the future. An uncontrollable amount of mysql queries each time the page loads can easily get out of hands.
Try this:
function all_categories(){
$data = array();
$first = $this->db->select('itemParentCategoryId')->group_by('itemParentCategoryId')->get('table')->result_array();
if( isset( $first ) && is_array( $first ) && count( $first ) > 0 ){
foreach( $first as $key => $each ){
$second = $this->db->select('itemCategoryId, categoryName')->where_in('itemParentCategoryId', $each['itemParentCategoryId'])->get('table')->result_array();
$data[$key]['itemParentCategoryId'] = $each['itemParentCategoryId'];
$data[$key]['subs'] = $second;
}
}
print_r( $data );
}
I don't think you want/can to do this in your query since you can nest a long way.
You should make a getChilds function that calls itself when you retrieve a category. This way you can nest more than 2 levels.
function getCategory()
{
// Retrieve the category
// Get childs
$childs = $this->getCategoryByParent($categoryId);
}
function getCategorysByParent($parentId)
{
// Get category
// Get childs again.
}
MySQL does not support recursive queries. It is possible to emulate recursive queries through recursive calls to a stored procedure, but this is hackish and sub-optimal.
There are other ways to organise your data, these structures allow very efficient querying.
This question comes up so often I can't even be bothered to complain about your inability to use Google or SO search, or to offer a wordy explanation.
Here - use this library I made: http://codebyjeff.com/blog/2012/10/nested-data-with-mahana-hierarchy-library so you don't bring down your database

PHP and MySQL - efficiently handling multiple one to many relationships

I am seeking some advice on the best way to retrieve and display my data using MySQL and PHP.
I have 3 tables, all 1 to many relationships as follows:
Each SCHEDULE has many OVERRIDES and each override has many LOCATIONS. I would like to retrieve this data so that it can all be displayed on a single PHP page e.g. list out my SCHEDULES. Within each schedule list the OVERRIDES, and within each override list the LOCATIONS.
Option1 - Is the best way to do this make 3 separate SQL queries and then write these to a PHP object? I could then iterate through each array and check for a match on the parent array.
Option 2 - I have thought quite a bit about joins however doing two right joins will return me a row for every entrance in all 3 tables.
Any thoughts and comments would be appreciated.
Best regards, Ben.
If you really want every piece of data, you're going to be retrieving the same number of rows, no matter how you do it. Best to get it all in one query.
SELECT schedule.id, overrides.id, locations.id, locations.name
FROM schedule
JOIN overrides ON overrides.schedule_id = schedule.id
JOIN locations ON locations.override_id = overrides.id
ORDER BY schedule.id, overrides.id, locations.id
By ordering the results like this, you can iterate through the result set and move on to the next schedule whenever the scheduleid changes, and the next location when the locationid changes.
Edit: a possible example of how to turn this data into a 3-dimensional array -
$last_schedule = 0;
$last_override = 0;
$schedules = array();
while ($row = mysql_fetch_array($query_result))
{
$schedule_id = $row[0];
$override_id = $row[1];
$location_id = $row[2];
$location_name = $row[3];
if ($schedule_id != $last_schedule)
{
$schedules[$schedule_id] = array();
}
if ($override_id != $last_override)
{
$schedules[$schedule_id][$override_id] = array();
}
$schedules[$schedule_id][$override_id][$location_id] = $location_name;
$last_schedule = $schedule_id;
$last_override = $override_id;
}
Quite primitive, I imagine your code will look different, but hopefully it makes some sense.

get max value in php (instead of mysql)

I have two msyql tables, Badges and Events. I use a join to find all the events and return the badge info for that event (title & description) using the following code:
SELECT COUNT(Badges.badge_ID) AS
badge_count,title,Badges.description
FROM Badges JOIN Events ON
Badges.badge_id=Events.badge_id GROUP
BY title ASC
In addition to the counts, I need to know the value of the event with the most entries. I thought I'd do this in php with the max() function, but I had trouble getting that to work correctly. So, I decided I could get the same result by modifying the above query by using "ORDER BY badgecount DESC LIMIT 1," which returns an array of a single element, whose value is the highest count total of all the events.
While this solution works well for me, I'm curious if it is taking more resources to make 2 calls to the server (b/c I'm now using two queries) instead of working it out in php. If I did do it in php, how could I get the max value of a particular item in an associative array (it would be nice to be able to return the key and the value, if possible)?
EDIT:
OK, it's amazing what a few hours of rest will do for the mind. I opened up my code this morning, and made a simple modification to the code, which worked out for me. I simply created a variable on the count field and, if the new one was greater than the old one, changed it to the new value (see the "if" statement in the following code):
if ( $c > $highestCount ) {
$highestCount = $c; }
This might again lead to a "religious war", but I would go with the two queries version. To me it is cleaner to have data handling in the database as much as possible. In the long run, query caching, etc.. would even out the overhead caused by the extra query.
Anyway, to get the max in PHP, you simply need to iterate over your $results array:
getMax($results) {
if (count($results) == 0) {
return NULL;
}
$max = reset($results);
for($results as $elem) {
if ($max < $elem) { // need to do specific comparison here
$max = $elem;
}
}
return $max;
}

Categories