I'm currently struggling with an issue that is overloading my database which makes all page requests being delayed significantly.
Current scenario
- A certain Artisan Command is scheduled to be ran every 8 minutes
- This command has to update a whole table with more than 30000 rows
- Every row will have a new value, which means 30000 queries will have to be executed
- For about 14 seconds the server doesn't answer due to database overload (I guess)
Here's the handle method of the command handle()
public function handle()
{
$thingies = /* Insert big query here */
foreach ($thingies as $thing)
{
$resource = Resource::find($thing->id);
if(!$resource)
{
continue;
}
$resource->update(['column' => $thing->value]);
}
}
Is there any other approach to do this without making my page requests being delayed?
Your process is really inefficient and I'm not surprised it takes a long time to complete. To process 30,000 rows, you're making 60,000 queries (half to find out if the id exists, and the other half to update the row). You could be making just 1.
I have no experience with Laravel, so I'll leave it up to you to find out what functions in Laravel can be used to apply my recommendation. I just want to get you to understand the concepts.
MySQL allows you to submit a multi query; One command that executes many queries. It is drastically faster than executing individual queries in a loop. Here is an example that uses MySQLi directly (no 3rd party framework such as Laravel)
//the 30,000 new values and the record IDs they belong to. These values
// MUST be escaped or known to be safe
$values = [
['id'=>145, 'fieldName'=>'a'], ['id'=>2, 'fieldName'=>'b']...
];
// %s and %d will be replaced with column value and id to look for
$qry_template = "UPDATE myTable SET fieldName = '%s' WHERE id = %d";
$queries = [];//array of all queries to be run
foreach ($values as $row){ //build and add queries
$q = sprintf($qry_template,$row['fieldName'],$row['id']);
array_push($queries,$q);
}
//combine all into one query
$combined = implode("; ",$queries);
//execute all queries at once
$mysqli->multi_query($combined);
I would look into how Laravel does multi queries and start there. The last time I implemented something like this, it took about 7 milliseconds to insert 3,000 rows. So updating 30,000 will definitely not take 14 seconds.
As an added bonus, there is no need to first run a query to figure out whether the ID exists. If it doesn't, nothing will be updated.
Thanks to #cyclone comment I was able to update all the values in one single query.
It's not a perfect solution, but the query execution time now takes roughly 8 seconds and only 1 connection is required, which means the page requests are still being handled when the query is being executed.
I'm not marking this question as definitive since there might be improvements to make.
$ids = [];
$caseQuery = '';
foreach ($thingies as $thing)
{
if(strlen($caseQuery) == 0)
{
$caseQuery = '(CASE WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
else
{
$caseQuery .= ' WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
array_push($ids, $thing->id);
}
$caseQuery .= ' END)';
// Execute query
DB::update('UPDATE <table> SET <value> = '. $caseQuery . ' WHERE id IN ('. implode( ',' , $ids) .')');
Related
I am trying to make a PHP loop work for me in MySQL. Currently all visits to a website via a specific URL parameterare logged into a table along with the date and time of the visit. I am rebuilding the logging procedure to only count the visits via one specific parameter on one day, but I'll have to convert the old data first.
So here's what I'm trying to do: The MySQL table (let's call it my_visits) has 3 columns: parameter, visit_id and time.
In my PHP code, I've created the following loop to gather the data I need (all visits made via one paramter on one day, for all parameters):
foreach (range(2008, 2014) as $year) {
$visit_data = array();
$date_ts = strtotime($year . '-01-01');
while ($date_ts <= strtotime($year . '-12-31')) {
$date = date('Y-m-d', $date_ts);
$date_ts += 86400;
// count visit data
$sql = 'SELECT parameter, COUNT(parameter) AS total ' .
'FROM my_visits ' .
'WHERE time BETWEEN \''.$date.' 00:00\' AND \''.$date.' 23:59\' '.
'GROUP BY parameter ORDER BY total DESC';
$stmt = $db->prepare($sql);
$stmt->execute(array($date));
while ($row = $stmt->fetch()) {
$visit_data[] = array(
'param' => $row['parameter'],
'visit_count' => $row['total'],
'date' => $date);
}
$stmt->closeCursor();
}
}
Later on, the gathered data is inserted into a new table (basically eliminating visit_id) using a multiple INSERT (thanks to SO! :)).
The above code works, but due to the size of the table (roughly 3.4 million rows) it is very slow. Using 7 * 365 SQL queries just to gather the data seems just wrong to me and I fear the impact of just running the script will slow everything down substantially.
Is there a way to make this loop work in MySQL, like an equivalent query or something (on a yearly basis perhaps)? I've already tried a solution using GROUP BY, but since this eliminates either the specific dates or the parameters, I can't get it to work.
You can GROUP further.
SELECT `parameter`, COUNT(`parameter`) AS `total`, DATE(`time`) AS `date`
FROM `my_visits`
GROUP BY `parameter`, DATE(`time`)
You can then execute it once (instead of in a loop) and use $row['date'] instead of $date.
This also means you don't have to update your code when we reach 2015 ;)
I have a bunch of photos on a page and using jQuery UI's Sortable plugin, to allow for them to be reordered.
When my sortable function fires, it writes a new order sequence:
1030:0,1031:1,1032:2,1040:3,1033:4
Each item of the comma delimited string, consists of the photo ID and the order position, separated by a colon. When the user has completely finished their reordering, I'm posting this order sequence to a PHP page via AJAX, to store the changes in the database. Here's where I get into trouble.
I have no problem getting my script to work, but I'm pretty sure it's the incorrect way to achieve what I want, and will suffer hugely in performance and resources - I'm hoping somebody could advise me as to what would be the best approach.
This is my PHP script that deals with the sequence:
if ($sorted_order) {
$exploded_order = explode(',',$sorted_order);
foreach ($exploded_order as $order_part) {
$exploded_part = explode(':',$order_part);
$part_count = 0;
foreach ($exploded_part as $part) {
$part_count++;
if ($part_count == 1) {
$photo_id = $part;
} elseif ($part_count == 2) {
$order = $part;
}
$SQL = "UPDATE article_photos ";
$SQL .= "SET order_pos = :order_pos ";
$SQL .= "WHERE photo_id = :photo_id;";
... rest of PDO stuff ...
}
}
}
My concerns arise from the nested foreach functions and also running so many database updates. If a given sequence contained 150 items, would this script cry for help? If it will, how could I improve it?
** This is for an admin page, so it won't be heavily abused **
you can use one update, with some cleaver code like so:
create the array $data['order'] in the loop then:
$q = "UPDATE article_photos SET order_pos = (CASE photo_id ";
foreach($data['order'] as $sort => $id){
$q .= " WHEN {$id} THEN {$sort}";
}
$q .= " END ) WHERE photo_id IN (".implode(",",$data['order']).")";
a little clearer perhaps
UPDATE article_photos SET order_pos = (CASE photo_id
WHEN id = 1 THEN 999
WHEN id = 2 THEN 1000
WHEN id = 3 THEN 1001
END)
WHERE photo_id IN (1,2,3)
i use this approach for exactly what your doing, updating sort orders
No need for the second foreach: you know it's going to be two parts if your data passes validation (I'm assuming you validated this. If not: you should =) so just do:
if (count($exploded_part) == 2) {
$id = $exploded_part[0];
$seq = $exploded_part[1];
/* rest of code */
} else {
/* error - data does not conform despite validation */
}
As for update hammering: do your DB updates in a transaction. Your db will queue the ops, but not commit them to the main DB until you commit the transaction, at which point it'll happily do the update "for real" at lightning speed.
I suggest making your script even simplier and changing names of the variables, so the code would be way more readable.
$parts = explode(',',$sorted_order);
foreach ($parts as $part) {
list($id, $position) = explode(':',$order_part);
//Now you can work with $id and $position ;
}
More info about list: http://php.net/manual/en/function.list.php
Also, about performance and your data structure:
The way you store your data is not perfect. But that way you will not suffer any performance issues, that way you need to send less data, less overhead overall.
However the drawback of your data structure is that most probably you will be unable to establish relationships between tables and make joins or alter table structure in a correct way.
Following code is a mock-up of my real code. I'm getting a big performance hit when myFunction is called. myTable is no more than a few hundred rows, but calling myFunction adds ~10 seconds execution time. Is there something inherently wrong with trying to access a row of a table inside a loop already accessing that table?
<select>
<?php
$stmt = SQLout ("SELECT ID,Title FROM myTable WHERE LEFT(Title,2) = ? ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title));
while ($stmt->fetch()) {
if (myFunction($id)) // skip this function call and save 10 seconds
echo '<option value="' . $co_title . '">' . $co_title . '</option>';
}
$stmt->close();
function myFunction ($id) {
$stmt = SQLout ("SELECT Info FROM myTable WHERE ID = ?",
array ('i', $id), array (&$info));
if ($stmt->fetch()) {
$stmt->close();
if ($info == $something)
return true;
}
return false;
}
?>
SQLout is basically:
$sqli_db->prepare($query);
$stmt->bind_param;
$stmt->execute();
$stmt->bind_result;
return $stmt;
What you're doing is sometimes called the "N+1 queries" problem. You run the first (outer) query 1 times, and it returns N rows. Then you run N subordinate queries, one for each row returned by the first query. Thus N+1 queries. It causes a lot of overhead.
This would have far better performance if you could apply the "something" condition in SQL:
$stmt = SQLout ("SELECT ID,Title FROM myTable
WHERE LEFT(Title,2) = ? AND Info = ... ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title));
In general, it's not a good idea to run queries in a loop that depends on how many rows match the outer query. What if the outer query matches 1000000 rows? That means a million queries inside the loop will hit your database for this single PHP request.
Even if today the outer query only matches 3 rows, the fact that you've architected the code in this way means that six months from now, at some unpredictable time, there will be some search that results in a vast overhead, even if your code does not change. The number of queries is driven by the data, not the code.
Sometimes it's necessary to do what you're doing, for instance of the "something" condition is complex and can't be represented by an SQL expression. But you should try in all other cases to avoid this pattern of N+1 queries.
So, if you have a "few hundred rows" in the table, you might be calling myFunction a few hundred times, depending on how many rows are returned in the first query.
Check the number of rows that first query is returning to make sure it meets your expectations.
After that, make sure you have an index on myTable.ID.
After that, I would start looking into system/server level issues. On slower systems, say a laptop hard drive, 10 queries per second might be all you can get.
Try something like this:
$stmt = SQLout ("SELECT ID,Title, Info FROM myTable WHERE LEFT(Title,2) = ? ORDER BY Title DESC",
array ('s', $co), array (&$id, &$co_title, &$info));
while ($stmt->fetch()) {
if (myFunction($info)) // skip this function call and save 10 seconds
echo '<option value="' . $co_title . '">' . $co_title . '</option>';
}
$stmt->close();
function myFunction ($info) {
if ($info == $something)
return true;
}
return false;
}
I have a list of users which needs to be iterated using a foreach loop and inserted in to a table for every new row in db table.
$data['entity_classid'] = $classid;
$data['notification_context_id'] = $context_id;
$data['entity_id'] = $entity_id;
$data['notification_by'] = $userid;
$data['actionid'] = $actionid;
$data['is_read'] = 0;
$data['createdtime'] = time();
foreach($classassocusers as $users){
$data['notification_to'] = $users->userid;
$DB->insert_record('homework.comments',$data,false);
}
so using the insert query as given above is
A good practice or bad practice,
Shall i place any delay after every insert query execution?
what are the pros and cons of doing so?
Thanks
Using the query like that is a good practice in your case. You will have to insert a list of users anyway, so you will have to process many queries. No way around this!
I have no idea why you would want to place a delay after each insert. These methods are synchronous calls, so your code will be "paused" anyway during the execution of your query. So delaying it will just delay your code while nothing is progressing.
So your loop will not continue while executing a query. So don't delay your code even more on purpose.
Another way to do this is by executing one query though.
$user_data = "";
foreach($classassocusers as $users) {
$user_data .= "('" . $users->userid . "', '" . $users->name . "'), ";
}
$user_data = substr($user_data, 0, strlen($user_data) - 2);
$query = "INSERT INTO `homework.comments` ( `id`, `name` )
VALUES " . $user_data;
That's supposed to make a query like:
INSERT INTO `homework.comments` ( `id`, `name` )
VALUES ('1', 'John'),
('2', 'Jeffrey'),
('3', 'Kate');
(By the way, I made some assumptions regarding your $users object and your table structure. But I'm sure you catch the idea)
It all depends on your requirements.
If you run 500.000 of these updates in 5 minutes - every 15 minutes, your database will have a hard time. If you do this for 1.000 users every 15 minutes - this is a great approach.
When performance is demanded, concider the following:
Combine INSERT using the VALUES syntax, process every 500/1000.
Add a small timeout after the query.
Otherwise, this is an excellent approach!
When I run my script I receive the following error before processing all rows of data.
maximum execution time of 30 seconds
exceeded
After researching the problem, I should be able to extend the max_execution_time time which should resolve the problem.
But being in my PHP programming infancy I would like to know if there is a more optimal way of doing my script below, so I do not have to rely on "get out of jail cards".
The script is:
1 Taking a CSV file
2 Cherry picking some columns
3 Trying to insert 10k rows of CSV data into a my SQL table
In my head I think I should be able to insert in chunks, but that is so far beyond my skillset I do not even know how to write one line :\
Many thanks in advance
<?php
function processCSV()
{
global $uploadFile;
include 'dbConnection.inc.php';
dbConnection("xx","xx","xx");
$rowCounter = 0;
$loadLocationCsvUrl = fopen($uploadFile,"r");
if ($loadLocationCsvUrl <> false)
{
while ($locationFile = fgetcsv($loadLocationCsvUrl, ','))
{
$officeId = $locationFile[2];
$country = $locationFile[9];
$country = trim($country);
$country = htmlspecialchars($country);
$open = $locationFile[4];
$open = trim($open);
$open = htmlspecialchars($open);
$insString = "insert into countrytable set officeId='$officeId', countryname='$country', status='$open'";
switch($country)
{
case $country <> 'Country':
if (!mysql_query($insString))
{
echo "<p>error " . mysql_error() . "</p>";
}
break;
}
$rowCounter++;
}
echo "$rowCounter inserted.";
}
fclose($loadLocationCsvUrl);
}
processCSV();
?>
First, in 2011 you do not use mysql_query. You use mysqli or PDO and prepared statements. Then you do not need to figure out how to escape strings for SQL. You used htmlspecialchars which is totally wrong for this purpose. Next, you could use a transaction to speed up many inserts. MySQL also supports multiple interests.
But the best bet would be to use the CSV storage engine. http://dev.mysql.com/doc/refman/5.0/en/csv-storage-engine.html read here. You can instantly load everything into SQL and then manipulate there as you wish. The article also shows the load data infile command.
Well, you could create a single query like this.
$query = "INSERT INTO countrytable (officeId, countryname, status) VALUES ";
$entries = array();
while ($locationFile = fgetcsv($loadLocationCsvUrl, ',')) {
// your code
$entries[] = "('$officeId', '$country', '$open')";
}
$query .= implode(', ', $enties);
mysql_query($query);
But this depends on how long your query will be and what the server limit is set to.
But as you can read in other posts, there are better way for your requirements. But I thougt I should share a way you did thought about.
You can try calling the following function before inserting. This will set the time limit to unlimited instead of the 30 sec default time.
set_time_limit( 0 );