I have to insert 1000 rows of data per time into MySQL. At the moment, I use PDO and for-loop to insert row by row to database. Is there any more efficient way to achieve better performance? Because I have to set max_execution_time to 5 minutes.
function save()
{
return $query = $this->insert("
INSERT INTO gadata (landing_page, page_title, page_views, visits, visitors, bounce_rate, pageviews_per_visit, time_on_page, avg_time_on_page, day, month, year, hour)
VALUES (:landing_page, :page_title, :page_views, :visits, :visitors, :bounce_rate, :pageviews_per_visit, :time_on_page, :avg_time_on_page, :day, :month, :year, :hour)", $this->data);
}
And
protected function insert($sql, array $data) {
$q = $this->_db_handler->prepare($sql);
foreach ($data as $k => $v)
{
$q->bindValue(':' . $k, $v);
}
$q->execute();
}
It is not PDO nor the way you are inserting makes insert so delayed, but innodb engine. So you have 3 choices:
Wrap all inserts into transaction.
using root privileges, set innodb_flush_log_at_trx_commit variable to 2, to make innodb use a filecache for writes - it will make your inserts blazingly fast.
Run all the inserts in one query as suggested by Manu
Might not be the best solution, But you can try constructing a query string like INSERT INTO [table] VALUES (r1c1,r1c2,r1c3),(r2c1,r2c2,r2c3) ... and execute one mysql_query (or say one query for few hundred rows), you might even verify data programatically while constructing the sql query if it is not from a trusted source.
Parameterized queries by definition trade execution safety against reduced flexibility on the count of data items.
You have at least 2 possibilities to mitigate:
Build up the SQL, then execute it at once:
Something like:
$sql="INSERT INTO gadata (landing_page, page_title, page_views, visits, visitors, bounce_rate, pageviews_per_visit, time_on_page, avg_time_on_page, day, month, year, hour) VALUES ";
foreach ($all_data_rows as $data) {
if ($i==0) $value=""; else $value=",";
$sql.=$value."(:landing_page$i, :page_title$i, :page_views$i, :visits$i, :visitors$i, :bounce_rate$i, :pageviews_per_visit$i, :time_on_page$i, :avg_time_on_page$i, :day$i, :month$i, :year$i, :hour$i)";
$i++;
}
$i=0;
$q=$db_handler->prepare($sql);
foreach ($all_data_rows as $data) {
foreach ($data as $k => $v) {
$q->bindValue(":$k$i", $v);
}
$i++;
}
$q->execute();
Use a temporary table to avoid locking and disk overhead
First create a temporary table of type HEAP with the same structure as your target table, then insert into it: This will be much faster, as no locking and disk IO happens. Then run
INSERT INTO final_table SELECT * FROM temporary_table
If mitigation doesn't suffice, you will need to consider using non-parameterized queries for this use case. The usual caveats apply.
Related
I'm currently struggling with an issue that is overloading my database which makes all page requests being delayed significantly.
Current scenario
- A certain Artisan Command is scheduled to be ran every 8 minutes
- This command has to update a whole table with more than 30000 rows
- Every row will have a new value, which means 30000 queries will have to be executed
- For about 14 seconds the server doesn't answer due to database overload (I guess)
Here's the handle method of the command handle()
public function handle()
{
$thingies = /* Insert big query here */
foreach ($thingies as $thing)
{
$resource = Resource::find($thing->id);
if(!$resource)
{
continue;
}
$resource->update(['column' => $thing->value]);
}
}
Is there any other approach to do this without making my page requests being delayed?
Your process is really inefficient and I'm not surprised it takes a long time to complete. To process 30,000 rows, you're making 60,000 queries (half to find out if the id exists, and the other half to update the row). You could be making just 1.
I have no experience with Laravel, so I'll leave it up to you to find out what functions in Laravel can be used to apply my recommendation. I just want to get you to understand the concepts.
MySQL allows you to submit a multi query; One command that executes many queries. It is drastically faster than executing individual queries in a loop. Here is an example that uses MySQLi directly (no 3rd party framework such as Laravel)
//the 30,000 new values and the record IDs they belong to. These values
// MUST be escaped or known to be safe
$values = [
['id'=>145, 'fieldName'=>'a'], ['id'=>2, 'fieldName'=>'b']...
];
// %s and %d will be replaced with column value and id to look for
$qry_template = "UPDATE myTable SET fieldName = '%s' WHERE id = %d";
$queries = [];//array of all queries to be run
foreach ($values as $row){ //build and add queries
$q = sprintf($qry_template,$row['fieldName'],$row['id']);
array_push($queries,$q);
}
//combine all into one query
$combined = implode("; ",$queries);
//execute all queries at once
$mysqli->multi_query($combined);
I would look into how Laravel does multi queries and start there. The last time I implemented something like this, it took about 7 milliseconds to insert 3,000 rows. So updating 30,000 will definitely not take 14 seconds.
As an added bonus, there is no need to first run a query to figure out whether the ID exists. If it doesn't, nothing will be updated.
Thanks to #cyclone comment I was able to update all the values in one single query.
It's not a perfect solution, but the query execution time now takes roughly 8 seconds and only 1 connection is required, which means the page requests are still being handled when the query is being executed.
I'm not marking this question as definitive since there might be improvements to make.
$ids = [];
$caseQuery = '';
foreach ($thingies as $thing)
{
if(strlen($caseQuery) == 0)
{
$caseQuery = '(CASE WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
else
{
$caseQuery .= ' WHEN id = '. $thing->id . ' THEN \''. $thing->rank .'\' ';
}
array_push($ids, $thing->id);
}
$caseQuery .= ' END)';
// Execute query
DB::update('UPDATE <table> SET <value> = '. $caseQuery . ' WHERE id IN ('. implode( ',' , $ids) .')');
There is a huge two dimensional array which contain 500k sub one dimension arrays, every sub array contain 5 elements.
Now it is my job to insert all the data into a sqlite database.
function insert_data($array){
Global $db;
$dbh=new PDO("sqlite:{$db}");
$sql = "INSERT INTO quote (f1,f2,f3,f4,f5) VALUES (?,?,?,?,?)";
$query = $dbh->prepare($sql);
foreach($array as $item){
$query->execute(array_values($item));
}
$dbh=null;
}
I want to optimize the data insert process that the execute action will be executed for 500k times,how to make it executed just one time?
The idea is to prevent running transactions for each insert, because it will be very slow indeed. So just start and commit the transaction, say for every 10k records.
$dbh->beginTransaction();
$counter = 0;
foreach($array as $item) {
$query->execute(array_values($item));
if ($counter++ % 10000 == 0) {
$dbh->commit();
$dbh->beginTransaction();
}
}
$dbh->commit();
Another solution, you can move an array in a csv file and then just import it.
If you are using a newer version of Sqlite (3.7.11+) then it supports batch inserts:
INSERT INTO quote (f1,f2,f3,f4,f5) VALUES
(?,?,?,?,?),
(?,?,?,?,?),
(?,?,?,?,?);
You can use this to chunk your array into groups, and do batch inserts this way.
As pointed out by Axalix you should also wrap the whole operation in a transaction.
I have an array stored in a variable $contactid. I need to run this query to insert a row for each contact_id in the array. What is the best way to do this? Here is the query I need to run...
$contactid=$_POST['contact_id'];
$eventid=$_POST['event_id'];
$groupid=$_POST['group_id'];
mysql_query($query);
$query="INSERT INTO attendance (event_id,contact_id,group_id) VALUES ('$eventid','$contactid','$groupid')";
Use a foreach loop.
$query = "INSERT INTO attendance (event_id,contact_id,group_id) VALUES ";
foreach($contactid as $value)
{
$query .= "('{$eventid}','{$value}','{$groupid}'),";
}
mysql_query(substr($query, 0, -1));
The idea here is to concatenate your query string and only make 1 query to the database, each value-set is separated by a comma
Since no one hasn't stated that yet, you actually cannot do this:
$query = '
INSERT INTO [Table] ([Column List])
VALUES ([Value List 1]);
INSERT INTO [Table] ([Column List])
VALUES ([Value List 2]);
';
mysql_query($query);
as this has been prevented to prevent sql injections in the mysql_query code. You cannot have semicolon within the given query param with mysql_query. With the following exception, taken from the manual comments:
The documentation claims that "multiple queries are not supported".
However, multiple queries seem to be supported. You just have to pass
flag 65536 as mysql_connect's 5 parameter (client_flags). This value
is defined in /usr/include/mysql/mysql_com.h:
#define CLIENT_MULTI_STATEMENTS (1UL << 16) /* Enable/disable multi-stmt support */
Executed with multiple queries at once, the mysql_query function will
return a result only for the first query. The other queries will be
executed as well, but you won't have a result for them.
That is undocumented and unsupported behaviour, however, and easily opens your code to SQL injections. What you can do with mysql_query, instead, is
$query = '
INSERT INTO [Table] ([Column List])
VALUES ([Value List 1])
, ([Value List 2])
[...]
, ([Value List N])
';
mysql_query($query);
so you can actually insert multiple rows with a one query, and with one insert statement. In this answer there's a code example for it which doesn't concatenate to a string in a loop, which is better than what's suggested in this thread.
However, disregarding all the above, you're probably better of still to use a prepared statement, like
$stmt->prepare("INSERT INTO mytbl (fld1, fld2, fld3, fld4) VALUES(?, ?, ?, ?)");
foreach($myarray as $row)
{
$stmt->bind_param('idsb', $row['fld1'], $row['fld2'], $row['fld3'], $row['fld4']);
$stmt->execute();
}
$stmt->close();
Use something like the following. Please note that you shouldn't be using mysql_* functions anymore, and that your code is suseptible to injection.
for ($i = 0; $i < count($contactid); $i++) {
$query="INSERT INTO attendance (event_id,contact_id,group_id) VALUES ('$eventid','$contactid[$i]','$groupid')";
mysql_query($query);
}
I'm not sure running multiple queries is the best thing to do, so won't recommend making a for loop for example, that runs for each element of the array. I would rather say, make a recursive loop, that adds the new elements to a string, that then gets passed to the query. In case you can give us a short example of your DB structure and how you'd like it to look like (i.e. how the array should go into the table), I could give you an example loop syntax.
Cheers!
What about:
$contactIds = $_POST['contact_id'];
$eventIds = $_POST['event_id'];
$groupIds = $_POST['group_id'];
foreach($contactIds as $key => $value)
{
$currentContactId = $value;
$currentEventId = $eventIds[$key];
$currentGroupId = $groupIds[$key];
$query="INSERT INTO attendance (event_id,contact_id,group_id) VALUES ('$currentEventId','$currentContactId','$currentGroupId')";
mysql_query($query);
}
Well, you could refactor that to insert everything in a single query, but you got the idea.
I have a list of users which needs to be iterated using a foreach loop and inserted in to a table for every new row in db table.
$data['entity_classid'] = $classid;
$data['notification_context_id'] = $context_id;
$data['entity_id'] = $entity_id;
$data['notification_by'] = $userid;
$data['actionid'] = $actionid;
$data['is_read'] = 0;
$data['createdtime'] = time();
foreach($classassocusers as $users){
$data['notification_to'] = $users->userid;
$DB->insert_record('homework.comments',$data,false);
}
so using the insert query as given above is
A good practice or bad practice,
Shall i place any delay after every insert query execution?
what are the pros and cons of doing so?
Thanks
Using the query like that is a good practice in your case. You will have to insert a list of users anyway, so you will have to process many queries. No way around this!
I have no idea why you would want to place a delay after each insert. These methods are synchronous calls, so your code will be "paused" anyway during the execution of your query. So delaying it will just delay your code while nothing is progressing.
So your loop will not continue while executing a query. So don't delay your code even more on purpose.
Another way to do this is by executing one query though.
$user_data = "";
foreach($classassocusers as $users) {
$user_data .= "('" . $users->userid . "', '" . $users->name . "'), ";
}
$user_data = substr($user_data, 0, strlen($user_data) - 2);
$query = "INSERT INTO `homework.comments` ( `id`, `name` )
VALUES " . $user_data;
That's supposed to make a query like:
INSERT INTO `homework.comments` ( `id`, `name` )
VALUES ('1', 'John'),
('2', 'Jeffrey'),
('3', 'Kate');
(By the way, I made some assumptions regarding your $users object and your table structure. But I'm sure you catch the idea)
It all depends on your requirements.
If you run 500.000 of these updates in 5 minutes - every 15 minutes, your database will have a hard time. If you do this for 1.000 users every 15 minutes - this is a great approach.
When performance is demanded, concider the following:
Combine INSERT using the VALUES syntax, process every 500/1000.
Add a small timeout after the query.
Otherwise, this is an excellent approach!
I have about 14000 rows of comma separated values that I am trying to insert into a sqlite table using PHP PDO, like so:
<?php
// create a PDO object
$dbh = new PDO('sqlite:mydb.sdb');
$lines = file('/csv/file.txt'); // import lines as array
foreach ($lines as $line) {
$line_array = (','$line); // create an array of comma-separated values in each line
$values = '';
foreach ($line_array as $l) {
$values .= "'$l', ";
}
substr($values,-2,0); // get rid of the last comma and whitespace
$query = "insert into sqlite_table values ($values)"; // plug the value into a query statement
$dbh->query($query); // run the query
}
?>
This query takes a long time, and to run it without interuption, I would have to use PHP-CLI.
Is there a better (faster) way to do this?
You will see a good performance gain by wrapping your inserts in a single transaction. If you don't do this SQLite treats each insert as its own transaction.
<?php
// create a PDO object
$dbh = new PDO('sqlite:mydb.sdb');
// Start transaction
$dbh->beginTransaction();
$lines = file('/csv/file.txt'); // import lines as array
foreach ($lines as $line) {
$line_array = (','$line); // create an array of comma-separated values in each line
$values = '';
foreach ($line_array as $l) {
$values .= "'$l', ";
}
substr($values,-2,0); // get rid of the last comma and whitespace
$query = "insert into sqlite_table values ($values)"; // plug the value into a query statement
$dbh->query($query); // run the query
}
// commit transaction
$dbh->commit();
?>
Start a transaction before the loop and commit it after the loop
the way your code is working now, it starts a transaction on every insert
If you're looking for a bit more speed, use prepare/fetch, so the SQL engine doesn't have to parse out the text string each time.
$name = $age = '';
$insert_stmt = $db->prepare("insert into table (name, age) values (:name, :age)");
$insert_stmt->bindValue(':name', $name);
$insert_stmt->bindValue(':age', $age);
// do your loop here, like fgetcsv
while (get the data) {
list($name, $age) = split(',', $string);
$insert_stmt->execute();
}
It's counter-intuitive that you do the binding outside the loop, but this is one reason why this method is so fast, you're basically saying "Execute this pre-compiled query using data from these variables". So it doesn't even need to move the data around internally. And you want to avoid re-parsing the query, which is the problem if you use something like "insert into table (name) values ('$name')", every query sends the entire text string to the database to be re-parsed.
One more thing to speed it up -- wrap the whole loop in a transaction, then commit the transaction when the loop is finished.
From SQLlite FAQ :
Transaction speed is limited by disk drive speed because (by default)
SQLite actually waits until the data
really is safely stored on the disk
surface before the transaction is
complete. That way, if you suddenly
lose power or if your OS crashes, your
data is still safe. For details, read
about atomic commit in SQLite..
[...]
Another option is to run PRAGMA synchronous=OFF. This command will
cause SQLite to not wait on data to
reach the disk surface, which will
make write operations appear to be
much faster. But if you lose power in
the middle of a transaction, your
database file might go corrupt.
I'd say this last paragraph is what you need.
EDIT: No sure about this, but I believe using sqlite_unbuffered_query() should do the trick.