Select a big number of insert in mysql - php

I have a problem, so I have in my project 10 database. Each database have the table members. In each table members I have 2 millions of rows so in 10 databases ~ 20 millions of rows. I tried like this :
foreach ($aDataBases as $database) {
$sSql = sprintf('SELECT nom,prenom,naiss FROM `%s`', $sTableName);
$rResult = Mysqli::query($sSql, $database);
while ($aRecord = $rResult->fetch_array(MYSQLI_ASSOC))
{
$aUsers['lastName'] = $aRecord['name'];
$aUsers['firstName'] = $aRecord['f_name'];
$aUsers['birthDate'] = $aRecord['birth'];
$aTotalUsers[] = $aUsers;
}
}
When I run I get the error Allowed memory size of 134217728 bytes exhausted. If for example I put in the select LIMIT 100 work perfect. Can you help me please ?

Just put your code in a loop, and make SQL calls of, say, 1000 entries each. Loop until all rows have been printed. Some people will tell just to raise your memory limit, but there's always a physical limit you can't pass.
I won't code that for you because you're a PHP programmer and you got the idea. Here's the pseudocode, though:
base = 0
while (rows = getrows(base,1000))
foreach row in rows
print row
base = base + 1000

Related

PHP process MySQL query row-by-row

I have combined a PHP script to count words in MySQL textfield and update another field accordingly.
It works well with relatively small tables - but when I tried with really big table (10M records) - of course I've got "PHP Fatal error: Allowed memory size of 134217728 bytes exhausted"
Could somebody hint how to modify the script below to process the data "row-by-row" ?
<?php
$con1 = mysqli_connect('localhost','USERNAME','PASSWORD','DATABASE');
if (!$con1) {
die('Could not connect: ' . mysqli_error($con1));
}
$sql = "SELECT id FROM TableName";
$result = mysqli_query($con1, $sql);
while ($row = mysqli_fetch_assoc($result)) {
$to = $row["id"];
$sql1 = "SELECT textfield FROM TableName WHERE id = '$to' ";
$result1 = mysqli_query($con1,$sql1);
$row1 = mysqli_fetch_assoc($result1);
$words=(str_word_count($row1['textfield'],0));
$sql2="UPDATE TableName SET wordcount = '$words' WHERE id ='$to'";
$result2 = mysqli_query($con1,$sql2);
}
mysqli_close($con1);
?>
MySQL queries have the clause limit o, n, so you can run
SELECT id FROM TableName limit 0, 10
for example to get only 10 elements from the start. Now, the first number is the offset (the index where you start your work from) and the second is the number of elements you would expect to get. Now, these are the ideas you need to know in order to have success in doing this:
you will need to write a loop
in the loop you will always get n elements (n could be 1 as you wanted, or more)
in each step you increment o by n, so the new offset will be starting where the results ended previously
you can ensure an order, like order by id, for example
you can wrap the loop we are speaking about here around most of your code

PHP - Exceeding allowed memory when processing large dataset

I have a list data with 999,000 records.
I have a select query and a while loop to get the data, I use array_push to add the retrieved value in loop into one array.
And then I want it so every loop processes 1000 values in this array.
My problem is when use array_push with big data I get the error:
Fatal Error: Allowed Memory Size of 134217728 Bytes
How can I optimize my code to resolve my problem?
My code is below:
$sql = "select customer_id";
$sql .= " from";
$sql .= " t_customer t1";
$sql .= " inner join t_mail_address t2 using(mid, customer_id)";
$result = $conn->query($sql);
$customerArray = array();
while ($row = $result ->fetch(PDO::FETCH_ASSOC)) {
array_push($customerArray , $row);
}
// Execute every 1000 record
foreach(array_chunk($customerArray , 1000) as $execCustomerArray ) {
// My code to execute for every records.
// ....
}
I'm unsure if it would fix anything, but one thing I will say is, your use of pushing all records into an array is silly.
You're using fetch to fetch them one by one, then adding them all to an array, why on earth aren't you just using PDOStatement::fetchAll() ?
Example:
$sql = "select customer_id";
$sql .= " from";
$sql .= " t_customer t1";
$sql .= " inner join t_mail_address t2 using(mid, customer_id)";
$result = $conn->query($sql);
$customerArray = $result->fetchAll(PDO::FETCH_ASSOC);
// Execute every 1000 record
foreach(array_chunk($customerArray , 1000) as $execCustomerArray ) {
// My code to execute for every records.
// ....
}
This may not fix your memory issue, because we can't see what the heavy lifting is for every customer record, but I will say that while loop you had was silly but most likely not the cause of your memory issue
Depending on if this is a script, or a web page thing, you could also have an incremental loop sort of thing, and use the MySQL LIMIT function to implement basic paging for your data, thus preventing it from coming into memory all at once,

SQL query execution time dramatically increasing

My problem is explained below.
This is my PHP code running on my server right now :
$limit = 10000;
$annee = '2017';
//Counting the lines I need to delete
$sql = " SELECT COUNT(*) FROM historisation.cdr_".$annee." a
INNER JOIN transatel.cdr_transatel_v2 b ON a.id_cdr = b.id_cdr ";
$t = $db_transatel->selectAll($sql);
//The number of lines I have to delete
$i = $t[0][0];
do {
if ($i < $limit) {
$limit = $i;
}
//The problem is comming from that delete
$selectFromHistoryAndDelete = " DELETE FROM transatel.cdr_transatel_v2
WHERE id_cdr IN (
SELECT a.id_cdr FROM historisation.cdr_".$annee." a
INNER JOIN (SELECT id_cdr FROM historisation.cdr_transatel_v2) b ON a.id_cdr = b.id_cdr
)
LIMIT " . $limit;
$delete = $db_transatel->exec($selectFromHistoryAndDelete, $params);
$i = $i - $limit;
} while ($i > 0);
The execution of the query.
As you can see on the picture, in the first 195 loops the execution time was between 13 and 17 seconds.
It increased to 73 seconds on the 195th loop and to 1305 seconds on the 196th loop.
Now the query is running for 2000 seconds.
The query is deleting rows in a test table that no one is using right.
I'm deleting row 10,000 by 10,000 for the query to be quick and not overload the server.
I am wondering why is the execution time increasing like that, I though it will be quicker at the end because I though the inner join would be much quicker as they are less rows in the table.
Does anyone has an idea ?
Edit : The tables engine is MyISAM.
Based on your latest comment the inner join is redundant, since you're deleting from the table that contains the values you're joining on. In essence you're having to process b.id_cdr = a.id_cdr twice, since the number of values compared on cdr_2017 are not changed by the inner join, just the number of values queried to be deleted.
As for the cause of the incremental slowness, it is because you are manually performing the same function as SELECT cdr_id FROM cdr_2017 LIMIT 10000 OFFSET x.
That is to say, your query has to perform a full-table scan on cdr_2017 to determine the id values to delete. As you delete the values, the SQL optimizer has to move further through the cdr_2017 table to retrieve the values.
Resulting in
DELETE FROM IN(1,2,3,...10000)
DELETE FROM IN(1,2,3,...20000)
...
DELETE FROM IN(1,2,3,...1000000)
Assuming cdr_id is the incremental primary key, to resolve the issue you could use the last index retrieved from cdr_2017 to filter the selected values.
This will be much faster, as a full-table scan is no longer required to validate the joined records, since you're now utilizing an indexed value on both sides of the query.
$sql = " SELECT COUNT(a.cdr_id) FROM historisation.cdr_".$annee." a
INNER JOIN transatel.cdr_transatel_v2 b ON a.id_cdr = b.id_cdr ";
$t = $db_transatel->selectAll($sql);
//The number of lines I have to delete
$i = $t[0][0];
//set starting index
$previous = 0;
do {
if ($i < $limit) {
$limit = $i;
}
$selectFromHistoryAndDelete = 'DELETE d
FROM transatel.cdr_transatel_v2 AS d
JOIN (
SELECT #previous := cdr_id AS cdr_id
FROM historisation.cdr_2017
WHERE cdr_id > ' . $previous . '
ORDER BY cdr_id
LIMIT 10000
) AS a
ON a.cdr_id = d.cdr_id';
$db_transatel->exec($selectFromHistoryAndDelete, $params);
//retrieve last id selected in cdr_2017 to use in next iteration
$v = $db_transatel->selectAll('SELECT #previous'); //prefer fetchColumn
$previous = $v[0][0];
$i = $i - $limit;
} while ($i > 0);
//optionally reclaim table-space
$db_transatel->exec('OPTIMIZE TABLE transatel.cdr_transatel_v2', $params);
You could also refactor to use cdr_id > $previous AND cdr_id < $last to remove the order by limit clauses, which should also improve performance.
Though I would like to note, that a table lock on cdr_transatel_v2 is performed during this operation by the MyISAM database engine. Due to the way MySQL handles concurrent sessions and queries, there is not much gain from a batch delete in this manner, and is really only applicable to InnoDB and transactions. Especially when using PHP with FastCGI, as opposed to Apache mod_php. Since other queries not on cdr_transatel_v2 will still be executed and write operations on cdr_transatel_v2 will still be queued. If using mod_php I would reduce the limit to 1,000 records to reduce queue times.
For more information see https://dev.mysql.com/doc/refman/5.7/en/internal-locking.html#internal-table-level-locking
Alternative approach.
Considering the large number of records that need to be deleted, when the records deleted exceed those that are kept, it would be more beneficial to invert the operation by using INSERT instead of DELETE.
#ensure the storage table doesn't exist already
DROP TABLE IF EXISTS cdr_transatel_temp;
#duplicate the structure of the original table
CREATE TABLE transatel.cdr_transatel_temp
LIKE transatel.cdr_transatel_v2;
#copy the records that are not to be deleted from the original table
INSERT transatel.cdr_transatel_temp
SELECT *
FROM transatel.cdr_transatel_v2 AS d
LEFT JOIN historisation.cdr_2017 AS b
ON b.cdr_id = d.cdr_id
WHERE b.cdr_id IS NULL;
#replace the original table with the storage table
RENAME TABLE transatel.cdr_transatel_v2 to transatel.backup,
transatel.cdr_transatel_temp to cdr_transatel_v2;
#remove the original table
DROP TABLE transatel.backup;

Mysql query eating large amount of space using limit

I'm trying to query a very large table some 35+ millions rows to process each row 1 by 1 because I can't pull in the full database in php at once (out of memory) I'm using 'limit' in a loop but every time it trys to query the 700K mark it throws an out of disk space error (error 28)
select * from dbm_new order by id asc limit 700000,10000
I'm pulling in 10K rows at once into php and I can even make it pull in 100K rows it still throws the same error trying to start at row 700K, I can see it's eating a huge amount of disk space.
In php I'm freeing the result set after each loop
mysql_free_result ($res);
But it's not a PHP related issue, I've run the query in mysql only and it gives the same error
Why does starting the limit at the 700K mark eat up so much disk space, I'm talking over 47gig here, surely it doesn't need that much space, What other options do I have?
here's the code
$start = 0;
$increment = 10000;
$hasResults = true;
while ($hasResults) {
$sql = "select * from dbm_new order by id asc limit $start,$increment ";
....
}
You can use the PK instead of OFFSET to get chunks of data:
$start = 0;
while(1) {
$sql = "SELECT * FROM table WHERE id > $start ORDER BY id ASC LIMIT 10000";
//get records...
if(empty($rows)) break;
foreach($rows as $row) {
//do stuff...
$start = $row['id'];
}
}

PHP how to run sql query one part at a time?

I have a table with roughly 1 million rows. I'm doing a simple program that prints out one field from each row. However, when I started using mysql_pconnect and mysql_query the query would take a long time, I am assuming the query needs to finish before I can print out even the first row. Is there a way to process the data a bit at a time?
--Edited--
I am not looking to retrieve a small set of the data, I'm looking for a way to process the data a chunk at a time (say fetch 10 rows, print 10 rows, fetch 10 rows, print 10 rows etc etc) rather than wait for the query to retrieve 1 million rows (who knows how long) and then start the printing.
Printing one million fields will take some time. Retrieving one million records will take some time. Time adds up.
Have you profiled your code? I'm not sure using limit would make such a drastic difference in this case.
Doing something like this
while ($row = mysql_fetch_object($res)) {
echo $row->field."\n";
}
outputs one record at a time. It does not wait for the whole resultset to be returned.
If you are dealing with a browser you will need something more.
Such as this
ob_start();
$i = 0;
while ($row = mysql_fetch_object($res)) {
echo $row->field."\n";
if (($i++ % 1000) == 0) {
ob_flush();
}
}
ob_end_flush();
Do you really want to print one million fields?
The customary solution is to use some kind of output pagination in your web application, showing only part of the result. On SELECT queries you can use the LIMIT keyword to return only part of the data. This is basic SQL stuff, really. Example:
SELECT * FROM table WHERE (some conditions) LIMIT 40,20
shows 20 entries, starting from the 40th (off by one mistakes on my part may be possible).
It may be necessary to use ORDER BY along with LIMIT to prevent the ordering from randomly changing under your feet between requests.
This is commonly needed for pagination. You can use the limit keyword in your select query. Search for limit here:
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements).
With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:
SELECT * FROM tbl LIMIT 95,18446744073709551615;
With one argument, the value specifies the number of rows to return from the beginning of the result set:
SELECT * FROM tbl LIMIT 5; # Retrieve first 5 rows
In other words, LIMIT row_count is equivalent to LIMIT 0, row_count.
You might be able to use
Mysqli::use_result
combined with a flush to output the data set to the browser. I know flush can be used to output data to the browser at an incremental state as I have used it before to do just that, however I am not sure if mysqli::use_result is the correct function to retrieve incomplete result sets.
This is how I do something like that in Oracle. I'm not sure how it would cross over:
declare
my_counter integer := 0;
begin
for cur in (
select id from table
) loop
begin
-- do whatever your trying to do
update table set name = 'steve' where id = cur.id;
my_counter := my_counter + 1;
if my_counter > 500 then
my_counter := 0;
commit;
end if;
end;
end loop;
commit;
end;
An example using the basic mysql driver.
define( 'CHUNK_SIZE', 500 );
$result = mysql_query( 'select count(*) as num from `table`' );
$row = mysql_fetch_assoc( $result );
$totalRecords = (int)$row['num'];
$offsets = ceil( $totalRecords / CHUNK_SIZE );
for ( $i = 0; $i < $offsets; $i++ )
{
$result = mysql_query( "select * from `table` limit " . CHUNK_SIZE . " offset " . ( $i * CHUNK_SIZE ) );
while ( $row = mysql_fetch_assoc( $result ) )
{
// your per-row operations here
}
unset( $result, $row );
}
This will iterate over your entire row volume, but do so only 500 rows at a time to keep memory usage down.
It sounds like you're hitting the limits of various buffer sizes within the mysql server... Some methods you could do would be to specify the field you want in the SQL statement to reduce this buffer size, or play around with the various admin settings.
OR, you can use a pagination like method but have it output all on one page...
(pseudocode)
function q($part) {
$off = $part*SIZE_OF_PARTITIONS;
$size = SIZE_OF_PARTITIONS;
return( execute_and_return_sql('SELECT `field` FROM `table` LIMIT $off, $size'));
}
$ii = 0;
while ($elements = q($ii)) {
print_fields($elements);
$ii++;
}
Use mysql_unbuffered_query() or if using PDO make sure PDO::MYSQL_ATTR_USE_BUFFERED_QUERY is false.
Also see this similar question.
Edit: and as others have said, you may wish to combine this with flushing your output buffer after each batch of processing, depending on your circumstances.

Categories