Looping through large data array in PHP - php

I have an array with 100,000 users personal info in (ID, name, email etc). I need to loop through each row of the array and insert a mysql record to a table based on the row data. My problem is that I am running out of memory after about 70,000 rows.
My code:
if(!empty($users)){
$c = 0;
foreach($users as $user){
$message = // Some code to create custom email
queue_mail_to_send($user->user_email, $subject, $message, $db_options, $mail_options, $mail_queue);
}
}
Background:
I am building an email system which sends out an email to the users of my site. The code above is looping through the array of users and executing the function 'queue_mail_to_send' which inserts a mysql row into a email queue table. (I am using a PEAR library to stagger the email sending)
Question:
I know that I am simply exhausting the memory here by trying to do too much in one execution. So does anybody know a better approach to this rather than trying to execute everything in one big loop?
Thanks

I think reducing the payload of the script will be cumbersome and will not give you a satisfying result. If you have any possibility to do so, I would advise you to log which rows you have processed already, and have a script run the next x rows. If you can use a cronjob, you can stage a mail, and let the cronjob add mails to the queue every 5 minutes, until all users are processed.
The easiest way would be to store somewhere, the highest user id you have processed. I would not advise you to store the number of users, because in between batches a user can be added or removed, resulting in users not receiving the e-mail. But if you order by user id (assuming you use an auto-incrementing column for the id!), you can be sure every user gets processed.
So your user query would be something like:
SELECT * FROM users WHERE user_id > [highest_processed_user_id] ORDER BY user_id LIMIT 1000
Then process your loop, and store the last user id:
if(!empty($users)) {
$last_processed_id = null;
foreach($users as $user) {
$message = // Message creation magic
queue_mail_to_send( /** parameters **/ );
$last_processed_id = $user->id;
}
// batch done! store processed user id
$query = 'UPDATE mail_table SET last_processed_user_id = '. $last_processed_id; // please use parameterized statements here
// execute the query
}
And on the next execution, do it again until all users have received the mail.

I have exactly same problem with you. Anyway the answer from #giorgio is the best solutions.
But like java or python, we have "yield" in php. #see [here] (http://php.net/manual/en/language.generators.syntax.php)
Here is my sample code, my case is 50.000 records. and I also test successfully with 370.000 records. But it takes times.
$items = CustomerService::findAll();
foreach ($items AS $item)
{
yield (new self())->loadFromResource($item);
}

You may split that operation in multiple operations, seperated in time.
For instance, only allow your routine to process 40 emails per minute, or maybe use an array of an array, to create "pages" of records (use sql LIMIT function).
And set the arrays of array to null and unset it, when you no longer need that information.

I think you can use MySQL IN clause rather then doing foreach for every user.
Like
user_ids = array (1,2,3,4);
// Do something WHERE user_id IN ($user_ids);
and of sending mails you can user PHPMailer class by supplying comma separated email addresses in $to.

USE just one query like:
INSERT INTO table_name (COL1, Col2,...) SELECT COL1, COL2 FROM other_table;

Related

Php/mysql : How to insert large amount of rows in several tables, in 2 query only

In my app, I have a job that compute something for the user everyday at exactly midnight UTC.
I have a row to insert in table A for each user, for this I first compute a string that I concate in à foreach loop, and then I send the whole string to mysqli_query.
$q = ‘insert into mb_thing_to_compute values’;
foreach($user as $u)
{
$q .= ‘’(NULL, {$u[‘ID’]}, $computedStuff),’’;
}
mysqli_query($q);
This way, It seems there is no overhead time, compared to if I sent the insert one by one
Now the problem is that in a second table, I need to send update notification to user, and each notification row has its params in another table. Only mysql know, after the insert, which ID it inserted the rows so I can’t use the first method because I won’t know what are the inserted notification IDs, needed to insert the notification params (each notif has 3 params).
I thought about giving the notifications IDs a php generated ID, with php uniqId function but there is probably a more « clean » to achieve that ?

Correct way to pass between 5,000 to 100,000 values in mysql WHERE clause

I am getting 45000 values from one query result, and need to use this values to make second query but because of large size of array, its taking long time more than 30 seconds to execute so getting Error:
Error: Maximum execution time of 30 seconds exceeded
Is there any other way to do this database data calculation quickly, or i should calculate the data and save it in another table to show at anytime
Queries :
$query = $em->createQuery('SELECT DISTINCT(u.username) as mobile FROM sessionTable u WHERE u.accesspoint IN (?1) ');
$query->setParameter(1, $accesspoint);
$result = $query->getResult();
$count = count($result);
$i = 0;
$numbers = array();
foreach ($result as $value) {
$numbers[$i] = $value['mobile'];
$i++;
}
dump(count($numbers)); //----> Output is 48567 --successful
$Users = $this->getDoctrine()
->getRepository('AcmeDataBundle:User')
->findByNumber($numbers);
----Error Occurs Here------
dump(count($Users));
die();
I am using symfony 2.0 framework , doctrine 2.0
UPDATE :
consider I have 5 tables in same database,
viz. 1- user 2- googleData 3- facebook data 4-yahooData 5- sessions
when users login to my application, I collect there gender info from either one or multiple social profile data, and save in that particular table
Now I want to calculate all users male:female ratio who have used multiple sessions,
For this scenario its going too tough to calculate male :female ratio from multiple tables.
I feel there is one solution it will be easy if I adds Gender column directly in session table but is there any other better way by using FK or anything else ?
If you are having to pass 1000 values never mind 100,000 values, that probably means you have a problem in the design of your queries. The data has to come from somewhere, if it's from the database, it would be a simple matter to use a join or a sub query. If it's from an external source, it can go into a temporary table.
so in short: Yes, a temporary or permanent table is better

Multiple queries with php code in mysqli one person at the time

This code works but the problem is that if several people use it simultaneously it will cause problems in the sense that some people wont be registered. So I need to rewrite this in a way that all queries per person are executed and finished before the queries of the next person start.
First, the code reads from the database in order to get to the string of all the people that are registered so far.
$sql_s = $con -> query("select * from schedule where date='$date'");
$row_schedule = $sql_s->fetch_array(MYSQLI_BOTH);
$participants = $row_schedule['participants'];
$participants is a string that looks something like "'Sara':'Richard':'Greg'"
Now the current user (Fredrik) wants to add its name to the string like this
$current_user='Fredrik';
$participants_new=add_participant($participants,$current_user);
add_participant is a php function that adds 'Fredrik' to the participant string. Then I want to replace the old participant string with the new one in the SQL database like this
$sql = $con->query("UPDATE schedule SET participants='{$participants_new}' where date='{$date}'");
The specific problem is that if another person (Linda) reads the database before Fredrik executes
$sql = $con->query("UPDATE schedule SET participants='{$participants_new}' where date='{$date}'");
Linda won't get a string that includes Fredrik, she will get "'Sara':'Richard':'Greg'". And when she has added her name it will look like "'Sara':'Richard':'Greg':'Linda'" and when she updates the database like this
$sql = $con->query("UPDATE schedule SET participants='{$participants_new}' where date='{$date}'");
The string including Fredrik ("'Sara':'Richard':'Greg':'Fredrik'") will be overwritten with ("'Sara':'Richard':'Greg':'Linda'") and noone will ever know that Fredrik registered in the class.
Thus, how can I rewrite this code such that all Fredrik's queries are executed before Linda's queries start?
Your question is very good example, showing why one should always learn database design basics and always follow them.
A separator-delimited string in a database is a deadly sin. For many reasons, but we are interesting in this particular case.
Had you designed your database properly, adding participants into separate rows, there would be not a single problem.
So, just change your design by adding a table with participants, and there will be not a single problem adding or removing any number of them.
Here is an approach to do it :
Theoritically Explanation :
Something like this could work.That everytime when user executes the query so it should check for time the request was made to update the query so.Now there must be time difference between user requests for updation query.
Note : Still It's not guaranteed that it will work as because when you will be having internet problems and the user who submitted the request at first but having internet problems and that's why his update query execution is delayed during that time and the other user comes and he sent request late for the updation query but he was having no internet connection problem so his query will be updated before and I think hence that way first user query will get failed..!
Here is the Code :
<?php
// You need to add another column for saving time of last query execution
$current_time=time();
$current_date=date("Y-m-d",$t);
$query_execution_new_time = $current_time.":".$current_date;
if (empty($row_schedule['query_execution_time'])) {
$sql = $con->query("UPDATE schedule SET query_execution_time='{$query_execution_new_time}' where date='{$date}'");
} else {
$query_execution_time = explode(":",$row_schedule['query_execution_time']);
if ($query_execution_time[0] < $current_time) {
$con->query("UPDATE schedule SET participants='{$participants_new}' where date='{$date}'");
$sql = $con->query("UPDATE schedule SET query_execution_time='{$query_execution_new_time}' where date='{$date}'");
}
}
?>
Try this
No need to fetch first all participants and then update.
only update new participant user.
you can concat result of previous one result saved in database column field.
update schedule
set participants = case when participants is null or participants =''
then CONCAT(participants,'Fredrik') // assume Fredrik a new participant
else CONCAT(participants,':','Fredrik')
end
where date='$date';
That way even if you have multiple participants came at the same time the queries won't run at exactly the same time and so you'll get the correct user at the end.
you don't need to worry about multiple users clicking on them unless you've got millions of users

Multiple Long Concurrent MySQL Queries

I have a MySQL database that has around 600,000 records in the largest table. The other tables are fairly small in comparison. The data is somewhat normalized but there is some duplication because I'm using it for personal use and when I tried fully normalizing it, I found the queries to be unnecessarily complex and slow from all of the joins. I am using PHP to execute the queries.
Now, for example, say that the 600,000 record table contains email addresses. Imagine I have about 10 applications/clients that need to retrieve an email address from this table based on conditions and joins and no two clients should get the same email address. So, I created a query that selects an email address and then another query that uses the selected email address to update a flag field to mark the email address as "in use" and so another client cannot take the same email address. The problem is the query to select the email address takes about 25 seconds to execute and when two clients execute at the same time, they receive the same email address. The speed is not an issue because the clients will only be executing this query once every few hours but I need the clients to get unique email addresses.
I'm kind of new to MySQL so I don't know if selecting the field and then setting a flag is the proper way to go about this. Is there a way to set the flag before I select the field? Also, I don't know much about transactions but could this be solved using them?
Thanks!
START TRANSACTION;
SELECT email FROM myemails WHERE flag = 0 LIMIT 1 FOR UPDATE;
UPDATE myemails SET flag = 1 WHERE email = '$email';
COMMIT;
Another possible approach is to generate a unique flag in php and update first i.e.
$flag = uniqid();
UPDATE myemails SET flag = '$flag' WHERE flag IS NULL LIMIT 1;
SELECT email FROM myemails WHERE flag = '$flag';

Doing a database search and send email

my first time here and I am quite new to coding.
I have a form that takes the values from a customer table when they want to do a return. This values will then be inserted into a return_product table. However, I need to run another script that does a search in this return product table.
The other script needs to count the total of the transactions in the return_product table group by ID number. When there are customers with 3 or more transactions, they will send an email to the manager. However, the email part should not stop the form from being submitted into the database.
The if loop part is to be done by my friend and she's not done with it yet. What if I need to send out all those who exceed 3 transactions? Do I use a while loop?
I require assistance with making the two scripts run concurrently when the form is being submitted.
Any help is greatly appreciated. Thanks in advance. :)
if you want to count no of records you can use SQL count method to do it!
You can check your condition before inserting values in db and then return respective data like
//Get number of records of particular user with query
//Check the limit
If(num_of_records < 3) {
//Insert data
} else {
//Send mail
}
With this you don't need to write 2 scripts

Categories