For some reasons (that I think it is not the point of my question, but if it help, ask me and I can describe why), I need to check MySQL tables continuously for new records. If any new records come, I want to do some related actions that are not important now.
Question is, how I should continuously check the database to make sure I am using the lowest resources and getting the results, close to the realtime.
For now, I have this:
$new_record_come = false;
while(! $new_record_come) {
$sql = "SELECT id FROM Notificatins WHERE insert_date > (NOW() - INTERVAL 5 SECONDS)";
$result = $conn->query($sql);
if ($result)
{
//doing some related actions...
$new_record_come = true;
}
else
{
sleep(5); //5 seconds delay
}
}
But I am worry that if I get thousands of users, it will make the server down, even if the server is a high price one!
Do you have any advice to make it better in performance or even change the way completely or even change the type of query or any other suggestion?
Polling a database is costly, so you're right to be wary of that solution.
If you need to scale this application up to handle thousands of concurrent users, you probably should consider additional technology that complements the RDBMS.
For this, I'd suggest using a message queue. After an app inserts a new notification to the database, the app will also post an item to a topic on the message queue. Typically the primary key (id) is the item you post.
Meanwhile, other apps are listening to the topic. They don't need to do polling. The way message queues work is that the client just waits until there's a new item in the queue. The wait will return the item.
A comment suggested using a trigger to invoke a PHP script. This won't work, because triggers execute while the transaction that spawned them is not yet committed. So if the trigger runs a PHP script, which probably needs to read the record from the database. But an uncommitted record is not visible to any other database session, so the PHP script can never read the data that it was notified about.
Another angle (much simpler than message queue I think):
I once implemented this on a website by letting the clients poll AND compare it to their latest id they received.
For example: You have a table with primary key, and want to watch if new items are added.
But you don't want to set up a database connection and query the table if there is nothing new in it.
Let's say the primary key is named 'postid'.
I had a file containing the latest postid.
I updated it with each new entry in tblposts, so it contains alsways the latest postid.
The polling scripts on the clientside simply retrieved that file (do not use PHP, just let Apache serve it, much faster: name it lastpostid.txt or something).
Client compares to its internal latest postid. If it is bigger, the client requests the ones after the last one. This step DOES include a query.
Advantage is that you only query the database when something new is in, and you can also tell the PHP script what your latest postid was, so PHP can only fetch the later ones.
(Not sure if this will work in your situation becuase it assumes an increasing number meaning 'newer'.)
This might not be possible with your current system design but how about instead of using triggers or a heartbeat to poll the database continuously that you go where the updates, etc happen and from there execute other code? This way, you can avoid polling the database continuously and code will fire ONLY IF somebody initiates a request?
Related
User keys in search parameters, then we make a request to a data provider and redirect user to a loading page. The response from the data provider hits a callback url, in which case we parse the results and store about 200 rows into the db. Meanwhile the loading page uses ajax to query the db every second and when the results are all there we display the results to the user.
The issue is that insert into the mysql db is too slow. We know the response back from the data provider comes back within seconds, but the processing of the script and inserting of rows into the db is very slow. We do use multirow insert.
Any suggestions to improve? FYI, the code is hugely long... that's why not displaying right now.
There are multitude of factors affecting your insertions:
1) slow hardware and bad server speeds.
Sol : Contact your server administrator
2) Use something other than InnoDB
3) Use a surrogate key , other than your primary key that is numeric and sequential along with your natural primary key.
OR
4) Try this https://stackoverflow.com/a/2223062/3391466.
Suggestion: Instead of running the code on one page and having the user wait the whole process, why not have the php page store the instructions in a php queue? The instructions would then be executed by a separate php script (for instance a Cron Job) and the user wouldn't have to wait for the whole process to take place.
However, in this situation it would be ideal to let the user know that the changes made can take a bit of time to update.
Cron jobs are very easy to implement. In CPanel there is an option for Cron Jobs where you specify which script you want to run and in which intervals. You can let your script know to run once every 1 minute (or more or less depending on how much demand there is). From there your script would check the queue and could keep on running until the queue is empty again.
Let me know if that helped!
I am building a turn-based multiplayer game with Flash and PHP. Sometimes two users may call on the same PHP script at the same time. This script is designed to write some information to the database. But that script should not run if that information has already been written by another user, or else the game will break. If PHP processes these scripts sequentially (similar to how MySQL queues up multiple queries), then only one script should run in total and everything should be fine.
However, I find that around 10% of the time, BOTH user's scripts are executed. My theory is that the server sometimes receives both user requests to run the script at exactly the same time and they both run because neither detected that anything has been written to the database yet. Is it possible that both scripts were executed at the same time? If so, what are the possible solutions to this problem.
THis is indeed possible. You can try locking and unlocking tables at the beginning and end of your scripts.
Though this will slow down some requests, as they would have to first wait for the locked tables to be unlocked.
It doesnt matter, if it is PHP, C, Java what ever. At the same time can run max only as much processes, as you have CPUs (and cores). There can be running lets say 100 processes at the same time, if you have only 2 cores. Only 2 are running, rest is waiting.
Now it depends what you see under run. If you take it as active or if you take also waiting processes. Secondly, it depends on your system configuration, how many processes can wait and on your system specs.
Sounds, at first glance, like what keeps a 2nd instance of the script to roll just does not happen fast enough, 10% of the time... I understand that you already have some kind of a 'lock' like someone told you to add, which is great; as someone mentioned above, always put this lock FIRST THING in your script, if not even before calling the script (a.k.a in parent script). Same goes for competing functions / objects etc...
Just a note though, I was directed here by google and what i wanted to find out is if script B will run IN AN IFRAME (so in a 'different window' if you wish) if script A is not finished running; basically your title is a bit blurry. Thank you very much.
Fortunately enough we're in the same pants : I'm programing an Hearthstone-like card game using php (which I know, ain't suited for this at all, but I just like challenging tasks, (and okay, that's the only language i'm familiar with)). Basically I gotta keep multiple 'instants' or actions if you prefer from triggering while another set of global event/instant - instants - sub-instants is rolling. This includes NEVER calling a function that has an event into it into the same rolling snipet, EXCEPT if I roll a while on a $_SESSION variable with value y that only does sleep(1) (that happens in scritpt A); while $_SESSION["phase"] == "EndOfTurnEffects" and then continue to roll until $_SESSION["phase"] == "StandBy" (other player's turn), and I wish script B to mofity $_SESSION["phase"]. Basically if script B does not run before script A is done executing, I'm caught in an endless loop of the while statement...
That's very plausible that they do. Look into database transactions.
Briefly, database transactions are used to control concurrency between programs that access the database at the same time. You start a transaction, then execute multiple queries and finally commit the transaction. If two scripts overlap each other, one of them will fail.
Note that isolation levels can further give fine grained control of how much the two (or more) competing scripts may share. Typically all are allowed to ready from the database, but only one is allowed to write. So the error will happen at the final commit. This is fine as long as all side effects are happening in the database, but not sufficient if you have external side effects (Such as deleting a file or sending an email). In these cases you may want to lock a table or row for the duration of the transaction or set the isolation level.
Here is an example of SQL table locking that you can use so that the first PHP thread which grabs the DB first will lock the table (using lock name "awesome_lock_row") until it finally releases it. The second thread attempts to use the same table, and since the lock name is also "awesome_lock_row"), it keeps on waiting until the first PHP thread has unlocked the table.
For this example, you can try running the same script perhaps 100 times concurrently as a cron job and you should see "update_this_data" number field increment to 100. If the table hadn't been locked, all the concurrent 100 threads would probably first see "update_this_data" as 0 at the same time and the end result would have been just 1 instead of 100.
<?php
$db = new mysqli( 'host', 'username', 'password', 'dbname');
// Lock the table
$db->query( "DO GET_LOCK('awesome_lock_row', 30)" ); // Timeout 30 seconds
$result = $db->query( "SELECT * FROM table_name" );
if($result) {
if ( $row = $result->fetch_object() )
$output = $row;
$result->close();
}
$update_id = $output->some_id;
$db->query( UPDATE table_name SET update_this_data=update_this_data+1 WHERE id={$update_id} );
// Unlock the table
$db->query( "DO RELEASE_LOCK('awesome_lock_row')" );
?>
Hope this helps.
I'm looking for a technique to do the following and I need your advices.
I have a huge (really )table with registration ids and I need to send messages to these ID owners. I cant send the message to many recipients at once, this needs to be proceeded one by one. So I would like to have a script(php) which can run in many parallel instances (processes) by getting some amount from db and processing it. In other words every process needs to work with a particular range of data. I would like also to stop each process and to be able to continue message sending from the stopped user to another set of users who didnt get the message yet.
If it's possible? Any tips and advices are welcome.
You may wish to set a cron job, typically one of the best approaches to run large batch operations with PHP scripts:
http://www.developertutorials.com/tutorials/php/running-php-cron-jobs-regular-scheduled-tasks-in-php-172/
Your cron job will need to point to a PHP script which does the following:
Selects a subset of recipients from your large DB table, based on a
flag set at #3 (below), identifying the next batch to process
Send email to those selected recipients
Saves a note of current job position success/fail (i.e. you could set a
flag next to each recipient in the DB who is succesfully mailed, these are then not selected when the job is rerun)
Parallel processing is possible only to the extent of the configuration of your server. Many servers can serve pages in a parallel fashion, but then again, it is limited to a few. Instead, the rule of thumb is to be as fast as possible and jump to the next request.
Regarding your processing of a really large list of data in your database. You will first of all need a list of id for the mailing your are doing:
INSERT INTO `mymailinglisttable` (mailing_id, recipient_id, senton) SELECT 123 AS mailing_id, mycontacttable.recipient_id, NULL FROM mycontacttable WHERE [insert your criterias for your contacts]
Next you will need to use either innodb or some clever logic for your parallel processing:
With InnoDB, you can do some row level locking, but don't ask me how, search it yourself, i don't use InnoDB at all, but i know it is possible. So you read the docs on that, select and lock some rows, send the emails, mark as sent and wash rinse repeat the operation by calling back your own script. (Either with AJAX or with a php socket)
Without InnoDB, you can simply add 2 fields to your database, one is a processid, the other is a lockedon field. When you want to lock some addresses for your processing, do:
$mypid = getmypid().rand(1111,9999);
$now = date('Y-m-d G:i:s');
mysql_query('UPDATE mymailinglisttable SET mypid = '.$mypid.', lockedon = "'.$now.'" LIMIT 3');
This will lock 3 rows for your pid and on the current time, select the rows that were locked using:
mysql_query('SELECT * FROM mymailinglisttable WHERE mypid = '.$mypid.' AND lockedon = "'.$now.'")
You will retrieve the 3 rows that you locked correctly for processing. I tend to use this version more than the innodb version cause i was raised with this method but not because it is more performant, actually, i'm sure InnoDB's version is much better just never tried it.
If you're comfortable with using PEAR modules, I'd recommend having a look at the pear Mail_Queue module.
http://pear.php.net/package/Mail_Queue
Well documented and with a nice tutorial. I've used a modified version of this before to send out thousands of emails to customers and it hasn't given me a problem yet:
http://pear.php.net/manual/en/package.mail.mail-queue.mail-queue.tutorial.php
Our company deals with sales. We receive orders and our PHP application allows our CSRs to process these orders.
There is a record in the database that is constantly changing depending on which order is currently being processed by a specific CSR - there is one of these fields for every CSR.
Currently, a completely separate page polls the database every second using an xmlhhtp request and receives the response. If the response is not blank (only when the value has changed on the database) it performs an action.
As you can imagine, this amounts to one databse query per second as well as a http request every second.
My question is, is there a better way to do this? Possibly a listener using sockets? Something that would ping my script when a change has been performed without forcing me to poll the database and/or send an http request.
Thanks in advance
First off, 1 query/second, and 1 request/second really isn't much. Especially since this number wont change as you get more CSRs or sales. If you were executing 1 query/order/second or something you might have to worry, but as it stands, if it works well I probably wouldn't change it. It may be worth running some metrics on the query to ensure that it runs quickly, selecting on an indexed column and the like. Most databases offer a way to check how a query is executing, like the EXPLAIN syntax in MySQL.
That said, there are a few options.
Use database triggers to either perform the required updates when an edit is made, or to call an external script. Some reference materials for MySQL: http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
Have whatever software the CSRs are using call a second script directly when making an update.
Reduce polling frequency.
You could use an asynchronous architecture based on a message queue. When a CSR starts to handle an order, and the record in the database is changed, a message is added to the queue. Your script can either block on requests for the latest queue item or you could implement a queue that will automatically notify your script on the addition of messages.
Unless you have millions of these events happening simultaneously, this kind of setup will cause the action to be executed within milliseconds of the event occuring, and you won't be constantly making useless polling requests to your database.
I have a PHP script that grabs data from an external service and saves data to my database. I need this script to run once every minute for every user in the system (of which I expect to be thousands). My question is, what's the most efficient way to run this per user, per minute? At first I thought I would have a function that grabs all the user Ids from my database, iterate over the ids and perform the task for each one, but I think that as the number of users grow, this will take longer, and no longer fall within 1 minute intervals. Perhaps I should queue the user Ids, and perform the task individually for each one? In which case, I'm actually unsure of how to proceed.
Thanks in advance for any advice.
Edit
To answer Oddthinking's question:
I would like to start the processes for each user at the same time. When the process for each user completes, I want to wait 1 minute, then begin the process again. So I suppose each process for each user should be asynchronous - the process for user 1 shouldn't care about the process for user 2.
To answer sims' question:
I have no control over the external service, and the users of the external service are not the same as the users in my database. I'm afraid I don't know any other scripting languages, so I need to use PHP to do this.
Am I summarising correctly?
You want to do thousands of tasks per minute, but you are not sure if you can finish them all in time?
You need to decide what do when you start running over your schedule.
Do you keep going until you finish, and then immediately start over?
Do you keep going until you finish, then wait one minute, and then start over?
Do you abort the process, wherever it got to, and then start over?
Do you slow down the frequency (e.g. from now on, just every 2 minutes)?
Do you have two processes running at the same time, and hope that the next run will be faster (this might work if you are clearing up a backlog the first time, so the second run will run quickly.)
The answers to these questions depend on the application. Cron might not be the right tool for you depending on the answer. You might be better having a process permanently running and scheduling itself.
So, let me get this straight: You are querying an external service (what? SOAP? MYSQL?) every minute for every user in the database and storing the results in the same database. Is that correct?
It seems like a design problem.
If the users on the external service are the same as the users in your database, perhaps the two should be more closely configured. I don't know if PHP is the way to go for syncing this data. If you give more detail, we could think about another solution. If you are in control of the external service, you may want to have that service dump it's data or even write directly to the database. Some other syncing mechanism might be better.
EDIT
It seems that you are making an application that stores data for a user that can then be viewed chronologically. Otherwise you may as well just fetch the data when the user requests it.
Fetch all the user IDs in go.
Iterate over them one by one (assuming that the data being fetched is unique to each user) and (you'll have to be creative here as PHP threads do not exist AFAIK) call a process for each request as you want them all to be executed at the same time and not delayed if one user does not return data.
Said process should insert the data returned into the db as soon as it is returned.
As for cron being right for the job: As long as you have a powerful enough server that can handle thousands of the above cron jobs running simultaneously, you should be fine.
You could get creative with several PHP scripts. I'm not sure, but if every CLI call to PHP starts a new PHP process, then you could do it like that.
foreach ($users as $user)
{
shell_exec("php fetchdata.php $user");
}
This is all very heavy and you should not expect to get it done snappy with PHP. Do some tests. Don't take my word for it.
Databases are made to process BULKS of records at once. If you're processing them one-by-one, you're looking for trouble. You need to find a way to batch up your "every minute" task, so that by executing a SINGLE (complicated) query, all of the affected users' info is retrieved; then, you would do the PHP processing on the result; then, in another single query, you'd PUSH the results back into the DB.
Based on your big-picture description it sounds like you have a dead-end design. If you are able to get it working right now, it'll most likely be very fragile and it won't scale at all.
I'm guessing that if you have no control over the external service, then that external service might not be happy about getting hammered by your script like this. Have you approached them with your general plan?
Do you really need to do all users every time? Is there any sort of timestamp you can use to be more selective about which users need "updates"? Perhaps if you could describe the goal a little better we might be able to give more specific advice.
Given your clarification of wanting to run the processing of users simultaneously...
The simplest solution that jumps to mind is to have one thread per user. On Windows, threads are significantly cheaper than processes.
However, whether you use threads or processes, having thousands running at the same time is almost certainly unworkable.
Instead, have a pool of threads. The size of the pool is determined by how many threads your machine can comfortable handle at a time. I would expect numbers like 30-150 to be about as far as you might want to go, but it depends very much on the hardware's capacity, and I might be out by another order of magnitude.
Each thread would grab the next user due to be processed from a shared queue, process it, and put it back at the end of the queue, perhaps with a date before which it shouldn't be processed.
(Depending on the amount and type of processing, this might be done on a separate box to the database, to ensure the database isn't overloaded by non-database-related processing.)
This solution ensures that you are always processing as many users as you can, without overloading the machine. As the number of users increases, they are processed less frequently, but always as quickly as the hardware will allow.