I have user subscriptions in my DB, and the SUBSCRIPTIONS table has UID (that I use with the CC company to charge the Card), and DATE_EXPIRED, DATE_PAID, VALID and LENGTH.
I am going to need to use some kind of a function to process a batch of subscriptions from the DB and send to the CC company for processing, and upon return, I should treat the subscription accordingly.
in the case the charge went ok, I mark VALID = 1, set DATE_PAID = NOW(), SET DATE_EXPIRED = NOW + INTERVAL LENGTH MONTH
but if the result is not ok, I need to mark VALID = ERROR_NO and do some actions, like sending an email, bring up messages etc.
My question is about the approach to the situation,
I will be getting a list of subscriptions that need to be updated, so:
What do you think is the best way to process a batch like that?
My internal processing function is using cURL to contact the CC server, and gets a cURL response. upon which I know what to do next.
Is it a file I should write all necessary subscriptions to ? Is that a cURL to my own server to batch independently ? How do I send a batch of result for processing to myself ? What do you think is the best approach ?
How do I keep the SELECT result to process in one time ?
I think I should clarify. Say I theoretically have 100,000 results, I think I should seperate my SELECT to portions with LIMIT X,Y then put in memory and save on memory, but on the next SELECT I will be casting upon a perhaps different table (as it might have been updated by now). I'd like to run through all 100,000 results having the same 100,000 I had in the first SELECT. Doing all the process in the page would be uneffective of course as the whole process would drop as soon as you close the page.
Use LOCK TABLES subscriptions WRITE to solve your second problem, and UNLOCK the table when you are done. Although really, unless you're running this on a server with no RAM at all, keeping these 100,000 rows in memory and going through them in a while loop using something like mysql_fetch_assoc or mysqli_fetch_assoc shouldn't be a problem.
As to your first question, why don't you simply process the cURL response when you get it, and save the results in the database on a per-subscription basis? I don't really see why you would want to put this into a separate file first, and certainly not why you would then need to use cURL to contact your own server?
Related
For some reasons (that I think it is not the point of my question, but if it help, ask me and I can describe why), I need to check MySQL tables continuously for new records. If any new records come, I want to do some related actions that are not important now.
Question is, how I should continuously check the database to make sure I am using the lowest resources and getting the results, close to the realtime.
For now, I have this:
$new_record_come = false;
while(! $new_record_come) {
$sql = "SELECT id FROM Notificatins WHERE insert_date > (NOW() - INTERVAL 5 SECONDS)";
$result = $conn->query($sql);
if ($result)
{
//doing some related actions...
$new_record_come = true;
}
else
{
sleep(5); //5 seconds delay
}
}
But I am worry that if I get thousands of users, it will make the server down, even if the server is a high price one!
Do you have any advice to make it better in performance or even change the way completely or even change the type of query or any other suggestion?
Polling a database is costly, so you're right to be wary of that solution.
If you need to scale this application up to handle thousands of concurrent users, you probably should consider additional technology that complements the RDBMS.
For this, I'd suggest using a message queue. After an app inserts a new notification to the database, the app will also post an item to a topic on the message queue. Typically the primary key (id) is the item you post.
Meanwhile, other apps are listening to the topic. They don't need to do polling. The way message queues work is that the client just waits until there's a new item in the queue. The wait will return the item.
A comment suggested using a trigger to invoke a PHP script. This won't work, because triggers execute while the transaction that spawned them is not yet committed. So if the trigger runs a PHP script, which probably needs to read the record from the database. But an uncommitted record is not visible to any other database session, so the PHP script can never read the data that it was notified about.
Another angle (much simpler than message queue I think):
I once implemented this on a website by letting the clients poll AND compare it to their latest id they received.
For example: You have a table with primary key, and want to watch if new items are added.
But you don't want to set up a database connection and query the table if there is nothing new in it.
Let's say the primary key is named 'postid'.
I had a file containing the latest postid.
I updated it with each new entry in tblposts, so it contains alsways the latest postid.
The polling scripts on the clientside simply retrieved that file (do not use PHP, just let Apache serve it, much faster: name it lastpostid.txt or something).
Client compares to its internal latest postid. If it is bigger, the client requests the ones after the last one. This step DOES include a query.
Advantage is that you only query the database when something new is in, and you can also tell the PHP script what your latest postid was, so PHP can only fetch the later ones.
(Not sure if this will work in your situation becuase it assumes an increasing number meaning 'newer'.)
This might not be possible with your current system design but how about instead of using triggers or a heartbeat to poll the database continuously that you go where the updates, etc happen and from there execute other code? This way, you can avoid polling the database continuously and code will fire ONLY IF somebody initiates a request?
User keys in search parameters, then we make a request to a data provider and redirect user to a loading page. The response from the data provider hits a callback url, in which case we parse the results and store about 200 rows into the db. Meanwhile the loading page uses ajax to query the db every second and when the results are all there we display the results to the user.
The issue is that insert into the mysql db is too slow. We know the response back from the data provider comes back within seconds, but the processing of the script and inserting of rows into the db is very slow. We do use multirow insert.
Any suggestions to improve? FYI, the code is hugely long... that's why not displaying right now.
There are multitude of factors affecting your insertions:
1) slow hardware and bad server speeds.
Sol : Contact your server administrator
2) Use something other than InnoDB
3) Use a surrogate key , other than your primary key that is numeric and sequential along with your natural primary key.
OR
4) Try this https://stackoverflow.com/a/2223062/3391466.
Suggestion: Instead of running the code on one page and having the user wait the whole process, why not have the php page store the instructions in a php queue? The instructions would then be executed by a separate php script (for instance a Cron Job) and the user wouldn't have to wait for the whole process to take place.
However, in this situation it would be ideal to let the user know that the changes made can take a bit of time to update.
Cron jobs are very easy to implement. In CPanel there is an option for Cron Jobs where you specify which script you want to run and in which intervals. You can let your script know to run once every 1 minute (or more or less depending on how much demand there is). From there your script would check the queue and could keep on running until the queue is empty again.
Let me know if that helped!
I am putting together an interface for our employees to upload a list of products for which they need industry stat's (currently doing 'em manually one at a time).
Each product will then be served up to our stat's engine via a webservice api.
I will be replying. The Stat's-engine will be requesting the "next victim" from my api.
Each list the users upload will have between 50 and 1000 products, and will be its own queue.
For now, Queues/Lists will likely be added (& removed via completion) aprox 10-20 times per day.
If successful, traffic will probably rev up after a few months to something like 700-900 lists per day.
We're just planning to go with a simple round-robin approach to direct the traffic evenly across queues.
The multiplexer would grab the top item off of List A, then List B, then List C and so on until looping back around to List A again ... keeping in mind that lists/queues can be added/removed at any time.
The issue I'm facing is just conceptualizing the management of this.
I thought about storing each queue as a flat file and managing the rotation via relational DB (MySQL). Thought about doing it the reverse. Thought about going either completely flat-file or completely relational DB ... bottom line, I'm flexible.
Regardless, my brain is just vapor locking when I try to statelessly meld a variable list of participants with a circular rotation (I just got back from a quick holiday, and I don't think my brain's made it home yet ;)
Has anyone done something like this?
How did you handle it?
What would you improve if you had to do it again?
Any & all tips/suggestions/advice are welcome.
NOTE: Since each request from our stat's engine/tool will be separated by many seconds, if not a couple minutes, I need to keep this stateless.
List data should be stored in a database, for sure. Your PHP side should have a view giving the status of the system, and the form to add lists.
Since each request becomes its own queue, and all the request-queues are considered equal in priority, the ideal number of tables is probably three. One to list requests and their priority relative to another (to determine who goes next in the round-robin) and processing status, another to list the contents (list-items) of each request that are yet to be processed, and a third table to list the processed items from each queue.
You will also need a script that does the actual processing, that is not driven by a user request, but instead by a system-scheduled job that executes periodically (throttled to whatever you desire). This can of course also be in PHP. This is where you would set up your 10-at-a-time list checks and updates.
The processing would be something like:
Select the next set of at most 10 items from the highest-priority queue.
Process them, Updating their DB status as they complete.
Update the priority of the above queue so that it is now the lowest priority.
And if new queues are added, they would be added with lowest priority.
Priority could be represented with an integer.
Your users would need to wait patiently for their list to be processed and then view or download the result. You might setup an auto-refresh script for this on your view page.
It sounds like you're trying to implement something that Gearman already does very well. For each upload / request, you can simply send off a job to the Gearman server to be queued.
Gearman can be configured to be persistent (just in case things go to hell), which should eliminate the need for you logging requests in a relational database.
Then, you can start as many workers as you'd like. I know you suggest running all jobs serially, which you can still do, but you can also parallelize the work, so that your user isn't sitting around quite as long as they would've been if all jobs had been processed in a serial fashion.
After a good nights sleep, I now have my wits about me (I hope :).
A simple solution is a flat file for the priorities.
Have a text file simply with one List/Queue ID on each line.
Feed from one end of the list, and add to the other ... simple.
Criticisms are welcome ;o)
Thanks #Trylobot and #Chris_Henry for the feedback.
I'm looking for a technique to do the following and I need your advices.
I have a huge (really )table with registration ids and I need to send messages to these ID owners. I cant send the message to many recipients at once, this needs to be proceeded one by one. So I would like to have a script(php) which can run in many parallel instances (processes) by getting some amount from db and processing it. In other words every process needs to work with a particular range of data. I would like also to stop each process and to be able to continue message sending from the stopped user to another set of users who didnt get the message yet.
If it's possible? Any tips and advices are welcome.
You may wish to set a cron job, typically one of the best approaches to run large batch operations with PHP scripts:
http://www.developertutorials.com/tutorials/php/running-php-cron-jobs-regular-scheduled-tasks-in-php-172/
Your cron job will need to point to a PHP script which does the following:
Selects a subset of recipients from your large DB table, based on a
flag set at #3 (below), identifying the next batch to process
Send email to those selected recipients
Saves a note of current job position success/fail (i.e. you could set a
flag next to each recipient in the DB who is succesfully mailed, these are then not selected when the job is rerun)
Parallel processing is possible only to the extent of the configuration of your server. Many servers can serve pages in a parallel fashion, but then again, it is limited to a few. Instead, the rule of thumb is to be as fast as possible and jump to the next request.
Regarding your processing of a really large list of data in your database. You will first of all need a list of id for the mailing your are doing:
INSERT INTO `mymailinglisttable` (mailing_id, recipient_id, senton) SELECT 123 AS mailing_id, mycontacttable.recipient_id, NULL FROM mycontacttable WHERE [insert your criterias for your contacts]
Next you will need to use either innodb or some clever logic for your parallel processing:
With InnoDB, you can do some row level locking, but don't ask me how, search it yourself, i don't use InnoDB at all, but i know it is possible. So you read the docs on that, select and lock some rows, send the emails, mark as sent and wash rinse repeat the operation by calling back your own script. (Either with AJAX or with a php socket)
Without InnoDB, you can simply add 2 fields to your database, one is a processid, the other is a lockedon field. When you want to lock some addresses for your processing, do:
$mypid = getmypid().rand(1111,9999);
$now = date('Y-m-d G:i:s');
mysql_query('UPDATE mymailinglisttable SET mypid = '.$mypid.', lockedon = "'.$now.'" LIMIT 3');
This will lock 3 rows for your pid and on the current time, select the rows that were locked using:
mysql_query('SELECT * FROM mymailinglisttable WHERE mypid = '.$mypid.' AND lockedon = "'.$now.'")
You will retrieve the 3 rows that you locked correctly for processing. I tend to use this version more than the innodb version cause i was raised with this method but not because it is more performant, actually, i'm sure InnoDB's version is much better just never tried it.
If you're comfortable with using PEAR modules, I'd recommend having a look at the pear Mail_Queue module.
http://pear.php.net/package/Mail_Queue
Well documented and with a nice tutorial. I've used a modified version of this before to send out thousands of emails to customers and it hasn't given me a problem yet:
http://pear.php.net/manual/en/package.mail.mail-queue.mail-queue.tutorial.php
Our company deals with sales. We receive orders and our PHP application allows our CSRs to process these orders.
There is a record in the database that is constantly changing depending on which order is currently being processed by a specific CSR - there is one of these fields for every CSR.
Currently, a completely separate page polls the database every second using an xmlhhtp request and receives the response. If the response is not blank (only when the value has changed on the database) it performs an action.
As you can imagine, this amounts to one databse query per second as well as a http request every second.
My question is, is there a better way to do this? Possibly a listener using sockets? Something that would ping my script when a change has been performed without forcing me to poll the database and/or send an http request.
Thanks in advance
First off, 1 query/second, and 1 request/second really isn't much. Especially since this number wont change as you get more CSRs or sales. If you were executing 1 query/order/second or something you might have to worry, but as it stands, if it works well I probably wouldn't change it. It may be worth running some metrics on the query to ensure that it runs quickly, selecting on an indexed column and the like. Most databases offer a way to check how a query is executing, like the EXPLAIN syntax in MySQL.
That said, there are a few options.
Use database triggers to either perform the required updates when an edit is made, or to call an external script. Some reference materials for MySQL: http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
Have whatever software the CSRs are using call a second script directly when making an update.
Reduce polling frequency.
You could use an asynchronous architecture based on a message queue. When a CSR starts to handle an order, and the record in the database is changed, a message is added to the queue. Your script can either block on requests for the latest queue item or you could implement a queue that will automatically notify your script on the addition of messages.
Unless you have millions of these events happening simultaneously, this kind of setup will cause the action to be executed within milliseconds of the event occuring, and you won't be constantly making useless polling requests to your database.