Only one call from concurrent request with 60 sec timeout - php

I have a function callUpdate() that needs to be executed after every update in the webpage admin.
callUpdates execute some caching (and takes up to 30 sec..) so it is not important to execute it immediately but in reasonable amount of time after the last update lets say 60 sec.
The goal is to skip processing if the user (users) make several consecutive changes in a short amount of time.
here is my current code:
//this in separate stand alone script that is called asynchronous way
//so hanging for 1min does not and block the app.
function afterUpdate(){
$time = time();
file_put_contents('timer.txt', $time);
sleep(60);
if (file_get_contents("timer.txt") == $time) {
callUpdate();
}
}
My concerns here are bout the sleep function .. if it takes too much resources
(if I make 10 quick saves, this will start 10 PHP processes running for almost 60 sec each ..)
And what will happen if 2 users call simultaneously file_put_contents() on the same file.
Please tell me if there is better approach and if there are some major issues in mine.
NOTE: data between sessions can be stored only in a file
and there I have limited access to the server setup "APC settings and such"

Related

What is the most efficient way to record JSON data per second

Reason
I've been building a system that pulls data from multiple JSON sources. The data being pulled is constantly changing and I'm recording what the changes are to a SQL database via a PHP script. 9 times out of 10 the data is different and therefore needs recording.
The JSON needs to be checked every single second. I've been successfully using a cron task every minute with a PHP function that loops 60 times over.
The problem I'm now having is that the more JSON sources I want to check the slower the PHP file runs, meaning the next cron get's triggered before the previous has finished. It's all starting to feel way too unstable and hacky.
Question
Assuming the PHP script is already the most efficient it can be, what else can be done?
Should I be using multiple cron tasks?
Should something else other then PHP be used?
Are cron tasks even suitable for this sort of problem?
Any experience, best practices or just plan old help will be very much appreciated.
Overview
I'm monitoring for active race sessions and recording each driver and then each lap a driver completes. Laps are recorded only once a driver crosses the start/finish line and I do not know when race sessions may or may not be active or when a driver crosses the line. Therefore I have been checking every second for new data to record.
Each venue where a race session may be active has a separate URL to receive JSON data from. The more venue's I add to my system to monitor the slower the script takes to run.
I've currently 19 venues and the script takes circa 12 seconds to complete. Since I'm running a cron job every minute and looping the script every second. I'm assuming I have at the very least 12 scripts running every second. It just doesn't seem like the most efficient way to do it to me. Of course, it worked a charm back when I was only checking 1 single venue.
There's a cycle to your operations. It is.
start your process by reading the time witn $starttime = time();.
compute the next scheduled time by taking the time plus 60 seconds. $nexttime = $starttime + 60;
do the operations you must do (read a mess of json feeds)
compute how long is left in the minute $timeleft = $nexttime - time();.
sleep until the next scheduled time if ($timeleft > 0) sleep ($timeleft);
set $starttime = $nexttime.
jump back to step 2.
Obviously, if $timeleft is ever negative, you're not keeping up with your measurements. If $timeleft is always negative, you will get further and further behind.
The use of cron every minute is probably wasteful, because it takes resources to fire up a new process and get it going. You probably want to make your process run forever, and use a shell script that monitors it and restarts it if it crashes.
This is all pretty obvious. What's not so obvious is that you should keep track of your individual $timeleft values for each minute over your cycle of measurements. If they vary daily, you should track for a whole day. If they vary weekly you should track for a week.
Then you should should look at the worst (smallest) values of $timeleft. If your 95th percentile is less than about 15 seconds, you're running out of resources and you need to take action. You need a margin like 15 seconds, so your system doesn't move into overload.
If your system has zero tolerance for late sampling of data, you should look at the single worst value of $timeleft, not the 95th percentile. You should give yourself a more generous margin than 15 seconds.
So-called hard real time systems allocate a time slot to each operation, and crash if the operation exceeds the time slot. In your case the time slot is 60 seconds and the operation is reading a certain number of feeds. Crashing is pretty drastic, but measuring is mandatory.
The simplest action to take is to start running multiple worker processes. Give some of your feeds to each process. php runs single-threaded so multiple processes probably will help, at least until you get to three or four of them.
Then you will need to add another computer, and divide your feeds among worker processes on those multiple computers.
A language environment that parses JSON faster than php does might help, but only if the time it takes to parse the JSON is more important than the time it takes to wait for it to arrive.

Should I avoid a one hour sleep()?

I'm using the DigitalOcean API to create droplets when my web application needs the extra resources. Because of the way DigitalOcean charges (min. one hour increments), I'd like to keep a created server on for an hour so it's available for new tasks after the initial task is completed.
I'm thinking about formatting my script this way:
<?php
createDroplet($dropletid);
$time = time();
// run resource heavy task
sleep($time + 3599);
deleteDroplet($dropletid);
Is this the best way to achieve this?
It doesn't look like a good idea, but the code is so simple, nothing can compete with that. You would need to make sure your script can run, at least, for that long.
Note that sleep() should not have $time as an argument. It sleeps for the given number of seconds. $time contains many, many seconds.
I am worried that the script might get interrupted, and you will never delete the droplet. Not much you can do about that, given this script.
Also, the sleep() itself might get interrupted, causing it to sleep much shorter than you want. A better sleep would be:
$remainingTime = 3590;
do {
$remainingTime = sleep($remainingTime);
} while ($remainingTime > 0);
This will catch an interrupt of sleep(). This is under the assumption that FALSE is -1. See the manual on sleep(): http://php.net/manual/en/function.sleep.php
Then there's the problem that you want to sleep for exactly 3599 seconds, so that you're only charged one hour. I wouldn't make it so close to one hour. You have to leave some time for DigitalOcean to execute stuff and log the time. I would start with 3590 seconds and make sure that always works.
Finally: What are the alternatives? Clearly this could be a cron job. How would that work? Suppose you execute a PHP script every minute, and you have a database entry that tells you which resource to allocate at a certain start time and deallocate at a certain expire time. Then that script could do this for you with an accurary of about a minute, which should be enough. Even if the server crashes and restarts, as long as the database is intact and the script runs again, everything should go as planned. I know, this is far more work to implement, but it is the better way to do it.

How do I observe a rate Limit in PHP?

So, I'm requesting data from an API.
So far, my API key is limited to:
10 requests every 10 seconds
500 requests every 10 minutes
Bascially, I want to request a specific value from every game the user has played.
That are, for example, about 300 games.
So I have to make 300 requests with my PHP. How can I slow them down to observe the rate limit?
(It can take time, site does not have to be fast)
I tried sleep(), which resulted in my script crashing.. Any other ways to do this?
I suggest setting up a cron job that executes every minute, or even better use Laravel scheduling rather than using sleep or usleep to imitate a cron.
Here is some information on both:
https://laravel.com/docs/5.1/scheduling
http://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/
This sounds like a perfect use for the set_time_limit() function. This function allows you to specify how long your script can execute, in seconds. For example, if you say set_time_limit(45); at the beginning of your script, then the script will run for a total of 45 seconds. One great feature of this function is that you can allow your script to execute indefinitely (no time limit) by saying: set_time_limit(0);.
You may want to write your script using the following general structure:
<?php
// Ignore user aborts and allow the script
// to run forever
ignore_user_abort(true);
set_time_limit(0);
// Define constant for how much time must pass between batches of connections:
define('TIME_LIMIT', 10); // Seconds between batches of API requests
$tLast = 0;
while( /* Some condition to check if there are still API connections that need to be made */ ){
if( timestamp() <= ($tLast + TIME_LIMIT) ){ // Check if TIME_LIMIT seconds have passed since the last connection batch
// TIME_LIMIT seconds have passed since the last batch of connections
/* Use cURL multi to make 10 asynchronous connections to the API */
// Once all of those connections are made and processed, save the current time:
$tLast = timestamp();
}else{
// TIME_LIMIT seconds have not yet passed
// Calculate the total number of seconds remaining until TIME_LIMIT seconds have passed:
$timeDifference = $tLast + TIME_LIMIT - timestamp();
sleep( $timeDifference ); // Sleep for the calculated number of seconds
}
} // END WHILE-LOOP
/* Do any additional processing, computing, and output */
?>
Note: In this code snippet, I am also using the ignore_user_abort() function. As noted in the comment on the code, this function just allows the script to ignore a user abort, so if the user closes the browser (or connection) while your script is still executing, the script will continue retrieving and processing the data from the API anyway. You may want to disable that in your implementation, but I will leave that up to you.
Obviously this code is very incomplete, but it should give you a decent understanding of how you could possibly implement a solution for this problem.
Don't slow the individual requests down.
Instead, you'd typically use something like Redis to keep track of requests per-IP or per-user. Once the limit is hit for a time period, reject (with a HTTP 429 status code, perhaps) until the count resets.
http://redis.io/commands/INCR coupled with http://redis.io/commands/expire would easily do the trick.

Call a function multiple times without waiting for it to complete

I have a cron job that calls a script that iterates through some items and submits them as posts to Facebook Graph API every minute. The issue is, each call takes a few seconds. If there is more than 10 posts to be sent to the API in a given minute, the script runs longer than a minute and then starts causing issues when the script starts running again at the next minute.
The general process is like this:
1. Each facebook profile posts every hour
2. Each of these profiles have a 'posting minute', which is the minute of the hour that they are posted at
3. A cron job runs every minute to see which profiles should be posted do, any given minute, and then posts to them
My question: Is it possible to continue the script immediately after calling the $facebook->api(...) method below, rather than waiting for it to complete before continuing? So that it can ensure to post to all profiles within the given minute, rather than potentially risk having too many profiles to post to and the script exceeding 60 seconds.
$profilesThatNeedToBePostedTo = getProfilesToPostTo(date(i));
foreach($profilesThatNeedToBePostedTo as $profile)
{
$facebook->api($endPoint, 'POST', $postData); // $postData and $endPoint omitted for brevity
}
what you are asking if in PHP you can do Asynchronous calls. There are some solutions here: Asynchronous PHP calls
Just to point there are not the most reliable options.
Is not limiting number of post to 9 each time reduce the times you have exceeded the 60 seconds. But what happens if the Facebook Graph is slow? what do you do?
Or run you code for 60 seconds and once the 60 seconds time is reached close the function. And let it run again the next time.
I am not sure if I understand your question hundred percent clearly. You meant that you want to run the cronjob every minute. However, it takes longer than 1 minute to call facebook API. Do you mean that foreach loop may delay your cronjob?
If so, I wonder if you did a right cronjob script, better to show your cronjob script. Because I think php script should not affect your sh script(cronjob)

php - how to determine execution time?

I have to process more images with big amount, (10 mb/image)
How do I determine the execution time to process all of the images in queue?
Determine the time base from the data that we have.
Set the time limit.
Run process.
And what do the execution time depend on? (Server, internet speed, type of data...?)
#All:
i have changed my way to do my issue, send 1reqeust/1image,
so with 40 images, we will be have 40 request. no need to care about excution time :D
Thanks
You can test your setup with the code
// first image
$start = time();
// ... work on image ...
$elapsed = time() - $start;
if (ini_get('max_execution_time') > $elapsed * $imageCount)
trigger_warning("may be not able to finish in time", E_USER_WARNING);
Please note two things: in the CLI-version of PHP, the max_execution_time is hardcoded to 0 / inifinity (according to this comment). Also, you may reset the timer by calling set_time_limit() again like so:
foreach ($imagelist as $image) {
// ... do image work ....
// reset timer to 30
set_time_limit(30);
}
That way, you can let your script run forever or at least until you're finished with your image processing. You must enable the appropriate overwrite rules in the apache-configuration to allow this via AllowOverride All
I would suggest (given your limited info in the question) that you try using the trial and error method - run your process and see how long it takes - increase the time limit until it completes - you might be able to shorten your process.
Be aware that the server processing time can vary a LOT depending on the current load on the server from other processess. If it's a shared server, some other user can be running some script at this exact time, making your script only perform half as well.
I think it's going to be hard to determine the execution time BEFORE the script is run.
I would upload batches (small groups) of images. The number of images would depend on some testing.
For example, run your script several times simultaneously from different pages to see if they all still complete without breaking. If it works with 5 images in the queue, write your script to process 5 images. After the first five images has processed, store them (write to database or whatever you need), wait a little bit then take the next 5 images.
If it works when you run three scripts with 5 images each at the same time, you should be safe doing it once with whatever some other user on the server is doing.
You change the time execution time limit in the file php.ini, or if you don't have access to the file you can set it in on the fly with set_time_limit(600) for 600 seconds. I would however write smarter code instead than relying on time limit.
My five cents. Good luck!

Categories