PHP/Mysql/jQuery - Long Polling - Best delay between 2 database checks

PHP/Mysql/jQuery - Long Polling - Best delay between 2 database checks - php

I'm working currently on a long polling script where i've to check a database for new changes.
I'm wondering if it will be too much resources consuming to do the query in a while loop then do the query again without any delay, or if i should let a little delay like one second.
When I look on Facebook for example it seems to have the new changes within the second so i guess they they don't have any delay while checking the database or this delay is really short like half a second.
I don't expect a straight answer but more advises on the best practices for this
Thanks

Yeah, if the while loop runs as fast as it can (it will) you will not just get a fast rate which might overwhelm the database connection, but you'll also not get a very 'even' rate of polling.
Try something like
$polllength = 1;
while(1) {
$polltime = microtime(true);
//poll function call;
$endtime = microtime(true);
$sleeptime = $polllength - ($endtime-$polltime);
sleep($sleeptime);
}
This way your polls are about $polllength seconds apart no matter how the polling function varies (it will due to INTARNETS)
EDIT: also, make sure there's a way out of that while loop but everyone should know that y'all
For load balancing, you want to be able to tweak the polllength value somehow, so as to not be completely hardcoded. Whether it be in a configuration file or whatnot is up to you, it might even be a value that increases as load increases. That's up to how it actually 'feels' to the end user and how the server is really faring. A good rule of thumb is that n threads max means n * (poll length / avg response time) users. Reaching beyond that limit would doubtlessly increase response times as the users wait for responses from the overwhelmed server.

Related

Should I avoid a one hour sleep()?

I'm using the DigitalOcean API to create droplets when my web application needs the extra resources. Because of the way DigitalOcean charges (min. one hour increments), I'd like to keep a created server on for an hour so it's available for new tasks after the initial task is completed.
I'm thinking about formatting my script this way:
<?php
createDroplet($dropletid);
$time = time();
// run resource heavy task
sleep($time + 3599);
deleteDroplet($dropletid);
Is this the best way to achieve this?

It doesn't look like a good idea, but the code is so simple, nothing can compete with that. You would need to make sure your script can run, at least, for that long.
Note that sleep() should not have $time as an argument. It sleeps for the given number of seconds. $time contains many, many seconds.
I am worried that the script might get interrupted, and you will never delete the droplet. Not much you can do about that, given this script.
Also, the sleep() itself might get interrupted, causing it to sleep much shorter than you want. A better sleep would be:
$remainingTime = 3590;
do {
$remainingTime = sleep($remainingTime);
} while ($remainingTime > 0);
This will catch an interrupt of sleep(). This is under the assumption that FALSE is -1. See the manual on sleep(): http://php.net/manual/en/function.sleep.php
Then there's the problem that you want to sleep for exactly 3599 seconds, so that you're only charged one hour. I wouldn't make it so close to one hour. You have to leave some time for DigitalOcean to execute stuff and log the time. I would start with 3590 seconds and make sure that always works.
Finally: What are the alternatives? Clearly this could be a cron job. How would that work? Suppose you execute a PHP script every minute, and you have a database entry that tells you which resource to allocate at a certain start time and deallocate at a certain expire time. Then that script could do this for you with an accurary of about a minute, which should be enough. Even if the server crashes and restarts, as long as the database is intact and the script runs again, everything should go as planned. I know, this is far more work to implement, but it is the better way to do it.

prevent php timeout in block of code

Wanted to know if there was a was to prevent a php timeout of occurring if a part of the code has started being process.
Let me explain:
i have a script that is executed that take way too long to even use
ini_set('max_execution_time', 0);
set_time_limit(0);
the code is built to allow it to timeout and restart where it was but I have 2 line of code that need to be executed together for that to happen
$email->save();
$this->_setDoneWebsiteId($user['id'], $websiteId);
is there a way in php to tell it it has to finish executing them both even if the timeout is called?
Got an idea as I'm writing this, i could use a time_out of 120 sec and start a timer and if there is less then 20 sec left before timeout to stop, i just wanted to know if i was missing something.
Thank you for your inputs.

If your code is not synchronous and some task takes more than 100 seconds - you'll not be able to check the execution time.
I see only one truly HACK (be careful, test it with php -f in console for be able to kill the processes):
<?php
// Any preparations here
register_shutdown_function(function(){
if (error_get_last()) { // There was timeout exceeded error
// Call the rest of your system
// But note: you have no stack, no context, no valid previous frame - nothing!
}
});

One thing you could do is use the DATE time features to monitor your average execution time. (kind of builds up with each execution assuming you have a loop).
If the average then time is then longer than how ever much time you have left (you would be counting how much time has been taken already against your maximum execution time), you would trigger a restart and let it pick up from where it left off.
How ever if you are experiencing time outs, you might want to look at ways to make your code more efficient.

No, you can't abort timeout handler, but i'd say that 20 seconds is quite a big time if you're not parsing something huge. However, you can do the following:
Get time of the execution start ($start = $_SERVER['REQUEST_TIME'] or just $start = microtime(true); in the beginning of your controller).
Asset that execution time is lesser than 100 seconds before running $email->save() or halt/skip code if necessary. This is as easy as if (microtime(true) - $start < 100) { $email->save()... }. You would like more abstraction on this, however.
(if necessary) check execution time after both methods have ran and halt execution if it has timed out.
This will require time_limit set to big value or even turned off and tedious work to prevent too long execution. In most cases timeout is your friend that just tells you're work is taking too much time and you should rethink your architecture and inner processes; if you're out of 120 seconds, you'll probably want to put that work on a daemon.

Thank you for your input, as I thought the timer solution is the best way.
what I ended up doing was the following, this is not the actual code as its too long to make a good answer but just the general idea.
ini_set('max_execution_time', 180);
set_time_limit(180);
$startTime = date('U'); //give me the current timestamps
while(true){
//gather all the data i need for the email
//I know its overkill to break the loop with 1 min remaining
//but i realy dont want to take chances
if(date('U') < ($startTime+120)){
$email->save();
$this->_setDoneWebsiteId($user['id'], $websiteId);
}else{
return false
}
}
I could not use the idea of measuring the average time of each cycle as it vary too much.
I could have made the code more efficient but it number of cycle is based on the number of users and websites in the framework. It should grow big enough to need multiple run to be completed anyway.
Il have to make some research to understand register_shutdown_function, but I will look into it.
Again Thank you!

Querying for a set time?

Is there a way to query for a set time from a PHP script?
I want to write a PHP script that takes an id and then queries the MySQL database to see if there is a match. In the case where another user may have not yet uploaded their match, so I am aiming to query until I find a match or until 5 seconds have passed, which I will then return 0.
In pseudocode this is what I was thinking but it doesn't seem like a good method since I've read looping queries isn't good practice.
$id_in = 123;
time_c = time();
time_stop = time + 5; //seconds
while(time_c < time_stop){
time_c = time()
$result = mysql_query('SELECT * WHERE id=$id_in');
}

It sounds like your requirement is to poll some table until a row with a particular ID shows up. You'll need a query like this to do that:
SELECT some-column, another column FROM some-table WHERE id=$id_in
(Pro tip: don't use SELECT * in software.)
It seems that you want to poll for five seconds and then give up. So let's work through this.
One choice is to simply sleep(5), then poll the table using your query. The advantage of this is that it's very simple.
Another choice is what you have. This will make your php program hammer away at the table as fast as it can, over and over, until the poll succeeds or until your five seconds run out. The advantage of this approach is that your php program won't be asleep when the other program hits the table. In other words, it will pick up the change to the table with minimum latency. This choice, however, has an enormous disadvantage. By hammering away at the table as fast as you can, you'll tie up resources on the MySQL server. This is generally wasteful. It will prevent your application from scaling up efficiently (what if you have ten thousand users all doing this?) Specifically, it may slow down the other program trying to hit the table, so it can't get the update done in five seconds.
There's middle ground, however. Try doing a half-second wait
usleep(500000);
right before each time you poll the table. That won't waste MySQL resources as badly. Even if your php program is asleep when the other program hits the table, it won't be asleep for long.

There is no need to do simple polling and sleeping. I don't know your exact requirements, but in general your question asks for GET_LOCK() or PHP's Semaphore support.
Assuming your uploading process starts with
SELECT GET_LOCK("id-123", 0);
Your Query thread can then wait on that lock:
SELECT GET_LOCK("id-123", 5) as locked, entities.*
FROM entities WHERE id = 123;
You might eventually find that a TRIGGER is the thing what you were looking for.

How do I split a really long, memory intensive loop into smaller chunks

I have quite a long, memory intensive loop. I can't run it in one go because my server places a time limit for execution and or I run out of memory.
I want to split up this loop into smaller chunks.
I had an idea to split the loop into smaller chunks and then set a location header to reload the script with new starting conditions.
MY OLD SCRIPT (Pseudocode. I'm aware of the shortcomings below)
for($i=0;$i<1000;$i++)
{
//FUNCTION
}
MY NEW SCRIPT
$start=$_GET['start'];
$end=$start+10;
for($i=$start;$i<$end;$i++;)
{
//FUNCTION
}
header("Location:script.php?start=$end");
However, my new script runs successfully for a few iterations and then I get a server error "Too many redirects"
Is there a way around this? Can someone suggest a better strategy?
I'm on a shared server so I can't increase memory allocation or script execution time.
I'd like a PHP solution.
Thanks.

"Too many redirects" is a browser error, so a PHP solution would be to use cURL or standard streams to load the initial page and let it follow all redirects. You would have to run this from a machine without time-out limitations though (e.g. using CLI)
Another thing to consider is to use AJAX. A piece of JavaScript on your page will run your script, gather the output from your script and determine whether to stop (end of computation) or continue (start from X). This way you can create a nifty progress meter too ;-)

You probably want to look into forking child processes to do the work. These child processes can do the work in smaller chunks in their own memory space, while the parent process fires off multiple children. This is commonly handled by Gearman, but can be done without.
Take a look at Forking PHP on Dealnews' Developers site. It has a library and some sample code to help manage code that needs to spawn child processes.

Generally if I have to iterate over something many many times and it has a decent amount of data, I use a "lazy load" type application like:
for($i=$start;$i<$end;$i++;)
{
$data_holder[] = "adding my big data chunks!";
if($i % 5 == 1){
//function to process data
process_data($data_holder); // process that data like a boss!
unset($data_holder); // This frees up the memory
}
}
// Now pick up the stragglers of whatever is left in the data chunk
if(count($data_holder) > 0){
process_data($data_holder);
}
That way you can continue to iterate through your data, but you don't stuff up your memory. You can work in chunks, then unset the data, work in chunks, unset data, etc.. to help prevent memory. As far as execution time, that depends on how much you have to do / how efficient your script is written.
The basic premise -- "Process your data in smaller chunks to avoid memory issues. Keep your design simple to keep it fast."

How about you put a conditional inside your loop to sleep every 100 iterations?
for ($i = 0; $i < 1000; $i++)
{
if ($i % 100 == 0)
sleep(1800) //Sleep for half an hour
}

First off, without knowing what your doing inside the loop, it's hard to tell you the best approach to actually solving your issue. However, if you want to execute something that takes a really long time, my suggestion would be to set up a cron job and let it nail out little portions at a time. The script would log where it stops and the next time it starts up, it could read the log for where to start.
Edit: If you are dead set against cron, and you aren't too concerned about user experience, you could do this:
Let the page load similar to the cron job above. Except after so many seconds or iterations, stop the script. Display a refresh meta tag or javascript refresh. Do this until the task is done.

With the limitations you have, I think the approach you are using could work. It may be that your browser is trying to be smart and not let you redirect back the page you were just on. It might be trying to prevent an endless loop.
You could try
Redirecting back and forth between two scripts that are identical (or aliases).
A different browser.
Having your script output an HTML page with a refresh tag, e.g.
<meta http-equiv="refresh" content="1; url=http://example.com/script.php?start=xxx">

delaying script to slow leechers

I am developing an image bank site that will hold royalty-free images for download. I want to slow down anyone using a bot or who is downloading too often, so I have a daily file limit and have incorporated a variable sleep into the script that delivers the files. I do that by writing the completion time of the last download to a database, then checking the elapsed time when the next download begins. If that is less that N seconds then I delay the download by M seconds, doubling M on successive infractions. That works fine until the script hits the server's execution time limit.
My hosting company confirms that sleep time counts towards execution time.
Am I being over-cautious at the development stage?
Any suggestions about how to detect and slow down users who are abusing the site without using php sleep?

I don't think you're being over-cautious, but I do think that this is a bad way to be cautious. If sleep time counts toward execution time, aren't you paying for that? It probably also counts toward CPU usage and a bunch of other cost factors too. Additionally, slowly choking off service doesn't give your user any indication that they are doing something wrong, it just makes your service seem slow.
You'd probably be better off serving a friendly message-image letting the person know what's going on so they can modify their behavior (this is particularly good given that some people might trigger it by accident while performing completely innocent activities). If they insist on serving your message-image more than five or ten times, then it's definitely a script, so just stop answering their requests entirely.

Why don't you simply make the user aware of what he/she is doing "wrong" and display an error?
This way, the user will know what is going on and might decide to correct the behavior. With random delays, I would suspect something wrong with your server and maybe just look for a competing offering that works more stable.

Use a div with a time counter and implement this time mechanism in javascript.example: (www.rapidshare.com) If sleep time is counted as execution time, that means that you have a pretty high chance of crossing the execution time limit.

If any one delay is much longer than the script execution timeout, you might want to block that user entirely for some period of time (24 hours?).
How are you deciding exactly who is aggressively downloading? The IP address is not 100% reliable, as you might have a number of people behind NAT that all appear to come from the same IP address.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.