Until recently I wasn't even aware it was possible for PHP to abort a script due to user disconnect.
Anyways, it could cause some real trouble for my database if the script could just abort midway through. Like if I'm inserting rows into multiple tables that partially depend on each other and only half of it gets done, I'd have to get real defensive with my programming.
Oddly enough, I found that ignore_user_abort defaults to false (at least on my installation), which seems like the sort of thing that could confuse the hell out of developers not aware of this possibility when something goes wrong because of it.
So to make things easier, shouldn't I just always set it to true? Or are there a good reason why it defaults to false?
Passing true to ignore_user_abort() as its only parameter will instruct PHP that the script is not to be terminated even if your end-user closes their browser, has navigated away to another site, or has clicked Stop. This is useful if you have some important processing to do and you do not want to stop it even if your users click cancel, such as running a payment through on a credit card. You can of course also pass false to ignore_user_abort(), thereby making PHP exit when the user closes the connection.
For handling shutdown tasks, register_shutdown_function() is perfect, as it allows you to register with PHP a function to be run when script execution ends.so it depends on your project
Anyways, it could cause some real trouble for my database if the script could just abort midway through. Like if I'm inserting rows into multiple tables that partially depend on each other and only half of it gets done, I'd have to get real defensive with my programming.
This can happen with or without ignore_user_abort, and should be addressed using database transactions.
So to make things easier, shouldn't I just always set it to true? Or are there a good reason why it defaults to false?
Since people are typically writing PHP code for the web, ignoring a user abort means your server would be sitting around doing useless work that's never going to be of value. Enough of them and you might find your server bogged down on abandoned, long-running HTTP requests.
If you've got lots of long-running requests that should ignore a user abort, a queue is a much better approach.
Related
Hi I have had an issue where two visitors have hit a php function within a second of each other. This function sends them a one time use code from a pool of codes and it sent both people the same code.
What methods can I use in my script to check if someone else is already being processed and either delay or wait for the other person to finish?
I know this seems a really general question its hard to explain what I mean! Hopefully someone can help!
What methods can I use in my script to check if someone else is already being processed and either delay or wait for the other person to finish?
That would be what we call a "mutex", short for mutually exclusive.
Notice that without knowing how your PHP is run on your server, it's hard to know whether PHP's built-in mutex routines will work. PHP is a bad language when it comes to multithreading.
If your pool of codes lives in the database you could use transactions and lock tables for reading when one of the requests is trying to obtain the code. Wherever the data are, you will have to introduce some way locking or queuing requests to deal with concurrent requests.
This will be a newbie question but I'm learning php for one sole purpose (atm) to implement a solution--everything i've learned about php was learned in the last 18 hours.
The goal is adding indirection to my javascript get requests to allow for cross-domain accesses of another website. I also don't wish to throttle said website and want to put safeguards in place. I can't rely on them being in javascript because that can't account for other peers sending their requests.
So right now I have the following makeshift code, without any throttling measures:
<?php
$expires = 15;
if(!$_GET["target"])
exit();
$fn = md5($_GET["target"]);
if(!$_GET["cache"]) {
if(!array_search($fn, scandir("cache/")) ||
time() - filemtime($file) > $expires)
echo file_get_contents("cache/".$fn);
else
echo file_get_contents(file);
}
else if($_GET["data"]) {
file_put_contents("cache/".$fn, $_GET["data"]);
}
?>
It works perfectly, as far as I can tell (doesn't account for the improbable checksum clash). Now what I want to know is, and what my search queries in google refuse to procure for me, is how php actually launches and when it ends.
Obviously if I was running my own web server I'd have a bit more insight into this: I'm not, I have no shell access either.
Basically I'm trying to figure out whether I can control for when the script ends in the code, and whether every 'get' request to the php file would launch a new instance of the script or whether it can 'wake up' the same script. The reason being I wish to track whether, say, it already sent a request to 'target' within the last n milliseconds, and it seems a bit wasteful to dump the value to a savefile and then recover it, over and over, for something that doesn't need to be kept in memory for very long.
Every HTTP request starts a new instance of the interpreter; it's basically an implementation detail whether this is a whole new process, or a reuse of an existing one.
This generally pushes you towards good simple and scalable designs: you can run multiple server processes and threads and you won't get varying behaviour depending whether the request goes back to the same instance or not.
Loading a recently-touched file will be very fast on Linux, since it will come right from the cache. Don't worry about it.
Do worry about the fact that by directly appending request parameters to the path you have a serious security hole: people can get data=../../../etc/passwd and so on. Read http://www.php.net/manual/en/security.variables.php and so on. (In this particular example you're hashing the inputs before putting them in the path so it's not a practical problem but it is something to watch for.)
More generally, if you want to hold a cache across multiple requests the typical thing these days is to use memcached.
php is done from a per-connection basis. IE: each request for a php file is seen as a new instance. Each instance is ended, generally, when the connection is closed. You can however use sessions to save data between connections for a specific user
For basic use of sessions look into:
session_start()
$_SESSION
session_destroy()
I have a PHP function that I want to make available publically on the web - but it uses a lot of server resources each time it is called.
What I'd like to happen is that a user who calls this function is forced to wait for some time, before the function is called (or, at the least, before they can call it a second time).
I'd greatly prefer this 'wait' to be enforced on the server-side, so that it can't be overridden by dubious clients.
I plan to insist that users log into an online account.
Is there an efficient way I can make the user wait, without using server resources?
Would 'sleep()' be an appropriate way to do this?
Are there any suggested problems with using sleep()?
Is there a better solution to this?
Excuse my ignorance, and thanks!
sleep would be fine if you were using PHP as a command line tool for example. For a website though, your sleep will hold the connection open. Your webserver will only have a finite number of concurrent connections, so this could be used to DOS your site.
A better - but more involved - way would be to use a job queue. Add the task to a queue which is processed by a scheduled script and update the web page using AJAX or a meta-refresh.
sleep() is a bad idea in almost all possible situations. In your case, it's bad because it keeps the connection to the client open, and most webservers have a limit of open connections.
sleep() will not help you at all. The user could just load the page twice at the same time, and the command would be executed twice right after each other.
Instead, you could save a timestamp in your database for when your function was last invoked. Then, before invoking it, you should check the database to see if a suitable amount of time has passed. If it has, invoke the function and update the timestamp in the database.
If you're planning on enforcing a user login, than the problem just got a whole lot simpler.
Have a record inn the database listing users and the last time they used your resource consuming service, and measure the time difference between then and now. If the time difference is too low, deny access and display an error message.
This is best handled at the server level. No reason to even invoke PHP for repeat requests.
Like many sites, I use Nginx and you can use it's rate-limiting to block repeat requests over a certain number. So like, three requests per IP, per hour.
I have a PHP script something like:
$i=0;
for(;$i<500;++i) {
//Do some operation with files numbered 0 to 500;
}
The thing is, the script works and displays the end results, but the operation takes a while and watching a blank screen can be frustrating. I was thinking if there is some way I can continuously update the page at the client's end, detailing which file is currently being worked upon. That is, can I display and continuously update what is the current value of $i?
The Solution
Thanks everyone! The output buffering is working as suggested. However, David has offered valuable insight and am considering that approach as well.
You can buffer and control the output from the PHP script.
However, you may want to consider the scalability of this design. In general, heavy processes shouldn't be done online. Your particular case may be an edge in that the wait is acceptable, but consider something like this as an alternative for an improved user experience:
The user kicks off a process. This can be as simple as setting a flag on a record in the database or inserting some "to be processed" records into the data.
The user is immediately directed to a page indicating that the process has been queued.
An offline process (either kicked off by the PHP script on the server or scheduled to run regularly) checks the data and does the heavy processing.
In the meantime, the user can refresh the page (manually, by navigating elsewhere and coming back to check, or even use an AJAX polling mechanism to update the page) to check the status of the processing. In this case, it sounds like you'd have several hundred records in a database table queued for processing. As each one finishes, it can be flagged as done. The page can just check how many are left, which one is current, etc. from the data.
When the processing is completed, the page shows the result.
In general this is a better user experience because it doesn't force the user to wait. The user can navigate around the site and check back on progress as desired. Additionally, this approach scales better. If your heavy processing is done directly on the page, what happens when you have many users or the data processing load increases? Will the page start to time out? Will users have to wait longer? By making the process happen outside of the scope of the website you can offload it to better hardware if needed, ensure that records are processed in serial/parallel as business rules demand (avoid race conditions), save processing for off-peak hours, etc.
Check out PHP's Output Buffering.
Try to use:
flush();
http://php.net/manual/ru/function.flush.php
Try the flush() function. Calling this function forces PHP to send whatever output it has so far to the client, instead of waiting for the script to end.
However, some web servers will only send the output once the entire page is done being built, so calling flush() would have no effect in this case.
Also, browsers themselves buffer input, so you may run into problems there. For example, certain versions of IE won't start displaying the page until 256 bytes has been received.
What is the best way to break up a recursive function that is using a ton of resources
For example:
function do_a_lot(){
//a lot of code and processing is done here
//it takes a lot of execution time
if($true){
//if true we have to do all of that processing again
do_a_lot();
}
}
Is there anyway to make the server only have to take the brunt of the first execution and then break up the recursion into separate processes? Or am I dreaming?
Honestly, if your function is using up that much of your system's resources, I'd most likely refactor my code. However, it's not truly multithreading, but you could perhaps look at using popen to fork your process.
One of the rule of PHP is "Share nothing". That means every PHP process is independant and shares nothing with the others. So if you want to break your execution on several PHP process you'll have to store the data somewhere. It can be a memcached storage, or a database, or the session, as you want.
Then you'll need to 'fork' your PHp process. They're solutions available to get this done on the server side. IMHO this is all hacks. Dangerous and not minded in the PHP/web way. With the exception of 'work queues' tools.
I think the nicest way is to break your task with ajax. This will allow you a clean user interface and will avoid any long response timeout in the web process. i.e. show a 'working zone' to you user, then ask in ajax for next step of the job (first one), get response (in server side stor you response), then ask for next step, store new response and respond , next step, etc. You can even add a 'stop that stuff' function on the client side.
You can check as well for 'php work queue' on google.
If it's a long running task, divide and conquer with gearman