How to force a timeout on a HTTP request from php

How to force a timeout on a HTTP request from php - php

I have a website which spambots visit a lot. I would like to stop these spambots from skewing my website statistics with non-human requests.
I run IIS 7 (fast-CGI), mainly with PHP as the server side language. I have PHP code in place that detects what a spambot is, when it detects this spambot it doesn't log the entry to my request history log file. This solves the problem.
However I would like to punish the spambots further, and hopefully slow down/deter their requests.
Is there any way I can NOT send a HTTP response to the spambot's HTTP request?
In PHP I figured I would use a sleep(30) time out before calling exit() but I hear this still requires some server resources during the sleep i.e. keeping memory allocated to the script etc. Is there any way I can exit the PHP script, and essentially send nothing to the spambot? Which will hopefully lock it's request thread until the client side request timout it reached.

Related

configure Apache and MySQL for parallel and cancelable service to clients

We have a client server architecture with Angular on client side and Apache2 PHP PDO and MySQL on the server side. server side exposing an API to clients that gives them data to show.
Some observations :
some API calls can take very long to compute and return response.
server side seem to handle a single request per client at any given time (im seeing only one coresponding query thats being executed in mysql), that limit comes either from apache or from mysql since front-end sending requests in parallel for sure.
front end cancels requests that are not relevant anymore (data being fetched will not be visible)
seems like requests canceled by front end are not canceled in server side and continues to run anyway, i think even if they are queued they will still run when their turn arrives (even though they were cancelled on client side)
Need help to understand :
what exactly is the cause of not having all requests (or at least X>1 requests) run on parallel? can it be changed?
What configurations should i change in either apache or mysql to overcome this?
is there a way to make apache drop cancelled requests? at least those that are still queued and not started?
Thanks!
EDIT
Following #Markus AO comment (Thanks Markus!!!) this was session blocking related... wish i knew about that before!

OP has a number of tangled problems on the table. However I feel these are worthwhile concerns (having wrestled with them myself), so let's take this apart. For great justice; main screen turn on:
Solving Concurrent Request Problems
There are several possible problems and solutions with concurrent connections in a (L)AMP stack. Before looking at tuning Apache and MySQL, however, let me gloss a common "mystery" issue that creates concurrence problems; namely, a necessary evil called "PHP Session Locking".
PHP Session Blocking & Concurrent Requests
In a nutshell: When you use sessions in your application, after calling session_start(), PHP locks the session file stored at your session.save_path directory. This file lock will remain in place until the script ends, or session_write_close() is called. Result: Any subsequent calls by the same user will be queued, rather than concurrently processed, to ensure there's no session data corruption. (Imagine parallel scripts writing into the same $_SESSION!)
An easy way to demonstrate this is to create a long-running script; then call it in your browser; and then open a new tab, and call it again (or in fact, call any script sharing the same session cookie/ID). You'll see that the second call won't execute until the first one is concluded. This is a common cause of strange AJAX lags, especially with parallel AJAX requests from a single page. Processing will be consecutive instead of concurrent. Then, 10 calls at 0.3 sec each will take a total of 3 sec to conclude, and so on. We don't want that, do we!
You can remedy request blocking caused by PHP session lock by ensuring that:
Scripts using sessions should call session_write_close() once done storing session data. The session lock will be immediately released.
Scripts that don't require sessions shouldn't start sessions to begin with.
Scripts that need to only read session data: Using session_start() with ['read_and_close' => true] option will give you a read-only (non-persistent) $_SESSION variable without session locking. (Available since PHP 7.)
Options 1 and 3 will leave you with read access for the $_SESSION variable and release/avoid the session lock. Any changes made to $_SESSION after the session is closed will be silently discarded; no warnings/errors are displayed.
The session lock request blocking issue is only consequential for a single user (using the same session). It has no impact on multi-user concurrence. For further reading, please see:
SO: Session (Auto)-Start, Performance & Session Locking
SO: PHP & Sessions: Is there any way to disable PHP session locking?
In-Depth: PHP Session Locking: How To Prevent Sessions Blocking in PHP requests.
Apache & MySQL Concurrent Requests
Once upon a time, before realizing PHP was the culprit behind blocking/queuing my concurrent calls, I spent a small aeon in tweaking Apache and MySQL and wondering, what happen?
Apache 2.4 supports 150 concurrent requests by default; any further requests will queue up. There are several settings under the MPM/Multi-Processing Module that you can tune to support the desired level of concurrent connections. Please see:
MPM Docs
Worker Docs
Overview at Oxpedia
MySQL has options for max_connections (default 151) and max_user_connections (default unlimited). If your application sends a lot of concurrent requests per user, you'll want to ensure the global max connections is high enough to ensure a handful of users don't hog the entire DBMS.
Obviously, you'll further want to tune these settings in light of your server CPU/RAM specs. (The calculations for which are beyond this answer.) Your concurrency issues probably aren't caused by too many open TCP sockets, but hey, you never know...
Canceling Requests to Apache/PHP/MySQL
We don't have much to go on as far as your application's specific wiring, but I understand from the comments that as it stands, a user can cancel a request at the front-end, but no back-end action is taken. (Ie. any back-end response is simply ignored/discarded.)
"Is there a way to make Apache drop cancelled requests?" I'm assuming that your front-end sends the requests directly and without delay to Apache; and onward to PHP > MySQL > PHP > Apache. In that case, no, you can't really have Apache cancel the request that it's already received; or you could hit "stop", but chances are PHP and MySQL are already munching it away...
Holding a "Cancel Window"
However, you could program a "cancel window" lag into your front-end, where requests are only passed on to Apache after e.g. a 0.5-second sleep waiting for a possible cancel. This may or may not have a negative impact on the UX; may be worth implementing to save server resources if a significant portion of requests are canceled. This assumes an UI with Javascript. If you're getting direct HTTP calls to API, you could have a "sleepy proxy receiver" instead.
Using a "Cancel Controller"
How would one cancel PHP/MySQL processes? This is obviously only feasible/doable if calls to your API result in a processing time of any significant duration. If the back-end takes 0.28 sec to process, and user cancels after 0.3 seconds, then there isn't much left to cancel, is there.
However, if you do have scripts that may run for longer, say into a couple of seconds. You could always find relevant break-points in your code, where you have a "not-canceled" check or a kill/rollback routine. Basically, you'd have the following flow:
Front-end sends request with unique ID to main script
PHP script begins the long march for building a response
On cancel: Front-end re-sends the ID to a light-weight cancel controller
Cancel controller logs ID to temporary file/database/wherever
PHP checks at break-points if there's a cancel request for current process
On cancel, PHP executes a kill/rollback routine instead of further processing
This sort of "cancel watch" will obviously create some overhead, and as such you may want to only incorporate this into heavier scripts, to ensure you actually save some processing time in the big picture. Further, you'd only want at most a couple of breakpoints at significant junctions. For read requests, you could just kill the process; but for write requests, you'd probably want to have a graceful rollback to ensure data integrity in your system.
You can also cancel/kill a long-running MySQL thread, already initiated by PHP, with mysqli::kill. For this to make sense, you'd want to run it as MYSQLI_ASYNC, so PHP's around to pull the plug. PDO doesn't seem to have a native equivalent for either async queries or kill. Came across $pdo->query('KILL CONNECTION_ID()'); and PHP Asynchronous MySQL Query (see answer for PDO). Haven't tested these myself. Also see: Kill MySQL query on user abort
PHP Connection Handling
As an alternative to a controller that passes the cancel signal "from the side", you could look into PHP Connection Handling and poll for aborted connection status at your cancel check-points with connection_aborted(). (See "MySQL kill" link above for a code example.)
A CONNECTION_ABORTED state follows if a user clicks the "stop" button in their browser. PHP has a ignore_user_abort() setting, default "Off", which should abort a script on user-abort. (In my experience though, if I have a rogue script and session lock is on, I can't do anything until it times out, even when I hit "stop" in the browser. Go figure.)
If you have "ignore user abort" on false, ie. the PHP script terminates on user abort, be aware that this will be a wholly uncontrolled termination, unless you have register_shutdown_function() implemented. Even so, you'd have to flag check-points in your code for your shutdown function to be able to "rewind the clock" from the termination point onward. Also note this caveat:
PHP will not detect that the user has aborted the connection until an attempt is made to send information to the client. Simply using an echo statement does not guarantee that information is sent, see flush(). ~ PHP Manual on ignore_user_abort
I have no experience with implementing "user abort" over AJAX/JS. For a starting point, see: Abort AJAX Request and Cancel an HTTP fetch() request. Not sure how/if they register with PHP. If you decide to travel down this road, please return and update us with your code / research!

Does PHP know when a connection has been closed?

Suppose a page takes a long time to generate, some large report for example, and the user closes the browser, or maybe they press refresh, does the PHP engine stop generating the page from the original request?
And if not, what can one do to cope with users refreshing a page a lot that causes an expensive report to be generated.
I have tried this and it seems that it does not stop any running query on the database. But that could be an engine problem, not PHP.
Extra info:
IIS7
MS SQL Server via ODBC

When you send a request to the server, it is executed on the server without any communication with the browser until information is sent back to the browser. When PHP tries to send data back to the browser, it will fail and therefore the script will exit.
However, if you have a lot of code executing before any headers are sent, this will continue to execute until the headers are sent and a failed response is received.

PHP knows when a connection has been closed when it tries to output some data (and fails). echo, print, flush, etc. Aside from this, no, it doesn't; everything else is happening on the server end.

There is little in the way of passing back information about the browser state once a request has been made (or in your case, in progress)
To know if a user is still connected to your site, you will need to implement a long poll / comet or perhaps a web socket.
Alternatively - you may want to run the long query initiated via an ajax call - while keeping the main browser respsonsive (not white screened). This allows you to detect if the browser is closed during the long query with a Javascript event onbeforeunload() to notify your backend that the user has left. (I'm not sure how you would interupt a query in progress from another HTTP request though)

PHP have two functions to control this. set_time_limit(num) able to increase the limit before a page execution "dies". If you don't expand that limit, a page running "too long" will die. Bad for a long process. Also you need ignore_user_abort(TRUE) so the server don't close the PHP process if the server detect that the page has ben closed in the client side.
You may also need to check for memory leaks if you are writing something that use much memory and run for several hours.

When you send a request to the server the server will go away and perform the appropriate actions. IIS/SQL Server does not know if the browser has been closed (and it is not IIS/SQL Server's responsibility to understand this) so it will execute the commands (as told to do so by the PHP engine until it has finished or until the engine kills any transactions). Since your report could be dynamic, IIS will not cache page requests, SQL Server however can cache the last previously ran queries therefore you will see some performance gain from the database backend.

What happens if a user exits the browser or changes page before an AJAX request is over

I am calling a php script over ajax to do some database maintenance. If the user closes the page, hits back, or clicks a link, will the php script be fully executed? Is there a way to do it?
Maybe if the php script called the exec() method or something similar, which would in turn call a script via the console as such:
$ php /var/www/httpdocs/maintenance.php
?

It's a race condition. PHP will detect at some point (usually upon attempting to do output) that Apache is yelling in its face that the remote user has closed the connection. Whether everything you wanted to do is done at that point depends on how your code's structured.
If you want to ensure that all operations are complete before the script shuts itself down, use ignore_user_abort(TRUE), which keeps PHP running after the connection is severed. It's still subject to the user max_execution_time limits and whatnot, but it will not shut down because you disconnected.

As long as the user agent (browser, etc.) has fully sent the request, the server has all it needs and will complete the request and try to send back a response.
In fact, this sort of "pinging" behavior is often used for "heartbeat"-like processes that keep a service warm or perform periodic maintenance.

Once the web request makes it to your server, it really doesn't matter if the user closes their browser or navigates away. Your server will still respond, but no one will be listening for the response.

Varies on the settings, web server, operating system and so on.
Usually the request will be processed as usual, and the response will just never be read. Occasionally, a write might fail earlier, and the request fails while processing.

Once the ajax call is kicked off, the user is free to do whatever they want. If they close the page they simply won't get the feedback (if any ) from the ajax call that was made.

If the php starts executing then it will continue to execute regardless if the user closes the window or navigates away from the page.

The php script will complete, regardless of browser state. The php is parsed on the server, and that doesn't care about whether the client is still open or not.

If the HTTP request was completed, then yes, the PHP script will be executed fully even if the client's computer is closed.

Can you tell a php server to abort the execution of a previously running script?

My web page uses an ajax call to return data from a very long php script, so if I exit the page early and reload the page, that php script is still being carried out, which will cause me problems.
Is there a way I could tell the server to abort the execution of the previous ajax request, if there is one that's still running?
thanks

Not directly. You will need to set up a scheme where the work is offloaded to an external (to the web server) process, and that process has a communication channel with the web server set up that enables it to check if it should drop what it's doing every so often (e.g. a simple but not ideal scheme would be checking for the last-modified time of a "lock file"; if it's more than X seconds in the past, abort the task).
Your web page would then make a call to a script that would then "keep alive" the background task appropriately (e.g. by touching the lock file of the previous example).
This way, when the task is initiated through an AJAX request, the client begins making "keep-alive" requests to the server and the server forwards the "keep-alive" message to the external process. If the user reloads the page the "keep-alive" requests stop and the worker process will abort when the keep-alive threshold elapses. If all goes well and the work completes, your server would detect this through the communication channel it has with the worker process and report this back to the client on their next keep-alive "ping".

Maybe try use set_time_limit() function for this script.
Or create some few php scripts and randomly generates a url for it.

did you try setting the XMLHttpRequest object to null when the page reloads?

Does php execution stop after a user leaves the page?

I want to run a relatively time consuming script based on some form input, but I'd rather not resort to cron, so I'm wondering if a php page requested through ajax will continue to execute until completion or if it will halt if the user leaves the page.
It doesn't actually output to the browser until a json_encode at the end of the file, so would everything before that still execute?

It depends.
From http://us3.php.net/manual/en/features.connection-handling.php:
When a PHP script is running normally
the NORMAL state, is active. If the
remote client disconnects the ABORTED
state flag is turned on. A remote
client disconnect is usually caused by
the user hitting his STOP button.
You can decide whether or not you want
a client disconnect to cause your
script to be aborted. Sometimes it is
handy to always have your scripts run
to completion even if there is no
remote browser receiving the output.
The default behaviour is however for
your script to be aborted when the
remote client disconnects. This
behaviour can be set via the
ignore_user_abort php.ini directive as
well as through the corresponding
php_value ignore_user_abort Apache
httpd.conf directive or with the
ignore_user_abort() function.
That would seem to say the answer to your question is "Yes, the script will terminate if the user leaves the page".
However realize that depending on the backend SAPI being used (eg, mod_php), php cannot detect that the client has aborted the connection until an attempt is made to send information to the client. If your long running script does not issue a flush() the script may keep on running even though the user has closed the connection.
Complicating things is even if you do issue periodic calls to flush(), having output buffering on will cause those calls to trap and won't send them down to the client until the script completes anyway!
Further complicating things is if you have installed Apache handlers that buffer the response (for example mod_gzip) then once again php will not detect that the connection is closed and the script will keep on trucking.
Phew.

It depends on your settings - usually it will stop but you can use ignore_user_abort() to make it carry on.

Depending on the configuration of the web server and/or PHP, the PHP process may, or may not, kill the thread when the user terminates the HTTP connection. If an AJAX request is pending when the user walks away from the page, it is dependent on the browser killing the request (not guaranteed) ontop of your server config (not guaranteed). Not the answer you want to hear!
I would recommend creating a work queue in a flat file or database that a constantly-running PHP daemon can poll for jobs. It doesn't suffer from cron delay but keeps CPU/memory usage to a usable level. Once the job is complete, place the results in the flat file/database for AJAX fetch. Or promise to e-mail the user once the job is finished (my preferred method).
Hope that helps

If the client/user/downloader/viewer aborts or disconnects, the script will keep running until something tries do flush new data do the client. Unless you have used
ignore_user_abort(), the script will die there.
In the same order, PHP is unable to determine if client is still there without trying to flush any data to the httpd.

found the actual solution for my case of it not terminating the connection. The SESSION on my Apache/Php server needed to close before the next one could start.
Browser waits for ajax call to complete after abort.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.