Keep Elastic Load Balancer connection alive during long AJAX request - php

I am running into this problem :
I am sending a request to the server using AJAX, which takes some parameters in and on the server side will generate a PDF.
The generation of the pdf can take a lot of time depending on the data used
The Elastic Load Balancer of AWS, after 60s of "idle" connection decides to drop the socket, and therefore my request fails in that case.
I know it's possible to increase the timeout in ELB settings, but not only my sysadmin is against it, it's also a false solution, and bad practice.
I understand the best way to solve the problem would be to send data through the socket to sort of "tell ELB" that I am still active. Sending a dummy request to the server every 30s doesn't work because of our architecture and the fact that the session is locked (ie. we cannot have concurrent AJAX requests from the same session, otherwise one is pending until the other one finishes)
I tried just doing a get request to files on the server but it doesn't make a difference, I assume the "socket" is the one used by the original AJAX call.
The function on the server is pretty linear and almost impossible to divide in multiple calls, and the idea of letting it run in the background and checking every 5sec until it's finished is making me uncomfortable in terms of resource control.
TL;DR : is there any elegant and efficient solution to maintain a socket active while an AJAX request is pending?
Many thanks if anyone can help with this, I have found a couple of similar questions on SO but both are answered by "call amazon team to ask them to increase the timeout in your settings" which sounds very bad to me.

Another approach is to divided the whole operations into two services:
The first service accepts a HTTP request for generating a PDF document. This service finishes immediately after request is accepted. And it will return a UUID or URL for checking result
The second service accepts the UUID and return the PDF document if it's ready. If PDF document is not ready, this service can return an error code, such as HTTP 404.
Since you are using AJAX to call the server side, it will be easy for you to change your javascript and call the 2nd servcie when the 1st service finished successfully. Will this work for your scenario?

Have you tried to following the trouble shooting guide of ELB? Quoted the relevant part below:
HTTP 504: Gateway Timeout
Description: Indicates that the load balancer closed a connection
because a request did not complete within the idle timeout period.
Cause 1: The application takes longer to respond than the configured
idle timeout.
Solution 1: Monitor the HTTPCode_ELB_5XX and Latency metrics. If there
is an increase in these metrics, it could be due to the application
not responding within the idle timeout period. For details about the
requests that are timing out, enable access logs on the load balancer
and review the 504 response codes in the logs that are generated by
Elastic Load Balancing. If necessary, you can increase your capacity
or increase the configured idle timeout so that lengthy operations
(such as uploading a large file) can complete.
Cause 2: Registered instances closing the connection to Elastic Load
Balancing.
Solution 2: Enable keep-alive settings on your EC2 instances and set
the keep-alive timeout to greater than or equal to the idle timeout
settings of your load balancer.

Related

configure Apache and MySQL for parallel and cancelable service to clients

We have a client server architecture with Angular on client side and Apache2 PHP PDO and MySQL on the server side. server side exposing an API to clients that gives them data to show.
Some observations :
some API calls can take very long to compute and return response.
server side seem to handle a single request per client at any given time (im seeing only one coresponding query thats being executed in mysql), that limit comes either from apache or from mysql since front-end sending requests in parallel for sure.
front end cancels requests that are not relevant anymore (data being fetched will not be visible)
seems like requests canceled by front end are not canceled in server side and continues to run anyway, i think even if they are queued they will still run when their turn arrives (even though they were cancelled on client side)
Need help to understand :
what exactly is the cause of not having all requests (or at least X>1 requests) run on parallel? can it be changed?
What configurations should i change in either apache or mysql to overcome this?
is there a way to make apache drop cancelled requests? at least those that are still queued and not started?
Thanks!
EDIT
Following #Markus AO comment (Thanks Markus!!!) this was session blocking related... wish i knew about that before!
OP has a number of tangled problems on the table. However I feel these are worthwhile concerns (having wrestled with them myself), so let's take this apart. For great justice; main screen turn on:
Solving Concurrent Request Problems
There are several possible problems and solutions with concurrent connections in a (L)AMP stack. Before looking at tuning Apache and MySQL, however, let me gloss a common "mystery" issue that creates concurrence problems; namely, a necessary evil called "PHP Session Locking".
PHP Session Blocking & Concurrent Requests
In a nutshell: When you use sessions in your application, after calling session_start(), PHP locks the session file stored at your session.save_path directory. This file lock will remain in place until the script ends, or session_write_close() is called. Result: Any subsequent calls by the same user will be queued, rather than concurrently processed, to ensure there's no session data corruption. (Imagine parallel scripts writing into the same $_SESSION!)
An easy way to demonstrate this is to create a long-running script; then call it in your browser; and then open a new tab, and call it again (or in fact, call any script sharing the same session cookie/ID). You'll see that the second call won't execute until the first one is concluded. This is a common cause of strange AJAX lags, especially with parallel AJAX requests from a single page. Processing will be consecutive instead of concurrent. Then, 10 calls at 0.3 sec each will take a total of 3 sec to conclude, and so on. We don't want that, do we!
You can remedy request blocking caused by PHP session lock by ensuring that:
Scripts using sessions should call session_write_close() once done storing session data. The session lock will be immediately released.
Scripts that don't require sessions shouldn't start sessions to begin with.
Scripts that need to only read session data: Using session_start() with ['read_and_close' => true] option will give you a read-only (non-persistent) $_SESSION variable without session locking. (Available since PHP 7.)
Options 1 and 3 will leave you with read access for the $_SESSION variable and release/avoid the session lock. Any changes made to $_SESSION after the session is closed will be silently discarded; no warnings/errors are displayed.
The session lock request blocking issue is only consequential for a single user (using the same session). It has no impact on multi-user concurrence. For further reading, please see:
SO: Session (Auto)-Start, Performance & Session Locking
SO: PHP & Sessions: Is there any way to disable PHP session locking?
In-Depth: PHP Session Locking: How To Prevent Sessions Blocking in PHP requests.
Apache & MySQL Concurrent Requests
Once upon a time, before realizing PHP was the culprit behind blocking/queuing my concurrent calls, I spent a small aeon in tweaking Apache and MySQL and wondering, what happen?
Apache 2.4 supports 150 concurrent requests by default; any further requests will queue up. There are several settings under the MPM/Multi-Processing Module that you can tune to support the desired level of concurrent connections. Please see:
MPM Docs
Worker Docs
Overview at Oxpedia
MySQL has options for max_connections (default 151) and max_user_connections (default unlimited). If your application sends a lot of concurrent requests per user, you'll want to ensure the global max connections is high enough to ensure a handful of users don't hog the entire DBMS.
Obviously, you'll further want to tune these settings in light of your server CPU/RAM specs. (The calculations for which are beyond this answer.) Your concurrency issues probably aren't caused by too many open TCP sockets, but hey, you never know...
Canceling Requests to Apache/PHP/MySQL
We don't have much to go on as far as your application's specific wiring, but I understand from the comments that as it stands, a user can cancel a request at the front-end, but no back-end action is taken. (Ie. any back-end response is simply ignored/discarded.)
"Is there a way to make Apache drop cancelled requests?" I'm assuming that your front-end sends the requests directly and without delay to Apache; and onward to PHP > MySQL > PHP > Apache. In that case, no, you can't really have Apache cancel the request that it's already received; or you could hit "stop", but chances are PHP and MySQL are already munching it away...
Holding a "Cancel Window"
However, you could program a "cancel window" lag into your front-end, where requests are only passed on to Apache after e.g. a 0.5-second sleep waiting for a possible cancel. This may or may not have a negative impact on the UX; may be worth implementing to save server resources if a significant portion of requests are canceled. This assumes an UI with Javascript. If you're getting direct HTTP calls to API, you could have a "sleepy proxy receiver" instead.
Using a "Cancel Controller"
How would one cancel PHP/MySQL processes? This is obviously only feasible/doable if calls to your API result in a processing time of any significant duration. If the back-end takes 0.28 sec to process, and user cancels after 0.3 seconds, then there isn't much left to cancel, is there.
However, if you do have scripts that may run for longer, say into a couple of seconds. You could always find relevant break-points in your code, where you have a "not-canceled" check or a kill/rollback routine. Basically, you'd have the following flow:
Front-end sends request with unique ID to main script
PHP script begins the long march for building a response
On cancel: Front-end re-sends the ID to a light-weight cancel controller
Cancel controller logs ID to temporary file/database/wherever
PHP checks at break-points if there's a cancel request for current process
On cancel, PHP executes a kill/rollback routine instead of further processing
This sort of "cancel watch" will obviously create some overhead, and as such you may want to only incorporate this into heavier scripts, to ensure you actually save some processing time in the big picture. Further, you'd only want at most a couple of breakpoints at significant junctions. For read requests, you could just kill the process; but for write requests, you'd probably want to have a graceful rollback to ensure data integrity in your system.
You can also cancel/kill a long-running MySQL thread, already initiated by PHP, with mysqli::kill. For this to make sense, you'd want to run it as MYSQLI_ASYNC, so PHP's around to pull the plug. PDO doesn't seem to have a native equivalent for either async queries or kill. Came across $pdo->query('KILL CONNECTION_ID()'); and PHP Asynchronous MySQL Query (see answer for PDO). Haven't tested these myself. Also see: Kill MySQL query on user abort
PHP Connection Handling
As an alternative to a controller that passes the cancel signal "from the side", you could look into PHP Connection Handling and poll for aborted connection status at your cancel check-points with connection_aborted(). (See "MySQL kill" link above for a code example.)
A CONNECTION_ABORTED state follows if a user clicks the "stop" button in their browser. PHP has a ignore_user_abort() setting, default "Off", which should abort a script on user-abort. (In my experience though, if I have a rogue script and session lock is on, I can't do anything until it times out, even when I hit "stop" in the browser. Go figure.)
If you have "ignore user abort" on false, ie. the PHP script terminates on user abort, be aware that this will be a wholly uncontrolled termination, unless you have register_shutdown_function() implemented. Even so, you'd have to flag check-points in your code for your shutdown function to be able to "rewind the clock" from the termination point onward. Also note this caveat:
PHP will not detect that the user has aborted the connection until an attempt is made to send information to the client. Simply using an echo statement does not guarantee that information is sent, see flush(). ~ PHP Manual on ignore_user_abort
I have no experience with implementing "user abort" over AJAX/JS. For a starting point, see: Abort AJAX Request and Cancel an HTTP fetch() request. Not sure how/if they register with PHP. If you decide to travel down this road, please return and update us with your code / research!

Increase idle timeout

I have an App service in Azure: a php script that makes a migration from a database (server1) to a another database (azure db in a virtual machine).
This script makes a lot of queries and requests, so it takes a lot of time and the server (App service) returns:
"500 - The request timed out. The web server failed to respond within
the specified time."
I found that it's something about "idle timeout." I would like to know how to increase this time.
In my test, I have tried the following so far:
Add ini_set('max_execution_time', 300); at the top of my PHP script.
App settings on portal: SCM_COMMAND_IDLE_TIMEOUT = 3600.
But nothing seems to work.
After some searching, I found the post by David Ebbo, as he said:
There is a 230 second (i.e. a little less than 4 mins) timeout for
requests that are not sending any data back. After that, the client
gets the 500 you saw, even though in reality the request is allowed to
continue server side.
And the similar thread from SO, you can refer here.
The suggestion for migration is that you can leverage Web Jobs to run PHP scripts as background processes on App Service Web Apps.
For more details, you can refer to https://learn.microsoft.com/en-us/azure/app-service-web/web-sites-create-web-jobs.

Timeout issue in amazon with PHP

I have a PHP site in which I make an ajax call , in that ajax call I make call to an API that returns XML and I parse it, The problem it sometimes the xML is so huge that it takes many time, The load balancer in EC2 have timeout value of 20 minutes, so If my call is greater than this I get 504 Error, How can I solve this issue? I know its a server issue but how I can solve this?I dont think php.ini is helpful here
HTTP is a stateless protocol. It works best when responses to requests are made within a few seconds of the request. When you don't respond quickly, timeouts start coming into play. This might be a timeout you can control (fcgi process timeout) or one you can't control (third party proxy, client browser).
So what do you do when you have work that will take longer than a few seconds? Use a message queue of course.
The cheap way to do this is store the job in a db table and have cron read from the table and process the work. This can work on a small scale, but it has some issues when you try to get larger.
The proper way to do this is use a real message queue system. Amazon has SQS, but could just as well use Gearman, zeroMQ, rabbitMQ, and others to handle this.

push and pull technologies using Ajax or Socket

I have a website that needs to send notifications to the online clients at real time same as Facebook, after more googling, I found a lot of documentation about push and pull technology. I found from this documentation ways for implementing them using Ajax or Sockets. I need to know what is the best to use in my case and how is it coded using javascript or jquery and php.
I cannot say you what's the best use in your case without knowing your case in detail.
In most cases it is enough to have the clients check with the server every one or two seconds, asking if something new has happened. I prefer this over sockets most of the time because it works on every web server without any configuration changes and in any browser supporting AJAX, even old ones.
If you have few clients (because every client requires an open socket on the server) and you want real realtime, you can use websockets. There are several PHP implementations, for example this one: http://code.google.com/p/phpwebsocket/
If you can ensure that there will be only single browser open per logged in user then you can apply this long polling technique easily.
Policy for Ajax Call:
Do not make request every 2 seconds.
But wait and make request only after 2 seconds of getting response from previous request.
If a request does not respond within 12 seconds then do not wait send a fresh request. This is connection lost case.
Policy for server response:
if there is update response immediately. to check if there is update rely on session ; (better if you could send some hint from client side like latest message received; this second update checking mechanism will eliminate the restriction of single browser open as mentioned above)
otherwise sleep() for 1 second; (do not use infinite loop but use sleep) and then check whether there is update; if update is there respond; if not sleep again for 1 second; repeat this until total 10 seconds has elapsed and then respond back with no update
If you apply this policy (commonly known as long polling), you will find processor usage reduced from 95% to 4% under heavy load case.
Hope this explains. Best of luck.
Just use apply the long-polling technique using jQuery.
Sockets are not yet supported everywhere and also you would need to open a listening socket on the server for this to work.

How can a background service running in a webserver inform client browser the progress of the service?

First of all, I am completely new on many stuff, so I will welcome any inputs, including suggestions, existing projects, existing models, etc.
My current problems are:
The background service maintains a queue of tasks. The background service is written in C++ or python.
When a client clicks "Create Task" button in browser, the information will be sent to web server and the web server script (written in PHP) will initiate an RPC call to the background service to append the task to the internal queue.
The client browser will initiate an AJAX request to wait for the completion of the task. The AJAX request will hold until the task is completed (or failed) or the client cancels the request.
Thus, I need an low cost way to get the task progress which is run on a background service process.
I can think of two ways:
The background service can inform the server AJAX script about the progress pro-actively. This is low cost but I actually do not know how to do it. Does any RPC framework provides such asynchronous call back? Currently the RPC framework I decided to use is Thrift because of its multi-languages support.
The AJAX script on server side will make an RPC call to get current progress every a few seconds, and sleep in between. Upon completion, the AJAX script will return, otherwise it will just let the client browser wait by not returning. This is actually simpler but I am not sure about its cost. Note that delay isn't an issue to me here because I suppose that the clients are okay to wait for a few more seconds.
Is there any common way/model to deal with this problem?
Thanks for the help.
Depends on how you code it. The common way to do it is to make a javascripted ajax request every 1-3 seconds or so and poll the progress from the server.
This will intermediately close the connection and be more gentle to the server. If you use a persistent connection (WebSockets also fall into this category), you will keep the server busy. Besides, a "sleep" keeps the CPU busy - which is something I would try to avoid if I were you. On the other hand, if you've got the resources for that...
I can only repeat myself: it depends on how you code it and what you expect of it in the end.
If you want the client do some more work and treat the server gentle, choose your 1st option and if you think your server can handle it, choose the 2nd option and go "persistent" and even use WebSockets (which represent persistent connections to your server - remember that they aren't widely supported by web-browsing clients yet either).
Although I think that in the end - the trade-off of a simple progress compared to hogging your server CPU with constant sleeps and some persistent connections on-top of that will make you choose your 1st option: poll the server script for the progress value every x secs from the client side. Btw.: it's what Twitter does and their servers survived until today! ;)
I think, You can use WebSockets for that.
You can use WebSockets.
Establish a WebSockets connection between the client and a web service that has access to the information you need to pass to the client.
With web sockets, you don't need to poll the server asking it for progress, but rather have the server notify the client whenever it's ready.
A backwards compatible implementation would be long polling.
Cheers

Categories