AWS load balander times out - PHP backend - php

We have a PHP backend which connects over an API to our Java backend for some heavy duty number crunching.
Unfortunately this number crunching sometimes takes longer than 1 minute, and the AWS load balancer times out.
Do you know of a way to prevent this?
I was thinking getting PHP to keep pinging, or JQuery to keep pinging, or increasing the timeout of the load balancer, but I haven't been able to any of those.

By default, the ELB will timeout if no data is received for 1 minute.
Ideally this would be designed as a job, and you would just send status reports with ajax. If you can't do that, there are a couple of other options.
Send data, even if its just empty spaces. Keep in mind that php may use output buffering, and may not send any data unless the packet is of at least a certian size.
Contact AWS support to have the timeout for your ELB increased.

Related

Keep Elastic Load Balancer connection alive during long AJAX request

I am running into this problem :
I am sending a request to the server using AJAX, which takes some parameters in and on the server side will generate a PDF.
The generation of the pdf can take a lot of time depending on the data used
The Elastic Load Balancer of AWS, after 60s of "idle" connection decides to drop the socket, and therefore my request fails in that case.
I know it's possible to increase the timeout in ELB settings, but not only my sysadmin is against it, it's also a false solution, and bad practice.
I understand the best way to solve the problem would be to send data through the socket to sort of "tell ELB" that I am still active. Sending a dummy request to the server every 30s doesn't work because of our architecture and the fact that the session is locked (ie. we cannot have concurrent AJAX requests from the same session, otherwise one is pending until the other one finishes)
I tried just doing a get request to files on the server but it doesn't make a difference, I assume the "socket" is the one used by the original AJAX call.
The function on the server is pretty linear and almost impossible to divide in multiple calls, and the idea of letting it run in the background and checking every 5sec until it's finished is making me uncomfortable in terms of resource control.
TL;DR : is there any elegant and efficient solution to maintain a socket active while an AJAX request is pending?
Many thanks if anyone can help with this, I have found a couple of similar questions on SO but both are answered by "call amazon team to ask them to increase the timeout in your settings" which sounds very bad to me.
Another approach is to divided the whole operations into two services:
The first service accepts a HTTP request for generating a PDF document. This service finishes immediately after request is accepted. And it will return a UUID or URL for checking result
The second service accepts the UUID and return the PDF document if it's ready. If PDF document is not ready, this service can return an error code, such as HTTP 404.
Since you are using AJAX to call the server side, it will be easy for you to change your javascript and call the 2nd servcie when the 1st service finished successfully. Will this work for your scenario?
Have you tried to following the trouble shooting guide of ELB? Quoted the relevant part below:
HTTP 504: Gateway Timeout
Description: Indicates that the load balancer closed a connection
because a request did not complete within the idle timeout period.
Cause 1: The application takes longer to respond than the configured
idle timeout.
Solution 1: Monitor the HTTPCode_ELB_5XX and Latency metrics. If there
is an increase in these metrics, it could be due to the application
not responding within the idle timeout period. For details about the
requests that are timing out, enable access logs on the load balancer
and review the 504 response codes in the logs that are generated by
Elastic Load Balancing. If necessary, you can increase your capacity
or increase the configured idle timeout so that lengthy operations
(such as uploading a large file) can complete.
Cause 2: Registered instances closing the connection to Elastic Load
Balancing.
Solution 2: Enable keep-alive settings on your EC2 instances and set
the keep-alive timeout to greater than or equal to the idle timeout
settings of your load balancer.

PHP Script dies due to expired time while getting from memcached

I have a situation where I need rapid and very frequent updates from a website's API. (I've asked them about how fast I can hammer them and they've said as fast as you like.)
So my design architecture is to create several small fast running PHP scripts that do a very specific action, save the result to memcache, and repeat. So the first script grabs a single piece of data via their API and stores it in memcache and then asks again. A second script processes the data the first script stored in memcache and requests another piece of data from the API based on the results of that processing. The third uses the result from the second, does something with that data, asks for more data via the API, on up the chain until a decision is made to execute via their API.
I am running these scripts in parallel on a machine with 24 GB RAM and 8 cores. I am also using supervisor in order to manage them.
When I run every PHP script manually via CLI or browser they work fine. They don't die except where I've told them to for the browser so I can get some feedback. The logic is fine, the scripts runs fine, etc, etc.
However, when I leave them running infinitely vai supervisor the logs fill up with Maximum execution time reached errors and the line it points to is line in one of my classes that gets the data from memcache. Sometimes it bombs on a check to see if the data is JSON (which it should always be), sometimes it bombs elsewhere in the same function/method. The timeout is set for the supervisor managed script is 5 sec because the data is stale by then.
I have considered upping the execution time but
the data will be stale by then,
memcache typically returns in less than 1 msec so 5 sec is an eternity,
none of the scripts have ever failed due to timeout when manually (CLI or browser) run
Environment:
Ubuntu 12.04 Server
PHP 5.3.10-1unbuntu3.9 with Suhosin-Patch
Memcached 1.4.13
Supervisor ??
Memcache Stats (from phpMemcachdAdmin):
Size: 1 GB
Uptime: 17 hrs, 38 min
Hit Rate: 76.5%
Used: 18.9 MB
Wasted: 18.9 MB
Bytes Written: 307.8 GB
Bytes Read: 7.2 GB
Here's a screenshot:
--------------- Additional Thoughts/Questions ----------------
I don't think it was clear in my original post that in order to get rapid updates I am running multiple copies in parallel of the scripts that grab API data. So if one script is grabbing basic account data looking for a change to trigger another event, then I actually have at least 2 instances running concurrently. This is because my biggest risk factor is stale data causing a delayed decision combined with a 1+ sec response time from the API.
So it occurred to me that the issue may stem from write conflicts where 2 instances of the same script are attempting to write to the same cache key. My initial Googling didn't lead to any good material on possible write conflicts/collisions in memcache. However, a little deeper dive provided a page where a user with 2 bookmarking sites powered by Elgg off of 1 memcache instance ran into what he described as collisions.
My initial assumption when deciding to kick multiple instances off in parallel was that Supervisor would kick them off in a sequential and therefore slightly staggered manner (maybe a bad assumption, I'm new to using Supervisor). Additionally, the API would respond at different rates to each call. Thus with a write time in the sub-millisecond time frame and an update from each once every 1-2 seconds the chances of write conflicts/collisions seemed pretty low.
I'm considering using some form of prefix/postfix with the keys. Each instance already has it's own instance ID created from an md5 hash. So I could prefix or postfix and then have each instance write to it's own key. But then I need another key that holds all of those prefixed/postfixed keys. So now I'm doing multiple cache fetches, a loop through all the stored data, and a discard of all but one of those results. I bet there's a better/faster architecture out there...
I am adding the code to do the timing Aziz asked for now. It will take some time to add the code and gather the data.
Recommendations welcome

Timeout issue in amazon with PHP

I have a PHP site in which I make an ajax call , in that ajax call I make call to an API that returns XML and I parse it, The problem it sometimes the xML is so huge that it takes many time, The load balancer in EC2 have timeout value of 20 minutes, so If my call is greater than this I get 504 Error, How can I solve this issue? I know its a server issue but how I can solve this?I dont think php.ini is helpful here
HTTP is a stateless protocol. It works best when responses to requests are made within a few seconds of the request. When you don't respond quickly, timeouts start coming into play. This might be a timeout you can control (fcgi process timeout) or one you can't control (third party proxy, client browser).
So what do you do when you have work that will take longer than a few seconds? Use a message queue of course.
The cheap way to do this is store the job in a db table and have cron read from the table and process the work. This can work on a small scale, but it has some issues when you try to get larger.
The proper way to do this is use a real message queue system. Amazon has SQS, but could just as well use Gearman, zeroMQ, rabbitMQ, and others to handle this.

Browser Timing out waiting for Soap Client

I'm working on an application which gets some data from a web service using a PHP soap client. The web service accesses the clients SQL server, which has very slow performance (some requests will take several minutes to run).
Everything works fine for the smaller requests, but if the browser is waiting for 2 minutes, it prompts me to download a blank file.
I've increased the php max_execution_time, memory_limit and default_socket_timeout, but the browser will always seem to stop waiting at exactly 2 minutes.
Any ideas on how to get the brower to hang around indefinitely?
You could change your architecture from pull to push. Then the user can carrying on using your web application & be notified when the data is ready.
Or as a simple work around (not ideal) if you are able to modify the soap server you could have another web service that checks if the data is ready, then on the client you could call this every 30secs to keep checking if data is available rather than waiting.
The web server was timing out - in my case, Apache. I initially thought it was something else as I increased the timeout value in httpd.conf, and it was still stopping after two minutes. However, I'm using Zend Server, which has an additional configuration file which was setting the timeout to 120 seconds - I increased this and the browser no longer stops after two minutes.

Using memcached as a database buffer for chat messages

I am playing around with building a chat application using PHP and CodeIgniter.
For this, I am implementing a cache 'buffer' with memcached to hold the most recent chat messages in memory, reducing load on the database. What I want to do is this:
When a message arrives, I save it in memcached using the current minute (YYYY-MM-DD-HH-MM) as the key. No database I/O involved. The idea being that all messages from the same minute are collected under the same key.
Users receive new chat messages also fetched from memcached (for now I'm using long-polling, but this will move to WebSockets under Node.js for obvious performance reasons). Again, no database I/O involved.
An automated server script (cronjob) will run once every 5 minutes, collecting the memcached data from the last 5 minutes and inserting the messages into the database.
The memcached objects are set to go stale after 6 minutes, so we never need to keep more than 6 minutes worth of message data in memory
This for a total of one database write operation per 5 minutes and zero database read operations.
Does this sound feasible? Is there a better (maybe even built-in?) way to use memcached for this purpose?
Update: I have been experimenting a little now, and I have an idea for a shortcut (read: hack). I can 'buffer' the messages temporarily in the Node.js server script until I'm ready to store them. A Javascript object/array of messages in the Node.js server is basically a memory cache - kind of.
So: Every N messages/seconds, I can pass the buffered messages (the contents of the JS array) to my database, using whatever method I want, since it won't be called very often.
However, I'm worried this might cripple the Node.js server process, since it probably won't enjoy carrying around that 200 KB array.
Any thoughts on this strategy? Is it completely crazy?
Have you looked into HTML5 socket connections? With a socket server, you do not need to store anything. The server receives a message from one subscriber, and immediately sends it back out to the correct subscribers. I have not done this myself using HTML5, but I know the functionality now exists. I have done this before using Flash which also supports socket conenctions.
Why don't use INSERT DELAYED ? It offers you almost the same functionality you are trying to achieve without the need of memcached.
Anyway your solution looks good, too.

Categories