Long Polling with Ajax and PHP - Apache freezes

Long Polling with Ajax and PHP - Apache freezes - php

We try to implement long-polling based notification service in our company's ERP. Similar to Facebook notifications.
Technologies used:
PHP with timeout set to 60 seconds and 1 second sleep in each iteration of loop.
jQuery for AJAX handling.
Apache as web server.
After nearly month of coding, we went to production. Few minutes after deployment we had to rollback everything. It turned out that our server (8 cores) couldn't handle long requests from 20 employees, using ~5 browser tabs each.
For example: User opened 3 tabs with our ERP, with one long-polling AJAX on each tab. Opening 4th tab is impossible - it hangs until one of previous 3 is killed (and therefore AJAX is stopped).
'Apache limitations', we thought. So we went googling. I found some info about Apache's MPM modules and configs, so I gave it a try. Our server use prefork MPM, as apachectl -l shown us. So I changed few lines in config to look something like this:
<IfModule mpm_prefork_module>
StartServers 1
MinSpareServers 16
MaxSpareServers 32
ServerLimit 50%
MaxClients 150
MaxClients 50%
MaxRequestsPerChild 0
</IfModule>
Funny thing is, it works on my local machine with similar config. On server, it looks like Apache ignores config, because with MinSpareServers set to 16, it lauches 8 after restart. Whe have no idea what to do.

Passerby in first comment of previous post gave me good direction to check out if we hit max browser connections to one server.
As it turns out, each browser has those limit and you can't change them (as far as I know).
We made a workaround to make it work.
Let's assume that I was getting AJAX data from
http://domain.com/ajax
To avoid hitting max browser connections, each long-polling AJAX connects to random subdomain, like:
http://31289.domain.com/ajax
http://43289.domain.com/ajax
and so on. There's a wildcard on a DNS server pointing from *.domain.com to domain.com, and subdomain is unique random number, generated by JS on each tab.
For more information, check out this thread.
There's been also some problems with AJAX Same Origin Security, but we managed to work it out, using appropriate headers on both JS and PHP sides.
If you want to know more about headers, check it out here on StackOverflow, and here on Mozilla Developer's page. Thanks!

I have successfully implemented a LAMP setup with long polling. Two things to keep in mind, the php internal execution clock for linux is not altered or incremented by the 'usleep()' function. Therefore, setting the maximum execution time would only be needed for rare edge cases where obtaining the data takes longer than normal, or possibly for a windows setup. In addition, with long polling bare in mind that once you go over 20+ seconds, you are vulnerable to having browser timeouts occur.
Secondly, you will need to verify that your sessions aren't locking up (if sessions are being used).
Apache really shouldn't have any issue with what you are looking to do. Though, I will admit that webservers like nginx or an ajax-specific webserver may handle the concurrent connections better. If you could post your code for the ajax handler, we might be able to figure out where the problem is.
Utilizing subdomains, or as other threads have suggested -- multiple webservers on separate ports, remember that you may encounter JavaScript domain security issues.
I say, don't change apache config until you encounter an issue and have exhausted all other options; be careful with the PHP sessions, and make sure AJAX is waiting for a response, before sending another request ;)

Related

What is the PHP apache_child_terminate equivalent in Apache 2?

I really need to be able to do what I believe apache_child_terminate does, but apparently it doesn't work when using Apache2 (and that is what I am using... Apache 2 in prefork mode).
Is there an equivalent way I can "get rid of" or kill the currently executing Apache process after I finish fulfilling the current request?
Background:
There is one particular type of request that I get over 100 times each second. A very small message is returned (just a few characters), but Apache keeps the process around for 4 seconds to allow for subsequent requests to be processed by the same Apache process. I can't lower the 4 second keep alive much more because it affects the performance of all the other PHP pages that have a lot of content and requests. So the idea is that only on that one type of request, I don't want the process to linger for 4 seconds.
Thanks in advance for any suggestions!

Ok, so this code basically does the apache_child_terminate() equivalent for Apache 2 by nicely telling the Apache process to die after completing the request:
<?php
// Terminate Apache 2 child process after request has been
// completed by sending a "friendly" SIGWINCH POSIX signal (28).
function kill_on_exit() {
posix_kill( getmypid(), 28 );
}
register_shutdown_function( 'kill_on_exit' );
?>
This did not, however, fix my problem. I ended up getting the best performance by completely turning off Apache KeepAlive. I added this to my apache2.conf file:
KeepAlive Off
Now the hundreds of requests per second that I'm getting on a specific page do not tie-up resources for 4 seconds. They complete very fast and the process can be reused for another connection. The performance I'm getting from the normal pages with lots of content is actually pretty good even though every request is going over a different connection to a different process on the server.

Apache VERY high page load time

My Drupal 6 site has been running smoothly for years but recently has experienced intermittent periods of extreme slowness (10-60 sec page loads). Several hours of slowness followed by hours of normal (4-6 sec) page loads. The page always loads with no error, just sometimes takes forever.
My setup:
Windows Server 2003
Apache/2.2.15 (Win32) Jrun/4.0
PHP 5
MySql 5.1
Drupal 6
ColdFusion 9
Vmware virtual environment
DMZ behind a corporate firewall
Traffic: 1-3 hits/sec peak
Troubleshooting
No applicable errors in apache error log
No errors in drupal event log
Drupal devel module shows 242 queries in 366.23 milliseconds,page execution time 2069.62 ms. (So it looks like queries and php scripts are not the problem)
NO unusually high CPU, memory, or disk IO
Cold fusion apps, and other static pages outside of drupal also load slow
webpagetest.org test shows very high time-to-first-byte
The problem seems to be with Apache responding to requests, but previously I've only seen this behavior under 100% cpu load. Judging solely by resource monitoring, it looks as though very little is going on.
Here is the kicker - roughly half of the site's access comes from our LAN, but if I disable the firewall rule and block access from outside of our network, internal (LAN) access (1000+ devices) is speedy. But as soon as outside access is restored the site is crippled.
Apache config? Crawlers/bots? Attackers? I'm at the end of my rope, where should I be looking to determine where the problem lies?
------Edit:-----
Attached is a waterfall chart from webpagetest.org showing a 15 second load time. I've seen times as high as several minutes. And again, the server runs fine much of the time. The green areas indicate that the browser has sent a request and is waiting to recieve the first byte of data back from the server. This is certainly a back-end delay, but it is puzzling that the CPU is barely used during this slowness.
(Not enough rep to post an image, see https://webmasters.stackexchange.com/questions/54658/apache-very-high-page-load-time
------Edit------
On the Apache side of things - Is this possibly a ThreadsPerChild issue?

After much research, I may have found the solution. If I'm correct, it was an apache config problem. Specifically, the "ThreadsPerChild" directive. See... http://httpd.apache.org/docs/2.2/platform/windows.html
Because Apache for Windows is multithreaded, it does not use a
separate process for each request, as Apache can on Unix. Instead
there are usually only two Apache processes running: a parent process,
and a child which handles the requests. Within the child process each
request is handled by a separate thread.
ThreadsPerChild: This directive is new. It tells the server how many
threads it should use. This is the maximum number of connections the
server can handle at once, so be sure to set this number high enough
for your site if you get a lot of hits. The recommended default is
ThreadsPerChild 150, but this must be adjusted to reflect the greatest
anticipated number of simultaneous connections to accept.
Turns out, this directive was not set at all in my config and thus defaulted to 64. I confirmed this by viewing the number of threads for the second httpd.exe process in task manager. When the server was hitting more than 64 connections, the excess requests were simply having to wait for a thread to open up. I added ThreadsPerChild 150 in my httpd.conf.
Additionally, I enabled the apache status module
http://httpd.apache.org/docs/2.2/mod/mod_status.html
...which, among other things, allows one to see the total number of active request on the server at any given moment. Right away, I could see spikes of up to 80 active request. Time will tell, but I'm confident that this will resolve my issue. So far, 30 hours without a hiccup.

Apache is too bulk and clumsy for "1-3 hits/sec avg".
Once I have similar problem with much lighter (almost static-html, no DB) site, and similar hits/second.
No errors, no high network/CPU/memory/disk loads. Apache on WinXP.
I inserted nginx before Apache for static files and it started working like a charm.

Caching. The solution it caching.
Drupal (in common with most other large CMS platforms) has a tendency toward this kind of thing due to its nature -- every page is built on the fly, constructed from a whole stack of database tables and code modules. The more you've got in there, the slower it will be, but even fairly simple pages can become horribly slow if your site gets a bit of traffic.
Drupal has a page cache mechanism built-in which will cut your load dramatically. As long as your pages are static (ie no dynamic content) then you can simply switch on caching and watch the performance go right back up.
If you have dynamic content, you can still enable caching for the static parts of the page. It is a bit more complex (and beyond the scope of this answer), but it is worth the effort.
If that's still not enough, a server-based caching solution such as Varnish will definitely help.

php5-fpm children and requests

I have a question.
I own a 128mb vps with a simple blog that gets just a hundred hits per day.
I have nginx + php5-fpm installed. Considering the low visits and the ram I decided to set fpm to static with 1 server running. While I was doing my random tests like running php scripts through http that last over 30 minutes I tried to open the blog in the same machine and noticed that the site was basically unreachable. So I went to the configuration and read this:
The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes to be created when pm is set to 'dynamic'.
; **This value sets the limit on the number of simultaneous requests that will be
; served**
What shocked me the most was that I didn't know because I always assumed that a php children would handle hundreds of requests at the same time like a http server would do!
Did it get it right?
If for example I launch 2 php-fpm children and launch 2 "long scripts" at the same time all the sites using the same php backend will be unreachable?? How is this usable?
You may think: -duh! a php script (web page) is usually processed in 100ms- ... no doubt about that but what happens if you have pages that could run for about 10 secs each and I have 10 visitors with php-fpm with 5 servers so accepting only 5 requests per time at the same time? They'll all be queued or will experience timeouts?
I'm honestly used to run sites in Windows with Apache and mod_php I never experienced these issues because apparently those limits don't apply being a different way of using PHP.
This also raises another question. If I have file_1.php with sleep(20) and file_2.php with just an echo, if I run file_1 and then file_2 with the fastcgi machine the second file will request the creation of another server to handle the php request using 4MB RAM more. If I do the same with apache/mod_php the second file will only use 30KB more of RAM (in the apache server). Considering this why is mod_php is considering the "bad guy" if the ram used is actually less...I know I'm missing the big picture here.

You've basically got it right. You configured a static number of workers (and that number was "one") -- so that's exactly what you got.
But you don't understand quite how things typically work, since you say:
I always assumed that a php children would handle hundreds of requests
at the same time like a http server would do!
I'm not really familiar with nginx, but consider the typical mod_php setup in apache. If you're using mod_php, then you're using the prefork mpm for apache. So every concurrent http requests is handled by a distinct httpd process (no threads). If you're tuning your apache/mod_php server for low-memory, you're going to have to tweak apache settings to limit the number of processes it will spawn (in particular, MaxClients).
Failing to tune this stuff means that when you get a large traffic spike, apache starts spawning a huge number of heavy processes (remember, it's mod_php, so you have the whole PHP interpreter embedded in each httpd process), and you run out of memory, and then everything starts swapping, and your server starts emitting smoke.
Tuned properly (meaning: tuned so that you ignore requests instead of allocating memory you don't have for more processes), clients will time out, but when traffic subsides, things go back to normal.
Compare that with fpm, and a smarter web server architecture like apache-worker, or nginx. Now you have some, much larger, pool of threads (still configurable!) to handle http requests, and a separate pool of php-fpm processes to handle just the requests that require PHP. It's basically the same thing, if you don't set limits on how many processes/threads can be created, you are asking for trouble. But if you do tune, you come out ahead, since only a fraction of your requests use PHP. So essentially, the average amount of memory needed per http requests is lower -- thus you can handle more requests with the same amount of memory.
But setting the number to "1" is too extreme. At "1", it doesn't even matter if you choose static or dynamic, since either way you'll just have one php-fpm process.
So, to try to give explicit answers to particular questions:
You may think: -duh! a php script (web page) is usually processed in 100ms- ... no doubt about that but what happens if you have pages that could run for about 10 secs each and I have 10 visitors with php-fpm with 5 servers so accepting only 5 requests per time at the same time? They'll all be queued or will experience timeouts?
Yes, they'll all queue, and eventually timeout. The fact that you regularly have scripts that take 10 seconds to run is the real culprit here, though. There are lots of ways to architect around that (caching, work queues, etc), but the right solution depends entirely on what you're trying to do.
I'm honestly used to run sites in Windows with Apache and mod_php I never experienced these issues because apparently those limits don't apply being a different way of using PHP.
They do apply. You can set up an apache/mod_php server the same way as you have with nginx/php-fpm -- just set apache's MaxClients to 1!
This also raises another question. If I have file_1.php with sleep(20) and file_2.php with just an echo, if I run file_1 and then file_2 with the fastcgi machine the second file will request the creation of another server to handle the php request using 4MB RAM more. If I do the same with apache/mod_php the second file will only use 30KB more of RAM (in the apache server). Considering this why is mod_php is considering the "bad guy" if the ram used is actually less...I know I'm missing the big picture here.
Especially on linux, lots of things that report memory usage can be very misleading. But think about it this way: that 30kb is negligible. That's because most of PHP's memory was already allocated when some httpd process got started.
128MB VPS is pretty tight, but should be able to handle more than one php-process.
If you want to optimize, do something like this:
For PHP:
pm = static
pm.max_children=4
for nginx, figure out how to control processes and thread count (whatever the equivalent to apache's MaxClients, StartServers, MinSpareServers, MaxSpareServers)
Then figure out how to generate some realistic load (apachebench, siege, jmeter, etc). use vmstat, free, and top to watch your memory usage. Adjust pm.max_children and the nginx stuff to be as high as possible without causing any significant swap (according to vmstat)

4 processes created for each request?

We are doing some load testing on a PHP (Kohana) application. One funny thing we noticed is that each request seems to be creating 4 process each time and increases the load on the sever by 4 times. And when there are, for example, 500 users per second hitting it acts as 500*4.
I really don't understand what could be creating all these processes. My understanding is that each PHP request creates one thread, it shouldn't be creating processes, especially not 4. Could it be an Apache issue? Or PHP issue?
I didn't find any information about this on Google. Any suggestion on what could be causing this problem would be appreciated.

My first guess is you are simply seeing the effect of Apache MinSpareServers setting. Rather than spin up a process when a request comes it, Apache will have one ready and waiting. So if this is set to 4, Apache will always try to have active processes + 4 running.
It could also be ThreadsPerChild setting, depending on how you have Apache configured. In this case, each child always spins up the number of threads specified so they are ready.
Lots of processes or threads isn't necessarily an issue. They may not be doing anything except waiting to handle incoming traffic.

What different settings affect PHP and/or Apache timeouts?

I was asked to help troubleshoot someone's website. It is written in php, on a linux box, using an apache server and mysql, and I have never worked with any of these before (except maybe linux in school).
I got most of the issues fixed (most code is really the same no matter what langues it is) however there is still one page that is timing out when processing huge files. I'm fairly sure the problem is a timeout somewhere but I have no idea where all the php timeouts would be.
I have adjusted max_execution_time, max_input_time, mysql.connect_timeout, default_socket_timeout, and realpath_cache_ttl in php.ini but it is still timing out after about 10 minutes. What other settings might exist that I could increase to try and fix this?
As a sidenote, I'm aware that 10min is generally not desired when processing an file, however this section of the site is only used by one person once or twice a week and she doesn't mind providing the process finishes as expected !and I really don't want to go rewrite someone else's bad coding in a language I don't understand, for a process I don't understand)
EDIT: The sql process finishes in the background, its just the webpage itself that times out.

Per Frank Farmer's suggestion, I added flush() to the code and it works now. Definitely a browser timeout, thanks Frank!

You can use set_time_limit() if you set it to zero the script should not time out at all.
This would be placed within your script, not in any config etc...
Edit: Try changing apache's timeout settings. In the config look for TimeOut directive (should be the same for apache 2.x and apache 1.3.x), once changed restart apache and check it.
Edit 3:
Did you go to the link I provided? It lists there the default, which is 300 seconds (5 minutes). Also if the setting IS NOT in the config file, you CAN add it.
According to the docs:
The TimeOut directive currently defines the amount of time Apache will wait for three things:
The total amount of time it takes to receive a GET request.
The amount of time between receipt of TCP packets on a POST or PUT request.
The amount of time between ACKs on transmissions of TCP packets in responses.
So it is possible it doesn't relate, but try it and see.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.