We run a medium-size site that gets a few hundred thousand pageviews a day. Up until last weekend we ran with a load usually below 0.2 on a virtual machine. The OS is Ubuntu.
When deploying the latest version of our application, we also did an apt-get dist-upgrade before deploying. After we had deployed we noticed that the load on the CPU had spiked dramatically (sometimes reaching 10 and stopping to respond to page requests).
We tried dumping a full minute of Xdebug profiling data from PHP, but looking through it revealed only a few somewhat slow parts, but nothing to explain the huge jump.
We are now pretty sure that nothing in the new version of our website is triggering the problem, but we have no way to be sure. We have rolled back a lot of the changes, but the problem still persists.
When look at processes, we see that single Apache processes use quite a bit of CPU over a longer period of time than strictly necessary. However, when using strace on the affected process, we never see anything but
accept(3,
and it hangs for a while before receiving a new connection, so we can't actually see what is causing the problem.
The stack is PHP 5, Apache 2 (prefork), MySQL 5.1. Most things run through Memcached. We've tried APC and eAccelerator.
So, what should be our next step? Are there any profiling methods we overlooked/don't know about?
The answer ended up being not-Apache related. As mentioned, we were on a virtual machine. Our user sessions are pretty big (think 500kB per active user), so we had a lot of disk IO. The disk was nearly full, meaning that Ubuntu spent a lot of time moving things around (or so we think). There was no easy way to extend the disk (because it was not set up properly for VMWare). This completely killed performance, and Apache and MySQL would occasionally use 100% CPU (for a very short time), and the system would be so slow to update the CPU usage meters that it seemed to be stuck there.
We ended up setting up a new VM (which also gave us the opportunity to thoroughly document everything on the server). On the new VM we allocated plenty of disk space, and moved sessions into memory (using memcached). Our load dropped to 0.2 on off-peak use and around 1 near peak use (on a 2-CPU VM). Moving the sessions into memcached took a lot of disk IO away (we were constantly using about 2MB/s of disk IO, which is very bad).
Conclusion; sometimes you just have to start over... :)
Seeing an accept() call from your Apache process isn't at all unusual - that's the webserver waiting for a new request.
First of all, you want to establish what the parameters of the load are. Something like
vmstat 1
will show you what your system is up to. Look in the 'swap' and 'io' columns. If you see anything other than '0' in the 'si' and 'so' columns, your system is swapping because of a low memory condition. Consider reducing the number of running Apache children, or throwing more RAM in your server.
If RAM isn't an issue, look at the 'cpu' columns. You're interested in the 'us' and 'sy' columns. These show you the percentage of CPU time spent in either user processes or system. A high 'us' number points the finger at Apache or your scripts - or potentially something else on the server.
Running
top
will show you which processes are the most active.
Have you ruled out your database? The most common cause of unexpectedly high load I've seen on production LAMP stacks come down to database queries. You may have deployed new code with an expensive query in it; or got to the point where there are enough rows in your dataset to cause previously cheap queries to become expensive.
During periods of high load, do
echo "show full processlist" | mysql | grep -v Sleep
to see if there are either long-running queries, or huge numbers of the same query operating at once. Other mysql tools will help you optimise these.
You may find it useful to configure and use mod_status for Apache, which will allow you to see what request each Apache child is serving and for how long it has been doing so.
Finally, get some long-term statistical monitoring set up. Something like zabbix is straightforward to configure, and will let you monitor resource usage over time, such that if things get slow, you have historical baselines to compare against, and a better ieda of when problems started.
Perhaps you where using worker MPM before and now you are not?
I know PHP5 does not work with the Worker MPM. On my Ubuntu server, PHP5 can only be installed with the Prefork MPM. It seems that PHP5 module is not compatible with multithreading version of Apache.
I found a link here that will show you how to get better performance with mod_fcgid
To see what worker MPM is see here.
I'd use dTrace to solve this mystery... if it was running on Solaris or Mac... but since Linux doesn't have it you might want to try their Systemtap, however I can't say anything about its usability since I haven't used it.
With dTrace you could easily sniff out the culprits within a day, and would hope with Systemtap it would be similiar
Another option that I can't assure you will do any good, but it's more than worth the effort. Is to read the detailed changelog for the new version, and review what might have changed that could remotely affect you.
Going through the changelogs has saved me more than once. Especially when some config options have changed and when something got deprecated. Worst case is it'll give you some clues as to where to look next
Related
I am on a VPS server (Quad core with 768MB of RAM. I am running Apache and using APC as my default caching engine) . I get around 2000 visitors per day with a maximum of 100 concurrent visitors.
According to CakePHP DebugKit, my most memory intensive pages use around 16MB and the other pages average at around 10MB.
Is this memory usage normal?
Is it normal for my memory to bottleneck at 2000 visitors per page?
Should I consider upgrading my plan to 2GB RAM?
I also noticed that the View rendering is taking up most of the memory, around 70% on most pages.
I am monitoring my resource usage, when I have around 50 or more concurrent users, I am getting 0 free MB left.
Thank you
You should also check your other processes. From my experience, MySQL takes up more memory than anything else on any stack that I run. You should also implement better page caching so that PHP doesn't need to be touched when it isn't absolutely necessary. But Apache is also a memory hog that needs to be fine tuned. If you want to stick with Apache, then run Varnish in front of it.
Also, keep APC, but also add Memcached. It's much faster.
If your site has a spike-load that brings it to zero memory, then, if you can, consider launching extra instances of the server and doing a sort of round-robin (if the VPS is cloud-hosted and this is possible). If the load is constant, then definitely upgrade.
#burzum is completely right, however, that you should just switch to nginx. It's far, far better than Apache at this point. But, just to put you on the right track, quite a few people run nginx as a reverse proxy in front of Apache, and while that does speed up the server, it's entirely unnecessary because nginx can do pretty much anything you need it to do without Apache. Don't bother using Varnish in front of nginx either because nginx can act as its own reverse proxy.
Your best bet is to implement nginx with apcu (upgrade php to 5.5 if possible for better performance) and use memcached, and implement nginx's native microcaching abilities. If your site is heavy on read and light on write, then you might notice that everything is taken care of by nginx just retrieving a cached copy from memcache. While this helps with memory, it also helps with processing. My servers' CPUs usually have a 3-5% usage during peaks.
Is this memory usage normal?
Yes, it doesn't seem to be high for a normal CakePHP application.
Is it normal for my memory to bottleneck at 2000 visitors per page?
I guess yes, I'm not a "server guy".
Should I consider upgrading my plan to 2GB RAM?
I would try switching to Nginx first. Apache is a memory eating monster compared to Nginx, just search for a few banchmarks, a random one I've picked by a quick search is from this page.
Overall Nginx should provide you a much better performance. At some stage I would consider updating the memory but try Nginx first.
We try to deploy APC user-cache in a high load environment as local 2nd-tier cache on each server for our central caching service (redis), for caching database queries with rarely changing results, and configuration. We basically looked at what Facebook did (years ago):
http://www.slideshare.net/guoqing75/4069180-caching-performance-lessons-from-facebook
http://www.slideshare.net/shire/php-tek-2007-apc-facebook
It works pretty well for some time, but after some hours under high load, APC runs into problems, so the whole mod_php does not execute any PHP anymore.
Even a simple PHP script with only does not answer anymore, while static resources are still delivered by Apache. It does not really crash, there is no segfault. We tried the latest stable and latest beta of APC, we tried pthreads, spin locks, every time the same problem. We provided APC with far more memory it can ever consume, 1 minute before a crash we have 2% fragmentation and about 90% of the memory is free. When it „crashes“ we don’t find nothing in error logs, only restarting Apache helps. Only with spin locks we get an php error which is:
PHP Fatal error: Unknown: Stuck spinlock (0x7fcbae9fe068) detected in
Unknown on line 0
This seems to be a kind of timeout, which does not occur with pthreads, because those don’t use timeouts.
What’s happening is probably something like that:
http://notmysock.org/blog/php/user-cache-timebomb.html
Some numbers: A server has about 400 APC user-cache hits per second and about 30 inserts per second (which is a lot I think), one request has about 20-100 user-cache requests. There are about 300.000 variables in the user-cache, all with ttl (we store without ttl only in our central redis).
Our APC-settings are:
apc.shm_segments=1
apc.shm_size=4096M
apc.num_files_hint=1000
apc.user_entries_hint=500000
apc.max_file_size=2M
apc.stat=0
Currently we are using version 3.1.13-beta compiled with spin locks, used with an old PHP 5.2.6 (it’s a legacy app, I’ve heard that this PHP version could be a problem too?), Linux 64bit.
It's really hard to debug, we have written monitoring scripts which collect as much data as we could get every minute from apc, system etc., but we cannot see anything uncommon - even 1 minute before a crash.
I’ve seen a lot of similar problems here, but by now we couldn’t find a solution which solves our problem yet. And when I read something like that:
http://webadvent.org/2010/share-and-enjoy-by-gopal-vijayaraghavan
I’m not sure if going with APC for a local user-cache is the best idea in high load environments. We already worked with memcached here, but APC is a lot faster. But how to get it stable?
best regards,
Andreas
Lesson 1: https://www.kernel.org/doc/Documentation/spinlocks.txt
The single spin-lock primitives above are by no means the only ones. They
are the most safe ones, and the ones that work under all circumstances,
but partly because they are safe they are also fairly slow. They are slower
than they'd need to be, because they do have to disable interrupts
(which is just a single instruction on a x86, but it's an expensive one -
and on other architectures it can be worse).
That's written by Linus ...
Spin locks are slow; that assertion is not based on some article I read online by facebook, but upon the actual facts of the matter.
It's also, an incidental fact, that spinlocks are deployed at levels higher than the kernel because of the very problems you speak of; untraceable deadlocks because of a bad implementation.
They are used by the kernel efficiently, because that's where they were designed to be used, locking tiny tiny tiny sections, not sitting around and waiting for you to copy your amazon soap responses into apc and back out a billion times a second.
The most suitable kind of locking (for the web, not the kernel) available in APC is definitely rwlocks, you have to enable rwlocks with a configure option in legacy APC and it is the default in APCu.
The best advice that can be given, and I already gave it, is don't use spinlocks, if mutex are causing your stack to deadlock then try rwlocks.
Before I continue, your main problem is you are using a version of PHP from antiquity, which nobody even remembers how to support, in general you should look to upgrade, I'm aware of the constraints on the OP, but it would be irresponsible to negate to mention that this is a real problem, you do not want to deploy on unsupported software. Additionally APC is all but unmaintained, it is destined to die. O+ and APCu are it's replacement in modern versions of PHP.
Anyway, I digress ...
Synchronization is a headache when you are programming at the level of the kernel, with spinlocks, or whatever. When you are several layers removed from the kernel, when you rely on 6 or 7 bits of complicated software underneath you synchronizing properly in order that your code can synchronize properly synchronization becomes, not only a headache for the programmer, but for the executor too; it can easily become the bottleneck of your shiny web application even if there are no bugs in your implementation.
Happily, this is the year 2013, and Yahoo aren't the only people able to implement user caches in PHP :)
http://pecl.php.net/package/yac
This is an, extremely clever, lockless cache for userland PHP, it's marked as experimental, but once you are finished, have a play with it, maybe in another 7 years we won't be thinking about synchronization issues :)
I hope you get to the bottom of it :)
Unless you are on a freebsd derived operating system it is not a good idea to use spinlocks, they are the worst kind of synchronization on the face of the earth. The only reason you must use them in freebsd is because the implementer negated to include PTHREAD_PROCESS_SHARED support for mutex and rwlocks, so you have little choice but to use the pg-sql inspired spin lock in that case.
i have some problem with my VPS.
We are using a VPS to run our CMS and our websites, for now we have 300MB memory limit, and now we are close to reach the limit.
To maintain low cost(i know, increase memory is not to much expensive), but if i find a solution to optimize what we have, will be better.
What can i do?
Thanks!
I would suggest to increase memory - more memory, faster website :-)
But if speed is not important, reduce all cache sizes, set php memory_limit to 8M, disable opcode caching (APC, eAccelerator)
or try Raspberry Pi as server, now comes with 512MB :-)
I have a small 256MB vps which uses
apache + php + mariadb (mysqld)
I found memory was being gobbled by apache at every request 20MB at time for a simple wordpress page. It would settle but over time gobble into swap space and slow everything to a crawl. I suspect there are ways to fine tune mpm_event and mpm_worker to stop entering swap but I didn't figure out how.
When working in such tight environments it is important to know what is using what memory and reduce everything, so that swapping is minimized.
A summary of what I did follows, I managed to get 100MB physical headroom for tweaking, this is not a heavily loaded server but needs to be accessible (and cheap):
Using slackware (the os and default services seem to use less memory than debian, perhaps there is less loaded by default on my providers image)
Switch mysqld innodb tables off and use myisam
configure mysqld to reduce cache sizes if relevant
install cms fresh so they are definitely using myisam or alter all the tables
in apache choose to use the mod_mpm_prefork rather than mpm_event and mpm_worker ("apachectl -V" will tell you which is being used)
set sensible values for maxserver(start from low values for number of servers and requests and work up)
test the server under load whilst doing 'watch free' or 'top' through ssh
you should be able to see memory & the processes being created and destroyed as the load increases
adjust your httpd server settings and retest until happy
I like to see the memory max out at no more than 90% and reduce back down when the load is gone before I am convinced it will stay up without grinding to a halt (start using swap).
check your php.ini memory settings as stated in another answer.
I have also set a cron job to send me an email when swap starts to get used significantly, so I can restart something or even restart the whole server if I can't find what has used the memory, this should happen less and less the more you fine tune.
As stated this is not an environment that will perform well under heavy load but the cost may be more important to you.
Just my two pence worth...
I would look at Nginx like concerto49 recommended, if you have just one website on it also consider Litespeed (www.litespeedtech.com) they have a free version which may be sufficient to power your site.
If its PHP based, then strip out everything you're not using. Use APC/XCache to processing every request. Nginx also has a caching module, that could help so you can avoid hitting PHP for every request if its still fresh.
What type of VPS is it? OpenVZ? Xen? KVM? If it is OpenVZ does it have VSwap or Burst memory?
What type of CMS / Website are you running? Is it PHP based? Are you using Apache? If so have you tried nginx? I would look at optimizing the web server component and removing unused processes/applications to reduce memory and increase performance.
Running WordPress on IIS 7 (Windows Server 2008) with WP-SuperCache as per IIS.net's guide.
Was running great but recently we changed the permissions on some folders and the administrator password and we're getting huge spikes in our CPU usage as a result of the PHP-cgi.exe processes.
This leads me to believe it's not caching however the pages themselves have the "Cached with WP-SuperCache" comments at the bottom, and the caching seems to be working correctly.
What else could be the issue here?
I think I may have found a solution or at least a work round to this problem, at least it seems to be working for me reliably.
Try setting the Max Instances setting, under IIS Server --> FastCGI Settings, to 1.
It seemed to me that only certain requests were causing a php-cgi.exe process to go rogue and hog the cpu, usually when updating a post. When reading other posts on this issue one of them mentioned the Max Instances setting and that it is set to default at 0 or automatic. I wondered if this might not have a good effect when things aren't as they should be. I'm guessing (but this isn't quite my field of expertise) if a certain request(s) is causing the process to lock-up, so FastCGI just creates another, whilst leaving the first in place. Somehow it seems only having a single instance allows PHP to move on from the lock-up and the cpu stays under control.
For servers with high-levels of requests setting FastCGI to only a single instance may not be ideal, but it certainly beats the delays I was getting before. Used in combination with WP-SuperCache and WinCache, things seem to nipping along nicely now.
Looking at that task mgr looks like its missing the cache on every request. Plus that article dates to 2008 so difficult to say whether the directions as written would still work. Something with WP-SuperCache could have changed.
I would recommend using W3 Total Cache. I've done extensive testing with it on Windows Server 2008 and IIS 7 and it works great. It is also compatible with and leverages the WinCache extension for PHP. Has some other great features too if you're interested, minification, CDN support, etc. It's a really great performance plugin for WordPress. You can get the plugin here, http://wordpress.org/extend/plugins/w3-total-cache/
some other things to check...
What size is the app pool? (# of processes?)
Make sure you are using PHP 5.3.
Make sure you are using WinCache.
Make sure to set MaxInstanceRequests to something less than PHP_FCGI_MAX_REQUESTS. Definitely do not allow PHP to handle recyling the app pool. The default is 10K requests. If you are seeing these results during a load test then this might be the cause. Increase MaxInstanceRequests and keep it one less than PHP_FCGI_MAX_REQUESTS.
Hope that helps.
I am running wampserver on my windows vista machine. I have been doing this for a long time and it has been working great. I have completed loads of projects with this setup.
However, today, without me changing anything (no configuration etc) only PHP code changes, I find that every time I load pages of my site (those with user sessions or access the database) are really slow to load - Over 30 seconds, they use to take 1 or 2 seconds.
When I have a look at the task manager, I can see on page loads the httpd process jumps from 10mb to 30mb, 90mb, 120mb, 250mb and then back down again.
I have tested previous php code projects and they seem to all be slow as well!
What is going on?
Thanks all for any help on this confusion issue!
Check the following:
Check if you your data-access library to access your database has been changed/updated lately (if you use one).
Just a guess, but did you changed your antivirus/firewall (or settings) since last time you checked those previous projects? A more aggresive security can slow things a lot.
Did you changed the Apache/PHP/MySQL version in the WAMPSERVER menu?
Maybe you can try to reinstall WAMPSERVER (do this last and if it's not an hassle for you because I really doubt this will help but it can in some really really weird cases).
But from experience and the memory usage you explain in your question it seems that your SQL queries are long to execute and/or return a really large data set.
Try to optimise your queries, it can help for speed but not really memory usage (at least if the result set is the same). For the memory, maybe you can use LIMIT to reduce your returning data set (if your design allows it - but it should).
Since we don't really know what you do with your data, take note than "playing" (like parsing large XML documents) with large data sets can take much time/memory (again it depends much on what you do with all this data).
Bottom line, if nothing in this post helps, try to post more information on your setup and what exactly you do (with even code samples) when it's slow.
Try checking the size of your wamp log files.
i.e.
C:\wamp\logs
Sometimes, when they get really big, they can cause Apache to slow down.
Have you recently changed your network configuration or upgraded your system? That may be causing this issue through your network config or anti-virus/security software. People have had issues with zonealarm causing this in the past, for example.
Also, if you've recently switched from typing "127.0.0.1" to "localhost" or moved around networks, you may benefit from removing the IPV6 localhost setting from C:\Windows\System32\drivers\etc\hosts if you have one:
change the line with
::1 localhost
to
# ::1 localhost
I am surprised that no one has suggested this. You should always try to see why things are really slowing down:
Use Performance monitor to see where the bottleneck is, first. (perfmon.exe)
Is hard page fault actually the bottleneck? Is your hard drive busy reading/writing to the pagefile? Check the length of IO queue for the hard disk.
Are CPUs busy?
If nothing looks busy, use procmon to see if your php process is blocked on some system calls.
Hope this helps.
though not related to finding database bottlenecks, XDebug in combination with a cachegrind viewer (e.g. WebGrind, WinCacheGrind) can help you find the part of your PHP code, that takes longest to execute.