I'm asked by a customer to deliver a TYPO3 based website with the following parameters:
- small amount of content (about 50 pages)
- very little change frequency
- average availabilty about 95%/day
- 20% of pages are restricted, only available after login
- No requirements for fancy typo3 extensions or something else (only Typo3 core)
- Medium sized pages
- Only limited digital assets (images etc.) included
I have the requirements to build an infrastructure to serve up to 1000 concurrent users. With the assumption of having an average think time of 30 sec. this would result in 33 Requests per second.
How could an infrastructure look like?
I know that system scaling is a highly individual task depending on the implementation of the system and needs testing, but I need a first indication where to start (single server, separating components to different servers,...).
Any idea?
Easier solution is EXT:nc_staticfilecache. This saves the static pages as HTML and your web server automatically delivers them through rewrite rules (in case of Apache through mod_rewrite). This works very well for static content and should already enable you to do >100req/s.
The even more fancier way is to use Varnish Cache. Varnish is a reverse proxy server that holds your web site content in memory and can run on a dedicated host. If you configure it correctly (send correct cache headers!), it serves you line speed (some million req/s). There is also a TYPO3 Extension moc_varnish, which e.g. purges the varnish cache, when a page is changed in TYPO3. Also support for edge side includes exists to e.g. only retrieve the user-specific data from TYPO3 and use the static parts of a page from varnish cache (everything except the "Welcome user Foo Bar".. ;)).
As mentioned: Don't forget to configure correct cache headers (Expires etc) for your assets. This already removes some load from your web server.
It's quite possible, already made something like this. You need at least one dedicated server with >= 8GB of RAM.
If we are speaking about infrastructure, the minimal combination is :
nginx/Varnish for front/load balancing
Apache HTTP Server
MySQL could be on standalone server, could be clustered
Performance optimization is very important in such cases.
Some links for further reading :
http://techblog.evo.pl/en/how-to-boost-speed-up-your-typo3-website-with-nginx/
http://www.fabrizio-branca.de/nginx-varnish-apache-magento-typo3.html
http://wiki.typo3.org/Performance_tuning
I'd put this on a single dedicated server (or well specified VPS) but maybe keep all the static assets on a third party CDN so you can focus on the dynamic stuff. I don't know Typo3 but can't see any reason why you couldn't have your db on the same server for this level of usage - there is sure to be caching options of various kinds. Or perhaps consider a cloud server, so if you need more oomph, just add more resources.
Edit: I don't think it is a good idea to build a scalable architecture just yet e.g. proxy servers and all that stuff. If it is slow and you find you really can't cope with one machine, scale up at that point. I'm of the view you can make do with a much simpler architecture given your expected traffic.
I would look into a virtual sserver or a ksm and a good mysql and php configuration. When I have a ksm I would tweak Linux and use iptables for traffic shaping. A dedicated root server would be nice but it's expensive. Then I would think about using a nginx or lighttpd webserver with eaccellerator and memcache. If that doesn't help I would try to compile php and mysql with optimize flags or I would try to compile it with the Intel C Compiler. ICC can optimize C code better then gcc. If the server has many ram I would use ramdisk.
Related
I am currently tasked with finding a solution for a serious PHP bottleneck which is apparently caused by server-side minification of CSS and JS when our sites are under high load.
Some details and what I have found out so far
I inherited a web application running on Wordpress and which uses a complex constellation of Doctrine, Memcached and W3 Total Cache for minification and caching. When under heavy load our application begins to slow down rapidly. So far we have narrowed part of the problem down to the server-side minification process. Preliminary analysis has shown that the number PHP processes start to stack up under load, and when reaching the process limit of 500 processes, start to slow everything down. Something which is also mentioned by the author of the minify library.
Solutions I have evaluated so far
Pre-minification
The most logical solution would be to pre-minify any of the files before going live. Unfortunately our workflow demands that non-developers should be able to edit said files on our production servers (i.e. after the web app has gone live). Therefore I think that pre-processing is out of the question, as it limits the editability of minified files.
Serving unminified files
75% of our users are accessing our web application with their mobile devices, especially smartphones. Unminified JS and CSS amounts to 432KB and is reduced by 60-80% in size when minified. Therefore serving unminified files, while solving the performance and editability problem, is for the sake of mobile users out of the question.
I understand that this is as much a technical problem as it is a workflow problem and I guess we are open to working on both as long as we end up with a better overall performance.
My questions
Is there a reasonable compromise which solves the PHP bottleneck
problem, allows for non-devs to make changes to live CSS/JS and
still serves reasonably sized files to clients.
If there is no such one-size-fits-all solution, what can I do to
better our workflow and / or server-side behaviour?
EDIT: Because there were some questions / comments regarding the server configuration, our servers run Debian and are equipped with 32GB of RAM and 24 core CPUs.
You can run a css/javascript compilation service like Gulp or Grunt via Node.js that minifies all your js and css assets on change.
This service can run in production but that is not recommended without some architectural setup ( having multiple versioned compiled files and auto-checking them via gulp or another extension ).
I emphasize that patching features into production and directly
editing it is strongly discouraged as it can present live issues to
your visitors reducing your credibility.
http://gulpjs.com/
Using Gulp/Grunt would require you to change how you write your css/javascript files.
I would solve this with 2 solutions - first, removing any WP-CRON operation that runs every time a user runs the application and move that to actual CRON on the server. Second I would use load balancing so that a single server is not taking the load of the work. That is your real problem and even if you fix your perceived code issues you are still faced with the load issue.
I don't believe you need to change your workflow at all or go down the road of major modification to your existing system.
The WP-CRON tasks that runs each time a page is loaded causes significant load and slowness. You can shift this from the users visiting running that process to your server just running it at the server level. This reduces load. It is also most likely running these processes that you believe are slowing down the site.
See this guide:
http://www.inmotionhosting.com/support/website/wordpress/disabling-the-wp-cronphp-in-wordpress
Next - load balancing. Having a single server supplying all your users when you have a lot of traffic is a terrible idea. You need to split the webservers load.
I'm not sure where or how you are hosted but I would move things to AWS. Setup my WordPress site for load balancing # AWS: http://www.mornin.org/blog/scalable-wordpress-amazon-web-services/
This will involve:
Load Balancer
EC2 Instances running PHP/Apache
RDS for your database storage for all EC2 instances
S3 Storage for the sites media
For user sessions I suggest you just setup stickiness on the load balancer so users are continually served the same node they arrived on.
You can get a detailed guide on how to do this here:
http://www.mornin.org/blog/scalable-wordpress-amazon-web-services/
Or at server fault another approach:
https://serverfault.com/questions/571658/load-balancing-wordpress-on-amazon-web-services-managing-changes
The assumption here is that if you are high traffic you are making revenue from this high traffic so anytime your service responds slowly it will turn away users or possibly discourage them from returning. Changing the software could help - but you're treating the symptom not the illness. The illness is that your server comes under heavy load. This isn't uncommon with WordPress and high traffic, so you need to spread the load instead of try and micro-optimize. The difference is the optimizations will be small gains while the load balancing and spread of load actually solves the problem.
Finally - consider using a CDN for serving all of your media. This loads media faster and it removes load from your system by reducing the amount of requests to the server and it's output to the clients. It also loads pages faster consistently for people wherever they are visiting from by supplying media from nodes closest to them. At AWS this is called CloudFront. WordPress also offers this service free via Jetpack (I believe) but it does not handle all media from my understanding.
I like the idea of using GulpJS. One thing you might consider is to have a wp-cron or even just a system cron that runs every 5 minutes or so and then runs a gulp task to minify and concatenate your css and js files.
Another option that doesn't require scheduling but is based off of watching the file system for changes and then triggering a Gulp build to happen is to use incron (inotify cron). Check out the incron man page. Incron is great in that it triggers actions based on file system events such as file changes. You could use this to trigger a gulp build when any css file changes on the file system.
One caveat is that this is a Linux solution so if you're hosting on Windows you might have to look for something similar.
Edit:
Incron Documentation
By no means, NewRelic is taking the world by storm with many successful deployments.
But what are the cons of using it in production?
PHP monitoring agent works as a .so extension. If I understand correctly, it connects to another system aggregation service, which filters data out and pushes them into the NewRelic cloud.
This simply means that it works transparently under the hood. However, is this actually true?
Any monitoring, profiling or api service adds some overhead to the entire stack.
The extension itself is 0.6 MB, which adds up to each php process, this isn't much so my concern is rather CPU and IO.
The image shows CPU Utilization on a production EC2 t1.micro instances with NewRelic agent (top blue one) and w/o the agent (other lines)
What does NewRelic really do what cause the additional overhead?
What are other negative sides when using it?
Your mileage may vary based on the settings, your particular site's code base, etc...
The additional overhead you're seeing is less the memory used, but the tracing and profiling of your php code and gathering analytic data on it as well as DB request profiling. Basically some additional overhead hooked into every php function call. You see similar overhead if you left Xdebug or ZendDebugger running on a machine or profiling. Any module will use some resources, ones that hook deep in for profiling can be the costliest, but I've seen new relic has config settings to dial back how intensively it profiles, so you might be able to lighten it's hit more than say Xdebug.
All that being said, with the newrelic shared PHP module loaded with the default setup and config from their site my company's website overall server response latency went up about 15-20% across the board when we turned it on for all our production machines. I'm only talking about the time it takes for php-fpm to generate an initial response. Our site is http://www.nara.me. The newrelic-daemon and newrelic-sysmon services running as well, but I doubt they have any impact on response time.
Don't get me wrong, I love new relic, but the perfomance hit in my specific situation hit doesn't make me want to keep the PHP module running on all our live load balanced machines. We'll probably keep it running on one machine all the time. We do plan to keep the sysmon stuff going 100% and keep the module disabled in case we need it for troubleshooting.
My advice is this:
Wrap any calls to new relic functions in if(function_exists ( $function_name )) blocks so your code can run without error if the new relic module isn't loaded
If you've multiple identical servers behind a loadbalancer sharing the same code, only enable the php module on one image to save performance. You can keep the sysmon stuff running if you use new relic for this.
If you've just one server, only enable the shared php module when you need it--when you're actually profiling your code or mysql unless a 10-20% performance hit isn't a problem.
One other thing to remember if your main source of info is the new relic website: they get paid by the number of machines you're monitoring, so don't expect them to convince you to not use it on anything less than 100% of your machines even if it not needed. I think one of their FAQ's or blogs state basically you should expect some performance impact, but if you use it as intended and fix the issues you see from it, you should recoup the latency lost. I agree, but I think once you fix the issues, limit the exposure to the smallest needed number of servers.
The agent shouldn't be adding much overhead the way it is designed. Because of the level of detail required to adequately troubleshoot the problem, this seems like a good question to ask at https://support.newrelic.com
Consider a web app in which a call to the app consists of PHP script running several MySQL queries, some of them memcached.
The PHP does not do very complex job. It is mainly serving the MySQL data with some formatting.
In the past it used to be recommended to put MySQL and the app engine (PHP/Apache) on separate boxes.
However, when the data can be divided horizontally (for example when there are ten different customers using the service and it is possible to divide the data per customer) and when Nginx +FastCGI is used instead of heavier Apache, doesn't it make sense to put Nginx Memcache and MySQL on the same box? Then when more customers come, add similar boxes?
Background: We are moving to Amazon Ec2. And a separate box for MySQL and app server means double EBS volumes (needed on app servers to keep the code persistent as it changes often). Also if something happens to the database box, more customers will fail.
Clarification: Currently the app is running with LAMP on a single server (before moving to EC2).
If your application architecture is already designed to support Nginx and MySQL on separate instances, you may want to host all your services on the same instance until you receive enough traffic that justifies the separation.
In general, creating new identical instances with the full stack (Nginx + Your Application + MySQL) will make your setup much more difficult to maintain. Think about taking backups, releasing application updates, patching the database engine, updating the database schema, generating reports on all your clients, etc. If you opt for this method, you would really need to find some big advantages in order to offset all the disadvantages.
You need to measure carefully how much memory overhead everything has - I can't see enginex vs Apache making much difference, it's PHP which will use all the RAM (this in turn depends on how many processes the web server chooses to run, but that's more of a tuning issue).
Personally I'd stay away from enginex on the grounds that it is too risky to run such a weird server in production.
Databases always need lots of ram, and the only way you can sensibly tune the memory buffers is to have them on dedicated servers. This is assuming you have big data.
If you have very small data, you could keep it on the same box.
Likewise, memcached makes almost no sense if you're not running it on dedicated boxes. Taking memory from MySQL to give to memcached is really robbing Peter to pay Paul. MySQL can cache stuff in its innodb_buffer_pool quite efficiently (This saves IO, but may end up using more CPU as you won't cache presentation logic etc, which may be possible with memcached).
Memcached is only sensible if you're running it on dedicated boxes with lots of ram; it is also only sensible if you don't have enough grunt in your db servers to serve the read-workload of your app. Think about this before deploying it.
If your application is able to work with PHP and MySQL on different servers (I don't see why this wouldn't work, actually), then, it'll also work with PHP and MySQL on the same server.
The real question is : will your servers be able to handle the load of both Apache/nginx/PHP, MySQL, and memcached ?
And there is only one way to answer that question : you have to test in a "real" "production" configuration, to determine own loaded your servers are -- or use some tool like ab, siege, or OpenSTA to "simulate" that load.
If there is not too much load with everything on the same server... Well, go with it, if it makes the hosting of your application cheapier ;-)
I recently experienced a flood of traffic on a Facebook app I created (mostly for the sake of education, not with any intention of marketing)
Needless to say, I did not think about scalability when I created the app. I'm now in a position where my meager virtual server hosted by MediaTemple isn't cutting it at all, and it's really coming down to raw I/O of the machine. Since this project has been so educating to me so far, I figured I'd take this as an opportunity to understand the Amazon EC2 platform.
The app itself is created in PHP (using Zend Framework) with a MySQL backend. I use application caching wherever possible with memcached. I've spent the weekend playing around with EC2, spinning up instances, installing the packages I want, and mounting an EBS volume to an instance.
But what's the next logical step that is going to yield good results for scalability? Do I fire up an AMI instance for the MySQL and one for the Apache service? Or do I just replicate the instances out as many times as I need them and then do some sort of load balancing on the front end? Ideally, I'd like to have a centralized database because I do aggregate statistics across all database rows, however, this is not a hard requirement (there are probably some application specific solutions I could come up with to work around this)
I know this is probably not a straight forward answer, so opinions and suggestions are welcome.
So many questions - all of them good though.
In terms of scaling, you've a few options.
The first is to start with a single box. You can scale upwards - with a more powerful box. EC2 have various sized instances. This involves a server migration each time you want a bigger box.
Easier is to add servers. You can start with a single instance for Apache & MySQL. Then when traffic increases, create a separate instance for MySQL and point your application to this new instance. This creates a nice layer between application and database. It sounds like this is a good starting point based on your traffic.
Next you'll probably need more application power (web servers) or more database power (MySQL cluster etc.). You can have your DNS records pointing to a couple of front boxes running some load balancing software (try Pound). These load balancing servers distribute requests to your webservers. EC2 has Elastic Load Balancing which is an alternative to managing this yourself, and is probably easier - I haven't used it personally.
Something else to be aware of - EC2 has no persistent storage. You have to manage persistent data yourself using the Elastic Block Store. This guide is an excellent tutorial on how to do this, with automated backups.
I recommend that you purchase some reserved instances if you decide EC2 is the way forward. You'll save yourself about 50% over 3 years!
Finally, you may be interested in services like RightScale which offer management services at a cost. There are other providers available.
First step is to separate concerns. I'd split off with a separate MySQL server and possibly a dedicated memcached box, depending on how high your load is there. Then I'd monitor memory and CPU usage on each box and see where you can optimize where possible. This can be done with spinning off new Media Temple boxes. I'd also suggest Slicehost for a cheaper, more developer-friendly alternative.
Some more low-budget PHP deployment optimizations:
Using a more efficient web server like nginx to handle static file serving and then reverse proxy app requests to a separate Apache instance
Implement PHP with FastCGI on top of nginx using something like PHP-FPM, getting rid of Apache entirely. This may be a great alternative if your Apache needs don't extend far beyond mod_rewrite and simpler Apache modules.
If you prefer a more high-level, do-it-yourself approach, you may want to check out Scalr (code at Google Code). It's worth watching the video on their web site. It facilities a scalable hosting environment using Amazon EC2. The technology is open source, so you can download it and implement it yourself on your own management server. (Your Media Temple box, perhaps?) Scalr has pre-built AMIs (EC2 appliances) available for some common use cases.
web: Utilizes nginx and its many capabilities: software load balancing, static file serving, etc. You'd probably only have one of these, and it would probably implement some sort of connection to Amazon's EBS, or persistent storage solution, as mentioned by dcaunt.
app: An application server with Apache and PHP. You'd probably have many of these, and they'd get created automatically if more load needed to be handled. This type of server would hold copies of your ZF app.
db: A database server with MySQL. Again, you'd probably have many of these, and more slave instances would get created automatically if more load needed to be handled.
memcached: A dedicated memcached server you can use to have centralized caching, session management, et cetera across all your app instances.
The Scalr option will probably take some more configuration changes, but if you feel your scaling needs accelerating quickly it may be worth the time and effort.
I have a simple question and wish to hear others' experiences regarding which is the best way to replicate images across multiple hosts.
I have determined that storing images in the database and then using database replication over multiple hosts would result in maximum availability.
The worry I have with the filesystem is the difficulty synchronising the images (e.g I don't want 5 servers all hitting the same server for images!).
Now, the only concerns I have with storing images in the database is the extra queries hitting the database and the extra handling i'd have to put in place in apache if I wanted 'virtual' image links to point to database entries. (e.g AddHandler)
As far as my understanding goes:
If you have a script serving up the
images: Each image would require a
database call.
If you display the images inline as
binary data: Which could be done in
a single database call.
To provide external / linkable
images you would have to add a
addHandler for the extension you
wish to 'fake' and point it to your
scripting language (e.g php, asp).
I might have missed something, but I'm curious if anyone has any better ideas?
Edit:
Tom has suggested using mod_rewrite to save using an AddHandler, I have accepted as a proposed solution to the AddHandler issue; however I don't yet feel like I have a complete solution yet so please, please, keep answering ;)
A few have suggested using lighttpd over Apache. How different are the ISAPI modules for lighttpd?
If you store images in the database, you take an extra database hit plus you lose the innate caching/file serving optimizations in your web server. Apache will serve a static image much faster than PHP can manage it.
In our large app environments, we use up to 4 clusters:
App server cluster
Web service/data service cluster
Static resource (image, documents, multi-media) cluster
Database cluster
You'd be surprised how much traffic a static resource server can handle. Since it's not really computing (no app logic), a response can be optimized like crazy. If you go with a separate static resource cluster, you also leave yourself open to change just that portion of your architecture. For instance, in some benchmarks lighttpd is even faster at serving static resources than apache. If you have a separate cluster, you can change your http server there without changing anything else in your app environment.
I'd start with a 2-machine static resource cluster and see how that performs. That's another benefit of separating functions - you can scale out only where you need it. As far as synchronizing files, take a look at existing file synchronization tools versus rolling your own. You may find something that does what you need without having to write a line of code.
Serving the images from wherever you decide to store them is a trivial problem; I won't discuss how to solve it.
Deciding where to store them is the real decision you need to make. You need to think about what your goals are:
Redundancy of hardware
Lots of cheap storage
Read-scaling
Write-scaling
The last two are not the same and will definitely cause problems.
If you are confident that the size of this image library will not exceed the disc you're happy to put on your web servers (say, 200G at the time of writing, as being the largest high speed server-grade discs that can be obtained; I assume you want to use 1U web servers so you won't be able to store more than that in raid1, depending on your vendor), then you can get very good read-scaling by placing a copy of all the images on every web server.
Of course you might want to keep a master copy somewhere too, and have a daemon or process which syncs them from time to time, and have monitoring to check that they remain in sync and this daemon works, but these are details. Keeping a copy on every web server will make read-scaling pretty much perfect.
But keeping a copy everywhere will ruin write-scalability, as every single web server will have to write every changed / new file. Therefore your total write throughput will be limited to the slowest single web server in the cluster.
"Sharding" your image data between many servers will give good read/write scalability, but is a nontrivial exercise. It may also allow you to use cheap(ish) storage.
Having a single central server (or active/passive pair or something) with expensive IO hardware will give better write-throughput than using "cheap" IO hardware everywhere, but you'll then be limited by read-scalability.
Having your images in a database doesn't necessarily mean a database call for each one; you could cache these separately on each host (e.g. in temporary files) when they are retrieved. The source images would still be in the database and easy to synchronise across servers.
You also don't really need to add Apache handlers to serve an image through a PHP script whilst maintaining nice urls- you can make urls like http://server/image.php/param1/param2/param3.JPG and read the parameters through $_SERVER['PATH_INFO'] . You could also remove the 'image.php' portion of the URL (if you needed to) using mod_rewrite.
What you are looking for already exists and is called MogileFS
Target setup involves mogilefsd, replicated mysql databases and lighttd/perlbal for serving files; It will bring you failover, fine grained file replication (for exemple, you can decide to duplicate end-user images on several physical devices, and to keep only one physical instance of thumbnails). Load balancing can also be achieved quite easily.