My client currently has only one server with both MySql and Apache running on it, and at busy times of the year they're occasionally seeing Apache fall over as it has so many connections.
They run two applications; their busy public ecommerce PHP based website and their (busy during working hours only) internal order processing type application with 15-20 concurrent users.
I've managed to get them to increase their budget enough to get two servers. I'm considering either:
A) one server running Apache/PHP and the other as a dedicated MySQL server, or
B) one running their public website only, and the other running MySQL and the internal application.
The benefit I see of A) is that Mysql my.cnf can be tuned to use all of the resources of that server, but it has the drawback of only having one Apache instance running.
B) would spread the load on Apache across both servers, but would limit MySQL's resources on that server, even out of working hours when the internal application won't be used.
I just can't decide which way to go with this and would be grateful of any feedback you may have.
Both approaches are wrong.
You have 2 goals here; availability and performance (I'm considering capacity to be an aspect of performance in this context).
To improve availability, you should be ensuring that there is no single point of failure in your architecture. But with the models you propose, you're actually creating multiple single points of failure - hence your 2 server models are less available than your single server.
From a performance point of view, you want to spread the workload across the available resources. You can't move CPU and memory between the servers but you can move the traffic.
Hence the optimal solution is to run both applications on both servers. Setting up MySQL clustering is a bit more complex, but probably the out-of-the-box asynch replication will be adequate - with the nodes configured as master-master (but writes from the 2 applications targeted sensibly).
There's probably a lot of scope for increasing the capacity of the system further but without a lot more detail (more than is appropriate in this forum, and possibly more than your client is comfortable payng for) it is hard to advise.
Related
Here Google offers different tiers of their google-cloud-sql
I dont understand when someone will need to upgrade the very basic d0 tier.
My questions are:
1) If you are connecting GAE to cloud-sql, will the sql concurrent connections limit the scalability of your GAE app to 250 concurrent requests? I mean, will GAE create a new connection to cloud-sql on every request?
1bis) Can a very requested GAE app use only one sql connection?
2) Could you give some case-scenarios when Dx may be recomendable?
what i dont understand is when someone will need to upgrade the very
basic d0 tier.
When its performance proves insufficient for your workload (number and size of queries) resulting in too-slow responses to user queries (or back-end tasks). https://cloud.google.com/sql/docs/instance-info explains how to view all the info about a given Cloud SQL instance.
1) if you are connecting GAE to cloud-sql, will the sql concurrent
connections limit the scalability of your GAE app to 250 concurrent
requests ? i mean, will GAE create a new connection to cloud-sql on
every request ?
Actually, your PHP code will do that, e.g with a call such as
$sql = new mysqli( ... etc, etc
if and when it needs a Cloud SQL connection to serve a request. I do not believe there can be any way to share a single connection among different servers (and multiple concurrent requests are typically served by different servers -- although if your code is threadsafe a single server might be responding to a few requests concurrently, and I guess you could try to share a single connection among threads with locking, though that might impact latency and would only give you a small amount of connection-reuse anyway).
1bis) can a very requested GAE app use only one sql connection ?
A "very requested GAE app" is no doubt going to be using multiple servers at once, and there is no way separate servers can share 1 mySql connection.
2) could you give some case-scenarios when Dx may be recomendable ?
You'll just want larger instances in proportion to how big/demanding your workload is -- larger databases and indices, big/heavy requests including ones processing or returning lots of data, many concurrent requests, heavy background "data mining" going on at the same time, and so forth.
I would recommend using the calculator at https://cloud.google.com/products/calculator/ -- click on the Cloud SQL icon if that's specifically what you want to explore -- to determine expected monthly costs for an instance.
As for the performance you can expect in return, that's so totally dependent on your data, indices, workloads, &c, that there's really no shortcut for it: rather, I recommend building a minimal meaningful sample of your app's needs and a stress-load test for it, tune it first on a local MySQL installation, then deploy experimentally to Cloud SQL in different configurations to measure the effects.
Once you've gone to the trouble of building and calibrating such benchmarks, you may of course also want to try out other competing providers of "mysql in the cloud" services, to know for sure exactly what performance you're getting for your money -- I'm unfortunately not very knowledgeable about what all is available on the market, but my key message is to use your own benchmarks, built to be meaningful for your app, rather than relying on "canned" benchmarks...
I'm wrapping up development on an iPhone game right now that uses data from a PHP/MYSQL database. I'm currently (pre-release) hosting all the data on a non-dedicated web hosting service, but I have no idea how that will scale once the game goes live. I'm a bit worried it will crumble to it's knees if the game is moderately popular.
The game doesn't pull in a lot of data. The average user will ping the database 3-4 times a minute just to grab a tiny amount of data (a few text strings). Everything works fine with just a couple people using it, but I don't understand MYSQL well enough to know how it will scale to potentially hundreds of simultaneous connections.
I'm skeptical to move it to a dedicated server because they're damn expensive and I have no idea if the game will tank out of the gate or if it even needs a dedicated server.
Any advice? And sorry if anything I've said here is just plain stupid. This isn't my area of expertise.
I would stay away from shared hosting for any real application like this. Dedicated servers are expensive, but you can get reliable and relatively inexpensive service from a virtual private server. I use a VPS from linode.com for all my dev work, the basic plan costs $20 a month and you can upgrade your plan very quickly (matter of minutes) if needed.
Load test it first!
You didn't indicate how the data is pulled from the MySQL database to the iPhones, so I am going to assume that it's using HTTP requests in some form. This means you can use a load testing tool, such as Apache's Benchmarking tool ab, to generate many concurrent requests to your server-side application and see if it handles the load.
If the application is just reading small amounts of data and you have indexed your tables properly you may be fine. But, as others have noted, a VPS is probably your best bet.
What are the usual bottlenecks (And what tends to break first) to Lamp based sites on EC2 when your number of users increase?
Assuming:
-Decent DB design
-There are some Ram and CPU intensive processes on cron but no ram/cpu intensive stuff during normal use.
Good question - we replaced the A with Nginx, our PHP is fpm'd now. And that allows us to setup more app balancers to handle traffic spikes and all that. We also moved the main database to CouchDB (BigCouch) but generally there is no recipe to avoid disaster without knowing what your application does.
EC2 bottlenecks
EC2 bottlenecks or issues are easier to generalize and pin down.
Disk i/o
E.g., a very general bottleneck is the disk i/o.
Even though EBS is faster than the instance storage and also persistence, it's also slow. There are ways to get more EBS performance using RAID setups, but they'll never get you near the speed of SAS.
Network latency
Another bottleneck is internal network latency. You shouldn't rely on anything being instant, and I guess that's the general rule of thumb with cloud computing. It really is eventually consistent, which also requires your app to adjust to that and behave different.
Capacity
Last but not least - capacity errors. They happen - e.g., you can't start another instance in the same zone. I've also had instances reboot themselves or disappear. All these things happen in the cloud and need to be dealt with.
Automate, automate!
The biggest change when moving to EC2 is to let go of actual servers and automate instance bootstrapping. Before I went to the DC for half a day and racked new hardware, installed servers, etc..
Being able to start up and terminate application servers, loadbalancers etc. is the biggest change and also the greatest advantage of the cloud. It helps you to deal with many, many issues easily.
You really need to tell us more about your application. What breaks depends entirely on how it uses resources.
Since you've switched to lighttpd, the webserver itself is going to use fewer resources than apache would, but Apache is rarely the bottleneck unless you've run out of ram or seriously misconfigured it.
Have you tried actually testing your application using ab? Load it up and see what happens.
Consider a web app in which a call to the app consists of PHP script running several MySQL queries, some of them memcached.
The PHP does not do very complex job. It is mainly serving the MySQL data with some formatting.
In the past it used to be recommended to put MySQL and the app engine (PHP/Apache) on separate boxes.
However, when the data can be divided horizontally (for example when there are ten different customers using the service and it is possible to divide the data per customer) and when Nginx +FastCGI is used instead of heavier Apache, doesn't it make sense to put Nginx Memcache and MySQL on the same box? Then when more customers come, add similar boxes?
Background: We are moving to Amazon Ec2. And a separate box for MySQL and app server means double EBS volumes (needed on app servers to keep the code persistent as it changes often). Also if something happens to the database box, more customers will fail.
Clarification: Currently the app is running with LAMP on a single server (before moving to EC2).
If your application architecture is already designed to support Nginx and MySQL on separate instances, you may want to host all your services on the same instance until you receive enough traffic that justifies the separation.
In general, creating new identical instances with the full stack (Nginx + Your Application + MySQL) will make your setup much more difficult to maintain. Think about taking backups, releasing application updates, patching the database engine, updating the database schema, generating reports on all your clients, etc. If you opt for this method, you would really need to find some big advantages in order to offset all the disadvantages.
You need to measure carefully how much memory overhead everything has - I can't see enginex vs Apache making much difference, it's PHP which will use all the RAM (this in turn depends on how many processes the web server chooses to run, but that's more of a tuning issue).
Personally I'd stay away from enginex on the grounds that it is too risky to run such a weird server in production.
Databases always need lots of ram, and the only way you can sensibly tune the memory buffers is to have them on dedicated servers. This is assuming you have big data.
If you have very small data, you could keep it on the same box.
Likewise, memcached makes almost no sense if you're not running it on dedicated boxes. Taking memory from MySQL to give to memcached is really robbing Peter to pay Paul. MySQL can cache stuff in its innodb_buffer_pool quite efficiently (This saves IO, but may end up using more CPU as you won't cache presentation logic etc, which may be possible with memcached).
Memcached is only sensible if you're running it on dedicated boxes with lots of ram; it is also only sensible if you don't have enough grunt in your db servers to serve the read-workload of your app. Think about this before deploying it.
If your application is able to work with PHP and MySQL on different servers (I don't see why this wouldn't work, actually), then, it'll also work with PHP and MySQL on the same server.
The real question is : will your servers be able to handle the load of both Apache/nginx/PHP, MySQL, and memcached ?
And there is only one way to answer that question : you have to test in a "real" "production" configuration, to determine own loaded your servers are -- or use some tool like ab, siege, or OpenSTA to "simulate" that load.
If there is not too much load with everything on the same server... Well, go with it, if it makes the hosting of your application cheapier ;-)
I recently experienced a flood of traffic on a Facebook app I created (mostly for the sake of education, not with any intention of marketing)
Needless to say, I did not think about scalability when I created the app. I'm now in a position where my meager virtual server hosted by MediaTemple isn't cutting it at all, and it's really coming down to raw I/O of the machine. Since this project has been so educating to me so far, I figured I'd take this as an opportunity to understand the Amazon EC2 platform.
The app itself is created in PHP (using Zend Framework) with a MySQL backend. I use application caching wherever possible with memcached. I've spent the weekend playing around with EC2, spinning up instances, installing the packages I want, and mounting an EBS volume to an instance.
But what's the next logical step that is going to yield good results for scalability? Do I fire up an AMI instance for the MySQL and one for the Apache service? Or do I just replicate the instances out as many times as I need them and then do some sort of load balancing on the front end? Ideally, I'd like to have a centralized database because I do aggregate statistics across all database rows, however, this is not a hard requirement (there are probably some application specific solutions I could come up with to work around this)
I know this is probably not a straight forward answer, so opinions and suggestions are welcome.
So many questions - all of them good though.
In terms of scaling, you've a few options.
The first is to start with a single box. You can scale upwards - with a more powerful box. EC2 have various sized instances. This involves a server migration each time you want a bigger box.
Easier is to add servers. You can start with a single instance for Apache & MySQL. Then when traffic increases, create a separate instance for MySQL and point your application to this new instance. This creates a nice layer between application and database. It sounds like this is a good starting point based on your traffic.
Next you'll probably need more application power (web servers) or more database power (MySQL cluster etc.). You can have your DNS records pointing to a couple of front boxes running some load balancing software (try Pound). These load balancing servers distribute requests to your webservers. EC2 has Elastic Load Balancing which is an alternative to managing this yourself, and is probably easier - I haven't used it personally.
Something else to be aware of - EC2 has no persistent storage. You have to manage persistent data yourself using the Elastic Block Store. This guide is an excellent tutorial on how to do this, with automated backups.
I recommend that you purchase some reserved instances if you decide EC2 is the way forward. You'll save yourself about 50% over 3 years!
Finally, you may be interested in services like RightScale which offer management services at a cost. There are other providers available.
First step is to separate concerns. I'd split off with a separate MySQL server and possibly a dedicated memcached box, depending on how high your load is there. Then I'd monitor memory and CPU usage on each box and see where you can optimize where possible. This can be done with spinning off new Media Temple boxes. I'd also suggest Slicehost for a cheaper, more developer-friendly alternative.
Some more low-budget PHP deployment optimizations:
Using a more efficient web server like nginx to handle static file serving and then reverse proxy app requests to a separate Apache instance
Implement PHP with FastCGI on top of nginx using something like PHP-FPM, getting rid of Apache entirely. This may be a great alternative if your Apache needs don't extend far beyond mod_rewrite and simpler Apache modules.
If you prefer a more high-level, do-it-yourself approach, you may want to check out Scalr (code at Google Code). It's worth watching the video on their web site. It facilities a scalable hosting environment using Amazon EC2. The technology is open source, so you can download it and implement it yourself on your own management server. (Your Media Temple box, perhaps?) Scalr has pre-built AMIs (EC2 appliances) available for some common use cases.
web: Utilizes nginx and its many capabilities: software load balancing, static file serving, etc. You'd probably only have one of these, and it would probably implement some sort of connection to Amazon's EBS, or persistent storage solution, as mentioned by dcaunt.
app: An application server with Apache and PHP. You'd probably have many of these, and they'd get created automatically if more load needed to be handled. This type of server would hold copies of your ZF app.
db: A database server with MySQL. Again, you'd probably have many of these, and more slave instances would get created automatically if more load needed to be handled.
memcached: A dedicated memcached server you can use to have centralized caching, session management, et cetera across all your app instances.
The Scalr option will probably take some more configuration changes, but if you feel your scaling needs accelerating quickly it may be worth the time and effort.