What is optimal hardware configuration for heavy load LAMP application - php

I need to run Linux-Apache-PHP-MySQL application (Moodle e-learning platform) for a large number of concurrent users - I am aiming 5000 users. By concurrent I mean that 5000 people should be able to work with the application at the same time. "Work" means not only do database reads but writes as well.
The application is not very typical, since it is doing a lot of inserts/updates on the database, so caching techniques are not helping to much. We are using InnoDB storage engine. In addition application is not written with performance in mind. For instance one Apache thread usually occupies about 30-50 MB of RAM.
I would be greatful for information what hardware is needed to build scalable configuration that is able to handle this kind of load.
We are using right now two HP DLG 380 with two 4 core processors which are able to handle much lower load (typically 300-500 concurrent users). Is it reasonable to invest in this kind of boxes and build cluster using them or is it better to go with some more high-end hardware?
I am particularly curious
how many and how powerful servers are
needed (number of processors/cores, size of RAM)
what network equipment should
be used (what kind of switches,
network cards)
any other hardware,
like particular disc storage
solutions, etc, that are needed
Another thing is how to put together everything, that is what is the most optimal architecture. Clustering with MySQL is rather hard (people are complaining about MySQL Cluster, even here on Stackoverflow).

Once you get past the point where a couple of physical machines aren't giving you the peak load you need, you probably want to start virtualising.
EC2 is probably the most flexible solution at the moment for the LAMP stack. You can set up their VMs as if they were physical machines, cluster them, spin them up as you need more compute-time, switch them off during off-peak times, create machine images so it's easy to system test...
There are various solutions available for load-balancing and automated spin-up.
If you can make your app fit, you can get use out of their non-relational database engine as well. At very high loads, relational databases (and MySQL in particular) don't scale effectively. The peak load of SimpleDB, BigTable and similar non-relational databases can scale almost linearly as you add hardware.
Moving away from a relational database is a huge step though, I can't say I've ever needed to do it myself.

I'm not so sure about hardware, but from a software point-of-view:
With an efficient data layer that will cache objects and collections returned from the database then I'd say a standard master-slave configuration would work fine. Route all writes to a beefy master and all reads to slaves, adding more slaves as required.
Cache data as objects returned from your data-mapper/ORM and not HTML, and use Memcached as your caching layer. If you update an object then write to the db and update in memcached, best use IdentityMap pattern for this. You'll probably need quite a few Memcached instances although you could get away with running these on your web servers.
We could never get MySQL clustering to work properly.
Be careful with the SQL queries you write and you should be fine.

Piotr, have you tried asking this question on moodle.org yet? There are a couple of similar scoped installations whose staff members answer that currently.
Also, depending on what your timeframe for deployment is, you might want to check out the moodle 2.0 line rather than the moodle 1.9 line, it looks like there are a bunch of good fixes for some of the issues with moodle's architecture in that version.
also: memcached rocks for this. php acceleration rocks for this. serverfault is probably the better *exchange site for this question though

Related

Can Laravel 5 handle 1000 concurrent users without lagging badly?

I wanted to know that if 1000 users are concurrently using a website built with laravel 5 and also querying database regularly then how does laravel 5 perform ?
I know it would be slow but will it be highly slow that it would be unbearable ?
Note that i am also going to use ajax a lot.
And lets assume i am using digital ocean cloud service with following configurations
2GB memory
2 vCPU
40GB SSD
I don't expect completely real figures as it is impossible to do so but at least provide some details whether i should go with laravel with some considerable performance.
Please also provide some some tools through which i can check the speed of my laravel 5 application as well as how it will perform when there is actual load as well as other tools through which i can test speed and performance.
And it would be great if someone has real experience of using laravel especially Laravel 5.
And what about Lumen does that really make application faster than laravel and how much ?
In short, yes. At least newer versions of Laravel are capable (Laravel 7.*).
That being said, this is really a three part conundrum.
1. Laravel (Php)
High Concurrency Architecture And Laravel Performance Tuning
Honestly, I wouldn't be able to provide half the details as this amazing article provides. He's got everything in there from the definition of concurrency all the way to pre-optimization times vs. after-optimization times.
2. Reading, Writing, & Partitioning Persisted Data (Databases)
MySQL vs. MongoDB
I'd be curious if the real concern is Php's Laravel, or more of a database read/write speed timing bottleneck. Non relational databases are an incredible technology, that benefit big data much more than traditional relational databases.
Non-relational Databases (Mongo) have a much faster read speed than MySql (Something like 60% faster if I'm remembering correctly)
Non-relational Databases (Mongo) do have a slower write speed, but this usually is not an inhibitor to the user experience
Unlike Relational Databases (MySQL), Mongo DB can truly be partitioned, spread out across multiple servers.
Mongo DB has collections of documents, collections are pretty synonymous to tables and documents are pretty synonymous to rows.
The difference being, MongoDB has a very JSON like feel to it. (Collections of documents, where each document looks like a JSON object).
The huge difference, and benefit, is that each document - AKA row - does not have have the same keys. When using mongo DB on a fortune 500 project, my mentor and lead at the time, Logan, had a phenomenal quote.
"Mongo Don't Care"
This means that you can shape the data how you're wanting to retrieve it, so not only is your read speed faster, you're usually not being slowed by having to retrieve data from multiple tables.
Here's a package, personally tested and loved, to set up MongoDB within Laravel
Jessengers ~ MongoDB In Laravel
If you are concerned about immense amounts of users and transferring data, MongoDB may be what you're looking for. With that, let's move on to the 3rd, and most important point.
3. Serverless Architecture (Aka Horizontal scaling)
Aws, Google Cloud, Microsoft Azure, etc... I'm sure you've hear of The Cloud.
This, ultimately, is what you're looking for if you're having concurrency issues and want to stay within the realm of Laravel.
It's a whole new world of incredible tools one can hammer away at -- they'er awesome. It's also a whole new, quite large, world of tools and thought to learn.
First, let's dive into a few serverless concepts.
Infrastructure as Code Terraform
"Use Infrastructure as Code to provision and manage any cloud, infrastructure, or service"
Horizontal Scaling Example via The Cloud
"Create a Laravel application. It's a single application, monolithic. Then you dive Cloud. You discover Terraform. Ahaha, first you use terraform to define how many instances of your app will run at once. You decide you want 8 instances of your application. Next, you of course define a Load Balancer. The Load Balancer simply balances the traffic load between your 8 application instances. Each application is connected to the same database, ultimately sharing the same data source. You're simply spreading out the traffic across multiples instances of the same application."
We can of course top that, very simplified answer of cloud, and dive into lambdas, the what Not to do's of serverless, setting up your internal virtual cloud network...
Or...we can thank the Laravel team in advance for simplifying Serverless Architecture
Yep, Laravel Simplified Serverless (Shout out Laravel team)
Serverless Via Laravel Vapor
Laravel Vapor Opening Paragraph
"Laravel Vapor is an auto-scaling, serverless deployment platform for Laravel, powered by AWS Lambda. Manage your Laravel infrastructure on Vapor and fall in love with the scalability and simplicity of serverless."
Coming to a close, let's summarize.
Oringal Concern
Ability to handle a certain amount of traffic in a set amount of time
Potential Bottlenecks with potential solutions
Laravel & Php
(High Concurrency Architecture And Laravel Performance
Tuning
Database & Persisting/Retrieving Data Efficiently
Jessengers ~ MongoDB In Laravel
Serverless Architecture For Horizontal Scaling
Serverless Via Laravel Vapor
I'll try to answer this based on my experience as a software developer. To be honest I definitely will ask for an upgrade whenever it hits 1000 concurrent users at the same time because I won't take a risk with server failure nor data failure.
But let's break it how to engineer this.
It could handle those users if the data fetched is not complex and there are not many operations from Laravel code. If it's just passing through from the database and almost no modification from the data, it'll be fast.
The data fetched by the users are not unique. Let's say you have a news site without user personalization. the news that the users fetched mostly will be the same. You cached the data from memory (Redis, which I recommend) or from the web server(Nginx, should be avoided), your Laravel program will run fast enough.
Querying directly from the database is faster than using Laravel ORM. you might consider it if needed, but I myself will always try to use ORM because it will help code to be more readable and secure.
Splitting database, web server, CDN, and cache server is obviously making it easier to monitor server usage.
Try to upgrade it to the latest version. I used to work with a company that using version 5, and it's not really good at performance.
opcode caching. cache the PHP file code. I myself never use this.
split app to backend and frontend. use state management for front end app to reduce requests data to the server.
Now let's answer your question
Are there any tools for checking performance? You can check Laravel debug bar, these tools provide for simple performance reports. I myself encourage you to make a test for each of the features you create. You can create a report from that unit test to find which feature taking time to finish.
Are lume faster than laravel? Yeah, Lumen is faster because they disabled some features from Laravel. But please be aware that Taylor seems gonna stop Lumen for development. You should consider this for the future.
If you're aware of performance, you might not choose Laravel for development.
Because there is a delay between each server while opening a connection. Whenever a user creating a request, they open a connection to the server. server open connections to the cache server, database server, SMTP server, or probably other 3rd parties as well. It's the real bottleneck that happened on Laravel. You can make keep-alive connections with the database and cache server to reduce connection delay, but you wouldn't find it in Laravel because it will dispose the connection whenever the request finish.
It's a typed language, mostly compiled language are faster
In the end, you might not be able to use Laravel with that server resources unless you're creating a really simple application.
A question like this need an answer with real numbers:
luckily this guy already have done it in similar conditions as you want to try and with laravel forge.
with this php config:
opcache.enable=1
opcache.memory_consumption=512
opcache.interned_strings_buffer=64
opcache.max_accelerated_files=20000
opcache.validate_timestamps=0
opcache.save_comments=1
opcache.fast_shutdown=1
the results:
Without Sessions:
Laravel: 609.03 requests per second
With Sessions:
Laravel: 521.64 requests per second
so answering your question:
With that memory you would be in trouble to get 1000 users making requests, use 4gb memory and you will be in better shape.
I don't think you can do that with Laravel.
I tried benchmarking Laravel with an 8 core CPU, 8 GB RAM and 120GB HDD and I just got 200-400 request per second.

What performance improvements can I make to Symfony + Doctrine?

I've recently built a reasonably complex database application for a university handling almost 200 tables. Some tables can (e.g. Publications) hold 30 or more fields and store 10 one-to-one FK relations and up to 2 or 3 many-to-many FK relations (using crossreference-ref tables). I use integer IDs throughout and normalisation has been key every step of the way. AJAX is minimal and most pages are standard CRUD forms/processes.
I used Symfony 1.4, and Doctrine ORM 1.2, MySQL, PHP.
While the benefits to development time and ease of maintenance have been enormous (in using an MVC and ORM), we've been having problems with speed. That is, when we have more than a few users logged in and active at any one time the application slows considerbly (up to 20 seconds to save or edit a record).
We're currently engaged in discussions with our SysAdmin but they say that we should have more than enough power. With 6 or more users engaged in activity, we end up queuing 4 CPUs in a virtual server environment while memory usage is low (no bleeds).
Of course, we're considering multi-threading our mySQL application (if this will help), refining our code (though much of it is generated by the MVC) and refining our cache usage (this could be better, though the majority of the screen used is user-login specific and dynamic); we've installed APC, extra memory, de-fragmented our database, I've tried unsetting all recordsets (though I understand this is now automatic within the ORM), instigating manual garbage recycling...
But the question I'm asking is whether mySQL, PHP, and Symfony MVC was actually a poor choice for developing an application of this size in the first place? If so, what do people normally use/recommend for a web-based database interface application of this kind of size/complexity?
I don't have any experience with Symfony or Doctrine. However, there are certainly larger sites than yours that are built on these projects. DailyMotion for instance uses both, and it serves a heck of a lot more than a few simultaneous users.
Likewise, PHP and MySQL are used on sites as large as Wikipedia, so scalability should not be a problem. (BTW, MySQL should be multithreaded by default—one thread per connection.)
Of course, reasonable performance is relative. If you're trying to run a site serving 200 concurrent requests off a beige box sitting in a closet, then you may need to optimize your code a lot more than a Fortune 500 company with their own data centers.
The obvious thing to do is actually profile your application and see where the bottleneck is. Profiling MySQL queries is pretty simple, and there are also assorted PHP profiling tools like Xdebug. From there you can figure out whether you should switch frameworks, ORMs (or ditch ORM altogether), databases, or refactor some code, or just invest in more processing power.
The best way to speed up complex database operation is not calling them :).
So analyize which parts of your application you can cache.
Symfony has a pretty well caching system (in symfony2 even better) that can be used pretty granulary.
On database side another way would be to use views or nested sets to store aggregated data.
Try to find parts where this is appropriate.

Database topology design confusion

Background
I run (read: inherited) a network that is setup very similar to a shared hosting provider. There are between 300-400 sites running on the infrastructure. Over the years the database topology has become extremely fragmented, in that it's a 1 to 1 relationship from webserver->database.
Problems
The applications are 9 times out of 10 designed by third party design firms that have implemented wordpress/joomla/drupal etc.
The databases are sort of haphazardly spread across 6 database servers. They are not replicated anywhere.
The applications have no concept of separate database handles to separate INSERT to a master and a SELECT to a slave.
Using single master builtin mysql replication creates a huge bottleneck. The amount of inserts will down the master db very quickly.
Question
My question becomes, how can I make my database topology as flat as possible while leaving room for future scalability?
In the future I'd like to add more geographic locations to my network that can replicate the same databases across a 'backnet'.
In the past I've looked into multi-master replication but saw a lot of issues with things like auto_increment column collisions.
I'm open to enterprise solutions. Something similar to the Shareplex product for Oracle replication.
Whatever the solution is, it's not reasonable to expect the applications to change to accommodate this new design. So things like auto_increment columns need to remain the same and gel across the entire cluster.
Goal
My goal is to have an internally load balanced hostname for each cluster that I can point all the applications at. I
This would also afford me fault tolerance that I don't presently have. At present, removing a database from rotation is not possible.
Applications like Cassandra and Hadoop look amazingly similar to what I want to achieve but NoSQL isn't an option for these applications.
Any tips/pointers/tutorials/documentation/product recommendations are greatly appreciated. Thank you.
In the past I've looked into multi-master replication but saw a lot of issues with things like auto_increment column collisions.
We use multi-master in production at work. The auto-inc conundrum was fixed a while ago with auto_increment_increment and auto_increment_offset, which allows each server to have it's own pattern of increment IDs. As long as the application isn't designed blindly assuming that all the IDs will be sequential, it should work fine.
The real problem with multi-master is that MySQL still occasionally corrupts the binary log. This is mainly a problem over unreliable connections, so it won't be a problem if all the instances are local.
Another problem with multi-master is that it simply doesn't scale with writes, as you've already experienced or assumed, given a point in your answer. All of the writes on one master have to be replicated by the others. Even if you spread out read load correctly, you will eventually hit an I/O bottleneck that can only be resolved by either more hardware, an application redesign, or sharding (read: application redesign). It's slightly better now that MySQL has row-based replication available.
If you need geographic diversity, multi-master could work.
Also look into DRBD a disk-block-level replication system that's now been built into modern Linux kernels. It's been used by others to replicate MySQL and PostgreSQL before, though I don't have any personal experience with it.
I can't tell from your question if you're looking for high availability, or simply are looking for (manual or automatic) fail-over. If you just need fail-over and can have a bit of downtime, then traditional master/slave replication might be enough for you. The trouble is turning a slave that became a master back into a slave.

How to optimize this website. Need general advices

this is my first question here, which is regarding a specific website optimization.
A few moths ago, we launched [site] for one of our clients which is some kind of community website.
Everything works great, but now this website is getting bigger and it shows some slowness when the pages are loading.
The server specs:
PHP 5.2.1 (i think we need to upgrade on 5.3 to make use of the new garbage collector)
Apache 2.2
Quad Core Xeon Processor # 2,8 Ghz and 4 GB DDR 3 RAM.
XCACHE 1.3 (we added this a few months ago)
Mysql 5.1 (we are using innodb as engine)
Codeigniter framework
Here is what we did so far and what we intend to do further :
Beside xcache, we don't really use a caching mechanism because most of the content comes live and beside this, we didn't wanted to optimize prematurely because we didn't know what to expect as far as the traffic flow.
On the other hand, we have installed memcached and we want to implement a cache system based on memcached.
Regarding the database structure, we have reached 3NF with most of our tables, and yes we have some slow queries(which we plan to optimize) but i think because the tables that produce slow queries are the one for blog comments(~44,408 rows) / user logs tracking (~725,837 rows) / user comments (~698,964 rows) etc which are quite big tables. The entire database is 697.4 MB in size for now.
Also, here are some stats for January 2011:
Monthly unique visitors: - 127.124
Monthly unique views: 4.829.252
Monthly unique visits: 242.708
Daily average:
Unique new visitors: 7.533
Unique new views : 179.680
Just let me know if you need more details.
Any advice is highly appreciated.
Thank you.
When it come to performance issue, there is no golden rule or labelled sticky note that first tell that is related to database. Maybe what i could suggest is to do performance profiling and there are many free and paid tools over the Internet that allows you to do so.
First start of with web server layer, make sure everything is done correctly and optimized as what is be possible.
Then move on to next layer (which i assume is your database). Normally from layman perspective whenever someone mentioned InnoDB MySQL, we assume there are indexes being created to optimize and search operations. The usage of indexes also quite important because you don't want to indexing something wrong and make things worse. My advise to this is to get a DBA equivalent personnel to troubleshoot using a staging environment.
Another tricks you could possibility look at is the contents, from web page contents to database data, make sure you show/keep data where is needed only, do no store unnecessary information into database and using smart layout on the webpage. A cut down of a seconds or two might do a big difference in terms of usability and response time.
It is very hard to explain the detail here unless we have in-depth information about your application, its architecture and your environment, but above are some commonly used direction people use to troubleshoot such incident.
Good luck!
This site has excellent resources http://www.websiteoptimization.com/
The books that are mentioned are excellent. There are just too many techniques to list here and we do not know what you have tried so far.
Sorry for the delay guys, i have been very busy to find the issue and i did it.
Well, the problem was because of apache mostly, i had an access log of almost 300 GB which at midnight was parsed to generate webalizer stats. Mostly when this was happening the website was very very slow. I disabled webalizer for the domain, cleared the logs, and what to see, it is very fast again, doesn't matter the hour you access it.
I now only have just a few slow queries that i tend to fix today.
I also updated to CI 2.0 Reactor as suggested and started to use the memcached driver.
Who would knew that apache logs can be so problematic...
Based on the stats, I don't think you are hitting load problems... on a hunch, I would look to the database first. Database partitioning might be a good place to start.
But you should really do some profiling of your application first. How much time is spent in the application versus database. Are there application methods that are using lots of time and just need some tweaking? Are database queries not written efficiently? Do you need more or better database indices?
Everything looks pretty good-- if upgrading codeigniter is an option, the new codeigniter 2.0 (reactor) adds support for memcache (New Cache driver with file system, APC and memcache support). Granted you're already using xcache, these new additions may be worth looking at.
When cache objects weren't enough for our multi-domain platform that saw huge traffic, we went the route of throwing more hardware at it-- ram, servers/database. Then we moved to database clustering to handle single account forecasted heavy load. And now switching from apache to nginx... It's a never ending battle, but what worked for us was being smart about what we cached and increasing server memory then distributing this load across servers...
Cache as many database calls as you can. In my CI application I have a settings table that rarely changes, so I cache all calls made to it as I am constantly querying the settings table.
Cache your views and even your controllers as well. I tend to cache basically as much as I can in my CI applications and then refresh the cache when a file changes.
Only autoload important libraries, models and helpers. I've seen people autoload up to 10 libraries and on-top of that a few helpers and then a model. You only really need to autoload the database and session libraries if you are using them.
Regarding point number 3, are you autoloading many things in your config/autoload.php file by any chance? It might help speed things up only loading things you need in your controllers as you need them with exception of course the session and database libraries.

Developing a Scalable Website using Php

I am going to develop a social + professional networking website using Php (Zend or Yii framework). We are targeting over 5000 requests per minute. I have experience in developing advanced websites, using MVC frameworks.
But, this is the first time, I am going to develop something keeping scalability in mind. So, I will really appreciate, if someone can tell me about the technologies, I should be looking for.
I have read about memcache and APC. Which one should I look for? Also, should I use a single Mysql server or a master/slave combination (if its later, then why and how?)
Thanks !
You'll probably want to architect your site to use, at minimum, a master/slave replication system. You don't necessarily need to set up replicating mysql boxes to begin with, but you want design your application so that database reads use a different connection than writes (even if in the beginning both connections connect to the same db server).
You'll also want to think very carefully about what your caching strategy is going to be. I'd be looking at memcache, though with Zend_Cache you could use a file-based cache early on, and swap in memcache if/when you need it. In addition to record caching, you also want to think about (partial) page-level caching, and what kind of strategies you want to plan/implement there.
You'll also want to plan carefully how you'll handle the storage and retrieval of user-generated media. You'll want to be able to easily move that stuff off the main server onto a dedicated box to serve static content, or some kind of CDN (content distribution network).
Also, think about how you're going to handle session management, and make sure you don't do anything that will prevent you from using a non-file-based session storage ((dedicated) database, or memcache) in the future.
If you think carefully, and abstract data storage/retrieval, you'll be heading in a good direction.
Memcached is a distributed caching system, whereas APC is non-distributed and mainly an opcode cache.
If (and only if) your website has to live on different webservers (loadbalancing), you have to use memcache for distributed caching. If not, just stick to APC and its cache.
About MySQL database, I would advise a gridhosting which can autoscale according to requirements.
Depending on the requirements of your site it's more likely the database will be your bottle neck.
MVC frameworks tend to sacrifice performance for easy of coding, especially in the case of ORM. Don't rely on the ORM, instead benchmark different ways of querying the database and see which suits. You want to minimise the number of database queries, fetch a chunk of data at once instead of doing multiple small queries.
If you find that your php code is a bottle neck(profile it before optimizing) you might find facebook's hiphop useful.

Categories