I am working on a website that needs to serve multiple requests from the same table simultaneously. We made a simple index page in CakePHP which draws some data from the database (10 rows, to be precise), and a colleague executed a test simulating 1000 users viewing the same page at the same time, meaning that 1000 identical requests would be issued to the database. The thing is that at around 500 requests, the database stopped being responsive, everything just froze and we had to kill the processes.
What comes to mind is that each and every request is executed on its own connection, and this would explain why the MySQL server was overwhelmed. From a few searches online, and on SO, I can see that PHP does not support connection pooling natively, as can be done in a Java application, for instance. Having based our app on CakePHP 2.5.3, however, I would like to think that there is some underlying mechanism that overcomes these limitations. Perhaps I am not doing something right?
Any suggestion is welcome, I just want to make sure to exhaust every possible solution.
If results gonna be same for each query, you can cache the query result, then it will not send multiple request to database,
try this plugin:-
https://github.com/ndejong/CakephpAutocachePlugin
Related
I want to know something really important for my website, Imagine if I have thousand of users that accessed the website, then, At that time there probably be thousand of queries that might be running at the same, So, Don't that fill up the DB pool as at the same time there are thousand of queries.So, to overcome this , how can I run the MYSQL queries in Queue ?
Also, I am using InnoDB as MYSQL Storage Engine, Any other suggestion is also appreciated.
The problem does not exist.
Properly written queries can come and go in the blink of an eye.
You won't have "thousands" of users actually executing queries at the same time. The real metric is "pages per minute".
If you were to "queue" the queries, latency would be bad, and the users would move to some competing site.
If you do have performance problems, it is almost always because of lack of index, poorly phrasing of queries, and/or improper schema design.
Another, more appropriate approach, for scaling is to use Replication go allow for multiple read-only Slaves and have multiple clients (on other serves) hitting those Slaves through a load-balancer. This is what Yahoo, for example, does to handle its traffic. And it needs only a handful of servers for each property (news, weather, sports, finance, etc). Are you at that scale yet?
RDMS handles the query queues internally. You can use caches systems in order to increase the response time and use nosql databases in order to scale your solution. On the other hand, if your intent is to throttle the visitors, you have to handle it in your application side.
I suggest you monitor a couple of status variables in MySQL:
SHOW GLOBAL STATUS LIKE 'Threads_%';
Threads_connected is the number of clients are currently connected and occupying a thread in the MySQL server. Typically this is one per concurrent PHP request (if you're using PHP).
Threads_running is the number of clients who are connect and actually executing a query at that moment.
You might have hundreds of threads connected, but only a handful of threads running at any given time. This is because a PHP script runs code in between the SQL queries. During this in-between time, the MySQL connection is still open, but not doing anything in the MySQL server. This is normal.
You might have thousands of users, but since human beings typically view a web page for at least a few seconds before they click on something to request the next web page, there is some time in between page views where neither your web server or your database server need to do anything for a given user.
So it's typically true that the MySQL server can handle the traffic even when you have thousands of users, because only a tiny fraction of them are running a query simultaneously.
I've just set up my first remote connection with FileMaker Server using the PHP API and something a bit strange is happening.
The first connection and response takes around 5 seconds, if I hit reload immediately afterwards, I get a response within 0.5 second.
I can get a quick response for around 60 seconds or so (not timed it yet but it seems like at least a minute but less than 5 minutes) and then it goes back to taking 5 seconds to get a response. (after that it's quick again)
Is there any way of ensuring that it's always a quick response?
I can't give you an exact answer on where the speed difference may be coming from, but I'd agree with NATH's notion on caching. It's likely due to how FileMaker Server handles caching the results on the server side and when it clears that cache out.
In addition to that, a couple of things that are helpful to know when using custom web publishing with FileMaker when it comes to speed:
The fields on your layout will determine how much data is pulled
When you perform a find in the PHP api on a specific layout, e.g.:
$request = $fm->newFindCommand('myLayout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
What's being returned is data from all of the fields available on the my layout layout.
In sql terms, the above query is equivalent to:
SELECT * FROM myLayout WHERE `name` = ?; // and the $myname variable is bound to ?
The FileMaker find will return every field/column available. You designate the returned columns by placing the fields you want on the layout. To get a true * select all from your table, you would include every field from the table on your layout.
All of that said, you can speed up your requests by only including fields on the layout that you want returned in the queries. If you only need data from 3 fields returned to your php to get the job done, only include those 3 fields on the layout the requests use.
Once you have the records, hold on to them so you can edit them
Taking the example from above, if you know you need to make changes to those records somewhere down the line in your php, store the records in a variable and use the setField and commit methods to edit them. e.g.:
$request = $fm->newFindCommand('my layout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
$records = $result->getRecords();
...
// say we want to update a flag on each of the records down the line in our php code
foreach($records as $record){
$record->setField('active', true);
$record->commit();
}
Since you have the records already, you can act on them and commit them when needed.
I say this as opposed to grabbing them once for one purpose and then grabbing them again from the database later do make updates to the records.
It's not really an answer to your original question, but since FileMaker's API is a bit different than others and it doesn't have the greatest documentation I though I'd mention it.
There are some delays that you can remove.
Ensure that the layouts you are accessing via PHP are very simple, no unnecessary or slow calculations, few layout objects etc. When the PHP engine first accesses that layout it needs to load it up.
Also check for layout and file script triggers that may be run, IIRC the OnFirstWindowOpen script trigger is called when a connection is made.
I don't think that it's related to caching. Also, it's the same when accessing via XML. Haven't tested ODBC, but am assuming that it is an issue with this too.
Once the connection is established with FileMaker Server and your machine, FileMaker Server keeps this connection alive for about 3 minutes. You can see the connection in the client list in the FM Server Admin Console. The initial connection takes a few seconds to set up (depending on how many others are connected), and then ANY further queries are lightning fast. If you run your app again, it'll reuse that connection and give results in very little time.
You can do completely different queries (on different tables) in a different application, but as long as you execute the second one on the same machine and use the same credentials, FileMaker Server will reuse the existing connection and provide results instantly. This means that it is not due to caching, but it's just the time that it takes FMServer to initially establish a connection.
In our case, we're using a web server to make FileMaker PHP API calls. We have set up a cron every 2 minutes to keep that connection alive, which has pretty much eliminated all delays.
This is probably way late to answer this, but I'm posting here in case anyone else sees this.
I've seen this happen when using external authentication with FileMaker Server. The first query establishes a connection to Active Directory, which takes some time, and then subsequent queries are fast as FMS has got the authentication figured out. If you can, use local authentication in your FileMaker file for your PHP access and make sure it sits above any external authentication in your accounts list. FileMaker runs through the auth list from top to bottom, so this will make sure that FMS successfully authenticates your web query before it gets to attempt an external authentication request, making the authentication process very fast.
If I have a single page, which its contents are generated from a MySQL database (simple query to display the contents of a cell, HTML contents), how many users can hit that page at once without crashing the database?
I have to display the same data across multiple domains, so instead of duplicating the page on three domains, I'm going to load it into mysql and use this route to display it. However, I want to find out how many concurrent connections I could handle without crashing the database.
Can anyone point me in the right directions to finding this out? I'm assuming this small query shouldn't be a huge load, but if 10,000 visitors hit it at once, then what?
You need to check ur setting for max_connections, you could get more information about this by looking at the following link: http://www.electrictoolbox.com/update-max-connections-mysql/
Are you using any framework? If so most have caching built into them which should solve the issue assuming that the information isn't being updated on a moment to moment basis. Even if it is, try to cache whatever parts of the page you assume are not going to change that often.
how many users can hit that page at once without crashing the database?
All of them.
It's a bad question.
The database should not crash because users are trying to connect. There are lots of limits on how many php clients can open concurrent connections to the database. Really you should have the DBMS correctly configured so it handles the situation gracefully - i.e. max_connections should limit the number of clients - not the amount of memory available to the DBMS.
If you mean how many concurrent connections can you support without connections being rejected / queued, that's something VERY different. And it's nearly a sensible question.
In that case, it's going to depend on the amount of memory you've got, how you've configured the DBMS, how fast the CPU is, what size the content is, how many content items there are, how PHP connects to the DBMS, how far away the PHP is from the DBMS....
You're more likely to run into connection limits on the webserver before you hit them on the DBMS - but the method for establishing what those limits are is the same for both - generate a controlled number of connection and measure how memory it uses - repeat for a different number, get lots of data. Draw a graph, see where the line crosses your resource limits.
so instead of duplicating the page on three domains, I'm going to load it into mysql
The number of domains is pretty much irrelevant - unless you're ecxplicitly looking for a method of avoiding ESI.
but if 10,000 visitors
Really? On a typical site that would mean perhaps a million concurrent HTTP sessions.
I have a working live search system that on the whole works very well. However it often runs into the problem that many versions of the search query on the server are running simultaneously, if users are typing faster than the results can be returned.
I am aborting the ajax request on receoipt of a new one, but that of course does not affect the query already in process on the server, and you end up with a severe bottleneck and a long wait to get your final results. I am using MySQL with MyISAM tables for this, and there does not seem to be any advantage in converting to InnoDB as the result sets will be the sane rows.
I tried using a session variable to make php wait if this session already has a query in progress but that seems to stop it working altogether.
The problem is solved if I make the ajax requests syncrhonous, but that would rather defeat the object here.
I was wondering if anyone had any suggestions as to how to make this work properly.
Best regards
John
Before doing anything more complicated, have you considered not sending the request until the user has stopped typing for at least a certain time interval (say, 1 second)? That should dramatically cut the number of requests being made with little effort on your part.
Right now the setup for my javascript chat works so it's like
function getNewMessage()
{
//code would go here to get new messages
getNewMessages();
}
getNewMessages();
And within the function I would use JQuery to make a get post to retrieve the messages from a php scrip which would
1. Start SQL connection
2. Validate that it's a legit user through SQL
3. retrieve only new message since the last user visit
4. close SQL
This works great and the chat works perfectly. My concern is that this is opening and closing a LOT of SQL connections. It's quite fast, but I'd like to make a small javascript multiplayer game now, and transferring user coordinates as well as the tens of other variables 3 times a second in which I'm opening and closing the sql connection each time and pulling information from numerous tables each time might not be efficient enough to run smoothly, and might be too much strain on the server too.
Is there any better more efficient way of communicating all these variables that I should know about which isn't so hard on my server/database?
Don't use persistent connections unless it's the only solution available to you!
When MySQL detects the connection has been dropped, any temporary tables are dropped, any active transaction is rolled back, and any locked tables are unlocked. Persistent connections only drop when the Apache child exits, not when your script ends, even if the script crashes! You could inherit a connection in the middle of a transaction. Worse, other requests could block, waiting for those tables to unlock, which may take quite a long time.
Unless you have measured how long it takes to connect and identified it as a very large percentage of your script's run time, you should not consider using persistent connections. In fact, that should be what you do here, if you're worried about performance. Check out xhprof or xdebug, profile your code, then start optimizing.
Maybe try to use a different approach to get the new messages from the server: Comet.
Using this technique you do not have to open that much new connections.
http://www.php.net/manual/en/features.persistent-connections.php
and
http://www.php.net/manual/en/function.mysql-pconnect.php
A couple of dozen players at the same time won't hurt the database or cause noticeable lag if you have efficient SQL statements. Likely your database will be hosted on the same server or at least the same network as your game or site, so no worries. If your DB happens to be hosted on a separate server running an 8-bit 16mz board loaded with MSDOS, located in the remote Amazon, connected by radio waves hooked up to a crank-powered generator operatated by a drunk monkey, you're on your own with this one.
Otherwise, really you should be more worried about exactly how much data you're passing back and forth to your players. If you're passing back and forth coordinates for all objects in an entire world, page load could take a painfully long time, even though the DB query takes a fraction of a second. This is sometimes overcome in games by a "fog of war" feature which doesn't bother notifying the user of every single object in the entire map, only those which are in immediate range of the player. This can easily be done with a single SQL query where object coordinates are in proximity to a player. Though, if you have a stingy host, they will care about the number of connects and queries.
If you're concerned about attracting even more players than that, consider exploring cache methods like pre-building short files storing commonly fetched records or values using fopen(), fgets(), fclose(), etc. Or, use php extensions like apc to store values in memory which persist from page load to page load. memcache or memcached also act similarly, but in a way which acts like a separate server you can connect to, store values which can be shared with other page hits, and query.
To update cached pages or values when you think they might become stale, you can run a cron job every so often to update these files or values. If your host doesn't allow cron jobs, consider making your guests do that legwork: a line of script on a certain page will refresh the cache with new values from a database query after a certain number of page hits. Or cache a date value to check against on every page hit, and if so much time has passed, refresh the cache.
Again, unless you're under the oppressive thumb of a stingy host, or unless you're getting a hundred or more page hits at a time, no need to even be concerned about your database. Databases are not that fragile. If they crashed in a hysterical fit of tears anytime more than one query came their way, the engineers who made it wouldn't have a job for very long.
I know this is quite an annoying "answer" but perhaps you should be thinking about this a different way, after all this is really not the strongest use of a relational database. Have you considered an XMPP solution? IMO this would be the best tool for the job and both ejabberd and openfire are trivial to set up these days. The excellent Strophe library can make the front end story easy, and as an added bonus you get HTTP binding (like commet) so you won't need to poll the server, your latency will go down and you'll be generating less HTTP traffic.
I know it's highly unlikely you're going to change your whole approach just cos I said so, but wanted to provide an alternative perspective.
http://www.ejabberd.im/
http://code.stanziq.com/strophe/