We're currently using Code Igniter's file cache to optimize a heavy DB query which is called often in our multi-user environment, but doesn't change often. The cache is set to expire every 60 seconds, so that would suggest the query would be called about 60 times an hour. In the DB monitor, we're seeing the query invoked 340 ± 150 times per hour, or almost six times as frequently as expected.
The code is nothing special, all in one function:
// Cache identifier
$cache = 'cache-identifier';
// Try to pull from the cache first
if ($working = $this->cache->file->get($cache))
return $working;
// Cache didn't work out, query directly
$sql = '<<heavy-DB-query>>';
$working = $this->db->query($sql)->result_array();
// Save the cache (note essentially redundant check: $working is 99.95% non-null)
if ($working)
$this->cache->file->save($cache, $working, 60);
return $working;
We do see a significant reduction compared to non-caching but would like to optimize more. Any ideas why would it be missing so often?
Some more details:
Prior to caching, we were getting the DB query invoked about 3,000 calls/hour (≈ 50 calls/minute). So 340 calls/hour (≈ 5½ calls/minute) is a significant improvement. We tried outputting to a log file whenever there's a miss, but the act of doing that seems to change the dynamics and the statistics aren't the same.
Testing in isolation, single user, from browser to server and back, a miss takes about 1.0 s, and a hit about 0.6 s. The query itself is about 0.1 s, but with overhead, server to DB and back is about 0.5 s.
I think what might be going on is that side effects from other activities are causing the users to synchronize, so that we get the queries bunched up and not random. The first one has a miss, makes the query, same with second, third and so on, until the first gets the results back. Then the remainder for that minute get hits. Given the times involved, seeing 5½ calls a minute would fit this hypothesis more-or-less. But how to test?
Even more:
Turns out the server access logs had the answer. They were indeed showing groups of multiple queries with the same time stamp. Turns out the users often have multiple windows open in the same browser, and in this scenario, the queries do tend to synchronize. I've added a random element to the timer to break up this synchronization. So far, seems to be better. Now that I know what's going on, I'll be happy with ≈2 calls/minute!
Related
Short:
Is there a way to get the amount of queries that were executed within a certain timespan (via PHP) in an efficient way?
Full:
I'm currently running an API for a frontend web application that will be used by a great amount of users.
I use my own custom framework that uses models to do all the data magic and they execute mostly INSERTs and SELECTs. One function of a model can execute 5 to 10 queries on a request and another function can maybe execute 50 or more per request.
Currently, I don't have a way to check if I'm "killing" my server by executing (for example) 500 queries every second.
I also don't want to have surprises when the amount of users increases to 200, 500, 1000, .. within the first week and maybe 10.000 by the end of the month.
I want to pull some sort of statistics, per hour, so that I have an idea about an average and that I can maybe work on performance and efficiency before everything fails. Merge some queries into one "bigger" one or stuff like that.
Posts I've read suggested to just keep a counter within my code, but that would require more queries, just to have a number. The preferred way would be to add a selector within my hourly statistics script that returns me the amount of queries that have been executed for the x-amount of processed requests.
To conclude.
Are there any other options to keep track of this amount?
Extra. Should I be worried and concerned about the amount of queries? They are all small ones, just for fast execution without bottlenecks or heavy calculations and I'm currently quite impressed by how blazingly fast everything is running!
Extra extra. It's on our own VPS server, so I have full access and I'm not limited to "basic" functions or commands or anything like that.
Short Answer: Use the slowlog.
Full Answer:
At the start and end of the time period, perform
SELECT VARIABLE_VALUE AS Questions
FROM information_schema.GLOBAL_STATUS
WHERE VARIABLE_NAME = 'Questions';
Then take the difference.
If the timing is not precise, also get ... WHERE VARIABLE_NAME = 'Uptime' in order to get the time (to the second)
But the problem... 500 very fast queries may not be as problematic as 5 very slow and complex queries. I suggest that elapsed time might be a better metric for deciding whether to kill someone.
And... Killing the process may lead to a puzzling situation wherein the naughty statement remains in "Killing" State for a long time. (See SHOW PROCESSLIST.) The reason why this may happen is that the statement needs to be undone to preserve the integrity of the data. An example is a single UPDATE statement that modifies all rows of a million-row table.
If you do a Kill in such a situation, it is probably best to let it finish.
In a different direction, if you have, say, a one-row UPDATE that does not use an index, but needs a table scan, then the query will take a long time and possible be more burden on the system than "500 queries". The 'cure' is likely to be adding an INDEX.
What to do about all this? Use the slowlog. Set long_query_time to some small value. The default is 10 (seconds); this is almost useless. Change it to 1 or even something smaller. Then keep an eye on the slowlog. I find it to be the best way to watch out for the system getting out of hand and to tell you what to work on fixing. More discussion: http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog
Note that the best metric in the slowlog is neither the number of times a query is run, nor how long it runs, but the product of the two. This is the default for pt-query-digest. For mysqlslowdump, adding -s t gets the results sorted in that order.
I am doing a pagination system for about 100 items.
My question is:
Should I just load all 100 of them and then use jQuery to switch pages without reloading? Or should I use a MySQL query with "LIMIT 5" and then, each time user presses on Next Page or Previous Page, another Mysql query with LIMIT 5 is initiated?
For every item, I would have to load a thumbnail picture but I could keep it in the cache to avoid using my server bandwidth.
Which one is the best option from a server resource perspective?
Thanks in advance. Regards
Try connecting directly to your MySql instance via the command line interface. Execute the query with 100 at at time, and then with LIMIT 5. Look at the msec results. This will tell you which is more efficient or less resource-demanding.
100 records at a time from MySql (depending on dataset) really is nothing. The performance hit wouldn't be noticeable for a properly written query/database schema.
That said, I vote for calling only the results you need at a time. Use the LIMIT clause and your jquery pagination method to make it efficient.
For the server, the most efficient way would be to grab all 100 items once, send them to the client once, and have the client page through them locally. That's one possibly expensive query, but that's cheaper overall than having the client go back and forth for each additional five items.
Having said that, whether that's feasible is a different topic. You do not want to be pushing a huge amount of data to your client at once, since it'll slow down page loads and client-side processing. In fact, it's usually desirable to keep the bandwidth consumed by the client to a minimum. From that POV, making small AJAX requests with five results at a time when and only when necessary is much preferable. Unless even 100 results are so small overall that it doesn't make much of a difference.
Which one works best for you, you need to figure out.
Depends significantly on your query. If it is a simple SELECT from a well-designed table (indexes set etc.) then unlee you're running on a very underpowered server, there will be no noticeable difference between requesting 100 rows and 5 rows. If it is complicated query, then you should probably limit the number of queries.
Other considerations to take into account are how long it takes to load a page, as in the actual round trip time to the server to receive the data by the client. I'm going to make the wild guess that you are in America or Europe, where internet speeds are nice a fast, not the entire world is that lucky. Limiting the number of times your site has to request data from the server is a much better metric than how much load your server has.
This is moving rapidly into UX here, but your users don't care about your server load, they don't care if this way means your load average is 0.01 instead of 0.02. They will care if you have almost instantaneous transitions between sections of your site.
Personally, I'd go with the "load all data, then page locally" method. Also remember that Ajax is your friend, if you have to, load the results page, then request the data. You can split the request into two: first page and rest of pages. There's alot of behind-the-scenes tweaks you can do to make your site seem incredibly fast, and that is something people notice.
I'd say, load 5 at a time and paginate. My considerations:
It is indeed much lighter to load 5 at a time
Not all of your users will navigate through all 100, so those loaded might not even be used
A slight load time between 5 records are something expected (i.e. most users won't complain just because they have to wait 500ms - 1s)
You can also give user options to display x number of items per page, and put all options as well to let users see all items in the page. Over time, you can also monitor what most of your users preference in terms of x number of items to display per page are then go with that for the default LIMIT
I made a logger that shows me all the db queries onscreen. Average is about 30 queries a page. Is that a lot? Each call takes about 0.001 seconds to complete, some longer, some shorter. Here are some totals for a few pages: 0.9 secs, 0.09 secs, 0.8 secs. (Note: these are ONLY the times for the database queries and not image loading etc).
Are these acceptable tiems? What is ideal? What is the industry standard?
If you'd ask me, I would say a page will have to be able to load within 0.5 seconds. Almost 1 second for Queries is way to long.
But, if it is a huge page, which loads of information, a user will probably want to wait for it.
You should probably take a look at the queries, and find out why it takes so long (0.9/0.8)
Add this to your query: EXPLAIN EXTENDED and see if any indexes are used.
That depends on the query, and the level of interactivity of your application. If you have to provide a web application, you are not going to accept a query that completes in 10 seconds. If you can't avoid it, you may have to use some tricks to make it faster, or to build and return the data progressively as new results are found.
depends on the type of the query and ORM you use
My site has a PHP process running, for each window/tab open, that runs in a maximum of 1 minute, and it returns notifications/chat messages/people online or offline. When JavaScript gets the output, it calls the same PHP process again and so on.
This is like Facebook chat.
But, seems it is taking too much CPU when it is running. Have you something in mind how Facebook handles this problem? What do they do so their processes don't take too much CPU and put their servers down?
My process has a "while(true)", with a "sleep(1)" at the end. Inside the cycle, it checks for notifications, checks if one of the current online people got offline/changed status, reads unread messages, etc.
Let me know if you need more info about how my process works.
Does calling other PHPs from "system()" (and wait for its output) alleviate this?
I ask this because it makes other processes to check notifications, and flushes when finished, while the main PHP is just collecting the results.
Thank you.
I think your main problem here is the parallelism. Apache and PHP do not excell at tasks like this where 100+ Users have an open HTTP-Request.
If in your while(true) you spend 0.1 second on CPU-bound workload (checking change status or other useful things) and 1 second on the sleep, this would result in a CPU load of 100% as soon as you have 10 users online in the chat. So in order so serve more users with THIS model of a chat you would have to optimize the workload in your while(true) cycle and/or bring the sleep interval from 1 second to 3 or higher.
I had the same problem in a http-based chat system I wrote many years ago where at some point too many parallel mysql-selects where slowing down the chat, creating havy load on the system.
What I did is implement a fast "ring-buffer" for messages and status information in shared memory (sysv back in the day - today I would probably use APC or memcached). All operations write and read in the buffer and the buffer itself gets periodicaly "flushed" into the database to persist it (but alot less often than once per second per user). If no persistance is needed you can omit a backend of course.
I was able to increase the number of user I could serve by roughly 500% that way.
BUT as soon as you solved this isse you will be faced with another: Available System Memory (100+ apache processes a ~5MB each - fun) and process context switching overhead. The more active processes you have the more your operating system will spend on the overhead involved with assigning "fair enough" CPU-slots AFAIK.
You'll see it is very hard to scale efficently with apache and PHP alone for your usecase. There are open source tools, client and serverbased to help though. One I remember places a server before the apache and queues messages internally while having a very efficent multi-socket communication with javascript clients making real "push" events possible. Unfortunatly I do not remember any names so you'll have to research or hope on the stackoverflow-community to bring in what my brain discarded allready ;)
Edit:
Hi Nuno,
the comment field has too few characters so I reply here.
Lets get to the 10 users in parallel again:
10*0.1 second CPU time per cycle (assumed) is roughly 1s combined CPU-time over a period of 1.1 second (1 second sleep + 0.1 second execute). This 1 / 1.1 which I would boldly round to 100% cpu utilization even though it is "only" %90.9
If there is 10*0.1s CPU time "stretched" over a period of not 1.1 seconds but 3.1 (3 seconds sleep + 0.1 seconds execute) the calculation is 1 / 3.1 = %32
And it is logical. If your checking-cycle queries your backend three times slower you have only a third of the load on your system.
Regarding the shared memory: The name might imply it but if you use good IDs for your cache-areas, like one ID per conversation or user, you will have private areas within the shared memory. Database tables also rely on you providing good IDs to seperate private data from public information so those should be arround allready :)
I would also not "split" any more. The fewer PHP-processes you have to "juggle" in parallel the easier it is for your systems and for you. Unless you see it makes absolutly sense because one type of notification takes alot more querying ressources than another and you want to have different refresh-times or something like that. But even this can be decided in the whyile cycle. users "away"-status could be checked every 30 seconds while the messages he might have written could get checked every 3. No reason to create more cycles. Just different counter variables or using the right divisor in a modulo operation.
The inventor of PHP said that he believes man is too limited to controll parallel processes :)
Edit 2
ok lets build a formula. We have these variables:
duration of execution (e)
duration of sleep (s)
duration of one cycle (C)
number of concurrent users (u)
CPU load (l)
c=e+s
l=ue / c #expresses "how often" the available time-slot c fits into the CPU load generated by 30 CONCURRENT users.
l=ue / (e+s)
for 30 users ASSUMING that you have 0.1s execution time and 1 second sleep
l=30*0.1 / (0.1 + 1)
l=2.73
l= %273 CPU utilization (aka you need 3 cores :P)
exceeding capab. of your CPU measn that cycles will run longer than you intend. the overal response time will increase (and cpu runs hot)
PHP blocks all sleep() and system() calls. What you really need is to research pcntl_fork(). Fortunately, I had these problems over a decade ago and you can look at most of my code.
I had the need for a PHP application that could connect to multiple IRC servers, sit in unlimited IRC chatrooms, moderate, interact with, and receive commands from people. All this and more was done in a process efficient way.
You can check out the entire project at http://sourceforge.net/projects/phpegg/ The code you want is in source/connect.inc.
I have developed an AJAX based game where there is a bug caused (very remote, but in volume it happens at least once per hour) where for some reason two requests get sent to the processing page almost simultaneously (the last one I tracked, the requests were a difference of .0001 ms). There is a check right before the query is executed to make sure that it doesn't get executed twice, but since the difference is so small, the check hasn't finished before the next query gets executed. I'm stumped, how can I prevent this as it is causing serious problems in the game.
Just to be more clear, the query is starting a new round in the game, so when it executes twice, it starts 2 rounds at the same time which breaks the game, so I need to be able to stop the script from executing if the previous round isn't over, even if that previous round started .0001 ms ago.
The trick is to use an atomic update statement to do something which cannot succeed for both threads. They must not both succeed, so you can simply do a query like:
UPDATE gameinfo SET round=round+1 WHERE round=1234
Where 1234 is the current round that was in progress. Then check the number of rows affected. If the thread affects zero rows, it has failed and someone else did it before hand. I am assuming that the above will be executed in its own transaction as autocommit is on.
So all you really need is an application wide mutex. flock() sem_acquire() provide this - but only at the system level - if the application is spread across mutliple servers you'd need to use memcached or implement your own socket server to coordinate nodes.
Alternatively you could use a database as a common storage area - e.g. with MySQL acquire a lock on a table, check when the round was last started, if necessary update the row to say a new round is starting (and remember this - then unlock the table. Carry on....
C.
Locks are one way of doing this, but a simpler way in some cases is to make your request idempotent. This just means that calling it repeatedly has exactly the same effect as calling it once.
For example, your call at the moment effectively does $round++; Clearly repeated calls to this will cause trouble.
But you could instead do $round=$newround; Here repeated calls won't have any effect, because the the round has already been set, and the second call just sets it to the same value.