Why is this query abnormally long? - php

I'm timing various part of the site's "initialisation" code (including such things as verifying the user is logged in, connecting to the database, importing functions...)
This query is currently taking up abouve half the total initialisation time all by itself:
$sql = "update `users` set `lastclick`=now(),".(substr($_SERVER['PHP_SELF'],0,6) == "/ajax/" ? "" : " `lastactive`=now(),")." `lastip`='".addslashes($_SERVER['REMOTE_ADDR'])."' where `id`=".$userdata['id'];
Generating the query takes no time at all, it's the running that's the problem. Example result query:
update `users` set `lastclick`=now(), `lastactive`=now(), `lastip`='192.168.0.1' where `id`=1
Simple enough query, right? I am the only user on the server right now, there is literally nothing else running. So why does a simple update take up more time than connecting to the database, SELECTing the user data in the first place, validating the cookies, and defining a bunch of functions all combined?
(I just tried replacing now() with a literal value, but that made no difference - in fact it ended up taking 13ms the first time instead of 4...)
EDIT: As requested:
explain select * from `users` where `id`=1
1 row returned
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE users const PRIMARY PRIMARY 4 const 1

Solved my own mystery. Turns out one of the fields being updated (lastactive) was in an index, and the slowness was coming from rebuilding that index.
Since the only time that index might be used is in updating the list of users who are online, and that only happens by cron every set interval, I've dropped the index and now the query runs a heck of a lot faster.
Thanks to those who tried to help - you did help me find the problem, indirectly!

Related

Select query takes too long

These 2 querys take too long to produce a result (sometimes 1 min or even sometime end up on some error) and put really heavy load on the server:
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND from_unixtime(created) > CURRENT_DATE ORDER BY created DESC", $user->data->userid)
("SELECT COUNT(`userid`) AS `total_clicks` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i", $user->data->userid)
The table has about 4 million rows.
This is the table structure:
I have one index on traffic_id:
If you select anything from traffic_stats table it will take forever, however inserting to this table is normal.
Is it possible to reduce the time spent on executing this query? I use PDO and I am new to all this.
ORDER BY will take a lot of time and since you only need aggregate data (adding numbers or counting numbers is commutative), the ORDER BY will do a lot of useless sorting, costing you time and server power.
You will need to make sure that your indexing is right, you will probably need an index for user_id and for (user_id, created).
Is user_id numeric? If not, then you might consider converting it into numeric type, int for example.
These are improving your query and structure. But let's improve the concept as well. Are insertions and modifications very frequent? Do you absolutely need real-time data, or you can do with quasi-realtime data as well?
If insertions/modifications are not very frequent, or you can do with older data, or the problem is causing huge trouble, then you could do this by running periodically a cron job which would calculate these values and cache them. The application would read them from the cache.
I'm not sure why you accepted an answer, when you really didn't get to the heart of your problem.
I also want to clarify that this is a mysql question, and the fact that you are using PDO or PHP for that matter is not important.
People advised you to utilize EXPLAIN. I would go one further and tell you that you need to use EXPLAIN EXTENDED possibly with the format=json option to get a full picture of what is going on. Looking at your screen shot of the explain, what should jump out at you is that the query looked at over 1m rows to get an answer. This is why your queries are taking so long!
At the end of the day, if you have properly indexed your tables, your goal should be in a large table like this, to have number of rows examined be fairly close to the final result set.
So let's look at the 2nd query, which is quite simple:
("SELECT COUNT(`userid`) AS `total_clicks` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i", $user->data->userid)
In this case the only thing that is really important is that you have an index on traffic_stats.userid.
I would recommend, that, if you are uncertain at this point, drop all indexes other than the original primary key (traffic_id) index, and start with only an index on the userid column. Run your query. What is the result, and how long does it take? Look at the EXPLAIN EXTENDED. Given the simplicity of the query, you should see that only the index is being used and the rows should match the result.
Now to your first query:
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND from_unixtime(created) > CURRENT_DATE ORDER BY created DESC", $user->data->userid)
Looking at the WHERE clause there are these criteria:
userid =
from_unixtime(created) > CURRENT_DATE
You already have an index on userid. Despite the advice given previously, it is not necessarily correct to have an index on userid, created, and in your case it is of no value whatsoever.
The reason for this is that you are utilizing a mysql function from_unixtime(created) to transform the raw value of the created column.
Whenever you do this, an index can't be used. You would not have any concerns in doing a comparison with the CURRENT_DATE if you were using the native TIMESTAMP type but in this case, to handle the mismatch, you simply need to convert CURRENT_DATE rather than the created column.
You can do this by passing CURRENT_DATE as a parameter to UNIX_TIMESTAMP.
mysql> select UNIX_TIMESTAMP(), UNIX_TIMESTAMP(CURRENT_DATE);
+------------------+------------------------------+
| UNIX_TIMESTAMP() | UNIX_TIMESTAMP(CURRENT_DATE) |
+------------------+------------------------------+
| 1490059767 | 1490054400 |
+------------------+------------------------------+
1 row in set (0.00 sec)
As you can see from this quick example, UNIX_TIMESTAMP by itself is going to be the current time, but CURRENT_DATE is essentially the start of day, which is apparently what you are looking for.
I'm willing to bet that the number of rows for the current date are going to be fewer in number than the total rows for a user over the history of the system, so this is why you would not want an index on user, created as previously advised in the accepted answer. You might benefit from an index on created, userid.
My advice would be to start with an individual index on each of the columns separately.
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND created > UNIX_TIMESTAMP(CURRENT_DATE)", $user->data->userid)
And with your re-written query, again assuming that the result set is relatively small, you should see a clean EXPLAIN with rows matching your final result set.
As for whether or not you should apply an ORDER BY, this shouldn't be something you eliminate for performance reasons, but rather because it isn't relevant to your desired result. If you need or want the results ordered by user, then leave it. Unless you are producing a large result set, it shouldn't be a major problem.
In the case of that particular query, since you are doing a SUM(), there is no value of ORDERING the data, because you are only going to get one row back, so in that case I agree with Lajos, but there are many times when you might be utilizing a GROUP BY, and in that case, you might want the final results ordered.

How to stop row ID from getting bigger

I have this code, which parses JSON and is beeing run every 5 MINS via php file.php command.
foreach($data->chatters->staff as $viewers)
{
$sql = "INSERT INTO viewers (user,points)
VALUES('$viewers','10')
ON DUPLICATE KEY UPDATE
points = points+5";
if ($conn->query($sql) === TRUE) {
//echo "Staff: Done";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
}
For some reason, every cycle it increases the ID count.
Now my database looks like this:
id user points
1 uname1 123
2 uname2 123
18 uname3 123
256 uname4 123
And the ID value just keeps increasing. What am i missing?
THanks.
Before explaining: what you are experiencing is normal.
The id number gets generated for every INSERT query. It's an atomic operation, plus the feature is designed to work in concurrent environment (many people hitting one db with lots of requests).
In order to be able to concurrently serve, MySQL isolates transactions from each other and each transaction lives it its own "bubble". That means, every time an insert happens and auto_increment is generated, it can't produce the same value for different queries.
The only way for this to work is for MySQL to always increase the number and never re-use it. That's the only possible solution and the fastest one as well.
What does this mean - it means that every time you hit a unique constraint in your query, a value that was generated for auto_increment will be "wasted", insert won't happen and instead the UPDATE part gets executed.
You can't fix it, you can just break it. Leave it as it is, you've done it right.
The job of auto_increment is only to provide unique numbers. If they got spent somehow (by a failed insert) then that's fine and not something you need to worry about.
ON DUPLICATE KEY UPDATE will create a new row each time you 'update' the row.
If you don't want it to be like that, you'll have to make two queries : one for the INSERT query, another one for the UPDATEquery.

Long polling with PHP and jQuery - issue with update and delete

I wrote a small script which uses the concept of long polling.
It works as follows:
jQuery sends the request with some parameters (say lastId) to php
PHP gets the latest id from database and compares with the lastId.
If the lastId is smaller than the newly fetched Id, then it kills the
script and echoes the new records.
From jQuery, i display this output.
I have taken care of all security checks. The problem is when a record is deleted or updated, there is no way to know this.
The nearest solution i can get is to count the number of rows and match it with some saved row count variable. But then, if i have 1000 records, i have to echo out all the 1000 records which can be a big performance issue.
The CRUD functionality of this application is completely separated and runs in a different server. So i dont get to know which record was deleted.
I don't need any help coding wise, but i am looking for some suggestion to make this work while updating and deleting.
Please note, websockets(my fav) and node.js is not an option for me.
Instead of using a certain ID from your table, you could also check when the table itself was modified the last time.
SQL:
SELECT UPDATE_TIME
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
If successful, the statement should return something like
UPDATE_TIME
2014-04-02 11:12:15
Then use the resulting timestamp instead of the lastid. I am using a very similar technique to display and auto-refresh logs, works like a charm.
You have to adjust the statement to your needs, and replace yourdb and yourtable with the values needed for your application. It also requires you to have access to information_schema.tables, so check if this is available, too.
Two alternative solutions:
If the solution described above is too imprecise for your purpose (it might lead to issues when the table is changed multiple times per second), you might combine that timestamp with your current mechanism with lastid to cover new inserts.
Another way would be to implement a table, in which the current state is logged. This is where your ajax requests check the current state. Then generade triggers in your data tables, which update this table.
You can get the highest ID by
SELECT id FROM table ORDER BY id DESC LIMIT 1
but this is not reliable in my opinion, because you can have ID's of 1, 2, 3, 7 and you insert a new row having the ID 5.
Keep in mind: the highest ID, is not necessarily the most recent row.
The current auto increment value can be obtained by
SELECT AUTO_INCREMENT FROM information_schema.tables
WHERE TABLE_SCHEMA = 'yourdb'
AND TABLE_NAME = 'yourtable';
Maybe a timestamp + microtime is an option for you?

PHP MySQL slow query access

I have a table in my database :
And I'm using MySQL to insert/update/create info however the time it takes to execute queries is slow example:
First when a user runs my app it verifies they can use it by retrieving all IPs from the db with
SELECT IP FROM user_info
Then using a while loop if the IP is in the db do another query:
"SELECT Valid FROM user_info WHERE IP='".$_SERVER['REMOTE_ADDR']."'""
If Valid is 1 then they can use it otherwise they can't however if their IP is not found in the db it creates a new entry for them using
"INSERT INTO user_info (IP,First) VALUES ('".$_SERVER['REMOTE_ADDR']."','".date("Y/m/d") ."')")"
Now this first script has finished it accesses another - this one was supposed to update the db every minute however i don't think i can accomplish that now ; the update query is this:
"UPDATE user_info SET Name='".$Name."', Status='".$Status."', Last='".$Last."', Address='".$Address."' WHERE IP='".$_SERVER['REMOTE_ADDR']."'")"
All together it takes average 2.2 seconds and there's only 1 row in the table atm
My question is how do i speed up mysql queries? I've read about indexes and how they can help improve performance but i do not fully understand how to use them. Any light shed on this topic would help.
Nubcake
Indexes will become very important as your site grows, but they won't help when you have only one row in your table and it cannot be the reason why your queries take so long. You seem to have some sort of fundamental problem with your database. Maybe it's a latency issue.
Try starting with some simpler queries like SELECT * FROM users LIMIT 1 or even just SELECT 1 and see if you still get bad performance.
Lesser the number of queries, lesser the latency of the system. Try merging queries being made on the same table. For example, your second and third queries can be merged and you can execute
INSERT INTO user_info ...... WHERE Valid=1 AND IP= ...
Check the number of rows affected to know if a new row was added or not.
Also, do not open/close your sql connection at any point in between. The overheads of establishing a new connection could be high.
you could make IP Primary Key and Index

Best way to update user rankings without killing the server

I have a website that has user ranking as a central part, but the user count has grown to over 50,000 and it is putting a strain on the server to loop through all of those to update the rank every 5 minutes. Is there a better method that can be used to easily update the ranks at least every 5 minutes? It doesn't have to be with php, it could be something that is run like a perl script or something if something like that would be able to do the job better (though I'm not sure why that would be, just leaving my options open here).
This is what I currently do to update ranks:
$get_users = mysql_query("SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
$i=0;
while ($a = mysql_fetch_array($get_users)) {
$i++;
mysql_query("UPDATE users SET month_rank = '$i' WHERE id = '$a[id]'");
}
UPDATE (solution):
Here is the solution code, which takes less than 1/2 of a second to execute and update all 50,000 rows (make rank the primary key as suggested by Tom Haigh).
mysql_query("TRUNCATE TABLE userRanks");
mysql_query("INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC");
mysql_query("UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id");
Make userRanks.rank an autoincrementing primary key. If you then insert userids into userRanks in descending rank order it will increment the rank column on every row. This should be extremely fast.
TRUNCATE TABLE userRanks;
INSERT INTO userRanks (userid) SELECT id FROM users WHERE status = '1' ORDER BY month_score DESC;
UPDATE users, userRanks SET users.month_rank = userRanks.rank WHERE users.id = userRanks.id;
My first question would be: why are you doing this polling-type operation every five minutes?
Surely rank changes will be in response to some event and you can localize the changes to a few rows in the database at the time when that event occurs. I'm pretty certain the entire user base of 50,000 doesn't change rankings every five minutes.
I'm assuming the "status = '1'" indicates that a user's rank has changed so, rather than setting this when the user triggers a rank change, why don't you calculate the rank at that time?
That would seem to be a better solution as the cost of re-ranking would be amortized over all the operations.
Now I may have misunderstood what you meant by ranking in which case feel free to set me straight.
A simple alternative for bulk update might be something like:
set #rnk = 0;
update users
set month_rank = (#rnk := #rnk + 1)
order by month_score DESC
This code uses a local variable (#rnk) that is incremented on each update. Because the update is done over the ordered list of rows, the month_rank column will be set to the incremented value for each row.
Updating the users table row by row will be a time consuming task. It would be better if you could re-organise your query so that row by row updates are not required.
I'm not 100% sure of the syntax (as I've never used MySQL before) but here's a sample of the syntax used in MS SQL Server 2000
DECLARE #tmp TABLE
(
[MonthRank] [INT] NOT NULL,
[UserId] [INT] NOT NULL,
)
INSERT INTO #tmp ([UserId])
SELECT [id]
FROM [users]
WHERE [status] = '1'
ORDER BY [month_score] DESC
UPDATE users
SET month_rank = [tmp].[MonthRank]
FROM #tmp AS [tmp], [users]
WHERE [users].[Id] = [tmp].[UserId]
In MS SQL Server 2005/2008 you would probably use a CTE.
Any time you have a loop of any significant size that executes queries inside, you've got a very likely antipattern. We could look at the schema and processing requirement with more info, and see if we can do the whole job without a loop.
How much time does it spend calculating the scores, compared with assigning the rankings?
Your problem can be handled in a number of ways. Honestly more details from your server may point you in a totally different direction. But doing it that way you are causing 50,000 little locks on a heavily read table. You might get better performance with a staging table and then some sort of transition. Inserts into a table no one is reading from are probably going to be better.
Consider
mysql_query("delete from month_rank_staging;");
while(bla){
mysql_query("insert into month_rank_staging values ('$id', '$i');");
}
mysql_query("update month_rank_staging src, users set users.month_rank=src.month_rank where src.id=users.id;");
That'll cause one (bigger) lock on the table, but might improve your situation. But again, that may be way off base depending on the true source of your performance problem. You should probably look deeper at your logs, mysql config, database connections, etc.
Possibly you could use shards by time or other category. But read this carefully before...
You can split up the rank processing and the updating execution. So, run through all the data and process the query. Add each update statement to a cache. When the processing is complete, run the updates. You should have the WHERE portion of the UPDATE reference a primary key set to auto_increment, as mentioned in other posts. This will prevent the updates from interfering with the performance of the processing. It will also prevent users later in the processing queue from wrongfully taking advantage of the values from the users who were processed before them (if one user's rank affects that of another). It also prevents the database from clearing out its table caches from the SELECTS your processing code does.

Categories