How to echo random rows from database? - php

I have a database table with about 160 million rows in it.
The table has two columns: id and listing.
I simply need to used PHP to display 1000 random rows from the listing column and put them into <span> tags. Like this:
<span>Row 1</span>
<span>Row 2</span>
<span>Row 3</span>
I've been trying to do it with ORDER BY RAND() but that takes so long to load on such a large database and I haven't been able to find any other solutions.
I'm hoping that there is a fast/easy way to do this. I can't imagine that it'd be impossible to simply echo 1000 random rows... Thanks!

Two solutions presented here. Both of these proposed solutions are mysql-only and can be used by any programming language as the consumer. PHP would be wildly too slow for this, but it could be the consumer of it.
Faster Solution: I can bring 1000 random rows from a table of 19 million rows in about 2 tenths of a second with more advanced programming techniques.
Slower Solution: It takes about 15 seconds with non-power programming techniques.
By the way both use the data generation seen HERE that I wrote. So that is my little schema. I use that, continue with TWO more self-inserts seen over there, until I have 19M rows. So I am not going to show that again. But to get those 19M rows, go see that, and do 2 more of those inserts, and you have 19M rows.
Slower version first
First, the slower method.
select id,thing from ratings order by rand() limit 1000;
That returns 1000 rows in 15 seconds.
For anyone new to mysql, don't even read the following.
Faster solution
This is a little more complicated to describe. The gist of it is that you pre-compute your random numbers and generate an in clause ending of random numbers, separated by commas, and wrapped with a pair of parentheses.
It will look like (1,2,3,4) but it will have 1000 numbers in it.
And you store them, and use them once. Like a one time pad for cryptography. Ok, not a great analogy, but you get the point I hope.
Think of it as an ending for an in clause, and stored in a TEXT column (like a blob).
Why in the world would one want to do this? Because RNG (random number generators) are prohibitively slow. But to generate them with a few machines may be able to crank out thousands relatively quickly. By the way (and you will see this in the structure of my so called appendices, I capture how long it takes to generate one row. About 1 second with mysql. But C#, PHP, Java, anything can put that together. The point is not how you put it together, rather, that you have it when you want it.
This strategy, the long and short of it is, when this is combined with fetching a row that has not been used as a random list, marking it as used, and issuing a call such as
select id,thing from ratings where id in (a,b,c,d,e, ... )
and the in clause has 1000 numbers in it, the results are available in less than half a second. Effective employing the mysql CBO (cost based optimizer) than treats it like a join on a PK index.
I leave this in summary form, because it is a bit complicated in practice, but includes the following particles potentially
a table holding the precomputed random numbers (Appendix A)
a mysql create event strategy (Appendix B)
a stored procedure that employees a Prepared Statement (Appendix C)
a mysql-only stored proc to demonstrate RNG in clause for kicks (Appendix D)
Appendix A
A table holding the precomputed random numbers
create table randomsToUse
( -- create a table of 1000 random numbers to use
-- format will be like a long "(a,b,c,d,e, ...)" string
-- pre-computed random numbers, fetched upon needed for use
id int auto_increment primary key,
used int not null, -- 0 = not used yet, 1= used
dtStartCreate datetime not null, -- next two lines to eyeball time spent generating this row
dtEndCreate datetime not null,
dtUsed datetime null, -- when was it used
txtInString text not null -- here is your in clause ending like (a,b,c,d,e, ... )
-- this may only have about 5000 rows and garbage cleaned
-- so maybe choose one or two more indexes, such as composites
);
Appendix B
In the interest of not turning this into a book, see my answer HERE for a mechanism for running a recurring mysql Event. It will drive the maintenance of the table seen in Appendix A using techniques seen in Appendix D and other thoughts you want to dream up. Such as re-use of rows, archiving, deleting, whatever.
Appendix C
stored procedure to simply get me 1000 random rows.
DROP PROCEDURE if exists showARandomChunk;
DELIMITER $$
CREATE PROCEDURE showARandomChunk
(
)
BEGIN
DECLARE i int;
DECLARE txtInClause text;
-- select now() into dtBegin;
select id,txtInString into i,txtInClause from randomsToUse where used=0 order by id limit 1;
-- select txtInClause as sOut; -- used for debugging
-- if I run this following statement, it is 19.9 seconds on my Dell laptop
-- with 19M rows
-- select * from ratings order by rand() limit 1000; -- 19 seconds
-- however, if I run the following "Prepared Statement", if takes 2 tenths of a second
-- for 1000 rows
set #s1=concat("select * from ratings where id in ",txtInClause);
PREPARE stmt1 FROM #s1;
EXECUTE stmt1; -- execute the puppy and give me 1000 rows
DEALLOCATE PREPARE stmt1;
END
$$
DELIMITER ;
Appendix D
Can be intertwined with Appendix B concept. However you want to do it. But it leaves you with something to see how mysql could do it all by itself on the RNG side of things. By the way, for parameters 1 and 2 being 1000 and 19M respectively, it takes 800 ms on my machine.
This routine could be written in any language as mentioned in the beginning.
drop procedure if exists createARandomInString;
DELIMITER $$
create procedure createARandomInString
( nHowMany int, -- how many numbers to you want
nMaxNum int -- max of any one number
)
BEGIN
DECLARE dtBegin datetime;
DECLARE dtEnd datetime;
DECLARE i int;
DECLARE txtInClause text;
select now() into dtBegin;
set i=1;
set txtInClause="(";
WHILE i<nHowMany DO
set txtInClause=concat(txtInClause,floor(rand()*nMaxNum)+1,", "); -- extra space good due to viewing in text editor
set i=i+1;
END WHILE;
set txtInClause=concat(txtInClause,floor(rand()*nMaxNum)+1,")");
-- select txtInClause as myOutput; -- used for debugging
select now() into dtEnd;
-- insert a row, that has not been used yet
insert randomsToUse(used,dtStartCreate,dtEndCreate,dtUsed,txtInString) values
(0,dtBegin,dtEnd,null,txtInClause);
END
$$
DELIMITER ;
How to call the above stored proc:
call createARandomInString(1000,18000000);
That generates and saves 1 row, of 1000 numbers wrapped as described above. Big numbers, 1 to 18M
As a quick illustration, if one were to modify the stored proc, un-rem the line near the bottom that says "used for debugging", and have that as the last line, in the stored proc that runs, and run this:
call createARandomInString(4,18000000);
... to generate 4 random numbers up to 18M, the results might look like
+-------------------------------------+
| myOutput |
+-------------------------------------+
| (2857561,5076608,16810360,14821977) |
+-------------------------------------+
Appendix E
Reality check. These are somewhat advanced techniques and I can't tutor anyone on them. But I wanted to share them anyway. But I can't teach it. Over and out.

ORDER BY RAND() is a mysql function working fine with small databases, but if you run anything larger then 10k rows, you should build functions inside your program instead of using mysql premade functions or organise your data in special manners.
My suggestion: keep your mysql data indexed by auto increment id, or add other incremental and unique row.
Then build a select function:
<?php
//get total number of rows
$result = mysql_query('SELECT `id` FROM `table_name`', $link);
$num_rows = mysql_num_rows($result);
$randomlySelected = [];
for( $a = 0; $a < 1000; $a ++ ){
$randomlySelected[$a] = rand(1,$num_rows);
}
//then select data by random ids
$where = "";
$control = 0;
foreach($randomlySelected as $key => $selectedID){
if($control == 0){
$where .= "`id` = '". $selectedID ."' ";
} else {
$where .= "OR `id` = '". $selectedID ."'";
}
$control ++;
}
$final_query = "SELECT * FROM `table_name` WHERE ". $where .";";
$final_results = mysql_query($final_query);
?>
If some of your incremental IDs out of that 160 million database are missing, then you can easily add a function to add another random IDs (a while loop probably) if an array of randomly selected ids consists of less then required.
Let me know if you need some further help.

If your RAND() function is too slow, and you only need quasi-random records (for a test sample) and not truly random ones, you can always make a fast, effectively-random group by sorting by middle characters (using SUBSTRING) in indexed fields. For example, sorting by the 7th digit of a phone number...in descending order...and then by the 6th digit...in ascending order...that's already quasi-random. You could do the same with character columns: the 6th character in a person's name is going to be meaningless/random, etc.

You want to use the rand function in php. The signature is
rand(min, max);
so, get the number of rows in your table to a $var and set that as your max.
A way to do this with SQL is
SELECT COUNT(*) FROM table_name;
then simply run a loop to generate 1000 rands with the above function and use them to get specific rows.
If the IDs are not sequential but if they are close, you can simply test each rand ID to see if there is a hit. If they are far apart, you could pull the entire ID space into php and then randomly sample from that distribution via something like
$random = rand(0, count($rows)-1);
for an array of IDs in $rows.

Please use mysql rand in your query during select statement. Your query will be look like
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;

Related

Select query takes too long

These 2 querys take too long to produce a result (sometimes 1 min or even sometime end up on some error) and put really heavy load on the server:
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND from_unixtime(created) > CURRENT_DATE ORDER BY created DESC", $user->data->userid)
("SELECT COUNT(`userid`) AS `total_clicks` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i", $user->data->userid)
The table has about 4 million rows.
This is the table structure:
I have one index on traffic_id:
If you select anything from traffic_stats table it will take forever, however inserting to this table is normal.
Is it possible to reduce the time spent on executing this query? I use PDO and I am new to all this.
ORDER BY will take a lot of time and since you only need aggregate data (adding numbers or counting numbers is commutative), the ORDER BY will do a lot of useless sorting, costing you time and server power.
You will need to make sure that your indexing is right, you will probably need an index for user_id and for (user_id, created).
Is user_id numeric? If not, then you might consider converting it into numeric type, int for example.
These are improving your query and structure. But let's improve the concept as well. Are insertions and modifications very frequent? Do you absolutely need real-time data, or you can do with quasi-realtime data as well?
If insertions/modifications are not very frequent, or you can do with older data, or the problem is causing huge trouble, then you could do this by running periodically a cron job which would calculate these values and cache them. The application would read them from the cache.
I'm not sure why you accepted an answer, when you really didn't get to the heart of your problem.
I also want to clarify that this is a mysql question, and the fact that you are using PDO or PHP for that matter is not important.
People advised you to utilize EXPLAIN. I would go one further and tell you that you need to use EXPLAIN EXTENDED possibly with the format=json option to get a full picture of what is going on. Looking at your screen shot of the explain, what should jump out at you is that the query looked at over 1m rows to get an answer. This is why your queries are taking so long!
At the end of the day, if you have properly indexed your tables, your goal should be in a large table like this, to have number of rows examined be fairly close to the final result set.
So let's look at the 2nd query, which is quite simple:
("SELECT COUNT(`userid`) AS `total_clicks` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i", $user->data->userid)
In this case the only thing that is really important is that you have an index on traffic_stats.userid.
I would recommend, that, if you are uncertain at this point, drop all indexes other than the original primary key (traffic_id) index, and start with only an index on the userid column. Run your query. What is the result, and how long does it take? Look at the EXPLAIN EXTENDED. Given the simplicity of the query, you should see that only the index is being used and the rows should match the result.
Now to your first query:
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND from_unixtime(created) > CURRENT_DATE ORDER BY created DESC", $user->data->userid)
Looking at the WHERE clause there are these criteria:
userid =
from_unixtime(created) > CURRENT_DATE
You already have an index on userid. Despite the advice given previously, it is not necessarily correct to have an index on userid, created, and in your case it is of no value whatsoever.
The reason for this is that you are utilizing a mysql function from_unixtime(created) to transform the raw value of the created column.
Whenever you do this, an index can't be used. You would not have any concerns in doing a comparison with the CURRENT_DATE if you were using the native TIMESTAMP type but in this case, to handle the mismatch, you simply need to convert CURRENT_DATE rather than the created column.
You can do this by passing CURRENT_DATE as a parameter to UNIX_TIMESTAMP.
mysql> select UNIX_TIMESTAMP(), UNIX_TIMESTAMP(CURRENT_DATE);
+------------------+------------------------------+
| UNIX_TIMESTAMP() | UNIX_TIMESTAMP(CURRENT_DATE) |
+------------------+------------------------------+
| 1490059767 | 1490054400 |
+------------------+------------------------------+
1 row in set (0.00 sec)
As you can see from this quick example, UNIX_TIMESTAMP by itself is going to be the current time, but CURRENT_DATE is essentially the start of day, which is apparently what you are looking for.
I'm willing to bet that the number of rows for the current date are going to be fewer in number than the total rows for a user over the history of the system, so this is why you would not want an index on user, created as previously advised in the accepted answer. You might benefit from an index on created, userid.
My advice would be to start with an individual index on each of the columns separately.
("SELECT SUM(`rate`) AS `today_earned` FROM `".PREFIX."traffic_stats` WHERE `userid` = ?i AND created > UNIX_TIMESTAMP(CURRENT_DATE)", $user->data->userid)
And with your re-written query, again assuming that the result set is relatively small, you should see a clean EXPLAIN with rows matching your final result set.
As for whether or not you should apply an ORDER BY, this shouldn't be something you eliminate for performance reasons, but rather because it isn't relevant to your desired result. If you need or want the results ordered by user, then leave it. Unless you are producing a large result set, it shouldn't be a major problem.
In the case of that particular query, since you are doing a SUM(), there is no value of ORDERING the data, because you are only going to get one row back, so in that case I agree with Lajos, but there are many times when you might be utilizing a GROUP BY, and in that case, you might want the final results ordered.

Is it better to handle this Query on MySQL side or PHP side?

Explanation:
Writing an extension for a closed-source PHP Program,
I need to add some numbers based on these example Queries
SELECT sum(value) AS sum1 FROM table WHERE user_id=X AND text='TEXT_HERE_A'
SELECT sum(value) AS sum2 FROM table WHERE user_id=X AND text='TEXT_HERE_B'
Add these numbers in PHP and then Update a field in database
$update = $sum1 + $sum2 - $php_sum;
UPDATE table SET value=$update WHERE user_id=X
Question:
As you can see I'm searching based on "text" data type in MySql,
Do you think this action is okay? or should I do the following instead:
SELECT value,text FROM table WHERE user_id=X
and then do the sum and calculations in PHP side through loops (the difference here is I select based on user_id KEY(INT) only, and the sums are calculated are PHP side)
Which one has better performance in large tables?
Question: Which one is better in this situation? Calculating SUMs in PHP side or MySQL side?
It's usually wrong to return lots of rows and do the filtering and summing in the client -- you generally want to minimize the amount of data transferred between the client and server. So if you can do the filtering with a WHERE clause and aggregation with things like SUM and COUNT, it's usually preferable.
I would try to do the whole thing on the server.
UPDATE table AS t1
JOIN (SELECT SUM(value) AS total_value
FROM table
WHERE user_id = X AND text in ('TEXT_HERE_A', 'TEXT_HERE_B')) AS t2
SET value = total_value - $php_sum
WHERE user_id = X
If you do not need the values on your PHP-side code, then simply do the entire operation in one query.
Since now you have multiple queries in your example, your code is prone to concurrency anomalies - interleaved execution will leave your final value at the UPDATE statement inconsistent.
Since your question also asked about TEXT type query, it really depends on what sort of data you are putting into the TEXT column. If you can use a CHAR or VARCHAR instead and put an index on that column, the query would be faster. TEXT unindexed search would take rather long, and it is expensive to index a TEXT column.

Obtain an unique sequence order number concurrently from PostgreSQL

We are designing an order management system, the order id is designed as a bigint with Postgresql, and the place structure is implemented as follows:
Take 2015072201000010001 as an order id example, the first eight places are considered as the date which is 20150722 here, the next seven places are considered as the region code which is 0100001 here, and the last four places are for the sequence number under the aforementioned region and date.
So every time a new order is created, the php logic application layer will query PostgreSQL with the following like sql statement:
select id from orders where id between 2015072201000010000 and 2015072201000019999 order by id desc limit 1 offset 0
then increase the id for the new order, after this insert the order to PostgreSQL database.
This is ok if there is only one order generation process at one time. But with hundreds of concurrent order generation request, there are such a lot of chances that the order ids will collide since the database read/write lock mechanism of PostgreSQL.
Let's say there are two order requests A and B. A tries to read the the latest order id from the database, then B reads the latest order id too, then A writes to the database, finally B writes to the db will failed since the order id primary key collides.
Any thoughts on how to make this order generation action concurrently feasible?
In the case of many concurrent operations your only option is to work with sequences. In this scenario you would need to create a sequence for every date and region. That sounds like a lot of work, but most of it can be automated.
Creating the sequences
You can name your sequences after the date and the region. So do something like:
CREATE SEQUENCE seq_201507220100001;
You should create a sequence for every combination of day and region. Do this in a function to avoid repetition. Run this function once for every day. You can do this ahead of time or - even better - do this in a scheduled job on a daily basis to create tomorrow's sequences. Assuming you do not need to back-date orders to previous days, you can drop yesterday's sequences in the same function.
CREATE FUNCTION make_and_drop_sequences() RETURNS void AS $$
DECLARE
region text;
tomorrow text;
yesterday text;
BEGIN
tomorrow := to_char((CURRENT_DATE + 1)::date, 'YYYYMMDD');
yesterday := to_char((CURRENT_DATE - 1)::date, 'YYYYMMDD');
FOREACH region IN
SELECT DISTINCT region FROM table_with_regions
LOOP
EXECUTE format('CREATE SEQUENCE %I', 'seq_' || tomorrow || region);
EXECUTE format('DROP SEQUENCE %I', 'seq_' || yesterday|| region);
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
Using the sequences
In your PHP code you obviously know the date and the region you need to enter a new order id for. Make another function that generates a new value from the right sequence on the basis of the date and the region:
CREATE FUNCTION new_date_region_id (region text) RETURN bigint AS $$
DECLARE
dt_reg text;
new_id bigint;
BEGIN
dt_reg := tochar(CURRENT_DATE, 'YYYYMMDD') || region;
SELECT dt_reg::bigint * 10000 + nextval(quote_literal(dt_reg)) INTO new_id;
RETURN new_id;
END;
$$ LANGUAGE plpgsql STRICT;
In PHP you then call:
SELECT new_date_region_id('0100001');
which will give the next available id for the specified region for today.
The usual way to avoid locking ids in Postgres is through the sequences.
You could use Postgresql sequences for each region. Something like
create sequence seq_0100001;
then you can get a number from that using:
select nextval('seq_'||regioncode) % 10000 as order_seq
That does mean the order numbers will not reset to 0001 each day, but you do have the same 0000 -> 9999 range for order numbers. It will wrap around.
So you may end up with:
2015072201000010001 -> 2015072201000017500
2015072301000017501 -> 2015072301000019983
2015072401000019984 -> 2015072401000010293
Alternatively you could just generate a sequence for each day/region combination, but you'd need to be on top of dropping the previous days sequences at the start of next day.
Try to use UUIDv1 type which is a combination of timestamp and MAC adress. You can have it auto-generated on server side if the order of inserts is important for you. Otherwise, the IDs can be generated from any of your clients before inserting (you might need their clock synchronized). Just be aware that with UUIDv1 is you can disclose the MAC address of the host where the UUID was generated. In this case, you may want to spoof the MAC address.
For your case, you can do something like
CREATE TABLE orders (
id uuid PRIMARY KEY DEFAULT uuid_generate_v1(),
created_at timestamp NOT NULL DEFAULT now(),
region_code text NOT NULL REFERENCES...
...
);
Read more at http://www.postgresql.org/docs/9.4/static/uuid-ossp.html

GROUP BY and ORDER BY too slow. How to make faster?

I've trying to create some stats for my table but it has over 3 million rows so it is really slow.
I'm trying to find the most popular value for column name and also showing how many times it pops up.
I'm using this at the momment but it doesn't work cause its too slow and I just get errors.
$total = mysql_query("SELECT `name`, COUNT(*) as b FROM `people` GROUP BY `name` ORDER BY `b` DESC LIMIT 0,5;")or die(mysql_error());
As you may see I'm trying to get all the names and how many times that name has been used but only show the top 5 to hopefully speed it up.
I would like to be able to then do get the values like
while($row = mysql_fetch_array($result)){
echo $row['name'].': '.$row['b']."\r\n";
}
And it will show things like this;
Bob: 215
Steve: 120
Sophie: 118
RandomGuy: 50
RandomGirl: 50
I don't care much about ordering the names afterwards like RandomGirl and RandomGuy been the wrong way round.
I think I've have provided enough information. :) I would like the names to be case-insensitive if possible though. Bob should be the same as BoB, bOb, BOB and so on.
Thank-you for your time
Paul
Limiting results on the top 5 won't give you a lot of speed-up, you'll gain time in the result retrieval, but in mySQL side the whole table still needs to be parsed (to count).
You will speed-up your count query having index on name column, of course as only the index will be parsed and not the table.
Now if you really want to speed up the result and avoid parsing the name index when you need this result (which will still be quite slow if you really have millions of rows), then the only other solution is computing the stats when inserting, deleting or updating rows on this table. That is using triggers on this table to maintain a statistics table near this one. Then you will really only have a simple select query on this statistics table, with only 5 rows parsed. But you will slow down your inserts, delete and update operations (which are already quite slow, especially if you maintain indexes, so if the stats are important you should study this solution).
Do you have an index on name? It might help.
Since you are doing the counting/grouping and then sorting an index on name doesn't help at all MySql should go through all rows every time, there is no way to optimize this. You need to have a separate stats table like this:
CREATE TABLE name_stats( name VARCHAR(n), cnt INT, UNIQUE( name ), INDEX( cnt ) )
and you should update this table whenever you add a new row to 'people' table like this:
INSERT INTO name_stats VALUES( 'Bob', 1 ) ON DUPLICATE KEY UPDATE cnt = cnt + 1;
Querying this table for the list of top names should give you the results instantaneously.

How can I optimise this MySQL query?

I am using the following MySQL query in a PHP script on a database that contains over 300,000,000 (yes, three hundred million) rows. I know that it is extremely resource intensive and it takes ages to run this one query. Does anyone know how I can either optimise the query or get the information in another way that's quicker?
I need to be able to use any integer between 1 and 15 in place of the 14 in MID(). I also need to be able to match strings of lengths within the same range in the LIKE clause.
Table Info:
games | longint, unsigned, Primary Key
win | bit(1)
loss | bit(1)
Example Query:
SELECT MID(`game`,14,1) AS `move`,
COUNT(*) AS `games`,
SUM(`win`) AS `wins`,
SUM(`loss`) AS `losses`
FROM `games`
WHERE `game` LIKE '1112223334%'
GROUP BY MID(`game`,1,14)
Thanks in advance for your help!
First, have an index on the game field... :)
The query seems simple and straightforward, but it hides that fact that a datasbase design change is probably required.
In such cases I always prefer to maintain a field that holds aggregated data, either per day, per user, or per any other axis. This way you can have a daily task that aggregates the relevant data and saves it in the database.
If indeed you call this query often, you should use the principle of decreasing the efficiency of insertion for increasing the efficiency of retrieval.
It looks like the game column is storing two (or possibly more) different things that this query is using:
Filtering by the start of game (first 10 characters)
Grouping by and returning MID(game,1,14) (I'm assuming one of the MID expressions is a typo.
I'd split that up so that you don't have to use string operations on the game column, and also put indexes on the new columns so you can filter and group them properly.
This query is doing a lot of conversions (long to string) and string manipulations that wouldn't be necessary if the table were normalized (as in one piece of information per column instead of multiple like it is now).
Leave the game column the way it is, and create a game_filter string column based on it to use in your WHERE clause. Then set up a game_group column and populate it with the MID expression on insert. Set up these two columns as your clustered index, first game_filter, then game_group.
The query is simple and, aside from making sure there are all the necessary indexes ("game" field obviously), there may be no obvious way to make it faster by rewriting the query only.
Some modification of data structures will probably be necessary.
One way: precalculate the sums. Each of these records will most likely have a create_date or an autoincremented key field. Precalculate the sums for all records, where this field is ≤ some X, put results in a side table, and then you only need to calculate for all records > X, then summarize these partial results with your precalculated ones.
You could precompute the MID(game,14,1) and MID(game,1,14) and store the first ten digits of the game in a separate gameid column which is indexed.
It might also be an idea to investigate if you could just store an aggregate table of the precomputed values so you increment the count and wins or losses column on insert instead.
SELECT MID(`game`,14,1) AS `move`,
COUNT(*) AS `games`,
SUM(`win`) AS `wins`,
SUM(`loss`) AS `losses`
FROM `games`
WHERE `game` LIKE '1112223334%'
Create an index on game:
CREATE INDEX ix_games_game ON games (game)
and rewrite your query as this:
SELECT move,
(
SELECT COUNT(*)
FROM games
WHERE game >= move
AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
),
(
SELECT SUM(win)
FROM games
WHERE game >= move
AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
),
(
SELECT SUM(lose)
FROM games
WHERE game >= move
AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
)
FROM (
SELECT DISTINCT SUBSTRING(q.game, 1, 14) AS move
FROM games
WHERE game LIKE '1112223334%'
) q
This will help to use the index on game more efficiently.
Can you cache the result set with Memcache or something similar? That would help with repeated hits. Even if you only cache the result set for a few seconds, you might be able to avoid a lot of DB reads.

Categories