Paginate items from different sources - php

Here's a problem I'm facing: I need to lists some items. Those items come from different sources (let's say table A, table B, table C), with different attributes and nature (although some are common).
How can I merge them together in a list that is paginated?
The options I've considered:
Get them all first, then sort and paginate them afterwards in the code. This doesn't work well because there are too many items (thousands) and performance is a mess.
Join them in a SQL view with their shared attributes, once the SQL query is done, reload only the paginated items to get the rest of their attributes. This works so far, but might become difficult to maintain if the sources change/increase.
Do you know any other option? Basically, what is the most used/recommended way to paginate items from two data sources (either in SQL or directly in the code).
Thanks.

If UNION solves the problem, here are some syntax and optimization tips.
This will provide page 21 of 10-row pages:
(
( SELECT ... LIMIT 210 )
UNION [ALL|DISTINCT]
( SELECT ... LIMIT 210 )
) ORDER BY ... LIMIT 10 OFFSET 200
Note that 210 = 200+10. You can't trust using OFFSET in the inner SELECTs.
Use UNION ALL for speed, but if there could be repeated rows between the SELECTs, then explicitly say UNION DISTINCT.
If you take away too many parentheses, you will get either syntax errors or the 'wrong' results.
If you end up with a subquery, repeat the ORDER BY but not the LIMIT:
SELECT ...
FROM (
( SELECT ... LIMIT 210 )
UNION [ALL|DISTINCT]
( SELECT ... LIMIT 210 )
ORDER BY ... LIMIT 10 OFFSET 200
) AS u
JOIN something_else ON ...
ORDER BY ...
One reason that might include a JOIN is for performance -- The subquery u has boiled the resultset down to only 10 rows, hence the JOIN will have only 10 things to look up. Putting the JOIN inside would lead to lots of joining before whittling down to only 10.

I actually had to answer a similar situation very recently, specifically reporting across two large tables and paginating across both of them. The answer I came to was to use subqueries, like so:
SELECT
t1.id as 't1_id',
t1.name as 't1_name',
t1.attribute as 't1_attribute',
t2.id as 't2_id',
t2.name as 't2_name',
t2.attribute as 't2_attribute',
l.attribute as 'l_attribute'
FROM (
SELECT
id, name, attribute
FROM
table1
/* You can perform joins in here if you want, just make sure you're using your aliases right */
/* You can also put where statements here */
ORDER BY
name DESC, id ASC
LIMIT 0,50
) as t1
INNER JOIN (
SELECT
id,
name,
attribute
FROM
table2
ORDER BY
attribute ASC
LIMIT 250,50
) as t2
ON t2.id IS NOT NULL
LEFT JOIN
linkingTable as l
ON l.t1Id = t1.id
AND l.t2Id = t2.id
/* Do your wheres and stuff here */
/* You shouldn't need to do any additional ordering or limiting */

Related

yii2 data provider query takes very long time

I am using yii2 data Provider to extract data from database. Raw query looks like this
SELECT `client_money_operation`.* FROM `client_money_operation`
LEFT JOIN `user` ON `client_money_operation`.`user_id` = `user`.`id`
LEFT JOIN `client` ON `client_money_operation`.`client_id` = `client`.`id`
LEFT JOIN `client_bonus_operation` ON `client_money_operation`.`id` = `client_bonus_operation`.`money_operation_id`
WHERE (`client_money_operation`.`status`=0) AND (`client_money_operation`.`created_at` BETWEEN 1 AND 1539723600)
GROUP BY `operation_code` ORDER BY `created_at` DESC LIMIT 10
this query takes 107 seconds to execute.
Table client_money operations contains 132000 rows. What do I need to do to optimise this query, or set up my database properly?
Try pagination. But if you must have to show large set of records in one go remove as many left joins as you can. You can duplicate some data in the client_money_operation table if it is certainly required to show in the one-go result set.
SELECT mo.*
FROM `client_money_operation` AS mo
LEFT JOIN `user` AS u ON mo.`user_id` = u.`id`
LEFT JOIN `client` AS c ON mo.`client_id` = c.`id`
LEFT JOIN `client_bonus_operation` AS bo ON mo.`id` = bo.`money_operation_id`
WHERE (mo.`status`=0)
AND (mo.`created_at` BETWEEN 1 AND 1539723600)
GROUP BY `operation_code`
ORDER BY `created_at` DESC
LIMIT 10
is a rather confusing use of GROUP BY. First, it is improper to group by one column while having lots of non-aggregated columns in the SELECT list. And the use of created_at in the ORDER BY does not make sense since it is unclear which date will be associated with each operation_code. Perhaps you want MIN(created_at)?
Optimization...
There will be a full scan of mo and (hopefully) PRIMARY KEY lookups into the other tables. Please provide EXPLAIN SELECT ... so we can check this.
The only useful index on mo is INDEX(status, created_at), and it may or may not be useful, depending on how big that date range is.
bo needs some index starting with money_operation_id.
What table(s) are operation_code and created_at in? It makes a big difference to the Optimizer.
But there is a pattern that can probably be used to greatly speed up the query. (I can't give you details without knowing what table those columns are in, nor whether it can be made to work.)
SELECT mo.*
FROM ( SELECT mo.id FROM .. WHERE .. GROUP BY .. ORDER BY .. LIMIT .. ) AS x
JOIN mo ON x.id = mo.id
ORDER BY .. -- yes, repeated
That is, first do (in a derived table) the minimal work to find ids for the 10 rows desired, then use JOIN(s) to fetch there other columns needed.
(If yii2 cannot be made to generate such, then it is in the way.)

Query Performance effects

I just want to ask if with this kind of database design below and query below will this have a big effect on the query performance or should I break down the query instead of multiple subqueries. Though this subquery works for me I just want to make sure that someday it will not affect the performance. My goal in mind with this query is that I want to generate a dynamic queries for all those table. Example query for this is to view list of participants attended in a specific commodity and start date between 2 given date.Thank you in advance.
Below is my sample code that will view participants in between 1 and 3 years old.
SELECT TT.title, TTP.*, TS.*, TP.lastname, TP.firstname, (DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(TP.birthday, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(TP.birthday, '00-%m-%d')) )AS age, TP.birthday
FROM tbl_trainingparticipant AS TTP
LEFT JOIN (
SELECT t1.*
FROM tbl_schedules t1
WHERE t1.sched_id = (
SELECT t2.sched_id
FROM tbl_schedules t2
WHERE t2.training_id = t1.training_id
ORDER BY t2.sched_id DESC LIMIT 1
)
)AS TS ON TTP.training_id = TS.training_id
LEFT JOIN tbl_participants AS TP
ON TTP.participant_id = TP.participant_id
LEFT JOIN tbl_trainings AS TT
ON TT.training_id = TS.training_id
HAVING age BETWEEN 1 AND 3
SQL has an ability to nest queries within one another. A subquery is a
SELECT statement that is nested within another SELECT statement and
which return intermediate results. SQL executes innermost subquery
first, then next level.
So if you use a nested select you assigned task to mysql to execute two queries which are sent to it together. Usually it's going to be faster to do one trip than several.

Displaying a large amount of data in paging table without heavily impacting DB

The current implementation is a single complex query with multiple joins and temporary tables, but is putting too much stress on my MySQL and is taking upwards of 30+ seconds to load the table. The data is retrieved by PHP via a JavaScript Ajax call and displayed on a webpage. Here is the tables involved:
Table: table_companies
Columns: company_id, ...
Table: table_manufacture_line
Columns: line_id, line_name, ...
Table: table_product_stereo
Columns: product_id, line_id, company_id, assembly_datetime, serial_number, ...
Table: table_product_television
Columns: product_id, line_id, company_id, assembly_datetime, serial_number, warranty_expiry, ...
A single company can have 100k+ items split between the two product tables. The product tables are unioned and filtered by the line_name, then ordered by assembly_datetime and limited depending on the paging. The datetime value is also reliant on timezone and this is applied as part of the query (another JOIN + temp table). line_name is also one of the returned columns.
I was thinking of splitting the line_name filter out from the product union query. Essentially I'd determine the ids of the lines that correspond to the filter, then do a UNION query with a WHERE condition WHERE line_id IN (<results from previous query>). This would cut out the need for joins and temp tables, and I can apply the line_name to line_id and timezone modification in PHP, but I'm not sure this is the best way to go about things.
I have also looked at potentially using Redis, but the large number of individual products is leading to a similarly long wait time when pushing all of the data to Redis via PHP (20-30 seconds), even if it is just pulled in directly from the product tables.
Is it possible to tweak the existing queries to increase the efficiency?
Can I push some of the handling to PHP to decrease the load on the SQL server? What about Redis?
Is there a way to architect the tables better?
What other solution(s) would you suggest?
I appreciate any input you can provide.
Edit:
Existing query:
SELECT line_name,CONVERT_TZ(datetime,'UTC',timezone) datetime,... FROM (SELECT line_name,datetime,... FROM ((SELECT line_id,assembly_datetime datetime,... FROM table_product_stereos WHERE company_id=# ) UNION (SELECT line_id,assembly_datetime datetime,... FROM table_product_televisions WHERE company_id=# )) AS union_products INNER JOIN table_manufacture_line USING (line_id)) AS products INNER JOIN (SELECT timezone FROM table_companies WHERE company_id=# ) AS tz ORDER BY datetime DESC LIMIT 0,100
Here it is formatted for some readability.
SELECT line_name,CONVERT_TZ(datetime,'UTC',tz.timezone) datetime,...
FROM (SELECT line_name,datetime,...
FROM (SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos WHERE company_id=#
UNION
SELECT line_id,assembly_datetime datetime,...
FROM table_product_televisions
WHERE company_id=#
) AS union_products
INNER JOIN table_manufacture_line USING (line_id)
) AS products
INNER JOIN (SELECT timezone
FROM table_companies
WHERE company_id=#
) AS tz
ORDER BY datetime DESC LIMIT 0,100
IDs are indexed; Primary keys are the first key for each column.
Let's build this query up from its component parts to see what we can optimize.
Observation: you're fetching the 100 most recent rows from the union of two large product tables.
So, let's start by trying to optimize the subqueries fetching stuff from the product tables. Here is one of them.
SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos
WHERE company_id=#
But look, you only need the 100 newest entries here. So, let's add
ORDER BY assembly_datetime DESC
LIMIT 100
to this query. Also, you should put a compound index on this table as follows. This will allow both the WHERE and ORDER BY lookups to be satisfied by the index.
CREATE INDEX id_date ON table_product_stereos (company_id, assembly_datetime)
All the same considerations apply to the query from table_product_televisions. Order it by the time, limit it to 100, and index it.
If you need to apply other selection criteria, you can put them in these inner queries. For example, in a comment you mentioned a selection based on a substring search. You could do this as follows
SELECT t.line_id,t.assembly_datetime datetime,...
FROM table_product_stereos AS t
JOIN table_manufacture_line AS m ON m.line_id = t.line_id
AND m.line_name LIKE '%test'
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
Next, you are using UNION to combine those two query result sets into one. UNION has the function of eliminating duplicates, which is time-consuming. (You know you don't have duplicates, but MySQL doesn't.) Use UNION ALL instead.
Putting this all together, the innermost sub query becomes this. We have to wrap up the subqueries because SQL is confused by UNION and ORDER BY clauses at the same query level.
SELECT * FROM (
SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
) AS st
UNION ALL
SELECT * FROM (
SELECT line_id,assembly_datetime datetime,...
FROM table_product_televisions
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
) AS tv
That gets you 200 rows. It should get those rows fairly quickly.
200 rows are guaranteed to be enough to give you the 100 most recent items later on after you do your outer ORDER BY ... LIMIT operation. But that operation only has to crunch 200 rows, not 100K+, so it will be far faster.
Finally wrap up this query in your outer query material. Join the table_manufacture_line information, and fix up the timezone.
If you do the indexing and the ORDER BY ... LIMIT operation earlier, this query should become very fast.
The comment dialog in your question indicates to me that you may have multiple product types, not just two, and that you have complex selection criteria for your paged display. Using UNION ALL on large numbers of rows slams performance: it converts multiple indexed tables into an internal list of rows that simply can't be searched efficiently.
You really should consider putting your two kinds of product data in a single table instead of having to UNION ALL multiple product tables. The setup you have now is inflexible and won't scale up easily. If you structure your schema with a master product table and perhaps some attribute tables for product-specific information, you will find yourself much happier two years from now. Seriously. Please consider making the change.
Remember: Index fast, data slow. Use joins over nested queries. Nested queries return all of the data fields whereas joins just consider the filters (which should all be indexed - make sure there's a unique index on table_product_*.line_id). It's been a while but I'm pretty sure you can join "ON company_id=#" which should cut down the results early on.
In this case, all of the results refer to the same company (or a much smaller subset) so it makes sense to run that query separately (and it makes the query more maintainable).
So your data source would be:
(table_product_stereos as prod
INNER JOIN table_manufacture_line AS ml ON prod.line_id = ml.line_id and prod.company_id=#
UNION
table_product_televisions as prod
INNER JOIN table_manufacture_line as ml on prod.line_id = ml.line_id and prod.company_id=#)
From which you can select prod. or ml. fields as required.
PHP is not a solution at all...
Redis can be a solution.
But the main thing I would change is the index creation for the tables (add missing indexe)...If you're running into temp tables you didn't create indexes well for the tables. And 100k rows in not much at all.
But I cant help you without any table creation statements as well as queries you run.
Make sure your "where part" is part of youf btree index from left to right.

How to quickly SELECT 3 random records from a 30k MySQL table with a where filter by a single query?

Well, this is a very old question never gotten real solution. We want 3 random rows from a table with about 30k records. The table is not so big in point of view MySQL, but if it represents products of a store, it's representative. The random selection is useful when one presents 3 random products in a webpage for example. We would like a single SQL string solution that meets these conditions:
In PHP, the recordset by PDO or MySQLi must have exactly 3 rows.
They have to be obtained by a single MySQL query without Stored Procedure used.
The solution must be quick as for example a busy apache2 server, MySQL query is in many situations the bottleneck. So it has to avoid temporary table creation, etc.
The 3 records must be not contiguous, ie, they must not to be at the vicinity one to another.
The table has the following fields:
CREATE TABLE Products (
ID INT(8) NOT NULL AUTO_INCREMENT,
Name VARCHAR(255) default NULL,
HasImages INT default 0,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The WHERE constraint is Products.HasImages=1 permitting to fetch only records that have images available to show on the webpage. About one-third of records meet the condition of HasImages=1.
Searching for a Perfection, we first let aside the existent Solutions that have drawbacks:
I. This basic solution using ORDER BY RAND(),
is too slow but guarantees 3 really random records at each query:
SELECT ID, Name FROM Products WHERE HasImages=1 ORDER BY RAND() LIMIT 3;
*CPU about 0.10s, scanning 9690 rows because of WHERE clause, Using where; Using temporary; Using filesort, on Debian Squeeze Double-Core Linux box, not so bad but
not so scalable to a bigger table as temporary table and filesort are used, and takes me 8.52s for the first query on the test Windows7::MySQL system. With such a poor performance, to avoid for a webpage isn't-it ?
II. The bright solution of riedsio using JOIN ... RAND(),
from MySQL select 10 random rows from 600K rows fast, adapted here is only valid for a single random record, as the following query results in an almost always contiguous records. In effect it gets only a random set of 3 continuous records in IDs:
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID)
AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
*CPU about 0.01 - 0.19s, scanning 3200, 9690, 12000 rows or so randomly, but mostly 9690 records, Using where.
III. The best solution seems the following with WHERE ... RAND(),
seen on MySQL select 10 random rows from 600K rows fast proposed by bernardo-siu:
SELECT Products.ID, Products.Name FROM Products
WHERE ((Products.Hasimages=1) AND RAND() < 16 * 3/30000) LIMIT 3;
*CPU about 0.01 - 0.03s, scanning 9690 rows, Using where.
Here 3 is the number of wished rows, 30000 is the RecordCount of the table Products, 16 is the experimental coefficient to enlarge the selection in order to warrant the 3 records selection. I don't know on what basis the factor 16 is an acceptable approximation.
We so get at the majority of cases 3 random records and it's very quick, but it's not warranted: sometimes the query returns only 2 rows, sometimes even no record at all.
The three above methods scan all records of the table meeting WHERE clause, here 9690 rows.
A better SQL String?
Ugly, but quick and random. Can become very ugly very fast, especially with tuning described below, so make sure you really want it this way.
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
First row appears more often than it should
If you have big gaps between IDs in your table, rows right after such gaps will have bigger chance to be fetched by this query. In some cases, they will appear significatnly more often than they should. This can not be solved in general, but there's a fix for a common particular case: when there's a gap between 0 and the first existing ID in a table.
Instead of subquery (SELECT RAND()*<max_id> AS ID) use something like (SELECT <min_id> + RAND()*(<max_id> - <min_id>) AS ID)
Remove duplicates
The query, if used as is, may return duplicate rows. It is possible to avoid that by using UNION instead of UNION ALL. This way duplicates will be merged, but the query no longer guarantees to return exactly 3 rows. You can work around that too, by fetching more rows than you need and limiting the outer result like this:
(SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
...
UNION (SELECT ... LIMIT 1)
LIMIT 3
There's still no guarantee that 3 rows will be fetched, though. It just makes it more likely.
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Of course the above is given "near" contiguous records you are feeding it the same ID every time without much regard to the seed of the rand function.
This should give more "randomness"
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (ROUND((RAND() * (max-min))+min)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Where max and min are two values you choose, lets say for example sake:
max = select max(id)
min = 225
This statement executes really fast (19 ms on a 30k records table):
$db = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'username', 'password');
$stmt = $db->query("SELECT p.ID, p.Name, p.HasImages
FROM (SELECT #count := COUNT(*) + 1, #limit := 3 FROM Products WHERE HasImages = 1) vars
STRAIGHT_JOIN (SELECT t.*, #limit := #limit - 1 FROM Products t WHERE t.HasImages = 1 AND (#count := #count -1) AND RAND() < #limit / #count) p");
$products = $stmt->fetchAll(PDO::FETCH_ASSOC);
The Idea is to "inject" a new column with randomized values, and then sort by this column. The generation of and sorting by this injected column is way faster than the "ORDER BY RAND()" command.
There "might" be one caveat: You have to include the WHERE query twice.
What about creating another table containing only items with image ? This table will be much lighter as it will contain only one-third of the items the original table has !
------------------------------------------
|ID | Item ID (on the original table)|
------------------------------------------
|0 | 0 |
------------------------------------------
|1 | 123 |
------------------------------------------
.
.
.
------------------------------------------
|10 000 | 30 000 |
------------------------------------------
You can then generate three random IDs in the PHP part of the code and just fetch'em the from the database.
I've been testing the following bunch of SQLs on a 10M-record, poorly designed database.
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'(
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
)';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1
, #l2
, #l3;
DEALLOCATE PREPARE random_records;
It took almost 7 minutes to get the three results. But I'm sure its performance will be much better in your case. Yet if you are looking for a better performance I suggest the following ones as they took less than 30 seconds for me to get the job done (on the same database).
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1;
EXECUTE random_records USING #l2;
EXECUTE random_records USING #l3;
DEALLOCATE PREPARE random_records;
Bear in mind that both these commands require MySQLi driver in PHP if you want to execute them in one go. And their only difference is that the later one requires calling MySQLi's next_result method to retrieve all three results.
My personal belief is that this is the fastest way to do this.
On the off-chance that you're willing to accept an 'outside the box' type of answer, I'm going to repeat what I said in some of the comments.
The best way to approach your problem is to cache your data in advance (be that in an external JSON or XML file, or in a separate database table, possibly even an in-memory table).
This way you can schedule your performance-hit on the products table to times when you know the server will be quiet, and reduce your worry about creating a performance hit at "random" times when the visitor arrives to your site.
I'm not going to suggest an explicit solution, because there are far too many possibilities on how to build a solution. However, the answer suggested by #ahmed is not silly. If you don't want to create a join in your query, then simply load more of the data that you require into the new table instead.

inner join large table with small table , how to speed up

dear php and mysql expertor
i have two table one large for posts artices 200,000records (index colume: sid) , and one small table (index colume topicid ) for topics has 20 record .. have same topicid
curent im using : ( it took round 0.4s)
+do get last 50 record from table:
SELECT sid, aid, title, time, topic, informant, ihome, alanguage, counter, type, images, chainid FROM veryzoo_stories ORDER BY sid DESC LIMIT 0,50
+then do while loop in each records for find the maching name of topic in each post:
while ( .. ) {
SELECT topicname FROM veryzoo_topics WHERE topicid='$topic'"
....
}
+Now
I going to use Inner Join for speed up process but as my test it took much longer from 1.5s up to 3.5s
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid, t.topicname FROM veryzoo_stories a INNER JOIN veryzoo_topics t ON a.topic = t.topicid ORDER BY sid DESC LIMIT 0,50
It look like the inner join do all joining 200k records from two table fist then limit result at 50 .. that took long time..
Please help to point me right way doing this..
eg take last 50 records from table one.. then join it to table 2 .. ect
Do not use inner join unless the two tables share the same primary key, or you'll get duplicate values (and of course a slower query).
Please try this :
SELECT *
FROM (
SELECT a.sid, a.aid, a.title, a.time, a.topic, a.informant, a.ihome, a.alanguage, a.counter, a.type, a.images, a.chainid
FROM veryzoo_stories a
ORDER BY sid DESC
LIMIT 0 , 50
)b
INNER JOIN veryzoo_topics t ON b.topic = t.topicid
I made a small test and it seems to be faster. It uses a subquery (nested query) to first select the 50 records and then join.
Also make sure that veryzoo_stories.sid, veryzoo_stories.topic and veryzoo_topics.topicid are indexes (and that the relation exists if you use InnoDB). It should improve the performance.
Now it leaves the problem of the ORDER BY LIMIT. It is heavy because it orders the 200,000 records before selecting. I guess it's necessary. The indexes are very important when using ORDER BY.
Here is an article on the problem : ORDER BY … LIMIT Performance Optimization
I'm just give test to nested query + inner join and suprised that performace increase much: it now took only 0.22s . Here is my query:
SELECT a.*, t.topicname
FROM (SELECT sid, aid, title, TIME, topic, informant, ihome, alanguage, counter, TYPE, images, chainid
FROM veryzoo_stories
ORDER BY sid DESC
LIMIT 0, 50) a
INNER JOIN veryzoo_topics t ON a.topic = t.topicid
if no more solution come up , i may use this one .. thanks for anyone look at this post

Categories