Mysql fetch from last to first - [many records] - php

i want to fetch records from mysql starting from last to first LIMIT 20. my database have over 1M records. I am aware of order by. but from my understanding when using order by its taking forever to load 20 records i have no freaking idea. but i think mysql fetch all the records before ordering.
SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10

I suppose if you execute the query without the order by the time would be satisfactory?
You might try to create an index in the column your are ordering:
create index idx_bookings_booking_id on bookings(booking_id)

You can try to find out complexity of the Query using
EXPLAIN SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
then check the proper index has been created on the table
SHOW INDEX FROM `db_name`.`table_name`;
if the index us not there create proper index on all the table
please add if anything is missing

The index lookup table needs to be able to reside in memory, if I'm not mistaken (filesort is much slower than in-mem lookup).
Use small index / column size
For a double in capacity use UNSIGNED columns if you need no negative values..
Tune sort_buffer_size and read_rnd_buffer_size (maybe better on connection level, not global)
See https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html , particularly regarding using EXPLAIN and the maybe trying another execution plan strategy.

You seem to need another workaround like materialized views.
Tell me if this sounds like it:
Create another table like the booking table e.g. CREATE TABLE booking_short LIKE booking. Though you only need the booking_id column
And check your code for where exactly you create booking orders, e.g. where you first insert into booking. SELECT COUNT(*) FROM booking_short. If it is >20, delete the first record. Insert the new booking_id.
You can select the ID and join from there before joining for more details with the rest of the tables.
You won't need limit or sorting.
Of course, this needs heavy documentation to avoid maintenance problems.
Either that or https://stackoverflow.com/a/5912827/6288442

Related

How to reduce subquery execution time...?

I want per day sales item count so for that one i already created query but it takes to much around 55.585s and query is
Query :
SELECT
td.db_date,
(
select count(*) from order as order where DATE(order.created_on) = td.db_date
)as day_contribute
FROM time_dimension as td
So can any one please let me know how may i optimized this query and reduce execution time.?
You can modify your query to join like:
SELECT
td.db_date, count(order.id) as day_contribute
FROM time_dimension as td
LEFT JOIN order ON DATE(order.created_on) = td.db_date
GROUP BY td.db_date;
I do not know your primary id key for table order - so used just "order.id". Replace it with your.
Also it is very important - test if you have index on td.db_date field.
And one more important thing - better to avoid using DATE(order.created_on). Because it is mean that DATE() method will be called each time when DB will compare dates. If it is possible - convert order.created_on to same format as td.db_date. Or join by other fields. That will add speed too.
First you should make sure you have index on created_on column in order table.
However if you have many records in time_dimension and many records in order table it might be hard to optimize the query, because for each record from time_dimension you need to search in order table.
You can also change count(*) into count(order_id) (assuming primary key in order table is order_id) or add extra column with date only in order table (created_on_date with date only and index on this column) so your query could look like this:
SELECT
td.db_date,
(
select count(order_id) from order where order.created_on_date = td.db_date
)as day_contribute
FROM time_dimension as td
However it's possible the execution time might be too high if you have many records in both tables, so it might be necessary to create one extra table where you hold number of orders for each day and update it in cron or when adding/updating/deleting records in order table

How to improve Mysql database performance without changing the db structure

I have a database that is already in use and I have to improve the performance of the system that's using this database.
There are 2 major queries running about 1000 times in a loop and this queries have inner joins to 3 other tables each. This in turn is making the system very slow.
I tried actually to remove the query from the loop and fetch all the data only once and process it in PHP. But this is putting to much load on the memory (RAM) and the system is hanging if 2 or more clients try to use the system.
There is a lot of data in the tables even after removing the expired data .
I have attached the query below.
Can anyone help me with this issue ?
select * from inventory
where (region_id = 38 or region_id = -1)
and (tour_opp_id = 410 or tour_opp_id = -1)
and room_plan_id = 141 and meal_plan_id = 1 and bed_type_id = 1 and hotel_id = 1059
and FIND_IN_SET(supplier_code, 'QOA,QTE,QM,TEST,TEST1,MQE1,MQE3,PERR,QKT')
and ( ('2014-11-14' between from_date and to_date) )
order by hotel_id desc ,supplier_code desc, region_id desc,tour_opp_id desc,inventory.inventory_id desc
SELECT * ,pinfo.fri as pi_day_fri,pinfoadd.fri as pa_day_fri,pinfochld.fri as pc_day_fri
FROM `profit_markup`
inner join profit_markup_info as pinfo on pinfo.profit_id = profit_markup.profit_markup_id
inner join profit_markup_add_info as pinfoadd on pinfoadd.profit_id = profit_markup.profit_markup_id
inner join profit_markup_child_info as pinfochld on pinfochld.profit_id = profit_markup.profit_markup_id
where profit_markup.hotel_id = 1059 and (`booking_channel` = 1 or `booking_channel` = 2)
and (`rate_region` = -1 or `rate_region` = 128)
and ( ( period_from <= '2014-11-14' and period_to >= '2014-11-14' ) )
ORDER BY profit_markup.hotel_id DESC,supplier_code desc, rate_region desc,operators_list desc, profit_markup_id DESC
Since we have not seen your SHOW CREATE TABLES; and EXPLAIN EXTENDED plan it is hard to give you 1 answer
But generally speaking in regard to your query "BTW I re-wrote below"
SELECT
hotel_id, supplier_code, region_id, tour_opp_id, inventory_id
FROM
inventory
WHERE
region_id IN (38, -1)
AND tour_opp_id IN (410, -1)
AND room_plan_id IN (141, 1)
AND bed_type_id IN (1, 1059)
AND supplier_code IN ('QOA', 'QTE', 'QM', 'TEST', 'TEST1', 'MQE1', 'MQE3', 'PERR', 'QKT')
AND ('2014-11-14' BETWEEN from_date AND to_date )
ORDER BY
hotel_id DESC, supplier_code DESC, region_id DESC, tour_opp_id DESC, inventory_id DESC
Do not use * to get all the columns. You should list the column that you really need. Using * is just a lazy way of writing a query. limiting the columns will limit the data size that is being selected.
How often is the records in the inventory are being updates/inserted/delete? If not too often then you can use consider using SQL_CACHE. However, caching a query will cause you problems if you use it and the inventory table is updated very often. In addition, to use query cache you must check the value of query_cache_type on your server. SHOW GLOBAL VARIABLES LIKE 'query_cache_type';. If this is set to "0" then the cache feature is disabled and SQL_CACHE will be ignored. If it is set to 1 then the server will cache all queries unless you tell it not too using NO_SQL_CACHE. If the option is set to 2 then MySQL will cache the query only where SQL_CACHE clause is used. here is documentation about query_cache_type
If you have an index on those following column in this order it will help you (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id)
ALTER TABLE inventory
ADD INDEX (hotel_id, supplier_code, region_id, tour_opp_id, inventory_id);
If possible increase sort_buffer_size on your server as most likely you issue here is that your are doing too much sorting.
As for the second query "BTW I re-wrote below"
SELECT
*, pinfo.fri as pi_day_fri,
pinfoadd.fri as pa_day_fri,
pinfochld.fri as pc_day_fri
FROM
profit_markup
INNER JOIN
profit_markup_info AS pinfo ON pinfo.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_add_info AS pinfoadd ON pinfoadd.profit_id = profit_markup.profit_markup_id
INNER JOIN
profit_markup_child_info AS pinfochld ON pinfochld.profit_id = profit_markup.profit_markup_id
WHERE
profit_markup.hotel_id = 1059
AND booking_channel IN (1, 2)
AND rate_region IN (-1, 128)
AND period_from <= '2014-11-14'
AND period_to >= '2014-11-14'
ORDER BY
profit_markup.hotel_id DESC, supplier_code DESC, rate_region DESC,
operators_list DESC, profit_markup_id DESC
Again eliminate the use of * from your query
Make sure that the following columns have the same type/collation and same size. pinfo.profit_id, profit_markup.profit_markup_id, pinfoadd.profit_id, pinfochld.profit_id and each one have to have an index on every table. If the columns have different types then MySQL will have to convert the data every time to join the records. Even if you have index it will be slower. Also, if those column are characters type (ie. VARCHAR()) make sure they are of the CHAR() with a collation of latin1_general_ci as this will be faster for finding ID, but if you are using INT() even better.
Use the 3rd and 4th trick I listed for the previous query
Try using STRAIGHT_JOIN "you must know what your doing here or it will bite you!" Here is a good thread about this When to use STRAIGHT_JOIN with MySQL
I hope this helps.
For the first query, I am not sure if you can do much (assuming you have already indexed the fields you are ordering by) apart from replacing the * with column names (Don't expect this to increase the performance drastically).
For the second query, before you go through the loop and put in selection arguments, you could create a view with all the tables joined and ordered then make a prepared statement to select from the view and bind arguments in the loop.
Also, if your php server and the database server are in two different places, it is better if you did the selection through a stored procedure in the database.
(If nothing works out, then memcache is the way to go... Although I have personally never done this)
Here you have increase query performance not an database performance.
For both queries first check index is available on WHERE and ON(Join) clause columns, if index is missing then you have to add index to improve query performance.
Check explain plane before create index.
If possible show me the explain plane of both query that will help us.

Mysql query execution time optimization for large data table

I am executing a mysql query for searching car information from a table having 530399 records
for Executing query it is taking so much time
SELECT c.* FROM CarInfo as c WHERE (c.Vehicle_Year<='2014') and c.Vehicle_Age_Type='USED' limit 0,15 .
I need all the fields from table so using * .
My table have 36 columns . Is there any way to optimize this query .
After adding index it is loading fast with limit but its taking time when trying to use total count
SELECT count(*) as total FROM CarInfo as c WHERE (c.Vehicle_Year<='2014') and c.Vehicle_Dealer_Zip in(85320,85354,85541) and (c.Vehicle_age_type='New' or c.Vehicle_age_type='Used' or c.Vehicle_age_type='Certified Used')
Dealer_Zip may contain so may values.
Thanks in advance.
Looks like your table is missing the indexes and if yes you need to add them first.
Before adding the index first check if its already there using the following command
show indexes from CarInfo
From the above command see if Vehicle_Year and Vehicle_Age_Type is having index and since you mentioned the query needs optimization I guess you are missing the indexes.
Next step add the index as
alter table CarInfo add index type_year_idx (Vehicle_Age_Type,Vehicle_Year);
NOTE : You must take a backup of the table before adding the index
Then re-frame the query as
SELECT c.*
FROM CarInfo as c
WHERE c.Vehicle_Age_Type='USED'
AND c.Vehicle_Year<='2014' limit 0,15 ;
In addition when you feel the query is taking long time you should always use EXPLAIN to see what this query is up to so you can plan for the optimization. The syntax looks like below for your current query.
EXPLAIN
SELECT c.*
FROM CarInfo as c
WHERE
c.Vehicle_Year<='2014'
AND c.Vehicle_Age_Type='USED'
limit 0,15 ;

MySQL Query Optimization - Random Record

I'm having a terrible time with a MySQL query. I've spent most of my weekend and most of my day today attempting to make this query run a bit faster. I've made it considerably faster, but I know I can make it better.
SELECT m.id,other_fields,C.contacts_count FROM marketingDatabase AS m
LEFT OUTER JOIN
(SELECT COUNT(*) as contacts_count, rid
FROM contacts
WHERE status = 'Active' AND install_id = 'XXXX' GROUP BY rid) as C
ON C.rid = m.id
WHERE (RAND()*2612<50)
AND do_not_call != 'true'
AND `ACTUAL SALES VOLUME` >= '800000'
AND `ACTUAL SALES VOLUME` <= '1200000'
AND status = 'Pending'
AND install_id = 'XXXXX'
ORDER BY RAND()
I have an index on 'install_id', 'category' and 'status' but the EXPLAIN shows it was sorting based on 9100 rows.
My Explain is here:
https://s3.amazonaws.com/jas-so-question/Screen+Shot+2012-03-13+at+12.34.04+AM.png
Anybody have any suggestions on what I can do to make this a bit faster? The entire point of the query is to select a random record from an account's records (install_id) that matches certain criteria like sales volume, status and do_not_call. I'm currently gathering 25 records and caching it (using PHP) so I only have to run this query once every 25 requests, but I'm already dealing with thousands of requests per day. It currently takes 0.2 seconds to run. I realize that by using ORDER BY RAND() I'm already taking a major performance hit, but it's just sorting 25 rows.
Thanks in advance for the help.
**EDIT: I forgot to mention that the 'contact_sort' index is on the 'contacts' table, and indexes install_id, status, and rid. (rid references Record ID in marketingDatabase so it knows which record a contact belongs to.
**EDIT 2: The 2612 number in the query represents the number of rows in marketingDatabase that match the criteria (install_id, status, actual sales volume, etc.)
Since I do not see your index definitions, I am not sure they are correct. The query would benefit from the following indexes:
a composite index (install_id, status, rid) on the contacts
a composite index (install_id, status, `ACTUAL SALES VOLUME`) on marketingDatabase
I played around with a few queries, and I don't think you'll ever be able to get a indexed query to work with RAND(), especially when you're using it in both a WHERE clause and an ORDER BY clause. If at all possible, I'd introduce the random element in my PHP logic, and probably look at whether two simple queries made more sense than one fairly complex one. Added to that, you have LEFT OUTER JOIN on a random result set, which may also be increasing the amount of work that has to be done a lot.
In summary, my guess would be - rewrite to exclude RAND, see if you can get rid of the LEFT OUTER JOIN. Two straightforward indexed queries with a bit of PHP in between may be a lot better.

MYSQL query optimization

I'm trying to optimize a report query run on an ecommerce site. I'm pretty sure that I'm doing something stupid, since this query shouldn't be taking nearly as long to run as it does.
The query in question is:
SELECT inventories_name, inventories_code, SUM(shop_orders_inventories_qty) AS qty,
SUM(shop_orders_inventories_price) AS tot_price, inventories_categories_name,
inventories_price_list, inventories_id
FROM shop_orders
LEFT JOIN shop_orders_inventories ON (shop_orders_id = join_shop_orders_id)
LEFT JOIN inventories ON (join_inventories_id = inventories_id)
WHERE {$date_type} BETWEEN '{$start_date}' AND '{$end_date}'
AND shop_orders_x_response_code = 1
GROUP BY join_inventories_id, join_shop_categories_id
{$order}
{$limit}
It's basically trying to get total sales per item over a period of time; values in curly brackets are filled in via a form. It works fine for a period of a couple days, but querying a time interval of a week or more can take 30 seconds+.
I feel like it's joining way too many rows in order to calculate the aggregate values and sucking up huge amounts of memory, but I'm not sure how to limit it.
Note - I realize that I'm selecting fields which aren't in the group by, but they correspond 1-1 with inventory ID, which is in the group by.
Any suggestions?
-- Edit --
The current indices are:
inventories:
join_categories - BTREE
inventories_name, inventories_code, inventories_description - FULLTEXT
shop_orders_inventories:
shop_orders_inventories_id - BTREE
shop_orders:
shop_orders_id - BTREE
Two sequential left joins will work quite long on a big table. Try to use "join" instead of "left join" (unless you have records in shop_orders with now matching records in shop_orders_inventories or inventories) or split this query to couple of small ones. Also by using "sum" and "group by" you are forcing MySQL to create temp tables - you might want to increase MySQL cache so those tables would fit in to memory (otherwise MySQL will dump them to disk which will also increase SQL execution time).
The first and foremost rule to indexing is... index the columns that you will search on!
For each possible value of {$date_type}, create an index for that date column.
Once you have lots of data in the table (say 2 years or 100 weeks), a single week's data is 1% of the index, so it becomes a good starting point.
Even though MySQL allows non-aggregates in the SELECT clause, I personally would sync the two
SELECT inventories_name, inventories_code,
SUM(shop_orders_inventories_qty) AS qty,
SUM(shop_orders_inventories_price) AS tot_price,
inventories_categories_name, inventories_price_list, inventories_id
FROM ...
GROUP BY inventories_id, join_shop_categories_id, inventories_name,
inventories_code, inventories_categories_name, inventories_price_list
...

Categories