Speed up slow 3 table MYSQL query

Speed up slow 3 table MYSQL query - php

I'm querying 3 tables in an eCommerce site.
orders Table:
id
order_number
name
etc...
order_lines table:
id
order_number
sku
quantity
etc.
products Table
id
sku
title
ship_by (INT)
etc.
order_number links orders table to order_lines table. SKU links order_lines table to products table.
Notice the ship_by column in the products table, this denotes which supplier will ship the item. The query needs to pull orders for a particular supplier. Orders may contain items sold by different suppliers.
This is the query I managed to cobble together:
SELECT
orders.`order_number` as `orderId`,
orders.`shipping` as `fPostageCost`,
FROM_UNIXTIME(orders.`time`) as `dReceievedDate`,
(
CASE orders.`dob`
WHEN 'COURIER 3 DAY' THEN 'UK_SellersStandardRate'
WHEN 'COURIER 24 HOUR' THEN 'UK_OtherCourier24'
ELSE orders.`dob`
END
)
...(plus a number more)
FROM orders
INNER JOIN order_lines
ON orders.order_number = order_lines.order_number
WHERE
(
(SELECT COUNT(order_lines.sku)
FROM order_lines, products
WHERE order_lines.sku = products.sku
AND products.ship_by = 1
AND order_lines.order_number = orders.order_number) > 0
)
AND ( orders.`printed` = 'N' OR orders.`printed` IS NULL )
AND orders.status = 'Awaiting Despatch'
AND ( orders.payment_status = 'Success' OR orders.payment_status = 'Paypal Paid' OR orders.payment_status = 'Manual Payment' )
GROUP BY orders.`order_number`
ORDER BY orders.order_number ASC
It is taking about 7 seconds to execute the query.
If I remove the line 'AND order_lines.order_number = orders.order_number' from the second SELECT query it is executed almost instantly, but without this line it doesn't work as I need it to.
Aside from adding the 'ship_by' column into the order_lines table (which I'd rather not do since I'd have to change a lot of php code), is there any way of modifying this query to speed it up?
The query is being run from outside of PHP so it has to be a pure mysql query.
Here is the EXPLAIN on the query:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY order_lines ALL NULL NULL NULL NULL 9627 Using temporary; Using filesort
1 PRIMARY orders eq_ref order_idx order_idx 17 .order_lines.order_number 1 Using where
2 DEPENDENT SUBQUERY order_lines ALL NULL NULL NULL NULL 9627 Using where
2 DEPENDENT SUBQUERY products ref sku_2,sku,sku_3 sku_2 63 order_lines.prod_code 11 Using where
Thanks

Your order_lines table needs an index on order_number at the very least. We would need to see the schema to best choose one. Perhaps a composite index would pick up speed in other areas.
But in choosing index changes, it must be carefully weighed against other queries in your system, and the impact on insert and update speeds.
The goal shouldn't be to make 10% of your app fast, at the expense of the 90.
To show very useful information, publish show create table tableName for relevant tablenames. (versus describing it free-hand, like my table has this and that). We need to see the schema for orders, order_lines, products.
The manual page on Create Index

Idk how large the table is on your inline view but if it is large then
it may help to use a union instead of having to query the view for every record.
Something like.
SELECT L1*
FROM
(SELECT
orders.`order_number` as `orderId`,
0 AS SKU_CNT,
orders.`shipping` as `fPostageCost`,
FROM_UNIXTIME(orders.`time`) as `dReceievedDate`,
(
CASE orders.`dob`
WHEN 'COURIER 3 DAY' THEN 'UK_SellersStandardRate'
WHEN 'COURIER 24 HOUR' THEN 'UK_OtherCourier24'
ELSE orders.`dob`
END
)
...(plus a number more)
FROM orders
INNER JOIN order_lines
ON orders.order_number = order_lines.order_number
WHERE ( orders.`printed` = 'N' OR orders.`printed` IS NULL )
AND orders.status = 'Awaiting Despatch'
AND ( orders.payment_status = 'Success' OR orders.payment_status = 'Paypal Paid' OR orders.payment_status = 'Manual Payment' )
UNION ALL
Select
0 AS ORDERID,
COUNT(OL.SKU) AS SKU_CNT,
0 as `fPostageCost`,
NULL as `dReceievedDate`,
...
FROM ORDER_LINES OL,
PRODUCTS PR
WHERE order_lines.sku = products.sku
AND products.ship_by = 1
AND order_lines.order_number = orders.order_number
GROUP BY ORDERID, ORDER_LINES_CNT, fPostageCost`, dReceievedDate`,
)L1
WHERE L1.SKU_CNT > 0
GROUP BY L1.`order_number`
ORDER BY L1.order_number ASC

You can replace the subquery with exists:
WHERE EXISTS (SELECT 1
FROM order_lines ol2 JOIN
products p2
ON ol2.sku = p.sku AND
p2.ship_by = 1
ol.order_number = o.order_number
) AND
. . .
But, your snippet of the query is not using order_lines in the outer query. I suspect you can just get rid of it with the aggregation:
FROM orders o
WHERE EXISTS (SELECT 1
FROM order_lines ol2 JOIN
products p2
ON ol2.sku = p2.sku AND
p2.ship_by = 1
ol.order_number = o.order_number
) AND
. . .
This will at least return all the orders information -- and the lack of aggregation may simplify other parts of the query. If some versions of the query are running fast, then indexes are probably set up reasonable.

Related

How to create new table for the output of DISTINCT COUNT to be distributed in rows, not in column?

My query displays the DISTINCT count of buyers with corresponding ticketserial#. I need to automatically calculate the SOLD and BALANCE column and save into the database either into the existing table (table1) with the rows that corresponds to the ticketserial. I've already exhausted my brain and did google many times but I just can't figure it out. So I tried another option by trying to create a new table into the database for the output of DISTINCT COUNT but I didn't find any sample query to follow, so that I could just use INNER JOIN for that new table with table1, with that the PRINTED, SOLD are in the same table, thus I can subtract these columns to obtain the values for the BALANCE column.
Existing table1 & table2 are records in the database via html form:
Table1
Ticket Serial Printed Copies SOLD(sold) Balance
TS#1234 50 ?(should be auto ?
TS#5678 80 ?(should be auto ?
(so on and so forth...)
Table2
Buyer Ticket Serial
Adam TS#1234
Kathy TS#1234
Sam TS#5678
(so on and so forth...)
The COUNT DISTINCT outputs the qty. of sold tickets:
<td> <?php print '<div align="center">'.$row['COUNT(ticketserial)'];?></td>
...
$query = "SELECT *, COUNT(ticketserial) FROM buyers WHERE ticketsold != 'blank' GROUP BY
ticketserial ";
It's COUNT output looks like this:
Ticket Serial------Distinct Count
TS#1234 7
TS#5678 25
(so on and so forth...)
I tried to update the SOLD column and BALANCE column by UPDATE or INSERT and foreach loop but only the first row in table was updated.
Table1
Ticket Serial Printed Copies Sold Balance
TS#1234 50 **7** 0
TS#5678 80 **0** 0
TS#8911 40 **0** 0
(so on and so forth...)
Note: The fieldname "sold" in table1 is not the same with the fieldname "ticketsold" in table2 as the former is quantity and the later is ticketserials.

Your question is a bit hard to follow. However this looks like a left join on a aggregate query:
select
t1.ticket_serial,
t1.printed_copies,
coalesce(t2.sold, 0) sold,
t1.printed_copies - coalesce(t2.sold, 0) balance
from table1 t1
left join (
select ticket_serial, count(*) sold
from table2
group by ticket_serial
) t2 on t2.ticket_serial = t1.ticket_serial
If you are looking to update the main table:
update table1 t1
left join (
select ticket_serial, count(*) sold
from table2
group by ticket_serial
) t2 on t2.ticket_serial = t1.ticket_serial
set
t1.sold = coalesce(t2.sold, 0),
t1.balance = t1.printed_copies - coalesce(t2.sold, 0)
I would not actually recommend storing the sold and balance in the main table - this is derived information that can be easily computed when needed, and would be tedious to maintain. If needed, you could create a view using the first above SQL statement, which will give you an always up-to-date perspective at your data.

how to batch update the unique column based on different condition in a single query mysql

======= products table ======
CREATE TABLE `products` (
`id` BigInt( 11 ) UNSIGNED AUTO_INCREMENT NOT NULL,
`item` VarChar( 50 ) CHARACTER SET utf8 COLLATE utf8_general_ci NULL,
`category` VarChar( 20 ) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
PRIMARY KEY ( `id` ),
CONSTRAINT `unique_item` UNIQUE( `item` ) )
CHARACTER SET = utf8
COLLATE = utf8_general_ci
ENGINE = InnoDB
AUTO_INCREMENT = 1;
====== existing data in products table ======
INSERT INTO `products` ( `category`)
VALUES ( '2' ),( '2' ),( '3' ),( '5' );
====== update batch query ======
UPDATE products
SET item = CASE
WHEN category = 2
THEN 222
WHEN category = 2
THEN 211
WHEN category = 3
THEN 333
WHEN category = 5
THEN 555
END
WHERE category IN (2, 3, 5) AND item IS NULL
I have 3 columns (id, category, item) in products table.
id is primary key and item is unique key.
I want to use a single query to update item at once based on different category. The above update batch query is only working for category 3 and 5. It does not work for category 2.
The error message:
Error( 1062 ) 23000: "Duplicate entry '222' for key 'unique_item'"
I can use SELECT and UNION to get the list of id and put the id in CASE WHEN for update. But I would like to reduce one step. Is it possible to update all items in one query?
Thanks in advance!
===== EDITED =====
Current data in the product table
Expected result after run update query

It is possible to create a single query and update all data in one step.
Generate the folowing subquery in PHP from your update data:
select 0 as position, 0 as category, 0 as item from dual having item > 0
union all select 1, 2, 222
union all select 2, 2, 221
union all select 1, 3, 333
union all select 1, 5, 555
The first line here is only to define the column names and will be removed due to the impossible HAVING condition, so you can easily add the data rows in a loop without repeating the column names.
Note that you also need to define the position, which must be sequential for every category.
Now you can join it with your products table, but first you need to determine the position of a row within the category. Since your server probably doesn't support window functions (ROW_NUMBER()), you will need another way. One is to count all products from the same category with a smaller id. So you would need a subquery like
select count(*)
from products p2
where p2.category = p1.category
and p2.item is null
and p2.id <= p1.id
The final update query would be:
update products p
join (
select p1.id, ud.item
from (
select 0 as position, 0 as category, 0 as item from dual having item > 0
union all select 1, 2, 222
union all select 2, 2, 221
union all select 1, 3, 333
union all select 1, 5, 555
) as ud -- update data
join products p1
on p1.category = ud.category
and ud.position = (
select count(*)
from products p2
where p2.category = p1.category
and p2.item is null
and p2.id <= p1.id
)
where p1.item is null
) sub using (id)
set p.item = sub.item;
However I suggest first to try the simple way and execute one statement per row. It could be something like:
$stmt = $db->prepare("
update products
set item = ?
where category = ?
and item is null
order by id asc
limit 1
");
$db->beginTransaction();
foreach ($updateData as $category => $items) {
foreach($items as $item) {
$stmt->execute([$item, $category]);
if ($db->rowCount() == 0) {
break; // no more rows to update for this category
}
}
}
$db->commit();
I think the performance should be OK for like up to 1000 rows given an index on (category, item). Note that the single query solution might also be slow due to the expensive subquery to determine the position.

You cannot do what you want, because category is not unique in the table.
You can remove the current rows and insert the values that you really want:
delete p from products p
where category in (2, 3, 5) and itemid is null;
insert into product (itemid, category)
values ('222', '2' ), ('211', '2' ), ('333', '3' ), ('555', '5' );

MySQL can't make sense of your CASE expression.
You wrote
CASE
WHEN category = 2 THEN 222
WHEN category = 2 THEN 211
What do you want to happen when category is 2? You have provided two alternatives, and MySQL can't guess which one you want. That's your duplicate entry.
The error message is admittedly a little confusing.

Need SQL query with good performance to select data that does NOT match criteria

I have a database with
a company table
a country table
a company_country n:n table which defines which company is available in which country
a product table (each product belongs to one specific categoryId)
and a company_product_country n:n:n table that defines which company offers which product in which country.
The latter has the three primary key columns companyId, productId, countryId and the additional columns val and limitedAvailability. val is an ENUM with the values yes|no|n/a, and limitedAvailability is an ENUM with the values 0|1.
Products within categories 1 or 2 are available in all countries and therefore get countryId = 0. But at the same time, only these very products may have a limitedAvailability = 1.
An SQLFiddle with a test database can be found here: http://www.sqlfiddle.com/#!9/a065a/1/0
It contains five countries, products and companies.
Background information on what I need to select from the database:
A PHP script generates a search form where an arbitrary list of countries and products can be selected. The products are separated by categories (I did not add the category table in the sample database, because it is not needed in this case). For the first category, I can select whether to exclude products with limited availability.
Generating the desired result works fine:
It displays all companies that are available in the selected countries and have at least one of the selected products available. The result offers a column that defines how many of the selected products are available by company.
If the user defines that one or more categories should not contain products with limited availability, then the products within the corresponding categories will not count as a match if the company offers them with limited availability only.
I am pleased with the performance of this query. My original database has got around 15 countries, 100 companies and 150 products. Selecting everything in the search form occupies the MySQL server for around two seconds which is acceptable for me.
The problem:
After generating the result list of companies which matches as many product search criteria as possible, I use PHP to iterate through those companies and run another SQL query that should give me the list of products that the company does not offer corresponding to the search criteria. The following is an example query for companyId 1 to find out which products are not available when
the desired products have the productIds 2, 4 and 5
the product's country availability should be at least one of the countryIds 1, 2 or 3
the product should not have a limitedAvailability when it is from categoryId = 2:
SELECT DISTINCT p.name
FROM `product` p
LEFT JOIN `company_product_country` cpc ON `p`.`productId` = `cpc`.`productId` AND `cpc`.`companyId` = 1
WHERE NOT EXISTS(
SELECT *
FROM company_product_country cpcTmp
WHERE `cpcTmp`.`companyId` = 1
AND cpcTmp.val = 'yes'
AND (
cpcTmp.limitedAvailability = 0
OR p.categoryId NOT IN(2)
)
AND cpcTmp.productId = p.productId
)
AND p.`productId` IN (2,4,5)
AND countryId IN(0,1,2,3);
The database along with this query can be found on the SQLFiddle linked above.
The query generates the correct result, but its performance dramatically decreases with the number of products. My local SQL server needs about 4 seconds per company when searching for 150 products in 15 countries. This is inaccpetable when iterating through 100 companies. Is there any way to improve this query, like avoiding the IN(...) function containing up to 150 products? Or should I maybe split the query into two like so:
First fetch the unmatched products that do not have country Id 0 and are IN the desired countryIds
Then fetch the unmatched products in countryId = 0 and if applicable filter limitedAvailability = 0
?
Your help is gladly appreciated!

I would suggest writing the query like this:
SELECT p.name
FROM product p
WHERE EXISTS (select 1
from company_product_country cpc
where p.productid = cpc.productid and
cpc.companyid = 1 and
cpc.countryid in (1, 2, 3)
) and
NOT EXISTS (select 1
from company_product_country cpcTmp
where cpcTmp.productId = p.productId and
cpcTmp.companyId = 1 and
cpcTmp.val = 'yes' and
cpcTmp.limitedAvailability = 0
) AND
NOT EXISTS (select 1
from company_product_country cpcTmp
where cpcTmp.productId = p.productId and
cpcTmp.companyId = 1 and
cpcTmp.val = 'yes' and
p.categoryId NOT IN (2)
)
p.`productId` IN (2, 4, 5) ;
Then, you want the following indexes:
product(productid, categoryid, name)
company_product_country(productid, companyid, countryid)
company_product_country(productid, companyid, val, limitedavailability)
company_product_country(productid, companyid, val, category)
Note: these indexes completely "cover" the query, meaning that all columns in the query come from the indexes. For most purposes, is probably sufficient to have a single index on company_product_country. Any of the three would do.

Take the query that identifies the products that match the user selection. Subquery it and outer join it to the products table. Exclude the matches.
SQL Fiddle
SELECT p.name
FROM
product p LEFT JOIN
(
SELECT productId
FROM company_product_country cpcTmp
WHERE companyId = 1 AND
countryId IN (0,1,2,3) AND
(
productId IN (4, 5) OR
(productId = 2 AND limitedAvailability = 0)
)
) t
ON p.productId = t.productId
WHERE
t.productId IS NULL AND
p.productId IN (2,4,5)

condition update cells within the same column

I need to update cells within a specific column based upon ids in another column. The column names are Prod_ID, Lang_ID and Descr:
Prod_ID | Lang_ID | Descr
--------+---------+------
A101 | 1 | TextA
A101 | 2 | TextB
A101 | 3 | TextC
For a group of rows with the same Prod_ID, I need to replace all subsequent descriptions (Descr column) with the description of the first row. The row with the correct description has always Lang_ID = 1. Also, the table may not be sorted by Lang_ID.
Example: TextA (Lang_ID = 1) should replace TextB and TextC because the Prod_IDs of the rows match.

You mentioned in a comment elsewhere that the "master" lang_id is always 1. That simplifies things greatly, and you can do this with a simple self-join (no subqueries :-)
This query selects all lang_1 rows, then joins them with all non-lang_1 rows of the same prod_id and updates those.
If Lang_ID=1 is always the "first"
UPDATE products
LEFT JOIN products as duplicates
ON products.Prod_ID=duplicates.Prod_ID
AND duplicates.Lang_ID != 1
SET duplicates.Descr = products.Descr
WHERE products.Lang_ID = 1
edit: If Lang_ID=1 may not be the "first"
you can join the table to itself via a an intermediate join which finds the lowest Lang_ID for that row. I have called the intermediate-join "lang_finder".
UPDATE products
LEFT JOIN (SELECT Prod_ID, MIN(Lang_ID) as Lang_ID FROM products GROUP BY Prod_ID) as lang_finder
ON products.prod_id=lang_finder.prod_id
LEFT JOIN products as cannonical_lang
ON products.Prod_ID = cannonical_lang.Prod_ID
AND lang_finder.Lang_ID = cannonical_lang.Lang_ID
SET products.Descr = cannonical_lang.Descr
Note that while it does use a subquery, it does not nest them. The subquery essentially just adds a column to the products table (virtually) with the value of the lowest Lang_ID, which then allows a self-join to match on that. So if there were a product with Lang_ID 3, 4, & 5, this would set the Descr on all of them to whatever was set for Lang_ID 3.

How about this?
UPDATE myTable dt1, myTable dt2
SET dt1.Descr = dt2.Descr
WHERE dt1.Prod_ID=dt2.Prod_ID;
Demo at sqlfiddle

Assuming that the correct description is always in the row of a group of rows with the same Prod_ID where Lang_ID has the smallest value, this MySQL query should work:
UPDATE your_table AS t1
JOIN (
SELECT Prod_ID, Descr
FROM (
SELECT *
FROM your_table
ORDER BY Lang_ID
) AS t3
GROUP BY Prod_ID
) AS t2
ON t1.Prod_ID = t2.Prod_ID
SET t1.Descr = t2.Descr;
The above can be used e.g. if Lang_ID is a primary or unique key. It also works if the corresponding Lang_ID has always the same minimum value (e.g. = 1) but in that case much less complex queries like this one are possible.

Getting random record from database with group by

Hello i have a question on picking random entries from a database. I have 4 tables, products, bids and autobids, and users.
Products
-------
id 20,21,22,23,24(prime_key)
price...........
etc...........
users
-------
id(prim_key)
name user1,user2,user3
etc
bids
-------
product_id
user_id
created
autobids
--------
user_id
product_id
Now a multiple users can have an autobid on an product. So for the next bidder I want to select a random user from the autobid table
example of the query in language:
for each product in the autobid table I want a random user, which is not the last bidder.
On product 20 has user1,user2,user3 an autobidding.
On product 21 has user1,user2,user3 an autobidding
Then I want a resultset that looks for example like this
20 – user2
21 – user3
Just a random user. I tried miximg the GOUP BY (product_id) and making it RAND(), but I just can't get the right values from it. Now I am getting a random user, but all the values that go with it don't match.
Can someone please help me construct this query, I am using php and mysql

The first part of the solution is concerned with identifying the latest bid for each product: these eventually wind up in temporary table "latest_bid".
Then, we assign randon rank values to each autobid for each product - excluding the latest bid for each product. We then choose the highest rank value for each product, and then output the user_id and product_id of the autobids with those highest rank values.
create temporary table lastbids (product_id int not null,
created datetime not null,
primary key( product_id, created ) );
insert into lastbids
select product_id, max(created)
from bids
group by product_id;
create temporary table latest_bid ( user_id int not null,
product_id int not null,
primary key( user_id, product_id) );
insert into latest_bid
select product_id, user_id
from bids b
join lastbids lb on lb.product_id = b.product_id and lb.created = b.created;
create temporary table rank ( user_id int not null,
product_id int not null,
rank float not null,
primary key( product_id, rank ));
# "ignore" duplicates - it should not matter
# left join on latest_bid to exclude latest_bid for each product
insert ignore into rank
select user_id, product_id, rand()
from autobids a
left join latest_bid lb on a.user_id = lb.user_id and a.product_id = lb.product_id
where lb.user_id is null;
create temporary table choice
as select product_id,max(rank) choice
from rank group by product_id;
select user_id, res.product_id from rank res
join choice on res.product_id = choice.product_id and res.rank = choice.choice;

You can use the LIMIT statement in conjunction with server-side PREPARE.
Here is an example that selects a random row from the table mysql.help_category:
select #choice:= (rand() * count(*)) from mysql.help_category;
prepare rand_msg from 'select * from mysql.help_category limit ?,1';
execute rand_msg using #choice;
deallocate prepare rand_msg;
This will need refining to prevent #choice becoming zero, but the general idea works.
Alternatively, your application can construct the count itself by running the first select, and constructing the second select with a hard-coded limit value:
select count(*) from mysql.help_category;
# application then calculates limit value and constructs the select statement:
select * from mysql.help_category limit 5,1;

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.