MYSQL Multi Table Delete - php

I have 2 tables setup like this:
Items:
id (int)
name (varchar)
category (int)
last_update (timestamp)
Categories
id (int)
name (varchar)
tree_id (int)
I want to delete all the records from Items, who's last_update is NOT today, and who's corresponding categories.tree_id is equal to 1. However, I don't want to delete anything from the Categories table, just the Items table. I tried this:
$query = "DELETE FROM items USING categories, items
WHERE items.category = categories.id
AND categories.tree_id = 1
AND items.last_update != '".date('Y-m-d')."'";
However this just seems to delete EVERY record who's tree_id is 1. It should keep items with a tree_id of 1, as long as their last_update field is today
What am I missing?

If last_update is a timestamp field,and you are only passing in a date (with no time component) in your where clause, you are, in essence, actually doing this (if passing in 2012-10-24 for example):
AND items.last_update != 2012-10-24 00:00:00
This means every row without that exact second value in the timestamp would be deleted. You are much better doing something like
AND items.last_update NOT LIKE '".date('Y-m-d')."%'";
Of course you want to make sure you have an index on last_update.
Or if you don't care about index performance on last_update field (i.e. you are just doing this as a one off query and don't want to index this field), you could do this, which may make more logical sense to some
AND DATE(items.last_update) <> '".date('Y-m-d')."'"
The bottom line is that you need to only be comparing the date component of the last_updated field in some manner.

Sounds like you need a subquery
DELETE FROM items USING categories, items
WHERE items.category = categories.id
AND items.last_update != '".date('Y-m-d')."'"
AND items.id in
(
SELECT id from items inner join categories on item.category = categories.id
)

You say that last_update contains time-stamps - I assume UNIX time-stamp. You can never find a record with the same time-stamp as when the stamp was executed and you can not compare a time-stamp with formatted date, they'll never correspond. So you need to store the data in last_update column in a date(Y-m-d) format so that compare those which are not equal.

Related

Pulling data from MySQL database and pulling only 1 value of each in a column and then show latest first

What I'm trying to do.
To pull data from a database. With one column 'serial_no' to only pull 1 of each value so it's unique, and any other values within the 'serial_no" column to not show if another one of the same value exists. So in column 'serial_no' there could be 35k values but it would only show 35 if there are a total of 35 unique serial numbers. Once I have them I need them to show the latest first by 'datetime' column.
Current outcome.
I have the data pulling through, and it's only showing once of each 'serial_no' however, it's not showing the latest first, like it seems to be ignoring the ordering or just pulling through the first one it sees rather than the latest.
These 2 PHP queries I have used and working but not 100% how it should. The first one, i only want 'serial_no" distinct not all columns, so maybe that's why this one is not working.
$sql = "SELECT DISTINCT serial_no, datetime FROM wp_clicker_data ORDER BY datetime DESC";
The other one which works fine apart from it does not show the latest value of a specific serial_no
$sql = "SELECT * FROM wp_clicker_data GROUP BY serial_no ORDER BY datetime DESC";
Any ideas how each unique value of column 'serial_no' can pull through the latest entry based on the latest 'datetime' column?
Thanks!
Use Max and GROUP BY to get your desired output as below-
SELECT serial_no,
MAX(datetime)
FROM wp_clicker_data
GROUP BY serial_no
If you want the latest row for each serial number, then use filtering:
select cd.*
from wp_clicker_data cd
where cd.datetime = (SELECT MAX(cd2.datetime)
FROM wp_clicker_data cd2
WHERE cd2.serial_no = cd.serial_no
);
GROUP BY is not appropriate when you want to retrieve entire rows. Using SELECT * with GROUP BY doesn't make sense, because there are columns in the SELECT that are not in the GROUP BY. And this construct generally won't work (with the default settings) in the more recent versions of MySQL.

Retail inventory Mysql query optimization

Given the following tables for a retail administration system:
STORES: store_id, name
PRODUCTS: product_id, name, cost
PRODUCT_ENTRIES: key, store_id, date
PRODUCT_ENTRIES_CONTENT: product_entries_key, product_id, quantity
PRODUCT_EXITS: key, store_id, product_id, quantity, status, date
SALES: key, store_id, date
SALES_CONTENT: sales_key, product_id, quantity
RETURNS: key, store_id, date
RETURNS_CONTENT: returns_key, product_id, quantity
In order to calculate stock values I run through the contents of the products table and for each product_id:
Sum quantities of product_entries_content as well as returns_content
Subtract quantities of product_exits_content (where status = 2 or 3) as well as sales_content
To calculate the cost of the inventory of each store, I'm running the following query through a PHP loop for each distinct store and outputting the result:
SELECT
SUM((((
(SELECT COALESCE(SUM(product_entries_content.quantity), 0)
FROM product_entries
INNER JOIN product_entries_content ON
product_entries_content.product_entries_key = product_entries.key
WHERE product_entries_content.product_id = products.id
AND product_entries.store_id = '.$row['id'].'
AND DATE(product_entries.date) <= DATE(NOW()))
-
(SELECT COALESCE(SUM(quantity), 0)
FROM sales_content
INNER JOIN sales ON sales.key = sales_content.sales_key
WHERE product_id = products.product_id AND sales.store_id = '.$row['id'].'
AND DATE(sales_content.date) <= DATE(NOW()))
+
(SELECT COALESCE(SUM(quantity), 0)
FROM returns_content
INNER JOIN returns ON returns.key = returns_content.returns_key
WHERE product_id = products.product_id AND returns.store_id = '.$row['id'].'
AND DATE(returns.date) <= DATE(NOW()))
-
(SELECT COALESCE(SUM(quantity), 0)
FROM product_exits
WHERE product_id = products.product_id AND (status = 2 OR status = 3)
AND product_exits.store_id = '.$row['id'].' #store_id
AND DATE(product_exits.date) <= DATE(NOW()))
) * products.cost) / 100) ) AS "'.$row['key'].'" #store_name
FROM products WHERE 1
All foreign keys and indexes are properly set. The problem is because of the large amount of stores and movements in each store the query is becoming increasingly heavy, and because inventory is calculated from the beginning of each store's history it only gets slower with time.
What could I do to optimize this scheme?
Ideally, SHOW CREATE TABLE tablename for each table would really help a lot in any optimization question. The data type of each column is EXTREMELY important to performance.
That said, from the information you've given the following should be helpful, assuming the column data types are all appropriate.
Add the following indexes, if they do not exist. IMPORTANT: Single column indexes are NOT valid replacements for the following composite indexes. You stated that
All foreign keys and indexes are properly set.
but that tells us nothing about what they are, and if they are "proper" for optimization.
New indexes
ALTER TABLE sales
CREATE INDEX `aaaa` (`store_id`,`key`)
ALTER TABLE sales_content
CREATE INDEX `bbbb` (`product_id`,`sales_key`,`date`,`quantity`)
ALTER TABLE returns
CREATE INDEX `cccc` (`store_id`,`date`,`sales_key`)
ALTER TABLE returns_content
CREATE INDEX `dddd` (`product_id`,`returns_key`,`quantity`)
ALTER TABLE product_exits
CREATE INDEX `eeee` (`product_id`,`status`,`store_id`,`date`,`quantity`)
ALTER TABLE product_entries
CREATE INDEX `ffff` (`store_id`,`date`,`key`)
ALTER TABLE product_entries_content
CREATE INDEX `gggg` (`product_id`,`product_entries_key`,`quantity`)
(Use more appropriate names than aaaa. I just used those to save time.)
Each of the above indexes will allow the database to read only one row for each table. Most performance issues involving joins comes from what is known as a double lookup.
Understanding indexes and double lookups
An index is just a copy of the table data. Each column listed in the index is copied from the table, in the order listed in the index, and then the primary key is appended to that row in the index. When the database uses an index to look up a value, if not all the information is contained in the index, the primary key will be used to access the clustered index of the table to obtain the rest of the information. This is what a double look up is, and it is VERY bad for performance.
Example
All the above indexes are designed to avoid double lookups. Let's look at the second subquery to see how the indexes related to that query will work.
ALTER TABLE sales
CREATE INDEX `aaaa` (`store_id`,`key`)
ALTER TABLE sales_content
CREATE INDEX `bbbb` (`product_id`,`sales_key`,`date`,`quantity`)
Subquery (I added aliases and adjusted how the date column is accessed, but otherwise it is unchanged):
SELECT COALESCE(SUM(sc.quantity), 0)
FROM sales_content sc
INNER JOIN sales s
ON s.key = sc.sales_key
WHERE sc.product_id = p.product_id
AND s.store_id = '.$row['id'].'
AND sc.date < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
Using the aaaa index, the database will be able to look up only those rows in the sales table that match the store_id, since that is listed first in the index. Think of this in the same way as a phone book, where store_id is the last name, and key is the first name. If you have the last name, then it is EXTREMELY easy to flip to that point of the phone book, and quickly get all the first names that go with that last name. Likewise, the database is able to very quickly "flip" to the part of the index that contains the given store_id value, and find all the key values. In this case, we do not need the primary key at all (which would be the phone number, in the phone book example.)
So, done with the sales table, and we have all the key values we need from there.
Next, the database moves onto the bbbb index. We already have product_id from the main query, and we have the sales_key from the aaaa index. That is like having both first and last name in the phone book. The only thing left to compare is the date, which could be like the address in a phone book. The database will store all the dates in order, and so by giving it a cutoff value, it can just look at all the dates up to a certain point.
The last part of the bbbb index is the quantity, which is there so that the database can quickly sum up all those quantities. To see why this is fast, consider again the phone book. Imagine in addition to last name, first name, and address information, that there is also a quantity column (of something, it doesn't matter what). If you wanted the sum of the quantities for a specific last name, first name, and for all addresses that start with the number 5 or less, that is easy, isn't it? Just find the first one, and add them up in order until you reach the first address that starts with a number greater than 5. The database benefits the same way when using the date column in this way (date is like the address column, in this example.)
The date columns
Finally, I noted earlier, I changed how the date column was accessed. You never want to run a function on a database column that you are comparing to another value. The reason is this: What would happen if you had to convert all the addresses into roman numerals, before you did any comparison? You wouldn't be able to just go down the list like we did earlier. You'd have to convert ALL the values, and THEN check each one to make sure it was within the limit, since we no longer know if the values are sorted correctly to just be able to do the "read them all and then stop at a certain value" shortcut I described above.
You and I may know that converting a datetime value to a date isn't going to change the order, but the database will not know (it might be possible it optimizes this conversion, but that's not something I want to assume.) So, keep the columns pure. The change I made was to just take the NOW() date, and add one day, and then make it a < instead of a <=. After all, comparing two values and saying the date must be equal to or less than today's date is equivalent to saying the datetime must be less than tomorrow's date.
The query
Below is my final query for you. As stated, not much has changed other than the date change and aliases. However, you had a typo in the first subquery where you accessed products.id. I corrected the id to be product_id, given that that matches what you stated were the columns for the products table.
SELECT
SUM(
(
(
(
(
SELECT COALESCE(SUM(pec.quantity), 0)
FROM product_entries pe
INNER JOIN product_entries_content pec
ON pec.product_entries_key = pe.key
WHERE pec.product_id = p.product_id
AND pe.store_id = '.$row['id'].'
AND pe.date < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
)
-
(
SELECT COALESCE(SUM(sc.quantity), 0)
FROM sales_content sc
INNER JOIN sales s
ON s.key = sc.sales_key
WHERE sc.product_id = p.product_id
AND s.store_id = '.$row['id'].'
AND sc.date < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
)
+
(
SELECT COALESCE(SUM(rc.quantity), 0)
FROM returns_content rc
INNER JOIN returns r
ON r.key = rc.returns_key
WHERE rc.product_id = p.product_id
AND r.store_id = '.$row['id'].'
AND r.date < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
)
-
(
SELECT COALESCE(SUM(pex.quantity), 0)
FROM product_exits pex
WHERE pex.product_id = p.product_id
AND (pex.status = 2 OR pex.status = 3)
AND pex.store_id = '.$row['id'].' #store_id
AND pex.date < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
)
)
* p.cost)
/ 100)
) AS "'.$row['key'].'" #store_name
FROM products p WHERE 1
You may be able to further optimize this by splitting the subquery on the product_exits table into 2 separate sub queries, rather than using a OR, which many times will perform poorly. Ultimately, you'll have to benchmark that to see how well the database optimizes the OR on its own.

Adding a Row into an alphabetically ordered SQL table

I have a SQL table with two columns:
'id' int Auto_Increment
instancename varchar
The current 114 rows are ordered alphabetically after instancename.
Now i want to insert a new row that fits into the order.
So say it starts with a 'B', it would be at around id 14 and therefore had to 'push down' all of the rows after id 14. How do i do this?
An SQL table is not inherently ordered! (It is just a set.) You would simply add the new row and view it using something like:
select instancename
from thetable
order by instancename;
I think you're going about this the wrong way. IDs shouldn't be changed. If you have tables that reference these IDs as foreign keys then the DBMS wouldn't let you change them, anyway.
Instead, if you need results from a specific query to be ordered alphabetically, tell SQL to order it for you:
SELECT * FROM table ORDER BY instancename
As an aside, sometimes you want something that can seemingly be a key (read- needs to be unique for each row) but does have to change from time to time (such as something like a SKU in a product table). This should not be the primary key for the same reason (there are undoubtedly other tables that may refer to these entries, each of which would also need to be updated).
Keeping this information distinct will help keep you and everyone else working on the project from going insane.
Try using an over and joining to self.
Update thetable
Set ID = r.ID
From thetable c Join
( Select instancename, Row_Number() Over(Order By instancename) As ID
From CollectionStatus) r On c.instancename= r.instancename
This should update the id column to the ordered number. You may have to disable it's identity first.

MySQL Join table and get results that don't exist in one

I have two tables, one that has a foreign key from the other. I want to get all records that don't exist in the foreign table, based on certain criteria.
Here are the tables I have:
item_setting
setting_id
category_id
item
item_id
setting_id
name
expired_dt
Here's the query I'm using now:
SELECT
iset.setting_id
FROM
item_settings iset
LEFT OUTER JOIN
item i ON i.setting_id = iset.setting_id
WHERE
iset.category_id = '5' AND i.setting_id is null
This query works in providing any setting_id's that do not have a record in the item's table within a specific category.
However, now I want to include cases where the expired_dt less than than time() (meaning it's past expired). In otherwords, I would think to add this:
WHERE
iset.category_id = '5' AND (i.setting_id is null OR i.expired_dt < '".time()."')
However, this doesn't work, it returns all the records.
Any suggestions? Maybe I'm completely over complicating this.... I just want to return the setting_id's from the item_settings table, where the expired_dt associated in the item table is expired or if it does not even exist in the item table.
Thank you!
Try moving the timestamp condition into the join clause. Something like
item_settings iset
LEFT OUTER JOIN
item i ON i.setting_id = iset.setting_id and i.expired_dt > time()

MYSQL: GROUP BY on all values except 0 and null?

I have a simple SQL Query:
SELECT tid,
COUNT(*) AS bpn
FROM mark_list
WHERE userid = $userid
GROUP BY tid
Now the column tid is basically a category list associated with each entry. The categories are unique numeric values.
What I am trying to do is get an overall count of how many records there as per userid, but I only want to count an entire category one time (meaning if category 3 has 10000 records, it should only receive a count of 1).
The caveat is that sometimes the category is listed as null or sometimes a 0. If the item has either a 0 or a null, it has no category and I want them counted as their own separate entities and not lumped into a single large category.
Wheeee!
SELECT SUM(`tid` IS NULL) AS `total_null`,
SUM(`tid` = 0) AS `total_zero`,
COUNT(DISTINCT `tid`) AS `other`
FROM `mark_list`
WHERE `user_id` = $userid
Edit: note that if total_zero is greater than 0, you will have to subtract one from the "other" result (because tid=0 will get counted in that column)
You can alter the query to not take into account those particular values (via the WHERE clause), and then perhaps run a separate query that ONLY takes into account those values.
There may be a way to combine it into only one query, but this way should work, too.

Categories