how to delete duplicate values in mysql table

how to delete duplicate values in mysql table - php

I have column in mysql table (my_sale_time) of type timestamp....infact the rows value look like this
2010-12-01 14:38:07
2010-12-01 17:14:18
...
so what i need mysql query to delete those value whose date is repeated multiple times in table.....as i mentioned in sample sale_time_value.....only date is repeated with same value but time is different....so i want to get all rows, date is repeated multiple times and delete duplicate dates

The basic principle of deleting duplicate rows:
CREATE TEMPORARY TABLE tmptbl AS SELECT DISTINCT * FROM my_sale_time;
DELETE FROM my_sale_time;
INSERT INTO my_sale_time SELECT * FROM tmptbl;
You may have to specify columns and WHERE clauses (I didn't really understand your criteria).
And of course you should test-run it on a development server and don't forget to run it as a single transaction with locked tables.

If you have an auto_increment field, use this:
DELETE FROM
`mytable`
WHERE
`my_auto_increment_field` NOT IN (
SELECT
MAX(`my_auto_increment_field`)
GROUP BY
`my_sale_time`
);

Related

Mysql Insert .... select, obtain last insert ID

I have this query in php. It's an insert select copying from table2, but I need to get the IDs of the newly created rows and store them into an array. Here is my code:
$sql = "INSERT INTO table1 SELECT distinct * from table2";
$db->query($sql);
I could revert the flow starting with a select on table2 and making all single inserts but it would slow down the script on a big table. Ideas?

You could lock the table, insert the rows, and get the ID of the last item inserted, and then unlock; that way you know that the IDs will be contiguous as no other concurrent user could have changed them. Locking and unlocking is something you want to use with caution though.
An alternative approach could be to use one of the columns in the table - either an 'updated' datetime column, or an insert-id column (for which you put in a value that will be the same across all of your rows.)
That way you can do a subsequent SELECT of the IDs back out of the database matching either the updated time or your chosen insert ID.

SQL INSERT INTO SELECT and Return the SELECT data to Create Row View Counts

So I'm creating a system that will be pulling 50-150 records at a time from a table and display them to the user, and I'm trying to keep a view count for each record.
I figured the most efficient way would be to create a MEMORY table that I use an INSERT INTO to pull the IDs of the rows into and then have a cron function that runs regularly to aggregate the view ID counts and clears out the memory table, updating the original one with the latest view counts. This avoids constantly updating the table that'll likely be getting accessed the most, so I'm not locking 150 rows at a time with each query(or the whole table if I'm using MyISAM).
Basically, the method explained here.
However, I would of course like to do this at the same time as I pull the records information for viewing, and I'd like to avoid running a second, separate query just to get the same set of data for its counts.
Is there any way to SELECT a dataset, return that dataset, and simultaneously insert a single column from that dataset into another table?
It looks like PostgreSQL might have something similar to what I want with the RETURNING keyword, but I'm using MySQL.

First of all, I would not add a counter column to the Main table. I would create a separate Audit table that would hold ID of the item from the Main table plus at least timestamp when that ID was requested. In essence, Audit table would store a history of requests. In this approach you can easily generate much more interesting reports. You can always calculate grand totals per item and also you can calculate summaries by day, week, month, etc per item or across all items. Depending on the volume of data you can periodically delete Audit entries older than some threshold (a month, a year, etc).
Also, you can easily store more information in Audit table as needed, for example, user ID to calculate stats per user.
To populate Audit table "automatically" I would create a stored procedure. The client code would call this stored procedure instead of performing the original SELECT. Stored procedure would return exactly the same result as original SELECT does, but would also add necessary details to the Audit table transparently to the client code.
So, let's assume that Audit table looks like this:
CREATE TABLE AuditTable
(
ID int
IDENTITY -- SQL Server
SERIAL -- Postgres
AUTO_INCREMENT -- MySQL
NOT NULL,
ItemID int NOT NULL,
RequestDateTime datetime NOT NULL
)
and your main SELECT looks like this:
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
To perform both INSERT and SELECT in one statement in SQL Server I'd use OUTPUT clause, in Postgres - RETURNING clause, in MySQL - ??? I don't think it has anything like this. So, MySQL procedure would have several separate statements.
MySQL
At first do your SELECT and insert results into a temporary (possibly memory) table. Then copy item IDs from temporary table into Audit table. Then SELECT from temporary table to return result to the client.
CREATE TEMPORARY TABLE TempTable
(
ItemID int NOT NULL,
Col1 ...,
Col2 ...,
...
)
ENGINE = MEMORY
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
;
INSERT INTO AuditTable (ItemID, RequestDateTime)
SELECT ItemID, NOW()
FROM TempTable;
SELECT ItemID, Col1, Col2, ...
FROM TempTable
ORDER BY ...;
SQL Server (just to tease you. this single statement does both INSERT and SELECT)
MERGE INTO AuditTable
USING
(
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
(ItemID, RequestDateTime)
VALUES
(Src.ItemID, GETDATE())
OUTPUT
Src.ItemID, Src.Col1, Src.Col2, ...
;
You can leave Audit table as it is, or you can set up cron to summarize it periodically. It really depends on the volume of data. In our system we store individual rows for a week, plus we summarize stats per hour and keep it for 6 weeks, plus we keep daily summary for 18 months. But, important part, all these summaries are separate Audit tables, we don't keep auditing information in the Main table, so we don't need to update it.
Joe Celko explained it very well in SQL Style Habits: Attack of the Skeuomorphs:
Now go to any SQL Forum text search the postings. You will find
thousands of postings with DDL that include columns named createdby,
createddate, modifiedby and modifieddate with that particular
meta data on the end of the row declaration. It is the old mag tape
header label written in a new language! Deja Vu!
The header records appeared only once on a tape. But these meta data
values appear over and over on every row in the table. One of the main
reasons for using databases (not just SQL) was to remove redundancy
from the data; this just adds more redundancy. But now think about
what happens to the audit trail when a row is deleted? What happens to
the audit trail when a row is updated? The trail is destroyed. The
audit data should be separated from the schema. Would you put the log
file on the same disk drive as the database? Would an accountant let
the same person approve and receive a payment?

You're kind of asking if MySQL supports a SELECT trigger. It doesn't. You'll need to do this as two queries, however you can stick those inside a stored procedure - then you can pass in the range you're fetching, have it both return the results AND do the INSERT into the other table.
Updated answer with skeleton example for stored procedure:
DELIMITER $$
CREATE PROCEDURE `FetchRows`(IN StartID INT, IN EndID INT)
BEGIN
UPDATE Blah SET ViewCount = ViewCount+1 WHERE id >= StartID AND id <= EndID;
# ^ Assumes counts are stored in the same table. If they're in a seperate table, do an INSERT INTO ... ON DUPLICATE KEY UPDATE ViewCount = ViewCount+1 instead.
SELECT * FROM Blah WHERE id >= StartID AND id <= EndID;
END$$
DELIMITER ;

prevent a value repeating in the same day

i am trying to develop php simple script that can be entered records like this: (mysql is my db engine)
id (auto increment, primary key)
StudentName
StudentNumber
ClassName
Grade
ScreeningDate (mm/dd/yyyy)
etc.
now what I need is to prevent the StudentNumber to be entered in the same day, for example, if other user has entered it already, then a message says: this Number was already added for today...
In other words, i need before the insert, to check if the studentNumber is there, then give the message, otherwise, will add the row normally...... hence, next day, it is okay, they can add same studentNumber again like yesterday.... something like PrimaryKey but only for today! how is this possible?

Create a unique compound index on (ScreeningDate, StudentNumber). Then INSERT operations that would place duplicates into your table will fail with duplicate-key errors.
You will need to detect this duplicate-key situation in your PHP code and return the appropriate message to the user who attempted to insert the dup. The INSERT statement will return an error.
Or, you can use INSERT ... ON DUPLICATE KEY UPDATE if you want to update the row if it already exists. Read this. http://dev.mysql.com/doc/refman/5.6/en/insert.html
To create the index, do this:
ALTER TABLE whatever ADD UNIQUE INDEX NoDailyDups (ScreeningDate, StudentNumber)
This approach has the advantage that it works correctly, without explicit table locking, even on a very busy system. If two users happen to be racing to insert the first item for a particular student and day, one of them will win and the other will get the duplicate-key error.
Notice also that your table is slightly denormalized -- it contains slightly redundant data. You might want to create another table containing the columns StudentNumber and StudentName, and move the student names out of this table.

Do a select to get the specified student number on today's date like this:
SELECT * FROM table WHERE StudentNumber = 123 AND DATE(ScreeningDate) = CURDATE()
If any rows are returned, there's a record for this student number on today's date, otherwise you are free to insert. The DATE function extracts the date part of a date, i.e. the date month and year

This may not be efficient but you can fetch all the rows of current date first:
SELECT StudentNumber, ScreeningDate FROM tablename WHERE StudentNumber = '22' AND ScreeningDate = 'mm/dd/yyyy';
And don't add(insert query) if the above query returned more than 0 rows(it will be one generally).

How can i order by last update row

I have one Sql Query to get all the informations from my table.
I created an list using an foreach.
And i want to order this list, by the last updated row.
Like this
$query - "SELECT * FROM table ORDER BY last_updated_row";
//call Query here
And when i updated a certain row, i want to put this row on the top of the list
I heard about time_stamp, can i use time_stamp for that?
how can i do that?
Thanks

Assuming your using MySQL your table needs to be like this
CREATE TABLE table (
last_updated_row TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
That will give the row a create time stamp and update it on each update statement which effects the row
http://dev.mysql.com/doc/refman/5.0/en/timestamp-initialization.html

You can use just about any date/datetime/timestamp column in a table to sort by if needed. The only catch is you need to actually have it in the table.
Any of the above will allow sorts by ascending/descending order, but need to be maintained when inserting/updating a row.
Assuming you have the following structure:
table - someTable
id someVale updateTime
1 54634 ......
2 65138 ......
3 94141 ......
4 84351 ......
It doesn't matter what type of column updateTime is - whether it is a date, a datetime, a timestamp, a simple order by updateTime will work.
But you need to make sure that each insert/update you make to that row updates the column so that the sort will be true.

group by mysql option

I am writing a converter to transfer data from old systems to new systems. I am using php+mysql.
I have one table that contains millions records with duplicate entries. I want to transfer that data in a new table and remove all entries. I am using following queries and pseudo code to perform this task
select *
from table1
insert into table2
ON DUPLICATE KEY UPDATE customer_information = concat('$firstName',',','$lastName')
It takes ages to process one table :(
I am pondering that is it possible to use group by and get all grouped record automatically?
Other than going through each record and checking duplicate etc.?
For example
select *
from table1
group by firstName, lastName
insert into table 2 only one record and add all users'
first last name into column ALL_NAMES with comma
EDIT
There are different records for each customers with different information. Each row is called duplicated if first and last name of user is same. In new table, we will just add one customer and their bought product in different columns (we have only 4 products).

I don't know what you are trying to do with customer_information, but if you just want to transfer the non-duplicated set of data from one table to another, this will work:
INSERT IGNORE INTO table2(field1, field2, ... fieldx)
SELECT DISTINCT field1, field2, ... fieldx
FROM table1;
DISTINCT will take care of rows that are exact duplicates. But if you have rows that are only partial duplicates (like the same last and first names but a different email) then IGNORE can help. If you put a unique index on table2(lastname,firstname) then IGNORE will make sure that only the first record with lastnameX, firstnameY from table1 is inserted. Of course, you might not like which record of a pair of partial duplicates is chosen.
ETA
Now that you've updated your question, it appears that you want to put the values of multiple rows into one field. This is, generally speaking, a bad idea because when you denormalize your data this way you make it much less accessible. Also, if you are grouping by (lastname, firstname), there will not be names in allnames. Because of this, my example uses allemails instead. In any event, if you really need to do this, here's how:
INSERT INTO table2(lastname, firstname, allemails)
SELECT lastname, firstname, GROUP_CONCAT(email) as allemails
FROM table1
GROUP BY lastname, firstname;

If they are really duplicate rows (every field is the the same) then you can use:
select DISTINCT * from table1
instead of :
select * from table1

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

how to delete duplicate values in mysql table - php

If you have an auto_increment field, use this: DELETE FROM `mytable` WHERE `my_auto_increment_field` NOT IN ( SELECT MAX(`my_auto_increment_field`) GROUP BY `my_sale_time` );

Related

Mysql Insert .... select, obtain last insert ID

SQL INSERT INTO SELECT and Return the SELECT data to Create Row View Counts

prevent a value repeating in the same day

How can i order by last update row

group by mysql option

Categories

Resources