How can I avoid creating duplicate rows? - php

Everything I have searched for and found has yet to work because I am accessing the Table through a php script and differently than everything I see. Anyways,
I am importing Feeds from a website into a mysql table. My table was created like this...
$query2 = <<<EOQ
CREATE TABLE IF NOT EXISTS `Entries` (
`feed_id` int(11) NOT NULL,
`item_title` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`item_link` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`item_date` varchar(40) COLLATE utf8_unicode_ci NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
EOQ;
$result = $db_obj->query($query2);
I enter the data like so....
foreach($rss->channel->item as $Item){
$query5 = <<<EOQ
INSERT INTO Entries (feed_id, item_title, item_link, item_date)
VALUES ('$get_id','$Item->title','$Item->link','$Item->pubDate')
EOQ;
$result = $db_obj->query($query5);
}
Now, every time Import new feeds from the site I want to make sure I delete any duplicates that might already be there. Everything I have tried, especially DISTINCT, has not worked for me. Does anyone know what type of query I could use to create a temp table, copy over any distinct rows (ENTIRE ROWS, if a title is the same but the date is different I want to keep that), drop the old table, then rename the tamp table to what I want.... or something similar?

Avoid using the duplicate rows in the first place. Make any unique values into keys. When adding new values to your database, use
REPLACE INTO Entries (feed_id, item_title, item_link, item_date)
VALUES ('$get_id','$Item->title','$Item->link','$Item->pubDate')
EOQ;
The duplicates will be automatically overwritten. Replace is handy because it works like an insert when there is no conflict in the keys, but when there is then it will update the record and bump up any auto-incrementing keys.
EDIT
I've been drumming over this for a while. Here's what I came up with.
The problem with making a multi-column key on (feed_id, item_title, item_link, item_date) is that it will exceed the 1000 byte limitation in MySQL for key length. So instead alter your schema like so:
CREATE TABLE IF NOT EXISTS `Entries` (
`hash` varchar(32),
`feed_id` int(11) NOT NULL,
`item_title` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`item_link` varchar(200) COLLATE utf8_unicode_ci NOT NULL,
`item_date` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (hash)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Now when you store a new value, get a hash of the values together:
$hash = md5($get_id . $Item->title . $Item->link . $Item->pubDate);
And for your insert statements use the following:
REPLACE INTO Entries (hash, feed_id, item_title, item_link, item_date)
VALUES ('$hash', '$get_id','$Item->title','$Item->link','$Item->pubDate')
EOQ;
The hash will be a unique representation of the record in it's entirety, and will be easy to compare in order to avoid duplicates. Now when you attempt to add the same record more than once, it will just replace the existing entry, and your query will not fail. As an alternative, you could continue to use insert, and the query will return an error, which you could handle however you want to.

The fastest and easiest way to delete duplicate records is by issuing a very simple command.
ALTER IGNORE TABLE [TABLENAME] ADD UNIQUE INDEX UNIQUE_INDEX ([FIELDNAME])
What this does is create a unique index on the field that you do not want to have any duplicates. The ignore syntax instructs MySQL to not stop and display an error when it hits a duplicate. This is much easier than dumping and reloading a table. It will also add unique indexes so that no new duplicates will be added. Just change you INSERT to INSERT IGNORE.
This also will work, but is not as elegant:
delete from [tablename] where fieldname in (select a.[fieldname] from
(select [fieldname] from [tablename] group by [fieldname] having count(*) > 1 ) a )

Perhaps do something like this:
$query2 = 'CREATE TABLE entries_new LIKE entries';
$result = $db_obj->query($query2);
$query5 = 'INSERT INTO entries_new (feed_id, item_title, item_link, item_date) VALUES ';
foreach($rss->channel->item as $Item){
$query5 .= '('$get_id','$Item->title','$Item->link','$Item->pubDate'),';
}
$query5 = rtrim($query5, ',');
$result = $db_obj->query($query5);
$query6 = "RENAME TABLE entries TO entries_backup, entries_new TO entries";
$result = $db_object->query($query6);
This will create a table called entries_new like your entries table. Make a single insert of data into entries_new and then rename the old table to entries_backup and the new table to entries.
You might also want to consider wrapping this whole sequence up in a transaction.

Related

update table in mysql get strange results

I try to update an existing table in mysql, but I get strange results, I explain my problem:
My table looks like this:
TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`photoName` varchar(255) COLLATE latin1_general_ci NOT NULL,
`vote` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `photoName_2` (`photoName`),
)
and im trying to use saveVote.php that look like this:
$namePhoto = $_POST['name'];
$likePhoto = $_POST['like'];
mysql_connect("host","dbUser","psw");
mysql_select_db("db_is");
mysql_query("INSERT INTO `myTable` (`photoName`,`vote`) VALUES('$namePhoto','$likePhoto') ON DUPLICATE KEY UPDATE vote = vote + 1");
the 'vote' value is updated but every time when i call the "saveVote.php", for the first time he create an empty entry in my table with only the vote value and after, each time the "saveVote.php" is called
the vote value is updated for the right photoName but the vote value for the empty entry is also updated.
Why my request created this empty entry ?
Thanks for help.
It seems like your $namePhoto = $_POST['name']; is also returning a empty value. Try this:
if(!empty($_POST['name'])){
mysql_query("INSERT INTO `myTable` (`photoName`,`vote`) VALUES('$namePhoto','$likePhoto') ON DUPLICATE KEY UPDATE vote = vote + 1");
}
Keep in mind that this is just to test. This is not a fix. You need to figure out why you are sending a empty value.

MySQL INSERT IGNORE Adding 1 to Non-Indexed column

I'm building a small report in a PHP while loop.
The query I'm running inside the while() loop is this:
INSERT IGNORE INTO `tbl_reporting` SET datesubmitted = '2015-05-26', submissiontype = 'email', outcome = 0, totalcount = totalcount+1
I'm expecting the totalcount column to increment every time the query is run.
But the number stays at 1.
The UNIQUE index composes the first 3 columns.
Here's the Table Schema:
CREATE TABLE `tbl_reporting` (
`datesubmitted` date NOT NULL,
`submissiontype` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL,
`outcome` tinyint(1) unsigned NOT NULL DEFAULT '0',
`totalcount` mediumint(5) unsigned NOT NULL DEFAULT '0',
UNIQUE KEY `datesubmitted` (`datesubmitted`,`submissiontype`,`outcome`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
When I modify the query into a regular UPDATE statement:
UPDATE `tbl_reporting` SET totalcount = totalcount+1 WHERE datesubmitted = '2015-05-26' AND submissiontype = 'email' AND outcome = 1
...it works.
Does INSERT IGNORE not allow adding numbers? Or is my original query malformed?
I'd like to use the INSERT IGNORE, otherwise I'll have to query for the original record first, then insert, then eventually update.
Think of what you're doing:
INSERT .... totalcount=totalcount+1
To calculate totalcount+1, the DB has to retrieve the current value of totalcount... which doesn't exist yet, because you're CREATING a new record, and there is NO existing data to retrieve the "old" value from.
e.g. you're trying eat your cake before you ever went to the store to buy the ingredients, let alone mix/bake them.

Extract column names from SHOW CREATE TABLE

I am using
$row = mysqli_fetch_row(mysqli_query($conx, "SHOW CREATE TABLE $table"));
in a loop to grab my schema data, that works fine.
I also need put the column names in an array.
Is there a way to pull them from that $row array?
Or do I need to run a separate SHOW COLUMNS to do that?
If you take the create table result whic will look like this:
CREATE TABLE `TableName` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`message` varchar(250) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
You could create a regex that looks for "`[a-zA-Z0-9_-]*`" and besides the first match, they will be column names.
Why not use the information_schema?
"SELECT * FROM information_schema.COLUMNS WHERE TABLE_NAME = '$table'"

How to select specifc entries in MySQL/PHP

I'm dealing with this problem. There is tableorders(oid,datetime,quantity,title,username,mid).
The table orders is updated from php code as far as the features oid,datetime,quantity,title,username are concerned. The problem is that I want to classify each entry based on both datetime and username so as to gather these entries under an order code in order to make an ordering entry. (I can't think of anything else at the moment).
The question is how can I select those entries that are corresponding to the same username and the same date time.
For example the if I have 3entries (freddo espresso,latte,freddoccino) belong to the same order procedure (are posted by the same username, tha exact same datetime) and I need to present them to my user as a completed order.
Here is the structure of table orders:
CREATE TABLE IF NOT EXISTS `orders` (
`oid` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`datetime` DATETIME NOT NULL,
`quantity` INT NOT NULL,
`sum` FLOAT(4,2) NOT NULL,
`title` VARCHAR(30) COLLATE utf8_unicode_ci NOT NULL,
`username` VARCHAR(30) COLLATE utf8_unicode_ci NOT NULL,
`mid` VARCHAR(30) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`oid`),
KEY `username`(`username`,`mid`,`title`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=10000;
The feature title is foreign key from table products:
CREATE TABLE IF NOT EXISTS `products`(
`title` VARCHAR(30) COLLATE utf8_unicode_ci NOT NULL,
`descr` TEXT(255),
`price` FLOAT(4,2) NOT NULL,
`popularity` INT NOT NULL,
`cname` VARCHAR(20) COLLATE utf8_unicode_ci NOT NULL,
`mid` VARCHAR(30) COLLATE utf8_unicode_ci NOT NULL ,
PRIMARY KEY(`title`),
KEY `cname` (`cname`, `mid`)
)ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=10000;
Sorry If I'm a little uncomprehensive, though I really need some help to come to a conclusion. Any ideas?
Thanks in advance!
If you know what the datetime value and the username values are then you can simply use:
SELECT * FROM orders WHERE username = '$username' AND datetime = '$datetime'
However, what you would be better off doing is splitting this into two separate tables; something like:
Orders
OrderID
OrderTime
UserName
Items
ItemID
OrderID
Title
Then you would search in the following way:
SELECT Orders.OrderID, Orders.UserName, Items.Title
FROM Orders
INNER JOIN Items ON Orders.OrderID = Items.OrderID
WHERE
Orders.UserName = '$username'
AND
Orders.OrderDate = '$datetime'
When adding orders you add a record to Orders first, and then use that OrderID and add it to each item inserted in Items...
Insert Example
$mysqli; //Assuming your connection to the database...
$items; //Assuming an array of items for the order like: array('Coffee', 'Tea')
$username; //Assuming the user name to be inserted for the order
$mysqli->query("INSERT INTO Orders(`OrderTime`, `UserName`) VALUES(NOW(), '$username')");
$orderid = $mysqli->insert_id;
foreach($items as $item){
$mysqli->query("INSERT INTO Items (`OrderID`, `Title`) VALUES($orderid, '$title')");
}
NOTE: You should make sure to sanitize data before inserting to database...
Storing JSON
Storing JSON in a database is going to require you to make sure that you use a field data type that is an appropriate length (e.g. a blob).
You mentioned that you retrieve the titles as an array from a form so I'm now going to refer to that as $titles.
Saving to database
$username = '...'; // Username or id to store in database with order
$titles = array(.....); // Array of titles from form
$encodedTitles = json_encode($titles); // Convert to JSON
$mysqli->query("INSERT INTO table_name (titles_field, username_field, date_field) VALUES ('$titles', '$username', NOW())"); // Save to database (assuming already open connection
Retrieve from database
$result = $mysqli->query("SELECT titles FROM table_name WHERE username = 'username_value' AND date_field = 'date_value'"); //Run query to get row
$row = $result->fetch_assoc(); // Fetch row
$titles = json_decode($row['titles']); // This is the same as the `titles` array from the from above!
SELECT quantity,title
FROM orders
WHERE username = ? and datetime = ?
Would return the quantity of items for a specific user on specific date. Instead of a date you could use an order id, which might be a bit safer. If you use order id, then username becomes irrelevant as well, since order ids should be unique.
The answer posted with the query will help you but you should also consider changing your table structure. Looks like you could have a table named orders and another one named orders_items. Then you could list all the itens from orders_itens matching a single order.
I think this query will return kind of data where you have the same unique_id string for records where username and datetime are the same.
SELECT MD5(a.unique_id), b.* FROM (
SELECT
GROUP_CONCAT(oid) unique_id, `datetime`, username
FROM `orders` GROUP BY username, `datetime`
) a
RIGHT JOIN `orders` b
ON a.`datetime` = b.`datetime` AND a.username = b.username
ORDER BY unique_id, oid;
I also have another answer for about 3 thousands characters long but I think this variant will help you more than my long tutorial how to split the table to two tables and how to migrate data into it + php code samples. So I decided not publicate it. )))
Edit: I think you even can run this one query which is easiest and works faster:
SELECT *, MD5( CONCAT( `username` , `datetime` ) ) unique_id
FROM `orders`
ORDER BY unique_id, oid;

query takes 70 ms to execute

I have an MySQL table named i_visited structured like: userid,tid,dateline
And I run this condition in view_thread.php page:
if (db('count','SELECT userid FROM i_visited
WHERE tid = '.intval($_GET['id']).'
AND userid = '.$user['id']))
mysql_query('UPDATE i_visited
SET dateline = unix_timestamp(now())
WHERE userid = '.$user['id'].'
AND tid = '.intval($_GET['id']));
else
mysql_query('INSERT INTO i_visited (userid,tid,dateline) VALUES
('.$user['id'].','.intval($_GET['id']).',unix_timestamp(now()))');
The problem is that it executes in 80/100 ms (on Windows) 40/60 (on Linux)
1 row affected. (query executed in 0.0707 sec)
The mysql_num_rows() aka db('count',sql) uses 2 / 3 ms, so the problem is at the update and the insert.
P.S. i_visited is an utf8_unicode_ci (InnoDB), has anyone seen this problem?
Other queries run normal (2 / 3 milliseconds)
CREATE TABLE i_visited (
userid int(10) NOT NULL,
tid int(10) unsigned NOT NULL,
dateline int(10) NOT NULL,
KEY userid (userid,tid),
KEY userid_2 (userid),
KEY tid (tid) )
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
You do not need to do a select to check existence and then choose either Update or Insert.
You can use MySQL's ON DUPLICATE KEY UPDATE Feature like this.
$query = 'INSERT INTO
i_visited (userid,tid,dateline)
VALUES (' .
$user['id'] . ',' .
intval($_GET['id']) . ',
unix_timestamp(now()))
ON DUPLICATE KEY UPDATE
dateline = unix_timestamp(now())';
mysql_query($query);
This query will insert a new row if there is now KEY conflict, and in case a duplicate key is being inserted, it will instead execute the update part.
And as you have a KEY userid (userid,tid) in your CREATE Statement the above query is equivalent to your if...else block.
Try this and see if there are any gains
You can also use REPLACE INTO, as there are only the specified 3 columns, like this
$query = 'REPLACE INTO
i_visited (userid,tid,dateline)
VALUES (' .
$user['id'] . ',' .
intval($_GET['id']) . ',
unix_timestamp(now()))';
mysql_query($query);
But I would suggest looking at ON DUPLICATE KEY UPDATE as it is more flexible, as it can be used on a table with any number of columns, whereas REPLACE INTO would only work in some limited cases as other column values would also need to be filled in the REPLACE INTO statement unnecessarily
I think (part) of the problem is that your table does not have an explicit primary key.
You've only declared secondary keys.
Change the definition to:
CREATE TABLE i_visited (
userid int(10) NOT NULL,
tid int(10) unsigned NOT NULL,
dateline int(10) NOT NULL,
PRIMARY KEY userid (userid,tid), <<----------
KEY userid_2 (userid),
KEY tid (tid) )
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
InnoDB does not work well without an explicit primary key defined.

Categories