Scenario
Say I have a list of voucher codes that I am giving away, I need to ensure that if two persons place an order at the exact same time, that they do not get the same voucher.
Tables
CREATE TABLE IF NOT EXISTS `order` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`voucher_id` bigint(20) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `voucher_id` (`voucher_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
ALTER TABLE `order` ADD CONSTRAINT `order_fk` FOREIGN KEY (`voucher_id`) REFERENCES `voucher` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
CREATE TABLE IF NOT EXISTS `voucher` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`code` varchar(10) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Sample data
INSERT INTO `voucher` (`code`) VALUES ('A'), ('B'), ('C');
Sample Query
SELECT #voucher_id := v.id FROM `voucher` v LEFT JOIN `order` o ON o.voucher_id = v.id WHERE o.id IS NULL;
INSERT INTO `order` (`voucher_id`) VALUES (#voucher_id);
Question
I believe the UNIQUE KEY on voucher_id in the order table will prevent two orders having the same voucher_id, giving an error / throwing an exception if the same voucher id is inserted twice. This would require a while loop to retry upon failure.
The alternative is read locking the vouchers table before the SELECT and releasing that lock after the INSERT, ensuring the same voucher isn't picked twice.
My question is therefore:
Which is faster?
A while loop in PHP code.
Read locking the vouchers table.
Is there another way?
Edits
ALTER TABLEorderCHANGEvoucher_idvoucher_idBIGINT(20) UNSIGNED NOT NULL
will cause the INSERT to fail if #voucher_id is null (as desired, as there would be no vouchers left).
The "correct" and by that I mean best way to do what you're looking to do is to generate the voucher at the time you place the order. Look at the documentation for the sha1() function in php. You can seed it with unique information to prevent duplicates and use that for your voucher along with an auto_increment field for the unique ID.
When the order is placed, PHP generates a new voucher, saves it to the database, and sends it to the user. This way you're only storing valid vouchers and you're also preventing duplicates from being created.
You can use START TRANSACTION, COMMIT, and ROLLBACK to prevent race conditions in your SQL. http://dev.mysql.com/doc/refman/4.1/en/commit.html
In your case, I would just put your transactions into a critical area bounded by these tokens.
Related
I have a database containing over 1,000 item information and I am now developing a system that will have this check the API source via a regular Cron Job adding new entries as they come. Usually, but not always the case, when a new item is released, it will have limited information, eg; Image and name only, more information like description can sometimes be initially withheld.
With this system, I am creating a bulletin to let everyone know new items have been released, so like most announcements, they get submitted to a database, however instead of submitting static content to the database for the bulletin, is it possible to submit something which will be executed upon the person loading that page and that bulletin data is firstly obtained then the secondary code within run?
, For example, within the database could read something like the following
<p>Today new items were released!</p>
<?php $item_ids = "545, 546, 547, 548"; ?>
And then on the page, it will fetch the latest known information from the other database table for items "545, 546, 547, 548"
Therefore, there would be no need to go back and edit any past entries, this page would stay somewhat up-to-date dynamically.
Typically you would do something like have a date field on your items, so you can show which items were released on a given date. Or if you need to have the items associated with some sort of announcement record, create a lookup table that joins your items and announcements. Do not insert executable code in the DB and then pull it out and execute it.
CREATE TABLE `announcements` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`publish_date` DATETIME NOT NULL,
`content` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `items` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`title` VARCHAR(128) NOT NULL,
`description` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `announcement_item_lkp` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`announcement_id` int(11) unsigned NOT NULL,
`item_id` int(11) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `announcement_item_lkp_uk1` (`announcement_id`,`item_id`),
KEY `announcement_item_lkp_fk_1` (`announcement_id`),
KEY `announcement_item_lkp_fk_2` (`item_id`),
CONSTRAINT `announcement_item_lkp_fk_1` FOREIGN KEY (`announcement_id`) REFERENCES `announcements` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `announcement_item_lkp_fk_2` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
With the announcement_item_lkp table, you can associate as many items to your announcement as you like. And since you have cascading deletes, if an item gets deletes, its lookup records are deleted as well, so you don't have to worry about orphaned references in your announcements, like you would it you just stuff a string of IDs somewhere.
You're already using a relational database, let it do its job.
I have a MySQL (5.6.26) database with large ammount of data and I have problem with COUNT select on table join.
This query takes about 23 seconds to execute:
SELECT COUNT(0) FROM user
LEFT JOIN blog_user ON blog_user.id_user = user.id
WHERE email IS NOT NULL
AND blog_user.id_blog = 1
Table user is MyISAM and contains user data like id, email, name, etc...
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(50) DEFAULT NULL,
`email` varchar(100) DEFAULT '',
`hash` varchar(100) DEFAULT NULL,
`last_login` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`created` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`) USING BTREE,
UNIQUE KEY `email` (`email`) USING BTREE,
UNIQUE KEY `hash` (`hash`) USING BTREE,
FULLTEXT KEY `email_full_text` (`email`)
) ENGINE=MyISAM AUTO_INCREMENT=5728203 DEFAULT CHARSET=utf8
Table blog_user is InnoDB and contains only id, id_user and id_blog (user can have access to more than one blog). id is PRIMARY KEY and there are indexes on id_blog, id_user and id_blog-id_user.
CREATE TABLE `blog_user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_blog` int(11) NOT NULL DEFAULT '0',
`id_user` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `id_blog_user` (`id_blog`,`id_user`) USING BTREE,
KEY `id_user` (`id_user`) USING BTREE,
KEY `id_blog` (`id_blog`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=5250695 DEFAULT CHARSET=utf8
I deleted all other tables and there is no other connection to MySQL server (testing environment).
What I've found so far:
When I delete some columns from user table, duration of query is shorter (like 2 seconds per deleted column)
When I delete all columns from user table (except id and email), duration of query is 0.6 seconds.
When I change blog_user table also to MyISAM, duration of query is 46 seconds.
When I change user table to InnoDB, duration of query is 0.1 seconds.
The question is why is MyISAM so slow executing the command?
First, some comments on your query (after fixing it up a bit):
SELECT COUNT(*)
FROM user u LEFT JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
Table aliases help make it easier to both write and to read a query. More importantly, You have a LEFT JOIN but your WHERE clause is turning it into an INNER JOIN. So, write it that way:
SELECT COUNT(*)
FROM user u INNER JOIN
blog_user bu
ON bu.id_user = u.id
WHERE u.email IS NOT NULL AND bu.id_blog = 1;
The difference is important because it affects choices that the optimizer can make.
Next, indexes will help this query. I am guessing that blog_user(id_blog, id_user) and user(id, email) are the best indexes.
The reason why the number of columns affects your original query is because it is doing a lot of I/O. The fewer columns then the fewer pages needed to store the records -- and the faster the query runs. Proper indexes should work better and more consistently.
To answer the real question (why is myisam slower than InnoDB), I can't give an authoritative answer.
But it is certainly related to one of the more important differences between the two storage engines : InnoDB does support foreign keys, and myisam doesn't. Foreign keys are important for joining tables.
I don't know if defining a foreign key constraint will improve speed further, but for sure, it will guarantee data consistency.
Another note : you observe that the time decreases as you delete columns. This indicates that the query requires a full table scan. This can be avoided by creating an index on the email column. user.id and blog.id_user hopefully already have an index, if they don't, this is an error. Columns that participate in a foreign key, explicit or not, always must have an index.
This is a long time after the event to be much use to the OP and all the foregoing suggestions for speeding up the query are entirely appropriate but I wonder why no one has remarked on the output of EXPLAIN. Specifically, why the index on email was chosen and how that relates to the definition for the email column in the user table.
The optimizer has selected an index on email column, presumably because it's included in the where clause. key_len for this index is comparatively long and it's a reasonably large table given the auto_increment value so the memory requirements for this index would be considerably greater than if it had chosen the id column (4 bytes against 303 bytes). The email column is NULLABLE but has a default of the empty string so, unless the application explicitly sets a NULL, you are not going to find any NULLs in this column anyway. Neither will you find more than one record with the default given the UNIQUE constraint. The column DEFAULT and UNIQUE constraint appear to be completely at odds with each other.
Given the above, and the fact we only want the count in the query, I'd then wonder if the email part of the where clause serves any purpose other than slowing the query down as each value is compared to NULL. Without it the optimizer would probably pick the primary key and do a much better job. Better yet would be a query which ignored the user table entirely and took the count based on the covering index on blog_user that Gordon Linoff highlighted.
There's another indexing issues here worth mentioning:
On the user table
UNIQUE KEY `id` (`id`) USING BTREE,
is redundant since id is the PRIMARY KEY and therefore UNIQUE by definition.
To answer your last question,
The question is why is MyISAM so slow executing the command?
MyISAM is dependent on the speed of your hard drive,
INNODB once the data is read is at speed of RAM. 1st time query is run could be loading data, second and later will avoid hard drive until aged out of RAM.
I want to know if it's possible to INSERT records from a SELECT statement from a source table into a destination table, get the INSERT ID's and UPDATE a field on all the corresponding records from source table.
Take for example, the destination table 'payments':
CREATE TABLE `payments` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`txid` TEXT NULL,
`amount` DECIMAL(16,8) NOT NULL DEFAULT '0.00000000',
`worker` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
)
The source table 'log':
CREATE TABLE `log` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`solution` VARCHAR(80) NOT NULL,
`worker` INT(11) NOT NULL,
`amount` DECIMAL(16,8) NOT NULL DEFAULT '0.00000000',
`pstatus` VARCHAR(50) NOT NULL DEFAULT 'pending',
`payment_id` INT(10) UNSIGNED NULL DEFAULT NULL,
PRIMARY KEY (`id`)
)
The "log" table contains multiple "micro-payments" for a completed task. The purpose of the "payments" table is to consolidate the micro-payments into one larger payment:
INSERT INTO payments p (amount, worker)
SELECT SUM(l.amount) AS total, l.worker FROM log l
WHERE l.pstatus = "ready"
AND l.payment_id IS NULL
AND l.amount > 0
GROUP BY l.worker
I'm not sure if clear from the code above, but I would like the field "payment_id" to be given the value of the insert id so that it's possible to trace back the micro-payment to the larger consolidated payment.
I could do it all client side (PHP), but I was wondering if there was some magical SQL query that would do it for me? Or maybe I am going about it all wrong.
You can use mysql_insert_id() to get the id the inserted record.
See mysql_insert_id()
But the above function is deprecated.
If you're using PDO, use PDO::lastInsertId.
If you're using Mysqli, use mysqli::$insert_id.
Well, the linking column between the tables is the column worker. After you inserted your values, just do
UPDATE log l
INNER JOIN payments p ON l.worker = p.worker
SET l.payment_id = p.id;
and that's it. Or did I get the question wrong? Note, that the columns differ in the attribute signed/unsigned. You might want to change that.
I think you should use ORM in php as follows:
Look into Doctrine.
Doctrine 1.2 implements Active Record. Doctrine 2+ is a DataMapper ORM.
Also, check out Xyster. It's based on the Data Mapper pattern.
Also, take a look at DataMapper vs. Active Record.
I have a table containing the balances of about 100 accounts (it's variable). One record per account. The balances are continually updated but I would like to find the best way to archive the current balance each day.
I'm looking for the most efficient way to do this.
Table schemas:
-- --------------------------------------------------------
--
-- Table structure for table `acc_bals`
--
-- Holds the tracks the balances of all coa's and bank accounts
CREATE TABLE IF NOT EXISTS `acc_bals` (
`id` INT(11) NOT NULL auto_increment,
`acc_type` TINYINT(4) NOT NULL comment '1 - coa; 2 - bank accounts',
`acc_id` SMALLINT(5) NOT NULL,
`acct_balance` VARBINARY(40) NOT NULL,
PRIMARY KEY (id)
) engine=InnoDB DEFAULT charset=utf8 auto_increment=1;
--
-- Table structure for table `balance_archive`
--
CREATE TABLE IF NOT EXISTS `balance_archive` (
`id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`date` date NOT NULL COMMENT 'Beginning of the day for which this value was archived for..',
`coa_id` smallint(4) unsigned NOT NULL COMMENT 'Foreign ID of COA.',
`bal` varbinary(27) NOT NULL COMMENT 'Archived COA balance at beginning of specified date.',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
The reason for the varbinary columns is because the balances are encrypted.
I was originally thinking to query acc_bals and put all the account id's and values into an array, having decrypted the values, and then run a second query and for each item in the array, copy it into the archive table.
It then occurred to me that I probably do not need to decrypt the values at all which would save a lot of processing and then further more, it might be possible to do this in a single query?
If my approach seems right, perhaps someone can suggest how that query might look please?
I'm using MySQL PDO.
Simple select into will do in a recurring event set for each day.
Something like :
insert into balance_archive (date,coa_id,bal)
select now(),ab.id,ab.acct_balance
from acc_bals ab
That way you don't need to use PHP at all, you can do it with MySql only.
I have been over this issue for the last year or so, changing what I am doing and trying different things. The issue is to do with the schema so I can still order nicely in player/clan ladders but if we want to add a stat later it won't lock our table changing every row due to one stat per column.
I see two options for how to do this but both don't seem to be right. One is one stat per column. There would be 4 tables, user_stat_summary (for basic stats shown on ladders), user_stat_beast (teams are human vs beast), user_stat_human and user_stat_overall. Stats are shown everywhere from the last 30 days. A cron job will take any dated stats by getting a query on matches that happened after the 30 days and taking away those stats from the 3 main tables and putting them into the overall one. Matches will have blobs for the stats each player got for that match. The issue I see here is when we have a lot of rows that we can't easily add more stats when say the game changes a little. What I was thinking was a extra_stats blob column on each table and if we add new stats they simply aren't going to be sortable on the ladders.
The other option is an EAV model which is what I have been playing around with but can't seem to get it right. I would be getting many more rows per query and then grouping them into users and the order would work for the most part but I couldn't get limits right for pagination since there was generally an unknown number of rows selected.
What I was thinking is the EAV model with a table that stores ranks per stats which could be used for ordering. So the EAV tables are currently as follows...
CREATE TABLE `user_stat` (
`user_id` int(10) unsigned NOT NULL,
`stat_id` varchar(50) NOT NULL,
`value` int(10) unsigned NOT NULL,
PRIMARY KEY (`user_id`,`stat_id`),
CONSTRAINT `user` FOREIGN KEY (`user_id`) REFERENCES `xf_user` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `user_human_stat` (
`user_id` int(10) unsigned NOT NULL,
`stat_id` varchar(50) NOT NULL,
`value` int(10) unsigned NOT NULL,
PRIMARY KEY (`user_id`,`stat_id`),
CONSTRAINT `human_user` FOREIGN KEY (`user_id`) REFERENCES `xf_user` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `user_beast_stat` (
`user_id` int(10) unsigned NOT NULL,
`stat_id` varchar(50) NOT NULL,
`value` int(10) unsigned NOT NULL,
PRIMARY KEY (`user_id`,`stat_id`),
CONSTRAINT `beast_user` FOREIGN KEY (`user_id`) REFERENCES `xf_user` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `user_stat_overall` (
`user_id` int(10) unsigned NOT NULL,
`human` blob NOT NULL,
`beast` blob NOT NULL,
`total` blob NOT NULL,
PRIMARY KEY (`user_id`),
CONSTRAINT `user_overall` FOREIGN KEY (`user_id`) REFERENCES `xf_user` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
So I was thinking I could add a user_stat_rank table which would be user_id, stat_id, rank. Then say I want to get the first page of the ladder ordered by the 'kills' stat I could get all the user_ids order by rank where stat_id is kills. Then make a second query to populate all the users stats.
After writing all this out it seems like it would work fine but I might not be seeing something. I also understand this question is all over the place so if you would like me to edit in details at places just say so.
For sake of managibility, I would stick to adding a column for every stat. In the long run, this will probably be the easiest way to manage it without ending up in a corner due to the limitations that for instance the EAV model would impose on you.
If you're worried about the stats table growing too large, you could consider implementing some form of table partitioning where you regularly move the data older than 4 weeks to (a) historic table(s). The historic table(s) can be indexed to the extreme, as they won't require constant updating.