MySQL InnoDB insert and select lock

MySQL InnoDB insert and select lock - php

I created a ticketing system that in its simplest form just records a user joining the queue, and prints out a ticket with the queue number.
When the user presses for a ticket, the following happens in the database
INSERT details INTO All_Transactions_Table
SELECT COUNT(*) as ticketNum FROM All_Transactions_Table WHERE date is TODAY
This serves me well in most cases. However, I recently started to see some duplicate ticket numbers. I cant seem to replicate the issue even after running the web service multiple times myself.
My guess of how it could happen is that in some scenarios the INSERT happened only AFTER the SELECT COUNT. But this is an InnoDB table and I am not using INSERT DELAYED. Does InnoDB have any of such implicit mechanisms?

I think your problem is that you have a race condition. Imagine that you have two people that come in to get tickets. Here's person one:
INSERT details INTO All_Transactions_Table
Then, before the SELECT COUNT(*) can happen, person two comes along and does:
INSERT details INTO All_Transactions_Table
Now both users get the same ticket number. This can be very hard to replicate using your existing code because it depends on the exact scheduling of threads withing MySQL which is totally beyond your control.
The best solution to this would be to use some kind of AUTO_INCREMENT column to provide the ticket number, but failing that, you can probably use transactions to achieve what you want:
START TRANSACTION
SELECT COUNT(*) + 1 as ticketNum FROM All_Transactions_Table WHERE date is TODAY FOR UPDATE
INSERT details INTO All_Transactions_Table
COMMIT
However, whether or not this works will depend on what transaction isolation level you have set, and it will not be very efficient.

Related

Mysql - Prevent duplicate entry for incremental value in a field

My site hosted in a shared hosting. It's a POS application (PHP, Codeigniter). It has several users. Everyone is generating invoice. Invoice number is incremental. That is when a user submit a invoice form, it fetches the last invoice number then increment it by one and then create a new row with new invoice number. This process some time (very rarely) duplicate invoice number generated when users submits the form pretty much same time.
One possible way is that make invoice unique. But if it happens again, user will see an exception or formatted error message.
I don't want show error to my users. Because when they submit the invoice form , it contains sales information that they have written. If they loose it because of this warning, they feel disturbed. AJAX will not work. Direct submit is working here(for invoice submission ).
Can SQL lock be applied for this situation? I have no idea about SQL locking.

If your concern is not performance an inefficient way to do would be something like
Insert your invoice number as null/zero and have another query update that
like
INSERT INTO invoices (id, invoice_number) VALUES (10001, null);
UPDATE invoices SET invoice_number = id WHERE invoice_number IS NULL;
For locks you can look into SELECT ... FOR UPDATE that would lock the last read row, and also inserts from other connections are also blocked but its better you try it on your DB as this depends on your Mysql version and isolation levels set.

mysql historical data and record id

I am setting up a new part of an application with historical data requirements for the transactions table in mysql. Originally in old version transactions were not historical, with structure like this:
id|buyerid|prodid|price|status
And other fields, with the id being referenced in links to access Transaction Details page, as well as used as foreign key in other tables across the application to reference particular transactions for various purposes.
Now the requirement is to answer reporting questions like "Show all transaction that had particular status Feb 2014" AND "What did a transaction look like in Feb 2014".
The new design I'm testing at the moment is below:
id|buyerid|prodid|price|status|active|start_date|end_date
Where active used to indicate latest record, start is when it is created, no records to be modified instead end date populated and a new record created with same details plus the modification.
Now the question is - what to do about transaction id field? Because in this new design it is more of a history id, and can not be used for a foreign key across the application since it is going to change with every update.
I can think of two options:
Create a separate table, transaction_ids with just one column, primary key autoincrement tid, and a foreign key column in the main transactions table for tid - Every time a brand new transaction is created, insert the ids table and use that id for the tid to trace this particular transaction across the system.
The buyerid and prodid combination is always unique in my application, no buyer can get the same product twice.
Is the second solution better? Does anyone know of a better way to handle this?

What you are trying to achieve is called Event Sourcing.
Think in terms of events changing the status of your transaction, rather than tracing the status itself in time.
You still have your transaction with its own primary key, and you rebuild the current (or past) status applying each event.
I would also suggest you to start coding your business models, and only after that, to think about the persistence and the best way to map it to a database.

Second Solution looks better although I will say that there is a lot of ambiguity in your question.
I am saying that second solution is better because the transaction_ids table which you are talking about in solution 1 is basically REDUNDANT. It is not solving any purpose. Even if the transaction id is repeating itself in the transaction table, it does not mean that you need to have a separate table to generate the ids and make it as PK-FK relation. Most probably you will still be querying the data by user-id and prod-id and not by transaction-id
Basically what you need is some kind of audit history table where you insert a record for every operation/transaction/modification done and capture some basic details like - Username, Date/time, old value, new value etc. You do not need status or start date and end date columns. Once a record is inserted in this audit history table then it is never going to be touched again.
You will have to design your report carefully.

Taking two previous answers into consideration, here is the solution I will go with: All of the data updates in my application come through one single function, that is already set up to audit particular fields of my choosing, so I will mark the transaction status to be audited among the others. Table structure for the audit table is similar to this:
|id|table|table_id|column|old_val|new_val|who|when|
Only that there is a bit more advanced object mapping via object id's instead of simple table name. I can then use this data in a Join to the main, normal not historical transactions table to provide the reporting required.

PHP MySQL Task API, Prevent Duplicate Records

I am building a PHP RESTful-API for remote "worker" machines to self-assign tasks. The MySQL InnoDB table on the API host holds pending records that the workers can pick up from the API whenever they are ready to work on a record. How do I prevent concurrently requesting worker system from ever getting the same record?
My initial plan to prevent this is to UPDATE a single record with a uniquely generated ID in a default NULL field, and then poll for the details of the record where the unique ID field matches.
For example:
UPDATE mytable SET status = 'Assigned', uniqueidfield = '3kj29slsad'
WHERE uniqueidfield IS NULL LIMIT 1
And in the same PHP instance, the next query:
SELECT id, status, etc FROM mytable WHERE uniqueidfield = '3kj29slsad'
The resulting record from the SELECT statement above is then given to the worker. Would this prevent simultaneously requesting workers from getting the same records shown to them? I am not exactly sure on how MySQL handles the lookups within an UPDATE query, and if two UPDATES could "find" the same record, and then update it sequentially. If this works, is there a more elegant or standardized way of doing this (not sure if FOR UPDATE would need to be applied to this)? Thanks!

Nevermind my previous answer. I believe I understand what you are asking. I'll reword it so maybe it is clearer to others.
"If I issue two of the above update statements at the same time, what would happen?"
According to http://dev.mysql.com/doc/refman/5.0/en/lock-tables-restrictions.html, the second statement would not interfere with the first one.
Normally, you do not need to lock tables, because all single UPDATE
statements are atomic; no other session can interfere with any other
currently executing SQL statement.
A more elegant way is probably opinion based, but I don't see anything wrong with what you're doing.

Database schema for a live chat project with rooms

For my university project, I'm developing a dynamic live chat website with rooms, user registration, etc. I've got the entire system planned out bar one aspect. The rooms. I'm confused as to how to design the database for rooms.
To put it in perspective, a room is created by a user who is then an operator of that room. Users can join the room and talk within it. The system has to be scalable, accounting for hundreds of thousands if not millions of messages being sent a day.
Originally, I was going to create on table in my database called messages, and have fields like this:
| r_id | u_id | message | timestamp |
r_id and u_id would be foreign keys to the room ID and user ID respectively. Doing it this way means I would need to insert a new record whenever a user sends a message, and periodically run a SELECT statement for every client (say every 3 seconds or so) to get the recent messages. My worry with this is because the table will be huge, running these statements might create a lot of overhead and take a long time.
The other way I thought of implementing this would be to create a new database table for every room. Say a user creates 3 rooms called General, Programming and Gaming, the database tables would look like: room_general, room_programming, room_gaming, each with fields like:
| u_id | message | timestamp |
This would drastically cut down on the amount of queries for each table, but may introduce problems when I come to program it.
So, I'm stuck on what the best way to do this is. If it makes a difference, the technology I'm using will be MySQL with PHP, and a whole lotta AJAX.
Thanks for any help!

It is bad idea to create a table per room. Hard to implement and hard to support.
Dont worry about performance of selects because they will be wery simple:
SELECT * FROM messages WHERE r_id=X ORDER BY timestamp DESC LIMIT X,Y
Just make sure your (r_id, timestamp) indexed together in this order to make this select using index:
ALTER TABLE `messages` ADD KEY `IN_messages_room_time` (`r_id`, `timestamp`);
If you will still have problems with performance (probably you will not), just add a 1-3 seconds inmemory cache (using memcache) and fetch a messages from DB one time per 1-3 seconds.
Also look at the Apollo Clark's answer: https://stackoverflow.com/a/8673165/436932 to prevent storing huge amount of unneccessary old messages: you can just put it in to the MYISAM table archive or simply delete.

Look into creating a "transaction table" for storing the messages. Basically, you need to decide, do I really want to log all of the messages ever posted to the room, or just the message posted this past month / week / day / hour. If you really want to have a history of every message ever written, then you would create two databases. If you don't want to keep a history of every message, then you just need one table.
Using a transaction table, here's how it would flow:
user enters chat room
user types a message, which is saved to the transaction table.
every 500msec or 3sec, every user in the room would query the transaction table to get the latest updates from the past 500msec or 3sec
SELECT * FROM message_transactions WHERE timestamp > 123456789
a CRON job runs every 5 min or 1 hour, and deletes all entries older then 5min or however long you want the history to be.
Be sure to synchronize and round the time that each user queries the transaction table, so that the MySQL query result caching will kick in. For example, round the timestamp to once every 1sec or every 500msec.
What'll happen now is the users only get the newest messages, and your database won't explode in size over time, or slow down. Doing this, you'll need to cache the history of messages on the client-side in JS.
On the flip side, you could just get a PHP to IRC library, and call it a day. Also, if you're curious about it, look into how Facebook implements their AJAX-based chat system.

To speed up your database, have a look at indexing your tables: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
In your case I assume that you'd be SELECTing messages by r_id while doing a JOIN on the user table through u_id. I would index the r_id and u_id columns. I am by no means an expert on this subject as I've only done "what works" for my own projects. I don't understand every pro and con of indexing, just that indexing those columns that are typically used as, well, indexes, speeds things up. Google "mysql index tutorial", you'll find plenty more information.
Don't go nuts and index every column, you'll slow down your INSERTs and UPDATEs.
I also suggest that you purge the chat logs every few days / weeks, or move them to another server for archival purposes if that's what you want / need to do.

You could potentially use memcached to hold recent chat messages in memory and do your database writes in bulk.
Using memcached as a database buffer for chat messages

What you can do is:
whenever the user updates, you save the message to a cache specific to a room with a timestamp of when the message came in, while saving it to the database at the time. When the clients requests for new messages, if the user is not new in the chat room, you check the last time the user got served by the server and load the new messages from the cache for the request. But if the user is new, then you serve him from the database.
To improve scalability in this scenario, you have to set the expiration of the messages so that messages can expire after that time. Or implement an async method that deletes old messages based on their timestamp.

Total Registered Users Count

What's the most efficient way of counting the total number of registered users on a website?
I was thinking of using the following query, but if this table contained 1000's of users, the execution time will be very long.
mysql_query("SELECT COUNT(*) FROM users");
Instead, I thought of creating a separate table that will hold this value. Each time a new user registers, or a current one deleted, this value gets updated.
My Question:
Is it possible to carry out an INSERT and UPDATE in one query? - The INSERT will be for storing the new users details, and the UPDATE to increment the total users value.
I'm very interested in your thoughts on this.
If there is a better and faster way to find out the total registered users, I'm very interested to know ;
Cheers ;)

You can use triggers to update the value every time you make an INSERT, UPDATE or DELETE.
if this table contained 1000's of users, the execution time will be very long.
I doubt that it would be that slow for thousands of users. If you had millions of users then it would probably be too slow.
And does your count need to be 100% accurate?
If an approximate row count is sufficient, SHOW TABLE STATUS can be used.
(Source)
By the way, if you are using MyISAM then your original query will be close to instant because the row count is stored already with this storage engine.

You don't do an insert and update in one query. but rather, you do them in one "Transaction".
Transactions have a concept of "atomicity", which means that other processes cannot see "part" of the transaction - it is all or nothing.
If this concept is not familiar to you, you may wish to look it up.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.