i have a table field type varchar(36) and i want to generate it dynamically by mysql so i used this code:
$sql_code = 'insert into table1 (id, text) values (uuid(),'some text');';
mysql_query($sql_code);
how can i retrieve the generated uuid immediately after inserting the record ?
char(36) is better
you cannot. The only solution is to perform 2 separated queries:
SELECT UUID()
INSERT INTO table1 (id, text) VALUES ($uuid, 'text')
where $uuid is the value retrieved on the 1st step.
You can do everything you need to with SQL triggers. The following SQL adds a trigger on tablename.table_id to automatically create the primary key UUID when inserting, then stores the newly created ID into an SQL variable for retrieval later:
CREATE TRIGGER `tablename_newid`
AFTER INSERT ON `tablename`
FOR EACH ROW
BEGIN
IF ASCII(NEW.table_id) = 0 THEN
SET NEW.table_id = UNHEX(REPLACE(UUID(),'-',''));
END IF;
SET #last_uuid = NEW.table_id;
END
As a bonus, it inserts the UUID in binary form to a binary(16) field to save storage space and greatly increase query speed.
edit: the trigger should check for an existing column value before inserting its own UUID in order to mimic the ability to provide values for table primary keys in MySQL - without this, any values passed in will always be overridden by the trigger. The example has been updated to use ASCII() = 0 to check for the existence of the primary key value in the INSERT, which will detect empty string values for a binary field.
edit 2: after a comment here it has since been pointed out to me that using BEFORE INSERT has the effect of setting the #last_uuid variable even if the row insert fails. I have updated my answer to use AFTER INSERT - whilst I feel this is a totally fine approach under general circumstances it may have issues with row replication under clustered or replicated databases. If anyone knows, I would love to as well!
To read the new row's insert ID back out, just run SELECT #last_uuid.
When querying and reading such binary values, the MySQL functions HEX() and UNHEX() will be very helpful, as will writing your query values in hex notation (preceded by 0x). The php-side code for your original answer, given this type of trigger applied to table1, would be:
// insert row
$sql = "INSERT INTO table1(text) VALUES ('some text')";
mysql_query($sql);
// get last inserted UUID
$sql = "SELECT HEX(#last_uuid)";
$result = mysql_query($sql);
$row = mysql_fetch_row($result);
$id = $row[0];
// perform a query using said ID
mysql_query("SELECT FROM table1 WHERE id = 0x" . $id);
Following up in response to #ina's comment:
A UUID is not a string, even if MySQL chooses to represent it as such. It's binary data in its raw form, and those dashes are just MySQL's friendly way of representing it to you.
The most efficient storage for a UUID is to create it as UNHEX(REPLACE(UUID(),'-','')) - this will remove that formatting and convert it back to binary data. Those functions will make the original insertion slower, but all following comparisons you do on that key or column will be much faster on a 16-byte binary field than a 36-character string.
For one, character data requires parsing and localisation. Any strings coming in to the query engine are generally being collated automatically against the character set of the database, and some APIs (wordpress comes to mind) even run CONVERT() on all string data before querying. Binary data doesn't have this overhead. For the other, your char(36) is actually allocating 36 characters, which means (if your database is UTF-8) each character could be as long as 3 or 4 bytes depending on the version of MySQL you are using. So a char(36) can range anywhere from 36 bytes (if it consists entirely of low-ASCII characters) to 144 if consisting entirely of high-order UTF8 characters. This is much larger than the 16 bytes we have allocated for our binary field.
Any logic performed on this data can be done with UNHEX(), but is better accomplished by simply escaping data in queries as hex, prefixed with 0x. This is just as fast as reading a string, gets converted to binary on the fly and directly assigned to the query or cell in question. Very fast.
Reading data out is slightly slower - you have to call HEX() on all binary data read out of a query to get it in a useful format if your client API doesn't deal well with binary data (PHP in paricular will usually determine that binary strings === null and will break them if manipulated without first calling bin2hex(), base64_encode() or similar) - but this overhead is about as minimal as character collation and more importantly is only being called on the actual cells SELECTed, not all cells involved in the internal computations of a query result.
So of course, all these small speed increases are very minimal and other areas result in small decreases - but when you add them all up binary still comes out on top, and when you consider use cases and the general 'reads > writes' principle it really shines.
... and that's why binary(16) is better than char(36).
Its pretty easy actually
you can pass this to mysql and it will return the inserted id.
set #id=UUID();
insert into <table>(<col1>,<col2>) values (#id,'another value');
select #id;
Depending on how the uuid() function is implemented, this is very bad programming practice - if you try to do this with binary logging enabled (i.e. in a cluster) then the insert will most likely fail. Ivan's suggestion looks it might solve the immediate problem - however I thought this only returned the value generated for an auto-increment field - indeed that's what the manual says.
Also what's the benefit of using a uuid()? Its computationally expensive to generate, requires a lot of storage, increases the cost of querying the data and is not cryptographically secure. Use a sequence generator or autoincrement instead.
Regardless if you use a sequence generator or uuid, if you must use this as the only unique key on the database, then you'll need to assign the value first, read it back into phpland and embed/bind the value as a literal to the subsequent insert query.
Related
So in this app, we have a user id which is simple auto-increment primary key. Since we do not want to expose this at the client side, we are going to use a simple hash (encryption is not important, only obfuscation).
So when a user is added to the table we do uniqid(). user_id. This will guarantee that the user hash is random enough and always unique.
The question I have is, while inserting the record, we do not know the user id at that point (cannot assume max(user_id) + 1) since there might be inserts getting committed. So we are doing an insert then getting the last_insert_idthen using that for theuser_id`, which adds an additional db query. So is there a better way to do this?
A few things before the actual answer: with latest version of MySQL which uses InnoDB as default storage engine - you always want an integer pk (or the famous auto_increment). Reasons are mostly performance. For more information, you can research on how InnoDB clusters records using PK and why it's so important. With that out of the way, let's consider our options for creating a unique surrogate key.
Option 1
You calculate it yourself, using PHP and information you obtained back from MySQL (the last_insert_id()), then you update the database back.
Pros: easy to understand by even novice programmers, produces short surrogate key.
Cons: extremely bad for concurrent access, you'll probably get clashes, and you never want to use PHP to calculate unique indices required by the database.
You don't want that option
Option 2
Supply the uniqid() to your query, create an AFTER INSERT trigger that will concatenate uniqid() with the auto_increment.
Pros: easy to understand, produces short surrogate key.
Cons: requires you to create the trigger, implements magic that's not visible from the code directly which will definitely confuse a developer that inherits the project at some point - and from experience I would bet that bad things will happen
Option 3
Use universally unique identifiers or UUIDs (also known as GUIDs). Simply supply your query with surrogate_key = UUID() and MySQL does the rest.
Pros: always unique, no magic required, easy to understand.
Cons: none, unless the fact that it occupies 36 chars bothers you.
You want the option 3.
Since we do not want to expose this at the client side
Simply don't.
In a well-designed database, users never need to see a primary-key value. In fact, a user need never know the primary key even exists.
From your question it seems you actually replace your normal auto-increment ID column with a surrogate id (If not skip to the last paragraph).
Try creating a column with another unique surrogate ID and use that on your frontend. And you can keep your normal primary ids for relationships etc.'
Remember one of the basic must rules for primary keys:
The primary key must be compact and contain the fewest possible attributes.
Also integer serials have the advantage of being simple to use and implement. They also, depending on the specific implementation of the serialization method, have the advantage of being quickly derivable, as most databases just store the serial number in a fixed location. Meaning in stead of max(id)+1 the db has it already stored and makes auto-increment fast.
So we are doing an insert then getting the last_insert_id then using
that for theuser_id`, which adds an additional db query.
last_insert_id Isn't actually a query and is a saved variable in your db connection when you performed a insert query.
If you already have a second column for your surrogate ID ignore all the above:
So we are doing an insert then getting the last_insert_id then using
that for theuser_id`, which adds an additional db query. So is there a
better way to do this?
No, you can only retrieve that uniqid by doing a query.
$res = mysql_query('SELECT LAST_INSERT_ID()');
$row = mysql_fetch_array($res);
$lastsurrogateid = $row['surrogate_id'];
Anything else is making it more complicated than necessary.
In my database (MySQL) I have a table (MyISAM) containing a field called number. Each value of this field is either 0 or a positive number. The non zero values must be unique. And the last thing is that the value of the field is being generated in my php code according to value of another field (called isNew) in this table. The code folows.
$maxNumber = $db->selectField('select max(number)+1 m from confirmed where isNew = ?', array($isNew), 'm');
$db->query('update confirmed set number = ? where dataid = ?', array($maxNumber, $id));
The first line of code select the maximum value of the number field and increments it. The second line updates the record by setting it freshly generated number.
This code is being used concurrently by hundreds of clients so I noticed that sometimes duplicates of the number field occur. As I understand this is happening when two clients read value of the number field almost simultaneously and this fact leads to the duplicate.
I have read about the SELECT ... FOR UPDATE statement but I'm not quite sure it is applicable in my case.
So the question is should I just append FOR UPDATE to my SELECT statement? Or create a stored procedure to do the job? Or maybe completely change the way the numbers are being generated?
This is definitely possible to do. MyISAM doesn't offer transaction locking so forget about stuff like FOR UPDATE. There's definitely room for a race condition between the two statements in your example code. The way you've implemented it, this one is like the talking burro. It's amazing it works at all, not that it works badly! :-)
I don't understand what you're doing with this SQL:
select max(number)+1 m from confirmed where isNew = ?
Are the values of number unique throughout the table, or only within sets where isNew has a certain value? Would it work if the values of number were unique throughout the table? That would be easier to create, debug, and maintain.
You need a multi-connection-safe way of getting a number.
You could try this SQL. It will do the setting of the max number in one statement.
UPDATE confirmed
SET number = (SELECT 1+ MAX(number) FROM confirmed WHERE isNew = ?)
WHERE dataid = ?
This will perform badly. Without a compound index on (isNew, number), and without both those columns declared NOT NULL it will perform very very badly.
If you can use numbers that are unique throughout the table I suggest you create for yourself a sequence setup, which will return a unique number each time you use it. You need to use a series of consecutive SQL statements to do that. Here's how it goes.
First, when you create your tables create yourself a table to use called sequence (or whatever name you like). This is a one-column table.
CREATE TABLE sequence (
sequence_id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`sequence_id`)
) AUTO_INCREMENT = 990000
This will make the sequence table start issuing numbers at 990,000.
Second, when you need a unique number in your application, do the following things.
INSERT INTO sequence () VALUES ();
DELETE FROM sequence WHERE sequence_id < LAST_INSERT_ID();
UPDATE confirmed
SET number = LAST_INSERT_ID()
WHERE dataid = ?
What's going on here? The MySQL function LAST_INSERT_ID() returns the value of the most recent autoincrement-generated ID number. Because you inserted a row into that sequence table, it gives you back that generated ID number. The DELETE FROM command keeps that table from snarfing up disk space; we don't care about old ID numbers.
LAST_INSERT_ID() is connection-safe. If software on different connections to your database uses it, they all get their own values.
If you need to know the last inserted ID number, you can issue this SQL:
SELECT LAST_INSERT_ID() AS sequence_id
and you'll get it returned.
If you were using Oracle or PostgreSQL, instead of MySQL, you'd find they provide SEQUENCE objects that basically do this.
Here's the answer to another similar question.
Fastest way to generate 11,000,000 unique ids
Using a PHP form to insert the text strings from users into the table, and another form to pull it later on another page, what would be the best method of creating a table in MySQL for strings of text, and what options when creating the table would likely be necessary to best handle text strings?
The complicating factor, I suppose, is that the text that would exist in the table doesn't exist yet (as it would need to be input through the form, etc.), I am unsure if this is why I've had trouble (along with my relative inexperience, so I am unsure of what, precisely would be an ideal table configuration).
Since I don't want to store any other data beyond this user input (like I said, just strings of text i.e a sentence), I assumed I only needed one column when creating the table, but I was unsure of this as well; it seems it is possible I am more likely just overlooking something about how SQL works.
I'll put my comments into an answer now:
consider the estimated, maximum length of such strings to decide whether to use varchar-fields or text-fields in mysql.
Quoting from the MySQL-Manual (BTW a good read for your purpose):
Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions.
http://dev.mysql.com/doc/refman/5.0/en/char.html
It is said that varcharis faster, for a good summary, see MySQL: Large VARCHAR vs. TEXT?
consider having at least a 2nd field called id (int, primary key, auto increment), when you need to reference those strings later. Consider having a field referencing the author of that string. Maybe a field to store the date and time when the string was put into the database would be a good idea as well.
use mysqli or PDO instead of mysql, which is deprecated.
See here, there are links to good tutorials in the 1st answer: How do I migrate my site from mysql to mysqli?
I was wondering where the cut off point is to creating a field in a database to hold your data or generating the data yourself in your code. For example, I need to know certain value that is generated from two different columns in the database. Column1-Column2 = Column3. So here is the question will it be better to generate that data in the code or should I create a Column3 and put the data there while populating the DB and then retrieve it later. In my case the data is a two digit integer or a single character string, basically small data.
I am using the latest mysql the programming language is php with the mysqli library. Also this website should not get too much traffic and the size of the db will be 200k rows at the most.
This type of attribute (column) is called derived attribute. You should not put them in database as they will increase redundancy. Just put column1 and column2 and calculate it while fetching. For example like this,
`Column1` - `Column2` as `Column3`
If you dont bother query that way every time create a view with added derived attributes.
Note, If calculating is cpu intensive you should consider using cache. and then you must implement how and when this cache will be invalidated.
It depends on how resource-hungry the calculation is and how often it's gonna be done. In your case it's pretty simple so storing the difference in a separate column would be overkill. You can do your calculation in SQL query like this:
SELECT col1, col2, col1-col2 AS col3...
So, imagine a mysql table with a few simple columns, an auto increment, and a hash (varchar, UNIQUE).
Is it possible to give mysql a query that will add a column, and generate a unique hash without multiple queries?
Currently, the only way I can think of to achieve this is with a while, which I worry would become more and more processor intensive the more entries were in the db.
Here's some pseudo-php, obviously untested, but gets the general idea across:
while(!query("INSERT INTO table (hash) VALUES (".generate_hash().");")){
//found conflict, try again.
}
In the above example, the hash column would be UNIQUE, and so the query would fail. The problem is, say there's 500,000 entries in the db and I'm working off of a base36 hash generator, with 4 characters. The likelyhood of a conflict would be almost 1 in 3, and I definitely can't be running 160,000 queries. In fact, any more than 5 I would consider unacceptable.
So, can I do this with pure SQL? I would need to generate a base62, 6 char string (like: "j8Du7X", chars a-z, A-Z, and 0-9), and either update the last_insert_id with it, or even better, generate it during the insert.
I can handle basic CRUD with MySQL, but even JOINs are a little outside of my MySQL comfort zone, so excuse my ignorance if this is cake.
Any ideas? I'd prefer to use either pure MySQL or PHP & MySQL, but hell, if another language can get this done cleanly, I'd build a script and AJAX it too.
Thanks!
This is our approach for a similar project, where we wanted to generate unique coupon codes.
First, we used an AUTO_INCREMENT primary key. This ensures uniqueness and query speed.
Then, we created a base24 numbering system, using A,B,C, etc, without using O and I, because someone might have thought that they were 0 or 1.
Then we converted the auto-increment integer to our base24 number. For example, 0=A, 1=B, 28=BE, 1458965=EKNYF. We used base24, because long numbers in base10 have fewer letters in base24.
Then we created a separate column in our table, coupon_code. This was not indexed.
We took the base24 and added 3 random numbers, or I and O (which were not used in our base24), and inserted them into our number. For example, EKNYF could turn into 1EKON6F or EK2NY3F9. This was our coupon code and we inserted it into our coupon_code column. It's unique and random.
So, when the user uses code EK2NY3F9, all we have to do it remove all non-used characters (2,3 and 9) and we get EKNYF, which we convert to 1458965. We just select the primary key 1458965 and then compare coupon_code column with EK2NY3F9.
I hope this helps.
If your heart is set on using base-36 4 character hashes (hashspace is only 1679616), you could probably pre-generate a table of hashes that aren't already in the other table. Then finding a unique hash would be as simple as moving it from the "unused table" to the "used table" which is O(1).
If your table is conceivably 1/3 full you might want to consider expanding your hashspace since it will probably fill up in your lifetime. Once the space is full you will no longer be able to find unique hashes no matter what algorithm you use.
What is this hash a hash of? It seems like you just want a randomly generated unique VARCHAR column? What's wrong with the auto increment?
Anyway, you should just use a bigger hash - find an MD5 function - (if you're actually hashing something), or a UUID generator with more than 4 characters, and yes, you could use a while loop, but just generate a big enough one so that conflicts are incredibly unlikely
As others have suggested whats wrong with an autoinc field? If you want an alpha numeric value then you could simply do a simple conversion from int to a alphanumeric string in base 36. This could be implemented in almost any language.
Going with zneaks comment, why don't you use an autoincrement column? save the hash in another (non unique) field, and concatenate the id to it (dynamically). So you give a user [hash][id]. You can parse it out in pure sql using the substring functions.
Since you have to have the hash, the user can't look at other records by incrementing the id.
So, just in case someone runs across a similar issue, I'm using a UNIQUE field, I'll be using a php hash function to insert the hashes, if it comes back with an error, I'll try again.
Hopefully because of the low likelyhood of conflict, it won't get slow.
You could also check the MySQL functions UUID() and UUID_SHORT(). Those functions generate UUIDs that are globally unique by definition. You won't have to double-check if your PHP-generated hash string already exists.
I think in several cases these functions can also fit your project's requirements. :-)
If you already have the table filled by some content, you can alter it with the following :
ALTER TABLE `page` ADD COLUMN `hash` char(64) AS (SHA2(`content`, 256)) AFTER `content`
This solution will add hash column right after the content one, generates hash for existing and new records too without need to change your INSERT statement.
If you add UNIQUE index to the column (after have removed duplicates), your inserts will only be done if content is not already in the table. This will prevent duplicates.