I need to get time based unique id which would only consist of numbers, i was thinking to simply use something like this:
str_replace(".", "", microtime(true))
I would use this id for many things, for example comments posting, and i'm expecting quite high traffic so i would like to know how high are the chances of collision with microtime function?
And perhaps there is a better way to get numerical unique id? Just remember that it has to based on time so sorting could be done.
you can use this to get uniqid but is not numberic:
$id = uniqid(); //58aea16085f6b
use this to convert it to only numbers:
$id = hexdec( uniqid() ); //1560112880181100
if it's too long use this but maybe get negetive numbers:
$id = crc32( uniqid() ); //-1551585806
if you just need posetive numbers you can do it:
$id = abs( crc32( uniqid() ) ); //1551585806
If it must be based on time you could use microtime. But I guess you are storing things in a database so my vote would go for a primary key column Id with auto_increment and then a second column with timestamp type and a default current time.
Then you can sort on timestamp and also have a 100% unique identifier even with extremely high traffic. But if ordering is needed (so not searching between dates) then you dont need it based on time.
1, 2, 3, 4 will be in the same order as a microtime key thats just a lot larger and further appart.
UPDATE
If it must be a unique key and cant be used from the database, try the following. The changes of a duplicate key are so slim it can be ignored.
$key = microtime() + floor(rand()*10000);
If you need to use a pure time based ID you need to ensure that you avoid collisions so a) use a primary key in the database and then b) use a recursive check to ensure that your time based ID has not already been taken:
$id = getTimeId();
function getTimeId() {
// Generate microtime
// Query to ensure it does not exist
// Return microtime
}
I shall leave it upto you good sir to enjoy making it recursive.
if you have to avoid collisions time based won't work. eventually you'll get a collision. Why not use a counter with a semaphore.
http://php.net/manual/en/book.sem.php
Related
I am designing a shopping project and am thinking about using the unix timestamp as the order ID. I know I can use mySQL incremental numbers to use as order ID like 10001, 10002, etc. but I don't want everybody to know how many orders there really are.
Is this a safe way to do it? I obviously don't expect more than 1 order per second, so I should be safe, right?
function check_number(){
$unique_number = time();
$exists = $this->count_rows('orders', "WHERE order_id='" . $unique_number . "'");
if ($exists >0){
$results = check_number();
}
else{
$results = $unique_number;
return $results;
}
}
A much more robust solution would be to use PHP's uniqid() function. It hides the number of orders in your system, but it wont go belly-up if two or more orders happen to come in at the same second.
As for the actual question: no, it's not safe to use unix timestamps as unique ids for semi-random events, like users making orders. However unlikely it may be, why leave in the possibility of a collision when it can be easily avoided?
I wouldn't recommend it as there is a possibility that two orders are made in the same second. Use the default increment and add/subtract the ID by a fixed number so the users sees a larger number.
$ID = $_GET['ID'];
$ID -= 183727; // subtract offset
// Do mysql stuff
// Build an ID for an URL
$ID += 183727;
echo 'http://../test.php?order=' .$ID;
You can make a crontab that will do
ALTER TABLE orders AUTO_INCREMENT = max_id + <something>; and move this randomly every day or hour (not the best thing to do though); OR you can put randomly ids (more than current max id) in the inserts.
If you want to keep integers in the DB, you can use base64_ family with a salt.
You can use uniques slugs in URLs
You can use com_create_guid or similar functions or even generate random strings / numbers, but be sure you have a "UNIQUE" constraint in the database.
The problem with "timestamp" is about lots of concurrent operations it won't be unique for a certain time-frame; you can use microseconds and add some random numbers at the end - that can help.
I'm generating a 5 letter uniqid in PHP this way:
$id = substr(uniqid(),0,5);
And every single time I call it, I get the value 5004b. Why is that happening?
If I remove the substr, the 5004b part stays constant while the remaining changes. Isn't this severely reducing the entropy of the GUID being generated?
uniqid() is based on microtime(), the beginning is going to be the same for a long time.
My suggestion is that you just increment every time or something if you need a 5 digit long uniqid.
uniqid() only works if you take the full value. It would make more sense to take the last five characters rather than the first:
$id = substr(uniqid(),-5);
However, after just one second you'll get repeating values. You really should just take the full uniqid().
To increase the entropy of the output, you can use uniqid('', true);. According to the doc, this "will add additional entropy (using the combined linear congruential generator) at the end of the return value, which increases the likelihood that the result will be unique."
If you need a unique ID, don't use uniqid() as it only returns the an ID based on the current time. If your computer is fast enough, it will produce the same values again and again.
Enlarging the entropy (by passing "true" as the second argument) helps, but it cannot hide the fact that this function is flawed, and should not be used more than once in a script.
you could use something like substr(uniqid(rand(),1),0,7). this generates unique ids from a random seed. then again if you are saving them to a database you may need to check whether they exist before generating again. Do something like:
function idExists($id){
//query database and check if the id already exists
}
//then use this to generate
$id=null;
do{
$id=substr(uniqid(rand(),1),0,7);
}while(idExists($id));
//then you get your unique id here
In order to produce a unique Id I suppose I must use the uniqid function in php.
But uniqid produces a 13 digits long HEXA number, by default.
4f66835b507db
I would like to reduce this number to 7 digits long NUMERIC number but I want to conserve the unicity. Is it possible ?
4974012
This number will be used as User Id. The authentication will be done with thid Id and a password.
Some people say uniqid is not unique ! Is it a bad choice ?
Any "unique" number will eventually have a collision after generating enough records. To ensure uniqueness, you need to store the values you generated into a database and when generating next one, you need to check if there is no collision.
However, in practice, applications usually generate IDs as a simple sequence 1,2,3,... That way you know you won't get a collision until you run out of the datatype (UINT is usually 32 bits long, which gives you 4 billion unique ids).
Uniqid is not guaranteed to be unique, even in its full length.
Furthermore, uniqid is intended to be unique only locally. This means that if you create users simultaneously on two or more servers, you may end up with one ID for two different users, even if you use full-length uniqid.
My recommendations:
If you are really looking for globally unique identifiers (i.e. your application is running on multiple servers with separate databases), you should use UUIDs. These are even longer than the ones returned by uniqid, but there is no practical chance of collisions.
If you need only locally unique identifiers, stick with AUTO_INCREMENT in your database. This is (a little) faster and (a little) safer than checking if a short random ID already exists in your database.
EDIT: As it turns out in the comments below, you are looking not only for an ID for the user, but rather you are forced to provide your users with a random login name... Which is weird, but okay. In such case, you may try to use rand in a loop, until you get one that does not exist in your database.
Pseudocode:
$min = 1;
do {
$username = "user" . rand($min, $min * 10);
$min = $min * 10;
} while (user_exists($username));
// Create your user here.
Write a while loop that generates random letters and numbers of a desired length, which loops until it creates an ID that is not already in use.
Well, by reducing it to 7 characters and only numeric, you are reducing the 'uniqueness' by a lot.
I suggest using an auto increment of the user ID and start at 1000000 if it has to be 7 digits long.
If you really must generate it without auto increment, you can use mt_rand() to generate a random number 7 digits long:
$random = mt_rand(1000000, 9999999);
This is not ideal because you will need to check if the number is already in use by another user.
If you are using a Database. Define an id column as unique and auto-incremented, and then let the database manage your ids.
It's safer.
Read more : mysql-doc
Take a lookt at this article
Create short IDs with PHP - Like Youtube or TinyURL
It explains how to generate short unique ids, like youtube does.
Actually, the function in the article is very related to php function base_convert which converts a number from a base to another (but is only up to base 36).
I have just found this great tutorial as it is something that I need.
However, after having a look, it seems that this might be inefficient. The way it works is, first generate a unique key then check if it exists in the database to make sure it really is unique. However, the larger the database gets the slower the function gets, right?
Instead, I was thinking, is there a way to add ordering to this function? So all that has to be done is check the previous entry in the DB and increment the key. So it will always be unique?
function generate_chars()
{
$num_chars = 4; //max length of random chars
$i = 0;
$my_keys = "123456789abcdefghijklmnopqrstuvwxyz"; //keys to be chosen from
$keys_length = strlen($my_keys);
$url = "";
while($i<$num_chars)
{
$rand_num = mt_rand(1, $keys_length-1);
$url .= $my_keys[$rand_num];
$i++;
}
return $url;
}
function isUnique($chars)
{
//check the uniqueness of the chars
global $link;
$q = "SELECT * FROM `urls` WHERE `unique_chars`='".$chars."'";
$r = mysql_query($q, $link);
//echo mysql_num_rows($r); die();
if( mysql_num_rows($r)>0 ):
return false;
else:
return true;
endif;
}
The tiny url people like to use random tokens because then you can't just troll the tiny url links. "Where does #2 go?" "Oh, cool!" "Where does #3 go?" "Even cooler!" You can type in random characters but it's unlikely you'll hit a valid value.
Since the key is rather sparse (4 values each having 36* possibilities gives you 1,679,616 unique values, 5 gives you 60,466,176) the chance of collisions is small (indeed, it's a desired part of the design) and a good SQL index will make the lookup be trivial (indeed, it's the primary lookup for the url so they optimize around it).
If you really want to avoid the lookup and just unse auto-increment you can create a function that turns an integer into a string of seemingly-random characters with the ability to convert back. So "1" becomes "54jcdn" and "2" becomes "pqmw21". Similar to Base64-encoding, but not using consecutive characters.
(*) I actually like using less than 36 characters -- single-cased, no vowels, and no similar characters (1, l, I). This prevents accidental swear words and also makes it easier for someone to speak the value to someone else. I even map similar charactes to each other, accepting "0" for "O". If you're entirely machine-based you could use upper and lower case and all digits for even greater possibilities.
In the database table, there is an index on the unique_chars field, so I don't see why that would be slow or inefficient.
UNIQUE KEY `unique_chars` (`unique_chars`)
Don't rush to do premature optimization on something that you think might be slow.
Also, there may be some benefit in a url shortening service that generates random urls instead of sequential urls.
I don't know why you'd bother. The premise of the tutorial is to create a "random" URL. If the random space is large enough, then you can simply rely on pure, dumb luck. If you random character space is 62 characters (A-Za-z0-9), the the 4 characters they use, given a reasonable random number generator, is 1 in 62^4, which is 1 in 14,776,336. Five characters is 1 in 916,132,832. So, a conflict is, literally, "1 in a billion".
Obviously, as the documents fill, your odds increase for the chance of a collision.
With 10,000 documents, it's 1 in 91,613, almost 1 in 100,000 (for round numbers).
That means, for every new document, you have a 1 in 91,613 chance of hitting the DB again for another pull on the slot machine.
It is not deterministic. It's random. It's luck. In theory, you can hit a string of really, really, bad luck and just get collision after collision after collision. Also, it WILL, eventually, fill up. How many URLs do you plan on hashing?
But if 1 in 91,613 odds isn't good enough, boosting it to 6 chars makes it more than 1 in 5M for 10,000 documents. We're talking almost LOTTO odds here.
Simply put, make the key big enough (7 characters? 8?) and the problem pretty much "wishes" itself out of existence.
Couldn't you encode the URL as Base36 when it's generated, and then decode it when visited - that would allow you to remove the database completely?
A snippet from Channel9:
The formula is simple, just turn the
Entry ID of our post, which is a long
into a short string by Base-36
encoding it and then stick
'http://ch9.ms/' onto the front of it.
This produces reasonably short URLs,
and can be computed at either end
without any need for a database look
up. The result, a URL like
http://ch9.ms/A49H is then used in
creating the twitter link.
I solved a similar problem by implementing an alogirthm that used to generate serial numbers one-by-one in base36. I had my own oredring of base36 characters all of which are unique. Since it was generating numbers serially I did not have to worry about duplication. Complexity and randomness of the number depends on the ordering of base36 numbers[characters]... that too for public only becuase to my application they are serial numbers :)
Check out this guys functions - http://www.pgregg.com/projects/php/base_conversion/base_conversion.php source - http://www.pgregg.com/projects/php/base_conversion/base_conversion.inc.phps
You can use any base you like, for example to convert 554512 to base 62, call
$tiny = base_base2base(554512, 10, 62); and that evaluates to $tiny = '2KFk'.
So, just pass in the unique id of the database record.
In a project I used this in a removed a few characters from the $sChars string, and am using base 58. You can also rearrange the characters in the string if you want the values to be less easy to guess.
You could of course add ordering by simply numbering the urls:
http://mytinyfier.com/1
http://mytinyfier.com/2
and so on. But if the hash key is indexed in the database (which it obviously should be), the performance boost would be minimal at best.
I wouldn't bother doing ordered enumeration for two reasons:
1) SQL servers are very effective at checking such hash collisions (given correct indexes)
2) That might hurt privacy, as users would be able to easily figure out what other users are tinyurl-ing.
Use autoincrement on the database, and get the latest id as described by http://www.acuras.co.uk/articles/24-php-use-mysqlinsertid-to-get-the-last-entered-auto-increment-value
Perhaps this is a bit off-answer, but, my general rule for creating always unique keys is simple md5( time() * 100 + rand( 0, 100 ) ); There is a one in 100,000 chance that if two people are using the same service at the same second they will get the same result (nie impossible).
That said, md5( rand( 0, n ) ) works too.
That might work, but the easiest way to accomplish the problem would probably be with hashing. Theoretically speaking, hashing runs in O(1) time, as in, it only has to perform the hash, and then does only one actual hit to the database to retrieve the value. Then, you would introduce complications for checking for hash collisions, but it seems like this is probably what most of the tinyurl providers do. And, a good hash function isn't terribly hard to write.
I have also created small tinyurl service.
I wrote a script in Python that was generating keys and store in MySQL table named tokens with status U(Unused).
But, I am doing it in offline mode. I have a corn job on my VPS. It runs a script every 10 minutes. The script check if there are less than 1000 keys in the table, it keep generating keys and inserting them if they are unique and not already exists in the table until the key's count up to 1000.
For my service, 1000 keys for 10 minutes are more than enough, you can set the timing or number of keys generated according to your need.
Now when any tiny url needs to be created on my website, my PHP script just fetch any key which is unused from the table and marked its status as T(taken). PHP script does not have to bother about its uniqueness as my python script already populated only unique keys.
Couldn't you just trim the hash to the length you wish?
$tinyURL = substr(md5($longURL . time()),0,4);
Granted, this may not provide as much pseudo randomness as using the entire string length. But, if you hash the long URL concatenated with the time(), wouldn't this be sufficient? Thoughts on using this method? Thanks!
I've seen lots of examples of how to use uniqid() in PHP to create a unique string, but need to create a unique order number (integers only, no letters).
I liked the idea of uniqid() because from what I understand it uses date/time, so the chances of having another id created that is identical is nil.... (if I'm understanding the function correctly)
mt_rand should do the trick.
It generates a random number between its first paramater and its second paramater. For example, to generate a random number between 500 and 1000, you'd do:
$number = mt_rand(500,1000);
But if you're using it as an order number, you should just use an autoincrement column. Not only is that what it's there for, but what would you do in the event where the same number was generated more than once? Assuming you're using MySQL, you can read about autoincrement columns here.
Use hexdec to convert the hex string to a number. http://us.php.net/manual/en/function.hexdec.php
hexdec(uniqid())
uniqid() does what you're thinking it does.. but if you're plugging this value into a database, you're better off using an auto incrementing field for ids.. it really depends on what you're using the ids for.
I personally use date('U') to generate a string based on the number of seconds since the UNIX EPOCH. If this isn't random enough (if you think you're going to have two orders being placed within the same exact second) simply add another layer with mt_rand(0,9):
$uniqid = date('U') . mt_rand(0,9);
This will, in almost all cases, give you an incremental ID except for the case of having orders created at exactly the same second, in which case the second order could precede the first.