I'm trying to send a random number to the database for a user/article ID. It is currently using auto increment as a counting system. However, I'd like for the number to be random and unpredictable.
The mt_rand() function in PHP does exactly what I need. Although, my question is what happens when the function returns a number already in use. Of course I can just use a is_null() to check. But if it keeps on picking a number in use I could imagine that that'd slow the operation down.
Any thoughts on what I might be able to do to get around this? Perhaps I'm going at this all wrong.
Also if there's a function that gives letters and numbers that would also help greatly (like Youtube's).
Thanks for reading!
Here is a simple function to create a 10 character long string. The string is built using upper/lowercase text and numbers. Auto increment is definitely the way to go, however, if you are dead set, the function below should help.
<?php
function randomID()
{
$ID = substr(str_shuffle(str_repeat('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',5)),0,10);
echo $ID;
}
randomID();
?>
To make the string longer, change 10 to whatever you like. In terms of ensuring it does not already exist. I would suggest you generate the new ID and then do a search in the database to ensure it does not exist before inserting. Granted this is an extra step in the chain, but unfortunately this is what needs to be done.
Hope this helps
You should always use an auto_increment field as the primary key of your database. Not doing that costs you a great deal in performance. You can certainly create a secondary ID field with your random ID. I'd probably use a hashing function to get the best chance of a random string:
<?php $key = md5(rand(0,999).time().$myItemTitle); // ex. ce4075a3d3f6fd757eb6dd44810cbe14
You should always (in normal use cases) use an auto incremented ID for performance reasons. If you're purpose is to be able to somewhat hide the next post because someone could be guessing for it then you better add some kind of hashed unique field to your database.
Always random (just encrypting ms) :
<?php
$value = time();
$key = "543yretghf436436";
$encrypted = mcrypt_encrypt(MCRYPT_RIJNDAEL_128, $key, $value, MCRYPT_MODE_ECB);
//if you want even long string change 128 to 256
$encrypted = base64_encode($encrypted);
$encrypted = rtrim($encrypted, '=');
echo $encrypted;
?>
e.g.
Egttu2XhRGdAiXVfszscWg
XlttfR3XaL6pym1uSNY7Kg
YvoKCweUnN8gZyodRYysLA
What you actually want is some "random" key to use as an identifier for the article. I would keep the auto_increment and eigther:
add an column with a "hashkey" or "random key" to identify the article. This poses the "i already have this key" issue (which should not be that large unless you have billions of articles). See some code examples already posted.
create an extra table with pregenerated keys (i.e. 10000 id -> key values) where you can lookup the id by key. If the table runs out you can easily generate new values. This way you don't have to worry about getting "slow" generation speed.
Related
I have a file sharing website, and every file has a random id. Example for an id: G4t68MgW7
Every upload I create a random id, and check if it's exists (in a loop). There are some issues with that way.
I have to check if this id does exists (Mysql query)
It's a limited range
So how can I can create a unique id without limitation and without checking if it already exists?
Note: I don't use Auto Increment because I want to avoid from bots to reach every file in my website. example of how it looks in the browser: http://www.example.com/file/G4t68MgW7
You can assign timestamp value ie, time() as id. It will be unique always
Well, you more or less gave the answer yourself.
Illustrated with the following pseudocode:
while (true) {
hash = generate_hash();
SQL -> Check if hash found
if (!found) {
break;
}
}
It is pretty easy to implement this. The generate hash could be a simple md5 or it could be a function that builds a random string based on an array of letters. For example something as simple as:
function generate_hash() {
return '$2y$' . substr(md5(time() . 'foo' . rand(0, 1000000) . 'bar'), 0, 15) . 'ydfdf';
}
In 99.999% of all cases, the hash would be unique, so performance should not be an issue here. This also creates more "randomness" than uniqid().
echo substr(uniqid(rand(10,1000),false),rand(0,10),6)
You can have a table of pre-defined identifiers, so you make sure that they are unique in creation time (you don't have to query if they exist; simply insert and don't do anything if the insert fails). When you want a file to be uploaded, get an unused code and mark it as used so it's not used again.
You can also have a cron to check if you're running out of codes, and run the generation script again (increasing the number of characters makes the number of codes virtually unlimited). As this is asynchronous, it won't affect performance.
I want to generate a Unique Code for each project being created. I have an HTML5 webpage that allows user to create new project, each project when created successfully be assigned a unique code.
I am making a Ajax call to the PHP file on the web server which in-turns saves the project details in MySql database. I have a column in the table that stores unique code for each project created.
I am confused how do i create this code ? is it in PHP or shall i do it in MySql. I want it to be a unique code which will be used by the client to distribute to their customers.
I haven't decided on the length of the key yet but it should be around 8 Digits(combination of char & int is fine ). I know i could use HashTable in Java to create this code based on the inputs from user but i am a fresher to PHP/MySql.
Any advise ?
Note: My Aim is that the key should not be repeated
You can use PHP's uniqid() to generate a unique ID. However, this should not be used for security purposes, as explicity stated in the PHP manual. For more info, go here
Example:
$unique_key = uniqid();
echo $unique_key; // Outputs unique alphanumeric key, like 5369adb278516
Generate Code:
// $length is the length of code you want to return
function generate_code($length) {
$charset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789012345678900987654321234567890";
return substr(str_shuffle($charset), 0, $length);
}
To get the verification code, it will call user_code_exists() with a parameter of the generated code which is on $code = generate_code(50).
It will check the database if there's at least one row that has the same value, if the row is 0 (code doesn't exist) it will return as true.
// Do generate and verify code existence
$verification_code = "";
while($this->user_code_exists($code = generate_code(50)) == true) {
$verification_code = $code;
break;
}
public function user_code_exists($code) {
$query = $this->db->prepare("SELECT verification_code FROM accounts WHERE verification_code = :verification_code");
$query->execute(array(':verification_code' => $code));
return ($query->rowCount() == 0) ? true : false;
}
On while loop, once it returns true, the variable $verification_code holds the unique generated code.
This is just an overview, I hope this helps.
See the answers given for this question:
What is the best way to create a random hash/string?
In particular, if you want a purely random value (as opposed to, say a hash of the project name) then see the answer by #Gajus Kuizinas, except using base64_encode rather than binhex will give a shorter but still readable value:
base64_encode(mcrypt_create_iv(8, MCRYPT_DEV_URANDOM));
will give you 11 characters: NTM2OWI0YzR
Or if you don't have the mcrypt library installed, try:
base64_encode(hex2bin(uniqid()."0")); // Derived from microtime (the "0" is needed since uniqid() gives an odd number of characters
gives 10 characters: U2m5vF8FAA after discarding the trailing '=='
If you want to be paranoid about the project code never repeating, add a unique index to the column in your MySql table that stores the unique code for each project created, and repeat the number generation if your insert into the table fails.
As noted by #Mark M above, if you are concerned about security or someone masquerading an existing project code, see #Anthony Forloney's answer in the related question link above. In particular:
Numbers used once (NONCE) - They are used on requests to prevent
unauthorized access, they send a secret key and check the key each
time your code is used.
You can check out more at PHP NONCE Library from FullThrottle
Development
I needed to do something similar, a solution to keep unique id and i ended up with a solution to use PHP function time() like this $reference_number = 'BFF-' . time(); you can change the BFF to something that makes more sense to your business logic. This way i dont have to worry about if new id that is being generated was taken up before.
I hope this helps
I have written a function to generate a random string of 7 alphanumeric characters which I am then inserting in a mysql database.
Here is the code :
function getRandomID(){
$tmp ="";
$characters=array("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z","1","2","3","4","5","6","7","8","9");
for($i=0;$i<7;$i++)
$tmp.=$characters[rand(0,count($characters)-1)];
return $tmp;
}
I am not checking for duplicates atm because I anticipate there will be no more than 1000 entries in the database and I've calculated that this function can return (35)^7 = 64,339,296,875 possible values.
I am testing it out locally as well as on a live server.
The problem is just in the last hour , this function generated duplicate values twice.
I came upon 3 entries in the database all of which had the same random string.
I do not know what could have caused this as I tried numerous times afterwards and the problem wasn't reproducible.
Does anybody have any idea what could be going on here?
Many thanks in advance
Designing your code with the mindset of "meh, that's not going to happen" is a very risky game, just do it properly once so you don't have to get back to your code multiple times to quick-fix minor things like these.
Do the duplicate check and you'll be solid.
You can create a function like
function stringExists($string)
{
...
return $boolValue;
}
And you can easily create a while loop that generates a new string while an old one has been generated.
$duplicate = true;
while($duplicate)
{
$newString = getRandomId();
$duplicate = !stringExists($string);
}
// Work with the newest string that is not a duplicate.
If you really want to get into it
You can then take a look at the documentation for rand if you want to find out what might be causing your problem. Besides, 3 entries doesn't mean anything if we don't know how many total entries there are. Also sometimes "random" function are not as random as one might think, sometimes random functions in some programming languages are always usable but require some sort of an initiation before they become "truly" random.
The time of the inserts might also be a part of the problem, there are plenty of threads on the internet, like this one on stackoverflow, that have some interesting points that can affect your "random"ness.
Whether it's true or not, not which has been pointed out in the comment, you can be pretty sure to find an answer to your question in related threads and topics.
Short answer: Don't think about it and do a duplicate check, it's easy.
Note that you should, of-course, make your ID be a UNIQUE constraint in the database to begin with.
Random != unique. Collisions happen. Check that the value is unique before you insert into the database, and/or put an integrity contstraint in your DB to enforce uniqueness.
If you're using a very old version of PHP [eg. pre-4.2] you have to seed the random number generator with srand().
Aside from #2, it's probably not your getRandomID() function but something else in your code that's re-using previous values.
If you need to enrer unique data in the DB, you may use PHP function uniqid(). (http://ca3.php.net/uniqid)
The function generates more-less random string based on current microseconds. So in theory it is unique.
But still, its always good to check before insert. Or at least put UNIQUE index on the field.
You could do something like this:
function randomString($length, $chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789") {
$string = "";
$charsLength = strlen($chars);
for ($i = 0; $i < intval($length); $i++) {
$string .= $chars[rand(0, $charsLength - 1)];
}
return $string;
}
The function above will generate a random string in the given length from the given characters. This makes it a little bit more flexible, than your implementation, if you need to use it in amother context later.
Then you could do a check like this:
$id = null;
do {
$id = randomString(7);
} while (!isUnique($id));
// do your insert here. You need to write your isUnique, so that it checks if
// the given string is unique or not.
I'm interested in creating tiny url like links. My idea was to simply store an incrementing identifier for every long url posted and then convert this id to it's base 36 variant, like the following in PHP:
$tinyurl = base_convert($id, 10, 36)
The problem here is that the result is guessable, while it has to be hard to guess what the next url is going to be, while still being short (tiny). Eg. atm if my last tinyurl was a1, the next one will be a2. This is a bad thing for me.
So, how would I make sure that the resulting tiny url is not as guessable but still short?
What you are asking for is a balance between reduction of information (URLs to their indexes in your database), and artificial increase of information (to create holes in your sequence).
You have to decide how important both is for you. Another question is whether you just do not want sequential URLs to be guessable, or have them sufficiently random to make guessing any valid URL difficult.
Basically, you want to declare n out of N valid ids. Choose N smaller to make the URLs shorter, and make n smaller to generate URLs that are difficult to guess. Make n and N larger to generate more URLs when the shorter ones are taken.
To assign the ids, you can just take any kind of random generator or hash function and cap this to your target range N. If you detect a collision, choose the next random value. If you have reached a count of n unique ids, you must increase the range of your ID set (n and N).
I would simply crc32 url
$url = 'http://www.google.com';
$tinyurl = hash('crc32', $url ); // db85f073
cons: constant 8 character long identifier
This is really cheap, but if the user doesn't know it's happening then it's not as guessable, but prefix and postfix the actual id with 2 or 3 random numbers/letters.
If I saw 9d2a1me3 I wouldn't guess that dm2a2dq2 was the next in the series.
Try Xor'ing the $id with some value, e.g. $id ^ 46418 - and to convert back to your original id you just perform the same Xor again i.e. $mungedId ^ 46418. Stack this together with your base_convert and perhaps some swapping of chars in the resultant string and it'll get quite tricky to guess a URL.
Another way would be to set the maximum number of characters for the URL (let's say it's n). You could then choose a random number between 1 and n!, which would be your permutation number.
On which new URL, you would increment the id and use the permutation number to associate the actual id that would be used. Finally, you would base 32 (or whatever) encode your URL. This would be perfectly random and perfectly reversible.
If you want an injective function, you can use any form of encryption. For instance:
<?php
$key = "my secret";
$enc = mcrypt_ecb (MCRYPT_3DES, $key, "42", MCRYPT_ENCRYPT);
$f = unpack("H*", $enc);
$value = reset($f);
var_dump($value); //string(16) "1399e6a37a6e9870"
To reverse:
$rf = pack("H*", $value);
$dec = rtrim(mcrypt_ecb (MCRYPT_3DES, $key, $rf, MCRYPT_DECRYPT), "\x00");
var_dump($dec); //string(2) "42"
This will not give you a number in base 32; it will give you the encrypted data with each byte converted to base 16 (i.e., the conversion is global). If you really need, you can trivially convert this to base 10 and then to base 32 with any library that supports big integers.
You can pre-define the 4-character codes in advance (all possible combinations), then randomize that list and store it in this random order in a data table. When you want a new value, just grab the first one off the top and remove it from the list. It's fast, no on-the-fly calculation, and guarantees pseudo-randomness to the end-user.
Hashids is an open-source library that generates short, unique, non-sequential, YouTube-like ids from one or many numbers. You can think of it as an algorithm to obfuscate numbers.
It converts numbers like 347 into strings like "yr8", or array like [27, 986] into "3kTMd". You can also decode those ids back. This is useful in bundling several parameters into one or simply using them as short UIDs.
Use it when you don't want to expose your database ids to the user.
It allows custom alphabet as well as salt, so ids are unique only to you.
Incremental input is mangled to stay unguessable.
There are no collisions because the method is based on integer to hex conversion.
It was written with the intent of placing created ids in visible places, like the URL. Therefore, the algorithm avoids generating most common English curse words.
Code example
$hashids = new Hashids();
$id = $hashids->encode(1, 2, 3); // o2fXhV
$numbers = $hashids->decode($id); // [1, 2, 3]
I ended up creating a md5 sum of the identifier, use the first 4 alphanumerics of it and if this is a duplicate simply increment the length until it is no longer a duplicate.
function idToTinyurl($id) {
$md5 = md5($id);
for ($i = 4; $i < strlen($md5); $i++) {
$possibleTinyurl = substr($md5, 0, $i);
$res = mysql_query("SELECT id FROM tabke WHERE tinyurl='".$possibleTinyurl."' LIMIT 1");
if (mysql_num_rows($res) == 0) return $possibleTinyurl;
}
return $md5;
}
Accepted relet's answer as it's lead me to this strategy.
I'm have build an up php script to host large number of images upload by user, what is the best way to generate random numbers to image filenames so that in future there would be no filename conflict? Be it like Imageshack. Thanks.
$better_token = uniqid(md5(mt_rand()), true);
Easiest way would be a new GUID for each file.
http://www.php.net/manual/en/function.uniqid.php#65879
Here's how I implemented your solution
This example assumes i want to
Get a list, containing 50 numbers that is unique and random, and
This list of # to come from the number range of 0 to 1000
Code:
//developed by www.fatphuc.com
$array = array(); //define the array
//set random # range
$minNum = 0;
$maxNum = 1000;
// i just created this function, since we’ll be generating
// # in various sections, and i just want to make sure that
// if we need to change how we generate random #, we don’t
// have to make multiple changes to the codes everywhere.
// (basically, to prevent mistakes)
function GenerateRandomNumber($minNum, $maxNum){
return round(rand($minNum, $maxNum));
}
//generate 49 more random #s to give a total of 50 random #s
for($i = 1; $i <= 49; $i++){
$num1 = GenerateRandomNumber($minNum, $maxNum);
while(in_array($num1, $array)){
$num1 = GenerateRandomNumber($minNum, $maxNum);
}
$array[$i] = $num1;
}
asort($array); //just want to sort the array
//this simply prints the list of #s in list style
echo '<ol>';
foreach ($array as $var){
echo '<li>';
echo $var;
echo '</li>';
}
echo '</ol>';
Keep a persistent list of all the previous numbers you've generated(in a database table or in a file) and check that a newly generated number is not amongst the ones on the list. If you find this to be prohibitively expensive, generate random numbers on a sufficient number of bits to guarantee a very low probability of collision.
You can also use an incremental approach of assigning these numbers, like a concatenation of a timestamp_part based on the current time and a random_part, just to make sure you don't get collisions if multiple users upload files at the same time.
You could use microtime() as suggested above and then appending an hash of the original filename to further avoid collisions in the (rare) case of exact contemporary uploads.
There are several flaws in your postulate that random values will be unique - regardless of how good the random number generator is. Also, the better the random number generator, the longer it takes to calculate results.
Wouldn't it be better to use a hash of the datafile - that way you get the added benefit of detecting duplicate submissions.
If detecting duplicates is known to be a non-issue, then I'd still recommend this approach but modify the output based on detected collisions (but using a MUCH cheaper computation method than that proposed by Lo'oris) e.g.
$candidate_name=generate_hash_of_file($input_file);
$offset=0;
while ((file_exists($candidate_name . strrev($offset) && ($offset<50)) {
$offset++;
}
if ($offset<50) {
rename($input_file, $candidate_name . strrev($offset));
} else {
print "Congratulations - you've got the biggest storage network in the world by far!";
}
this would give you the capacity to store approx 25*2^63 files using a sha1 hash.
As to how to generate the hash, reading the entire file into PHP might be slow (particularly if you try to read it all into a single string to hash it). Most Linux/Posix/Unix systems come with tools like 'md5sum' which will generate a hash from a stream very efficiently.
C.
forge a filename
try to open that file
if it exists, goto 1
create the file
Using something based on a timestamp maybe. See the microtime function for details. Alternatively uniqid to generate a unique ID based on the current time.
Guaranteed unique cannot be random. Random cannot be guaranteed unique. If you want unique (without the random) then just use the integers: 0, 1, 2, ... 1235, 1236, 1237, ... Definitely unique, but not random.
If that doesn't suit, then you can have definitely unique with the appearance of random. You use encryption on the integers to make them appear random. Using DES will give you 32 bit numbers, while using AES will give you 64 bit numbers. Use either to encrypt 0, 1, 2, ... in order with the same key. All you need to store is the key and the next number to encrypt. Because encryption is reversible, then the encrypted numbers are guaranteed unique.
If 64 bit or 32 bit numbers are too large (32 bits is 8 hex digits) then look at a format preserving encryption which will give you a smaller size range at some cost in time.
My solution is usually a hash (MD5/SHA1/...) of the image contents. This has the added advantage that if people upload the same image twice you still only have one image on the hard disk, saving some space (ofc you have to make sure that the image is not deleted if one user deletes it and another user has the same image in use).