I have created a function to generate a unique referral code for a user when they sign up, I want to ensure uniqueness so I check if it already exists, if it does then I call the function again recursively:
public function generateUniqueReferralCode()
{
$referral_code = str_random(8);
if(User::where('referral_code', $referral_code)->exists()) {
$referral_code = $this->generateUniqueReferralCode();
}
return $referral_code;
}
My question is, is this computationally expensive? Can it be done in a more efficient way as it has to scan the user table? Lets say we have 1 million users, it will check against 1 million user records if the key already exists.
PHP functions are pretty costly. So I think the following is a little faster (didn't benchmark):
public function generateUniqueReferralCode() {
$referral_code = str_random(8);
while (User::where('referral_code', $referral_code)->exists()) {
$referral_code = str_random(8);
}
return $referral_code;
}
My approach would be a little simpler. Instead of checking all those records for uniqueness, I'll rather generate a random key and plant the primary key of the last record or the record to be generated.
For instance, here's my flow of thoughts
Generate a random key - 1234abc
Fetch the primary key of the last record. Result - 3
Append it to the key - 1234abc3 ( will always be unqiue )
No, a database uses efficient indexing (search trees or hash codes) for efficient lookups, so that the number of records is virtually immaterial.
But why don't you just increment a counter to implicitly guarantee uniqueness ? (And add random salt if you want.)
Related
I am looking for an efficient way to generate 5 million unique codes with 7 characters (letters, numbers, special chars).
Basically, my idea was to generate a table with a unique constraint. Then to generate a code, insert it into the database, see if it is "accepted" (meaning a new code) until we have 5 million unique codes.
Alternatively they idea was to generate an array with 5 million unique codes to insert them afterward at once into the database to see how many of the codes make it into the database (are unique).
The third option was to create one code, check if it already exists, if not insert it into the database.
My question now is what method I should use - there might be a problem I oversee. Or is there a better way?
Thanks a lot!
Pick an appropriate function to generate one random code; for illustration purposes I'll be using this:
function generateCode() {
return substr(bin2hex(random_bytes(4)), 0, 7);
}
See https://stackoverflow.com/a/22829048/476 and other answers in there to pick something that works for you. The important point is that it uses a good source of randomness, either random_bytes, random_int, openssl_random_pseudo_bytes or /dev/urandom. This minimises the chance of two calls to this function producing the same output.
From there, simply use array keys to deduplicate the values:
$codes = [];
while (count($codes) < 5000000) {
$codes[generateCode()] = null;
}
$codes = array_keys($codes);
If generateCode is sufficiently random, there should be few collisions and there shouldn't be too much overhead in generating codes this way. Even if, this is presumably a one-time operation, and efficiency isn't paramount. 5 million short strings should certainly fit into memory without much problem. You can then insert them all into the database in a batch.
function generateRandomString($length = 7) {
// you can update these with new chars
$characters = '!##$%^&*()_+0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $charactersLength; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
Now use an array to store the codes:
$codes = array();
while(count($codes)!=5000000){
$code = generateRandomString();
$codes[$code] = $code;
}
$codes key and value, both have the same code.
Given the purpose for which you're generating unique identifiers (as hard-to-guess coupon codes), I want to say that you should generate a unique identifier that combines a "unique" part and a "random" part.
The "unique" part can be a monotonically increasing counter (such as an auto-incremented row number in supporting databases), which can optionally serve as the seed of a full-period linear congruential generator (which cycles pseudorandomly through all possible values in its period before repeating).
The "random" part is simply a random number generated with a cryptographic random number generator (which for PHP is random_int). In general, the longer the random part is, the less predictable it will be.
Moreover, for the purposes of generating unique coupon codes, there is little reason to limit yourself to 7-character codes, especially if end users won't be required to enter those codes directly. See also this question.
Should the codes you wanted needed to be inserted in the database?
It would have been better not to constantly request to the db and try if it is unique.
You can store the codes to an array, first before putting it to the db.
Pseudo-code:
Generate unique 5 million codes, inserted in the hash table or an array. // as you insert a new one check the hash-table if it exists.
You then insert this hash table or array in the database now.
I have written a function to generate a random string of 7 alphanumeric characters which I am then inserting in a mysql database.
Here is the code :
function getRandomID(){
$tmp ="";
$characters=array("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z","1","2","3","4","5","6","7","8","9");
for($i=0;$i<7;$i++)
$tmp.=$characters[rand(0,count($characters)-1)];
return $tmp;
}
I am not checking for duplicates atm because I anticipate there will be no more than 1000 entries in the database and I've calculated that this function can return (35)^7 = 64,339,296,875 possible values.
I am testing it out locally as well as on a live server.
The problem is just in the last hour , this function generated duplicate values twice.
I came upon 3 entries in the database all of which had the same random string.
I do not know what could have caused this as I tried numerous times afterwards and the problem wasn't reproducible.
Does anybody have any idea what could be going on here?
Many thanks in advance
Designing your code with the mindset of "meh, that's not going to happen" is a very risky game, just do it properly once so you don't have to get back to your code multiple times to quick-fix minor things like these.
Do the duplicate check and you'll be solid.
You can create a function like
function stringExists($string)
{
...
return $boolValue;
}
And you can easily create a while loop that generates a new string while an old one has been generated.
$duplicate = true;
while($duplicate)
{
$newString = getRandomId();
$duplicate = !stringExists($string);
}
// Work with the newest string that is not a duplicate.
If you really want to get into it
You can then take a look at the documentation for rand if you want to find out what might be causing your problem. Besides, 3 entries doesn't mean anything if we don't know how many total entries there are. Also sometimes "random" function are not as random as one might think, sometimes random functions in some programming languages are always usable but require some sort of an initiation before they become "truly" random.
The time of the inserts might also be a part of the problem, there are plenty of threads on the internet, like this one on stackoverflow, that have some interesting points that can affect your "random"ness.
Whether it's true or not, not which has been pointed out in the comment, you can be pretty sure to find an answer to your question in related threads and topics.
Short answer: Don't think about it and do a duplicate check, it's easy.
Note that you should, of-course, make your ID be a UNIQUE constraint in the database to begin with.
Random != unique. Collisions happen. Check that the value is unique before you insert into the database, and/or put an integrity contstraint in your DB to enforce uniqueness.
If you're using a very old version of PHP [eg. pre-4.2] you have to seed the random number generator with srand().
Aside from #2, it's probably not your getRandomID() function but something else in your code that's re-using previous values.
If you need to enrer unique data in the DB, you may use PHP function uniqid(). (http://ca3.php.net/uniqid)
The function generates more-less random string based on current microseconds. So in theory it is unique.
But still, its always good to check before insert. Or at least put UNIQUE index on the field.
You could do something like this:
function randomString($length, $chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789") {
$string = "";
$charsLength = strlen($chars);
for ($i = 0; $i < intval($length); $i++) {
$string .= $chars[rand(0, $charsLength - 1)];
}
return $string;
}
The function above will generate a random string in the given length from the given characters. This makes it a little bit more flexible, than your implementation, if you need to use it in amother context later.
Then you could do a check like this:
$id = null;
do {
$id = randomString(7);
} while (!isUnique($id));
// do your insert here. You need to write your isUnique, so that it checks if
// the given string is unique or not.
I'm trying to send a random number to the database for a user/article ID. It is currently using auto increment as a counting system. However, I'd like for the number to be random and unpredictable.
The mt_rand() function in PHP does exactly what I need. Although, my question is what happens when the function returns a number already in use. Of course I can just use a is_null() to check. But if it keeps on picking a number in use I could imagine that that'd slow the operation down.
Any thoughts on what I might be able to do to get around this? Perhaps I'm going at this all wrong.
Also if there's a function that gives letters and numbers that would also help greatly (like Youtube's).
Thanks for reading!
Here is a simple function to create a 10 character long string. The string is built using upper/lowercase text and numbers. Auto increment is definitely the way to go, however, if you are dead set, the function below should help.
<?php
function randomID()
{
$ID = substr(str_shuffle(str_repeat('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',5)),0,10);
echo $ID;
}
randomID();
?>
To make the string longer, change 10 to whatever you like. In terms of ensuring it does not already exist. I would suggest you generate the new ID and then do a search in the database to ensure it does not exist before inserting. Granted this is an extra step in the chain, but unfortunately this is what needs to be done.
Hope this helps
You should always use an auto_increment field as the primary key of your database. Not doing that costs you a great deal in performance. You can certainly create a secondary ID field with your random ID. I'd probably use a hashing function to get the best chance of a random string:
<?php $key = md5(rand(0,999).time().$myItemTitle); // ex. ce4075a3d3f6fd757eb6dd44810cbe14
You should always (in normal use cases) use an auto incremented ID for performance reasons. If you're purpose is to be able to somewhat hide the next post because someone could be guessing for it then you better add some kind of hashed unique field to your database.
Always random (just encrypting ms) :
<?php
$value = time();
$key = "543yretghf436436";
$encrypted = mcrypt_encrypt(MCRYPT_RIJNDAEL_128, $key, $value, MCRYPT_MODE_ECB);
//if you want even long string change 128 to 256
$encrypted = base64_encode($encrypted);
$encrypted = rtrim($encrypted, '=');
echo $encrypted;
?>
e.g.
Egttu2XhRGdAiXVfszscWg
XlttfR3XaL6pym1uSNY7Kg
YvoKCweUnN8gZyodRYysLA
What you actually want is some "random" key to use as an identifier for the article. I would keep the auto_increment and eigther:
add an column with a "hashkey" or "random key" to identify the article. This poses the "i already have this key" issue (which should not be that large unless you have billions of articles). See some code examples already posted.
create an extra table with pregenerated keys (i.e. 10000 id -> key values) where you can lookup the id by key. If the table runs out you can easily generate new values. This way you don't have to worry about getting "slow" generation speed.
I have an array $used_logins with set of logins (can be large), and I need to generate an array of three unique logins like $login+rand(1, 1000); which wouldn't be in $used_logins array.
How can I do this fastly?
If you can't do it in a database or such: Use the keys of $used_logins to store the data. And then check whether an element with the key exists.
$k_used_logins = array_flip($used_logins); // Complexity is O(n)
$logins = array();
do {
$l = $login+rand(1, 1000);
if (!isset($k_used_logins[$l])) { // Complexity O(1)
$logins[] = $l;
}
} while (sizeof($logins) != 3);
Depending on the sie of the array this can be faster thant the naive way using array search each time. (One creates a copy of the array, but array_search slower than a key access)
If these logins are in a database, that might be the most efficient way - do a left join on 'available logins' to 'used logins' and you can filter on rows with NULLs in the right hand side to get 'unused logins'
If the logins are always generated by you, you may use a unique number in it (e.g. a timestamp). Edit: you may also append the size of the $used_logins array ($login + count($used_logins)). Of course, this only works if there are no concurrent requests (you may add a random number to avoid that).
If other users may create the login, you may just check if the login already exists with the in_array() function.
I have an email address and I want to create a unique ID based on it, so say email is me#email.com that turns into 66wyy7eu
Ive found a close solution http://www.php.net/manual/en/function.uniqid.php#96898 but it needs the input to be numeric
emails are already unique.
You can't guarantee that a hash of the email will always be unique either.
If your using a DB. an auto-increment field will be unique
Check out hash(). This should allow you to generate a sufficiently unique ID based on a string input.
have you read about md5 ?
PHP md5 function
Personally, I would use something like md5() or sha1(). PHP does have a hash() function that allows you to specify the algorithm used: http://php.net/manual/en/function.hash.php
Please see my answer to another question that is of the same nature, the function can be modified accordingly to suite your needs:
PHP random URL names (short URL)
as stated above the email addresses are unique, and if you store them into a database you will get a unique identification number from the Auto-increment column.
With that id you can then use the above function to create a unique hash for that id, and store that in the same row, then you have 2 identifiers for your email address, the ID to use internally and the encrypted key to use as a short URL service.
alternatively there is a simpler approach where as you constantly create random string and then check to see if it is within your database, if the key is within your database then you generate another and check again until you have a unique id.
here's a quick example:
function createRandomID($length = 9)
{
$random = '';
for ($i = 0; $i < $length; $i++)
{
$random .= chr(rand(ord('a'), ord('z')));
}
return $random;
}
and then simply do:
do
{
$id = createRandomID();
}while(!idExists($id));
//Insert $id into our DB along with the email!
Note: The limitations of the characters effects the amount of unique strings it can produce, the more strings you have within your database the higher the loop rate becomes which could increase the load on your DB and result in slower pages for the user.