Create mutually exclusive arrays on PHP - php

I have an array $used_logins with set of logins (can be large), and I need to generate an array of three unique logins like $login+rand(1, 1000); which wouldn't be in $used_logins array.
How can I do this fastly?

If you can't do it in a database or such: Use the keys of $used_logins to store the data. And then check whether an element with the key exists.
$k_used_logins = array_flip($used_logins); // Complexity is O(n)
$logins = array();
do {
$l = $login+rand(1, 1000);
if (!isset($k_used_logins[$l])) { // Complexity O(1)
$logins[] = $l;
}
} while (sizeof($logins) != 3);
Depending on the sie of the array this can be faster thant the naive way using array search each time. (One creates a copy of the array, but array_search slower than a key access)

If these logins are in a database, that might be the most efficient way - do a left join on 'available logins' to 'used logins' and you can filter on rows with NULLs in the right hand side to get 'unused logins'

If the logins are always generated by you, you may use a unique number in it (e.g. a timestamp). Edit: you may also append the size of the $used_logins array ($login + count($used_logins)). Of course, this only works if there are no concurrent requests (you may add a random number to avoid that).
If other users may create the login, you may just check if the login already exists with the in_array() function.

Related

Is this recursive random string generation computationally expensive?

I have created a function to generate a unique referral code for a user when they sign up, I want to ensure uniqueness so I check if it already exists, if it does then I call the function again recursively:
public function generateUniqueReferralCode()
{
$referral_code = str_random(8);
if(User::where('referral_code', $referral_code)->exists()) {
$referral_code = $this->generateUniqueReferralCode();
}
return $referral_code;
}
My question is, is this computationally expensive? Can it be done in a more efficient way as it has to scan the user table? Lets say we have 1 million users, it will check against 1 million user records if the key already exists.
PHP functions are pretty costly. So I think the following is a little faster (didn't benchmark):
public function generateUniqueReferralCode() {
$referral_code = str_random(8);
while (User::where('referral_code', $referral_code)->exists()) {
$referral_code = str_random(8);
}
return $referral_code;
}
My approach would be a little simpler. Instead of checking all those records for uniqueness, I'll rather generate a random key and plant the primary key of the last record or the record to be generated.
For instance, here's my flow of thoughts
Generate a random key - 1234abc
Fetch the primary key of the last record. Result - 3
Append it to the key - 1234abc3 ( will always be unqiue )
No, a database uses efficient indexing (search trees or hash codes) for efficient lookups, so that the number of records is virtually immaterial.
But why don't you just increment a counter to implicitly guarantee uniqueness ? (And add random salt if you want.)

php function to generate random string returned duplicate values consecutively

I have written a function to generate a random string of 7 alphanumeric characters which I am then inserting in a mysql database.
Here is the code :
function getRandomID(){
$tmp ="";
$characters=array("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z","1","2","3","4","5","6","7","8","9");
for($i=0;$i<7;$i++)
$tmp.=$characters[rand(0,count($characters)-1)];
return $tmp;
}
I am not checking for duplicates atm because I anticipate there will be no more than 1000 entries in the database and I've calculated that this function can return (35)^7 = 64,339,296,875 possible values.
I am testing it out locally as well as on a live server.
The problem is just in the last hour , this function generated duplicate values twice.
I came upon 3 entries in the database all of which had the same random string.
I do not know what could have caused this as I tried numerous times afterwards and the problem wasn't reproducible.
Does anybody have any idea what could be going on here?
Many thanks in advance
Designing your code with the mindset of "meh, that's not going to happen" is a very risky game, just do it properly once so you don't have to get back to your code multiple times to quick-fix minor things like these.
Do the duplicate check and you'll be solid.
You can create a function like
function stringExists($string)
{
...
return $boolValue;
}
And you can easily create a while loop that generates a new string while an old one has been generated.
$duplicate = true;
while($duplicate)
{
$newString = getRandomId();
$duplicate = !stringExists($string);
}
// Work with the newest string that is not a duplicate.
If you really want to get into it
You can then take a look at the documentation for rand if you want to find out what might be causing your problem. Besides, 3 entries doesn't mean anything if we don't know how many total entries there are. Also sometimes "random" function are not as random as one might think, sometimes random functions in some programming languages are always usable but require some sort of an initiation before they become "truly" random.
The time of the inserts might also be a part of the problem, there are plenty of threads on the internet, like this one on stackoverflow, that have some interesting points that can affect your "random"ness.
Whether it's true or not, not which has been pointed out in the comment, you can be pretty sure to find an answer to your question in related threads and topics.
Short answer: Don't think about it and do a duplicate check, it's easy.
Note that you should, of-course, make your ID be a UNIQUE constraint in the database to begin with.
Random != unique. Collisions happen. Check that the value is unique before you insert into the database, and/or put an integrity contstraint in your DB to enforce uniqueness.
If you're using a very old version of PHP [eg. pre-4.2] you have to seed the random number generator with srand().
Aside from #2, it's probably not your getRandomID() function but something else in your code that's re-using previous values.
If you need to enrer unique data in the DB, you may use PHP function uniqid(). (http://ca3.php.net/uniqid)
The function generates more-less random string based on current microseconds. So in theory it is unique.
But still, its always good to check before insert. Or at least put UNIQUE index on the field.
You could do something like this:
function randomString($length, $chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789") {
$string = "";
$charsLength = strlen($chars);
for ($i = 0; $i < intval($length); $i++) {
$string .= $chars[rand(0, $charsLength - 1)];
}
return $string;
}
The function above will generate a random string in the given length from the given characters. This makes it a little bit more flexible, than your implementation, if you need to use it in amother context later.
Then you could do a check like this:
$id = null;
do {
$id = randomString(7);
} while (!isUnique($id));
// do your insert here. You need to write your isUnique, so that it checks if
// the given string is unique or not.

Sending Random numbers to a database as it's ID

I'm trying to send a random number to the database for a user/article ID. It is currently using auto increment as a counting system. However, I'd like for the number to be random and unpredictable.
The mt_rand() function in PHP does exactly what I need. Although, my question is what happens when the function returns a number already in use. Of course I can just use a is_null() to check. But if it keeps on picking a number in use I could imagine that that'd slow the operation down.
Any thoughts on what I might be able to do to get around this? Perhaps I'm going at this all wrong.
Also if there's a function that gives letters and numbers that would also help greatly (like Youtube's).
Thanks for reading!
Here is a simple function to create a 10 character long string. The string is built using upper/lowercase text and numbers. Auto increment is definitely the way to go, however, if you are dead set, the function below should help.
<?php
function randomID()
{
$ID = substr(str_shuffle(str_repeat('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',5)),0,10);
echo $ID;
}
randomID();
?>
To make the string longer, change 10 to whatever you like. In terms of ensuring it does not already exist. I would suggest you generate the new ID and then do a search in the database to ensure it does not exist before inserting. Granted this is an extra step in the chain, but unfortunately this is what needs to be done.
Hope this helps
You should always use an auto_increment field as the primary key of your database. Not doing that costs you a great deal in performance. You can certainly create a secondary ID field with your random ID. I'd probably use a hashing function to get the best chance of a random string:
<?php $key = md5(rand(0,999).time().$myItemTitle); // ex. ce4075a3d3f6fd757eb6dd44810cbe14
You should always (in normal use cases) use an auto incremented ID for performance reasons. If you're purpose is to be able to somewhat hide the next post because someone could be guessing for it then you better add some kind of hashed unique field to your database.
Always random (just encrypting ms) :
<?php
$value = time();
$key = "543yretghf436436";
$encrypted = mcrypt_encrypt(MCRYPT_RIJNDAEL_128, $key, $value, MCRYPT_MODE_ECB);
//if you want even long string change 128 to 256
$encrypted = base64_encode($encrypted);
$encrypted = rtrim($encrypted, '=');
echo $encrypted;
?>
e.g.
Egttu2XhRGdAiXVfszscWg
XlttfR3XaL6pym1uSNY7Kg
YvoKCweUnN8gZyodRYysLA
What you actually want is some "random" key to use as an identifier for the article. I would keep the auto_increment and eigther:
add an column with a "hashkey" or "random key" to identify the article. This poses the "i already have this key" issue (which should not be that large unless you have billions of articles). See some code examples already posted.
create an extra table with pregenerated keys (i.e. 10000 id -> key values) where you can lookup the id by key. If the table runs out you can easily generate new values. This way you don't have to worry about getting "slow" generation speed.

PHP unique ID based on username/email

I have an email address and I want to create a unique ID based on it, so say email is me#email.com that turns into 66wyy7eu
Ive found a close solution http://www.php.net/manual/en/function.uniqid.php#96898 but it needs the input to be numeric
emails are already unique.
You can't guarantee that a hash of the email will always be unique either.
If your using a DB. an auto-increment field will be unique
Check out hash(). This should allow you to generate a sufficiently unique ID based on a string input.
have you read about md5 ?
PHP md5 function
Personally, I would use something like md5() or sha1(). PHP does have a hash() function that allows you to specify the algorithm used: http://php.net/manual/en/function.hash.php
Please see my answer to another question that is of the same nature, the function can be modified accordingly to suite your needs:
PHP random URL names (short URL)
as stated above the email addresses are unique, and if you store them into a database you will get a unique identification number from the Auto-increment column.
With that id you can then use the above function to create a unique hash for that id, and store that in the same row, then you have 2 identifiers for your email address, the ID to use internally and the encrypted key to use as a short URL service.
alternatively there is a simpler approach where as you constantly create random string and then check to see if it is within your database, if the key is within your database then you generate another and check again until you have a unique id.
here's a quick example:
function createRandomID($length = 9)
{
$random = '';
for ($i = 0; $i < $length; $i++)
{
$random .= chr(rand(ord('a'), ord('z')));
}
return $random;
}
and then simply do:
do
{
$id = createRandomID();
}while(!idExists($id));
//Insert $id into our DB along with the email!
Note: The limitations of the characters effects the amount of unique strings it can produce, the more strings you have within your database the higher the loop rate becomes which could increase the load on your DB and result in slower pages for the user.

truly unique random number generate by php?

I'm have build an up php script to host large number of images upload by user, what is the best way to generate random numbers to image filenames so that in future there would be no filename conflict? Be it like Imageshack. Thanks.
$better_token = uniqid(md5(mt_rand()), true);
Easiest way would be a new GUID for each file.
http://www.php.net/manual/en/function.uniqid.php#65879
Here's how I implemented your solution
This example assumes i want to
Get a list, containing 50 numbers that is unique and random, and
This list of # to come from the number range of 0 to 1000
Code:
//developed by www.fatphuc.com
$array = array(); //define the array
//set random # range
$minNum = 0;
$maxNum = 1000;
// i just created this function, since we’ll be generating
// # in various sections, and i just want to make sure that
// if we need to change how we generate random #, we don’t
// have to make multiple changes to the codes everywhere.
// (basically, to prevent mistakes)
function GenerateRandomNumber($minNum, $maxNum){
return round(rand($minNum, $maxNum));
}
//generate 49 more random #s to give a total of 50 random #s
for($i = 1; $i <= 49; $i++){
$num1 = GenerateRandomNumber($minNum, $maxNum);
while(in_array($num1, $array)){
$num1 = GenerateRandomNumber($minNum, $maxNum);
}
$array[$i] = $num1;
}
asort($array); //just want to sort the array
//this simply prints the list of #s in list style
echo '<ol>';
foreach ($array as $var){
echo '<li>';
echo $var;
echo '</li>';
}
echo '</ol>';
Keep a persistent list of all the previous numbers you've generated(in a database table or in a file) and check that a newly generated number is not amongst the ones on the list. If you find this to be prohibitively expensive, generate random numbers on a sufficient number of bits to guarantee a very low probability of collision.
You can also use an incremental approach of assigning these numbers, like a concatenation of a timestamp_part based on the current time and a random_part, just to make sure you don't get collisions if multiple users upload files at the same time.
You could use microtime() as suggested above and then appending an hash of the original filename to further avoid collisions in the (rare) case of exact contemporary uploads.
There are several flaws in your postulate that random values will be unique - regardless of how good the random number generator is. Also, the better the random number generator, the longer it takes to calculate results.
Wouldn't it be better to use a hash of the datafile - that way you get the added benefit of detecting duplicate submissions.
If detecting duplicates is known to be a non-issue, then I'd still recommend this approach but modify the output based on detected collisions (but using a MUCH cheaper computation method than that proposed by Lo'oris) e.g.
$candidate_name=generate_hash_of_file($input_file);
$offset=0;
while ((file_exists($candidate_name . strrev($offset) && ($offset<50)) {
$offset++;
}
if ($offset<50) {
rename($input_file, $candidate_name . strrev($offset));
} else {
print "Congratulations - you've got the biggest storage network in the world by far!";
}
this would give you the capacity to store approx 25*2^63 files using a sha1 hash.
As to how to generate the hash, reading the entire file into PHP might be slow (particularly if you try to read it all into a single string to hash it). Most Linux/Posix/Unix systems come with tools like 'md5sum' which will generate a hash from a stream very efficiently.
C.
forge a filename
try to open that file
if it exists, goto 1
create the file
Using something based on a timestamp maybe. See the microtime function for details. Alternatively uniqid to generate a unique ID based on the current time.
Guaranteed unique cannot be random. Random cannot be guaranteed unique. If you want unique (without the random) then just use the integers: 0, 1, 2, ... 1235, 1236, 1237, ... Definitely unique, but not random.
If that doesn't suit, then you can have definitely unique with the appearance of random. You use encryption on the integers to make them appear random. Using DES will give you 32 bit numbers, while using AES will give you 64 bit numbers. Use either to encrypt 0, 1, 2, ... in order with the same key. All you need to store is the key and the next number to encrypt. Because encryption is reversible, then the encrypted numbers are guaranteed unique.
If 64 bit or 32 bit numbers are too large (32 bits is 8 hex digits) then look at a format preserving encryption which will give you a smaller size range at some cost in time.
My solution is usually a hash (MD5/SHA1/...) of the image contents. This has the added advantage that if people upload the same image twice you still only have one image on the hard disk, saving some space (ofc you have to make sure that the image is not deleted if one user deletes it and another user has the same image in use).

Categories