Generate unique names? - php

I am working on a php site in which we have to upload images from users.i have to rename that file for preventing conflicts in the name of the image.
uniqid(rand(), true);
and adding a large random number after it.
Will this work perfectly. Any suggestions..??
Its about generation unique names for the image.....

Function tempnam() creates a file with a unique name.

Take an md5 of the file and use that. IIRC, the odds of a collision are 1 in 64M. If that's not enough, prefix it with the timestamp expressed in seconds or milliseconds. That way even if a duplicate md5 is generated, the files would have to come in during the same second/millisecond for a collision.

You can use Base36 on the AutoIncrement value from a SQL Table (hoping that you do use a SQL table).
$filename = base_convert($last_insert_id, 10, 36);

You have two approaches depending on "how" big can be your image library:
1. for a non-big amount of files I do this
<?php
$file = sanitize_file($file); // remove all no [az-09_] characters for safe url linking;
$file_md5 = $md5($file);
$file_extention = $md5($file);
// since I assume the file should belongs to someone you can do this
$file_name = $user_id . $file_md5 . $file_extension;
// then save the file
?>
option.... CacheMogul. Here you need to use your imagination. but for huge amount of files this does a nice sharding so you dont need to worry about a folder max quantity or size

Related

Is it safe to use PHP uniqid for filenames? Will they always be unique?

I want to use PHPs uniqid() function to create a filename for image upload.
There will be only one person uploading images and they can only upload one at a time. I don't care that the filename can be reverse engineered to get the time because it's just a filename.
Is uniqid 100% unique for my use case?
Try to use something like below:
md5(rand(1,1000000).uniqid(mt_rand(), true))
This code piece had already increased uniqueness.
You can use time() function in this case
$uniqueFilename=time().uniqid(rand());
Or like this one
$originalFileName = myimage.png
$uniqueFilename=time().$originalFileName;
Yes, uniqid() will always be unique in your case because you allow only one person and one upload at the same time. This will work because uniqid() is generated based on the system time.
You could also use uniqid(rand(), true) to generate a more random filename or md5(); to encrypt your filename.

PHP generating same random number at same time (in seconds)

I'm using random number function in a PHP script while uploading files. Because I wanted to avoid overwrite files with same name. So following is the script potion is used while upload the file.
$filename = rand(0,100000).strtolower($_FILES['file']['name']);
$dir="/file/upload/directory/".$filename;
move_uploaded_file($_FILES["user_file"]["tmp_name"], $dir);
This application expected to have large amount of concurrent users. So QA testing this application with different automated tools by applying high concurrent visit. That point the random number seems generating the same value within the same seconds.
Then we test the random number separately this same random number on same time was clearly identified.
While search on web some post suggest on mt_rand() but still it is same on milli second level.
Is there any way of generating random number in time independent way in PHP?
Random numbers are generated with time. But for this particular issue we need to write few lines of code. If we check for file existence and apply incremental number to the file name it will be a fixed solution. The code can be like follows.
$filename = strtolower($_FILES['file']['name']);
$dir="/file/upload/directory/";
$i = 1;
while(is_file($dir . $i . $filename))
{
$i++;
}
move_uploaded_file($_FILES["user_file"]["tmp_name"], $dir . $i . $filename);
Even though loop is inefficient. This will make sure the file overwrite won't happen.

algorithm to name files with no probability of repetition

Can someone suggest a complex algorithm in php to name files that would be uploaded so that it never repeats? i wonder how youtube which has millions of videos does it??
Right now i use an random number and get its 16 character sha1 hash and name the file with that name but i'm pretty sure it will eventually repeat and generate an error as file will not be able to save in the file system.
something like:
$name = sha1(substr(sha1(md5($randomnumber)),0,10));
somebody once told me that its impossible to break the hash generated by this code or at least it'll take 100 years to break it.
you could do:
$uniq = md5(uniqid(rand(), true));
You could also apped user id of users uploading the file, like:
$uniq = $user_id_of_uploader."_".md5(uniqid(rand(), true));
Generate a GUID (sometimes called UUID) using a pre-existing implementation. GUIDs are unique per computer, timestamp, GUID generated during that timestamp and so on, so they will never repeat.
If making a GUID isn't available, using sha1 on the entire input and using the entire output of it is second best.
$name = 'filename'.$user_id(if_available).md5(microtime(true)).'extension';
Try to remove special characters and white spaces from the file name.
If you are saving name in database then a recursive function can be helpful.
Do below with proper methods.
First slice its extension and filename
Now Trim the filename
Change multiple Space into single space
Replace special character and whitespace into to _
Prefix with current timestamp using strtotime and salt using md5(uniqid(rand(), true)) separated by _ (Thanks to #Sudhir )
Suffix with a special signature using str_pad and limit the text length of a file
Now again add extension and formatted file name
hope it make sense.
Thanks
I usually just generate a string for the filename (implementation is not incredibly important), then check if a file already exists with that name. If so, append a counter to it. If you somehow have a lot of files with the same base filename, this could be inefficient, but assuming your string is unique enough, it shouldn't happen very often. There's also the overhead of checking that the file exists.
$base_name = generate_some_random_string(); // use whatever method you like
$extension = '.jpg'; // Change as necessary
$file_name = $base_name . $extension;
$i = 0;
while (file_exists($file_name)) {
$file_name = $base_name . $i++ . $extension;
}
/* insert code to save the file as $file_name */

Upload image evenly into a directory structure

Im sure this question has been asked thousand of times, so here goes my version...
I have a form that uploads images...
Every image contains an unique id. I use the following function to generate my unid id:
function generateUnid($key) {
$name = $_FILES[$key]['name']; //get image name from global variable $_FILES
$ext = pathinfo($name, PATHINFO_EXTENSION); //get image extension
$prefix = 'fc'; //prefix for unid
do {
$unid = uniqid($prefix, true); //generate a unid
$filename = $unid . '.' . $ext; //replace image name with unid
$path = PATH_UPLOAD_ARTWORK . $filename; // image path
} while (file_exists($path)); // check if the image name exists
return $filename;
}
A sample of return values is:
fc4e7801523a04e6.06876802.jpg
So far so good. Now, i want to create some sort of directory structure for my images. Something similar like:
0
0
1
2
fc4e7801523a04e6.06876802.jpg
...
3
...
1
0
1
2
3
...
2
0
1
...
I could probably get the last 2 integers in my unique id for filing the image in the correct directory. But, i'm not to sure if that is the correct strategy...
How can i make sure that the images are filed evenly in the folders. I don't want to find my self with one folder that contains 12 000 images and one folder with 1 500 images...
Am i doing it the correct way by extracting the last 2 numbers of my uniq? Are there better ways for filing the image evenly?
Thanks
Assuming the unique id is uniformly (psuedo)random, which I think it is, this strategy will work pretty well I think. There will inevitably be a few folders with many more or many less than the average, predicted by normal distribution.
A slightly better technique for "binning" the images is to use the modulo (%) of many digits from the uid, rather than using the last two digits, in case the digits you have picked have some kind of pattern.
My advice would be to give it a go and see how it works for you. Ideally, you could create a "test harness" which calls the algorithm hundreds of thousands of times, after which you could assess whether the distribution of files in the directory structure is appropriate for your purposes.

Create Unique Image Names

What's a good way to create a unique name for an image that my user is uploading?
I don't want to have any duplicates so something like MD5($filename) isn't suitable.
Any ideas?
as it was mentioned, i think that best way to create unique file name is to simply add time(). that would be like
$image_name = time()."_".$image_name;
Grab the file extension from uploaded file:
$ext = pathinfo($uploaded_filename, PATHINFO_EXTENSION);
Grab the time to the second: time()
Grab some randomness: md5(microtime())
Convert time to base 36: base_convert (time(), 10, 36) - base 36 compresses a 10 byte string down to about 6 bytes to allow for more of the random string to be used
Send the whole lot out as a 16 char string:
$unique_id = substr( base_convert( time(), 10, 36 ) . md5( microtime() ), 0, 16 ) . $ext;
I doubt that will ever collide - you could even not truncate it if you don't mind very long file names.
If you actually need a filename (it's not entirely clear from your question) I would use tempnam(), which:
Creates a file with a unique filename, with access permission set to 0600, in the specified directory.
...and let PHP do the heavy lifting of working out uniqueness. Note that as well as returning the filename, tempnam() actually creates the file; you can just overwrite it when you drop the image file there.
You could take a hash (e.g., md5, sha) of the image data itself. That would help identify duplicate images too (if it was byte-for-byte, the same). But any sufficiently long string of random characters would work.
You can always rig it up in a way that the file name looks like:
/image/0/1/012345678/original-name.jpg
That way the file name looks normal, but it's still unique.
I'd recommend sha1_file() over md5_file(). It's less prone to collisions.
You could also use hash_file('sha256', $filePath) to get even better results.
http://php.net/manual/en/function.uniqid.php maybe?
You can prefix it with the user id to avoid collisions between 2 users (in less than one millisecond).
For short names:
$i = 0;
while(file_exists($name . '_' . $i)){
$i++;
}
WARNING: this might fail on a multi threaded server if two user upload a image with the same name at the same time.
In that case you should include the md5 of the username.
lol there are around 63340000000000000000000000000000000000000000000000 possibility's that md5 can produce
plus you could use just tobe on the safe side
$newfilename = md5(time().'image');
if(file_exists('./images/'.$newfilename)){
$newfilename = md5(time().$newfilename);
}
//uploadimage
How big is the probablity of two users uploading image with same name on same microsecond ?
try
$currTime = microtime(true);
$finalFileName = cleanTheInput($fileName)."_".$currTime;
// you can also append a _.rand(0,1000) in the end to have more foolproof name collision
function cleanTheInput($input)
{
// do some formatting here ...
}
This would also help you in tracking the upload time of the file for analysis. or may be sort the files,manage the files.
For good performance and uniqueness you can use approach like this:
files will be stored on a server with names like md5_file($file).jpg
the directory to store file in define from md5 file name, by stripping first two chars (first level), and second two (second level) like that:
uploaded_files\ 30 \ c5 \ 30 c5 67139b64ee14c80cc5f5006d8081.pdf
create record in database with file_id, original file name, uploaded user id, and path to file on server
on server side create script that'll get role of download providing - it'll get file by id from db, and output its content with original filename provided by user (see php example of codeigniter download_helper ). So url to file will look like that:
http://site.com/download.php?file=id
Pros:
minified collisions threat
good performance at file lookup (not much files in 1 directory, not much directories at the same level)
original file names are saved
you can adjust access to files by server side script (check session or cookies)
Cons:
Good for small filesizes, because before user can download file, server have to read this file in memory
try this file format:
$filename = microtime(true) . $username . '.jpg';
I think it would be good for you.
<?php
$name=uniqid(mt_rand()).$image_name;
?>
You should try to meet two goals: Uniqueness, and usefulness.
Using a GUID guarantees uniqueness, but one day the files may become detached from their original source, and then you will be in trouble.
My typical solution is to embed crucial information into the filename, such as the userID (if it belongs to a user) or the date and time uploaded (if this is significant), or the filename used when uploading it.
This may really save your skin one day, when the information embedded in the filename allows you to, for example, recover from a bug, or the accidental deletion of records. If all you have is GUIDs, and you lose the catalogue, you will have a heck of a job cleaning that up.
For example, if a file "My Holiday: Florida 23.jpg" is uploaded, by userID 98765, on 2013/04/04 at 12:51:23 I would name it something like this, adding a random string ad8a7dsf9:
20130404125123-ad8a7dsf9-98765-my-holiday-florida-23.jpg
Uniqueness is ensured by the date and time, and random string (provided it is properly random from /dev/urandom or CryptGenRandom.
If the file is ever detached, you can identify the user, the date and time, and the title.
Everything is folded to lower case and anything non-alphanumeric is removed and replaced by dashes, which makes the filename easy to handle using simple tools (e.g. no spaces which can confuse badly written scripts, no colons or other characters which are forbidden on some filesystems, and so on).
Something like this could work for you:
while (file_exists('/uploads/' . $filename . '.jpeg')) {
$filename .= rand(10, 99);
}
Ready-to-use code:
$file_ext = substr($file['name'], -4); // e.g.'.jpg', '.gif', '.png', 'jpeg' (note the missing leading point in 'jpeg')
$new_name = sha1($file['name'] . uniqid('',true)); // this will generate a 40-character-long random name
$new_name .= ((substr($file_ext, 0, 1) != '.') ? ".{$file_ext}" : $file_ext); //the original extension is appended (accounting for the point, see comment above)

Categories