Directory structure for large number of files - php

I made one site... where i am storing user uploaded files in separate directories like
user_id = 1
so
img/upload_docs/1/1324026061_1.txt
img/upload_docs/1/1324026056_1.txt
Same way if
user_id = 2
so
img/upload_docs/2/1324026061_2.txt
img/upload_docs/2/1324026056_2.txt
...
n
So now if in future if I will get 100000 users then in my upload_docs folder I will have 100000 folders.
And there is no restriction on user upload so it can be 1000 files for 1 user or 10 files any number of files...
so is this proper way?
Or if not then can anyone suggest me how to store this files in this kind of structure???

What I would do is name the images UUIDs and create subfolders based on the names of the files. You can do this pretty easily with chunk_split. For example, if you create a folder every 4 characters you would end up with a structure like this:
img/upload_docs/1/1324/0260/61_1.txt
img/upload_docs/1/1324/0260/56_1.txt
By storing the image name 1324026056_1.txt you could then very easily determine where it belongs or where to fetch it using chunk_split.
This is a similar method to how git stores objects.
As code, it could look something like this.
// pass filename ('123456789.txt' from db)
function get_path_from_filename($filename) {
$path = 'img/upload_docs';
list($filename, $ext) = explode('.', $filename); //remove extension
$folders = chunk_split($filename, 4, '/'); // becomes 1234/5678/9
// now we store it here
$newpath = $path.'/'.$folders.'/'.$ext;
return $newpath;
}
Now when you search for the file to deliver it to the user, use a function using these steps to recreate where the file is (based on the filename which is still stored as '123456789.txt' in the DB).
Now to deliver or store the file, use get_path_from_filename.

img/upload_docs/1/0/10000/1324026056_2.txt
img/upload_docs/9/7/97555/1324026056_2.txt
img/upload_docs/2/3/23/1324026056_2.txt

Related

Check if image exists - database or file lookup?

I hash a file's name based on its contents and store this name reference in a database, and store the file on a server.
Would it be more efficient (quicker) to check for duplicate files (and therefore not re-upload) via checking its name in a database or by checking if the file exists on the server?
There would be 1000s of files.
I had the same issue, we have roughly 40k images and the duplicates were loading heavy on our server, especially with image license management, as the same license had to be added to the same image multiple times.
I recommmend a database lookup. It's much faster as your collection of files grows. a 40k table scan takes something like 20 milliseconds. A 40k file search on disk runs in a few seconds, which gets annoying fast.
To solve this we changed how images were uploaded, so we don't get duplicate files, but multiple database records that reference the same physical file on disk. This gives us speed for looking up the file data, without having the "file" or even knowing where the actual file is.
We also don't store the file as the original filename, but as hexadicimal hash based on date and time, so we don't get conflicting filenames, and have no delivery issues due to special characters, spaces, etc... and store the original file name in a database field for lookup purposes.
Our images have their "metadata" stored in the database, with a hexadecimal file name and an "original" filename. It's really fast to check against this database, and then retrieve the file link when there's a match/relation. This also allows checking if this file has already been uploaded, as you don't need to scan the entire directory structure with alt the images, which can take up a significant amount of time.
This is the code I use, you can use something similar. Note that this is using laravel's eloquent, but it's fairly easy to replicate in pure mysql.
First you get an instance to query the file model table.
Then you check for a file where the original filename, file size, content type and other meta data that shouldn't change are the same.
If they are the same, make your file entry a duplicate of the original entry(In my case this allows modifying image titles and descriptions for each reference)
$file = new $FILE();
$existingFile = $file->newQuery()->where('file_name',$uploadedFile->getClientOriginalName())
->where('file_size',$uploadedFile->getSize())
->where('content_type',$uploadedFile->getMimeType())
->where('is_public',$fileRelation->isPublic())->limit(1)->get()->first();
if($existingFile) {
$file->disk_name = $existingFile->disk_name;
$file->file_size = $existingFile->file_size;
$file->file_name = $existingFile->file_name;
$file->content_type = $existingFile->content_type;
$file->is_public = $existingFile->is_public;
}
else {
$file->data = $uploadedFile;
$file->is_public = $fileRelation->isPublic();
}
then as the file is deleted, you need to check if it's the "last one"
public function afterDelete()
{
try {
$count = $this->newQuery()->where('disk_name','=',$this->disk_name)
->where('file_name','=',$this->file_name)
->where('file_size','=', $this->file_size)
->count();
if(!$count) {
$this->deleteThumbs();
$this->deleteFile();
}
}
catch (Exception $ex) {
traceLog($ex->getMessage() . '\n' . $ex->getTraceAsString());
}
}

PHP Sequentially naming game ids in a dynamic list of game names

Ever time I make a game of a certain type, say cricket, I have to name it BTAG-n, where n is the number of games made for that type,
eg cricket BTAG-0, cricket BTAG-1, hockey BTAG-0, soccer BTAG-0
I cannot use a database for this, as the the number of different types of games will change with time. So I tried using files.
$filename = '/game_data/'.$name.'.txt';
$count = '0';
if (!file_exists($filename)){
file_put_contents($filename, $count);
}else{
$count = ((int)file_get_contents($filename))+1;
file_put_contents($filename, $count);
}
$randNumber = "BTAG-".$count;
But $count is always 0, I assume because file_put_contents and file_get_contents don't work, and I can't find how to enable errors or change permissions as there is no php.ini file in my cpanel (I inherited this project from another person I have no contact with, maybe he deleted it).
Any help appreciated.
You're trying to write to the root folder of our server, which is most likely not writeable by your PHP process, try to remove the "/" at the beginning of your file path, it will attempt to write the file at the same location of your script.
You might also need the folder "game_data" created there.

How to create increment folder name in php

I have an HTML form with three inputs:
name
consultant id (number)
picture upload
After the user submits the form, a php script would:
Create folder with the submitted name
Inside the folder create a txt file with: name + consultant id (given number)
Inside the folder, store the image uploaded by user
The most important thing I want is that folders created by the php file should be increased by 1. What I mean: folder1 (txt file + image), folder2 (txt file + image), folder3 (txtfile + image) and so on...
There are a few different methods for accomplishing what you describe. One option would be look at all existing folders(directories) when you attempt to create a new one and determine the next highest number.
You can accomplish this by using scandir on your parent output directory to find existing files.
Example:
$max=0;
$files=scandir("/path/to/your/output-directory");
$matches=[];
foreach($files as $file){
if(preg_match("/folder(\d+)/", $file, $matches){
$number=intval($matches[1]);
if($number>$max)
$max=$number;
}
}
$newNumber=$max+1;
That is a simple example to get you the next number. There are many other factors to consider. For instance, what happens if two users submit the form concurrently? You would need some synchronization metaphor(such as semaphore or file lock) to ensure only insert can occur at a time.
You could use a separate lock file to store the current number and function as a synchronization method.
I would highly encourage finding a different way to store the data. Using a database to store this data may be a better option.
If you need to store the files on disk, locally, you may consider other options for generating the directory name. You could use a timestamp, a hash of the data, or a combination thereof, for instance. You may also be able to get by with something like uniqid. Any filesystem option will require some form of synchronization to address race conditions.
Here is a more complete example for sequentially creating directories using a lock file for the sequence and synchronization. This omits some error handling that should be added for production code, but should provide the core functionality.
define("LOCK_FILE", "/some/file/path"); //A file for synchronization and to store the counter
define("OUTPUT_DIRECTORY", "/some/directory"); //The directory where you want to write your folders
//Open the lock file
$file=fopen(LOCK_FILE, "r+");
if(flock($file, LOCK_EX)){
//Read the current value of the file, if empty, default to 0
$last=fgets($file);
if(empty($last))
$last=0;
//Increment to get the current ID
$current=$last+1;
//Write over the existing value(a larger number will always completely overwrite a smaller number written from the same position)
rewind($file);
fwrite($file, (string)$current);
fflush($file);
//Determine the path for the next directory
$dir=OUTPUT_DIRECTORY."/folder$current";
if(file_exists($dir))
die("Directory $dir already exists. Lock may have been reset");
//Create the next directory
mkdir($dir);
//TODO: Write your content to $dir (You'll need to provide this piece)
//Release the lock
flock($file, LOCK_UN);
}
else{
die("Unable to acquire lock");
}
//Always close the file handle
fclose($file);

Organise & Name Uploaded Files

What's the best way to ogranise and name uploaded files?
I maybe wrong on this, but I feel it's not good practice to save all files in one directory, as this will result in slowness of the filesystem if the file ever needed to be browsed?
All thoughts and idea's welcome
Cheers
i use a function like this to generate a 3-level folder's tree for uploaded files:
function get_1000_path($id)
{
$id = strval($id);
$id = str_repeat("0", 9 - strlen($id)).$id;
$path = substr($id, 0, 3)."/".substr($id, 3, 3)."/".substr($id, 6, 3)."/";
return $path;
}
so max count of files in one subfolder is 1000, filesystem will work well
It depends on how your application is working.
If it's something like users galleries, I would simply create one folder for every user, then a year -> month structure.
If files are commonly shared by all the users, I would organize them in different directories based on time (year -> month -> week -> day). If your site is not expecting a lot of traffic, a year -> month division should be enough.
I would use uniqid() function to rename the uploaded files so that there will be no need to check for overwrite or duplicates.
I think it depends on your system/site or needs, ie. for a user portal it could be better like this user_images/USER-ID/filename-unid_id.jpg. For example, this could be usefull uploads/news_images/YYYY/MM/DD/filename-unid_id.jpg type for a news site. At this point it won't need large unid_id's like md5 or something else and you can use php hash like: $unid_id = hash('crc32b', microtime()). Cos there are hard line collution possibilities.
// in your upload file
$dir = "uploads/news_images/". date("Y-m-d");
if (!is_dir($dir)) {
mkdir($dir, 0777, true);
if (!is_writable($dir)) chmod($dir, 0777); // try again changing mod
}
But, both two example, you need to store file name/path of each file. In this way, you use this method to store data (personally I use json format when storing many data togather and if I don't need to search on);
// table "news_images"
id | file_name | file_data
--------------------------
1 | blue fish | {"path":"2012/12/12", "slug":"blue-fish-9834a12b", "ext":"jpg" ...
Good lucks..

Auto create directories after uploading with PHP

I'm creating a php site where a company will upload a lot of images. I'd like one folder to contain upto 500-1000 files and PHP automatically creates a new one if previous contains more that 1000 files.
For example, px300 has folder dir1 which stores 500 files, then a new one dir2 will be created.
Are there any existed a solutions?
This task is simple enough not to require an existing solution. You can make use of scandir to count the number of files in a directory, and then mkdir to make a directory.
// Make sure we don't count . and .. as proper directories
if (count(scandir("dir")) - 2 > 1000) {
mkdir("newdir");
}
A common approach is to create one-letter directories based on the file name. This works particularly well if you assign random names to files (and random names are good to avoid name conflicts in user uploads):
/files/a/c/acbd18db4cc2f85cedef654fccc4a4d8
/files/3/7/37b51d194a7513e45b56f6524f2d51f2
In this way:
if ($h = opendir('/dir')) {
$files = 0;
while (false !== ($file = readdir($h))) {
$files++
}
if($files > 1000){
//create dir
mkdir('/newdir')
}
}
You could use the glob function. It will return an array matching your pattern, which you could count for the amount of files.

Categories