I have hundreds of mp3 files on my server. Each file's modified-date is important because it is fetched by PHPs filemtime to represent it's upload date (since there's no way to determine an upload time without storing values in a database).
I have come across an audio issue in which all the files need to be normalized and re-uploaded to the server. This would, of course, change the modified-date of each file to "today". I need each file to retain it's original modified-date.
I'm not sure if this is a software-recommendation question or a programming question, so I apologize if this is the wrong .SE site. Is this even possible?
You should be able to set the modified time with touch: http://php.net/manual/en/function.touch.php
This requires PHP > 5.3 and the user running the script (probably your web user unless you run it from the cli) needs to have write permission on the file.
You have two options for implementation:
Store the filenames and their mtimes in temporary storage (either a file or a database table). When you finish the upload, run through all of the files and use touch to reset the mtime.
As you upload the files, check to see if the file already exists. If it does, grab the mtime in a temporary variable, overwrite the file, then touch it with the correct mtime.
I know this isn't the answer you're looking for, but it would make far more sense to start storing this information in a database than relying on the last-modified date. This way you can show your users the date that they need to know and retain the true date of modification.
An approach like this also gives you much more flexibility.
As requested by #Snailer - for the sake of closing the question.
Related
I am trying to convert our uploaded filenames from an unreadable pile of files to an organized and human-readable structure of files. I am wondering if there are any additional security measures I need to take to safe this type of system.
To give a brief overview, the current system uploads files, generates a random filename, and allows the files to be accessed only through a download script (I have no need to serve the files directly to the browser).
In short, I'd like to implement a WebDav system and would think the easiest solution would be to store uploaded files with their original name (separated into different folders).
Thank you
Edit: To clarify, I'd like to retain the filename as much as possible, but I'd obviously need to at least sanitize the filenames first. I've considered chmod-ing the containing folder to prevent execution (a folder located outside of the web directory). What, in addition, am I not considering.
In short, I'd like to implement a WebDav system and would think the easiest solution would be to store uploaded files with their original name
This is pretty wide question but to make answer short: NEVER TRUST USER PROVIDED DATA. You must always do server side validation and sanitization, otherwise you will be hacked sooner or later.
Original file name is sent by client, so it can be anything. Here's some ideas of what I'd try to send you as "original" file names knowing you are so
carefree: ../../../../etc/passwd or ../../config/db.php. Handle as-it-comes. Enjoy :)
EDIT
I should have mentioned things I've considered -- sanitizing filenames
Sanitized file name is not an original file name any more. However there's approach you could consider here to meed your goal and still stay safe. You could validate/sanitize original file names and if after that it still the same as it came from user, you can keep the file and retain the original name. If it is not, then you should reject the file upload as whole. At the end fo the day you will have only files that you can allow to be accessed with original file names via other API/interfaces.
EDIT
I've considered chmod-ing the containing folder to prevent execution
This is bad security. You should rather keep files in the folder that is not accessible directly instead.
From How to Securely Allow Users to Upload Files:
Always Store Uploaded Files Outside of the Document Root
If your website is example.com and when a visitor accesses this website in their browser, the script located at /home/example/public_html/index.php is executed, then you should not be storing the files that users have uploaded in /home/example/public_html/ or any of its subdirectories. A good candidate, instead, would be /home/example/uploaded/.
...
Instead of storing the file at /home/example/uploaded/some/directories/user_provided.file, store all relevant metadata in a database record (while taking care to prevent SQL injection vulnerabilities) and use a random filename for the actual filesystem storage.
This does three things:
It guarantees that your user's file will never be executed as a script. They get read-only access whether they like it or not. (No reverse shells!)
It prevents the user from controlling the filename, to prevent security-critical files from being overwritten.
It allows you to retain as much metadata about each file as you'd like without sacrificing security.
If you need a real implementation to reference, here are two from a CMS that I'm developing:
Uploading files
Serving files
I want to allow registered users of a website (PHP) to upload files (documents), which are going to be publicly available for download.
In this context, is the fact that I keep the file's original name a vulnerability ?
If it is one, I would like to know why, and how to get rid of it.
While this is an old question, it's surprisingly high on the list of search results when looking for 'security file names', so I'd like to expand on the existing answers:
Yes, it's almost surely a vulnerability.
There are several possible problems you might encounter if you try to store a file using its original filename:
the filename could be a reserved or special file name. What happens if a user uploads a file called .htaccess that tells the webserver to parse all .gif files as PHP, then uploads a .gif file with a GIF comment of <?php /* ... */ ?>?
the filename could contain ../. What happens if a user uploads a file with the 'name' ../../../../../etc/cron.d/foo? (This particular example should be caught by system permissions, but do you know all locations that your system reads configuration files from?)
if the user the web server runs as (let's call it www-data) is misconfigured and has a shell, how about ../../../../../home/www-data/.ssh/authorized_keys? (Again, this particular example should be guarded against by SSH itself (and possibly the folder not existing), since the authorized_keys file needs very particular file permissions; but if your system is set up to give restrictive file permissions by default (tricky!), then that won't be the problem.)
the filename could contain the x00 byte, or control characters. System programs may not respond to these as expected - e.g. a simple ls -al | cat (not that I know why you'd want to execute that, but a more complex script might contain a sequence that ultimately boils down to this) might execute commands.
the filename could end in .php and be executed once someone tries to download the file. (Don't try blacklisting extensions.)
The way to handle this is to roll the filenames yourself (e.g. md5() on the file contents or the original filename). If you absolutely must allow the original filename to best of your ability, whitelist the file extension, mime-type check the file, and whitelist what characters can be used in the filename.
Alternatively, you can roll the filename yourself when you store the file and for use in the URL that people use to download the file (although if this is a file-serving script, you should avoid letting people specify filenames here, anyway, so no one downloads your ../../../../../etc/passwd or other files of interest), but keep the original filename stored in the database for display somewhere. In this case, you only have SQL injection and XSS to worry about, which is ground that the other answers have already covered.
That depends where you store the filename. If you store the name in a database, in strictly typed variable, then HTML encode before you display it on a web page, there won't be any issues.
The name of the files could reveal potentially sensitive information. Some companies/people use different naming conventions for documents, so you might end up with :
Author name ( court-order-john.smith.doc )
Company name ( sensitive-information-enterprisename.doc )
File creation date ( letter.2012-03-29.pdf )
I think you get the point, you can probably think of some other information people use in their filenames.
Depending on what your site is about this could become an issue (consider if wikileaks published leaked documents that had the original source somewhere inside the filename).
If you decide to hide the filename, you must consider the problem of somebody submitting an executable as a document, and how you make sure people know what they are downloading.
I have a topic/question concerning your upload filename standards, if any, that you are using. Imagine you have an application that allows many types of documents to be uploaded to your server and placed into a directory. Perhaps the same document could even be uploaded twice. Usually, you have to make some kind of unique filename adjustment when saving the document. Assume it is saved in a directory, not saved directly into a database. Of course, the Meta Data would probably need to be saved into the database. Perhaps the typical PHP upload methods could be the application used; simple enough to do.
Possible Filenaming Standard:
1.) Append the document filename with a unique id: image.png changed to image_20110924_ahd74vdjd3.png
2.) Perhaps use a UUID/GUID and store the actual file type (meta) in a database: 2dea72e0-a341-11e0-bdc3-721d3cd780fb
3.) Perhaps a combination: image_2dea72e0-a341-11e0-bdc3-721d3cd780fb.png
Can you recommend a good standard approach?
Thanks, Jeff
I always just hash the file using md5() or sha1() and use that as a filename.
E.g.
3059e384f1edbacc3a66e35d8a4b88e5.ext
And I would save the original filename in the database may I ever need it.
This will make the filename unique AND it makes sure you don't have the same file multiple times on your server (since they would have the same hash).
EDIT
As you can see I had some discussion with zerkms about my solution and he raised some valid points.
I would always serve the file through PHP instead of letting user download them directly.
This has some advantages:
I would add records into the database if users upload a file. This would contain the user who uploaded the file, the original filename and tha hash of the file.
If a user wants to delete a file you just delete the record of the user with that file.
If no more users has the file after delete you can delete the file itself (or keep it anyway).
You should not keep the files somewhere in the document root, but rather somewhere else where it isn't accessible by the public and serve the file using PHP to the user.
A disadvantage as zerkms has pointed out is that serving files through PHP is more resource consuming, although I find the advantages to be worth the extra resources.
Another thing zerkms has pointed out is that the extension isn't really needed when saving the file as hash (since it already is in the database), but I always like to know what kind of files are in the directory by simply doing a ls -la for example. However again it isn't really necessarily.
I will be sending new files over from one computer to another computer. How do I make PHP auto detect new/updated files in the folders and enter the information inside the files into mysql database?
Get all files you already know from the database
loop through the directory with http://www.php.net/manual/de/function.readdir.php
if the file is known, do nothing
if the file is not known, add it to the database
In the end, delete all files no longer in the directory
I would pick a set-up where new files and old fields are in a separate directory.
But if you have no choice, you could check the modification date and match it with your last directory iteration. (Use filemtime for this).
Don't forget to do some database checking when you process an image though.
Save the timestamp of the last check and when you check next look at the fileinfo and check creation date. Even better yet because you store filecontens in a database, check for the time it was modified using: filemtime()
You can't. PHP works as a preprocessor and even it has execution time limit (set in the configuration). If you need to process with PHP then make a PHP script that outputs a web page that use meta redirection to itself. Inside the script, you should loop over the files, query the database for the file name and its modification time, if it exists then nothing to do, otherwise, if the file name exists then it's an update, otherwise it's a new file.
I'm generating a unique filename for uploaded files with the following code
$date = date( 'U' );
$user = $_SERVER[REMOTE_ADDR];
$filename = md5($date.$user);
The problem is that I want to use this filename again later on in the script, but if the script takes a second to run, I'm going to get a different filename the second time I try to use this variable.
For instance, I'm using an upload/resize/save image upload script. The first operation of the script is to copy and save the resized image, which I use a date function to assign a unique name to. Then the script processses the save and saves the whole upload, and assigns it a name. At the end of the script ($thumb and $full are the variables), I need to insert into a MySQL database, the filenames I used when i saved the uploads.
Problem is, sometimes on large images it takes more than a second (or during the process, the seconds change) resulting in a different filename being put into the database than is what the file is actually saved under.
Is this just not a good idea to use this method of naming?
AFAIK it's a great way to name the files, although I would check file_exists() and maybe tack on a random number.
You need to store that filename in a variable and reference it again later, instead of relying on the algorithm each time. This could be stored in the user $_SESSION, a cookie, a GET variable, etc between pageloads.
Hope that helps
Just want add that php has a function to create identifiers: uniqid. You can also prefix the identifier with a string (date maybe?).
Always validate your user's input, and the server headers!
I would recommend storing the file name in the session (as per AI). If you store it in one of the other variables, it is more likely for the end user to be able to attack the system through it. MD5 of user concatenated with rand() would be a nice way to get a long list of unique values. Just using rand() would probably have a higher percentage of conflicts.
I am not sure about the process that you are following for uploading files, but another way to handle file uploads is with PHP's built in handlers. You can upload the file and then use the "secure" methods for pulling uploaded files out of the temporary space. (the temporary space in this instance can be safely located outside of the open base dir directive to prevent tampering). is_uploaded_file() and move_uploaded_file() from: http://php.net/manual/en/features.file-upload.post-method.php example 2 might handle the problem you are encountering.
Definitely check for an existing file in that location if you are choosing a filename on the fly. If user input is allowed in any way shape or form, validate and filter the argument to make sure it is safe. Also, if the storage folder is web accessible, make sure you munge the name and probably the extension as well. You do not want someone to be able to upload code and then be able to execute it. That officially leads to BAD activities.
I just discovered that PHP has a built-in function for this, called tempnam. It even avoids race conditions. See http://php.net/manual/en/function.tempnam.php.
Why not to use
$filename = md5(rand());
This will be pretty much unique in every case. And if you find that $filename already exists you can just call it again.
Not a good idea using ID dependent on time – if you upload two images at the same time, the later one can overwrite the earlier. You should look at function such as uniqid(). However, if this upload/resize/save script is meant to be "single-user", then this is not such a big problem.
To the problem itself. If I were you, I would just save the computed filename to some variable a use the variable from that point. Computing already computed is waste of time. And when uploading some really big images, or more images at once, script can take even 20 seconds. You cannot depend on fact that you'll make everything you want in one second.