I am in the middle of making a script to upload files via php. What I would like to know, is how to display the files already uploaded, and when clicking on them open them for download. Should I store the names and path in a database, or just list the conents of a directory with php?
Check out handling file uploads in PHP. A few points:
Ideally you want to allow the user to upload multiple files at the same time. Just create extra file inputs dynamically with Javascript for this;
When you get an upload, make sure you check that it is an upload with is_uploaded_file;
Use move_uploaded_file() to copy the file to wherever you're going to store it;
Don't rely on what the client tells you the MIME type is;
Sending them back to the client can be done trivially with a PHP script but you need to know the right MIME type;
Try and verify that what you get is what you expect (eg if it is a PDF file use a library to verify that it is), particularly if you use the file for anything or send it to anyone else; and
I would recommend you store the file name of the file from the client's computer and display that to them regardless of what you store it as. The user is just more likely to recognise this than anything else.
Storing paths in the database might be okay, depending on your specific application, but consider storing the filenames in the database and construct your paths to those files in PHP in a single place. That way, if you end up moving all uploaded files later, there is only one place in your code you need to change path generation, and you can avoid doing a large amount of data transformation on your "path" field in the database.
For example, for the file 1234.txt, you might store it in:
/your_web_directory/uploaded_files/1/2/3/1234.txt
You can use a configuration file or if you prefer, a global somewhere to define the path where your uploads are stored (/your web directory/uploaded files/) and then split characters from the filename (in the database) to figure out which subdirectory the file actually resides in.
As for displaying your files, you can simply load your list of files from the database and use a path-generating function to get download paths for each one based on their filenames. If you want to paginate the list of files, try using something like START 0, LIMIT 50; in mySQL. Just pass in a new start number with each successive page of upload results.
maybe you should use files, in this sense:
myfile.txt
My Uploaded File||my_upload_dir/my_uploaded_file.pdf
Other Uploaded File||my_upload_dir/other_uploaded.html
and go through them like this:
<?php
$file = "myfile.txt";
$lines = file($file);
$files = array();
for($i=0;$i<=count($lines)-1;$i++) {
$parts = explode($lines[$i]);
$name = parts[0];
$filename = parts[1];
$files[$i][0] = $name;
$files[$i][1] = $filename;
}
print_r($files);
?>
hope this helps. :)
What I always did (past tense, I haven't written an upload script for ages) is, I'd link up an upload script (any upload script) to a simple database.
This offers some advantages;
You do not offer your users direct insight to your file system (what if there is a leak in your 'browse'-script and you expose your whole harddrive?
You can store extra information and meta-data in an easy and efficient way
You can actually query for files / meta-data instead of just looping through all the files
You can enable a 'safe-delete', where you delete the row, but keep the file (for example)
You can enable logging way more easily
Showing files in pages is easier
You can 'mask' files. Using a database enables you to store a 'masked' filename, and a 'real' filename.
Obviously, there are some disadvantages as well;
It is a little harder to migrate, since your file system and database have to be in sync
If an operation fails (on one of both ends) you have either a 'corrupt' database or file system
As mentioned before (but we can not mention enough, I'm afraid); _Keep your uploading safe!_
The MIME type / extension issue is one that is going on for ages.. I think most of the web is solid nowadays, but there used to be a time when developers would check either MIME type or extension, but never both (why bother?). This resulted in websites being very, very leaky.
If not written properly, upload scripts are big hole in your security. A great example of that is a website I 'hacked' a while back (on their request, of course). They supported the upload of images to a photoalbum, but they only checked on file extension. So I uploaded a GIF, with a directory scanner inside. This allowed me to scan through their whole system (since it wasn't a dedicated server; I could see a little more then that).
Hope I helped ;)
Related
I'm programming a file converter. Therefore the user uploads a file e.g. test.txt which is then convertet and a download link is sent back to the user. For security purposes I change the name of the files as soon as they are uploaded like it is also suggested here.
Instead create files and folders with randomly generated names like fg3754jk3h
The problem starts when it comes to download. For a better UX I want the downloadable files to have the same name as the user supplied files, not a random string. At the moment I also get an error in Chrome:
<Filename> is an unusual download and may be harmful. [translated]
I think this could also be a result of the crypthographic file names.
So my question: What is the best method to change the file names back to the original ones without having any security issues, or should I better do a scrict validation of the file names? And will this get rid of the displayed error message?
You can provide the original filename when returning the file to the user. (see Downloading a file with a different name to the stored name for a few ways of doing it)
The principle of not storing the file with original name is to avoid a malicious user trying to upload some script to your server that he can execute. You should do it, but also you should put that files in a temporary directory that your web server have no access.
For example:
You web server are pointing to /var/www
When your receive the uploaded file, store it on /var/uploads instead of /var/www/uploads. This way, the file will never be accessible to user (at least from web)
You save the original filename on your database
You still should generate a random filename, this will avoid filename collisions (many people will upload their cute-cat.jpg images), There's no problem keeping file extension. eg: kr3242sd93fdsh.jpg
You provide some endpoint to your user download the file by some random string (I suggest you avoiding use the same random string that you used to name the file): https://youserver.com/download?id=uoqq41jsak
On your download endpoint, you define the original filename on Content-Disposition's filename attribute.
I have used this step echo $_FILES["fileField"]["tmp_name"]; but result like this.
C:\xampp\tmp\phpA9EE.tmp
How can i get exact file path?
An uploaded file does not have a "full path", other than temporary location where PHP has stored it during the upload process.
For the security of users, the browser sends only a filename of where it came from on the remote computer; for your security, you should not blindly use this (security rule of thumb: anything sent by the user is suspect and could be used to attack your system). You might want to filter it through a whitelist (e.g. remove anything other than letters and numbers) and use it as a "friendly" upload name, or you might want to ignore it completely. The browser also sends a file type (e.g. image/jpeg); again, this should not be trusted - the only way to know the type of a file is to use a command that looks at the content and validates it.
As far as PHP is concerned, what has been uploaded is a chunk of binary data; it saves this to a randomly named file, which is the path you have echoed there. The PHP manual has an introduction to how this works.
With that path you can do one of two things:
validate with is_uploaded_file(), and read the data with file_get_contents() or similar
use move_uploaded_file() to put it in a permanent location of your choice
I want to allow registered users of a website (PHP) to upload files (documents), which are going to be publicly available for download.
In this context, is the fact that I keep the file's original name a vulnerability ?
If it is one, I would like to know why, and how to get rid of it.
While this is an old question, it's surprisingly high on the list of search results when looking for 'security file names', so I'd like to expand on the existing answers:
Yes, it's almost surely a vulnerability.
There are several possible problems you might encounter if you try to store a file using its original filename:
the filename could be a reserved or special file name. What happens if a user uploads a file called .htaccess that tells the webserver to parse all .gif files as PHP, then uploads a .gif file with a GIF comment of <?php /* ... */ ?>?
the filename could contain ../. What happens if a user uploads a file with the 'name' ../../../../../etc/cron.d/foo? (This particular example should be caught by system permissions, but do you know all locations that your system reads configuration files from?)
if the user the web server runs as (let's call it www-data) is misconfigured and has a shell, how about ../../../../../home/www-data/.ssh/authorized_keys? (Again, this particular example should be guarded against by SSH itself (and possibly the folder not existing), since the authorized_keys file needs very particular file permissions; but if your system is set up to give restrictive file permissions by default (tricky!), then that won't be the problem.)
the filename could contain the x00 byte, or control characters. System programs may not respond to these as expected - e.g. a simple ls -al | cat (not that I know why you'd want to execute that, but a more complex script might contain a sequence that ultimately boils down to this) might execute commands.
the filename could end in .php and be executed once someone tries to download the file. (Don't try blacklisting extensions.)
The way to handle this is to roll the filenames yourself (e.g. md5() on the file contents or the original filename). If you absolutely must allow the original filename to best of your ability, whitelist the file extension, mime-type check the file, and whitelist what characters can be used in the filename.
Alternatively, you can roll the filename yourself when you store the file and for use in the URL that people use to download the file (although if this is a file-serving script, you should avoid letting people specify filenames here, anyway, so no one downloads your ../../../../../etc/passwd or other files of interest), but keep the original filename stored in the database for display somewhere. In this case, you only have SQL injection and XSS to worry about, which is ground that the other answers have already covered.
That depends where you store the filename. If you store the name in a database, in strictly typed variable, then HTML encode before you display it on a web page, there won't be any issues.
The name of the files could reveal potentially sensitive information. Some companies/people use different naming conventions for documents, so you might end up with :
Author name ( court-order-john.smith.doc )
Company name ( sensitive-information-enterprisename.doc )
File creation date ( letter.2012-03-29.pdf )
I think you get the point, you can probably think of some other information people use in their filenames.
Depending on what your site is about this could become an issue (consider if wikileaks published leaked documents that had the original source somewhere inside the filename).
If you decide to hide the filename, you must consider the problem of somebody submitting an executable as a document, and how you make sure people know what they are downloading.
I have a topic/question concerning your upload filename standards, if any, that you are using. Imagine you have an application that allows many types of documents to be uploaded to your server and placed into a directory. Perhaps the same document could even be uploaded twice. Usually, you have to make some kind of unique filename adjustment when saving the document. Assume it is saved in a directory, not saved directly into a database. Of course, the Meta Data would probably need to be saved into the database. Perhaps the typical PHP upload methods could be the application used; simple enough to do.
Possible Filenaming Standard:
1.) Append the document filename with a unique id: image.png changed to image_20110924_ahd74vdjd3.png
2.) Perhaps use a UUID/GUID and store the actual file type (meta) in a database: 2dea72e0-a341-11e0-bdc3-721d3cd780fb
3.) Perhaps a combination: image_2dea72e0-a341-11e0-bdc3-721d3cd780fb.png
Can you recommend a good standard approach?
Thanks, Jeff
I always just hash the file using md5() or sha1() and use that as a filename.
E.g.
3059e384f1edbacc3a66e35d8a4b88e5.ext
And I would save the original filename in the database may I ever need it.
This will make the filename unique AND it makes sure you don't have the same file multiple times on your server (since they would have the same hash).
EDIT
As you can see I had some discussion with zerkms about my solution and he raised some valid points.
I would always serve the file through PHP instead of letting user download them directly.
This has some advantages:
I would add records into the database if users upload a file. This would contain the user who uploaded the file, the original filename and tha hash of the file.
If a user wants to delete a file you just delete the record of the user with that file.
If no more users has the file after delete you can delete the file itself (or keep it anyway).
You should not keep the files somewhere in the document root, but rather somewhere else where it isn't accessible by the public and serve the file using PHP to the user.
A disadvantage as zerkms has pointed out is that serving files through PHP is more resource consuming, although I find the advantages to be worth the extra resources.
Another thing zerkms has pointed out is that the extension isn't really needed when saving the file as hash (since it already is in the database), but I always like to know what kind of files are in the directory by simply doing a ls -la for example. However again it isn't really necessarily.
I'm making a real simple "backend" (PHP5) for two flash/air-applications. One of them will upload a photo, the backend will save it to a folder, and the second app will poll the backend for new photo's and show them.
I don't got any access to a database, so the backend has to be pure PHP5 and nothing more. That's why I chose to save the images to a folder (with a timestamp in their names) and use readdir() to get them back.
This all works like a charm. Nevertheless, I would really like to make sure the backend only returns photo's that are completely uploaded, preventing the second app to try to load an unfinished image. Are there any methods/tricks that I can use to validate a file?
You could check the filesize a couple hundred milliseconds apart and see if it changes:
$first = filesize($file);
// wait 100ms
usleep(100000);
$second = filesize($file);
if($first == $second) {
// file is no longer being actively uploaded
}
The usual trick for atomic filesystem operations is to write into a temporary file that is not matched by the reader (e.g. XXX.jpg.tmp) and once it's completely uploaded, rename it to it's target name. Renames on the same volume are atomic, so there is no point where the file is either uncomplete or unavailable.
A really easy and common way to do so would be to create a trigger file based on the files name, so that you get something like
123.jpg
123.rdy
or
123.jpg
123.jpg.rdy
You create that file (just an empty stub) as soon as the upload is complete. The application that grabs files to load only cares about files with a trigger file and then processes those. Alternatively, you could also save the uploaded file as ie. 123.bsy or 123.jpg.bsy while it is still being uploaded and then rename it to the finale name 123.jpg after the upload is done. Since renames in the same directory are usually really cheap operations in term of processing time, the chances of running in a race condition should be pretty low. (This might or might now depend on the OS used, though ...)
If you need to keep the files in that place, you could, of course, use a database where you add a record for each file, as the upload is complete. The other app could then just provide files with a matching database record.
After writing this all down I figgered it out myself. What I did was adding the exact amount of bytes in the filename as well and validate that while outputting the list of images. The .tmp/.bsy-sollution is nice also, but I read it a bit to late :)
Upside to my solution is that no more renaming is required after the upload is done. Thanks everybody for your fast answers!