My application is keeping watch on a set of folders where users can upload files. When a file upload is finished I have to apply a treatment, but I don't know how to detect that a file has not finish to upload.
Any way to detect if a file is not released yet by the FTP server?
There's no generic solution to this problem.
Some FTP servers lock the file being uploaded, preventing you from accessing it, while the file is still being uploaded. For example IIS FTP server does that. Most other FTP servers do not. See my answer at Prevent file from being accessed as it's being uploaded.
There are some common workarounds to the problem (originally posted in SFTP file lock mechanism, but relevant for the FTP too):
You can have the client upload a "done" file once the upload finishes. Make your automated system wait for the "done" file to appear.
You can have a dedicated "upload" folder and have the client (atomically) move the uploaded file to a "done" folder. Make your automated system look to the "done" folder only.
Have a file naming convention for files being uploaded (".filepart") and have the client (atomically) rename the file after upload to its final name. Make your automated system ignore the ".filepart" files.
See (my) article Locking files while uploading / Upload to temporary file name for an example of implementing this approach.
Also, some FTP servers have this functionality built-in. For example ProFTPD with its HiddenStores directive.
A gross hack is to periodically check for file attributes (size and time) and consider the upload finished, if the attributes have not changed for some time interval.
You can also make use of the fact that some file formats have clear end-of-the-file marker (like XML or ZIP). So you know, that the file is incomplete.
Some FTP servers allow you to configure a hook to be called, when an upload is finished. You can make use of that. For example ProFTPD has a mod_exec module (see the ExecOnCommand directive).
I use ftputil to implement this work-around:
connect to ftp server
list all files of the directory
call stat() on each file
wait N seconds
For each file: call stat() again. If result is different, then skip this file, since it was modified during the last seconds.
If stat() result is not different, then download the file.
This whole ftp-fetching is old and obsolete technology. I hope that the customer will use a modern http API the next time :-)
If you are reading files of particular extensions, then use WINSCP for File Transfer. It will create a temporary file with extension .filepart and it will turn to the actual file extension once it fully transfer the file.
I hope, it will help someone.
This is a classic problem with FTP transfers. The only mostly reliable method I've found is to send a file, then send a second short "marker" file just to tell the recipient the transfer of the first is complete. You can use a file naming convention and just check for existence of the second file.
You might get fancy and make the content of the second file a checksum of the first file. Then you could verify the first file. (You don't have the problem with the second file because you just wait until file size = checksum size).
And of course this only works if you can get the sender to send a second file.
Related
I'm reading about security stuff for PHP and my biggest concern now is the users file upload form. I've read a lot that some users may upload files that seems to be something else by changing the extension or even manipulating the header and the mimetype. I understand this.
But my question is how will this be an issue if I rename any uploaded file and move it to a directory that they do not know.
Please let me know if this will be enough or not, and if not, just give me some headline of what extra security checks should I perform
Thanks a lot
It really depends on what your online application is looking to achieve. If you wish to limit access directly to files which are uploaded, then you should set the folder permissions for the parent folder of the uploaded area to block user access. Then in your database you can record to path and only host the files through the http response. This will ensure that no files are accessed which could be potentially harmful, and also that users can still upload what they feel. As an extra step, you could add an erroneous file extension to each file while it is hosted and then remove it when it is served.
You might run an antivirus scan daemon in the background like avscand, configured. for scanning and moving infected files to a quarantaine directory. This ought to prevent delivering infected files later back to the people. Configure automatic virus database updating. A bit back that I did do such things, so investigate.
A simple renaming of the file name to one with safe characters should be sufficient; per user separated of course.
To have a more secure site the following needs to happen:
Due to the nature of security, this list will need be updated every so often.
Set the upload_max_filesize to something sensible
Install an Antivirus on the server
Set the upload_tmp_dir to something sensible, that the user may not access. See Setting PHP tmp dir - PHP upload not working
Have your form you upload files (which you already have done)
Your form handler should:
Run a file command to get the type of the data without executing it
Reject random files
The PHP interpreter will validate the file size
Run the virus scanner on the file
Do a file rename to ensure the filename is clean (if you need to reference things, it is convenient to rename the file to the primary key of your attachments table)
Move the file to a location that isn't accessible by the client (but move it, so if a later upload comes in with the same name nothing happens)
When you move the files, ensure they don't have execute permissions
We have a FreeBSD server with samba that employees copy image files on to, which then get uploaded to our web servers (this way they don't have to mess with ftp). Sometimes, if the upload script is running at the same time as files are being copied, it can upload an incomplete file.
We fixed this by getting the list of files along with the file sizes, then waiting 5 seconds and rechecking the file sizes. If the sizes match then its save to upload, if they don't match it checks again in another 5 seconds.
This seems like an odd way to check if the files are being written to. is there a better, more simple way of doing this?
Use a flock function http://php.net/flock - when writing a file obtain an exclusive lock flock($handle, LOCK_EX), after it is written release the lock flock($handle, LOCK_UN).
The upload script could try to obtain the exclusive writing lock too, if it succeeds it is Okay to move the file, otherwise no.
EDIT: Sorry, I forgot about the users copying the files to the server through samba... So there is no space to use flock while copying... But the upload script could still use flock($handle, LOCK_EX) to see, if it is successful or not.
I recommend to shell_exec() smbstatus(1), e.g. smbstatus -LB to check for locked files
Write a script to copy the files to a temp folder on the Samba server and then, when fully copied and flushed, move, (ie, unlink/link, not copy again), them to the upload folder.
How to use PHP or any other language to read an uploading-file to allow download of the uploading-file while it is uploading?
Example sites that does this are:
http://www.filesovermiles.com/
http://host03.pipebytes.com/
Use this: http://www.php.net/manual/en/apc.configuration.php#ini.apc.rfc1867
In the array the file name is included as temp_filename - so you can pass that to your other program, which can read from the file and stream it live. The array also includes a file size so that program can make sure not to try to read beyond the end of the file.
I don't think this is possible in PHP because PHP takes care of receiving the download and only hands over control when it has the complete file. When writing CGI programs or Java servlets you read the upload from the socket so you are in control while receiving the file and you can administer if it is still uploading and how much has been received so another process could read this data and start sending what is already there.
One of the site's you've given as an example is just downloading a file from an URL or from the client computer, stores it temporarily and assigns a code to that file to make it identifiable.
After uploading, any other user who has the code can then download that file again.
This is more a question how you operate a server system then writing the code.
You can download files to the local system by making use of file_get_contents and file_put_contents.
If you want to stream file-data from the server to the browser, you can make use of readfile PHP Manual.
I am making a feature to my site so that users can upload files (any type).
In order to secure the upload form, i made a blacklist of non-accepted filetypes. But in order to assure protection to my server (in case of uploading malicious scripts in any way) i thought to tar the uploaded files (using the tar class) so that they are stored as .tar zipped files.
So if the user wants to donwload it, then he will receive a .tar file.
My question is, is this secure enough? (since the files cannot be executed then).
[I have this reservation as i can see at the code of tar class, the "fread()"]
Thanks!
Two points, here :
Using a blacklist is a bad idea : you will never think to all possible evil filetypes.
Do not store the uploaded files into a public directory of your server :
Store those files to a directory that is not served by Apache, outside of your DocumentRoot.
And use a PHP script (even if Apaches cannot serve the files through HTTP, PHP can read them) to send those files contents to the user who wants to download them.
This will make sure that those uploaded files are never executed.
Of course, make sure your PHP script that sends the content of a file doesn't allow anyone to download any possible file that's on the server...
You can upload the files to an non web accessible location (under your webroot) and then use a download script to download the file.
The best way of handling uploaded files, in my opinion, is to place them in a folder that's not reachable through HTTP. Then when a file is requested, use a PHP file to send then download headers, the use readfile() to send the file to the user. This way, files are never executed.
That might work, assuming that you're users that will download the files can untar them (most non UNIX systems just have zip, I'd give them the option to download either format).
Also, i think its better to create a list of allowed files vs banned files. Its easy to forget to ban a specific type; whereas you will probably have a better idea of what users can upload
Dont block/allow files on extension. Make sure you are using the mime type that the server identifies the file as. This way its hard for them to fake it.
also, store the files in a non web accessible directory and download them through a script.
Even if its a bad file, they won't be able to exploit it if they can't directly access it .
When saving the files make sure you use these functions:
http://php.net/manual/en/function.is-uploaded-file.php
http://php.net/manual/en/function.move-uploaded-file.php
Dan
I have a PHP script that opens a local directory in order to copy and process some files. But these files may be incomplete, because they are being uploaded by a slow FTP process, and I do not want to copy or process any files which have not been completely uploaded yet.
Is is possible in PHP to find out if a file is still being copied (that is, read from), or written to?
I need my script to process only those files that have been completely uploaded.
The ftp process now, upload files in parallel, and it take more than 1 second for each filesize to change, so this trick is not working for me now, any other method suggests
Do you have script control over the FTP process? If so, have the script that's doing the uploading upload a [FILENAME].complete file (blank text file) after the primary upload completes, so the processing script knows that the file is complete if there's a matching *.complete file there also.
+1 to #MidnightLightning for his excellent suggestion. If you don't have control over the process you have a couple of options:
If you know what the final size of the file should be then use filesize() to compare the current size to the known size. Keep checking until they match.
If you don't know what the final size should be it gets a little trickier. You could use filesize() to check the size of the file, wait a second or two and check it again. If the size hasn't changed then the upload should be complete. The problem with the second method is if your file upload stalls for whatever reason it could give you a false positive. So the time to wait is key.
You don't specify what kind of OS you're on, but if it's a Unix-type box, you should have fuser and/or lsof available. fuser will report on who's using a particular file, and lsof will list all open files (including sockets, fifos, .so's, etc...). Either of those could most likely be used to monitor your directory.
On the windows end, there's a few free tools from Sysinternals that do the same thing. handle might do the trick