I'm writing a PHP process that will run on a Unix machine that will need to monitor a remote SMB server and detect new files that are being uploaded to that box via FTP. It's unlikely I'll be able to
It will need to detect:
New files being created
File upload completing
Files being deleted
If it was an NFS share, I'd try using FAM to detect the events, but I can't see a way of doing anything equivalent?
Doesn't sound like something I would use in production. But you could try something like this:
mount the SMB share with Samba on
the machine that is running a PHP
daemon
use SPL
RecursiveIteratorIterator with
DirectoryIterator to collect and
maintain a list of all the files and
folders on the shared drive
once in
a while refresh the folder list and
compare it with the current state,
if the file does not exist any more
you know it has been deleted, if
there is a new file put it in the
queue and mark it as "being
uploaded"
in the next "refresh run"
check the queued file, it the file
size did not change the file upload
probably completed, if the file size
changed put it in the queue again
and mark it as "being uploaded"
Related
How can I set my uploader to save uplodaded files into folder on my computer? I'm hosting a free server on 000webhost.com and I have a simple uploader script ^^
So I'd like to do something like this:
//specify folder for file upload
$folder = "C:\Users\Tepa\Desktop\Ohjelmat\Uploads";
In order to have a website save files on your local computer, you will need to install a ftp or other fileserver on your home computer and send the file by IP address, or DNS address.
I would suggest instead to tackle this from the opposite direction and write a script to pull these files to your local computer rather than open it up to the outside world. You can set it up as either a cron job / scheduled task so that the files are pulled in automatically.
If the files you are interested in are stored in a publicly accessible folder on your host, actually accomplishing this task is fairly trivial (see: cURL or file_get_contents)
You loose the real-time synchronicity, but gain some peace of mind.
When uploading an image PHP stores the temp image in a local on the server.
Is it possible to change this temp location so its off the local server.
Reason: using loading balancing without sticky sessions and I don't want files to be uploaded to one server and then not avaliable on another server. Note: I don't necessaryly complete the file upload and work on the file in the one go.
Preferred temp location would be AWS S3 - also just interested to know if this is possible.
If its not possible I could make the file upload a complete process that also puts the finished file in the final location.
just interested to know if the PHP temp image/file location can be off the the local server?
thankyou
You can mount S3 bucket with s3fs on your Instances which are under ELB, so that all your uploads are shared between application Servers. About /tmp, don't touch it as destination is S3 and it is shared - you don't have to worry.
If you have a lot of uploads, S3 might be bottleneck. In this case, I suggest to setup NAS. Personally, I use GlusterFS because it scales well and very easy to set up. It has replication issues, but you might not use replicated volumes at all and you are fine.
Another alternatives are Ceph, Sector/Sphere, XtreemFS, Tahoe-LAFS, POHMELFS and many others...
You can directly upload a file from a client to S3 with some newer technologies as detailed in this post:
http://www.ioncannon.net/programming/1539/direct-browser-uploading-amazon-s3-cors-fileapi-xhr2-and-signed-puts/
Otherwise, I personally would suggest using each server's tmp folder for exactly that-- temporary storage. After the file is on your server, you can always upload to S3, which would then be accessible across all of your load balanced servers.
Maybe it would be best to start by describing the scenario.
We have a Debian server with ffmpeg that we use to covert various video files into FLV.
The files are supplied by a number of different people via FTP and are kept in the "uploads" folder.
I need to write a PHP script that would go through all the files in the uploads folder, select the ones which are complete (i.e. not currently being uploaded or without any uploading errors) and then convert them to FLV using ffmpeg.
I can do the conversion and everything else, but how do I determine whether a file is complete and fully uploaded?
Many thanks!
Afaik you can't just figure out if a file is still being uploaded. You could run a cronjob everyminute getting the filesizes and store these in a database or file. Then if you run the cronjob the second time and the filesize is the same: convert them, if not.. wait another minute and then try again.
I don't believe there's a filesize stored with a file that contains the size it should be after the upload is done.
There is another way to go about this, which we have done for years.
Most ftp servers (proftpd does) will output a log which will tell you when a file upload has completed successfully. You can set this logging to go to a unix named pipe / fifo, and then have a daemonized script read this to determine which files to process. This works great, and only processes files after they are uploaded completely and successfully.
Files are being pushed to my server via FTP. I process them with PHP code in a Drupal module. O/S is Ubuntu and the FTP server is vsftp.
At regular intervals I will check for new files, process them with SimpleXML and move them to a "Done" folder. How do I avoid processing a partially uploaded file?
vsftp has lock_upload_files defaulted to yes. I thought of attempting to move the files first, expecting the move to fail on a currently uploading file. That doesn't seem to happen, at least on the command line. If I start uploading a large file and move, it just keeps growing in the new location. I guess the directory entry is not locked.
Should I try fopen with mode 'a' or 'r+' just to see if it succeeds before attempting to load into SimpleXML or is there a better way to do this? I guess I could just detect SimpleXML load failing but... that seems messy.
I don't have control of the sender. They won't do an upload and rename.
Thanks
Using the lock_upload_files configuration option of vsftpd leads to locking files with the fcntl() function. This places advisory lock(s) on uploaded file(s) which are in progress. Other programs don't need to consider advisory locks, and mv for example does not. Advisory locks are in general just an advice for programs that care about such locks.
You need another command line tool like lockrun which respects advisory locks.
Note: lockrun must be compiled with the WAIT_AND_LOCK(fd) macro to use the lockf() and not the flock() function in order to work with locks that are set by fcntl() under Linux. So when lockrun is compiled with using lockf() then it will cooperate with the locks set by vsftpd.
With such features (lockrun, mv, lock_upload_files) you can build a shell script or similar that moves files one by one, checking if the file is locked beforehand and holding an advisory lock on it as long as the file is moved. If the file is locked by vsftpd then lockrun can skip the call to mv so that running uploads are skipped.
If locking doesn't work, I don't know of a solution as clean/simple as you'd like. You could make an educated guess by not processing files whose last modified time (which you can get with filemtime()) is within the past x minutes.
If you want a higher degree of confidence than that, you could check and store each file's size (using filesize()) in a simple database, and every x minutes check new size against its old size. If the size hasn't changed in x minutes, you can assume nothing more is being sent.
The lsof linux command lists opened files on your system. I suggest executing it with shell_exec() from PHP and parsing the output to see what files are still being used by your FTP server.
Picking up on the previous answer, you could copy the file over and then compare the sizes of the copied file and the original file at a fixed interval.
If the sizes match, the upload is done, delete the copy, work with the file.
If the sizes do not match, copy the file again.
repeat.
Here's another idea: create a super (but hopefully not root) FTP user that can access some or all of the upload directories. Instead of your PHP code reading uploaded files right off the disk, make it connect to the local FTP server and download files. This way vsftpd handles the locking for you (assuming you leave lock_upload_files enabled). You'll only be able to download a file once vsftp releases the exclusive/write lock (once writing is complete).
You mentioned trying flock in your comment (and how it fails). It does indeed seem painful to try to match whatever locking vsftpd is doing, but dio_fcntl might be worth a shot.
I guess you've solved your problem years ago but still.
If you use some pattern to find the files you need you can ask the party uploading the file to use different name and rename the file once the upload has completed.
You should check the Hidden Stores in proftp, more info here:
http://www.proftpd.org/docs/directives/linked/config_ref_HiddenStores.html
I'd like to have my PHP script upload a file with a certain filename in a directory of my choosing. However, the catch is that I need it to exist there immediately upon upload so I can moniter it on my server. I don't want to use a PHP extension or something - this should be very easy to transfer to any PHP setup.
So basically: Is there a way to guarantee that, from the very beginning of the file upload process, the file has a certain name and location on the server?
Not that I'm aware of.
PHP will use the php.ini-defined tmp folder to store uploads until you copy them to their correct location with move_uploaded_file(). So it's very easy to know its location, but the file name is random and I don't think you can define it.
If you're not going to have multiple concurrent uploads (for example if only you are going to upload files and you know you won't upload 2 files at the same time), you could check the most recent upload file in the tmp directory.
The common solution for monitoring uploads is apc.rfc1867
I know of three options:
RFC1867 (as mentioned by others) which allows you to poll upload progress using ajax
Flash-based uploaders like SWFUpload which allow you to poll upload progress using JavaScript
Create a PHP command line daemon listening on port 80 that accepts file uploads, and used shared memory (or some other mechanism) to communicate upload progress. Wish I could find the link, but I read a great article about a site that allowed users to upload their iTunes library XML file, and it was processed live by the server as it was being uploaded. Very cool, but obviously more involved than the previous options.
I have had decent luck with SWFUpload in the past.
I don't think you can configure the name, as it will be a random name in the temporary folder. You should be able to change the directory, but I can't seem to find the answer on Google (check out php.ini).
As far as I know, this isn't possible with PHP, as a file upload request submits the entire file to the system in one request. So there is no way for the PHP server to know what is happening until it receives the whole request.
There is not a way to monitor file upload progress using PHP only, as PHP does not dispatch progress events during the upload. This is possible to do using a Flash uploader even if Flash is uploading via a PHP script. Flash polls the temporary file on the server during the upload to dispatch progress events. Some of the javascript frameworks like YUI use a SWF to manage uploads. Check out YUI's Uploader widget.
http://developer.yahoo.com/yui/uploader/