I have an ajax image uploader that sends images to a php script, so that they are validated / resized & saved into a directory. The ajax uploader allows multiple files to be uploaded at the same time. Since it allows multiple files there can be timeouts, so I thought of increasing the execution time using set_time_limit. But I am having trouble determining how much time I have to set, since the default is 30sec. Will 1min be enough? The images are uploading properly in my local machine, but I am having thoughts that there will be timeout errors on a shared hosting service. Any ideas & thoughts on how others have implemented will be valuable.. Thanks.
You can set it to 5 minutes if you need to. But for obvious reasons, you don't really want to have it that high especially for http calls.
So... if I had the energy, this is what I'd do...
I need:
process_initializer.php
process_checker.php
client.html
the_process.php (runs as background)
...
client uploads files to process_initializer.
initializer creates a unique ID, maybe based off of time with milliseconds or some other advanced solution
initializer starts a background process, sending it necessary arguments like filenames along with the ID
initializer responds to the client with ID
client then polls process_checker to see what's going on with ID (maybe 20 second intervals - setTimeout(), whatever)
process_checker may check to see if file output_ID.txt exists which the_process should create when it's done and then if it doesn't exist respond to the client that it's not ready, if it does, maybe send the output to the client and then the client can do whatever.
When apache runs php, it uses one php.ini configuration and when you run php from the command line or from another script like exec('php the_process arg1 arg2') it will use a different php.ini for this, reffered to as php cli or something like that unless you have php cli configured to use the same php.ini that apache does. Important thing is, it's possible they use different settings and so you can let cli scripts take more time than your http called scripts.
Related
In this system, SFTP will be used to upload large video files to a specific directory on the server. I am writing a PHP script that will be used to copy completely uploaded files to another directory.
My question is, what is a good way to tell if a file is completely uploaded or not? I considered taking note of the file size, sleeping the script for a few seconds, reading the size again, and comparing the two numbers. But this seems very kludgy.
Simple
If you just want a simple technique that will handle most use cases - write a process to check the last modified date or (as indicated in the question) the file size and if it hasn't changed in a set time, say 1 minute, assume the upload is finished and process it. Infact - your question is a near duplicate of another question showing exactly this.
Robust
An alternative to polling the size/last modified time of a file (which with a slow or intermittent connection could cause an open upload to be considered complete too early) is to listen for when the file has finished being written with inotify.
Listening for IN_CLOSE_WRITE you can know when an ftp upload has finished updating a specific file.
I've never used it in php, but from the spartan docs it looks that that aught to work in exactly the same way as the underlying lib.
Make FTP clients work for you
Take into account how some ftp programs work, from this comment:
it's worth mentioning that some ftp servers create a hidden temporary file to receive data until the transfer is finished. I.e. .foo-ftpupload-2837948723 in the same upload path as the target file
If applicable, from the finished filename you can therefore tell if the ftp upload has actually finished or was stopped/disconnected mid-upload.
In theory, in case you are allowed to run system commands from PHP you can try to get the pids of your sftp server processes with pidof command and then use lsof output to check if open file descriptors exist for the currently checked upload file.
During a transfer, the only thing that knows the real size of the file is your SFTP client and maybe the SFTP server (don't know protocol specifics so the server may know too.) Unless you can talk to the SFTP server directly, then you'll have to wait for the file to finish. Checking at some interval seems kludgy but under the circumstances is pretty good. It's what we do on my project.
If you have a very short poll interval, it's not terribly resilient when dealing with uploads that may stall for a few seconds. So just set the interval to one a minute or something like that. Assuming you're on a *nix based platform, you could set up a cron job to run every minute (or whatever your polling interval is) to run your script for you.
As #jcolebrand stated - if you have no control over upload process - there is actually not much you can do except guessing (size/date of file). I would look for a server software that allows you to execute hooks/scripts before/after some server action (here: file transfer complete). I am not sure though whether such software exists. As a last resort you could adapt some opensource SFTP server for your requirements by adding such a hook for yourself.
An interesting answer was mentioned in a commend under original question (but for some unknown reason the comment was deleted): you can try parsing SFTP server logs to find files that were uploaded completely.
I've seen many web apps that implement progress bars, however, my question is related to the non-uploading variety.
Many PHP web applications (phpBB, Joomla, etc.) implement a "smart" installer to not only guide you through the installation of the software, but also keep you informed of what it's currently doing. For instance, if the installer was creating SQL tables or writing configuration files, it would report this without asking you to click. (Basically, sit-back-and-relax installation.)
Another good example is with Joomla's Akeeba Backup (formerly Joomla Pack). When you perform a backup of your Joomla installation, it makes a full archive of the installation directory. This, however, takes a long time, and hence requires updates on the progress. However, the server itself has a limit on PHP script execution time, and so it seems that either
The backup script is able to bypass it.
Some temp data is stored so that the archive is appended to (if archive appending is possible).
Client scripts call the server's PHP every so often to perform actions.
My general guess (not specific to Akeeba) is with #3, that is:
Web page JS -> POST foo/installer.php?doaction=1 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=2 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=3 SESSID=foo2
Server -> ERRCODE SUCCESS
Web page JS -> POST foo/installer.php?doaction=4 SESSID=foo2
Server -> ERRCODE FAIL Reason: Configuration.php not writable!
Web page JS -> Show error to user
I'm 99% sure this isn't the case, since that would create a very nasty dependency on the user to have Javascript enabled.
I guess my question boils down to the following:
How are long running PHP scripts (on web servers, of course) handled and are able to "stay alive" past the PHP maximum execution time? If they don't "cheat", how are they able to split the task up at hand? (I notice that Akeeba Backup does acknowledge the PHP maximum execution time limit, but I don't want to dig too deep to find such code.)
How is the progress displayed via AJAX+PHP? I've read that people use a file to indicate progress, but to me that seems "dirty" and puts a bit of strain on I/O, especially for live servers with 10,000+ visitors running the aforementioned script.
The environment for this script is where safe_mode is enabled, and the limit is generally 30 seconds. (Basically, a restrictive, free $0 host.) This script is aimed at all audiences (will be made public), so I have no power over what host it will be on. (And this assumes that I'm not going to blame the end user for having a bad host.)
I don't necessarily need code examples (although they are very much appreciated!), I just need to know the logic flow for implementing this.
Generally, this sort of thing is stored in the $_SESSION variable. As far as execution timeout goes, what I typically do is have a JavaScript timeout that sets the innerHTML of an update status div to a PHP script every x number of seconds. When this script executes, it doesn't "wait" or anything like that. It merely grabs the current status from the session (which is updated via the script(s) that is/are actually performing the installation) then outputs that in whatever fancy method I see fit (status bar, etc).
I wouldn't recommend any direct I/O for status updates. You're correct in that it is messy and inefficient. I'd say $_SESSION is definitely the way to go here.
I want to setup a automated backup via PHP, that using http/s I can "POST" a zip file request to another server and send over a large .zip file , basically, I want to backup an entire site (and its database) and have a cron peridocally transmit the file over via a http/s. somethiling like
wget http://www.thissite.com/cron_backup.php?dest=www.othersite.com&file=backup.zip
The appropriate authentication security can be added afterwords....
I prefer http/s because this other site has limited use of ftp and is on a windows box. So the I figure the sure way to communicate with it is via http/s .. the other end would have a correspondign php script that would store the file.
this process needs to be completely programmatic (ie. Flash uploaders will not work, as this needs a browser to work, this script will run from a shell session)//
Are there any generalized PHP libraries or functions that help with this sort of thing? I'm aware of the PHP script timeout issues, but I can typically alter php.ini to minimize this.
I'd personally not use wget, and just run from the shell directly for this.
Your php script would be called from cron like this:
/usr/local/bin/php /your/script/location.php args here if you want
This way you don't have to worry about yet another program to handle things (wget) if your settings are the same at each run then just put them in a config file or directly into the PHP script.
Timeout can be handled by this, makes PHP script run an unlimited amount of time.
set_time_limit(0);
Not sure what libraries you're using, but look into CRUL to do the POST, should work fine.
I think the biggest issues that would come up would be more sever related and less PHP/script related, i.e make sure you got the bandwidth for it, and that your PHP script CAN connect to an outside server.
If its at all possible I'd stay away from doing large transfers over HTTP. FTP is far from ideal too - but for very different reasons.
Yes it is possible to do this via ftp, http and https using curl - but this does not really solve any of the problems. HTTP is optimized around sending relatively small files in erlatively short periods of time - when you stray away from that you will end up undermining a lot of the optimization that's applied to webservers (e.g. if you've got a setting for maxrequestsperchild you could be artificially extending the life of processes which should have stopped, and there's the interaction between LimitRequest* settings and max_file_size, not to mention various timeouts and the other limit settings in Apache).
A far more sensible solution is to use rsync over ssh for content/code backups and the appropriate database replication method for the DBMS you are using - e.g. mysql replication.
I'm trying to create a AJAX push implementation in PHP using a Comet long-polling method. My code involves using file_get_contents() to read a file repeatedly to check for any messages to send to the user. To reduce server load, I'm using two text files; one containing the actual command and one acting as a "change notifier", which either is iterated through 0-9 or contains a UNIX timestamp. My question is, how often can I access and read from a small (only a few bytes) file without overloading the server? The push implementation means that I can poll for changes much more often than requesting a file every few seconds, but there's still must be a limit.
If it helps, I'm using the 1&1 Home (Linux) hosting plan, which is shared hosting.
Assuming you're running a sane os which will cache the 'change notifier' file in ram, the operation would be so cheap as to insignificant. PHP would become a bottleneck way before then.
I need to upload potentially big (as in, 10's to 100's of megabytes) files from a desktop application to a server. The server code is written in PHP, the desktop application in C++/MFC. I want to be able to resume file uploads when the upload fails halfway through because this software will be used over unreliable connections. What are my options? I've found a number of HTTP upload components for C++, such as http://www.chilkatsoft.com/refdoc/vcCkUploadRef.html which looks excellent, but it doesn't seem to handle 'resume' of half done uploads (I assume this is because HTTP 1.1 doesn't support it). I've also looked at the BITS service but for uploads it requires an IIS server. So far my only option seems to be to cut up the file I want to upload into smaller pieces (say 1 meg each), upload them all to the server, reassemble them with PHP and run a checksum to see if everything went ok. To resume, I'd need to have some form of 'handshake' at the beginning of the upload to find out which pieces are already on the server. Will I have to code this by hand or does anyone know of a library that does all this for me, or maybe even a completely different solution? I'd rather not switch to another protocol that supports resume natively for maintenance reasons (potential problems with firewalls etc.)
I'm eight months late, but I just stumbled upon this question and was surprised that webDAV wasn't mentioned. You could use the HTTP PUT method to upload, and include a Content-Range header to handle resuming and such. A HEAD request would tell you if the file already exists and how big it is. So perhaps something like this:
1) HEAD the remote file
2) If it exists and size == local size, upload is already done
3) If size < local size, add a Content-Range header to request and seek to the appropriate location in local file.
4) Make PUT request to upload the file (or portion of the file, if resuming)
5) If connection fails during PUT request, start over with step 1
You can also list (PROPFIND) and rename (MOVE) files, and create directories (MKCOL) with dav.
I believe both Apache and Lighttpd have dav extensions.
You need a standard size (say 256k). If your file "abc.txt", uploaded by user x is 78.3MB it would be 313 full chunks and one smaller chunk.
You send a request to upload stating filename and size, as well as number of initial threads.
your php code will create a temp folder named after the IP address and filename,
Your app can then use MULTIPLE connections to send the data in different threads, so you could be sending chunks 1,111,212,313 at the same time (with separate checksums).
your php code saves them to different files and confirms reception after validating the checksum, giving the number of a new chunk to send, or to stop with this thread.
After all thread are finished, you would ask the php to join all the files, if something is missing, it would goto 3
You could increase or decrease the number of threads at will, since the app is controlling the sending.
You can easily show a progress indicator, either a simple progress bar, or something close to downthemall's detailed view of chunks.
libcurl (C api) could be a viable option
-C/--continue-at
Continue/Resume a previous file transfer at the given offset. The given offset is the exact number of bytes that will be skipped, counting from the beginning of the source file before it is transferred to the destination. If used with uploads, the FTP server command SIZE will not be used by curl.
Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out.
If this option is used several times, the last one will be used
Google have created a Resumable HTTP Upload protocol. See https://developers.google.com/gdata/docs/resumable_upload
Is reversing the whole proccess an option? I mean, instead of pushing file over to the server make the server pull the file using standard HTTP GET with all bells and whistles (like accept-ranges, etc.).
Maybe the easiest method would be to create an upload page that would accept the filename and range in parameter, such as http://yourpage/.../upload.php?file=myfile&from=123456 and handle resumes in the client (maybe you could add a function to inspect which ranges the server has received)
# Anton Gogolev
Lol, I was just thinking about the same thing - reversing whole thing, making server a client, and client a server. Thx to Roel, why it wouldn't work, is clearer to me now.
# Roel
I would suggest implementing Java uploader [JumpLoader is good, with its JScript interface and even sample PHP server side code]. Flash uploaders suffer badly when it comes to BIIIGGG files :) , in a gigabyte scale that is.
F*EX can upload files up to TB range via HTTP and is able to resume after link failures.
It does not exactly meets your needs, because it is written in Perl and needs an UNIX based server, but the clients can be on any operating system. Maybe it is helpful for you nevertheless:
http://fex.rus.uni-stuttgart.de/
Exists the protocol called TUS for resumable uploads with some implementations in PHP and C++