I created an simple web interface to allow various users to upload files. I set the upload limit to 100mb but now it turns out that the client occasionally wants to upload files 500mb+.
I know what to alter the php configuration to change the upload limit but I was wondering if there are any serious disadvantages to uploading files of this size via php?
Obviously ftp would be preferable but if possible i'd rather not have two different methods of uploading files.
Thanks
Firstly FTP is never preferable. To anything.
I assume you mean that you transferring the files via HTTP. While not quite as bad as FTP, its not a good idea if you can find another of solving the problem. HTTP (and hence the component programs) are optimized around transferring relatively small files around the internet.
While the protocol supports server to client range requests, it does not allow for the reverse operation. Even if the software at either end were unaffected by the volume, the more data you are pushing across the greater the interval during which you could lose the connection. But the biggest problem is that caveat in the last sentence.
Regardless of the server technology you use (PHP or something else) it's never a good idea to push that big file in one sweep in synchronous mode.
There are lots of plugins for any technology/framework that will do asynchronous upload for you.
Besides the connection timing out, there is one more disadvantage in that file uploading consumes the web server memory. You don't normally want that.
PHP will handle as many and as large a file as you'll allow it. But consider that it's basically impossible to resume an aborted upload in PHP, as scripts are not fired up until AFTER the upload is completed. The larger the file gets, the larger the chance of a network glitch killing the upload and wasting a good chunk of time and bandwidth. As well, without extra work with APC, or using something like uploadify, there's no progress report and users are left staring at a browser showing no visible signs of actual work except the throbber chugging away.
Related
I'm trying to convert a website to use S3 storage instead of local (expensive) disk storage. I solved the download problem using a stream wrapper interface on the S3Client. The upload problem is harder.
It seems to me that when I post to a PHP endpoint, the $_FILES object is already populated and copied to /tmp/ before I can even intercept it!
On top of that, the S3Client->upload() expects a file on the disk already!
Seems like a double-whammy against what I'm trying to do, and most advice I've found uses NodeJS or Java streaming so I don't know how to translate.
It would be better if I could intercept the code that populates $_FILES and then send up 5MB chunks from memory with the S3\ObjectUploader, but how do you crack open the PHP multipart handler?
Thoughts?
EDIT: It is a very low quantity of files, 0-20 per day, mostly 1-5MB sometimes hitting 40~70MB. Periodically (once every few weeks) a 1-2GB file will be uploaded. Hence the desire to move off an EC2 instance and into heroku/beanstalk type PaaS where I won't have much /tmp/ space.
It's hard to comment on your specific situation without knowing the performance requirements of the application and the volume of users needed to access it so I'll try to answer assuming a basic web app uploading profile avatars.
There are some good reasons for this, the file is streamed to the disk for multiple purposes one of which is to conserve memory use. If your file is not on the disk than it is in memory(think disk usage is expensive? bump up your memory usage and see how expensive that gets), which is fine for a single user uploading a small file, but not so great for a bunch of users uploading small files or worse: large files. You'll likely see the best performance if you use the defaults on these libraries and let them stream to and from the disk.
But again I don't know your use case and you may actually need to avoid the disk at all costs for some unknown reason.
My application requires downloading many images from server(each image about 10kb large). And I'm simply downloading each of them with independent AsyncTask without any optimization.
Now I'm wondering what's the common practice to transfer these images. For example, I'm thinking about saving zipped images at server, then send zipped file for user's mobile to unzip. In this case, is it better to combine the zip files into one big zip file for user to download?
Or there's better solution? Thanks in advance!
EDIT:
It seems combining zip files is a good idea, but I feel it may take too long for user to wait downloading and unzipping all images. So I may put ten or twenty images in each zip file, so user can see some downloaded ones while waiting for more to come. Having multiple AsyncTask fired together can be faster right? But they won't finish at the same time even given same file size and same address to download?
Since latency is often the largest problem with mobile connections, reducing the number of connections you have to open is a great way to optimize the loading times. Sending a zip file with all the images sounds like a very good idea, and is probably worth the time implementing.
Images probably are already compressed (gif, jpg, png). You will not reduce filesize but will reduce the number of connections. Which is a good idea for mobile. If it is always the same set of images you can use some sprite technology (sending one bigger image file containing all the images but with different x/y offset, in html you can use the backround with an offset to show the right image).
I was looking at the sidebar and saw this topic, but you're asking about patching when I saw the comments.
The best way to make sure is that the user knows what to do with it. You want the user to download X file and have Y output for a different purpose. On the other hand, it appears common practice is that chunks of resources for those not native to the Android app and not able to fit in the APK.
A comparable example is the JDIC apps, which use the popular Japanese resource that are in tandem used for English translations. JDIC apps like WWWJDIC use online downloads for the extremely large reference files that would otherwise have bad latency (which have been mentioned before) on Google servers. It's also bad rep to have >200 MB on Google apps unless it is 3D, which is justifiable. If your images cannot be compressed without extremely long loading times on the app itself, you may need to consider this option. The only downside is to request online connection (also mentioned before).
Also, you could use 7zip and program Android to self-extract it to a location. http://www.wikihow.com/Use-7Zip-to-Create-Self-Extracting-excutables
On another note, it would be optimal for the user to perform routine checks on the app while having a one-time download on initial startup. You can then optionally put in an AsyncTask so that your files will be downloaded to the app and used after restart or however you want it, so you really need only one AsyncTask. The benefit of this is that the user syncs on the apps and he may need to check only once. The downside is that the user may not always be able to update and may need to use 4G or LTE, but that is a minor concern if he can use WiFi whenever he wants.
I have a page that uploads a file to my server, where it then gets copied to a permanent directory via move_uploaded_file. This all seems to work great, with the exception that in a real-life scenario I will be expecting much larger files than I have successfully sent up.
I have already tacked the timeout for the file upload by changing the connection timeout in my site settings in IIS - so the file continues to upload up to six hours ( -_- ) - but this is where I run into my current problem - It might just take six hours!
After getting the upload process to get past 10% or so ( on a 300 meg file ), I noticed that the file continues to push up, but my upload rate seems to be 'falling off' - as in, I observed faster speeds when I started the transfer, than I am seeing halfway through it. The numbers here aren't necessarily relevant, as I know that my upload ( while Im uploading, still 2 Mbps ) is capable of pushing faster than it is, and the server on the other end is on fiber.
I wonder if anyone has encountered this before, and if so, have you determined a work-around. Any help appreciated. Thanks.
You should not be using HTTP for this task. You may have observed that all the "file locker" services (and others which involve uploading files, such as Apple's online-music service) provide you with an "uploader" program rather than making use of the browser. There are reasons for this.
First off, the overhead of the transfer encoding is large. You take your (presumably binary) data, and Base64 encode it; that's 33% overhead. So if it would take four hours with HTTP, it would only take three with a binary protocol - and that's disregarding the chunked-transfer overhead, so the reality is probably more severe.
Second, there's no way to "resume" an upload in HTTP. So if your connection is broken, you'll either have to write application-specific code to handle the resumption, or start all over.
Third, HTTP servers are not designed for super-long-lived connections: they usually have a finite or small pool of workers to service the (usually seconds-long at the outset) client requests, and occasionally they have smallish limits on the size of request data (2GB is common, and PHP by default has only a few MB).
I strongly recommend using a file transfer protocol to transfer files (such as FTP). You don't have to give out a single username/password pair to everyone: you can have a gatekeeper which integrates with whatever authentication system you already have in place. FTP-over-TLS also exists and is relatively mature.
There is a fairly good summary of the differences between the two protocols here. Note that you gain nothing from any of the advantages of HTTP listed, due to your circumstances.
Don't feel limited to FTP - rsync is a great protocol for transferring files as well, especially if you only change part of the file (it even does binary deltas!). git can also efficiently transport large blobs over secure connections or even HTTP, if you insist on using that.
I am a developing a website that involves uploading videos above 50MB.
Which is a better (faster) way of uploading the files to server:
uploading the video files via ftp
or
uploading the files via a form
Thanks
The best way would be with FTP.
FTP is much faster for larger file sizes. File sizes that are below 1MB won't matter as much.
P.S. If you are not the one uploading, then think which is easier for your users. Form is easier but ftp is still faster.
For user experience you should go with the form file upload; The speed of both depend on the internet connection speed and load of the server and client and won't differ that much. It might be a bit much for your webserver if it's handling a lot of users but you can use for example nginx to make that less of a problem.
edit:
here a comparison: http://daniel.haxx.se/docs/ftp-vs-http.html
I use Jupload
It splits the files and uploads them via http. It's also good because you don't need to care about file upload limitations in server config. Speed depends mostly on the client connection info both for HTTP and FTP. Of course there are some differences but not at all so big between them.
Why not offer both? (Seriously - I wrote an app about ten years ago that did this.) Look up "MOVEit DMZ" or research various FTP servers with web portal integration to see how it's being done today.
There's also a third way you should consider and was touched on by the Jupload comment: a local control (Flash, Java, ActiveX, Firefox plug-in, etc. that optimizes the upload experience). If people are uploading multiple large files to your site they may appreciate the speed/reliability boost.
I have a general question about this.
When you have a gallery, sometimes people need to upload 1000's of images at once. Most likely, it would be done through a .zip file. What is the best way to go about uploading this sort of thing to a server. Many times, server have timeouts etc. that need to be accounted for. I am wondering what kinds of things should I be looking out for and what is the best way to handle a large amount of images being uploaded.
I'm guessing that you would allow a user to upload a zip file (assuming the timeout does not effect you), and this zip file is uploaded to a specific directory, lets assume in this case a directory is created for each user in the system. You would then unzip the directory on the server and scan the user's folder for any directories containing .jpg or .png or .gif files (etc.) and then import them into a table accordingly. I'm guessing labeled by folder name.
What kind of server side troubles could I run into?
I'm aware that there may be many issues. Even general ideas would be could so I can then research further. Thanks!
Also, I would be programming in Ruby on Rails but I think this question applies accross any language.
There's no reason why you couldn't handle this kind of thing with a web application. There's a couple of excellent components that would be useful for this:
Uploadify (based on jquery/flash)
plupload (from moxiecode, the tinymce people)
The reason they're useful is that in the first instance, it uses a flash component to handle uploads, so you can select groups of files from the file browser window (assuming no one is going to individually select thousands of images..!), and with plupload, drag and drop is supported too along with more platforms.
Once you've got your interface working, the server side stuff just needs to be able to handle individual uploads, associating them with some kind of user account, and from there it should be pretty straightforward.
With regards to server side issues, that's really a big question, depending on how many people will be using the application at the same time, size of images, any processing that takes place after. Remember, the files are kept in a temporary location while the script is processing them, and either deleted upon completion or copied to a final storage location by your script, so space/memory overheads/timeouts could be an issue.
If the images are massive in size, say raw or tif, then this kind of thing could still work with chunked uploads, but implementing some kind of FTP upload might be easier. Its a bit of a vague question, but should be plenty here to get you going ;)
For those many images it has to be a serious app.. thus giving you the liberty to suggest a piece of software running on the client (something like yahoo mail/picassa does) that will take care of 'managing' (network interruptions/resume support etc) the upload of images.
For the server side, you could process these one at a time (assuming your client is sending them that way)..thus keeping it simple.
take a peek at http://gallery.menalto.com
they have a dozen of methods for uploading pictures into galleries.
You can choose ones which suits you.
Either have a client app, or some Ajax code that sends the images one by one, preventing timeouts. Alternatively if this is not available to the public. FTP still works...
I'd suggest a client application (maybe written in AIR or Titanium) or telling your users what FTP is.
deviantArt.com for example offers FTP as an upload method for paying subscribers and it works really well.
Flickr instead has it's own app for this. The "Flickr Uploadr".