So I am simply seeking to upload files to the server for later upload to Azure Blob Storage.
I am aware you can POST data using a HTML form, however this isn't a solution.
I would like to be able to upload files directly using POST such as through cURL or other means. As far as I'm aware PHP also requires the content type header to be specified as multipart/form-data for this.
I have tried to use file handles, but it appears that I am only able to interact with locally stored files. I have also played around with move_uploaded_files, but I am unable to POST files directly.
Streaming is important here, as files will likely be 5MB+, which wouldn't work for multiple concurrent uploads as PHP would quickly saturate its allocated memory; while I can change PHP settings on the staging server, I have no idea what the PHP memory allocation will be like on the Azure VM production server. It also wouldn't make sense to keep changing PHP settings with load increases.
However, if PHP does not offer such a solution out of the box, I'm open to alternatives.
To summarise, how do I stream data (e.g. videos) to a PHP file?
Thank you in advance.
Related
I'm trying to convert a website to use S3 storage instead of local (expensive) disk storage. I solved the download problem using a stream wrapper interface on the S3Client. The upload problem is harder.
It seems to me that when I post to a PHP endpoint, the $_FILES object is already populated and copied to /tmp/ before I can even intercept it!
On top of that, the S3Client->upload() expects a file on the disk already!
Seems like a double-whammy against what I'm trying to do, and most advice I've found uses NodeJS or Java streaming so I don't know how to translate.
It would be better if I could intercept the code that populates $_FILES and then send up 5MB chunks from memory with the S3\ObjectUploader, but how do you crack open the PHP multipart handler?
Thoughts?
EDIT: It is a very low quantity of files, 0-20 per day, mostly 1-5MB sometimes hitting 40~70MB. Periodically (once every few weeks) a 1-2GB file will be uploaded. Hence the desire to move off an EC2 instance and into heroku/beanstalk type PaaS where I won't have much /tmp/ space.
It's hard to comment on your specific situation without knowing the performance requirements of the application and the volume of users needed to access it so I'll try to answer assuming a basic web app uploading profile avatars.
There are some good reasons for this, the file is streamed to the disk for multiple purposes one of which is to conserve memory use. If your file is not on the disk than it is in memory(think disk usage is expensive? bump up your memory usage and see how expensive that gets), which is fine for a single user uploading a small file, but not so great for a bunch of users uploading small files or worse: large files. You'll likely see the best performance if you use the defaults on these libraries and let them stream to and from the disk.
But again I don't know your use case and you may actually need to avoid the disk at all costs for some unknown reason.
I have numerous storage servers, and then more "cache" servers which are used to load balance downloads. At the moment I use RSYNC to copy the most popular files from the storage boxes to the cache boxes, then update the DB with the new server IDs, so my script can route the download requests to a random box which has the file.
I'm now looking at better ways to distribute the content, and wondering if it's possible to route requests to any box at random, and the download script then check if the file exists locally, if it doesn't, it would "get" the file contents from the remote storage box, and output the content in realtime to the browser, whilst keeping the file on the cache box so that the next time the same request is made, it can just serve the local copy, rather than connecting to the storage box again.
Hope that makes sense(!)
I've been playing around with RSYNC, wget and cURL commands, but I'm struggling to find a way to output the data to browser as it comes in.
I've also been reading up on reverse proxies with nginx, which sounds like the right route... but it still sounds like they require the entire file to be downloaded from the origin server to the cache server before it can output anything to the client(?) some of my files are 100GB+ and each server has a 1gbps bandwidth limit, so at best, it would take 100s to download a file of that size to the cache server before the client will see any data at all. There must be a way to "pipe" the data to the client as it streams?
Is what I'm trying to achieve possible?
You can pipe data without downloading the full file using streams. One example for downloading a file as a stream would be the Guzzle sink feature. One example for uploading a file as a stream would be the Symfony StreamedResponse. Using those the following can be done:
Server A has a file the user wants
Server B gets the user request for the file
Server B uses Guzzle to setup a download stream to server A
Server B outputs the StreamedResponse directly to the user
Doing so will serve the download in real-time without having to wait for the entire file to be finished. However I do not know if you can stream to the user and store the file on disk at the same time. There's a stream_copy_to_stream function in PHP which might allow this, but don't know that for sure.
I am trying to process the user uploaded file real time on the websever,
but it seems, APACHE invokes PHP, only once complete file is uploaded.
When i uploaded the file using CURL, and set
Transfer-Encoding : "Chunked"
I had some success, but can't do same thing via browser.
I used Dropzone.js but when i tried to set same header, it said Transfer -Encoding is an unsafe header, hence not setting it.
This answer explains what is the issue there.
Can't set Transfer-Encoding :"Chunked from Browser"
In a Nutshell problem is , when a user uploads the file to webserver, i want webserver to start processing it as soon as first byte is available.
by process i mean, PIPING it to a Named Pipe.
Dont want 500mb first getting uploaded to a server, then start processing it.
But with current Webserver (APACHE - PHP), I cant seem to be able to accomplish it.
could someone please explain, what technology stack or workarounds to use, so that i can upload the large file via browser and start processing it, as soon as first byte is available.
It is possible to use NodeJS/Multiparty to do that. Here they have an example of a direct upload to Amazon S3. This is the form, which sets content type to multipart/form-data. And here is the function for form parts processing. part parameter is of type ReadableStream, which will allow per-chunk processing of the input using data event.
More on readable streams in node js is here.
If you really want that (sorry don`t think thats a good idea) you should try looking for a FUSE Filesystem which does your job.
Maybe there is already one https://github.com/libfuse/libfuse/wiki/Filesystems
Or you should write your own.
But remember as soon as the upload is completed and the post script finishes his job the temp file will be deleted
you can upload file with html5 resumable upload tools (like Resumable.js) and process uploaded parts as soon as they received.
or as a workaround , you may find the path of uploaded file (usually in /tmp) and then write a background job to stream it to 3rd app. it may be harder.
there may be other solutions...
I have a standard HTML form, including a file input, allowing users of a web application to upload files (pictures, documents or videos).
It technically works, except when it comes with a large file...
I have a personal dedicated server on which I can change the PHP configuration to handle larger files ;
But this is not a reliable solution as my client has a shared hosting.
I know HTTP has different limits and is definitely not the best protocol to handle files.
So my objective is to avoid HTTP upload.
I was wondering if there is any way to rely on FTP to upload the selected file.
Any solution to upload files through FTP, directly from client's browser ?
EDIT :
I've read some solutions about Java applets, however this is not
really something I can, or even want to, provide.
And as the files are confidential, using a third-party service is
not possible neither.
I have a web application that accepts file uploads of up to 4 MB. The server side script is PHP and web server is NGINX. Many users have requested to increase this limit drastically to allow upload of video etc.
However there seems to be no easy solution for this problem with PHP. First, on the client side I am looking for something that would allow me to chunk files during transfer. SWFUpload does not seem to do that. I guess I can stream uploads using Java FX (http://blogs.oracle.com/rakeshmenonp/entry/javafx_upload_file) but I can not find any equivalent of request.getInputStream in PHP.
Increasing browser client_post limits or php.ini upload or max_execution times is not really a solution for really large files (~ 1GB) because maybe the browser will time out and think of all those blobs stored in memory.
Is there any way to solve this problem using PHP on server side? I would appreciate your replies.
plupload is a javascript/php library, and it's quite easy to use and allows chunking.
It uses HTML5 though.
Take a look at tus protocol which is a HTTP based protocol for resumable file uploads so you can carry on where you left off without re-uploading whole data again in case of any interruptions. This protocol has also been adopted by vimeo from May, 2017.
You can find various implementations of the protocol in different languages here. In your case, you can use its javascript client called uppy and use golang or php based server implementation in a server.
"but I can not find any equivalent of request.getInputStream in PHP. "
fopen('php://input'); perhaps?
I have created a JavaFX client to send large files in chunks of max post size (I am using 2 MB) and a PHP receiver script to assemble the chunks into original file. I am releasing the code under apache license here : http://code.google.com/p/gigaupload/
Feel free to use/modify/distribute.
Try using the bigupload script. It is very easy to integrate and can upload up to 2 Gb in chunks. The chunk size is customizable.
How about using a java applet for the uploading and PHP for processing..
You can find an example here for Jupload:
http://sourceforge.net/apps/mediawiki/jupload/index.php?title=PHP_Example
you can use this package
it supports resumable chunk upload.
in the examples/js-examples/resumable-chunk-upload example , you can close and re-open the browser and then resume not completed uploads.
You can definitely write a web app that will accept a block of data (even via a POST) then append that block of data to a file. It seems to me that you need some kind of client side app that will take a file and break it up into chunks, then send it to your web service one chunk at a time. However, it seems a lot easier to create an sftp dir, and let clients just sftp up files using some pre-existing client app.