File upload script apparently not executing until upload has completed - php

I have an odd problem: I've got an AJAX file uploader (fineuploader, as it happens), with a server-side script (PHP) that handles the upload. Although the uploader has an "allowedExtensions" setting, I want the allowed extensions to be specified on the server side, so my upload handler checks the file extension and returns an error response if the file's extension is not allowed.
The problem I'm having is that whereas in my dev environment the upload handler returns this response straight away (as expected) and the upload stops, on another server the upload goes ahead and the response is only returned after the upload has completed. Which potentially means a very long wait before receiving an error message.
Looking at the network info in developer tools, it seems the browser is waiting for a response the whole time, and (on the problematic server) only receives it after the upload has completed, which seems to suggest that on the server the upload handler script is actually not being executed until after all the data has been received.
It seems to me that the most likely culprit is some setting to do with how PHP handles uploads / multi-part form data, but I can't figure out what that might be. Would be grateful for any advice!
Update:
It seems the problem is the same on both servers (I just didn't notice the lag on one). So it seems my upload handler script is not executing until the file transfer has completed (this seems likely because the script checks the filename early on and throws an Exception if the extension is wrong, so it should respond quickly once it's started. Also, it always responds almost immediately when the upload has completed - however long that takes - suggesting that's when it's being executed).
Is this just a feature of how PHP handles multi-part form data? Is there any way of allowing the script to respond immediately if the filename is unsuitable?

It could be as simple as the dev server being really fast to receive the file and respond as fast. If the dev server is on your own machine or a local dev server, 100mbits connexion makes even the pretty large file blazingly fast while on a production server that is often outside of the current network, the upload is long...
Sadly no, it's not possible for PHP to respond before the request being complete because thats the very nature of HTTP. You can cut off the connection once the request is sent and not read the response but you can't expect to receive a response until the whole request is sent. Although i used to do this with my friends, that is cut them off and answer before the end of the question, i can ensure you that only humans are capable of that feat! Oh and dont do that, it breaks friendships
o multipart doesn't mean chunked upload, it means the message is separated into different parts using a message boundary to separate the different element. If the process started before the whole request was sent anyway, you'd need to integrate mechanisms to detect if a certain part of your request was completely uploaded which would make the web even harder to program than it is now!
You can look at this for an example of what a multipart request looks like htmlcodetutorial.com/forms/form_enctype.html

Related

Process Uploaded file on web server without storing locally first?

I am trying to process the user uploaded file real time on the websever,
but it seems, APACHE invokes PHP, only once complete file is uploaded.
When i uploaded the file using CURL, and set
Transfer-Encoding : "Chunked"
I had some success, but can't do same thing via browser.
I used Dropzone.js but when i tried to set same header, it said Transfer -Encoding is an unsafe header, hence not setting it.
This answer explains what is the issue there.
Can't set Transfer-Encoding :"Chunked from Browser"
In a Nutshell problem is , when a user uploads the file to webserver, i want webserver to start processing it as soon as first byte is available.
by process i mean, PIPING it to a Named Pipe.
Dont want 500mb first getting uploaded to a server, then start processing it.
But with current Webserver (APACHE - PHP), I cant seem to be able to accomplish it.
could someone please explain, what technology stack or workarounds to use, so that i can upload the large file via browser and start processing it, as soon as first byte is available.
It is possible to use NodeJS/Multiparty to do that. Here they have an example of a direct upload to Amazon S3. This is the form, which sets content type to multipart/form-data. And here is the function for form parts processing. part parameter is of type ReadableStream, which will allow per-chunk processing of the input using data event.
More on readable streams in node js is here.
If you really want that (sorry don`t think thats a good idea) you should try looking for a FUSE Filesystem which does your job.
Maybe there is already one https://github.com/libfuse/libfuse/wiki/Filesystems
Or you should write your own.
But remember as soon as the upload is completed and the post script finishes his job the temp file will be deleted
you can upload file with html5 resumable upload tools (like Resumable.js) and process uploaded parts as soon as they received.
or as a workaround , you may find the path of uploaded file (usually in /tmp) and then write a background job to stream it to 3rd app. it may be harder.
there may be other solutions...

Uploading a file to my PHP server using a POST and passing it to another external server

I've found similar posts to what I want to do, but nothing exact so please excuse me if I missed it and there is already an answer here. Also, I am a C++ engineer, but new to PHP and without much experience with networking and HTTP requests.
Here is what I am hoping to do. I have a Linux server that is running PHP that hosts a restful API for clients to access. Clients have their custom authentication to access the API and they can upload files using it. I then need to take those files send them to an external server using my private authentication credentials. I can easily set that up so when I receive the POST, I create a new HTTP request to post it to my private server and then return the results back to the client.
The issue is speed. The files can be quite large so that means the client will have to wait for the file to be uploaded twice before receiving a response back. One solution I have is to immediately send a response back to the client and then have the client ping the server every x seconds to check the status of the secondary upload. This would allow me to get a response back to the client faster, but is not ideal. I was hoping there is a more advanced solution that Is there a way that I can begin the secondary upload on my server as soon as I start receiving the upload so that by the time the upload to my server is complete, the upload to the secondary server will be almost complete as well. This all has to be accomplished with POST's as well, so I don't know if that is a limiting factor in the equation.
Is something like that even possible? If so, how would you recommend doing it?
Another option might be to somehow have the client directly upload to the secondary server, but how would that be possible without giving the client my private authentication. Keep in mind that the secondary server is just a restful API that accepts posts using API key and token for authentication.
That is a tough one. If you had FTP access on the secondary server, we could get a bit more speed eventually.
As far as I know, you can´t start sending the secondary upload at the same time you receive the file. While the file is being uploaded, it resides in a temporary folder untill the request is finished, and only then it is moved to some accessible folder. But hey, I am not 100% sure there is no possibility at all.
The current solution seems the best to me: get the file, inform the user, upload to secondary server, inform the user if it is all complete. Communication between servers is usually quite faster, and so the secondary upload should take less time than the first.
The last option, for now, is a no no. I can´t think of a way to upload directly to the secondary server, without exposing your credentials.
Thank you guys. Looks like I am stuck with the option of returning a result to the client when the upload is complete, then starting the secondary upload while having the client ping the main server to check it's status. I think should be fine since I am limited to PHP at the moment.
Cheers

Receiving as much of an HTTP request as possible that didn't complete

In short, if I am sending an HTTP POST with a large-ish (20-30mb) payload and the connection drops halfway through sending the request to the server, can I recover the 10mb+ that was sent before the connection dropped?
In my testing of PHP on NGINX, if the connection drops during the upload, my PHP never seems to start. I have ignore_user_abort(1) at the top of the script, but that only seems to be relevant once a complete request has been received.
Is there a configuration setting somewhere that will allow me to see all of the request that was received, even if it wasn't received in full?
I'm sending these files mostly over intermittent connections, so I'd like to send as much as I can per request, and then just ask the server where to continue from. As things stand at the moment I have to send the files in pieces, and reducing the size of the pieces if there are errors, or increasing the size if there haven't been any errors for a while. That's very slow and wasteful of bandwidth.
=======
I should clarify that it's not so much about uploading a large file in one go that I'm after as much as if the connection breaks, can I pick up from where I left off? In all my testing, if the complete post is not received, the whole request is junked and PHP not notified, so I have to start from scratch.
I'll have to run some tests, but are you saying that if I used chunked transfer encoding for the request, that PHP would get all the chunks received before disconnection? It's worth a try, and certainly better that making multiple smaller posts on the offchance that the connection will break.
Thanks for the suggestion.
Never process big file uploads through scripting backend (Ruby, PHP), there is a built-in direct upload functionality called client_body_in_file_only, see my very deep overview on it here: https://coderwall.com/p/swgfvw/nginx-direct-file-upload-without-passing-them-through-backend
The only limit it doesn't work with multipart form data, but only via AJAX or direct POST from mobile or server to server.

How to deal with broken multipart uploads?

I have a PHP script that receives images from remote devices and saves them to the database. The script, launched from Apache, receives first a header defining what is being uploaded, and then the contents of the images uploaded, all as a single multipart transmission. After successfully adding the images, it replies with a confirmation for the device.
Now the problem is the connection isn't very reliable. Sometimes the transmission times out. That wouldn't be a problem as the device resends the data after some time if it didn't receive the confirmation. Except if the transmission is broken halfway through, Apache launches the script as usual, and the script happily saves the incomplete set of images to the database, with their creation timestamps as unique keys. Then the remote device resends data, which then the script receives correctly, but it's unable to save it as the unique key is already taken by the corrupt data.
Is there some reliable way to either tell from within a PHP script that it's been launched on incomplete multipart transmission, or prevent Apache from starting it if the transmission didn't end successfully?
(we can't really change the database structure or the format received from the remote device.)
Check your device's application code for send timeout. Maybe connection is too slow and it brakes it as time limit is reached. In that case apache server would run php script with partially received data in spite of not matching Content-Length header.

PHP redirect file post stream

I'm writing an API using php to wrap a website functionality and returning everything in json\xml. I've been using curl and so far it's working great.
The website has a standard file upload Post that accepts file(s) up to 1GB.
So the problem is how to redirect the file upload stream to the correspondent website?
I could download the file and after that upload it, but I'm limited by my server to just 20MG. And it seems a poor solution.
Is it even possible to control the stream and redirect it directly to the website?
I preserverd original at the bottom for posterity, but as it turns out, there is a way to do this
What you need to use is a combination of HTTP put method (which unfortunately isn't available in native browser forms), the PHP php://input wrapper, and a streaming php Socket. This gets around several limitations - PHP disallows php://input for post data, but it does nothing with regards to PUT filedata - clever!
If you're going to attempt this with apache, you're going to need mod_actions installed an activated. You're also going to need to specify a PUT script with the Script directive in your virtualhost/.htaccess.
http://blog.haraldkraft.de/2009/08/invalid-command-script-in-apache-configuration/
This allows put methods only for one url endpoint. This is the file that will open the socket and forward its data elsewhere. In my example case below, this is just index.php
I've prepared a boilerplate example of this using the python requests module as the client sending the put request with an image file. If you run the remote_server.py it will open a service that just listens on a port and awaits the forwarded message from php. put.py sends the actual put request to PHP. You're going to need to set the hosts put.py and index.php to the ones you define in your virtual host depending on your setup.
Running put.py will open the included image file, send it to your php virtual host, which will, in turn, open a socket and stream the received data to the python pseudo-service and print it to stdout in the terminal. Streaming PHP forwarder!
There's nothing stopping you from using any remote service that listens on a TCP port in the same way, in another language entirely. The client could be rewritten the same way, so long as it can send a PUT request.
The complete example is here:
https://github.com/DeaconDesperado/php-streamer
I actually had a lot of fun with this problem. Please let me know how it works and we can patch it together.
Begin original answer
There is no native way in php to pass a file asynchronously as it comes in with the request body without saving its state down to disc in some manner. This means you are hard bound by the memory limit on your server (20MB). The manner in which the $_FILES superglobal is initialized after the request is received depends upon this, as it will attempt to migrate that multipart data to a tmp directory.
Something similar can be acheived with the use of sockets, as this will circumvent the HTTP protocol at least, but if the file is passed in the HTTP request, php is still going to attempt to save it statefully in memory before it does anything at all with it. You'd have the tail end of the process set up with no practical way of getting that far.
There is the Stream library comes close, but still relies on reading the file out of memory on the server side - it's got to already be there.
What you are describing is a little bit outside of the HTTP protocol, especially since the request body is so large. HTTP is a request/response based mechanism, and one depends upon the other... it's very difficult to accomplish a in-place, streaming upload at an intermediary point since this would imply some protocol that uploads while the bits are streamed in.
One might argue this is more a limitation of HTTP than PHP, and since PHP is designed expressedly with the HTTP protocol in mind, you are moving about outside its comfort zone.
Deployments like this are regularly attempted with high success using other scripting languages (Twisted in Python for example, lot of people are getting onboard with NodeJS for its concurrent design patter, there are alternatives in Ruby or Java that I know much less about.)

Categories