PHP POST and execute for large file uploads - php

I have the following PHP Files:
fileUploadForm.php
handleUpload.php
fileUploadForm contains the following output:
output $_SESSION['errorMessage'] (if any)
Output a file upload form that posts to handleUpload.php
handleUpload.php performs the following actions:
Validates session (redirects to login if validation fails)
Validates file (sets $_SESSION['errorMessage'] if validation fails)
Scan File for Virus
MoveFile
Update database
The script is having trouble on large file uploads. I have set all of the php.ini settings regarding file uploads to be ridiculously huge, for testing purposes. So I don't believe the issue is a configuration issue.
The following behavior is confusing me:
When I watch the file grow in tmp, the file upload continues well past the max_input_time that was set. My understanding was that once the max_input_time is exceeded, the script will terminate, and in turn, so would the file upload. Any thoughts on why this isn't happening?
if I stop the file upload midstream and refresh fileUploadForm (not resubmit it), the script will output error messages related to file validation that are set in handleUpload. This seems to indicate that even though the file upload did not complete, lines of code in handleUpload are being executed. Does php execute a script and receive the form data asynchronously? I would have thought that the script would wait until all form data was received before executing any code. But this assumption is contradicted by the behavior I am seeing. What is the order in which a data POST / script execution occurs?
When max_input_time, along with the rest of the config values, is set to be ridiculously large for testing, very large uploads will complete. However, the rest of the script just seems to die. i.e. the virus scan and file move never happen, nor do the database updates. I have error handling set for each action in the script, but no errors are thrown. The page just seems to have died. Any thoughts on why this might happen / how to catch such an error?
Thanks in advance.
Kate

This quote from your second question answers (at least partially) the other two:
I would have thought that the script would wait until all form data was received before executing any code.
In order for PHP to be able to handle all input data, Apache (or whatever HTTP server you are using) will first wait for the file upload to be complete and only after that it will process the PHP script. So, PHP's max_input_time check will come into play after the file upload process is completed.
Now, in that case you'd probably ask then why your virus scanning, file moving and any other script procedures don't happen, since it's logical that any time counter related to PHP should start with the script's execution and this should happen after all input data is received. Well, that SHOULD be the case and to be honest - my thoughts on this are a kind of shot in the dark, but well ... either some other limit is exceeded or the script is started with the request, but is being suspended by the httpd until ready to proceed with it and effectively - some of those counters might expire during this time.
I don't know how to answer your second question as a refresh would mean that all of the data is re-POST-ed and should be re-processed. I doubt that you'd do the other thing - simply loading handleUpload.php without re-submitting the form, but it's a possibility that I should mention. A secound guess would be that if the first request was terminated unexpectedly - some garbage collection and/or recovery process happens the second time.
Hope that clears it up a bit.

Related

PHP, how will continue to run after the connection cancellation

I'm having a problem like a file upload code.
The user begins to upload files through the site (for large files. Like Wetransfer)
Showing percentage loading with Ajax.
When completed, showing warning.
But the problem starts here.
Because files are huge, it takes time to move to the appropriate folder and ziping.
If the user closes the browser in this process, the process can not be completed.
Even users close the browser, how do I ensure that the operation continues.
I tried to ignore_user_abort. But I was not successful.
So send response to the browser that you are moving file, and or do it as queue and execute it as background job or just do it in your script. That should help: https://stackoverflow.com/a/5997140/4099089

Ajax error - 0 after a long php script

I am using jquery ajax to send the url of a file (csv file) located on the server to my php script so as to process it.
The csv file contains telephone calls. If i have a file with even 10.000 calls everything is ok. But if i try a big file with like for example 20000 calls then i get an Ajax Error 0 . I check for server responce with firebug but i get none.
This behaviour occurs after like 40mins of w8ing for the php script to end. So why do i get this error on big files only? Does it have to do with apache, mysql or the server itself? Anyone able to help will be my personal hero cause this is driving me nuts.
I need a way to figure out whats happening exactly but firebug wont return a server responce. Any other way i can find out whats happening?
I checked the php error log and it reports nothing on the matter
Thanks in advance.
The script may have timed out:
See your php.ini file
max_execution_time
max_input_time ;# for the max time an input can be processed
Were your PHP.ini is depends on your enviroment, more information: http://php.net/manual/en/ini.php
Check:
max_input_time
This sets the maximum time in seconds a script is allowed to parse input data, like POST and GET. It is measured from the moment of receiving all data on the server to the start of script execution.
max_execution_time
This sets the maximum time in seconds a script is allowed to run before it is terminated by the parser. This helps prevent poorly written scripts from tying up the server. The default setting is 30. When running PHP from the command line the default setting is 0.
Also
Your web server can have other timeout configurations that may also interrupt PHP execution. Apache has a Timeout directive and IIS has a CGI timeout function. Both default to 300 seconds. See your web server documentation for specific details.
First enable php error by placing below code at top of the php file.
error_reporting(E_ALL);
Then as Shamil explained in this answer, checkout your max_execution_time settings of your php.
To check max_execution time, open your php.ini file and search for that, and then change it to a maximum value of say one hour (3600).
I hope this will fix your issue.
Thank you

MongoDB php driver, script ends when inserting data

Im playing with MongoDB and Im trying to import .csv files to DB and Im getting strange error. In process of uploading script just ends for no reason and when I try to run it again nothing happens only solution is to restart apache. I have already set unlimited timeout in php.ini Here is the script.
$dir = "tokens/";
$fileNames = array_diff( scandir("data/"), array(".", "..") );
foreach($fileNames as $filename)
if(file_exists($dir.$filename))
exec("d:\mongodb\bin\mongoimport.exe -d import -c ".$filename." -f Date,Open,Next,Amount,Type --type csv --file ".$dir.$filename."");
I got around 7000 .csv files and it manage to insert only about 200 before script ends.
Can anyone help? I would appreciate any help
You are missing back end infrastructure. It is just insane to try to load 7000 files into a database as part of a web request that is supposed to be short lived and is expected, by some of the software components as well as the end user, to only last a few seconds or maybe a minute.
Instead, create a backend service and command and control for this procedure. In the web app, write each file name to be processed to a database table or even a plain text file on the server and then tell the end user that their request has been queued and will be processed within the next NN minutes. Then have a cron job that runs every 5 minutes (or even 1 minute) that looks in the right place for stuff to do and can create reports of success or failure and/or send emails to tell the original requestor that it is done.
If this is intended as an import script and you are set on using PHP, it would be preferable to at least use the PHP CLI environment instead of performing this task through a web server. As it stands, it appears the CSV files are located on the server itself, so I see no reason to get HTTP involved. This would avoid an issue where the web request terminates and abruptly aborts the import process.
For processing the CSV, I'd start by looking at fgetcsv or str_getcsv. The mongoimport command really does very little in the way of validation and sanitization. Parsing the CSV yourself will allow you to skip records that are missing fields, provide default values where necessary, or take other appropriate action. As you iterate through records, you can collect documents to insert in an array and then pass the results on to MongoCollection::batchInsert() in batches. The driver will take care of splitting up large batches into chunks to actually send over the wire in 16MB messages (MongoDB's document size limit, which also applies to wire protocol communication).

Can PHP (with Apache or Nginx) check HTTP header before POST request finished?

Here is a simple file upload form HTML.
<form enctype="multipart/form-data" action="upload.php" method="POST">
Send this file: <input name="userfile" type="file" />
<input type="submit" value="Send File" />
</form>
And the php file is pretty simple too.
<?php
die();
As you see, the PHP script do nothing in server side. But when we uploading a big file, the process still cost a long time.
I know, my PHP code will executed after the POST process ended. PHP MUST prepare the array named $_POST and $_FILES before the first line code parsed.
So my question is: Can PHP (with Apache or Nginx) check HTTP header before POST request finished?
For example, some PHP extensions or Apache modules.
I was told that Python or node.js can resolve this problem, just want to know if PHP can or not.
Thanks.
================ UPDATE 1 ================
My target is try to block some unexpected file-upload request. For example, we generated a unique token as POST target url (like http://some.com/upload.php?token=268dfd235ca2d64abd4cee42d61bde48&t=1366552126). In server side, my code like:
<?php
define(MY_SALT, 'mysalt');
if (!isset($_GET['t']) || !isset($_GET['token']) || abs(time()-$_GET['t'])>3600 || md5(MY_SALT.$_GET['t'])!=$_GET['token']) {//token check
die('token incorrect or timeout');
}
//process the file uploaded
/* ... */
Code looks make sense :-P but cannot save bandwidth as I expected. The reason is PHP code runs too late, we cannot check token before file uploading finished. If someone upload file without correct token in url, the network and CPU of my server still wasted.
Any suggestion is welcome. Thanks a lot.
The answer is always yes because this is Open Source. But first, some background: (I'm only going to talk about nginx, but Apache is almost the same.)
The upload request isn't sent to your PHP backend right away -- nginx buffers the upload body so your PHP app isn't tying up 100MB of RAM waiting for some guy to upload via a 300 baud modem. The downside is that your app doesn't even find out about the upload until it's done or mostly done uploading (depending on client_body_buffer_size).
But you can write a module to hook into the different "phases" internally to nginx. One of the hooks are called when the headers are done. You can write modules in LUA, but it's sill fairly complex. There may be a module that will send the "pre-upload" hook out to your script via HTTP. But that's not great for performance.
It's very likely you won't even need a module. The nginx.conf files can do what you need. (i.e. route the request to different scripts based on headers, or return different error codes based on headers.) See this page for examples of header checking (especially "WordPress w/ W3 Total Cache using Disk (Enhanced)"): http://kbeezie.com/nginx-configuration-examples/
Read the docs, because some common header-checking needs already have directives of their own (i.e. client_max_body_size will reject a request if the Content-Length header is too big.)
There is no solution in HTTP level, but is possible in TCP level. See the answer I chose in another question:
Break HTTP file uploading from server side by PHP or Apache

set_time_limit() timing out

I have an upload form that uploads mp3s to my site. I have some intermittent issues with some users which I suspect to be slow upload connections...
But anyway the first line of code is set_time_limit(0); which did fix it for SOME users that had connections that were taking a while to upload, but some are still getting timed out and I have no idea why.
It says the script has exceeded limit execution of 60 seconds. The script has no loops so it's not like it's some kind of infinite loop.
The weird thing is that no matter what line of code is in the first line it will always say "error on line one, two, etc" even if it's set_time_limit(0);. I tried erasing it and the very first line of code always seems to be the error, it doesn't even give me a hint of why it can't execute the php page.
This is an issue only few users are experiencing and no one else seems to be affected. Could anyone throw some ideas as to why this could be happening?
set_time_limt() will only effect the actual execution of the PHP code on the page. You want to set the PHP directive max_input_time, which controls how long the script will accept input (like files) for. The catch is that you need to set this in php.ini, as if the default max_input_time is exceeded, it'll never reach the script which is attempting to change it with ini_set().
Sure, a couple of things noted in the PHP Manual.
Make sure PHP is not running in safe-mode. set_time_limit has no affect when PHP is running in safe_mode.
Second, and this is where I assume your problem lies.....
Note: The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, etc. is not included when determining the maximum time that the script has been running. This is not true on Windows where the measured time is real.
So your stream may be the culprit.
Can you post a little of your upload script, are you calling a separate file to handle the upload using Headers?
Try ini_set('max_execution_time', 0); instead.

Categories