Echo script progress and download CSV - php

I'm having problems sending an array to another PHP page. We send an array from one page to another to generate CSV file that has been transformed from XML. So we take a 800mb XML file and transform it down to a 20mb CSV file. There is a lot of information in it that we are removing and it runs for 30 minutes.
Anyway, we are periodically using a function to output the progress of the transformation in the browser with messages:
function outputResults($message) {
ob_start();
echo $message . "<br>";
ob_end_flush();
ob_flush();
}
$masterArray contains all the information in a associative array we have parsed from the XML.
The array ($masterArray) at the end we send from index.php to another php file called create_CSV_file.php
Originally we used include('create_CSV_file.php') within index.php , but due to the headers used in the CSV file, it was giving us the messages that
Warning: Cannot modify header information - headers already sent
. So we started looking at a solution of pushing the array as below.
echo "<a href='create_CSV_file.php?data=$masterArray'>**** Download CSV file ***</a>";
I keep getting the error message with the above echo :
Notice: Array to string conversion
What is the best method to be able to show echo statements from the server as it is running, then be able to download the result CSV at the end?

Ok, so first of all, using data in a url (GET) has some severe limitations. Older version of IE only supported 4096 byte urls. In addition, some proxies and other software impose their own limits.
I'm sure you've heard this before, but if not.... You should not be running a process that takes more than a couple of seconds (at most!) from a web server. They're not optimised for it. You definitely don't want to be passing megabytes of data to the client just so they can send it back to the server!
How about something like this...
User makes a web request (And uploads original data?) to the server
Server allocates an ID for the request (random? database?) and creates a file on disk using the ID as a name (tmp directory, or at least outside web root)
Server launches a new process (PHP?) to transform the data. As it runs, it can update the database with progress information
During this time, the user can check progress by making a sequence of AJAX requests (or just refreshing a page which shows latest status). Lots more control over appearance now
When the processing is complete, server-side process writes results to file, updates database to indicate completion.
Next time user checks status, redirect them to a PHP file that takes the ID and will read the file from disk / stream it to the user.
Benefits:
No long-running http requests
No data being passed back/forth to client in intermediate stage
Much more control over how users see progress
Depending on the tranformation you're applying / the detail stored in the database, you may be able to recover interrupted jobs (server failure)
It does have one downside which is that you need to clean up after yourself - the files you created on disk need to be deleted, however, you've got a complete audit of all files in the database and deleting anything over x days old would be trivial.

Related

#only server-side# How to get the echo-html-div-result of the php code saved to png-file on this server?

Like a Log-file is written by a php-script via fwrite($fp, ---HTML---),
I need to save an HTML DIV as png-file on the server.
The client-browser only start the php-script,
but without any client-interaction the png-file should be saved on the server.
Is there a way to do this?
All posts (over thousands) I have been reading are about html2canvas,
which is (as I understand) client-side operating.
I know the html-(html-div)-rendering normally does the browser.[=client-side]
But is there a way to do it in PHP on server-side ?
Reason:
Until now the procedure is
print the div via browser on paper twice
one for the costumer,
one to scan it in again to save it on the server as picture and throw it in the paperbasket.
By more than 500 times a day ...
By security reasons it need to be a saved picture on the server.

Least disruptive way to download file using PhP. Prevent disrupting of ongoing updating of it

I am working on a website in which it would be useful to to allow a user the option of downloading the content of a file, even when it's going to be updated by another user at the same time or later.
My problem is that the solution I've tried so far allows downloading, but will disrupt any later updating of the file. I don't think I can represent the code relating to the updating concisely (it is spread over multiple files), except that it's through AJAXing the data (which I'm not sure why it would cause this problem). In case it's relevant, this is a file which gets updated multiple times.
When I use fireftp I can download the file without disrupting this process, which makes me optimistic there's a PhP solution. I am currently downloading the data by Ajaxing the file contents to the page the "downloading user" is on. The code for this (within php) is:
$file_contents = file_get_contents ($_POST['file'])); //file address comes through Ajax POST request.
echo ($file_contents); //to access the content client side
Is there another way to access the text/content within a file without any unintended consequences on other server processing of it?

Progress bar for an application

I have implemented a SAAS scenario with my Windows server: a user could upload a file in a website, then Fetch.exe is an application coded in C# hosted in the server, Fetch.exe takes the uploaded file as an input, executes and generates an output to download for the user.
So in my php, I use exec to wrap Fetch.exe:
exec("Fetch.exe " . $inputFile . " > " . $outputFile)
Uploading and executing (ie, Fetch.exe) may take more than several seconds, and I want to show the user that it is processing and everything is going fine.
I have found some threads that discuss how to show a progress bar for the uploading. Whereas, does anyone know what I could do to show the approximate progress of Fetch.exe? Do I have to split it into smaller applications and use several exec?
You could supply Fetch.exe with an randomly generated ID from php, for example from the uniqueid function. Fetch.exe will create a file called <uniqueid>.txt with a progress percentage. From the browser, you could call another script with that unique ID to get the contents of that .txt file. In order, it would be something like this:
User uploads the file to PHP
PHP:
handles the uploaded file
creates a uniqueID
starts Fetch.exe with the file and the uniqueID
returns a page with the uniqueID embedded
The following happens in parallel:
Fetch.exe creates a textfile called /progress/uniqueid.txt with the uniqueid as name. It logs the progress into it.
The browser does an AJAX call to http://example.com/progress/uniqueid.txt and shows the progress to the user
And finally, when the progress reaches 100% the browser downloads the file. The only thing you might want to add is the pruning of the progress files after a while. Say you delete all files older than 10 minutes every hour.
Your PHP program needs a way to know the state of the subprocess (the Fetch.exe application), so, Fetch.exe needs to send info about the processing state, the most natural way to do this is through the standard output (the standard output is the information that provides a program when you run it from cmd).
Knowing this, you can run and keep reading a subprocess output from php using popen().
And secod, you can use the PHP ob_flush() and flush() with the onmessage javascript event to establish the comunication from your client page with your running php script, here you can find a good tutorial on how do this.

If a REST request can take 10 minutes

I'm about to implement a REST server (in ASP.NET although I think that's irrelevant here). where what I want to do is the request is made and it returns the result. However, this result is an .XLSX file that could be a million rows.
If I'm generating a million row spreadsheet, it's going to take about 10 minutes. An http request will time out. So what's the best way to handle this delay in the result.
Second, what's the best way to return a very large file as the REST result?
Update: The most common use case is the REST server is an Azure cloud service web worker (basically IIS on Azure). The client is a PHP web app running on a different server in a different location. The PHP web app needs to send up a report template (generally 25K) and the data which can be a connection string to a SQL database, or... could be a 500M XML file. So that is the request, an XML file containing the template and datasource(s).
The response if a file - PDF, DOCX, XLSX, PPTX, or HTML. That can be a BLOB inside an XML file or it can be the file itself. In the case of an error then it must return XML with the error information. The big issue is it can take 10 minutes to generate this file if everything goes right. When it's a 1 million row spreadsheet, it takes time to pull down all that data and populate the created XLSX file. Second issue, this is then a really large file.
So even if everything is perfect, there's a big delay and a large response.
I see two options:
Write file to response stream during its generation (from client side this looks like downloading large file);
Start file generation task on server side and return task id immediatly. Add API methods, that allows retreive task status, cancel it or get results (if task completed).
interesting question,
i sure hope you have a stable connection, anyway, at the client side, in this case, php, set the timeouts to very high values. in php
set_time_limit(3600*10);
curl_setopt($curlh,CURLOPT_TIMEOUT,3600*10);

PHP : how to post a 3 gigabyte chars long string by curl or output it in browser?

I'm doing an online database dumping tool.
But when I output the data, PHP will wait until it can calculate the length it needs to dump, which may confuse the user about whether the API is working or not.
How can I send a response in chunks?
I tried adding:
header('Transfer-Encoding:chunked');
But Chrome browser couldn't open the page with it.
What do I need to do?
Thanks!
Answer: should encoding data before send it.
function chunk_encoding($chunk) {
printf("%x\r\n%s\r\n", strlen($chunk), $chunk);
flush();
ob_flush();
}
It isn't the smartes approach to flood the user with 3GB of data. Assuming that this user has DSL connection (let's say 6MBit) he has to wait horrible ~69 minutes until all data can be used (e.g copied). In addition he isn't allowed to close the tab in which this data is loaded otherwise the data is lost. And finally any browser will grow to a memory consuming monster if he is forced to display this amount of data.
A better solution is to generate a file on the server and let the user download this file by showing him the link. This way the user can download the file in background (may be with any download manager) and can retrieve the data locally.

Categories