So I am using a Web Service and I am generating an XML file, I need to send across PDF files as base64 encode instead of just passing the URL.
The XML is processing around 500 requests and the way I am encoding the PDF is as follows:-
$filesrc = get_field('property_pdf');
$b64Doc = chunk_split(base64_encode(file_get_contents($filesrc)));
However, when I am doing this the process ends up just timing out, is there better way of encoding a PDF or am I not doing it correctly?
Note: It processes all 500 properties in less than 5 seconds before adding the above code to the XML generation.
Any help would be much appreciated.
Related
I'm about to implement a REST server (in ASP.NET although I think that's irrelevant here). where what I want to do is the request is made and it returns the result. However, this result is an .XLSX file that could be a million rows.
If I'm generating a million row spreadsheet, it's going to take about 10 minutes. An http request will time out. So what's the best way to handle this delay in the result.
Second, what's the best way to return a very large file as the REST result?
Update: The most common use case is the REST server is an Azure cloud service web worker (basically IIS on Azure). The client is a PHP web app running on a different server in a different location. The PHP web app needs to send up a report template (generally 25K) and the data which can be a connection string to a SQL database, or... could be a 500M XML file. So that is the request, an XML file containing the template and datasource(s).
The response if a file - PDF, DOCX, XLSX, PPTX, or HTML. That can be a BLOB inside an XML file or it can be the file itself. In the case of an error then it must return XML with the error information. The big issue is it can take 10 minutes to generate this file if everything goes right. When it's a 1 million row spreadsheet, it takes time to pull down all that data and populate the created XLSX file. Second issue, this is then a really large file.
So even if everything is perfect, there's a big delay and a large response.
I see two options:
Write file to response stream during its generation (from client side this looks like downloading large file);
Start file generation task on server side and return task id immediatly. Add API methods, that allows retreive task status, cancel it or get results (if task completed).
interesting question,
i sure hope you have a stable connection, anyway, at the client side, in this case, php, set the timeouts to very high values. in php
set_time_limit(3600*10);
curl_setopt($curlh,CURLOPT_TIMEOUT,3600*10);
I have a string containing a 2 page base64 encoded PDF file. The second page of the PDF is always garbage. (Terms and conditions from the web service that sent me the PDF.) I would like to be able to modify the PDF to drop the second junk page and re-encode it as base64 data, ideally without writing to the disk. Any suggestions?
Well 24 hours later and a lot of research and I have come up with the answer. Brad is correct, this should have probably been split into two questions, although decoding the base64 data is very simple and has been covered multiple times on this site so I will not go into detail on how to do that. The real kicker for me was finding a framework that would let you load the PDF from a string, not from the disk. The answer is the zend framework.
// Load the PDF
$pdf = Zend_Pdf::parse($pdfString);
// Remove the page from the pages array
unset($pdf->pages[$id]);
// Return the PDF document as a string
$pdfString = $pdf->render();
I have a web app that builds an associative array of responses from a user as they take a survey. When completed, I want to allow them to press a button, and have their responses download to their browser as a plist.
Using file_put_contents, I am able to write the array to disk and then downloading it from there is pretty easy. I am wondering if it is possible to output the plist file without writing it to disk first? It seems like a lot of overhead and cleanup will be required if I am writing to disk every time.
Can I just take the array and output to a select filename and extension?
Thanks
if you have no other output then it is possible see.
serving pdf file using php header produces the pdf source instead of the file
using the headers and just outputting your array, in a fomat you wish.
Create the necessary headers (as per your other question :-)
And then simply echo the content out after the headers.
I have been trying a long time to get the IIHF PDF's (example here: http://stats.iihf.com/Hydra/349/IHM349131_74_3_0.pdf) to a parseable form.
Now I've finally did it, because Google's cache stores a HTML version from it (http://webcache.googleusercontent.com/search?q=cache:http://stats.iihf.com/Hydra/349/IHM349131_74_3_0.pdf) and it could be parsed easily.
The only problem is, that Google doesn't cache every PDF they have and even if they cache a file, it could take days to appear there.
Is there any way to get those HTML versions via any API or even manually?
Edit: These PDFs have somehow corrupted character maps, so that normal PDF to HTML converters can't convert them. Forgot to say.
I have a script that reads the contents of a remote CSV file, iterates over the lines, and adds the data items to a database. This file has on average about 3000 lines, and therefore 3000 products.
To make a few things clear:
I DO NOT have control over the data in the CSV file beforehand
I DO NOT have access to / control over the manner in which this CSV file is cretaed
The CSV file is dynamically generated once a day, from data in a MySQL database
The problem:
My script only iterates over about 1300 lines then stops, no errors, nothing. All text is enclosed in double quotes, and generally the CSV file seems correctly formatted. The weird thing is this: If I download the CSV file, open it in Notepad++ and change the encoding to UTF-8 WITHOUT BOM, upload that to a test server and run my script on THAT file, I get the FULL 3000 items and all is fine.
So, I am assuming that the people generating this file need to insert the data as UTF-8? Because I cannot control that process, I would like to know if there is a fairly simple manner in which I can apply the UTF-8 WITHOUT BOM encoding to that file, or at least read the file contents into a variable and re-encode that?
Many thanks
You can use iconv to change the encoding directly from php before you process your file.
Edit: The php version of iconv can be used to process the data. If you want to re-encode the file before importing it, you'd have to use the linux command iconv (assuming a LAMP server) using for example exec.
sounds like you are trying to do this directly from the other server. why dont you get the entire file and save it to your own server, do any manipulation to that and then do your processing?