I am using DOMPDF to render HTML tables. However, because large tables we'll say greater that 150 rows render extremely slowly. I know this is a known issue for DOMPDF.
Does anyone know of any good work around?
Maybe using a different html to pdf converter?
How to write html code that is formatted like a table without using a table?
Any idea?
Using big and unnecessary css-files caused slowness for me. I was using 8000 line css-file. Example 1-page pdf rendered 5.5 seconds.
I removed my css-links from html-template and made a custom css-file containing only things I need ie. bootstrap's grid, table and panel css.
Now rendering takes less than 1 second so over 80% render time improvement.
Related
Our company allows its clients to view reports via our website. The pages are php based and the data is collected from MySQL. These reports were written a long time ago and include inline css. The pages themselves look fine, but the print version is lacking. I want to take the reports and create visually appealing "printable" pages that contain our branding.
I have found three solutions so far.
#Media Print Stylesheets
This is the easiest method, but does not give me complete layout control. I want landscape mode and need to control where the page breaks occur so this method has been eliminated from my list of possible solutions. The reports are built by looping through PHP data, so while I can always put a page break after a or for example, I can't stop the page from breaking before it gets to the next set of data.
TCPDF/FPDF
From what I have seen these classes will give me all of the control I need to customer a PDF. The challenge is that this appears to be a little more advanced than my programming skills require, and all of the inline CSS contained within the HTML tables may throw off formatting.
FDF
I am leaning towards this method if I understand it correctly. First I would create a PDF form and define all of the fields to be populated by the MySQL data. Then I would create a FDF file that would populate the form template with the data from the database. It seems easier to me to create a visually pleasing form via PDF and then populate that form using this method, rather than create the entire pdf from scratch using method 2.
Does it sound like I am on the right track? Are any of these methods "easier" than the other?
Any help is greatly appreciated.
TCPDF has the most control of each page which is what I am looking for. It is extremely sensitive when writing HTML, but that is the only downside I have found so far.
There's this excellent answer on SO already.
If you're looking for easy, my money is on mPDF. I found it to be the easiest, and essentially an out-of-the-box solution (often zero server configuration to do).
I think you should try out wkhtmltopdf.
https://code.google.com/p/wkhtmltopdf/
As for the TCPDF/FPDF pagination issue, you can see this other question for the solution provided and use the flow in it to sort yours out.
TCPDF / FPDF - Page break issue
Just found this other solution as well and think you'll need it
Convert HTML + CSS to PDF with PHP?
For me personally, FPDF works great to fetch data from my database, insert into the FPDF class and dynamically create PDF's for customers.
I see some people want to write HTML/CSS to create PDF's but you will always have
differences as the browser parses the HTML/CSS differently than when using it in PDF's.
When using FPDF's built-in method's, I have been able to get exactly what I wanted
and haven't seen any issues (yet).
TCPDF is an open source PDF generator I'm using in a web app of mine for generating reports. These reports are standard pre-made HTML tables, which TCPDF reads and converts into a PDF. Lately The reports have been running multiple pages and are taking much longer than I'd like to load.
The problems
TCPDF is currently generating the PDFs at 0.48s per page.
TCPDF uses on average 3-4 MB RAM per page during generation, often timing out the half gig php limit I have set. (512/4 = 128 = my page limit), even though the final file will be below 1MB.
What I've tried
Initially I thought the long wait time may have been down to database
calls for the required info and my php script generating the HTML, however timestamps on each page of the report, and also printing time
stamps at the completion of database calls and HTML generation rule out that possibility.
First thing I tried was updating TCPDF because I am running a 2010 version, this actually increased the load time by a factor of 4x! (however the new version was more memory efficient)
I've tried re-writing the HTML, removing all images and including all CSS in a style tag at the top (previously it was contained in each HTML tag), however this actually slowed the generation by 50%!
Question
Is there anyone whoes had a similar problem that can point out some other common edits which can improve the loading?
What are my alternatives from using HTML tables and TCPDF that can provide a more efficient way of generating the PDF?
Worst case scenario, I think I might have to make a separate PDF generating machine which does every report page by page, mashes them together, and Emails it when its done but that sounds naaasty.
I am doing a bulk generation of pdf files based on templates and I ran into big performance issues pretty fast.
My current scenario is as follows:
get data to be filled from db
create fdf based on single data row and pdf form
write .fdf file to disk
merge the pdf with fdf using pdftk (fill_form with flatten command)
continue iterating over rows until all .pdf's are generated
all the generated files are merged together in the end and the single pdf is given to the client
I use passthru to give the raw output to the client (saves time writing file), but this is just a little performance improvements. The total operation time is about 50 seconds for 200 records and I would like to get down to at least 10 seconds in some way.
The ideal scenario would be operating all these pdfs in memory and not writing every single one of them to separate file but then the output would be impossible to do as I can't pass that kind of data to external tool like pdftk.
One other idea was to generate one big .fdf file with all those rows, but it looks like that is not allowed.
Am I missing something very trivial here?
I'm thanksfull for any advice.
PS. I know I could use some good library like pdflib but I am considering only open licensed libraries now.
EDIT:
I am up to figuring out the syntax to build an .fdf file with multiple pages using the same pdf as a template, spent few hours and couldn't find any good documentation.
After beeing faced with the same problem for a long time (wanted to generate my pdfs based on LaTeX) i finally decided to switch to another crude but effective technique:
i generate my pdfs in two steps: first i generate html with a template engine like twig or smarty. second i use mpdf to generate pdfs out of it. I tryed many other html2pdf frameworks and ended up using mpdf, it's very mature and is developed since a long time (frequent updates, rich functionality). the benefit using this technique: you can use css to design your documents (mpdf completely features css) - which comes along with the css benefit (http://www.csszengarden.com) and generate dynamic tables very easy.
Mpdf parses the html tables and looks for the theader, tfooter element and puts it on each page if your tables are bigger than one page size. Also you have the possibility to define page header and page footer elements with dynamic entities like page nr and so on.
i know, using this detour seems to be a workaround, but to be honest, no latex, pdf whatever engine is as strong and simple as html!
Try a different less complex library like fpdf (http://www.fpdf.org/)
I find it quite good and lite.
Always find libraries that are small and only do what you need them to do.
The bigger the library the more resources it consumes.
This won't help your multiple-page problem, but I notice that pdftk accepts the - character to mean 'read from standard input'.
You may be able to send the .fdf to the pdftk process via it's stdin, in order to avoid having to write them to disk.
I'm using DomPDF to generate invoices in a project I'm working on.
I've done so several times before and never had any trouble but since today however It got so slow that the maximum execution time gets reached.
I've confirmed that this happens because of: $dompdf->render()
I'm generating some tables to display the data in. but is seems to find this table quite difficult. Does anyone know what could be the problem?
dompdf and my table can be found: http://pastebin.com/G585VQma
(figured I'd put it on pastebin to save some space)
dompdf doesn't support splitting table cells through pages, so you need to change the tables with cellspacing="10" into block elements like divs with the same styling, or at least not put the tables with many lines inside of these tables.
I've got a report that can generate over 30,000 records if given a large enough date range. From the HTML side of things, a resultset this large is not a problem since I implement a pagination system that limits the viewable results to 100 at a given time.
My real problem occurs once the user presses the "Get PDF" button. When this happens, I essentially re-run the portion of the report that prints the data (the results of the report itself are stored in a 'save' table so there's no need to re-run the data-gathering logic), and store the results in a variable called $html. Keep in mind that this variable now contains 30,000 records of data plus the HTML needed to format it correctly on the PDF. Once I've got this HTML string created, I pass it to TCPDF to try and generate the PDF file for the user. However, instead of generating the PDF file, it just craps out without an error message (the 'Generating PDf...') dialog disappears and the system acts like you never asked it to do anything.
Through tests, I've discovered that the problem lies in the size of the $html variable being passed in. If the report under 3K records, it works fine. If it's over that, the HTML side of the report will print but not the PDF.
Helpful Info
PHP 5.3
TCPDF for PDF generation (also tried PS2PDF)
Script Memory Limit: 500 MB
How would you guys handle this scale of data when generating a PDF of this size?
Here is how I solved this issue: I noticed that some of the strings that I was having in my HTML output had some slight encoding issues - I ran htmlentities on those particular strings as I was querying the database for them and that cleared the problem.
Don't know if this was what was causing your problem, but my experience was very similar - when I was trying to output an HTML table that had a large size, with about 80.000 rows, TCPDF would display the page header but nothing table-related. This behaviour would be the same with different sets of data and different table structures.
After many attempts I started adding my own pagination - every 15 table rows, I would break the page and add a new table to the following page. That's when I noticed that every once and a while I would get blank pages between a lot of full and correct ones. That's when I realised that there must be a problem with those particular subsets of data, and discovered the encoding issue. It may be that you had something similar and TCPDF was not making it clear what your problem was.
Are you using the writeHTML method?
I went through the performance recommendations here: http://www.tcpdf.org/performances.php
It says "Split large HTML blocks in smaller pieces;".
I found that if my blocks of HTML went over 20,000 characters the PDF would take well over 2 minutes to generate.
I simply split my html up into the blocks and called writeHTML for each block and it improved dramatically. A file that wouldn't generate in 2 minutes before now takes 16 seconds.
TCPDF seems to be a native implementation of PDF generation in PHP. You may have better performance using a compiled library like PDFlib or a command-line app like htmldoc. The latter will have the best chances of generating a large PDF.
Also, are you breaking the output PDF into multiple pages? I.e. does TCPDF know to take a single HTML document and cut it into multiple pages, or are you generating multiple HTML files for it to combine into a single PDF document? That may also help.
I would break the PDF into parts, just like pagination.
1) Have "Get PDF" button on every paginated HTML page and allow downloading of records from that HTML page only.
2) Limit the maximum number of records that can be downloaded. If the maximum limit reaches, split the PDF and let the user to download multiple PDFs.