How to render a dynamic HTML/PHP page in a PDF programmatically - php

I've got a PHP page that I generate, forming an expense report for clients. I've found that the clients end up file->saving the output, emailing it around, and printing it. Since emailing HTML and PHP isn't really ideal (i.e., images are lost, formatting is wonky), I'd like to render the page to a pdf and stream that to them.
Now, I've thought of good ol' "Print as PDF," but not all clients have that ability. I've looked into doing it myself with PHP PDFLib, but that gets pretty hairy. I've looked into DOMPDF and DocRaptor, but they attempt to parse the DOM and generate a pdf, which doesn't work well for more complex designs.
Here's the tantalizing thing: I use a Mac, and print->preview on the Mac does exactly what I want. It takes the pixels of the rendered page and generates a pdf out of it. If only I could harness that power! Is there a way? What can I do?

This library seems to have the right ingredients:
http://www.rustyparts.com/pdf.php
1) I never used it though, so can't tell much about it ( cUrl -> html2ps -> ps2pdf )
2) Also maybe it will be easier to write a shell script ( cUrl, html2ps, ps2pdf ), and execute it in php, if it's an option (though not the best practice - security-wise)

Related

Bulk template based pdf generation in PHP using pdftk

I am doing a bulk generation of pdf files based on templates and I ran into big performance issues pretty fast.
My current scenario is as follows:
get data to be filled from db
create fdf based on single data row and pdf form
write .fdf file to disk
merge the pdf with fdf using pdftk (fill_form with flatten command)
continue iterating over rows until all .pdf's are generated
all the generated files are merged together in the end and the single pdf is given to the client
I use passthru to give the raw output to the client (saves time writing file), but this is just a little performance improvements. The total operation time is about 50 seconds for 200 records and I would like to get down to at least 10 seconds in some way.
The ideal scenario would be operating all these pdfs in memory and not writing every single one of them to separate file but then the output would be impossible to do as I can't pass that kind of data to external tool like pdftk.
One other idea was to generate one big .fdf file with all those rows, but it looks like that is not allowed.
Am I missing something very trivial here?
I'm thanksfull for any advice.
PS. I know I could use some good library like pdflib but I am considering only open licensed libraries now.
EDIT:
I am up to figuring out the syntax to build an .fdf file with multiple pages using the same pdf as a template, spent few hours and couldn't find any good documentation.
After beeing faced with the same problem for a long time (wanted to generate my pdfs based on LaTeX) i finally decided to switch to another crude but effective technique:
i generate my pdfs in two steps: first i generate html with a template engine like twig or smarty. second i use mpdf to generate pdfs out of it. I tryed many other html2pdf frameworks and ended up using mpdf, it's very mature and is developed since a long time (frequent updates, rich functionality). the benefit using this technique: you can use css to design your documents (mpdf completely features css) - which comes along with the css benefit (http://www.csszengarden.com) and generate dynamic tables very easy.
Mpdf parses the html tables and looks for the theader, tfooter element and puts it on each page if your tables are bigger than one page size. Also you have the possibility to define page header and page footer elements with dynamic entities like page nr and so on.
i know, using this detour seems to be a workaround, but to be honest, no latex, pdf whatever engine is as strong and simple as html!
Try a different less complex library like fpdf (http://www.fpdf.org/)
I find it quite good and lite.
Always find libraries that are small and only do what you need them to do.
The bigger the library the more resources it consumes.
This won't help your multiple-page problem, but I notice that pdftk accepts the - character to mean 'read from standard input'.
You may be able to send the .fdf to the pdftk process via it's stdin, in order to avoid having to write them to disk.

PHP Is it possible to convert html to Images directly

Is it possible to convert the html contents including styles to image by php. Please guide me.
Well, you need to render it first. For rendering you need something like browser which can handle js, CSS etc. after rendering you take the image. Php is not yet capable of doing such things. But you can achieve it by creating a php extension that uses browser engine and do the task for you. The extension will be like a bridge.
There are many browser engines. Among them you can use webkit. It renders quite fast. I Prefer it.
Another thing to know. This extension will take a lot CPU and memory in compared to normal php script.

how mature is HTML+CSS now in relation to generating reports for printing?

I'm considering creating all the reports of a series of desktop business apps directly to html. Most of the reports are tables (maybe compound reports), headers, footers, etc. (no images, vector graphics, etc.).
After a search in SO, I've read lots of post regarding problems with page breaks and things like that (I don't need pixel positioning at all, but yes control at page breaks).
For example, let's say I have a big table with currency values and I need the last row of the table per page to display the running totals at that point.. it is something feasible to do easily or I will run in lots of trouble?
What technologies can help me here?
HTML5
Javascript
CSS
PHP Librarys
JQuery
Some notes:
The html will be displayed with the chrome or firefox engine embeded, so the diferences between browsers it's not a problem for me.
I can have the php preprocessor embedded if that helps to generate more easily the reports, I'm just looking fot the best technology at hand to make the work well..
I'm tired of report generators with "WYSIWYG" designers (Crystal Report, FastReport, ReportBuilder, etc.)
Thanks!
We made the exact move you're thinking about almost a year ago and haven't looked back. Most communication with our client is over the web, so it's been a perfect fit. They can view html outputs easily on our website, and can generate pdf's of the page (server side) whenever necessary. The program we use for pdf conversion is a free, easy-to-use, open-source project called wkhtmltopdf.
Where we are is great, but getting here was difficult.
Deciding which pdf engine to use was a long, painful process. The short of it is that HTML is for viewing pages on the internet, not for viewing pages on paper. Page-breaks will be the bane of your existence in this game -- you literally have to measure each page and create your own clean-looking breaks for every single report (otherwise, all html-to-pdf converters out there will just keep rendering the document onto the next page as it if encountered no page-break at all). Further complicating the matter is that every html-to-pdf engine out there handles this sh*t differently and you'll have to write a tailored solution to test each one to see if it meets your individual needs.
Now, the good news:
You can save yourself a lot of trouble by heeding my advice and going with wkhtmltopdf for your finalized reporting outputs. This little program is simply amazing -- it uses a webkit engine, renders CSS/javascript accurately, has header/footer control, optionally creates a table-of-contents page, and (most importantly) consistently produces excellent looking pdf's without having to customize your code base. It also has a variety of great command line switches, and it is very, very fast. I say again: it is very, very fast.
Best of all, it's a command line tool that can be used in batch processing. And did I mention that it's really, really fast?
Browser support for printing is generally terrible. However, there are other tools, notably Prince (which is not free) and Flying Saucer (which is free) that can generate PDF output from XML/HTML plus CSS. Prince even supports JavaScript though I don't have any experience with it.
I've got a Java back end in my current application, so for me Flying Saucer works fine for simple reports. I pre-process an HTML template with FreeMarker and then run the result through Flying Saucer. It's got a surprisingly smart rendering engine.
The CSS3 Paged Media spec (well, proposed spec) has all sorts of cool stuff in it but they're almost totally unimplemented in the browsers. Even the CSS2 paged media stuff is only supported half-heartedly.
Speaking of Prince, you might look into DocRaptor. DocRaptor is another HTML to PDF conversion application. It uses Prince XML, and handles CSS better than comparable programs.
It isn't free, but offers a free 30 day trial for all accounts, so there's no harm in trying it out, at least.
DocRaptor

How can I create image from html using PHP?

something like painty but with advanced options having div, font, fontsize, style... etc....
I would like to have a coupon design in html and output it as an image.. preferably JPG..
but painty is not supporting those.
you can find here.. http://www.rabuser.info/painty.php the painty code i am using right now.
Thanks and waiting for the reply.
Creating this with pure php is bad idea, this will be slow as hell.
As far as I know in production this is achieved with external screenshot app and standard browser run by exec() or similar function.
There's khtml2png, which renders the whole page and takes a screenshot; however, it's a standalone executable (and it needs an X server or xvfb), so you need to be able to run it on your server (so probably not on a shared hosting). This may be a bit of an overkill, but it gives you complete control over the final appearance.
You could also use some of the HTML to PDF convertors and then use ImageMagick to convert the PDF to JPEG.

Direct Print webpage in PDF file

In my site i m fetching my mysql data by using PHP. I want open that data in pdf file when i click pdf print button is it possible?
First of all, if you want a high quality professional product to do that. You want Prince XML
If you are looking into some open source tool to achieve something similar. You can look into this SO question.
You could prepare static PDF form file, that just fill it in with values using PHP's FDF module.
It depends which platform are you using. This would be an easy job if you are using Groovy on grails. There are plugins which facilitate pdf reporting like the jasper-plugin.
Luis
Check out jsPDF, an open-source library for generating PDF documents using nothing but JavaScript.
You can process the data with Apache FOP after transforming it to XML. (http://xmlgraphics.apache.org/fop/).
If your page is template based, you may create a template which produces xml output and process that. You'll have extremely well contol over the pdf construction. The tradeoff is that it is not a "plug this in and will work" solution, but I've done that and once its set up, works like charm.
I've used TCPDF in the past, it's a little kludgy but can definitely get the job done. (http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf)
The FPDF module in PHP is simple enough to get the data together. It is a safe option since you know what data you are passing out to the PDF engine. There are some streaming pdf options which can take in a bunch of html and then output that to pdf however they can get it quite wrong without you knowing.
I used, on Linux machines, WKHTMLTOIMAGE/WKHTMLTOPDF a number of times, on many projects. It workes like a charm, easy to use, just a script that you run.

Categories