I'm using DOMpdf to generate PDF files, I'm able to generate 15 pages so far, however I need to be able to generate 130 pages, I know this might consume memory, so I'm looking for a way where I can generate each 10 pages in a separate process, and make all of them write to the same file. My question is, is that possible?
There are many tools that can join PDFs together. One that I have had a lot of luck with is PDFtk Server. If you generate many small PDFs (e.g. 001.pdf, 002.pdf, 003.pdf) you can generate a joined PDF (e.g. final.pdf) with something like
pdftk cat 001.pdf 002.pdf 003.pdf output final.pdf
The page ordering of final.pdf will respect the order of the input files.
This is a standalone Java program, not a PHP library, so you'll have to kick the tool off with something like shell_exec or any number of other options.
Related
I need to generate a large PDF, 2480 pages to be exact.
Currently I am using indesign, and while the output is exactly what I want.
I would rather not be involved in the document creation process.
It takes 31 minutes for indesign to execute the data merge, generate the pdf, save the pdf, and to save the pdf.indd file. (I dont really need the pdf.indd file, but I would rather not have to recreate the data merge if something were to happen to the pdf)
I am hoping for a php, or similar solution. Currently my data is stored in MySQL.
The majority of the pdf is static text, with 19 dynamically driven text fields.
There is one image on the pdf, 75x100px # 72dpi.
The output needs to be exact, the pdf file is printed and cut in half at 4.25 inches.
I have tried TCPDF, while it is fast at generating upto 50 pages, after that it would rather die than give me an output. I have also played with mPDF, and found it to be, ..., not as friendly. I have also considered generating many small files and using some utility to merge the smaller pdf's into one large pdf. Though that seems like driving around the mountain.
Any thoughts would be helpful.
You certainly can create documents directly with PHP, but it can be difficult. One method is to use one of the various PDF classes to create the document, as you have found. Another is to create images (using ImageMagic, GD, etc.) and convert those to PDF. (This method is less efficient, as you are creating raster graphics making the whole PDF page one giant graphic.)
However, I think you should consider simply scripting InDesign. InDesign has the capability to read data in via XML and create the document. This way, the design of your document isn't dependent on your programming abilities and you can still have the power of programmatically creating the document.
When it comes to huge number of pages in PDF, LaTeX is always the best answer. Nothing can really handle huge PDF generation as fast, accurate and elegant as LaTeX.
Check this question to see how to retrieve your data from the database.
I am working on developing a shipping/receiving system which I plan to setup on an intranet. I have experience with HTML, CSS, PHP, MySQL, JavaScript, JQuery, and AJAX. My basic goal is to be able to scan barcodes and then generate and save PDF's for printing and storage that can be 1 page or 100 pages. I basically want to create a header with order information such as Order ID, Customer/Vender, Date, Page Number, etc. with columns below containing information like Part Number, QTY, Description, etc.
I am not sure if the entire pages can be created with css and 'foreach variables' or if perhaps a template where text is simply placed on top of a default pdf would work best? In the past I have been able to take a basic template and enter text onto a single page pdf at a specific X and Y co-ordinate, but I am not sure about detecting page breaks and the such.
Any advice on where to begin would be greatly appreciate!
Thanks in advance :)
Take a look at FPDF, it's a quite powerful PHP library for creating PDF:s. I've used it quite a lot and I'm sure it would fit your needs.
I don't think it's necessary to have a template upon which you position text. You could for example use the table capability provided by third party scripts in FDPF for presenting your data.
Using a solution such as wkhtmltopdf which is a HTML to PDF converter, you can generate the html for the invoice using php and output the PDF file using the tool. On an Ubuntu box you can simply install it using sudo apt-get install wkhtmltopdf. wkhtmltopdf is a commandline tool so you might have a cron running in the background which picks up html files within a folder and converts them to PDF using the tool or using php exec() or system() functions to execute the program. Hope that helps
I generated pdfs with tcpdf using writeHTML. What I do, I write entirely html code and after that I generate pdfs with writeHTML.
My problem is that it's very slow. Generating 5 pages of data table (5 cols x 12 rows per page) takes about 10 seconds.
I followed almost all instructions from here: http://www.tcpdf.org/performances.php .
I put
$pdf->setFontSubsetting(false) ;
Do you have other tips? Is it going to be more faster if I generate pdfs problematically?
Generating HTML, letting TCPDF parse that HTML and rejiggle it into Postscript instructions, then write this Postscript is of course going to be way slower than directly writing the Postscript to begin with. Use the regular Ln, Cell, Write etc. methods to directly generate the PDF if you want maximum performance. Yes, it's somewhat more complicated than writing HTML, but that's because they're different things. And the slow part is translating between those different things.
I am doing a bulk generation of pdf files based on templates and I ran into big performance issues pretty fast.
My current scenario is as follows:
get data to be filled from db
create fdf based on single data row and pdf form
write .fdf file to disk
merge the pdf with fdf using pdftk (fill_form with flatten command)
continue iterating over rows until all .pdf's are generated
all the generated files are merged together in the end and the single pdf is given to the client
I use passthru to give the raw output to the client (saves time writing file), but this is just a little performance improvements. The total operation time is about 50 seconds for 200 records and I would like to get down to at least 10 seconds in some way.
The ideal scenario would be operating all these pdfs in memory and not writing every single one of them to separate file but then the output would be impossible to do as I can't pass that kind of data to external tool like pdftk.
One other idea was to generate one big .fdf file with all those rows, but it looks like that is not allowed.
Am I missing something very trivial here?
I'm thanksfull for any advice.
PS. I know I could use some good library like pdflib but I am considering only open licensed libraries now.
EDIT:
I am up to figuring out the syntax to build an .fdf file with multiple pages using the same pdf as a template, spent few hours and couldn't find any good documentation.
After beeing faced with the same problem for a long time (wanted to generate my pdfs based on LaTeX) i finally decided to switch to another crude but effective technique:
i generate my pdfs in two steps: first i generate html with a template engine like twig or smarty. second i use mpdf to generate pdfs out of it. I tryed many other html2pdf frameworks and ended up using mpdf, it's very mature and is developed since a long time (frequent updates, rich functionality). the benefit using this technique: you can use css to design your documents (mpdf completely features css) - which comes along with the css benefit (http://www.csszengarden.com) and generate dynamic tables very easy.
Mpdf parses the html tables and looks for the theader, tfooter element and puts it on each page if your tables are bigger than one page size. Also you have the possibility to define page header and page footer elements with dynamic entities like page nr and so on.
i know, using this detour seems to be a workaround, but to be honest, no latex, pdf whatever engine is as strong and simple as html!
Try a different less complex library like fpdf (http://www.fpdf.org/)
I find it quite good and lite.
Always find libraries that are small and only do what you need them to do.
The bigger the library the more resources it consumes.
This won't help your multiple-page problem, but I notice that pdftk accepts the - character to mean 'read from standard input'.
You may be able to send the .fdf to the pdftk process via it's stdin, in order to avoid having to write them to disk.
I'm currently doing a task where I'm taking forms from a local government body, and converting them so that they are able to have a PDF generated dynamically via FPDF based on passed parameters. Currently the only copies of these documents are in read-only pdf files. What I'm wondering is if there is a way to have these files read somehow to where these documents could be converted into FPDF format somehow? Normally I'd just create them manually, but with 50 files to convert, and with some being multiple page forms, it'll probably take months, and hence looking for a quicker way.
The short answer is you're stuck. As far as I and my many hours of research know there's no such process. I would love to be proven wrong.
I recently went through a similar situation with insurance forms. I used the free trial of Adobe Live Cycle Designer to build out the forms. It basically turns the old pdf into a flat background image you can draw form fields over. Then I used PDF Toolkit and PDFTK-PHP to populate the fields.
The process wasn't ideal but it worked out well enough. I setup 20 forms consisting of about 50 pages with filling code and some other operations in a week.