Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Is there any PHP PDF library that can replace placeholder variables in an existing PDF, ODT or DOCX document, and generate a PDF file as the end result, without screwing up the layout?
Requirements:
Needs no 3rd party web service
Ability to run on shared web hosting would be ideal (no binary installations / packages required)
Mind you, a library that is able to load an existing PDF file and insert text programmatically at a specific position is not enough for my use case.
As far as my research shows, there is no library that can do this:
TCPDF can only generate documents from scratch
FPDI can read existing PDF templates, but can only add contents programmatically (no template variable replacement)
There are various DOCX/ODT template libraries out there but they don't output PDF
PHPDOCx claims to be able to do exactly what I need - but they don't offer a trial version and I'm not going to buy a cat in a bag, especially not when there seems to be no other product on the web that does this. I find it hard to believe they can do this without problems - if you have successfully done this using the product, please drop a line here.
Am I overlooking something?
Is there a way to do this using PDF forms? I am creating the source documents in OpenOffice 3.
I may be able to use standard Linux commands (pdftk is available for example, trying that out right now.)
Update: *Argh!* I was called out of the office and the bounty expired in the meantime. Starting a new bounty: As far as my testing shows, no solution works for me perfectly yet.
Update II: I will be looking the pdftk approach soon, but I am also starting another bounty for one more round of collecting additional input. This question has now seen 1300 rep points in bounties, must be some kind of a record :)
This is not very practical, but for completeness: If you already have an ODT template, then you might very well retain that as template. Modifying the OpenDocument content.xml and replacing placeholders therein is pretty simple. If so, you could use unoconv or pyodconverter to transform the ODT into a final PDF.
unoconv -f pdf -o final.pdf template.odt
Very obviously this requires a full OpenOffice setup (UNO and Writer) on the webserver. And obviously not every webhoster would go with that! haha. Even if it's simple on any Debian or Fedora setup. The execution speed would probably not be stellar either. But then it might be the cleanest approach, since OOo governs both formats way better than any PHP class ever could.
Pekka,
I looked in to this previously, I think you can use pdftk (a command line utility), to fill in a PDF form using FDF/XFDF data files, which you could easily generate from within PHP. That was the best option I've seen so far, though there may well be a native library.
pdftk is quite useful in general, worth having a look at.
Update: Have a look here: http://php.net/manual/en/book.fdf.php
Have you considered using something like XSL:Formatting Objects (XSL:FO)? Basically they're XML documents that are processed and turned into PDFs. Doing string - or better, DOM - replacements within that should be pretty simple. It supports embedding images, links, annotations, etc.
It's not PHP but there are a number of PHP wrappers for it along with ways of using it via exec, etc. Not an ideal but it takes care of the template portion completely. For some more info: http://techportal.inviqa.com/2009/12/16/transforming-xml-with-php-and-xsl/
There's an implementation available as an Apache project - http://xmlgraphics.apache.org/fop/
fpdf and there is another extention on top of it, which I can't remember, which allows you to import templates
Your best bet would be to generate the entire document on the fly, with the template defined programatically using fpdf or something similar. That way, your text will not be cut off by paragraphs or anything like that, and you can easily position images/other elements as required.
Late, but you can use OpenSource template designer https://github.com/applicius/dhek/releases , to define pkaceholders/areas over any existing PDF, then load it in PHP (as it's JSON format) and write accordingly on original PDF using fpdf lib, to generate custom PDF with dynamic data written on.
Altough not exactly thing you asked, you may consider to make it at two steps: using some php templating sytem (smarty, dwoo) to generate html page and then using tools like Html2Pdf convert it to pdf. I am using it, and results are good (no problems with page layout etc)
Of course it depends of your input documents (can you use html instead of PDF/ ODT as source ) and complexity of the layout of those.
Ok I'm trying to help you solve the problem a little.
First the answer for couple of your question.
Q - Am I overlooking something?
A - No. There is a PHP PDF library that can replace placeholder variables in an existing PDF and generate a PDF file as the end result, without screwing up the layout
Q - Is there a way to do this using PDF forms?
A - Yes. absolutelly the tric to doing this is by using a PDF Forms
For both answer you can use Justin Koivisto fill pdf form field php library.
For more detail you please go to http://koivi.com/fill-pdf-form-fields/tutorial.php.
Take a look there for additional information.
Credit to Justin Koivisto for his work
P.S
For workaround for displaying a table like output from pdf form
please consider to take some reading on Oracle Business Intelligence Publisher User's Guide - Creating a PDF Template
I'll add this new answer since the FDF PHP extension is now dead.
I've just followed these instructions and ended up executing one perl script then the pdftk command
I'm pretty aware it's far from being a real PHP solution but it's reliable and fairly easy to implement on any *nix platform.
The tools described there are also available on Debian, just in case you were wondering.
It's a litte bit late but have a look at the PDFTemplate Library it does exatly what you want. You can create Open Document files (odt) and add placeholders in it. The PDFTemplate library can fill out these placeholders (even with images) and create a PDF file.
ODT Files with placeholders to PDF
Related
I hope you are doing well.
I need to know about a PHP library that converts a PDF file having images as well to be converted in a HTML file with the following features that the library can do.
HTML file needs to be of version 3.2 compatible
Save the images in PDF file having .jpg extension
Correct font from PDF needs to be used in the HTML file.
A result folder that contains the images and html file in one folder
I have tried most of the PHP libraries but most of the PHP libraries are NOT doing my needed tasks.
Please, help let me know about a library that do all the above 4 requirements (image attached for reference)
Waiting for your kind responses.
Thanks
I am not very sure, But here is a library in PHP I found.
Here
Try this:
http://www.pdfaid.com/pdf-to-html.aspx
Or this:
http://webdesign.about.com/od/pdf/tp/tools-for-converting-pdf-to-html.htm
Or this...
http://www.pdfconvertonline.com/pdf-to-html-online.html
There are plenty of options available to you, the secret is to use a new fangled thing called a Search Engine, such as a Bing or a Google.
you will also do well to research on Stack Overflow before asking your question:
1) HTML 3.2 wes superceeded in 1997, this is very nearly twenty years ago, why on eart are you still needing a comparatively ancient technology when there are far better improvements available such as XML HTML, HTML 4.01 and HTML5.
2) Please read How can I extract embedded fonts from a PDF as valid font files?
3) Also to extract images you can use:
http://www.makeuseof.com/tag/extract-images-pdf-files-save-windows/
but again, there are several options available to you if you care to look for them.
You seem to imply a fundamental misunderstanding about HTML; there are several different ways of getting any desired result with HTML. You have a PDF file and you want it to look a certain way, this look depends on the browser you are looking at it on. For example if you use a PDF to HTML converter as linked above you will very probably find that the output will look different on Internet Explorer 7 versus on Firefox versus Internet Explorer 10. There is no one way of writing output on HTML or with CSS.
If you want a custom built library to do your specific task then you will need to employ a professional to do it, or you will need to code it yourself. This obviously should be charged to the client for requiring a technology that is extremely outdated. You can probably search github for a similar library (the one linked by CK Khan looks like what you're after) and then fork it and make your own variation for your needs. I very much doubt anyone is going to put time into developing a system to output HTML 3.2 from a PDF, and even less likely to develop this system for free and to your exact specifications.
It also appears that you can not directly incorporate font families into the <font> tag in HTML 3.2, only being able to edit size and colour of fonts. You can use CSS1 font-family to show font families. See here.
I have been scouting the web for a pdf editing tool for quite some time now. And it would therefore be nice with some suggestions/recommendations to the problem.
I have read a bunch of other topics around StackOverflow, but havent quite been able to find a full solution yet.
The case: I have an application written in php/javascript on a linux server and recently it has become a requirement for my customers that they are able to edit Pdf documents direct in the browser. The functions i need are primarily the ability to make annotations and to draw on the PDF. That means. They load an already uploaded PDF-document and edits it and then saves it in PDF-format again. All done within the browser.
The second requirement is that the program must save pdf document in PDF-format again, since i have an Ipad app with all of this functionality, and i need them to play nicely together. It is therefore not an option to save an image of html of something like that.
I read a comment suggesting the Zend framework, and it did sound quite useful. However i have developed my own platform from the ground, and therefore my second question is, if it is possible to embed only the PDF-tool from Zend or something?
Thank you in advance.
ps. If i missed a simular subject that answers exactly this, please let me know and i'll delete again.
There are several libraries that can be used to edit PDF-files. To name a few: Tcpdf, fpdf and Zend_Pdf.
The later can be used even without the complete ZendFramework. But be cautious: Currently it only allows editing of PDF-Files up to version 1.4.
If you need speed you should also have a look at the PDFlib which could be tweaked to support your usecase
I want to generate PDF from a PHP file that includes HTML controls like textbox, and textarea. I attached CSS in the same. I tried FPDF, DOMPDF and TCPDF, but still I don't get exactly what I want. How do I pass HTML controls with PHP variables and CSS to these libraries?
mpdf is another option that you could try.
EDIT :
Found another solution for it, TCPDF is a FLOSS PHP class for generating PDF documents. Looks more dominating library.
"PRINCEXML" is a good library (not completely free now).
Others:
If your meaning is to create a PDF file from PHP, pdflib will help you (as some other suggested).
Else, if you want to convert an HTML page in PDF via PHP, you'll find
a little trouble outta here.. For three years I have been trying to do it as best as I
can.
So, the options I know are:
HTML2PS: same of DOMPDF, but this one convert first in .ps
(Ghostscript), then, in whatever format you need (PDF, JPEG, PNG). For
me it is a little better than dompdf, but I have the same speed problem.. Oh,
it has better compatibility with CSS.
Those two are PHP classes, but if you can install some software on the
server, and access it through passthru() or system(), have a look at
these too:
wkhtmltopdf: based on webkit (safari's wrapper), is really fast and
powerful... It seem like it is the best one (atm) for converting HTML pages to PDF on the fly, taking only two seconds for a three pages XHTML document
with CSS 2. It is a recent project. Anyway, the Google Code page is often
updated.
htmldoc: this one is a tank, it really never stops orcrashes... The project
seems to have died in 2007, but anyway if you don't need CSS compatibility
this can be nice for you.
** Thumbs Up For Strae.
If I understand your needs correctly I don't think any PHP-PDF class would do that.
Mostly you could insert only text and images to a PDF file, so if you would want something that looks like an HTML element you would need to insert it as an image.
Usually just putting HTML doesn't mean all your elements would stay intact in the PDF . (Different world, after all)
http://www.fpdf.org/ is the site having a great HTML-to-PDF class which work well. I am using it, but you have to first study its functionality and then start.
I'm trying to find a way to search inside PDF files. I came accross the PHP PDF class but I can't seem to find any function for reading/searching a filestream.
So, as naive as I am, i tried to simple get a stream using file_get_contents(), obviously it's an encrypted-like output ;)
So my question, is there any way to search through PDF files? I'm looking for script-only / free / open source solutions and not buying some expensive commercial libraray.
XPDF?
There is a blog post here that may be of help.
There seems to be some code here that could help - a simple class that reads a PDF into plaintext. Unsure if it supports decryption.
There are also a number of resources in PHP documentation that may help you. Click.
FPDF and FPDI may also help. Probably your best bet after some research.**
A PHP search engine called Sphider has the option of adding PDF search via XPDF. You can then customise the result templates to fit in with the rest of your site (if applicable).
Recently i worked in a project. On this project I need convert page into a Microsoft word document (.doc file) and offer the document for download, all using PHP. But I can't solve this problem.
Please help me. Thank You very much, Arif
This is not easy to solve.
First off, if you want to write real word documents, you will have to do on Windows. You can use COM to talk to Word and this is how you manage to get good results. I've tried all the unix/linux based solutions and the results were not so great.
Otherwise, I'd suggest you write RTF -- which is just as good. And in the end, you can call the .rtf-file, .doc and no one will notice it. RTF has a couple limitations (formatting), but on the flipside -- it's all ASCII and the RTF standard is pretty comprehensive and well documented.
There's a class which does it pretty nicely -- phpLiveDocx (this is a great introduction). And this class also claims to write PDF and DOC -- but I haven't tried those yet. I use another solution for PDF.
I would recommend using the RTF format instead of the .doc - it's much simpler to write to, and all text editors understand it. Similar recommendation for .csv when you want to output an Excel file.
Perhaps not the answer you seek, but still interesting to note, there is a open source word processor out there called abiword that has a CLI (Command Line Interface). You can use it to easily convert between document formats. I know that at least one website uses it to convert text files into various formats.
It is actively getting developed and could easily be used as a 3de party black box solution to converting documents server side.
Here is a blog from one of the developers on how to integrate it with PHP
Server-Side AbiWord
abiword home page