Convert PDF to HTML in PHP similar to DocuSign - php

we are developing a website that needs to convert PDF files into HTML because some of the PDF has a form (not necessarily fillable PDF, these PDFs are printed to be filled up).
So we want it to be filled up through our website instead of printing the files and filled up by pen. We are going paperless.
DocuSign provides these wherein you can upload PDF, then you can customized it to have textboxes, checkbox. So we're kinda using DocuSign as a reference but still haven't figured out how they did it (Almost perfect convertion of PDF to HTML vice-versa).
So far I've tried several 3rd party softwares for converting PDF to HTML. I've tried XPDF, Poppler, & ImageMagick.
ImageMagick converts a PDF to an image which is not suitable as these images has a large size when converted back to a PDF for printing.
Poppler is a fork XPDF based on my research, I've tried it after using XPDF to see if it's better, it basically does what XPDF do but it converts the PDF to have bigger pixels on the CSS when converted to HTML. That's fine but it loses the font family.
XPDF converts PDF to HTML but the pixel is smaller, so when I convert it back to PDF, it does not fit the whole page, and I still have to manually adjust all the CSS to fit it.
So after using these 3rd party softwares, I convert back the HTML files into PDF using MPDF, and the converted files has so much inconsistencies. Texts are not aligned properly. It's basically not the same as the original PDF.
Any help will be appreciated thanks!

What you are trying to do is not as straight forward it may seem. I have worked with Adobe Sign, formerly known as EchoSign, for years and I have a pretty good idea on how these services work. With that been said I strongly suggest looking into one of these eSign services instead of trying to roll out your own. It will save you a lot of time.
This is how it all works
The PDF must have a form itself with named fields. In other words, if you open such PDF in Adobe Reader or Chrome you should be able to fill in the fields. If your PDF does not have a PDF form you will need additional software like Acrobat PRO to create the form.
You must convert the PDF into a flat image that can be rendered in the browser.
You will need a tool to extract the PDF Form information, such as the field names, types, dimensions, and coordinates.
With all this information you can then render the PDF image(s) in the browser. Place absolute positioned HTML form elements over the image using the field type, dimensions, and coordinates from the previous step. Each HTML element needs to reference a PDF form field by name.
Once you have collected the information and a data map like field_name => field_value from your HTML widget, you will need to use additional software to programmatically fill in the PDF form in the original PDF. A PDF form information is often stored in FDF or XFDF file.
I don't know of a single tool that will help you with the things outlined above, at least not in PHP. However, I can provide you with a suggestion can be helpful:
PDFtk Server - Can help you to both, extract the PDF form fields information and fill in the same an XFDF file. Unforutently, the form field information that you can extract with such tool does not include dimensions and coordinates.
iText - A library available in .Net and Java that can be used to extract detailed information about the PDF form including the dimension and coordinates of the fields. You can create microservice using this toolkit that can communicate with PHP.
There are definitely a lot more tools out there for the job. Hopefully, this information will guide you in the right direction or help you make a decision on how to move forward with your project.

Related

saving PDFs for later manipulation in PHP

I have a PDF with some text in it that I would like to modify dynamically using PHP. This is being done already with another PDF, and what happens is that PHP simply replaces a token in the form %token% with another value pulled from a database. If you open that PDF in a text editor, you can find the %token% in plain text. But with this other PDF that I want to do the same thing with, if you open it in a text editor, there are no tokens in plaintext (even though I explicitly created one using Adobe Acrobat Pro). Obviously, the PDF's string content in this PDF is either encrypted, compressed, or both. What I want to know is how can I save a PDF so that the string content remains as plaintext such that PHP can manipulate it.
Please note, I do not want to dynamically create the whole PDF from scratch using some PHP library. I know that is something that can be done, but the PDF I am working with already exists and I just want to modify it slightly in the manner described.
For things like that I like to use the free command line tool PDFtk, which can compress / decompress PDFs and some nice thinks more. You may have a look at: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
PS: I edit a special pdf calendar form from the internet. I decompress it and replace the awful pink color for the weekend with gray (Saturday) and blue for Sunday. Violet I use as small bar to mark vacation days.

php - generate pdf with html table and save it on file server

I am working with a tool which lets user upload a .csv file.
That csv file contains an address column. I have to use the address from each row in another HTML template. That HTML template is like this
. After creating that template I then need to convert it into a PDF, store the PDF on a file server and give the user a link to the PDF.
I've finished the first two steps - csv upload and created complete template with address, but I'm stuck on how I can convert a template into a PDF.
I have looked into a few php-pdf libraries like fpdf mpdf. I'm facing a problem in creating pdf with html template.
A link to a library wich convert HTML to PDF and works pretty well.
First the link to the library
HTML2PDF
Then some code* to create your PDF using your own generated HTML, where $content is your HTML string.
$html2pdf = new HTML2PDF('P','A4','fr');
$html2pdf->WriteHTML($content);
$html2pdf->Output('exemple.pdf');
*Code taken from the "example page" of the site.
I have used tcpdf in many cases, https://tcpdf.org/
Works well with tables, I have made receipts and accounting related stuff with it. Handle UTF-8 without problems, why it's my way to go.
Only downside is that code is bit long and complicated and it doesn't keep tables as tables in pdf and turns them to divs, so paddings and other styles might be bit trickier to do.
One way is to use webkit based HTML to PDF converter.
Pros are that it is easy to customize and style and to see in the browser how it will look and then you can be sure that it will look as same in PDF as well. You could use CSS and JavaScript as well to style and modify.
Cons are that it is hard to install it on the production server sometimes. But there are web services and APIs that get you covered.
For example one service is https://pdfapi.io. It is free to use. Only when your amounts get bigger, then it will charge like a cup of coffee.
Hope that helps.

How would I draw a rectangle in PHP, and then convert to a .pdf file?

I'm just wondering about a small problem I have. I'm creating a simple website with PHP, HTML and CSS. The user enters the dimensions of a rectangle into a form, and this all works perfectly. However, I'm unsure how to proceed.
The system should then use these dimensions (both stored as separate variables) to draw a rectangle of that size (in cm). The rectangle should be draw in a .pdf document, which the user can then download.
I'm a beginner, so sorry if there's anything wrong with this question.
You're going to want to start with looking into the html5 canvas element for drawing. There's plenty of tutorials online for you. As far as saving to pdf, see this: Convert canvas to PDF good luck.
Take a look at the fpdf extension available for php. Or even better and more consequent: use its unicode capable variant 'tfpdf'.
It allows you to generate a pdf document on the server side in a step by step way. You can control all details of that document. When finished you can return the document to the requesting client.

Submit HTML form to PDF

We have a high-resolution PDF (for printing) which has some form fields on it. We would like to have an HTML form which submits to the PDF, which is then placed into the respective fields.
I found a solution on google: http://koivi.com/fill-pdf-form-fields/
However, with that solution you only get an FDF file... And the demo does not work for me, opening the FDF file simply downloads another FDF file.
Since this PDF will be available to the public we would like to keep it as simple as possible. If we must open our original PDF and import this FDF file, we need a different solution (which I'm not sure is what the FDF file is for, since it didn't work).
A related post talking about .net framework had the same idea, but there were only paid commercial solutions: From HTML form to PDF
The PHP solutions I have found so far are for creating a new PDF, which is not what I need. Our PDF is created with Adobe Illustrator (or a similar adobe product) and is high-res with embedded fonts, svg and image content.
The form elements are in place, we just need to get the data to there.
Update April 11, 2013:
Since posting this question I have been utilizing FPDF on multiple projects where I needed to accomplish this goal. Although it cannot seem to "merge" template PDFs with the provided data, it can create the PDF from scratch.
One example I have used, I had a high resolution PNG for printing (similar to initial question) which we had to write the customer's name and today's date clearly in the center. I simply made the background of the PDF using FPDF->Image() and write the text afterwards using FPDF->Text().
It was very simple after all, you will need to look up the paper sizes to determine the X,Y,W,H of the image and then base your text fields relative to those numbers.
There was even a Form Filling extension, but I couldn't get it to work.
It seems as though I should answer my own question, although Visions answer may be better (seems to be deleted?). I used Vasiliy Faronov's link which was a comment to my main question: https://stackoverflow.com/a/1890835/200445
Here I found how to install pdftk and run a command to merge (flatten) my FDF and PDF files. I still used the "hacky" way to generate an FDF using Koivi's FDF Generator but it works for the most part.
One caveat is that some characters, like single and double quotes are not inserted correctly. It may be an issue of escaping the fields, but I could not find an answer.
Regardless, my PDF form generator is working, but anyone with a similar issue should look for a better solution.
There are number of tools which are not paid like itextsharp. try the following https://web.archive.org/web/20211020001747/https://www.4guysfromrolla.com/articles/030211-1.aspx Hope this code will help you. I have tried it its worked for me. If you can pay then there are number of paid tools which convert the HtML to PDF like ABCPDF etc.This example is in Asp.net and i am sure if you can convert it in PHP it will work for you too.

PHP generate HTML convert to JPG then to PDF

I'm developing an app where the user adds items to a list. That list is stored in an array and passed to PHP with JSON.
The objective is to then create a PDF with all the values extracted from the user. The PDF is quite complicated. It includes images depending on what the user selects and the text varies depending on the images and the input data.
The first idea was to generate the pdf in php with one of those pdf libraries, but that's going to be a real hassle.
Then I thought of creating an html & css (much easier) and the convert it to PDF. But since the html & css are quite complex I don't think those pdf converters will work with this.
Then I thought I could convert the html to jpg and then to pdf.
It'll be much simpler if I could just use html but the output needs to be pdf.
What do you suggest?
Here's a post that discusses creating PDF files with PHP and the PDFLib extension.
Generate PDFs with PHP it's on sitepoint.
Or if you want to go from HTML to the PDF it looks like TCPDF might work.
You can try using FPDF
Then I thought of creating an html & css (much easier) and the convert it to PDF. But since the html & css are quite complex I don't think those pdf converters will work with this.
wkhtmltopdf to the rescue! If you are on a VPS or dedicated machine, it's probably the best (open source) HTML-to-PDF engine out there. It leverages Webkit, the rendering engine used by Google Chrome and Apple Safari, amongst others.
Otherwise, your only other options are going to involve drawing every aspect of the PDF or image yourself, "by hand" in your code.

Categories