Converting images to pdfs server side - php

I'm currently working on a project involving converting form results to a downloadable PDF, which is simple enough. I was recently asked, however, to add attachment functionality. I'm using dompdf to convert the form results to PDF, but is there a way to convert the attachments separately (can be jpg, png, doc/x, or pdf) to a PDF file and then append the attachment file to the dompdf output?
I can handle the implementation; are there any free libraries that will support anything like this? I found FPDF, which supports images, but it does not support Word files.

First of all, you will need to find a library for every kind of conversion you need (you mentioned jpg, png, and doc/x, but you didn't say if that was all of them.)
For common office formats, you can launch a headless (meaning it can run on a server without a graphical display) instance of OpenOffice or LibreOffice. Then you can interact with it from various programming languages, or you can use a ready-made commandline tool such as pyodconverter, to ask it to convert between various file formats. This is the best way to convert doc and docx files to pdf, by the way, short of spending money on Microsoft software.
As for "appending the attachment file", by which I take it you mean concatenating a bunch of PDF files together, you can use the free tool pdftk.

Related

Convert to PDF using PHP

I need a simple solution for an issues I'm having. I have an application that allows to upload a document (mostly will be excel files but can be .docx/.doc)
I also am using jsignature to "esign" the document and save that to an image, and all that is working fine.
What I need is a way to convert the uploaded document to PDF, and then merge that with the newly created signature image. Thoughts?
Fpdf and Fpdi are good PHP plugins to deal with reading/writing PDFs.
I've only ever done the uploading and signing of PDFs but there is probably some way to utilize one of those tools to work with doc/docx files.
http://www.setasign.de/products/pdf-php-solutions/fpdi/
http://www.fpdf.org

.docx, .xlsx, .pdf to .pdf using PHP

I have relatively sensitive data in .docx, .xlsx and PDF files that all need to be converted to a single PDF file locally. Sending these files off to phpdocx or Google Docs or anything like this is not an option.
The only other option I am seeing is OpenOffice / LibreOffice but I am not satisfied with how they are converting the documents.
Is there any other alternative anyone is aware of? Thanks!
Definitely a difficult task. The very recent release of LibreOffice 3.6 has fixes to it's docx processing if that might help, but you haven't specified what the actual problems you encountered when you tried OpenOffice.
If you have time to experiment (and bring in any tools/languages you need to get the job done) you could try LibreOffice to produce PDFS, then use one of the many PDF libs to stitch the PDFs into the single file you require.
You could also look at ODFConverter which has traditionally been much better with DOCX than either OpenOffice or LibreOffice. This would allow you docx -> odt -> pdf. I think it can do the xlsx also. Then do the PDF stitching again.
I suggest testing the stages manually at first and if promising, try something like JODConverter (requires Java) to allow you to automate the process via scripts.
Good luck.

Best way to convert files into pdf files using php

What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.

Is it possible to output formats other than .docx and .odt with TinyButStrong and OpenTBS plugin

I have a module which merges a document from database records and .docx or .odt document model.
I have to output .docx, .odt or .pdf. For outputting to Microsoft and Open formats, there is no problem, all works properly.
But what I want to know is, can I output to a format (like XML or HTML) which I can use to subsequently build a PDF document?
If I can't, are there any libraries which provide a merge document capability like:
DOCX (or ODT) + database record => PDF
And I don't want to use phplivedocx.
I successfully put a portable version of libreoffice on my host's webserver, which I call with PHP to do a commandline conversion from .docx, etc. to pdf. on the fly. I do not have admin rights on my host's webserver. Here is my blog post of what I did:
http://geekswithblogs.net/robertphyatt/archive/2011/11/19/converting-.docx-to-pdf-or-.doc-to-pdf-or-.doc.aspx
Yay! Convert directly from .docx or .odt to .pdf using PHP with LibreOffice (OpenOffice's successor)!
I don't know any PHP library that does DOCX => PDF. In fact, the DOCX conversion to something else in PHP is an opened problem today. This is independent from how you made the DOCX.
But as you said, they are PHP libraries for HTML => PDF.
Html2Pdf is a well reputed PHP library that does HTML => PDF.
There is also DomPdf.
So if you can found a PHP library for DOCX => HTML, then it would work.
Of course it has some limitations because even if both PDF and DOCX are opened format, they have very specific features, they need huge rendering process, and the editors keep some good tips for them.
Converting DOCX to HTML is theoretically possible. There is a Windows software that does it by EpingSoft. If you need to do it in PHP, some web articles tell you how to make it, but since I cannot found any PHP code doing this, I guess it is more theoretical than practical.
http://www.quepublishing.com/articles/article.aspx?p=691502
How complicated that process would be
depends on how much of Word's native
formatting you need to preserve during
the conversion.
If you want to try this way, it's good to know that OpenTBS enables you to read the XML before and after the merge. It is based on a PHP class names TbsZip that can read any XML file in the DOCX since it's in fact a zip archive.
There is also posible to use PDF files directly in TBS after decompressing:
qpdf --qdf --object-streams=disable in.pdf out.pdf

Alternatives to ImageMagick for PDF downsizing

Having an issue with some PDF files not displaying properly in our iPad app. I have come to the conclusion that we are needing to standardize by "converting" PDF to PDF. I have successfully processed this using ImageMagick to convert the PDF to PNG (resized), and then pushing the PNG(s) back into a PDF. However, something within ImageMagick is making photos within PDFs display wrong. Same issue just converting a JPG or other graphic to PDF in ImageMagick. I solved that by taking the output of the converted ImageMagick file and converting it again using GD to PNG, then pushing it through our PDF converter.
So my question is this: What other PHP workflows would work with this, other than using ImageMagick for the conversion back to PDF? We are not opposed to a paid solution, we just need something that works. Our server runs centOS.
My gut instinct, other than yelling at whoever wrote the PDF reader that you're suffering with, would be to convert the PDF to PostScript using pdftops, then convert it back into a PDF using Ghostscript. You can enable a number of document compatibility options at that point, which may make it more digestible.
While this may have side effects, they should be minimal. PDFs are basically a wrapper around a PostScript document, and it looks like pdftops can not do utterly stupid things during the conversion process.
This may break or simply not work with advanced PDF features, like digital signatures or forms.

Categories