Conversion of complex DOCXs to PDFs in a PHP web environment - php

I am working on a PHP web application which programatically generates some DOCX files.
I want these files to be converted to PDF, but their layout is so complex that not any PHP-PDF generator library (domPDF, TCPDF, etc.) works well. They result in a poorly formatted PDF in each case.
In this situation, I have decided to let Google Drive do the conversion. For this, I have to:
Upload the DOCX files to GDrive
And then export them in PDF...
I have seen all of the GDrive API documentation, but it is very poorly documented. I only want to execute one single PHP script which:
Uploads the file to GDrive
Downloads its exported PDF version
Lets the PDF be downloaded when the script is finished...
I am searching for the optimal way to achieve this behaviour... (with or without GDrive, since the LibreOffice/Openffice CLI command is not an option because I am on a web hosting and I can't install any software...).

Have you considered using a file conversion service to do this for you ?
For complete transparency I work for Zamzar (an online file conversion website), we have recently released a developer API - https://developers.zamzar.com/ that would allow you to convert your DOCX files to PDF with little or no loss of formatting.
This would then eradicate your need to convert your file(s) using the Google Drive API. Check out our fairly extensive docs here - https://developers.zamzar.com/docs.

Related

Convert scanned pdf files to text-searchable pdf files

I want to convert scanned pdf files to text-searchable pdf files.
I want to give an input as a scanned PDF then my expected output is searchable PDF.
There are few tools which give us the text as output from scanned pdf file but I want text searchable pdf file as output, not just the text.
I have searched about it and found 1 solution here but my Production server is amazon centos and installation of this tool is only working for ubuntu not for amazon centos.
I am ready to pay for it if required. Please help me to give the link of any open source web api or paid web api services or any tools which can convert to text searchable pdf file.
I am using PHP language in my web applicatin.
There are several commercial web API services that will convert scanned PDFs (or scanned images generally) to searchable PDF. Of these, I would recommend trying ABBYY's Cloud OCR SDK. They've been in the OCR space for decades and use their own OCR engine, which tends to give better OCR results than APIs based off other technologies (e.g. Tesseract) based on my observations and what I've heard from others.

How a video template can be used to replace with custom images/text on-the-fly using PHP

I am a website developer (php) and I have been given a task to develop a website similar to
http://ivipid.com/.
I need to make an identical website and I am trying to figure out how this can be done,
especially the part in which they use user-uploaded image files and text into the Video file and all this on-the-fly?
I know how to convert user-uploaded video files into FLV on-the-fly using FFMPEG but I'm not sure on how they (ivipid.com) manage to do this.
You can use Templater Bot from Dataclay to achieve this. From your PHP script launch Templater's command line interface using a function call like exec(). Check out Dataclay's github cli-tools repository for more information on how to work with its command line interface.
Templater Bot is built for Adobe After Effects. There is a basic workflow. First, you prepare an AE project to be compatible with Templater. It involves marking certain layers as dynamic whether it be text layers or footage layers. Then you create a data source: this could be a JSON object array, or something as simple as a Google Sheet. Next, you setup a configuration file for the cli launch script. The config file lets you specify output location, data source location, footage location, etc. Finally, launch Templater and you'll have the versioned content that your app needs.
Full disclosure: I am the lead developer of Templater at Dataclay.

Best way to convert files into pdf files using php

What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.

Excel to PDF in PHP

I need to convert an Excel(.xls) spreadsheet to a PDF document with an image in PHP. If there is a library available please put the link.
Note - I have created excel(.xls) to PDF with "PHPExcel" library but my output is without image and border.
If PHPExcel can't do it, then you're stumped for a straight PHP solution, and might have to look at options like COM.
You don't mention what your problem is with the borders, and these have been a problem for some time in PHPExcel... the 1.7.6 version of PHPExcel resolved some of these issues, and there is a patch listed in the Issues section of the PHPExcel site that fixes some other problems with borders.
You can convert XLS files to PDF on Linux by installing OpenOffice
with a PDF writer as the default printer driver.
Then, you can call OpenOffice (from PHP) using the "-p" command-line
parameter, which will cause it to load a designated file and print it.
For example, if your file was "accounts.xls" you would call the
following command:
soffice -p accounts.xls
OpenOffice would load the "accounts.xls" file and "print" it to the
PDF writer, which would be configured to save the PDF document to the
desired filename.
GhostScript is a suitable PDF writer.
The OpenOffice setup guide describes how to install and configure
printer drivers using the "spadmin" utility, and discusses the use of
ghostscript as a PDF writer:
"Open Office Setup Guide - Appendix"
http://www.openoffice.org/docs/setup_guide/appendix.html
You can call OpenOffice from PHP by using the backtick execution
operator, or the "exec" function. You may also need to use PHP to move
and/or rename the resulting PDF files:
PHP: Program Execution Functions
http://www.php.net/manual/en/ref.exec.php
PHP: Filesystem: Rename
http://www.php.net/manual/en/function.rename.php
OpenOffice is pretty good at processing XLS files, but it may not
perfectly render every such file - so if you need the ultimate in
compatibility you will have to use Microsoft Excel on a Windows
Platform or emulator. "IT AsiaOne" looked at several alternatives to
Microsoft Office (including OpenOffice) and wrote that "while none of
the alternative suites promise ... full compatibility with Microsoft
Office-created documents, in general, they do a decent job of
translating Microsoft ".doc", ".ppt" and ".xls" file formats":
IT AsiaOne - Specials - Yours For The Picking
http://it.asia1.com.sg/specials/mmedia20020724_001.html
Additional links:
OpenOffice.org Home Page
http://www.openoffice.org/
Ghostscript Home Page
http://www.cs.wisc.edu/~ghost/
PHP Home Page
http://www.php.net/
Google search strategy:
openoffice scripting pdf linux
://www.google.com/search?q=openoffice%20scripting%20pdf%20linux
openoffice print "command line"
://www.google.com/search?q=openoffice%20scripting%20pdf%20linux
followed by a search for "command line parameters" from the
openoffice.org home page.
Ref

converting a given file to PDF

hey all,
is there any way to convert a given file (this could be of any type) in to a pdf file in .net or php?
eg: suppose there is a upload link to upload your file of any type(word,excel,autocad,images..) and once the upload button is clicked the uploaded file should be converted into a pdf.
i checked out fpdf.but according to my knowledge all file types cannot be converted.a module to plugin to the CMS would also be fine.
FPDF does support images. I know because I have used it recently.
If you are wanting a pure PHP solution, you can use the PHP COM functions along with Word or Excel on the server to open up those files then copy the data out.
If I were you though, I would use Google. Load the doc into Google Docs then export it as a new format with the API.
There are a couple of 3rd party solutions available such as this one, which is optimised for use on the server and accessible from any web services capable environment, including .net. Supports loads of file types including MS-Office based documents.
Disclaimer, I worked on this product so consider me biased. Having said that, it works very well.

Categories