openoffice document (odt) to PDF with command line on Linux? - php

we are building a PHP script that we need at work to create reports in PDFs
the reports will be created by using templates from postgrSQL.
so far I found that it can be done with the use of php and odt (openoffice) files [http://www.odtphp.com/] (do you have any other suggestions?)
now how I can convert the results to PDF so teachers will get the final reports as PDF
any tips? the server has no GUI and I want to make it as simple as possible
we tried using PHP to PDF directly with FPDF [http://www.fpdf.org/] but it is really a CPU killer!

http://www.artofsolving.com/opensource/pyodconverter
this may help you, it needs to start OpenOffice as service, and the python script is merely utilizing its api, maybe you can write one in PHP too

Related

Automate XML parsing and converting docx to pdf

I have not been programming for many years but need to get this following process automated.
A government medicine authority publishes an xml file on their website.
I need to download it and parse it and catch one of the fields that has a url to a docx file.
I need to then store it on our local filesystem as a pdf.
Need to repeat this process every n days.
I used to know PHP quite well but what would that be ok for this task. Would python be better.
As I don't have a server at work so was thinking of getting a Raspberry Pi.
What would you suggest on how I would get about this.
I have a few ideas of using wget or curl through a cron job to get the xml file. Then use perhaps php or python or bash to parse the xml file, call the docx with wget or curl nad then use a pdf command line tool. If it would be on a website should I load the results in a sql db or just list them as files in a directory.
Would appreciate any ideas.
Martin
I, personally, would go with node.js. It is easy to setup a node server on a raspberry pi and node.js has a library for just about anything. There is a lot of simple setup tutorials out there and SO has a lot of info like xml parsing in node. JavaScript is pretty easy to code in.
For example if you need a docx converter, here is one: mammoth.js
Good Luck!

Convert docx to pdf in PHP

Right first a little background that will help put this all in focus.
I have several indd files (indesign). I can convert these to pdf and then to docx.
Using the phpword library I can then effectively do a mail merge and replace several areas of my document with text and one image.
I then want to convert that to a pdf, which I can then stitch several pdfs together for printing with ghostscript.
I have a word macro that I can execute just find via standard command line functions. If I try that same command line in php it just hangs.
I've tried various forms of that, using system, exec, passthru - using Psexec all either hang and then timeout, or don't work and skip through.
I've seen other examples using COM objects thing like this.
http://www.sitepoint.com/make-microsoft-word-documents-php/
all either hang or give me problems with the com object that I'm trying to make.
Am I trying for the impossible, or perhaps is there another way.
I've also given e-PDF Document Converter v2.1 a go but without success.
Currently I'm thinking that there is some permission thing going on but I'm really at a loss as to how to get around it or what to do.
I would maybe like to use either the libreoffice or the openoffice as they both seem to have command line tools but when I open the pdf or the doc file they display very poorly.
Any help.
Thanks
Richard
Update
Just thinking maybe I'll stitch the word documents together and then just allow the user to download it and then they can print it.
Job done easy!
But if there is a better way - I'm open to it.
Update 2
On a windows platform
Maybe something like next ?
sudo apt-get install unoconv
doc2pdf respondus-docx-sample-file.docx
In php :
exec("doc2pdf \"" . $youPdfFile . "\"");

Exporting SVG to PDF in a offline TideSDK webapp

I have an offline HTML5/CSS/JS app built with TideSDK in which a bar chart is drawn with Highcharts as an SVG "tag" using data entered by the user. I need to export this chart in a PDF document, which will also contain text and tables.
As it is an offline app I can't use the export module included in Highcharts (except the getSVG() method) or other solutions like DocRaptor.
I'm open to use another JS plugin for drawing the chart, but I really love the "look and feel" and the features of Highcharts graphs.
As you may know, with TideSDK I can embed Python, PHP or Perl scripts/modules in my app (I prefer avoid Perl as I've never used it).
The other limitation is that I cannot ask to the final users to install another software than mine, so I can't use wkhtmltopdf with PHP. Except if I manage to install it through my app in a transparent process (not sure it is particularly easy to do).
After having search for several days, my final idea is to use the CairoSVG Python module to export the graph in a first PDF. Then I will find a JS (jsPDF) or Python tool to include this PDF in the final PDF containing text and tables.
I will start to test this solution soon and let you know if I managed it. Nevertheless, if some of you already have to manage a similar problem, I will be very happy to hear your solutions.
The app will run, in a first time, on Windows platform and should be adapted to MacOS, Linux, Android and iOS in later phases.
There is simple command line tool (wkhtml to pdf), that lets you convert html pages to pdf files. Its not strictly python/js solution, but you can call os.system() or something similar from python to use it. It helped me with my python program and it is very simple to use. Command is "wkhtmltopdf.exe inputname outputname" and you get what you want. And its free software. Site: https://code.google.com/p/wkhtmltopdf/

Converting doc, docx, pdf to HTML using PHP linux

i run a job search site, and i need to convert doc, docx and pdf files into HTML on linux CentOS server running php. People submit these files as resumes. So far, I found PHPDocx to be great at converting docx to html. But I am stuck at doc/pdf. PDFTOHTML gives error "bad color" when i run tests. As far as doc, i only found wvwave, which seems complex and bulky to install.
does anyone have any ideas on how to easily convert doc/pdf to HTML?
The only thing i can think of is FPDF.
It is intended for creating PDF files in PHP but it can also open PDF files.
Maybe you can use that as a base and develop some sort of toHTML function for it.
It is completely free to use and it has some extensions already.
It MIGHT help you.
http://www.fpdf.org
EDIT:
Thanks for the addition to my post in the comments to Pierre:
You can use fpdi: http://www.setasign.de/products/pdf-php-solutions/fpdi but the input pdf is just like an image.
I havent taken a look at it myself so far but this might help.
As far as .doc files go how about trying OpenOffice/LibreOffice, something like:
lowriter -convert-to html doc_file.doc –
As far as PDF goes, if the PDF is a graphical representation of text then you're out of luck, best you can do is try convert it to an image with ImageMagick, if it is a proper text it should easily convert.
There are various tools out there already to do this, such as http://dag.wieers.com/home-made/unoconv/, http://www.phpdocx.com/ (which you've already tried)
http://www.phplivedocx.org/2009/08/13/convert-docx-doc-rtf-to-html-in-php/ looks promising.
Or, you could install a portable version of libreoffice on your server which allows command line conversion
https://help.libreoffice.org/Common/Starting_the_Software_With_Parameters
I'm sure there'll be tutorials out there (on libreoffice support area)
To easily convert pdf to html, I would suggest pdf2htmlEX which produces outstanding HTML and is fast enough for runtime converting. You should first put some effort to optimize and build it for your system. There is simple build howto included on the project link.

Altering a PDF document in PHP

Is it possible to convert a PDF document to HTML or text and then edit some text of the html/text file and recreate the PDF, all in a PHP script?
Have a look at http://www.pdflib.com/download/
I have never seen or attempted todo this.
I find it better/cheaper to do the following:
Write the component that will do image intensive processing op a platform like .Net or Java. The application has be be console based and return/print the relevant info (if there was an error in processing etc).
Call the command from you php web application.
Php is a web language and used best for that purpose.

Categories