I want to generate thumbnail (first page) of following file formats:
PDF
DOC/DOCX [MS OFFICE]
PPT/PPTX [MS OFFICE]
For PDF I got many libraries and ImageMagick & Ghost Script did for me.
But for other formats i.e. ppt, pptx, doc and docx. I can't find any lead to solution.
Preferred language is PHP but option is open for every language that can run on linux. Thanks alot.
You can use a service like Post2Preview. It can generate thumbnails and OCR for hundreds of file types and doesn't require any third party libraries. Just a simple POST request. Disclaimer: I work for Post2Preview
Related
I'm currently working on a project involving converting form results to a downloadable PDF, which is simple enough. I was recently asked, however, to add attachment functionality. I'm using dompdf to convert the form results to PDF, but is there a way to convert the attachments separately (can be jpg, png, doc/x, or pdf) to a PDF file and then append the attachment file to the dompdf output?
I can handle the implementation; are there any free libraries that will support anything like this? I found FPDF, which supports images, but it does not support Word files.
First of all, you will need to find a library for every kind of conversion you need (you mentioned jpg, png, and doc/x, but you didn't say if that was all of them.)
For common office formats, you can launch a headless (meaning it can run on a server without a graphical display) instance of OpenOffice or LibreOffice. Then you can interact with it from various programming languages, or you can use a ready-made commandline tool such as pyodconverter, to ask it to convert between various file formats. This is the best way to convert doc and docx files to pdf, by the way, short of spending money on Microsoft software.
As for "appending the attachment file", by which I take it you mean concatenating a bunch of PDF files together, you can use the free tool pdftk.
When users upload certain files to my site (such as .doc, .xls, .pdf, etc) I'd like to be able to generate a preview thumbnail (of the first page of the document). I'm working with PHP in a LAMP stack but would be happy with any library or command-line tool that can do the job (Linux highly preferred).
It's not easy to convert certain document formats to image. php alone cannot do this.
The 'proper' way to do this is to first of all have the program installed on your server that can open the document in that format.
For example, for .doc documents you can use OpenOffice
it also can open most other document formats
You then need to setup your open office to work in 'headless' mode, sending the output to virtual display (XVFB is what you going to need on Linux)
You php script will then call OpenOffice, passing the path to uploaded doc. OpenOffice will actually open that doc. Then you need to create an image from the screen buffer. You can use ImageMagick for that
Then once you have the capture of your screen you can resize it to a thumbnail.
Look at this link for more details
http://www.mysql-apache-php.com/website_screenshot.htm
The best way is to have all your documents converted to PDF
after that you can make preview thumbnail
& this is how simply explained
How do I convert a PDF document to a preview image in PHP?
What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.
I need to convert an Excel(.xls) spreadsheet to a PDF document with an image in PHP. If there is a library available please put the link.
Note - I have created excel(.xls) to PDF with "PHPExcel" library but my output is without image and border.
If PHPExcel can't do it, then you're stumped for a straight PHP solution, and might have to look at options like COM.
You don't mention what your problem is with the borders, and these have been a problem for some time in PHPExcel... the 1.7.6 version of PHPExcel resolved some of these issues, and there is a patch listed in the Issues section of the PHPExcel site that fixes some other problems with borders.
You can convert XLS files to PDF on Linux by installing OpenOffice
with a PDF writer as the default printer driver.
Then, you can call OpenOffice (from PHP) using the "-p" command-line
parameter, which will cause it to load a designated file and print it.
For example, if your file was "accounts.xls" you would call the
following command:
soffice -p accounts.xls
OpenOffice would load the "accounts.xls" file and "print" it to the
PDF writer, which would be configured to save the PDF document to the
desired filename.
GhostScript is a suitable PDF writer.
The OpenOffice setup guide describes how to install and configure
printer drivers using the "spadmin" utility, and discusses the use of
ghostscript as a PDF writer:
"Open Office Setup Guide - Appendix"
http://www.openoffice.org/docs/setup_guide/appendix.html
You can call OpenOffice from PHP by using the backtick execution
operator, or the "exec" function. You may also need to use PHP to move
and/or rename the resulting PDF files:
PHP: Program Execution Functions
http://www.php.net/manual/en/ref.exec.php
PHP: Filesystem: Rename
http://www.php.net/manual/en/function.rename.php
OpenOffice is pretty good at processing XLS files, but it may not
perfectly render every such file - so if you need the ultimate in
compatibility you will have to use Microsoft Excel on a Windows
Platform or emulator. "IT AsiaOne" looked at several alternatives to
Microsoft Office (including OpenOffice) and wrote that "while none of
the alternative suites promise ... full compatibility with Microsoft
Office-created documents, in general, they do a decent job of
translating Microsoft ".doc", ".ppt" and ".xls" file formats":
IT AsiaOne - Specials - Yours For The Picking
http://it.asia1.com.sg/specials/mmedia20020724_001.html
Additional links:
OpenOffice.org Home Page
http://www.openoffice.org/
Ghostscript Home Page
http://www.cs.wisc.edu/~ghost/
PHP Home Page
http://www.php.net/
Google search strategy:
openoffice scripting pdf linux
://www.google.com/search?q=openoffice%20scripting%20pdf%20linux
openoffice print "command line"
://www.google.com/search?q=openoffice%20scripting%20pdf%20linux
followed by a search for "command line parameters" from the
openoffice.org home page.
Ref
hey all,
is there any way to convert a given file (this could be of any type) in to a pdf file in .net or php?
eg: suppose there is a upload link to upload your file of any type(word,excel,autocad,images..) and once the upload button is clicked the uploaded file should be converted into a pdf.
i checked out fpdf.but according to my knowledge all file types cannot be converted.a module to plugin to the CMS would also be fine.
FPDF does support images. I know because I have used it recently.
If you are wanting a pure PHP solution, you can use the PHP COM functions along with Word or Excel on the server to open up those files then copy the data out.
If I were you though, I would use Google. Load the doc into Google Docs then export it as a new format with the API.
There are a couple of 3rd party solutions available such as this one, which is optimised for use on the server and accessible from any web services capable environment, including .net. Supports loads of file types including MS-Office based documents.
Disclaimer, I worked on this product so consider me biased. Having said that, it works very well.