Displaying word documents, excel sheet, power point in browser - php

Is there is any way to displaying world document,excel sheet and power point in browser with out downloading.

I assume that you are going to use php for this, so you can try checking some libraries such as PHPWord by Microsoft for example.

If you wish to only display the document content, it is possible to do using some scripting language such as php. Basically office 2007+ formats are zipped XML documents with changed extension. Make a simple word 2007+ document, save it and change extension from .docx to .zip, than you can extract it and see what it's made of. You can find a lot of details here. Now displaying content may be a little tricky. As mentioned, there are libraries out there to handle this, but how will they handle the documents, I am not really sure. Most of them are abandoned, PHPword is in beta since 2011.
There are some indications that Apache is working on cloud version of Open office, but there is no release date yet. Once done, you will have a full featured office suite web app.
If you feel really creative you could use cron job (or scheduled task if you like Windows) to open a document, take a screenshot and basically make .jpg or .png version of the document (works fine with short documents, longer ones may be problematic), displaying it in a browser without much complication. It is also possible to schedule export to .pdf - all browsers do have Adobe PDF plugins.
To sum up, using php for parsing simple documents should be fine, but getting complex docs to display properly, may be much more difficult task and possibly not worth your time. I would go for cron export to pdf, to preserve most if not all of the document's structure.

Related

Converting PPTX to PDF with PHP

I am developing an API, in PHP, hosted on a linux server, that requires me to make jpeg previews for a .pptx powerpoint presentation.
I first convert the file to pdf and then convert the pdf to jpegs.
The second step is easy, with ghostscript, it's the first part that's proving difficult.
I have tried using the libreoffice executable, but pptx isn't completely compatible. Certain backgrounds become invisible.
I have the same problem with many 3rd party APIs (which I suspect also use libreoffice); the ones that do work, are ridiculously expensive.
Installing office on a Linux server and using COM functions seems impossible, or very tedious at best.
I have looked at Aspose.Slides, which also seems rather expensive, and their documentation is filled with errors.
I could use suggestions on how to tackle this problem.
I have tried to find the underlying problem of why LibreOffice and online conversion tools have a problem with the backgrounds of the presentations I need to convert.
The background is a .emf file, which has bad support.
My solution
I've unzipped the presentation, converted the .emf files to png (using ghostscript), changed all mentions of .emf to .png in the XML, and rezipped the altered presentation.
When I now use the LibreOffice headless to convert to pdf, the background shows up.
It might be a bit hacky, but it works for the intent of my program.
ps. I see that my question has gathered a few downvotes. In my opinion it was a valid question, and listed the various solutions that had worked for others, but not for me. If anyone has insights or ways to improve it, feel free to comment.

Write multiple sheet on OpenOffice spreadsheet file

I'm trying to solve how to write PHP in order to execute a report with multiple sheets on OpenOffice spreadsheet file (AKA ods). Now I used this code for generate the OpenOffice spreadsheet report but it can display only one sheet:
<?php
// Export Calc SpreadSheet
header("Content-Type: application/vnd.ms-excel");
header('Content-Disposition: attachment; filename="Report.ods"');
?>
How I can solve this problem?
There are many libraries out there that are able to be used within PHP to create, edit, and serve up spreadsheet files, or Workbooks (which are a collection of sheets...actually the spreadsheet file IS a workbook, even if it is just a single sheet). There are very well known ones, and some not-so-well known ones out there.
Most people will point to these:
PHPExcel - https://phpexcel.codeplex.com/
Spreadsheet_Excel_Writer -
http://pear.php.net/manual/en/package.fileformats.spreadsheet-excel-writer.intro.php
ods-php - http://sourceforge.net/projects/ods-php/
openTBS -
http://www.tinybutstrong.com/plugins/opentbs/tbs_plugin_opentbs.html
there are a few more answers on questions like this one, but there is a daunting number of libraries and options.
My personal choice for a simple, small, openDocument format editor in PHP was ods-php, which is a single php file you include in your application, and instantiate. My use is not going to be creating ODS documents, but rather editing template files and serving up the edited document. You will have to write your own headers and echo the file contents in your own function in your PHP application, but that is not hard at all.
There is a [very] basic example php file included in the ods-php download that shows some of the functions, but if you can follow basic PHP logic, you can look through the library source and figure out its available functions. I'd say it would do just fine for what you need.
On the other hand, if you would rather have a bigger API at your disposal, and your server is decent enough to handle it, I'd recommend any of the other three. Keep in mind, the other three are rather large by comparison, and each has it's own strong and weak points:
PHPExcel is probably the most used by the free community, and is maintained on github constantly (last messed with 6 days ago), but is quite large. Documentation is available on the github site (I provided their old link, which links to their github).
Spreadsheet_Excel_Writer is tool from Pear, and is just as large, although it is no longer maintained, so what is there is 'as-is'. The Pear team is looking for someone to take over its maintenance, but what is there is as far as I can tell working.
openTBS is a single class library extension of the TBS engine (TinyButStrong). You have to install TBS in order to use openTBS, and it is a very good idea to enable zlib for compression capabilities of your files. If you go with openTBS, you not only get the ability to make xml based documents, but you get the functionality of the entire TBS engine at your disposal, which is quite nice if you would like to merge html source templates with your php scripts (check out the site, it might open some new frontiers for you).
there are definitely more libraries and tools out there, but these are the most notable ones I found in my search. My choice was guided by the driving force to keep my server tiny and standalone (it operates on a raspberry pi). If I were to choose a bulky, production environment API, I would probably choose PHPExcel because it has the support it needs to keep being up-to-date with M$' ever-adapting formats.

Best way to convert files into pdf files using php

What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.

turn web page into an image on the fly?

I was wondering if there was any way of turning an entire HTML page into a png (or other kind of image?) I'm trying to create PDFs on the fly, but it's pulling across my styles as text, but I want the styles to stay the same as the page (cufon and all). Any help would be appreciated! :)
This doesn't look straightforward. The backend (PHP etc.) doesn't do rendering, layout. It merely generates content.
The layout and visual aspects of the website are done by your client (browser) and the backend has no way of accessing this.
However, given an HTML file, there are libraries that can render it into a PDF like Prince XML that seem to be capable of this.
The only way to generate an image identical, or even near, what a visitor sees in their browser when viewing your site is to launch a browser and take a screenshot. You need the browser's rendering engine to render the page. All the libraries you find to do it without a browser create something much different than what the visitor sees, and won't render cufon or other fancy things at all.
Companies that offer screenshot previews of a webpage now run many servers, each running many virtual PCs, each running a full operating system and real web browser. They have all those systems pulling jobs, opening the webpages in real browsers, taking screenshots and saving images. You won't replicate that with a little PHP script.
http://ipinfo.info/html/rendering_services.php
Turning web pages into images and PDFs is a royal pain using PHP. Solutions often require OS level scripting, fake printer drivers, or screen capturing, which can make for a rather fragile setup. I ran into the same issue a few years ago and started working on native PHP extension that leveraged the Gecko engine to render HTML to PDF, but never finished it.
The best answer I've seen doesn't quite turn a full web page into a PDF, but instead does XML to PDF. XEP by RenderX is the commercial tool Apple uses to produce developer documentation in many formats, including HTML and beautifully rendered PDFs, from an XML source. The great thing about using the XEP tool in conjunction with PHP is that PHP deals with XML very well, so you can pass generated XML to the XEP binary, let it do the conversion to PDF, then deal with the resulting PDF file in PHP.
consider building a regular PDF file that resembles your web page:
PHP::PDF - constructing using php.
PDF Reference - file structure.

What do sites like Google Docs and Zoho Writer use to generate MS Office documents

I realise this may just be speculation, but I'd appreciate comments from anyone who has some insight into this.
Something like MS Word COM add-in, or an OO bridge, or a custom implementation.
The reason I want to know is that I want to provide basic online document editing (really basic, basically just rich text at this point) for a php web app. I'm guess I will store the markup in html format then convert to rtf/doc etc for user convenience.
The Apache POI project (written in Java) offers an interface to many file types from the MS Office suite.
You can run the Java code from within PHP using the PHP/Java bridge.
I used this once for an application where MS Word documents had to be indexed in a web application. I remember that setting everything up was quite a hassle, but then it worked very well and reasonably fast. (Unfortunately, the code was written in PHP4 and I don't own it, so I cannot help you out with any snippets here.)
P.S. I cannot post links since I'm a new user, so google for "Apache POI" and "PHP/Java bridge" to get to the respective project's homepage.
This class might help you. I've never used it but here are some links:
Reading from a Word Document with COM in PHP
create a word document
Create Word Document using PHP in Linux
They have probably written their own, maybe starting from wvWare or something similar. I have noticed that Google Desktop on Linux seems to use wvWare to parse MS Word documents.
The documentation for the Word file formats is available, but reading through it makes you realize that it would not be an easy task.
Automating Word or OpenOffice would be the easiest, but there might be licensing issues with using Word like that, and possible concurrency issues with using either of them on a web server.
A popular way to do it is to generate RTF with the file extension .doc. It works fine with Word and other editors, and users remain happy that it is "a DOC file"

Categories