I was wondering if there was any way of turning an entire HTML page into a png (or other kind of image?) I'm trying to create PDFs on the fly, but it's pulling across my styles as text, but I want the styles to stay the same as the page (cufon and all). Any help would be appreciated! :)
This doesn't look straightforward. The backend (PHP etc.) doesn't do rendering, layout. It merely generates content.
The layout and visual aspects of the website are done by your client (browser) and the backend has no way of accessing this.
However, given an HTML file, there are libraries that can render it into a PDF like Prince XML that seem to be capable of this.
The only way to generate an image identical, or even near, what a visitor sees in their browser when viewing your site is to launch a browser and take a screenshot. You need the browser's rendering engine to render the page. All the libraries you find to do it without a browser create something much different than what the visitor sees, and won't render cufon or other fancy things at all.
Companies that offer screenshot previews of a webpage now run many servers, each running many virtual PCs, each running a full operating system and real web browser. They have all those systems pulling jobs, opening the webpages in real browsers, taking screenshots and saving images. You won't replicate that with a little PHP script.
http://ipinfo.info/html/rendering_services.php
Turning web pages into images and PDFs is a royal pain using PHP. Solutions often require OS level scripting, fake printer drivers, or screen capturing, which can make for a rather fragile setup. I ran into the same issue a few years ago and started working on native PHP extension that leveraged the Gecko engine to render HTML to PDF, but never finished it.
The best answer I've seen doesn't quite turn a full web page into a PDF, but instead does XML to PDF. XEP by RenderX is the commercial tool Apple uses to produce developer documentation in many formats, including HTML and beautifully rendered PDFs, from an XML source. The great thing about using the XEP tool in conjunction with PHP is that PHP deals with XML very well, so you can pass generated XML to the XEP binary, let it do the conversion to PDF, then deal with the resulting PDF file in PHP.
consider building a regular PDF file that resembles your web page:
PHP::PDF - constructing using php.
PDF Reference - file structure.
Related
I read a lot of topics about scripts that compute html and output pdf; I tried lots of them, and I am always disapointed in the results. Lots of them don't consider the external CSS, lots of them can't be executed from shared hosting (need to be installed in some unaccessible places, like DOMPDF), etc. Also, lots of the threads on the question are pretty old (most of them've been asked in 2010).
Question: Is there a simple way to cURL (from a php script) a remote web page and simply save a pdf of the "print" (like in css media print) version of the page, or even a jpeg, or a docx, or anything that "contains" the images and the styling for offline viewing? And more important, can it be free/open source?
All the web browsers do that with no effort. Once on the page, only press ctrl-p and there it goes (almost). Why is it so trivial to find a good script that can do this? Is there a way to emulate a browser, or what...?
Isn't it possible to cURL and force css media print, then take a snapshot of this?
The difficulty to find this seems very strange to me... I feel like it's a quite simple task.
Try to call wkhtmltox from PHP.
wkhtmltox/bin/wkhtmltopdf www.stackoverflow.com stackoverflow.pdf
This PHP library seems to work with wkhtmltox:
http://thejoyofcoding.org/php-wkhtmltox/
This might help:
http://davidbomba.com/php-wkhtmltox/
Is there is any way to displaying world document,excel sheet and power point in browser with out downloading.
I assume that you are going to use php for this, so you can try checking some libraries such as PHPWord by Microsoft for example.
If you wish to only display the document content, it is possible to do using some scripting language such as php. Basically office 2007+ formats are zipped XML documents with changed extension. Make a simple word 2007+ document, save it and change extension from .docx to .zip, than you can extract it and see what it's made of. You can find a lot of details here. Now displaying content may be a little tricky. As mentioned, there are libraries out there to handle this, but how will they handle the documents, I am not really sure. Most of them are abandoned, PHPword is in beta since 2011.
There are some indications that Apache is working on cloud version of Open office, but there is no release date yet. Once done, you will have a full featured office suite web app.
If you feel really creative you could use cron job (or scheduled task if you like Windows) to open a document, take a screenshot and basically make .jpg or .png version of the document (works fine with short documents, longer ones may be problematic), displaying it in a browser without much complication. It is also possible to schedule export to .pdf - all browsers do have Adobe PDF plugins.
To sum up, using php for parsing simple documents should be fine, but getting complex docs to display properly, may be much more difficult task and possibly not worth your time. I would go for cron export to pdf, to preserve most if not all of the document's structure.
What is, according to you, the best way to convert uploaded files of any kind (.doc, .docx,...) into a pdf-file using nothing but php. Is it even possible to do so?
I looked at FPDF, but this creates the pdf files from text.
An other solution previously given was to use the PDFlib library on your server, but unfortunately, my server doesn't support this library...
What is the best way to convert to files my users upload on my site to pdf files?
A simpler approach would be to restrict uploads to .PDF format programmatically and require your users to only upload .pdf files. Provide a link on the upload page to a free and open source pdf printer (e.g. Cuteftp) that the user can install to create .pdf documents from any file that can be printed.
Trying to do it through PHP will be problematic because the uploads could be generated from many different programs that would be impossible to cater for in their entirety. e.g. How would it handle Scribus or ABC Flowcharter or any other 'non-standard' application someone used to create a document?
Much better to filter the upload upfront.
The best server-side PDF generator from those I tried was, so far, wkhtmltopdf, a WebKit-based, self-contained invisible browser that can render any HTML+CSS and generate a PDF from it. Reasonably fast and fairly reliable, has some useful PDF options, such as page size, orientation, etc.
The second part of the job in your case is to convert documents to HTML prior to feeding them to wkhtmltopdf. If possible, have your users upload the docs in HTML (Word and Co. can export (crappy) HTML). If this is not an option, you will have to find a tool just for that, which, in my opinion, is much easier than finding a tool that converts Word docs directly into PDF.
Good thing about wkhtmltopdf is also that you can feed the output of your PHP script to it using the ob_xxx() functions.
PHP Excel best simple way to create doc, docx, xls, xlsx, pdf files with PHP. Its lot easier with clear documentation.
Use Microsoft Office to render Microsoft Office documents, if you care about accuracy at all. This is easily done by invoking Office over COM.
Get access to your server, and install what you need. Doing so would be far easier than monkeying around with sub-par solutions.
Well... I can think of one way of doing it quite easily, but it doesn't involve using PHP.
Upload your documents to a folder on your server, that are browsable by your users.
EG: http://mysite.com/docs/
Then get your users to install a virtual printer driver such as Primo PDF
http://www.primopdf.com/index.aspx
then they can load the document into their browser, and print to PDF for offline browsing.
If this is not an option, and your dealing with office documents that conform to the openXML standard, you could attempt to parse the XML doc into a PHP page for display in the browser, then use JavaScript to trigger a print.
Unfortunately, it does still depend on your user having a PDF printer installed.
Alternatively, you could just load the docs natively, and print to your own PDF printer, then upload the PDF's to the web server for download.
I can't think of any easy way of doing this otherwise, without installing all sorts of different document parser tool-kits and doing a huge amount of behind the scenes work.
Anyone know of any Flash components that would do the job of displaying an external PowerPoint file (e.g. .PPT, .PPTX) file in a Flash movie on a web page? Or a way of automatically parsing uploaded PowerPoint docs from a PHP-based CMS and displaying them on a web page.
Our client needs to be able to upload a PowerPoint documents on their site without any intervention (if necessary).
I know about Slideshare and the like, but the content needs to live on the client's web server due to security restrictions. Also, Adobe Presenter seems to require Adobe software/plugins on the clients machine which wouldn't be ideal.
You could use Google Docs. It supports previewing PDF/PPT/DOC (Don't know about XLS) files.
I use it on one of my projects and it behaves very well.
You can call it using http://docs.google.com/gview?url=<your absolute file url here>
you could also use the embed=true parameter to embed it into your site, using an iframe or that sort of thing.
Hope this helps.
Using COM you can save a PPT as a PDF (see this question), and then use swftools' great (and free) pdf2swf utility to get nice swfs.
You can use http://ajaxdocumentviewer.com/. I use it on a large site with good success. It will require a flash or pdf viewer plugin.
Hmm have you looked at the new Sharepoint 2010? They do have some sort of integration with MS Office 2010. The demo i saw allows collaboration as well as uploading and viewing powerpoint slides (with all the nice transitions preserved) and even some minor edit functions.
For the browser I think they said that there isn't a need a install any plugins like flash or silverlight and works on all browsers.
Is their any Open Source Web PDF Viewer?
Which has good api through which I can modify the looks of the viewer?
I had tried the Scribd, Google Docs, FlexPaper , and this also.
But it is not giving me, as I want.
Then i had downloaded the shadowbox but it has not given me information about how to use it?
So anyone know good web pdf viewer and it would be great if it offers the customization
And that should be great if it is in php.Thanks in advance...
I don't think you're going to find a PDF viewer that's in PHP. The decoding of the PDF format happens on the client, which means your only options are either relying on the client to do the decoding work for you (Adobe Reader, Google Chrome's built-in reader, OS X's Preview app, etc.), rendering it with Javascript, or figuring out some way to convert the PDF into HTML.
PDFs are so ubiquitous these days, that it doesn't make a whole lot of sense to me to want to try to render it for a client; rather, simply tell them that the file they're downloading is a PDF and offer links to either Chrome or Adobe Reader, and let the user view the PDF in whatever app they please.
There is a wonderful pdf viewer which is opensource too. The ui implementation is basic. You will have to work on it. But its awesome.
http://view.samurajdata.se/
There is also a project from Mozilla called PDF.js. They're hoping to get it to a point where it's available as part of Firefox.
Get it at: http://mozilla.github.com/pdf.js/
I've tried it out myself and it works very well. The only issue is the source JS files are about 1.4mb which is rather large and I couldn't minify them due to some weird coding standards.