I'm looking for a hassle-free script that can convert an HTML file to Microsoft Word or PDF format. My download script works already, but when I tried to open the downloaded document in MS Word the images are labeled X, and it prompted for missing files.
Can you suggest any alternative?
Have a look at PrinceXML.
It's definitely the best HTML/CSS to PDF converter out there, although it's not free (But hey, your programming is not free either, so if it saves you 10 hours of work, you're home free.)
Oh yeah, did I mention that this is the first (and probably only) HTML2PDF solution that does full ACID2!?
http://princexml.com/samples/
This answer copied from here; originally written by user SchizoDuckie
Related
I hope you are doing well.
I need to know about a PHP library that converts a PDF file having images as well to be converted in a HTML file with the following features that the library can do.
HTML file needs to be of version 3.2 compatible
Save the images in PDF file having .jpg extension
Correct font from PDF needs to be used in the HTML file.
A result folder that contains the images and html file in one folder
I have tried most of the PHP libraries but most of the PHP libraries are NOT doing my needed tasks.
Please, help let me know about a library that do all the above 4 requirements (image attached for reference)
Waiting for your kind responses.
Thanks
I am not very sure, But here is a library in PHP I found.
Here
Try this:
http://www.pdfaid.com/pdf-to-html.aspx
Or this:
http://webdesign.about.com/od/pdf/tp/tools-for-converting-pdf-to-html.htm
Or this...
http://www.pdfconvertonline.com/pdf-to-html-online.html
There are plenty of options available to you, the secret is to use a new fangled thing called a Search Engine, such as a Bing or a Google.
you will also do well to research on Stack Overflow before asking your question:
1) HTML 3.2 wes superceeded in 1997, this is very nearly twenty years ago, why on eart are you still needing a comparatively ancient technology when there are far better improvements available such as XML HTML, HTML 4.01 and HTML5.
2) Please read How can I extract embedded fonts from a PDF as valid font files?
3) Also to extract images you can use:
http://www.makeuseof.com/tag/extract-images-pdf-files-save-windows/
but again, there are several options available to you if you care to look for them.
You seem to imply a fundamental misunderstanding about HTML; there are several different ways of getting any desired result with HTML. You have a PDF file and you want it to look a certain way, this look depends on the browser you are looking at it on. For example if you use a PDF to HTML converter as linked above you will very probably find that the output will look different on Internet Explorer 7 versus on Firefox versus Internet Explorer 10. There is no one way of writing output on HTML or with CSS.
If you want a custom built library to do your specific task then you will need to employ a professional to do it, or you will need to code it yourself. This obviously should be charged to the client for requiring a technology that is extremely outdated. You can probably search github for a similar library (the one linked by CK Khan looks like what you're after) and then fork it and make your own variation for your needs. I very much doubt anyone is going to put time into developing a system to output HTML 3.2 from a PDF, and even less likely to develop this system for free and to your exact specifications.
It also appears that you can not directly incorporate font families into the <font> tag in HTML 3.2, only being able to edit size and colour of fonts. You can use CSS1 font-family to show font families. See here.
A client has given me the task of creating a site with the ability to convert their file uploads into html or pdf for storage on the web server. I want them to be able to upload (.doc, .tiff, .jpg, etc) and have it convert these files on the fly, again... into either html or pdf.
I am open to software and api's that do the trick but the file MUST BE STORED ON THE CLIENTS WEB SERVER after conversion. The client is using godaddy with an ssl if that helps. Any input is greatly appreciated as I have been looking for a long term solution to this problem that I will be able to use in future projects.
Things I have looked into but have had trouble using this way... Scribd, open office api
Places I've found the most help so far here
Well...the matter of storing the file as HTML...as you upload images all you need to do is store the file somewhere and then create a HTML file that looks something like this:
<html>
<body>
<img src="path/to/the/image/file.png" />
</body>
</html>
It might be worth it to convert the large files (especially TIFF) to another format. Converting .doc-Files might be a little more tricky. Have a look here: Convert .doc to html in php
Maybe also take a look at the Document zetaComponent, which is able to cobvert between different document types, although not all of those you mentioned are supported so far.
Creating a PDF should be almost as easy as there are several libraries for PHP that can aid you. Just poke around on SO: Convert HTML + CSS to PDF with PHP?
Overall you will have to mix up a whole lot of stuff to get that job done. There is no "simple" solution to this.
Actually I have to upload pdf files and need to read on my website as book reader like a presentation. Please show me the possible ways to achieve my goals.
Thank you
I've been using flexpaper, I use pdf2swf to convert the pdf to swf as I used the flash version but there is a javascript version too.
One possible solution would be to use scribd. You simply upload your document to their website and embed their reader on your website. This is the easiest way, and you get things like searchability. Their reader also works like Adobe's Acrobat Reader.
The downside is that you are uploading your documents onto a public website, so everyone will be able to view it. Perhaps they might have settings where you can lock your documents so that only certain people can see them.
The next solution is to roll your own. You can use turn.js. In this case, you will need to find a way to convert your PDF files to HTML files or perhaps image files. With images, your text won't be selectable, and they won't be discoverable by search engines. Again, converting PDF to HTML can also be difficult as you might lose formatting in the process.
But it is entirely up to your use case. Personally, I would go with scribd, as their platform works very well, and you won't have to worry about implementing your own system.
I want to overlap pictures, but it is not working and I need some help.
Here's the link to the page I'd like to convert:
http://9m9.com/innovative/sample/two.html
I want to convert this page to a PDF. You can see the small image overlapping the bigger one.
This is the page where you can click on a link that will convert the page to PDF.
http://citysoftsolutions.com/eclients/virtualtour/view_property_images.php?pid=9&uid=67
As you can see the image is placed behind the big image.
I'm using this converter script: http://mpdf.bpm1.com/
When I printed it using PrimoPDF driver it came out just fine. Last image was easily laid over. So there must be a bug with the script you're using.
What do I suggest?
If you'd like to convert your pages to PDFs "on the fly" I suggest you either
contact script creator and inform them of a bug in the script
use a different script (I'd check out this question that can help you)
If you'd like to just provide PDFs of your page I suggest you install a PDF printer driver (like PrimoPDF that I'm using) and print those pages yourself and use those.
I'm not working for Nitro PDF Software company nor am I related to them in any way. So this is not me advertising their products/services.
On a sidenote
Something's telling me that what you'd actually like to do is to create a PDF flyer/promo material or something. If that's actually what you're after I suggest you do that using some software that's meant for such a job. Microsoft Office Word will do, but you'll better off using some other. If it's a one page leaflet you could use Adobe Illustrator or CorelDraw. But if it's going to be an actual multipage document use something like Word or Adobe InDesign.
Word is probably something you can easily master. So go with that one.
Recently i worked in a project. On this project I need convert page into a Microsoft word document (.doc file) and offer the document for download, all using PHP. But I can't solve this problem.
Please help me. Thank You very much, Arif
This is not easy to solve.
First off, if you want to write real word documents, you will have to do on Windows. You can use COM to talk to Word and this is how you manage to get good results. I've tried all the unix/linux based solutions and the results were not so great.
Otherwise, I'd suggest you write RTF -- which is just as good. And in the end, you can call the .rtf-file, .doc and no one will notice it. RTF has a couple limitations (formatting), but on the flipside -- it's all ASCII and the RTF standard is pretty comprehensive and well documented.
There's a class which does it pretty nicely -- phpLiveDocx (this is a great introduction). And this class also claims to write PDF and DOC -- but I haven't tried those yet. I use another solution for PDF.
I would recommend using the RTF format instead of the .doc - it's much simpler to write to, and all text editors understand it. Similar recommendation for .csv when you want to output an Excel file.
Perhaps not the answer you seek, but still interesting to note, there is a open source word processor out there called abiword that has a CLI (Command Line Interface). You can use it to easily convert between document formats. I know that at least one website uses it to convert text files into various formats.
It is actively getting developed and could easily be used as a 3de party black box solution to converting documents server side.
Here is a blog from one of the developers on how to integrate it with PHP
Server-Side AbiWord
abiword home page