Convert specific PDF file to HTML in PHP - php

is there any way how to covert PDF to HTML? I need a text from the file and when I tried PDFtoText library, I got the text, but unsorted and without any rules for parsing.
I noticed, that some PDFtoHTML online services works great with the file. So, any tips please? Here is the PDF file and I need only one specific row in the right column.

Try integrating the PDFtoHTML from the poppler project; that should support table recognition.

pdftohtml works fine : fast, stable but the html result is ugly at best. I have used it for quite some time for a web site that has many job resumes.
It is a good solution for extracting textual content however.
I would give the scribd API a try
http://www.scribd.com/developers/api
or the google apps document API. GOogle does a great job a displaying and converting pdf files

Related

Pdf book reader in php?

Actually I have to upload pdf files and need to read on my website as book reader like a presentation. Please show me the possible ways to achieve my goals.
Thank you
I've been using flexpaper, I use pdf2swf to convert the pdf to swf as I used the flash version but there is a javascript version too.
One possible solution would be to use scribd. You simply upload your document to their website and embed their reader on your website. This is the easiest way, and you get things like searchability. Their reader also works like Adobe's Acrobat Reader.
The downside is that you are uploading your documents onto a public website, so everyone will be able to view it. Perhaps they might have settings where you can lock your documents so that only certain people can see them.
The next solution is to roll your own. You can use turn.js. In this case, you will need to find a way to convert your PDF files to HTML files or perhaps image files. With images, your text won't be selectable, and they won't be discoverable by search engines. Again, converting PDF to HTML can also be difficult as you might lose formatting in the process.
But it is entirely up to your use case. Personally, I would go with scribd, as their platform works very well, and you won't have to worry about implementing your own system.

PHP HTML to PDF free convertor Resources

What is the best PHP HTML to PDF free converter around, not just in terms of functionality but also in terms of resource usage and speed
Thanks
Have a look at open-souce fpdf library.
Check dompdf, an HTML to PDF converter written in PHP. No external dependencies, it supports complex tables, images and even external style sheets.
http://www.digitaljunkies.ca/dompdf/
If you want to be really clever about it, you could programmatically create a new Google doc containing your HTML and CSS, then programmatically export it as a PDF. No resource usage on your part, and it works very well.
Start here:
http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf
We recently used this on a project with quite a bit of luck. I don't know that it will go straight from HTML to PDF like you are looking for, but it is a good set of tools.

How do I convert a PDF file to HTML in PHP?

How do I convert a PDF file to HTML in PHP? Is there any lib or web service? I mean free, thanks!
Google pdf2html, pdftohtml looks to be the only viable one. and it's based on a command line program, not PHP. so it may not be useful to you. Google is capable of converting, so there may be a way to do it with GDocs as well. though I'm not sure of that. At any rate, I hope this gets you on the proper path at least.
I've tried Poppler's pdftohtml command to convert PDF files to HTML files. Check it out on The HTML file output of Poppler is lighter when used but the output is not very accurate.
If you want accurate output you should use pdf2htmlEX I've converted complicated PDF files and got the best HTML output.
You can't.
PDFs are complex documents containing embedded fonts, vector graphics and layout information that cannot be represented in HTML in an automated way. You may be able to extract the TEXT of the document, but that's about it.

How can i convert a php page into .doc file with php

Recently i worked in a project. On this project I need convert page into a Microsoft word document (.doc file) and offer the document for download, all using PHP. But I can't solve this problem.
Please help me. Thank You very much, Arif
This is not easy to solve.
First off, if you want to write real word documents, you will have to do on Windows. You can use COM to talk to Word and this is how you manage to get good results. I've tried all the unix/linux based solutions and the results were not so great.
Otherwise, I'd suggest you write RTF -- which is just as good. And in the end, you can call the .rtf-file, .doc and no one will notice it. RTF has a couple limitations (formatting), but on the flipside -- it's all ASCII and the RTF standard is pretty comprehensive and well documented.
There's a class which does it pretty nicely -- phpLiveDocx (this is a great introduction). And this class also claims to write PDF and DOC -- but I haven't tried those yet. I use another solution for PDF.
I would recommend using the RTF format instead of the .doc - it's much simpler to write to, and all text editors understand it. Similar recommendation for .csv when you want to output an Excel file.
Perhaps not the answer you seek, but still interesting to note, there is a open source word processor out there called abiword that has a CLI (Command Line Interface). You can use it to easily convert between document formats. I know that at least one website uses it to convert text files into various formats.
It is actively getting developed and could easily be used as a 3de party black box solution to converting documents server side.
Here is a blog from one of the developers on how to integrate it with PHP
Server-Side AbiWord
abiword home page

Direct Print webpage in PDF file

In my site i m fetching my mysql data by using PHP. I want open that data in pdf file when i click pdf print button is it possible?
First of all, if you want a high quality professional product to do that. You want Prince XML
If you are looking into some open source tool to achieve something similar. You can look into this SO question.
You could prepare static PDF form file, that just fill it in with values using PHP's FDF module.
It depends which platform are you using. This would be an easy job if you are using Groovy on grails. There are plugins which facilitate pdf reporting like the jasper-plugin.
Luis
Check out jsPDF, an open-source library for generating PDF documents using nothing but JavaScript.
You can process the data with Apache FOP after transforming it to XML. (http://xmlgraphics.apache.org/fop/).
If your page is template based, you may create a template which produces xml output and process that. You'll have extremely well contol over the pdf construction. The tradeoff is that it is not a "plug this in and will work" solution, but I've done that and once its set up, works like charm.
I've used TCPDF in the past, it's a little kludgy but can definitely get the job done. (http://www.tecnick.com/public/code/cp_dpage.php?aiocp_dp=tcpdf)
The FPDF module in PHP is simple enough to get the data together. It is a safe option since you know what data you are passing out to the PDF engine. There are some streaming pdf options which can take in a bunch of html and then output that to pdf however they can get it quite wrong without you knowing.
I used, on Linux machines, WKHTMLTOIMAGE/WKHTMLTOPDF a number of times, on many projects. It workes like a charm, easy to use, just a script that you run.

Categories