I want to convert any pdf,docx,doc file into html code using php. with same style as in pdf. I am not getting proper solution.
Config::set('pdftohtml.bin', 'C:/poppler-0.37/bin/pdftohtml.exe');
// change pdfinfo bin location
Config::set('pdfinfo.bin', 'C:/poppler-0.37/bin/pdfinfo.exe');
// initiate
$pdf = new Gufy\PdfToHtml\Pdf($item);
// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$html = $pdf->html();
Not working for me.
I had a similar problem and i found a github that i used with word docs. It worked fairly good then but i havent tested it of late. try it.
https://github.com/benbalter/Convert-Word-Documents-to-HTML
I think that this post could help you in a first time. With this one, you'll be able to convert any pdf into HTML code using PHP.
After this, you can use the help provided by this post to convert .doc and .docx to PDF using PHP.
I think that you can now built a function for each document extension that you want to convert into HTML.
Good luck.
I've come across a web service which presents an API for converting documents. I haven't tested it very thoroughly but it does seem to produce decent results at converting Word to HTML:
https://cloudconvert.org/
Related
Good morning.
I have a test.html file. I would like to convert into test.xlsx using php.
Now, I could create test.xls file using php, whereas it is having only html tags due to that this file could not open directly in excel and it shows extension error. so if the file is having test.xlsx format gets opened smoothly.
I did not know that how to proceed further to get an expected results.
Please help if possible.
Thanks in advance.
Best regards,
Balraj
Use PHPExcel_Reader_HTML function of PHPExcel
I am using PDF Jam for manipulating pdf. I need to add a text line at the bottom of generated file. I tried it but not able to made it.
Can anybody guide me how to do it?
I did in my php code as
$command = '-----------------';
exec($command);
As you know, PDFJAM is for manipulating pds. It is a small collection of shell scripts which provide a simple interface to much of the functionality of the excellent pdf pages. See the Ubuntu Manual
pdfjam - A shell script for manipulating PDF files
You should create your sheet as your doing (5x6) and create a separate sheet of minimal page size with required information than merge both the file into one.
Else in first step create your sheet and use pdflib to add text as second step. It very good tool. I hope its a good solution of your problem.
I love pdftk and so wanted to find a solution using that. The following worked for me.
pdfjam --preamble '\usepackage{fancyhdr} \topmargin 85pt \oddsidemargin 140pt \cfoot{\thepage}' --pagecommand '\thispagestyle{plain}' --landscape --nup 2x1 --frame false --clip true --trim ".5in 0.5in 0.5in .65in" --delta '-0.25in 0' tmp.pdf
I cribbed it from: Page Numbering with the "{page} of {pages}", removing the "of pages" part.
Command converts pdf to 2x1, trims margins, and crops. Output is landscape.
\topmargin and \oddsidemargin seem to tell pdflatex where to put the numbers.
I can convert html page to pdf and send it via email with out any problem, but I am facing trouble with converting php file to pdf.
is it possible to convert php file to pdf using mpdf or do I need to use some other php class for this?
Thanks!
The option seems to be like, first convert the output of the php file to html file, save it and pass that file to mpdf.
Thought my solution may seem other way round but it worked well for me.
Or otherwise this Link
<?php
$file = '/home/user/Desktop/myfile.html';
$result = file_get_contents("url/of/ur/page");
echo $result; //view source now
file_put_contents($file, $result);
?>
now u can pass this file to mpdf. Realie sorie bt i havnt used mpdf till date. May be this solution works for you.
Also, other option is curl.
I have a website now and I want to create a button on it to convert this page to PDF.
Is there any code to make this happen? I cannot find it on the internet.
So I want to have a button and when I press on it it converts the page to a .PDF file.
I do not want to use a third party website to generate the PDF's. I want to use it for internal purposes to generate files with PHP. So I need the code what can make a PDF for each page.
I use wkhtmltopdf - works very well - http://code.google.com/p/wkhtmltopdf/ there is a PHP wrapper
Updated based on comments below on usage :
How to use the integration class:
require_once('wkhtmltopdf/wkhtmltopdf.php'); // Ensure this path is correct !
$html = file_get_contents("http://www.google.com");
$pdf = new WKPDF();
$pdf->set_html($html);
$pdf->render();
$pdf->output(WKPDF::$PDF_EMBEDDED,'sample.pdf');
Use FPDF. It's a well-respected PDF-generating library for PHP that is written in pure PHP (so installing it should be dead simple for you).
Try this:
http://www.macronimous.com/resources/Converting_HTML2PDF_using_PHP.asp
It will convert HTML to a PDF using FPDF and HTML2PDF class.
Also found this:
http://www.phpclasses.org/package/3168-PHP-Generate-PDF-documents-from-HTML-pages.html
I want to add an word import function to our CMS, the only problem I cannot seems to find a good library for reading docx files (Word 2007).
Do anyone has some recommendations, the library should be able to extract content of the document and basic styling like italic, bold, superscript?
Thanks for your help
docx files are actually just containers for the document's XML. You should be able to unzip the docx file and then go to the word folder inside, then to the document.xml. This has the actual text. But things like the fonts and styles are in other xml files in the docx container, so you'll probably want to mess around a bit and figure out what is what and how to match it up (start by using namespaces, I bet).
But yea, unzip the file, then use simplexml to convert it into something you can actually mess around with.
PHPDocX PRO includes a TransformDoc class that can read .docx (zip) files and generate XHTML (or PDF) from it:
...
require_once 'phpdocx_pro/classes/TransformDoc.inc';
$doc = new TransformDoc();
$doc->setStrFile($file->filepath);
$doc->generateXHTML();
$html = $doc->getStrXHTML();
There is a library to do this but it works with Zend framework may be it will help you
It is called phpLiveDocx : http://www.phplivedocx.org/downloads/
The library is licensed under New Bcd
I have just find a library that has both reading and writing support check it on the codeplex forge http://openxmlapi.codeplex.com and it is licensed under GPLv2 .
Or, since you requested a library, you may want to look into something like Docvert. I was just looking around based on your question, and it's my favorite so far for PHP. You input the word file location, it transforms it into something simple with the attributes and all that good stuff.
Convert a docx document to a odt using OpenOffice. Use then eZ Components to do the parsing and import. They actually use the import in their CMZ eZ Publish.
Here is a simple working solution I found
http://webcheatsheet.com/php/reading_the_clean_text_from_docx_odt.php