how to convert .docx / .doc file to .pdf using php only - php

I want to convert ms-word file(.doc,.docx) to .pdf or .html file without losing style and image on word file.
$filenamearray = explode('.',$filename );
$doc = new Docx_reader();
$doc->setFile($item);
if(!$doc->get_errors()) {
$html = $doc->to_html();
$plain_text = $doc->to_plain_text();
$myfile = file_put_contents("$filenamearray[0].html", $html.PHP_EOL , FILE_APPEND | LOCK_EX);
} else {
echo implode(', ',$doc->get_errors());
}
Above solution i tried but giving me only html without style and image.i want .html or .pdf same as .doc .docx file.

This is not an easy task and your rate of success will largely depend on the complexity of your word document. If only basic style elements are used (bold, italic, underline, color) you could use the HTML add on to FPDF. Any more complex elements would require a translation function to FPDF.
I assume you need this for template purposes, may I suggest directly programming your page layout in FPDF. You can create custom tags for layout which can be used in a WYSIWYG editor.
You will lose some of the flexibility compaired to a Word document, but that loss does not outweight the ease of setting up standard FPDF. But again, your best way to go will depend on the functionality you need and your end-goal. I suggest taking a look at the possibilities with FPDF. It's a great script and can make PDF's lightning fast!

Related

DOMPDF change the html elements places

I am working with laravael,I have an HTML view that has CSS integrated. I want to convert into a PDF. If I open the view (it doesn't mather if I open it via my Documents or by a link in my app) it works fine, everything looks ok. But if I get that file and generate a PDF with dompdf, when I open it the css for the background and the images are in their places but the texts change places and have another size.
Here is how I convert it to a PDF
$file = file_get_contents("../resources/views/panel/historial/pdfs/otros.html");
$dompdf = new DOMPDF();
$dompdf->loadHtml($file);
//$dompdf->load_html_file($html);
$dompdf->render();
$dompdf->stream("otros.pdf", array("Attachment" => 0));
return $dompdf;
enter image description here
I think that is not working in that way to except then the DOMPDF-Renderer has not the full CSS functionality.
https://github.com/dompdf/dompdf/wiki/CSSCompatibility
Here is a list of elements that are supported. So in your case i would suggest that you render a new template and make it with a different style for your PDF.
Another good solution is wkhtmltopdf which has a better support but is a command line tool which you have to call over php or if you don't need PHP then run it directly from your command line.
https://wkhtmltopdf.org/

Improve dompdf rendering

Here is the dompdf rendering of my SO profile page....
https://docs.google.com/file/d/0B-I8istAg8Z6UmJDVHEtUkZOUDQ/edit
here's the code
<?php
require_once("dompdf/dompdf_config.inc.php");
$html = file_get_contents('http://stackoverflow.com/users/1461078/samidh-t');
$base_path = 'http://'.$_SERVER['HTTP_HOST'] ;
$dompdf = new DOMPDF();
if ( isset($base_path) ) {
$dompdf->set_base_path($base_path);
}
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream('file.pdf' , array("Attachment" => 0));
?>
the rendering is not even close to the profile page.
What can be done to improve this rendering ?
Dompdf - from memory, renders pages ok when the css and html are simple - lots of floats do not work - and you will find rendering web pages as pdf's using dompdf difficult enless you are writing the html and css yourself. (or parsing the input files first and modifying them).
What is your overall goal?
I have used dompdf to create pdf reports of data from a database and have had success but not using 'float' or any other advances css rules.
Dompdf info on css rules it can handle are here:
From this page:
https://github.com/dompdf/dompdf/blob/master/README.md
This text:
Limitations (Known Issues) not particularly tolerant to poorly-formed HTML input (using
Tidy first may help). large files or
large tables can take a while to render CSS float is not supported
(but is in the works). If you find this project useful, please
consider making a donation.

Creating a personalization engine with php

I am new to php and I want to create an php engine which changes the web content of a webpage with PHP with the use of data in mysql. For example (changing the order of navigation links on a webpage with the order of highest click count) I am not sure how PHP will read the HTML file and change the elements in the HTML file and also output the HTML file with the changes. Is this possible?
I am not quite sure why you would want to generate the html, read it, change it and then output it. It seems to be a lot easier to just generate it the way you want to in the first place.
I am not sure how PHP will read the HTML file and change the elements in the HTML file and also output the HTML file with the changes. Is this possible?
You could use file_get_contents:
$html = file_get_contents($url);
Then use a html-parser like Simple HTML DOM Parser, change what you want to do and output it.
If you want to modify HTML structure, use ganon - HTML DOM parser for PHP
include('path/ganon.php');
// Parse the google code website into a DOM
$html = file_get_dom('http://code.google.com/');
foreach($html('p[class]') as $element) {
echo $element->class, "<br>\n";
}

How to generate a PDF from a portion of a Wordpress post?

A client of mine has a food blog hosted on WordPress. Each post entry contains some text and a div called "recipes" with some more text inside it. They would like to add to this div a link that generates a PDF of the recipe, dynamically, for saving or printing, as the user sees fit.
I have seen quite a few Wordpress plugins that offer the conversion of entire posts to PDF but not anything that's customizable enough to select a given portion of a post, the way we'd like to.
Any suggestions on how to do this? I'm comfortable with PHP, Javascript, CSS but am new to the various PDF libraries.
Take a look at dompdf It's pretty easy to work with :) This is from the documentation:
<?php
require_once("dompdf_config.inc.php");
$html =
'<html><body>'.
'<p>Put your html here, or generate it with your favourite '.
'templating system.</p>'.
'</body></html>';
$dompdf = new DOMPDF();
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream("sample.pdf");
If you need more control than dompdf you could always use PHP's XML/XSL methods to convert the HTML to XSL:FO and use FOP on the commandline to generate the PDF. It's a little long-winded but you get complete control of the output styling/structure, the ability to "lock" the PDF, provide metadata, etc.

How can LiveDocx in PHP be used to read .doc & .docx files and read text inside it and save to HTML?

Let's say we have a .doc & .docx files. I want to use LiveDocx in PHP to load the files, read it's content and strip the text from inside it. Then save it to an HTML string.
Can this be done?
I've searched the documentation, and it seams that LiveDocx only loads .doc & .docx template files only!
You can save using external libraries and simply grab the text from the XML within the files:
http://www.webcheatsheet.com/PHP/reading_the_clean_text_from_docx_odt.php
I think you can find what you need in this example.
I might be wrong, but I think they call them "template" files because they act like a template but are still normal .doc/.docx documents. I suggest you simply try to run that example.
I think you can use TextControl that improves phpLiveDocx TextControl link
Using this you can also import pdf doc and docx
When you do document conversion on LiveDocX, you need to do a mailmerge and then retrieve the document. Even though you aren't inserting any new content, you need to do a mailmerge that replaces a dummy placeholder with dummy content.
So, the process I'd suggest is:
1) Set your source document as local template
2) Merge a dummy field with dummy content
3) Retrieve your document as HTML
4) Use a script server side to remove the html and leave only the content (Something like, remove everything between the HEAD tags, then strip_tags on the rest)
5) You should be left with your content as a simple string - I'm not sure it'll be too meaningful, but might be useful for building something like search indices.

Categories