I trying to convert file doc or docx to pdf but the result doesn't match with the origin file doc/docx and also there is no style in file pdf. I don't know why, because here i'm using tcpdf and phpword
this is my code to convert:
$filetarget = FileHelper::normalizePath($pathdirectory.'/'.$filename);
$objReader = \PhpOffice\PhpWord\IOFactory::createReader('Word2007');
$contents = $objReader->load($filetarget);
$tcpdfPath = Yii::getAlias('#baseApp') . '/vendor/tecnickcom/tcpdf';
\PhpOffice\PhpWord\Settings::setPdfRendererPath($tcpdfPath);
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($contents,'PDF');
$fileresult = str_replace('.docx', '.pdf', $filetarget);
$objWriter->save($fileresult);
$toPdf = FileHelper::normalizePath($fileresult);
this is part of result after converted from docx to pdf
and this is part of origin docx file
what's wrong with my code?
Unfortunately phpWord is very basic so for DocX to PDF output you can see there is no ability to preserve text or page breaks, nor support lists or export images.
For the current list of features see
https://phpword.readthedocs.io/en/latest/intro.html#writers
Since it runs OpenOffice as the converter you could try other PHP methods to run the conversion direct
Related
I am converting docx file to pdf using PHP library PHPOffice and PHPWord. I am using TCPDF as PDF writer.
My code is as below
include_once 'Sample_Header.php';
include_once '../vendor/tecnickcom/tcpdf/tcpdf.php';
\PhpOffice\PhpWord\Settings::setPdfRendererPath('../vendor/tecnickcom/tcpdf');
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$temp = \PhpOffice\PhpWord\IOFactory::load('files/sampledocument.docx');
$xmlWriter = \PhpOffice\PhpWord\IOFactory::createWriter($temp , 'PDF');
$xmlWriter->save('results/sampledocument.pdf', TRUE);
Its working fine and generating correct pdf file only if docx file is containing plain text without any style changes(i.e color, text-bold). But if docx file containing stylings then in pdf its starting from new line.
For example docx file contains
Hello World
This is showing correct in pdf file. But if docx file contains as below ("H" and "W" is bold here)
**H**ello **W**orld
Its showing in pdf as below (Instead of showing in one line its showing in multiple lines)
H
ello
W
orld
Please let me know you any one is having solution for this. Thanks in advance.
I have a PDF file with a QR Code. Iuploaded the file on the server folder named "tmp" and now I need to scan and convert this QR via PHP.
I found this library:
include_once('lib/QrReader.php');
$qrcode = new QrReader('/var/tmp/qrsample.png');
$text = $qrcode->text(); //return decoded text from QR Code
print $text;
But this works only for png/jpeg files.
Is there any way to scan PDF ? Is there any way to convert PDF to png only the time that I need ?
Thank you!
First, transform your PDF into an image with Imagick, then use your library to decode the QRcode from it:
include_once('lib/QrReader.php');
//the PDF file
$pdf = 'mypdf.pdf';
//retrieve the first page of the PDF as image
$image = new imagick($pdf.'[0]');
//pass it to the QRreader library
$qrcode = new QrReader($image, QrReader::SOURCE_TYPE_RESOURCE);
$text = $qrcode->text(); //return decoded text from QR Code
print $text;
//clear the resources used by Imagick
$image->clear();
You would want to convert the PDF to a supported image type, OR find a QR code reading library that supports PDFs. IMO the first option is easier. A quick search leads me to http://www.phpgang.com/how-to-convert-pdf-to-jpeg-in-php_498.html for a PDF -> img converter.
Presumably the QR code is embedded in the PDF as an image. You could use a command-line tool such as pdfimages to extract the image first, then run your QRReader library on the extracted image. You might need a bit of trial and error to establish which image is the QR code if there is more than one image in the PDF.
See extract images from PDF with PHP for more detail.
I am creating .docx files from a template using PHPWord. It works fine but now I want to convert the generated file to PDF.
First I tried using tcpdf in combination with PHPWord
$wordPdf = \PhpOffice\PhpWord\IOFactory::load($filename.".docx");
\PhpOffice\PhpWord\Settings::setPdfRendererPath(dirname(__FILE__)."/../../Office/tcpdf");
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$pdfWriter = \PhpOffice\PhpWord\IOFactory::createWriter($wordPdf , 'PDF');
if (file_exists($filename.".pdf")) unlink($filename.".pdf");
$pdfWriter->save($filename.".pdf");
but when I try to load the file to convert it to PDF I get the following exception while loading the file
Fatal error: Uncaught exception 'BadMethodCallException' with message 'Cannot add PreserveText in Section.'
After some research I found that some others also have this bug (phpWord - Cannot add PreserveText in Section)
EDIT
After trying around some more I found out, that the Exception only occurs when I have some mail merge fields in my document. Once I removed them the Exception does not come up anymore, but the converted PDF files look horrible. All style information are gone and I can't use the result, so the need for an alternative stays.
I thought about using another way to generate the PDF, but I could only find 4 ways:
Using OpenOffice - Impossible as I cannot install any software on the Server. Also going the way mentioned here did not work either as my hoster (Strato) uses SunOS as the OS and this needs Linux
Using phpdocx - I do not have any budget to pay for it and the demo cannot create PDF
Using PHPLiveDocx - This works, but has the limitation of 250 documents per day and 20 per hour and I have to convert arround 300 documents at once, maybe even multiple times a day
Using PHP-Digital-Format-Convert - The output looks better than with PHPWord and tcpdf, but still not usable as images are missing, and most (not all!) of the styles
Is there a 5th way to generate the PDF? Or is there any solution to make the generated PDF documents look nice?
I used Gears/pdf to convert the docx file generated by phpword to PDF:
$success = Gears\Pdf::convert(
'file_path/file_name.docx',
'file_path/file_name.pdf');
You're trying to unlink the PDF file before saving it, and you have also to unlink the DOCX document, not the PDF one.
Try this.
$pdfWriter = \PhpOffice\PhpWord\IOFactory::createWriter($wordPdf , 'PDF');
$pdfWriter->save($filename.".pdf");
unlink($wordPdf);
I don't think I'm correct..
You save the document as HTML content
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'HTML');
After than you read the HTML file content and write the content as PDF file with the help of mPDF or tcPdf or fpdf.
Try this:
// get the name of the input PDF
$inputFile = "C:\\PHP\\Test1.docx";
// get the name of the output MS-WORD file
$outputFile = "C:\\PHP\\Test1.pdf";
try
{
$oLoader = new COM("easyPDF.Loader.8");
$oPrinter = $oLoader->LoadObject("easyPDF.Printer.8");
$oPrintJob = $oPrinter->PrintJob;
$oPrintJob->PrintOut ($inputFile, $outputFile);
print "Success";
}
catch(com_exception $e)
{
Print "error code".$e->getcode(). "\n";
print $e->getMessage();
}
i want to read content of Doc file or convert a doc file into Docx
I have used COM object but it's not working because i've linux based server.
I have also tried with shell_exec command but it doesn't work because there's no any feature provide on shared server .
is there any api ? so that i can convert a Doc file using Docx
If you want to use PHP, try out PHPWord.
There's an example on how to convert from .doc to .docx on this page:
require_once '../PHPWord.php';
$PHPWord = new PHPWord();
$document = $PHPWord->loadTemplate('Excel2003.doc');
// Save File
$objWriter = PHPWord_IOFactory::createWriter($PHPWord, 'Word2007');
$objWriter->save('Excel2007.docx');
Somebody has asked me to make an app in php that will generate a .doc file with an image and a few tables in it. My first approach was:
<?php
function data_uri($file, $mime)
{
$contents = file_get_contents($file);
$base64 = base64_encode($contents);
return ('data:' . $mime . ';base64,' . $base64);
}
$file = 'new.doc';
$fh = fopen($file,'w');
$uri = data_uri('pic.png','image/png');
fwrite($fh,'<table border="1"><tr><td><b>something</b></td><td>something else</td></tr><tr><td></td><td></td></tr></table>
<br/><img src="'.$uri.'" alt="some text" />
<br/>
<table border="1"><tr><td><b>ceva</b></td><td>altceva</td></tr><tr><td></td><td></td></tr></table>');
fclose($fh);
?>
This uses the data uri technique of embedding an image.
This will generate an html file that will be rendered ok in web browsers but the image is missing in Microsoft Office Word, at least in the standard setup. Then, while editing the file with Word, i've replace the image with an image from file and Microsoft Word changed the contents of the file into Open XML and added a folder, new_files where he put the imported image (which was a .png), a .gif version of the image and a xml file:
<xml xmlns:o="urn:schemas-microsoft-com:office:office">
<o:MainFile HRef="../new.doc" />
<o:File HRef="image001.jpg" />
<o:File HRef="filelist.xml" />
</xml>
Now this isn't good enough either since i want this to be all kept in a single .doc file.
Is there a way to embed an image in an OpenXML-formatted .doc file?
look here http://www.tkachenko.com/blog/archives/000106.html
<w:pict>
<v:shapetype id="_x0000_t75" ...>
... VML shape template definition ...
</v:shapetype>
<w:binData w:name="wordml://02000001.jpg">
... Base64 encoded image goes here ...
</w:binData>
<v:shape id="_x0000_i1025" type="#_x0000_t75"
style="width:212.4pt;height:159pt">
<v:imagedata src="wordml://02000001.jpg"
o:title="Image title"/>
</v:shape>
</w:pict>
There is PHPWord project to manipulate MS Word from within PHP.
PHPWord is a library written in PHP
that create word documents. No Windows
operating system is needed for usage
because the result are docx files
(Office Open XML) that can be opened
by all major office software.
PHPWord can write them http://phpword.codeplex.com/ (note: its still in Beta. I've used PHpExcel by the same guy a lot... never tried the Word version).
Have a look at the phpdocx library for generating real .docx files rather than html files with a .doc extension
PS the extension should strictly be .docx rather than .doc for Open XML Word 2007 files
OpenTBS can create DOCX (and other OpenXML files) dynamic documents in PHP using the technique of templates.
No temporary files needed, no command lines, all in PHP.
It can add or delete pictures. The created document can be produced as a HTML download, a file saved on the server, or as binary contents in PHP.
It can also merge OpenDocument files (ODT, ODS, ODF, ...)
http://www.tinybutstrong.com/opentbs.php
I would use PHPExcel. It can work with OpenXML too.
Here's the link: http://phpexcel.codeplex.com/