I am converting docx file to pdf using PHP library PHPOffice and PHPWord. I am using TCPDF as PDF writer.
My code is as below
include_once 'Sample_Header.php';
include_once '../vendor/tecnickcom/tcpdf/tcpdf.php';
\PhpOffice\PhpWord\Settings::setPdfRendererPath('../vendor/tecnickcom/tcpdf');
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$temp = \PhpOffice\PhpWord\IOFactory::load('files/sampledocument.docx');
$xmlWriter = \PhpOffice\PhpWord\IOFactory::createWriter($temp , 'PDF');
$xmlWriter->save('results/sampledocument.pdf', TRUE);
Its working fine and generating correct pdf file only if docx file is containing plain text without any style changes(i.e color, text-bold). But if docx file containing stylings then in pdf its starting from new line.
For example docx file contains
Hello World
This is showing correct in pdf file. But if docx file contains as below ("H" and "W" is bold here)
**H**ello **W**orld
Its showing in pdf as below (Instead of showing in one line its showing in multiple lines)
H
ello
W
orld
Please let me know you any one is having solution for this. Thanks in advance.
Related
I trying to convert file doc or docx to pdf but the result doesn't match with the origin file doc/docx and also there is no style in file pdf. I don't know why, because here i'm using tcpdf and phpword
this is my code to convert:
$filetarget = FileHelper::normalizePath($pathdirectory.'/'.$filename);
$objReader = \PhpOffice\PhpWord\IOFactory::createReader('Word2007');
$contents = $objReader->load($filetarget);
$tcpdfPath = Yii::getAlias('#baseApp') . '/vendor/tecnickcom/tcpdf';
\PhpOffice\PhpWord\Settings::setPdfRendererPath($tcpdfPath);
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($contents,'PDF');
$fileresult = str_replace('.docx', '.pdf', $filetarget);
$objWriter->save($fileresult);
$toPdf = FileHelper::normalizePath($fileresult);
this is part of result after converted from docx to pdf
and this is part of origin docx file
what's wrong with my code?
Unfortunately phpWord is very basic so for DocX to PDF output you can see there is no ability to preserve text or page breaks, nor support lists or export images.
For the current list of features see
https://phpword.readthedocs.io/en/latest/intro.html#writers
Since it runs OpenOffice as the converter you could try other PHP methods to run the conversion direct
I have a PDF file with a QR Code. Iuploaded the file on the server folder named "tmp" and now I need to scan and convert this QR via PHP.
I found this library:
include_once('lib/QrReader.php');
$qrcode = new QrReader('/var/tmp/qrsample.png');
$text = $qrcode->text(); //return decoded text from QR Code
print $text;
But this works only for png/jpeg files.
Is there any way to scan PDF ? Is there any way to convert PDF to png only the time that I need ?
Thank you!
First, transform your PDF into an image with Imagick, then use your library to decode the QRcode from it:
include_once('lib/QrReader.php');
//the PDF file
$pdf = 'mypdf.pdf';
//retrieve the first page of the PDF as image
$image = new imagick($pdf.'[0]');
//pass it to the QRreader library
$qrcode = new QrReader($image, QrReader::SOURCE_TYPE_RESOURCE);
$text = $qrcode->text(); //return decoded text from QR Code
print $text;
//clear the resources used by Imagick
$image->clear();
You would want to convert the PDF to a supported image type, OR find a QR code reading library that supports PDFs. IMO the first option is easier. A quick search leads me to http://www.phpgang.com/how-to-convert-pdf-to-jpeg-in-php_498.html for a PDF -> img converter.
Presumably the QR code is embedded in the PDF as an image. You could use a command-line tool such as pdfimages to extract the image first, then run your QRReader library on the extracted image. You might need a bit of trial and error to establish which image is the QR code if there is more than one image in the PDF.
See extract images from PDF with PHP for more detail.
I am creating .docx files from a template using PHPWord. It works fine but now I want to convert the generated file to PDF.
First I tried using tcpdf in combination with PHPWord
$wordPdf = \PhpOffice\PhpWord\IOFactory::load($filename.".docx");
\PhpOffice\PhpWord\Settings::setPdfRendererPath(dirname(__FILE__)."/../../Office/tcpdf");
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$pdfWriter = \PhpOffice\PhpWord\IOFactory::createWriter($wordPdf , 'PDF');
if (file_exists($filename.".pdf")) unlink($filename.".pdf");
$pdfWriter->save($filename.".pdf");
but when I try to load the file to convert it to PDF I get the following exception while loading the file
Fatal error: Uncaught exception 'BadMethodCallException' with message 'Cannot add PreserveText in Section.'
After some research I found that some others also have this bug (phpWord - Cannot add PreserveText in Section)
EDIT
After trying around some more I found out, that the Exception only occurs when I have some mail merge fields in my document. Once I removed them the Exception does not come up anymore, but the converted PDF files look horrible. All style information are gone and I can't use the result, so the need for an alternative stays.
I thought about using another way to generate the PDF, but I could only find 4 ways:
Using OpenOffice - Impossible as I cannot install any software on the Server. Also going the way mentioned here did not work either as my hoster (Strato) uses SunOS as the OS and this needs Linux
Using phpdocx - I do not have any budget to pay for it and the demo cannot create PDF
Using PHPLiveDocx - This works, but has the limitation of 250 documents per day and 20 per hour and I have to convert arround 300 documents at once, maybe even multiple times a day
Using PHP-Digital-Format-Convert - The output looks better than with PHPWord and tcpdf, but still not usable as images are missing, and most (not all!) of the styles
Is there a 5th way to generate the PDF? Or is there any solution to make the generated PDF documents look nice?
I used Gears/pdf to convert the docx file generated by phpword to PDF:
$success = Gears\Pdf::convert(
'file_path/file_name.docx',
'file_path/file_name.pdf');
You're trying to unlink the PDF file before saving it, and you have also to unlink the DOCX document, not the PDF one.
Try this.
$pdfWriter = \PhpOffice\PhpWord\IOFactory::createWriter($wordPdf , 'PDF');
$pdfWriter->save($filename.".pdf");
unlink($wordPdf);
I don't think I'm correct..
You save the document as HTML content
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'HTML');
After than you read the HTML file content and write the content as PDF file with the help of mPDF or tcPdf or fpdf.
Try this:
// get the name of the input PDF
$inputFile = "C:\\PHP\\Test1.docx";
// get the name of the output MS-WORD file
$outputFile = "C:\\PHP\\Test1.pdf";
try
{
$oLoader = new COM("easyPDF.Loader.8");
$oPrinter = $oLoader->LoadObject("easyPDF.Printer.8");
$oPrintJob = $oPrinter->PrintJob;
$oPrintJob->PrintOut ($inputFile, $outputFile);
print "Success";
}
catch(com_exception $e)
{
Print "error code".$e->getcode(). "\n";
print $e->getMessage();
}
i'm using PHPWord to read and write file as doc docx or html.
But i have a problem to get all content of a docx file for example.
is there an example to get content of docx file or html file ? (i tried to use sample of phpword, but don't help me)
My wish is to extract all text from a file (docx, html or other) and save it in database as pure text.
Anyone can help me please ?
We can use file_get_contents php function to get content of files
$file = file_get_contents('./people.txt', true);
Hope this will help!
I am having a problem creating a PDF using PHPExcel. I can create the excel and pdf file fine, but its the characters on the pdf that isn't working properly. Image below.
If I copy any of the text and paste it somewhere it reads as normal text and not gibberish anymore, which is very strange. My code is below.
$newpdf = "reports/risk-bordereaux-".$time.".pdf";
$objPHPExcel = PHPExcel_IOFactory::load($newxlsx);
$objPHPExcel->getActiveSheet(0)->getPageSetup()->setOrientation(PHPExcel_Worksheet_PageSetup::ORIENTATION_LANDSCAPE);
$objPHPExcel->getActiveSheet(0)->getPageSetup()->setPaperSize(PHPExcel_Worksheet_PageSetup::PAPERSIZE_A4);
$objPHPExcel->getActiveSheet(0)->getPageSetup()->setFitToWidth(true);
$objPHPExcel->getActiveSheet(0)->getPageSetup()->setFitToHeight(true);
$objPHPExcel->getActiveSheet(0)->setShowGridlines(false);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'PDF');
$objWriter->save($newpdf);
I'm wondering is there any way I could set the character encoding or something to stop this happening? Any help would be greatly appreciated!
EDIT
Link to bigger image
I am using TCPDF to generate PDF's