i want to read content of Doc file or convert a doc file into Docx
I have used COM object but it's not working because i've linux based server.
I have also tried with shell_exec command but it doesn't work because there's no any feature provide on shared server .
is there any api ? so that i can convert a Doc file using Docx
If you want to use PHP, try out PHPWord.
There's an example on how to convert from .doc to .docx on this page:
require_once '../PHPWord.php';
$PHPWord = new PHPWord();
$document = $PHPWord->loadTemplate('Excel2003.doc');
// Save File
$objWriter = PHPWord_IOFactory::createWriter($PHPWord, 'Word2007');
$objWriter->save('Excel2007.docx');
Related
I know how to use the Google Translate API. I want to know one thing do you have an idea to translate the PDF file without losing in the file format. I had tried to convert the PDF file to DOCX and then translate the file and then return it to PDF but the conversion from PDF to DOCX failed because of many PHPWORD BUGs.
Here is the question and the codes that I had received in order to request the conversion from PDF to DOCX.
<?php
require_once 'vendor/autoload.php';
// Create a new PDF reader
$reader = \PhpOffice\PhpWord\IOFactory::createReader('PDF');
$reader->setReadDataOnly(true);
// Load the PDF file
$phpWord = $reader->load('example.pdf');
// Save the DOCX file
$writer = $PhpOfficePhpWordIOFactory::createWriter($phpWord, 'Word2007');
$writer->save('example.docx');
echo 'PDF file converted to DOCX successfully!';
<?php
require_once 'vendor/autoload.php';
use PhpOffice\PhpWord\IOFactory as WordIOFactory;
// Convert PDF to text
exec('pdftotext -layout input.pdf output.txt');
// Load text file
$text = file_get_contents('output.txt');
// Create new DOCX file
$phpWord = new \PhpOffice\PhpWord();
$section = $phpWord->addSection();
$textrun = $section->addTextRun();
$textrun->addText($text);
// Save DOCX file
$objWriter = WordIOFactory::createWriter($phpWord, 'Word2007');
$objWriter->save('output.docx');
Do you have any idea how I can translate the PDF file without going through the conversion to other formats?
I trying to convert file doc or docx to pdf but the result doesn't match with the origin file doc/docx and also there is no style in file pdf. I don't know why, because here i'm using tcpdf and phpword
this is my code to convert:
$filetarget = FileHelper::normalizePath($pathdirectory.'/'.$filename);
$objReader = \PhpOffice\PhpWord\IOFactory::createReader('Word2007');
$contents = $objReader->load($filetarget);
$tcpdfPath = Yii::getAlias('#baseApp') . '/vendor/tecnickcom/tcpdf';
\PhpOffice\PhpWord\Settings::setPdfRendererPath($tcpdfPath);
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($contents,'PDF');
$fileresult = str_replace('.docx', '.pdf', $filetarget);
$objWriter->save($fileresult);
$toPdf = FileHelper::normalizePath($fileresult);
this is part of result after converted from docx to pdf
and this is part of origin docx file
what's wrong with my code?
Unfortunately phpWord is very basic so for DocX to PDF output you can see there is no ability to preserve text or page breaks, nor support lists or export images.
For the current list of features see
https://phpword.readthedocs.io/en/latest/intro.html#writers
Since it runs OpenOffice as the converter you could try other PHP methods to run the conversion direct
I have tried reading the .docx file. I am successful in reading the .docx file. But I have to read .pdf file with PHPWord.
Any idea How can I do it?
I have tried the code
// Read contents
$source = "./example.pdf";
$phpWord = \PhpOffice\PhpWord\IOFactory::load($source, 'Word2007');
$data = '';
$section = $phpWord->addSection();
$section->addText($data);
$name = "officefile";
$source = __DIR__ . "/results/{$name}.html";
// Saving the document as HTML file...
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'HTML');
$objWriter->save($source);
PHPWord cannot read PDF files. Please see this Github issue too see that it is out of scope of the project.
Some alternative methods of reading PDF files with PHP can be found in this Stackoverflow question.
I am converting docx file to pdf using PHP library PHPOffice and PHPWord. I am using TCPDF as PDF writer.
My code is as below
include_once 'Sample_Header.php';
include_once '../vendor/tecnickcom/tcpdf/tcpdf.php';
\PhpOffice\PhpWord\Settings::setPdfRendererPath('../vendor/tecnickcom/tcpdf');
\PhpOffice\PhpWord\Settings::setPdfRendererName('TCPDF');
$temp = \PhpOffice\PhpWord\IOFactory::load('files/sampledocument.docx');
$xmlWriter = \PhpOffice\PhpWord\IOFactory::createWriter($temp , 'PDF');
$xmlWriter->save('results/sampledocument.pdf', TRUE);
Its working fine and generating correct pdf file only if docx file is containing plain text without any style changes(i.e color, text-bold). But if docx file containing stylings then in pdf its starting from new line.
For example docx file contains
Hello World
This is showing correct in pdf file. But if docx file contains as below ("H" and "W" is bold here)
**H**ello **W**orld
Its showing in pdf as below (Instead of showing in one line its showing in multiple lines)
H
ello
W
orld
Please let me know you any one is having solution for this. Thanks in advance.
I am trying to attach(embedding) a .docx file to word document using phpword addObject() function, it's attaching file but while clicking on attached file it's not opening. If i do it for .doc file it's opening the attached file. I am using phpword library.
<?php
require_once '../PHPWord.php';
// New Word Document
$PHPWord = new PHPWord();
// New portrait section
$section = $PHPWord->createSection();
// Add text elements
$section->addText('You can open this OLE object by double clicking on the icon:');
$section->addTextBreak(2);
// Add object
$section->addObject('Test.docx');
//if i use $section->addObject('Test.doc'); it's opening attached file. here Test.doc is word97-2003 format.
// Save File
$objWriter = PHPWord_IOFactory::createWriter($PHPWord, 'Word2007');
$objWriter->save('Object.docx');
?>
Multiple issues were already logged for this, but no solution proposed for the issue yet :-(
addObject() function is not working with .docx file, it's working with .doc(word97-2003) file
Add Object Issue
Unable to open embedded object