How to add text on pdf using PDFJam - php

I am using PDF Jam for manipulating pdf. I need to add a text line at the bottom of generated file. I tried it but not able to made it.
Can anybody guide me how to do it?
I did in my php code as
$command = '-----------------';
exec($command);

As you know, PDFJAM is for manipulating pds. It is a small collection of shell scripts which provide a simple interface to much of the functionality of the excellent pdf pages. See the Ubuntu Manual
pdfjam - A shell script for manipulating PDF files
You should create your sheet as your doing (5x6) and create a separate sheet of minimal page size with required information than merge both the file into one.
Else in first step create your sheet and use pdflib to add text as second step. It very good tool. I hope its a good solution of your problem.

I love pdftk and so wanted to find a solution using that. The following worked for me.
pdfjam --preamble '\usepackage{fancyhdr} \topmargin 85pt \oddsidemargin 140pt \cfoot{\thepage}' --pagecommand '\thispagestyle{plain}' --landscape --nup 2x1 --frame false --clip true --trim ".5in 0.5in 0.5in .65in" --delta '-0.25in 0' tmp.pdf
I cribbed it from: Page Numbering with the "{page} of {pages}", removing the "of pages" part.
Command converts pdf to 2x1, trims margins, and crops. Output is landscape.
\topmargin and \oddsidemargin seem to tell pdflatex where to put the numbers.

Related

No styling when converting DOCX into PDF with PHPWord

I am trying to convert a DOCX file to PDF with PHPWord. When I execute the script it looks like that some style elements are not converted. In the DOCX file I have one image, two tables with border 1px and hidden borders and I am using Tabs.
When I execute the script I get a PDF file without the image, all the Tabs are replaced with Space and all the tables have a border 3px.
Does someone know why I am missing these styles?
Here is my script:
while ($data2 = mysql_fetch_array($rsSql)){
$countLines=$countLines+1;
$templateProcessor->setValue('quantity#'.$countLines, $data2['quantity']);
$templateProcessor->setValue('name#'.$countLines, $data2['name']);
$templateProcessor->setValue('price#'.$countLines, "€ " .$data2['price'] ."");
}
\PhpOffice\PhpWord\Settings::setPdfRenderer('./dompdf');
\PhpOffice\PhpWord\Settings::setPdfRendererPath('./dompdf');
\PhpOffice\PhpWord\Settings::setPdfRendererName('DOMPDF');
$temp_file = tempnam(sys_get_temp_dir(), 'Word');
\$templateProcessor->saveAS($temp_file);
$phpWord = \PhpOffice\PhpWord\IOFactory::load($temp_file);
$xmlWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord , 'PDF');
$xmlWriter->save('result.pdf');
header("Content-type:application/pdf");
header("Content-Disposition:attachment;filename='result.pdf'");
readfile("result.pdf");
After a look on the source code, it seems that PHPWord previously converts the document into an HTML representation before letting it be saved it into PDF by dompdf, another converter.
That's what the opened issue #1139 confirms, moreover it deals with styles missing:
The PDF writers being used are taking in the HTML output, which also lacks the styling. The classes are being defined in the <style> tag, but they are just not being used.
Also the last message adds:
This still seems to be an issue. html and pdf outputs do not replicate the some styles in docx (header / footers).
Concerning your border problem, another SO question shows a similar issue in a conversion HTML -> PDF. A solution was to edit the CSS style, which you obviously cannot perform in your sample code, unless you proceed to pre-convert into HTML.
In conclusion, you may not solve your problem in the short term. If you won't be a part of the dev team, you could submit bug reports to them (and not to dompdf, since it's an HTML-to-PDF converter and they are outside the scope). Github lets you to add DOCX files to the issue report.
Alternatives
You could check out a SO question 204860 about server sides PDF editing library. Below two alternatives, one is free software, the other is closed source and priced.
LibreOffice
Another way is to use LibreOffice in headless mode (command line execution without interface):
libreoffice --headless --convert-to pdf <filename_to_convert>
A PHP wrapper for LibreOffice, Office Converter is also available here if you don't want to bother using libreoffice through exec().
Check if LibreOffice conversion will suit your needs (it may not cover all cases, but be satisfying your scope).
Aspose
The best converter I ever used at work is Aspose, an API covering Documents with Aspose.Words package, Worksheets with Aspose.Cells, Presentations with Aspose.Slides and so on. But it's closed-source and pretty expensive (and you'll pay for updates if you want them after your license expiration).
There is a way to use it in PHP through Java (Aspose.Words and Aspose.Cells) or .NET (Aspose.Words same seems to go with Aspose.Cells).

how to get html code from pdf,docx,doc using php

I want to convert any pdf,docx,doc file into html code using php. with same style as in pdf. I am not getting proper solution.
Config::set('pdftohtml.bin', 'C:/poppler-0.37/bin/pdftohtml.exe');
// change pdfinfo bin location
Config::set('pdfinfo.bin', 'C:/poppler-0.37/bin/pdfinfo.exe');
// initiate
$pdf = new Gufy\PdfToHtml\Pdf($item);
// convert to html and return it as [Dom Object](https://github.com/paquettg/php-html-parser)
$html = $pdf->html();
Not working for me.
I had a similar problem and i found a github that i used with word docs. It worked fairly good then but i havent tested it of late. try it.
https://github.com/benbalter/Convert-Word-Documents-to-HTML
I think that this post could help you in a first time. With this one, you'll be able to convert any pdf into HTML code using PHP.
After this, you can use the help provided by this post to convert .doc and .docx to PDF using PHP.
I think that you can now built a function for each document extension that you want to convert into HTML.
Good luck.
I've come across a web service which presents an API for converting documents. I haven't tested it very thoroughly but it does seem to produce decent results at converting Word to HTML:
https://cloudconvert.org/

PDF manipulation - images are distorted after few consecutive operations on PDF file

I've run into this weird issue with PDF file handling. Not sure if SO is the right place to ask this, but I couldn't find any specific sites for this. I hope that someone can shed some light on the issue.
This happens with the following specific process, if some of steps are omitted - the issue is not observed.
I have a PHP application that serves PDF files to users. These files are created by authors in MS Word 2007, then printed to protected PDF (using pdf995, most likely, I can confirm if needed).
I'll call this initial PDF file as 'source' hereinafter.
Upon request, the source file is processed in PHP the following way:
we decrypt it using qpdf:
qpdf --decrypt "source.pdf" "tmp_output.pdf"
Then we add security label / wartermark to it, encrypt and output to browser using mPDF 6.0:
$mpdf = new mPDF();
$mpdf->SetImportUse();
$pagecount = $mpdf->SetSourceFile($fpath);
if ($pagecount) {
for ($i=1;$i<=$pagecount;$i++){
$tplId = $mpdf->ImportPage($i);
$mpdf->UseTemplate($tplId);
$html = '[security label / watermark contents...]';
$mpdf->WriteHTML($html);
}
}
$mpdf->SetProtection(array('copy','print'), '', 'password',128);
$mpdf->Output('final_output.pdf','I');
With the exact steps described above, images in the output that were pasted in the Word doc appear as follows:
In the source PDF, tmp_output (qpdf decrypted file) the pasted images look correct:
The distortion doesn't take place if any of the following occurs:
Word doc printed to PDF without protection
mPDF output is not protected.
As you can see there too many factors, so I don't know where to look for a bug.
Each component works correctly on it's own and I cannot find any info on the issue. Any insights are greatly appreciated.
EDIT 1
After some more testing, it appears that this only happens to screenshots taken from web browser, Windows explorer, MS Word. Cannot reproduce this with screenshots from Gimp.
It appears that something along the way attempts to convert white to alpha and fails.
The current version (6.1) of Mpdf has a bug which does not handle escaped PDF strings (imported via FPDI) correct if they should be encrypted.
A pull request, which fixes this issue is available here.

Generate OpenOffice calc files using PHP

I have been trying to find a simple way to create OpenOffice calc files with no success.
I have tried:
openTBS - Seems to work writing an xml and a template file but can't find anything about how the xml file format.
Ods php generator - I tried this one as it provides clear examples, but when I copy the files to my server I always get corrupted files
Php doc writer - Tried an example and got an sxw file. I don't even know what that is
ODS-PHP - No documentation, only one example for creating 4 cells
Everything looks old, stalled and undocumented. ¿Any suggestion?
I have used opentbs successfully.
You can generate both excel and calc files. It also nice that you can "reuse" your html implementation so to speak.
Maybe this thread could get you going http://www.tinybutstrong.com/forum.php?thr=3069
Do the html version first.. then edit for calc/excel
Spout from Box works well enough for me. There are some missing features but it is simple to use, has a fluent API, and has no dependencies (it supports composer but you can use it standalone and its dependency graph has zero depth 😉 ).
Here's my "array of objects to ODS" pipeline, using Spout:
(I'm not using their recommended use import because all this code fits in a much larger file that I didn't want to contaminate and the $factory pattern looks cleaner to me anyway)
$factory = 'Box\Spout\Writer\Common\Creator\WriterEntityFactory';
$factory::createODSWriter()
->openToBrowser('filename.ods')
->addRow($factory::createRow([
$factory::createCell(__('Heading 1')),
$factory::createCell(__('Heading 2')),
$factory::createCell(__('Heading 3')),
]))
->addRows(array_map(function($row) use ($factory) {
return $factory::createRow([
$factory::createCell($row->first_val),
$factory::createCell($row->second_val),
$factory::createCell($row->third_val),
]);
}, loadDataFromSomewhere()))
->close();

HTML2PDF in PHP - convert utilities & scripts - examples & demos

I have a quite complicated HTML/CSS layout which I would like to convert to PDF on my server. I already have tryed DOMPDF, unfortunately it did not convert the HTML with correct layout. I have considered HTMLDOC but I have heard that it ignores CSS to a large extent, so I suppose the layout would break apart with that tool too.
My question therefor is - are there any online demos for other tools (like wkhtmltopdf i.e.) that I could use to verify how my HTML is converted? Before spending the rest of my life installing & testing one by one?
Unfortunately, I can't change the HTML layout to fit those tools. Or better said - I could, if any of them would get close to an acceptable result...
Not really an answer but for the question above, but I'll try to provide some of my experience, maybe it will help someone somwhere in the future.
wkthmltopdf is really THE ONLY solution that worked for me that could produce what I call acceptable results. Still, some minor modifications to the CSS had to be made, however, it worked really well when it comes to rendering the content. All the other packages are really only suitable if you have a rather simply document with one basic table etc. No chance to get them to produce fair results on complex docs with design elements, css, multiple overlapping images etc. If complex documents are in game - do not spend the time (like I did) - go straight to wkhtmltopdf.
Beware - the wkhtmltopdf installation is tricky. It was not so easy for me as the guys said in their comments (one of the reasons might be that I am not too familiar with Linux). The static binary did not work for me for some reason I can't explain. I suspect that there were problems with the version - apparently there is a difference between versions for different OS and processors, maybe I have the vrong version. For installing the non-static version first of all you have to have root access to the server, that's obvious. I installed it with apt-get using PuTTy, went quite well. I was lucky that my server already had all the predispositions to install wkhtmltopdf. So this was the easy part for me :) (btw, you don't have to care for symbolic links or wrappers as many tutorials tell you - I spent hours trying to figure out how to do that part, in the end I gave it up and everything works well though)
After the install I got the quite famous Cannot connect to X server error. This is due to the fact that we need to run wkhtmltopdf headless on a 'virtual' x server. Getting around this was also quite simple (if one does not care for the symbolic links). I installed it with apt-get install xvfb. This also went quite well for me, no problems.
After completing this I was able to run wkhtmltopdf. Beware - it took me some time to figure out that trying to run xvfb was the wrong way - instead you have to run xvfb-run. My PHP code now looks like this exec("xvfb-run wkhtmltopdf --margin-left 16 /data/web/example.com/source.html /data/web/example.com/target.pdf"); (notice the --margin-left 16 command line option for wkhtmltopdf - it makes my content more centered; I left it in place to demonstrate how you can use command line options).
I also wanted to protect the generated PDF files from editing (in my case, print protect is also possible). After doing some research I found this class from ID Security Suite. First of all I have to say - IT'S OLD (I am running PHP 5+). However, I made some improvements to it. First of all - it's a wrapper around the FPDF library, so there is a file called fpdf.php in the package. I replaced this file from the latest FPDF version I got from here. It made my PHP warnings look more sustainable. I also changed the $pdf =& new FPDI_Protection(); and removed the & sign as I was getting an deprecated warning for it. However, there are more of those to come. Instead of searching and modifying the code I just turned the error reporting lvl to 0 with error_reporting(0); (although turning off the warnings only should be sufficient). Now someone will say that this is not "good practice". I am using this whole stuff on an internal system, so I do not really have to care. For sure the scripts could be modifiyed to match latest requirements. For me I didn't want to spend another hours working on it. Be careful where the script says $pdf->SetProtection(array('print'), '', $password); (I allowed printing my documents as you can see). It took me a while to figure out that the first argument is the permissions. The second is the USER PASSWORD - if you provide this then the docs will require a password to open (I left this blank). The third is the OWNER PASSWORD - this is what you need to make the docs "secured" against editing, copying etc.
My whole code now looks like:
// get the HTML content of the file we want to convert
$invoice = file_get_contents("http://www.example.com/index.php?s=invoices-print&invoice_no=".$_GET['invoice_no'];
// replace the CSS style from a print version to a specially modified PDF version
$invoice = str_replace('href="design/css/base.print.css"','href="design/css/base.pdf.css"',$invoice);
// write the modified file to disk
file_put_contents("docs/invoices/tmp/".$_GET['invoice_no'].".html", $invoice);
// do the PDF magic
exec("xvfb-run wkhtmltopdf --margin-left 16 /data/web/domain.com/web/docs/invoices/tmp/".$_GET['invoice_no'].".html /data/web/domain.com/web/docs/invoices/".$_GET['invoice_no'].".pdf");
// delete the temporary HTML data - we do not need that anymore since our PDF is created
unlink("docs/invoices/tmp/".$_GET['invoice_no'].".html");
// workaround the warnings
error_reporting(0);
// script from ID Security Suite
function pdfEncrypt ($origFile, $password, $destFile){
require_once('libraries/fpdf/FPDI_Protection.php');
$pdf = new FPDI_Protection();
$pdf->FPDF('P', 'in');
//Calculate the number of pages from the original document.
$pagecount = $pdf->setSourceFile($origFile);
//Copy all pages from the old unprotected pdf in the new one.
for ($loop = 1; $loop <= $pagecount; $loop++) {
$tplidx = $pdf->importPage($loop);
$pdf->addPage();
$pdf->useTemplate($tplidx);
}
//Protect the new pdf file, and allow no printing, copy, etc. and
//leave only reading allowed.
$pdf->SetProtection(array('print'), '', $password);
$pdf->Output($destFile, 'F');
return $destFile;
}
//Password for the PDF file (I suggest using the email adress of the purchaser).
$password = md5(date("Ymd")).md5(date("Ymd"));
//Name of the original file (unprotected).
$origFile = "docs/invoices/".$_GET['invoice_no'].".pdf";
//Name of the destination file (password protected and printing rights removed).
$destFile = "docs/invoices/".$_GET['invoice_no'].".pdf";
//Encrypt the book and create the protected file.
pdfEncrypt($origFile, $password, $destFile );
Hope this helps someone to save some time in the future. This whole solution took me like 12 hours to implement into our invoicing system. If there was better info on wkhtmltopdf for users like me, who are not that familiar with Linux/UNIX, I could have saved some of the hours spent on this.
However - what doesn't kill you makes you stronger :) So I am a bit more perfect now that I made this run :)

Categories