We have a project where we merge different pdfs to create a catalog.
Right now it's running on myokyawhtun/pdfmerger, which runs fine, but it does not keep links set in acrobat.
We have tried different libraries we found (pure PHP, we cannot install or call applications from the command line via shell-exec or similar on this webspace, so no gs), even if we just import the pdf-files via fpdi and resave them, the hyperlinks get lost.
Is there any (pure PHP) library out there which can retain links inside the files? Or are there some special settings that we missed?
We have tried:
setasign/fpdi
iio/libmergepdf
jurosh/pdf-merge
Example code for the current lib (myokyawhtun/pdfmerger):
require('vendor/myokyawhtun/pdfmerger/tcpdf/tcpdf.php');
require('vendor/myokyawhtun/pdfmerger/tcpdf/tcpdi.php');
require('vendor/myokyawhtun/pdfmerger/PDFMerger.php');
$pdf = new \PDFMerger\PDFMerger;
foreach($sourcePdfs as $file)
{
$pdf->addPDF($pdfDir.'/source/'.$file);
}
$pdf->merge('download', 'Download.pdf');
All the mentioned libraries use FPDI under the hood, which simply does not support content outside of a pages content stream, such as links or any other annotation type.
We (author of FPDI) also offer non-free products which work on another level and which allow you keep all annotations including links and also forms when you concatenate the documents. This is possible with the SetaPDF-Merger component:
$merger = new SetaPDF_Merger();
foreach($sourcePdfs as $file) {
$merger->addFile($pdfDir . '/source/' . $file);
}
$merger->merge();
$document = $merger->getDocument();
$document->setWriter(new SetaPDF_Core_Writer_Http('Download.pdf'));
$document->save()->finish();
Related
I have several PDF files created dynamically using TCPDF.
I have to merge those PDF's created by TCPDF into one, and as I saw best practice is to do that with FPDI library.
All PDF's that have to be merged are stored in same directory.
To merge them, I'm using next code:
require( MY_APP_PATH . 'fpdf/fpdf.php');
require( MY_APP_PATH . 'fpdi/fpdi.php');
$fpdi = new FPDI();
// iterate over array of files and merge
foreach ($filesToMerge as $file) {
$fpdi->setSourceFile(MY_APP_PATH . 'pdf/' . $file);
$tpl = $fpdi->importPage(1, '/MediaBox');
$fpdi->addPage();
$fpdi->useTemplate($tpl);
}
$fpdi->Output('F', 'merged.pdf');
Error I'm getting here is:
TCPDF ERROR: Incorrect output destination: /VAR/WWW/HTML/MYAPP/PDF/MERGED.PDF
Looks like there is some collision between TCPDF and FPDI libraries (or even FPDF?), since they both have have same method Output.
Also, it works fine if I run it in separate code (without including TCPDF class)
Can you give me some idea how to avoid this and merge my PDF's?
Just change the order of the Output() parameters. The order was changed in the latest FPDF version but internally both orders are supported while TCPDF only supports $name followed by $dest.
FPDI will extend the TCPDF class if it is available. If it's not available it will extend FPDF.
I have PDF form files that I fill out dynamically with PHP using FPDM (the FPDF script). I can save them on my server no problem, and the text all looks fine in the PDF when I download and view in Acrobat.
My problem is: I'm trying to merge multiple PDF files together on the server so the user can download a single PDF document with several pages. I downloaded PDF Merger (http://pdfmerger.codeplex.com/) and got it merging the files together, but this causes the PDF form text to disappear.
Anyone know of a form-friendly PHP-based PDF merger that doesn't require installing anything (other than uploading libraries) to my server?
Code that works for merging but kills text in form boxes:
$pdfCombined= new PDFMerger;
$pdfCombined->addPDF('../forms/generated/16.pdf', 'all')
->addPDF('../forms/generated/19.pdf', 'all')
->merge('browser', 'mergedDoc.pdf');
The linked "PDF Merger" simply uses FPDI in the back. FPDI is not able to handle dynamic content as described here.
A pure PHP solution for merging PDF forms is the SetaPDF-Merger component (not free). An evaluation requires the installation of a Loader (Ioncube or Zend Guard). License owners will get access to the source code, so that no external library is needed. The usage is also that easy:
require_once("library/SetaPDF/Autoload.php");
// create a file writer
$writer = new SetaPDF_Core_Writer_Http("mergedDoc.pdf");
// create a new merger instance
$merger = new SetaPDF_Merger();
// add the files
$merger->addFile('../forms/generated/16.pdf');
$merger->addFile('../forms/generated/19.pdf');
// merge all files
$merger->merge();
// get the resulting document and set the writer instance
$document = $merger->getDocument();
$document->setWriter($writer);
// save the file and finish the writer
$document->save()->finish();
I am working on a Symfony 1.4 project. I need to make a PDF download link for a (yet to be) generated voucher and I have to say, I am a bit confused. I already have the HTML/CSS for the voucher, I created the download button in the right view, but I don't know where to go from there.
Use Mpdf to create the pdf file
http://www.mpdf1.com/
+1 with wkhtmltopdf
I'd even recommand the snappy library.
If you use composer, you can even get the wkhtmltopdf binaries automatically
Having used wkhtmltopdf for a while I've moved off it as 1) it has some serious bugs and 2) ongoing development has slowed down. I moved over to PhantomJS which is proving to be much better in terms of functionality and effectiveness.
Once you've got something like wkhtmltopdf or PhantomJS on your machine you need to generate the HTML page and pass that along to it. I'll give you an example assuming you use PhantomJS.
Initially set what every request parameters you need to for the template.
$this->getRequest->setParamater([some parameter],[some value]);
Then call the function getPresentation() to generate the HTML from a template. This will return the resulting HTML for a specific module and action.
$html = sfContext::getInstance()->getController()->getPresentation([module],[action]);
You'll need to replace the relative CSS paths with a absolute CSS path in the HTML file. For example by running preg_replace.
$html_replaced = preg_replace('/"\/css/','"'.sfConfig('sf_web_dir').'/css',$html);
Now write the HTML page to file and convert to a PDF.
$fp = fopen('export.html','w+');
fwrite($fp,$html_replaced);
fclose($fp)
exec('/path/to/phantomjs/bin/phantomjs /path/to/phantomjs/examples/rasterize.js /path/to/export.html /path/to/export.pdf "A3");
Now send the PDF to the user:
$this->getResponse()->clearHttpHeaders();
$this->getResponse()->setHttpHeader('Content-Description','File Transfer');
$this->getResponse()->setHttpHeader('Cache-Control','public, must-revalidate, max-age=0');
$this->getResponse()->setHttpHeader('Pragma: public',true);
$this->getResponse()->setHttpHeader('Content-Transfer-Encoding','binary');
$this->getResponse()->setHttpHeader('Content-length',filesize('/path/to/export.pdf'));
$this->getResponse()->setContentType('application/pdf');
$this->getResponse()->setHttpHeader('Content-Disposition','attachment; filename=export.pdf');
$this->getResponse()->setContent(readfile('/path/to/export.pdf'));
$this->getResponse()->sendContent();
You do need to set the headers otherwise the browser does odd things. The filename for the generated HTML file and export should be unique to avoid the situation of two people generating PDF vouchers at the same time clashing. You can use something like sha1(time()) to add a randomised hash to a standard name e.g. 'export_'.sha1(time());
Use wkhtmltopdf, if possible. It is by far the best html2pdf converter a php coder can use.
And then do something like this (not tested, but should be pretty close):
public function executeGeneratePdf(sfWebRequest $request)
{
$this->getContext()->getResponse()->clearHttpHeaders();
$html = '*your html content*';
$pdf = new WKPDF();
$pdf->set_html($html);
$pdf->render();
$pdf->output(WKPDF::$PDF_EMBEDDED, 'whatever_name.pdf');
throw new sfStopException();
}
I would like to merge multiple doc or rtf files into a single file which should be the same format of multiple files.
What I mean is that if a user selects multiple rtf template files from a list box and clicks on a button on web page, the output should be a single rtf file which combines multiple rtf template files, I should use php for this.
I haven't decided the format of template files, but it should be either rtf or doc, and also I assume that template file has some images as well.
I have spent many hours to research the library for this, but still can't find it out.
Please help me out here!! :(
Thanks in advance.
If you are searching for a solution for handling RTF documents only, you can find a PHP package to merge multiple RTF documents here :
www.rtftools.com
Here is a short example on how to merge multiple documents together :
include ( 'path/to/RtfMerger.phpclass' ) ;
$merger = new RtfMerger ( 'sample1.rtf', 'sample2.rtf' ) ; // You can specify docs to be merged to the class constructor...
$merger -> Add ( 'sample3.rtf' ) ; // or by using the Add() method
$merger [] = 'sample4.rtf' ; // or by using the array access methods
$merger -> SaveTo ( 'output.rtf' ) ; // Will save files 'sample1' to 'sample4' into 'output.rtf'
This package allows you to handle documents that are bigger than the available memory.
I've been working on a similar project and havne't managed to find any PHP (or any other open source language) libraries for manipulating MSWord files. The way I approach it is kind of complicated, but works. Here's how I would do it (assuming you have a Linux server):
Setup:
Install JODConverter and OpenOffice
Start open office as a server (see http://www.artofsolving.com/node/10)
Approach (ie. what to do in your PHP code):
Convert your MSWord or RTF files into ODT format by calling JODConverter via backticks or exec()
Unzip each file into a temporary directory of its own
Read the contents.xml file from each unzipped document using a DOM Parser
Extract the <office:text> contents from each, and concatenate
Put this concatenated xml back into the right spot in one of the content.xml files
Re-zip the contents of that temporary directory and give it an .odt extension
Use JODConverter to convert this file back to MSWord again
As I said, it's not pretty, but it does the job.
If you're looking to go down the RTF route, this question may also help: Concatenate RTF files in PHP (REGEX)
I have a folder with 100s of PDF versions of PPT presentations. I also have a one page PDF file that I want to add to the beginning of each PDF file. Is there a way I can do this with PHP? Could I maybe use the Zend Framework?
It certainly can be done by Zend_Framework!
$pdf = Zend_Pdf::load($fileName);
$frontPdf = Zend_Pdf::load('/path/to/template.pdf');
$frontPage = $frontPdf->pages[0];
//prepend our template front page to PDF
array_unshift($pdf->pages, $frontPage);
//update original document
$pdf->save($fileName, true);
I haven't tested the code here but we have an application working on the same principle.
Check the documentation for pages within Zend_Pdf if you have any problems.