Merging PDF files with PHP - php

I have several PDF files created dynamically using TCPDF.
I have to merge those PDF's created by TCPDF into one, and as I saw best practice is to do that with FPDI library.
All PDF's that have to be merged are stored in same directory.
To merge them, I'm using next code:
require( MY_APP_PATH . 'fpdf/fpdf.php');
require( MY_APP_PATH . 'fpdi/fpdi.php');
$fpdi = new FPDI();
// iterate over array of files and merge
foreach ($filesToMerge as $file) {
$fpdi->setSourceFile(MY_APP_PATH . 'pdf/' . $file);
$tpl = $fpdi->importPage(1, '/MediaBox');
$fpdi->addPage();
$fpdi->useTemplate($tpl);
}
$fpdi->Output('F', 'merged.pdf');
Error I'm getting here is:
TCPDF ERROR: Incorrect output destination: /VAR/WWW/HTML/MYAPP/PDF/MERGED.PDF
Looks like there is some collision between TCPDF and FPDI libraries (or even FPDF?), since they both have have same method Output.
Also, it works fine if I run it in separate code (without including TCPDF class)
Can you give me some idea how to avoid this and merge my PDF's?

Just change the order of the Output() parameters. The order was changed in the latest FPDF version but internally both orders are supported while TCPDF only supports $name followed by $dest.
FPDI will extend the TCPDF class if it is available. If it's not available it will extend FPDF.

Related

Trying to Import PDF into FPDF/FPDI Adds Blank Page

I'm new to using FPDF / FPDI. My predecessor here extended the FPDF class and I'm adding onto his code to edit his PDFs and make new ones. I've been able to create the first 3 pages of a PDF I want to make by writing values from my database into blank pages, using FPDF. All good!
Now I want to import an existing single page PDF as page 4 and write some stuff on it, and it looks like I need FPDI to do that.
I used the example here: https://www.setasign.com/products/fpdi/about as suggested by another post. However, it's just adding a blank page to the end of my PDF.
I've written the path of the PDF I want to import into my log and ensured that it is correct (I'm not getting any errors in the php log either, which I would expect if the pdf was not found).
I'm initializing the pdf as FPDF, not FPDI, so that may be the issue? But if I initialize it as FPDI then I get an error that my methods are undefined, because they are defined extending the FPDF class. So I'm not sure how to do what I want...do I need to redefine my classes to extend FPDI? I'm just worried this will break the PDFs already being created using some of the same methods. I'm also not getting any errors about using FPDI methods like useImportedPage...so I feel like maybe that's not the issue?
Sorry if I'm not explaining this well, let me know if you have questions. Here is the relevant code:
require_once(APPPATH.'libraries/fpdf/fpdf.php');
require_once(APPPATH.'libraries/fpdi/autoload.php');
require_once(APPPATH.'libraries/fpdi/Fpdi.php');
public function make_fieldpacketMA(){
$plotID=$this->input->get('PlotID');
$year=$this->input->get('Year');
$pdf = new PDF('P','in','Letter');
// Make additional info first page- this works!
$additional_info=$this->fhm_model->get_additionalplotinfo($plotID, "MA");
$pdf->AdditionalInfoSheet($additional_info, $pdf);
//make plot info pages (same as subplot info pages on VT style plots)- this works!
$plotInfo=$this->fhm_model->get_sheetinfoMA($plotID,$year);
$pdf->SubplotSheet($plotInfo, "MA", $pdf);
//make seedling sheet from existing template- this does not work
$seedlingInfo=$this->fhm_model->get_seedlingInfo($plotID, $year);
$pdf->SeedlingSheet($seedlingInfo, $pdf);
//Write the output
$rand=uniqid();
$filename='./fhm_sheets/PlotPacket_'.$rand.'.pdf';
$pdf->Output($filename,"F");
$this->load->helper('download');
$data = file_get_contents($filename);
force_download($plotID.'_'.($year+1).'_FHMPlotPacket.pdf',$data);
}
//Create PDF creation class
class PDF extends PDF_Rotate{
function SeedlingSheet($seedlingInfo=array()) {
$pageCount = $this->setSourceFile(FCPATH.'fhm_template/MAFHM_Microplot_SeedlingForm.pdf');
$pageId = $this->importPage(1);
$this->AddPage();
// $this->useTemplate($pageId);
$this->useImportedPage($pageId, 10, 10, 90);
}
This works! Looks like I needed a second argument for importPage to define the bounding box
function SeedlingSheet($seedlingInfo=array()) {
$pageCount = $this->setSourceFile(FCPATH.'fhm_template/MAFHM_Microplot_SeedlingForm.pdf');
$pageId = $this->importPage(1, \setasign\Fpdi\PdfReader\PageBoundaries::MEDIA_BOX);
$this->AddPage();
$this->useTemplate($pageId, 0, 0);
}

Merge pdf files in PHP and keep weblinks inside

We have a project where we merge different pdfs to create a catalog.
Right now it's running on myokyawhtun/pdfmerger, which runs fine, but it does not keep links set in acrobat.
We have tried different libraries we found (pure PHP, we cannot install or call applications from the command line via shell-exec or similar on this webspace, so no gs), even if we just import the pdf-files via fpdi and resave them, the hyperlinks get lost.
Is there any (pure PHP) library out there which can retain links inside the files? Or are there some special settings that we missed?
We have tried:
setasign/fpdi
iio/libmergepdf
jurosh/pdf-merge
Example code for the current lib (myokyawhtun/pdfmerger):
require('vendor/myokyawhtun/pdfmerger/tcpdf/tcpdf.php');
require('vendor/myokyawhtun/pdfmerger/tcpdf/tcpdi.php');
require('vendor/myokyawhtun/pdfmerger/PDFMerger.php');
$pdf = new \PDFMerger\PDFMerger;
foreach($sourcePdfs as $file)
{
$pdf->addPDF($pdfDir.'/source/'.$file);
}
$pdf->merge('download', 'Download.pdf');
All the mentioned libraries use FPDI under the hood, which simply does not support content outside of a pages content stream, such as links or any other annotation type.
We (author of FPDI) also offer non-free products which work on another level and which allow you keep all annotations including links and also forms when you concatenate the documents. This is possible with the SetaPDF-Merger component:
$merger = new SetaPDF_Merger();
foreach($sourcePdfs as $file) {
$merger->addFile($pdfDir . '/source/' . $file);
}
$merger->merge();
$document = $merger->getDocument();
$document->setWriter(new SetaPDF_Core_Writer_Http('Download.pdf'));
$document->save()->finish();

Is it possible to efficiently split a PDF into individual pages (using FPDI)?

I am trying to split large files into individual pages, using PHP's FPDI library.
For some reason, splitting the file does not do much to reduce the file size. For example, the following script applied to a 30 page 1MB file results in 30 files of around 0.9MB, i.e. resulting in total of around 26MB!
It suggests to me that a big portion of original file is retained, even though it is not required.
Questions:
Is this avoidable?
Is this a bug in FPDI?
Is there an alternative PHP library that is more efficient at splitting?
More detail
I've reproduced this issue in a variety of configurations:
FPDI version 1 (no longer supported) and FPDI version 2
Using FPDF and TCPDF
PHP 5.4 and PHP 5.6
Various PDF files, including files generated using FPDF and TCPDF
Here is some PHP code to illustrate the issue:
<?php
testPdfSplit();
function testPdfSplit()
{
echo phpversion();
//Load a file
$contentPath = "/path/to/local/files/original_file.pdf";
copy("https://file-examples.com/wp-content/uploads/2017/10/file-example_PDF_1MB.pdf", $contentPath);
$numpages = 30;
//Get the original file size
$fileSize = round(filesize($contentPath) / (1024 * 1024), 3);
echo "<p>Original file is $fileSize MB</p>";
for($i=1; $i<=$numpages; $i++)
{
echo "<p>Creating file with $i pages</p>";
$filePath = "/path/to/local/files/test.$i.pdf";
try
{
selectOnePage($content, $i, $filePath);
}
catch (Exception $e)
{
die ("<pre>ERROR: $e</pre>");
}
$fileSize = round(filesize($filePath) / (1024 * 1024),3);
echo "<p>$filePath is $fileSize MB</p>";
}
}
function selectOnePage($filePathIn, $pageNo, $filePathOut)
{
require_once('fpdf/fpdf.php');
require_once('fpdi/src/autoload.php');
// initiate FPDI
$pdf = new \setasign\Fpdi\Fpdi();
// get the page count
$pageCount = $pdf->setSourceFile($filePathIn);
echo "<p>Selecting page $pageNo / $pageCount</p>";
// import a page
$pdf->AddPage();
$templateId = $pdf->importPage($pageNo);
$pdf->useImportedPage($templateId);
//output the file
$pdf->Output($filePathOut, 'F');
}
FPDI does not analyze the used resources of an imported page and copies all referenced resources.
If a document e.g. has only a single resource dictionary (a common structure), all resources are copied.
We also offer a commercial (non-free) tool for merging and splitting PDF documents. The SetaPDF-Merger component. By default this tool has the same problem but we'd prepared a demo with some code, that removes unused resources after the split process. You can find the demo and code here.
This appears to be a general problem with most PDF tools - it is also a problem with pdftk and cpdf, as described in pdftk split pdf with multiple pages.
Most PDFs I have come across have a single resource dictionary, so it can't be done easily (Thanks to #Jan Slabon for the explanation).

How to insert an image from PHP into PDF 1.7

I'm creating a web app that allows a canvas form to insert an image from a HTML canvas into a particular position in multiple PDF files. I had this working with python flask as a back-end but the people that I'm making it for only want it in PHP. I have tried using libraries like FPDI but they only work with PDF versions up to 1.4 while the PDF files we are using are version 1.7.
Does anyone know any possible libraries that can help me solve this issue. I would prefer not to convert the PDF files if possible.
Cheers
With TCPDF you can insert images into a PDF (v.1.7) file:
Requirements
composer require tecnickcom/tcpdf
Example
<?php
require_once __DIR__ . '/vendor/autoload.php';
$pdf = new TCPDF();
$pdf->setPDFVersion('1.7');
$pdf->setAutoPageBreak(true);
$pdf->setPrintHeader(false);
$pdf->setPrintFooter(false);
$pdf->AddPage();
// Insert image
$pdf->setJPEGQuality(100);
$pdf->image(__DIR__ . '/example.jpg', 10, 10);
// Close and output PDF document
//$pdf->output('doc.pdf', 'I');
// Save the pdf file
$content = $pdf->output('', 'S');
file_put_contents('example.pdf', $content);
We (Setasign, creator of FPDI) offer a commercial add-on that let you import PDFs which uses a compression technic that was introduced in PDF 1.5.
You may also try to downgrade these documents with an external program. I'm aware of some people using Ghostscript for this.
Generally you should know that you do not insert an image into the existing PDF but you create a completely new PDF while importing a single page into a reusable structure which you place onto a newly created page. On top of this you place the image.
With FPDI you cannot edit a PDF document.

Creating a new PDF by Merging PDF documents using TCPDF

How can I create a new document using other PDFs that I'm generating?
I have methods to create some documents, and I want to merge them all in a big PDF, how can I do that with TCPDF?
I do not want to use other libs.
TCPDF has a tcpdf_import class, added in 2011, but it is still "under development". If you don't want to use anything outside of TCPDF, you're out of luck!
But FPDI is an excellent addition to TCPDF: it's like an addon. It's as simple as this:
require_once('tcpdf/tcpdf.php');
require_once('fpdi/fpdi.php'); // the addon
// FPDI extends the TCPDF class, so you keep all TCPDF functionality
$pdf = new FPDI();
$pdf->setSourceFile("document.pdf"); // must be pdf version 1.4 or below
// FPDI's importPage returns an object that you can insert with TCPDF's useTemplate
$pdf->useTemplate($pdf->importPage(1));
Done!
See also this question:
TCPDF and FPDI with multiple pages
Why don't you use Zend_PDF, it 's really a very good way to merge file.
<?php
require_once 'Zend/Pdf.php';
$pdf1 = Zend_Pdf::load("1.pdf");
$pdf2 = Zend_Pdf::load("2.pdf");
foreach ($pdf2->pages as $page){
$pdf1->pages[] = $page;
}
$pdf1->save('3.pdf');
?>
Hi i think TCPDF is not able to merge pdf files.
You can try it with an shell command and
PDFTK Toolkit
So you dont have to use an other pdf library.
This thread is from 2009, but using existing PDFs in PHP is still an issue in 2020.
After Zend_PDF has been abandoned and TCPDI does not support PHP 7, FPDI currently seems one of the few working solutions left in 2020. It can be used with TCPDF and FPDF, so existing code keeps working. And it currently seems well maintained.
Check out FPDI and FPDF_TPL. This isn't a perfect solution, but you can basically use FPDF_TPL to create a template of your PDF file and the insert it into your PDF file.

Categories