Merge PDF files with PHP [closed] - php

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
My concept is - there are 10 pdf files in a website. User can select some pdf files and then select merge to create a single pdf file which contains the selected pages. How can i do this with php?

Below is the php PDF merge command.
$fileArray= array("name1.pdf","name2.pdf","name3.pdf","name4.pdf");
$datadir = "save_path/";
$outputName = $datadir."merged.pdf";
$cmd = "gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=$outputName ";
//Add each pdf file to the end of the command
foreach($fileArray as $file) {
$cmd .= $file." ";
}
$result = shell_exec($cmd);
I forgot the link from where I found it, but it works fine.
Note: You should have gs (on linux and probably Mac), or Ghostscript (on windows) installed for this to work.

i suggest PDFMerger from github.com, so easy like ::
include 'PDFMerger.php';
$pdf = new PDFMerger;
$pdf->addPDF('samplepdfs/one.pdf', '1, 3, 4')
->addPDF('samplepdfs/two.pdf', '1-2')
->addPDF('samplepdfs/three.pdf', 'all')
->merge('file', 'samplepdfs/TEST2.pdf'); // REPLACE 'file' WITH 'browser', 'download', 'string', or 'file' for output options

I've done this before. I had a pdf that I generated with fpdf, and I needed to add on a variable amount of PDFs to it.
So I already had an fpdf object and page set up (http://www.fpdf.org/)
And I used fpdi to import the files (http://www.setasign.de/products/pdf-php-solutions/fpdi/)
FDPI is added by extending the PDF class:
class PDF extends FPDI
{
}
$pdffile = "Filename.pdf";
$pagecount = $pdf->setSourceFile($pdffile);
for($i=0; $i<$pagecount; $i++){
$pdf->AddPage();
$tplidx = $pdf->importPage($i+1, '/MediaBox');
$pdf->useTemplate($tplidx, 10, 10, 200);
}
This basically makes each pdf into an image to put into your other pdf. It worked amazingly well for what I needed it for.

$cmd = "gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=".$new." ".implode(" ", $files);
shell_exec($cmd);
A simplified version of Chauhan's answer

Both the accepted answer and even the FDPI homepage seem to give botched or incomplete examples. Here's mine which works and is easy to implement. As expected it requires fpdf and fpdi libraries:
FPDF: http://www.fpdf.org/en/download.php
FPDI: https://www.setasign.com/products/fpdi/downloads
require('fpdf.php');
require('fpdi.php');
$files = ['doc1.pdf', 'doc2.pdf', 'doc3.pdf'];
$pdf = new FPDI();
// iterate over array of files and merge
foreach ($files as $file) {
$pageCount = $pdf->setSourceFile($file);
for ($i = 0; $i < $pageCount; $i++) {
$tpl = $pdf->importPage($i + 1, '/MediaBox');
$pdf->addPage();
$pdf->useTemplate($tpl);
}
}
// output the pdf as a file (http://www.fpdf.org/en/doc/output.htm)
$pdf->Output('F','merged.pdf');

I've had similar problem in my software. We've wanted to merge several PDF files into one PDF file and submit it to an outer service. We've been using the FPDI solution as shown in Christa's solution.
However, the input PDF's we've been using could be in version higher than 1.7. We've decided to evaluate the FPDI commercial add-on. However, it turned out that some of the documents scanned by our office copier were having malformed indexes, which crashed the commercial FPDI add-on. So we've decided to use Ghostscript solution as in Chauhan's answer.
But then we got some strange metadata in the output PDF properties.
Finally we've decided to join two solutions to get PDF's merged and downgraded by Ghostscript, but the metadata is set by FPDI. We don't know yet how it would work with some advanced formatted pdfs, but for scans we use it works just fine. Here's our class excerpt:
class MergedPDF extends \FPDI
{
private $documentsPaths = array();
public function Render()
{
$outputFileName = tempnam(sys_get_temp_dir(), 'merged');
// merge files and save resulting file as PDF version 1.4 for FPDI compatibility
$cmd = "/usr/bin/gs -q -dNOPAUSE -dBATCH -dCompatibilityLevel=1.4 -sDEVICE=pdfwrite -sOutputFile=$outputFileName";
foreach ($this->getDocumentsPaths() as $pdfpath) {
$cmd .= " $pdfpath ";
}
$result = shell_exec($cmd);
$this->SetCreator('Your Software Name');
$this->setPrintHeader(false);
$numPages = $this->setSourceFile($outputFileName);
for ($i = 1; $i <= $numPages; $i++) {
$tplIdx = $this->importPage($i);
$this->AddPage();
$this->useTemplate($tplIdx);
}
unlink($outputFileName);
$content = $this->Output(null, 'S');
return $content;
}
public function getDocumentsPaths()
{
return $this->documentsPaths;
}
public function setDocumentsPaths($documentsPaths)
{
$this->documentsPaths = $documentsPaths;
}
public function addDocumentPath($documentPath)
{
$this->documentsPaths[] = $documentPath;
}
}
The usage of this class is as follows:
$pdf = new MergedPDF();
$pdf->setTitle($pdfTitle);
$pdf->addDocumentPath($absolutePath1);
$pdf->addDocumentPath($absolutePath2);
$pdf->addDocumentPath($absolutePath3);
$tempFileName = tempnam(sys_get_temp_dir(), 'merged');
$content = $pdf->Render();
file_put_contents($tempFileName, $content);

I have tried similar issue and works fine, try it. It can handle different orientations between PDFs.
// array to hold list of PDF files to be merged
$files = array("a.pdf", "b.pdf", "c.pdf");
$pageCount = 0;
// initiate FPDI
$pdf = new FPDI();
// iterate through the files
foreach ($files AS $file) {
// get the page count
$pageCount = $pdf->setSourceFile($file);
// iterate through all pages
for ($pageNo = 1; $pageNo <= $pageCount; $pageNo++) {
// import a page
$templateId = $pdf->importPage($pageNo);
// get the size of the imported page
$size = $pdf->getTemplateSize($templateId);
// create a page (landscape or portrait depending on the imported page size)
if ($size['w'] > $size['h']) {
$pdf->AddPage('L', array($size['w'], $size['h']));
} else {
$pdf->AddPage('P', array($size['w'], $size['h']));
}
// use the imported page
$pdf->useTemplate($templateId);
$pdf->SetFont('Helvetica');
$pdf->SetXY(5, 5);
$pdf->Write(8, 'Generated by FPDI');
}
}

This worked for me on Windows
download PDFtk free from https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
drop folder (PDFtk) into the root of c:
add the following to your php code where $file1 is the location and name of the first PDF file, $file2 is the location and name of the second and $newfile is the location and name of the destination file
$file1 = ' c:\\\www\\\folder1\\\folder2\\\file1.pdf';
$file2 = ' c:\\\www\\\folder1\\\folder2\\\file2.pdf';
$file3 = ' c:\\\www\\\folder1\\\folder2\\\file3.pdf';
$command = 'cmd /c C:\\\pdftk\\\bin\\\pdftk.exe '.$file1.$file2.$newfile;
$result = exec($command);

I created an abstraction layer over FPDI (might accommodate other engines).
I published it as a Symfony2 bundle depending on a library, and as the library itself.
The bundle
The Library
usage:
public function handlePdfChanges(Document $document, array $formRawData)
{
$oldPath = $document->getUploadRootDir($this->kernel) . $document->getOldPath();
$newTmpPath = $document->getFile()->getRealPath();
switch ($formRawData['insertOptions']['insertPosition']) {
case PdfInsertType::POSITION_BEGINNING:
// prepend
$newPdf = $this->pdfManager->insert($oldPath, $newTmpPath);
break;
case PdfInsertType::POSITION_END:
// Append
$newPdf = $this->pdfManager->append($oldPath, $newTmpPath);
break;
case PdfInsertType::POSITION_PAGE:
// insert at page n: PdfA={p1; p2; p3}, PdfB={pA; pB; pC}
// insert(PdfA, PdfB, 2) will render {p1; pA; pB; pC; p2; p3}
$newPdf = $this->pdfManager->insert(
$oldPath, $newTmpPath, $formRawData['insertOptions']['pageNumber']
);
break;
case PdfInsertType::POSITION_REPLACE:
// does nothing. overrides old file.
return;
break;
}
$pageCount = $newPdf->getPageCount();
$newPdf->renderFile($mergedPdfPath = "$newTmpPath.merged");
$document->setFile(new File($mergedPdfPath, true));
return $pageCount;
}

myokyawhtun's solution worked best for me (using PHP 5.4)
You will still get an error though - I resolved using the following:
Line 269 of fpdf_tpl.php - changed the function parameters to:
function Image($file, $x=null, $y=null, $w=0, $h=0, $type='', $link='',$align='', $resize=false, $dpi=300, $palign='', $ismask=false, $imgmask=false, $border=0) {
I also made this same change on line 898 of fpdf.php

Related

Merging PDFs with Codeigniter

I've written the following code for merging PDFs using this answer
function merge_pdfs() {
$pdfs_array = array('1.pdf', '2.pdf');
$pdf = new FPDI_Protection();
for ($i = 0; $i < count($pdfs_array); $i++ ) {
$pagecount = $pdf->setSourceFile($pdfs_array[$i]);
for($j = 0; $j < $pagecount ; $j++) {
$tplidx = $pdf->importPage(($j +1), '/MediaBox');
$pdf->addPage('P','A4');
$pdf->useTemplate($tplidx, 0, 0, 0, 0, TRUE);
}
}
$dt = new DateTime(NULL, new DateTimeZone($data->user->timezone));
$pdf->SetTitle('PDF, created: '.$dt->format(MYHMRS_DATETIME_FRIENDLY));
$pdf->SetSubject('PDF subject !');
$output = $pdf->Output('', 'S');
$name = "PDF".'-'.$dt->format('ymd').'.pdf';
$this->output
->set_header("Content-Disposition: filename=$name;")
->set_content_type('Application/pdf')
->set_output($output);
}
So, after this I'm getting the following error message
This document (1.pdf) probably uses a compression technique which is not supported by the free parser shipped with FPDI. (See https://www.setasign.com/fpdi-pdf-parser for more details)
I've checked the link and it suggests to set another PDF Parser ( If I understand right )
But I'm not sure how to make it working with Codeigniter and my example
Should I create library and try to use it?
Or maybe you know another solution for merging PDFs
The issue was related to PDF versions
Edit
If you don't know, the PDFs has versions. Yeah, I was surprised as well. Please check them here PDF versions
So, the problem was that I was trying to merge PDF 1.5 version with PDF 1.6
An example. It is simple.
<?php
require_once __DIR__ . '/vendor/autoload.php';
$mpdf = new \Mpdf\Mpdf();
$mpdf->WriteHTML('<h1>Hello world!</h1>');
$mpdf->AddPage('P');
$mpdf->WriteHTML('<h1>More</h1>');
$mpdf->Output();
?>

links (hyperlink) inside pdf is removed when splitting multiple page pdf into different single page

i was splitting pdf into different single page using fpdf and fpdi. Everything works fine but the link inside pdf was not working. Link was removed on splitted single pages.
split_pdf("test.pdf", 'splitedpdf/');
function split_pdf($filename, $end_directory = false)
{
require_once('fpdf/fpdf.php');
require_once('fpdi/fpdi.php');
$end_directory = $end_directory ? $end_directory : './';
$new_path = preg_replace('/[\/]+/', '/', $end_directory.'/'.substr($filename, 0, strrpos($filename, '/')));
if (!is_dir($new_path))
{
// Will make directories under end directory that don't exist
// Provided that end directory exists and has the right permissions
mkdir($new_path, 0777, true);
}
$pdf = new FPDI();
$pagecount = $pdf->setSourceFile($filename); // How many pages?
// Split each page into a new PDF
for ($i = 1; $i <= $pagecount; $i++) {
$new_pdf = new FPDI();
$new_pdf->AddPage();
$new_pdf->setSourceFile($filename);
$new_pdf->useTemplate($new_pdf->importPage($i));
try {
$new_filename = $end_directory.str_replace('.pdf', '', $filename).'_'.$i.".pdf";
$new_pdf->Output($new_filename, "F");
echo "Page ".$i." split into ".$new_filename."<br />\n";
} catch (Exception $e) {
echo 'Caught exception: ', $e->getMessage(), "\n";
}
}
// $pdf->close();
}
FPDI is not able to handle any dynamic content link links, form fields or any other annotation type. There's an extension which support at least links (only compatible with FPDI 1.4.4 + FPDF_TPL 1.2.3).
If you need to extract the pages including all attached annotations, you may check out the SetaPDF-Merger component (not free!).

How to generate PDF with low memory on the server?

I'm converting a table of my DB to PDF and I'm using the TCPDF.
First I have to convert my table to HTML and then I can convert to PDF, which use a lot of memory and I have a few resources on the server (256M for PHP max).
How can I pass a table that may have thousands of records to PDF with 256M memory max in PHP?
Can I create a PDF page by page and in the end concatenate all pages?
I have found a way to concatenate the pdf page from this link.
require_once("tcpdf/tcpdf.php"); //ur workspaces
require_once("fpdi/fpdi.php");
class concat_pdf extends FPDI {
var $files = array();
function setFiles($files) {
$this->files = $files;
}
function concat() {
foreach($this->files AS $file) {
$pagecount = $this->setSourceFile($file);
for ($i = 1; $i <= $pagecount; $i++) {
$tplidx = $this->ImportPage($i);
$s = $this->getTemplatesize($tplidx);
$this->AddPage(’P', array($s['w'], $s['h']));
$this->useTemplate($tplidx);
}
}
}
}
Did you try fpdf (http://www.fdpf.org) or the related mpdf (http://www.mpdf1.com) as an alternative? Maybe they use less resources so they can run on your server. They do a good job in creating HTML to PDF output.

How to skip over corrupt files with PHP libraries TCPDF and FPDI by modifying error handling?

I am using the PHP libraries TCPDF and FPDI to combine PDF documents, and am getting the following error:
TCPDF ERROR: Unable to find object (10, 0) at expected location
I have the commercial version of FPDI.
It appears that the issue is only happening with PDF Version 1.3 (Acrobat 4.x) files. Here is a screenshot of a file's document properties that is creating the error. http://imagebin.org/215041
I'd like to skip over any files with errors instead of letting the script die. I have modified the error handling with a new class ErrorIgnoringTCPDF, however, it is not working.
Any ideas?
require_once('../../libraries/tcpdf/tcpdf.php');
require_once('../../libraries/fpdi/fpdi.php');
class ErrorIgnoringTCPDF extends FPDI {
public function Error($msg) {
// unset all class variables
$this->_destroy(true);
// exit program and print error
//die('<strong>TCPDF ERROR: </strong>'.$msg);
}
}
$pdf = new ErrorIgnoringTCPDF();
$pdf->setPrintHeader(false);
$prows = fetch_data($id);
foreach ($prows AS $row) {
$irows = get_imaged_docs($row['pat_id']);
foreach($irows AS $irow){
if ($irow['type'] === 'application/pdf'){
$doc_id = $irow['id'];
$content = get_pdf_imaged_docs($doc_id);
$pagecount = $pdf->setSourceFile($content);
for ($i = 1; $i <= $pagecount; $i++) {
$tplidx = $pdf->ImportPage($i);
$s = $pdf->getTemplatesize($tplidx);
$pdf->AddPage('P', array($s['w'], $s['h']));
$pdf->useTemplate($tplidx);
}
} else {
$pdf->AddPage();
$doc = fetch_document_content($irow['id'], $irow['filename']);
$img = base64_encode($doc);
$imgdata = base64_decode($img);
$pdf->Image('#'.$imgdata);
}
}
}
$pdf->Output('documents.pdf', 'D');
If you are using Linux you can use shell_exec to combine files
function combine_pdf($outputName,$fileArray)
{
$cmd = "gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=$outputName ";
foreach($fileArray as $file)
{
$cmd .= $file." ";
}
$result = shell_exec($cmd);
}
Have you tried just suppressing the error?
$pagecount = #$pdf->setSourceFile($content);
if (empty($pagecount))
continue; // or whatever you want to do, maybe set $is_invalid = true;
This simply indicates that the PDF document is errorious. It points to a specific byte offset position where the expected object is not found.
I wont say this is an appropriate/best fix, but it may resolve your problem,
In: pdf_parser.php, comment out the line:
$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");
It should be near line 544.
You'll also likely need to replace:
if (!is_array($kids))
$this->error('Cannot find /Kids in current /Page-Dictionary');
with:
if (!is_array($kids)){
// $this->error('Cannot find /Kids in current /Page-Dictionary');
return;
}
in the fpdi_pdf_parser.php file
Hope that helps. It worked for me.
I have the same problem and i am using this code to fix my problems.
class convertPDF extends FPDI {
public function error($msg) {
throw new Exception($msg);
}
...other stuff...
}
try {
$convertPdf = new convertPDF();
} catch(Exception $e) {
die($e->getMessage);
}
This answer is for people who search for this problem. Have luck!

How to create a password protected pdf file

I'm using html2fpdf for creating PDF documents. Now once I have created that, I want to make sure that the PDF file is password protected. How can this be done in PHP?
Download the library I am using from a blog post on the ID Security Suite site:
<?php
function pdfEncrypt ($origFile, $password, $destFile){
require_once('FPDI_Protection.php');
$pdf =& new FPDI_Protection();
$pdf->FPDF('P', 'in');
//Calculate the number of pages from the original document.
$pagecount = $pdf->setSourceFile($origFile);
//Copy all pages from the old unprotected pdf in the new one.
for ($loop = 1; $loop <= $pagecount; $loop++) {
$tplidx = $pdf->importPage($loop);
$pdf->addPage();
$pdf->useTemplate($tplidx);
}
//Protect the new pdf file, and allow no printing, copy, etc. and
//leave only reading allowed.
$pdf->SetProtection(array(), $password);
$pdf->Output($destFile, 'F');
return $destFile;
}
//Password for the PDF file (I suggest using the email adress of the purchaser).
$password = "testpassword";
//Name of the original file (unprotected).
$origFile = "sample.pdf";
//Name of the destination file (password protected and printing rights removed).
$destFile ="sample_protected.pdf";
//Encrypt the book and create the protected file.
pdfEncrypt($origFile, $password, $destFile );
?>
I was never able to find a direct php solution to this problem. I ended up using pdftk and using shell_exec() to call the binary once the pdf file was generated/uploaded.
It accepts a syntax like this:
pdftk 'inputfile.pdf' output 'outputfile.pdf' user_pw pass1234 owner_pw pass4321

Categories