Convert pdf to html in php - php

I want to read the pdf documents and display the content to the browser, without allowing the users to save a copy of the pdf.
How can i use fpdf for this purpose? So far, i could not figure out a way of reading a pdf document with fpdf, apart from creating a new pdf. Can anyone suggest an example of reading a pdf file, and if possible, how to disable the save as pdf option?

fpdf can't read pdf's. take a look at it's FAQ - 16 an 17 sound interesting and it loooks like there are addons to do this.
what you really can't ever avoid is to let the user save that pdf - it has to be sent to the browser at the clients machine, to display it, so there will always be a possibility to save it. a possibility would be to transform every page of the pdf to an image (using Imagemagick for example) and oly display these images, so the user can't copy the text from it and has no possibility to get the original pdf-document - but that will only annoy people.

If you have an existing PDF you want to display it inline rather than ask them to download it:
// Path to PDF file
$file = "blah.pdf";
// Show in browser
header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="'.basename($file).'"');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: ' . filesize($file));
readfile($file);
exit();
You don't need to use FPDF that's only a class that helps you create PDF's from scratch.
Also bare in mind there is nothing to stop a user from saving a PDF even when displaying inline.

Im not sure if this is possible, check the FPDI extension for FPDF here: http://www.setasign.de/products/pdf-php-solutions/fpdi/

To convert a pdf to html there a program for linux called pdftohtml. However be aware that the result of this will not create something that looks like the original pdf and in many cases (locked pdf's etc) it will fail. What is a possible solution is generating an image of each page using a program like ImageMagick, then place the html over on an invisible layer to allow for interaction. I'd still rather go for displaying the pdf inline if I were you though.

Related

Do we need to set HTTP Header when using mpdf?

I am using mpdf to generate the pdf files using PHP. I am successfully able to output the pdf as inline browser as well as force download using the mpdf options.
My question is do we need to send any HTTP Header information? or mpdf handles that part automatically? I am asking this because some browser may require some kind of header information to make pdf files work properly.
Please note that we are asking about the headers related to PDF file only.
E.g.
header("Content-type: application/pdf");
header("Content-Description: PHP Generated Data");
header("Content-Transfer-Encoding: binary");
header('Content-Length:' . filesize($file));
Thanks
mPDF handles the appropriate headers automatically when using the DESTINATION::INLINE or DESTINATION::DOWNLOAD options. You can see the exact headers that are set in the code.
Feel free to set additional custom headers if you need to, however it's not needed to correctly view / download the generated PDF document.

Open PDF from FPDF in new tab

I have a process where customer clicks if he wants a report to be generated and downloaded while creating a business unit. This report will be in pdf format.
Currently I am using Codeigniter and I use FPDF to generate pdf files.
The pdf opens well when it is requested. But
Problem:
1) PDF opens in the same tab. I would like the pdf to open up in new tab which I am kind of thinking how.
"_target" will help me open the pdf in new tab if it is a pdf link. But here it is server side generated pdf. Hence "_target" will not work so I am looking for alternative on this.
2) After the pdf generates, the next line of code is not read. The execution actually stops here. I would like to know how I can make the process continue even after outputting pdf file.
Example
$pdf->Output($exampleArray, 'D'); // exampleArray carries all data to PDF and helps output the pdf and D forces FPDF to download PDF rather than opening it. Instead of 'D' I can use 'I' but that will output the pdf in same tab.
$this->continueNextFunction(); // This function should run and open the views in it.
From the above example I would love to see either PDF downloaded or 'opened in new tab' followed by next line executed helping the page to redirect and open required views.
Also please let me know if further explanation is required. I tried my best to explain the situation here. I had looked on this over google but I have not really found any solution on this.
Any help regarding this will be greatly appreciated.
You should create the new tab before running the FPDF code.
Alternative you can save the pdf as a file and open a new tab with the correct header.
see this question: Show a PDF files in users browser via PHP/Perl
The code terminates with output by design, unless you save it to file or string.
$pdf->Output($filename,'F');
If you could elaborate on what you want to do after the output i might be able to help more.
Here is what we are doing and some thoughts:
The Output() method takes 2 arguments, name and dest. You are sending an array in for the name parameter, probably not what you want. The second, dest, will use the "D" as you have specified.
Output() sends headers and the data depending on the value you specify for dest. See below.
What that means is if you want to continue executing code, you are likely going to need to separate out the logic that generates this PDF into a new page, open that in the new tab using target="_new" like you were thinking, which then prompts the user to download or in that case you can use the "I" value and open it within the browser.
Output() from fpdf.php [lines 999-1036]:
switch($dest)
{
case 'I':
// Send to standard output
$this->_checkoutput();
if(PHP_SAPI!='cli')
{
// We send to a browser
header('Content-Type: application/pdf');
header('Content-Disposition: inline; filename="'.$name.'"');
header('Cache-Control: private, max-age=0, must-revalidate');
header('Pragma: public');
}
echo $this->buffer;
break;
case 'D':
// Download file
$this->_checkoutput();
header('Content-Type: application/x-download');
header('Content-Disposition: attachment; filename="'.$name.'"');
header('Cache-Control: private, max-age=0, must-revalidate');
header('Pragma: public');
echo $this->buffer;
break;
case 'F':
// Save to local file
$f = fopen($name,'wb');
if(!$f)
$this->Error('Unable to create output file: '.$name);
fwrite($f,$this->buffer,strlen($this->buffer));
fclose($f);
break;
case 'S':
// Return as a string
return $this->buffer;
default:
$this->Error('Incorrect output destination: '.$dest);
}

Protecting / serving a file via readfile(), force download?

I'm trying to make a simple script that does two things:
Serves up a file and hide's it's destination
Has a download counter
Now, I'm doing this in the wordpress environment, but this question isn't completely wordpress-related so I figured I would ask here.
Basically, the way I have it set up, currently, is I have a link that when you click it sets a $_['GET'] which is then checked if is set. If it is set, the download file is served.
the link: Click here!'
the $_['GET'] code: http://pastebin.com/93nD43gA
There is a bit of wordpress jargon in the code, but basically it's checking a download count user_meta and if it's > 0, serveFile() is called.
The main problem I'm having here is, if I click the link, readfile() loads the actual file contents INTO the window (garbled text). If I add a target=_blank to the <a> it opens a new browser window and loads the contents INTO the window.
This approach seemed to work perfectly fine when I was doing it as stand-alone php files. My main issue is that I need to keep the wordpress space so I can call functions, etc. associated with it.
I have tried using the $_['GET'] on both the self page, another page with a custom template (the code in the pastebin above), and as a stand-alone php file. Both the first two options load the file INTO the window. The third doesn't preserve wordpress functions, even if I include blog-header.php.
Can anyone point me in to the right direction of how to get the file to force download and not load INTO the window?
You need to set the appropriate header for whatever the file type is. For example, if readfile always serves, PDFs, it should be done like this:
// disable browser caching -- the server may be doing this on its own
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header('Content-Type: application/pdf');
//forces a download
header("Content-Type: application/force-download");
header('Content-Disposition: attachment; filename=filename.pdf');
readfile($file);
Keep in mind that header only works if you have not sent any data in the request at all including whitespace.
The 'garbled' text is what you want however besides that you have to set a mime. This can be accomplished by simply setting a header, e.g. header("Content-Type: image/png");
If the file mimes will vary (e.g. pdf, doc, png, etc) you should look into finfo extension. With it you can get the full and correct mime of the file
<?php
$finfo = new \finfo(FILEINFO_MIME);
$mime = $finfo->file('path/to/file', FILEINFO_MIME_TYPE);
header("Content-Type: $mime");
As noted - headers can be set only if no write to output has been done (no echo's, print, etc. Output buffering could help you here).

Dynamic creation of a doc/docx document on the users desktop

My site is HTML/Javascript with AJAX calling server-side PHP. I want to allow the user to click an icon and create a report from MySQL data and then save this on the client's desktop without doing a page reload.
Options for creating a doc, as I can gather it, appear to be as follows. (I gather it needs to be done server-side, rather than with Javascript.) I'm not sure where the file ends up in each case. Please feel free to correct my misunderstandings :)
Method 1 - this appears only to create a .doc file. I'm not sure where the file gets put.
$fp = fopen("method1.doc", 'w+');
$str = "<B>This is the text for the word file created through php programming</B>";
fwrite($fp, $str);
fclose($fp);
Method 2 - this also appears to create a .doc file.
$word = new COM("word.application") or die ("couldnt create an instance of word");
echo "loaded , word version{$word->version}";
$word->visible = 1;
$word->Documents->Add();
$word->Selection->TypeText("Sample text.");
$word->Documents[1]->SaveAs("method2.doc");
$word->Quit();
$word->Release();
$word = null;
Method 3 - also a .doc file, I think.
header('Content-type: application/vnd.ms-word');
header("Content-Disposition: attachment;Filename=method3.doc");
echo "<html>";
echo "<body>";
echo "<b>My first document</b>";
echo "</body>";
echo "</html>";
Method 4 - PHPWord
Method 5 - PHPDocx
I've tested 1 & 2 in my home dev environment, but I can't find the files! What's the best way forward, please?
Thanks :)
BTW, I know there are relevant posts here, here and here, but none really answers the question.
If you want to have an icon, and when the icon is clicked it makes a download without page reload, then you just have to make a link to the icon that bring to a script that start a download using the appropriate headers.
Example :
header ('Pragma: no-cache');
header('Content-Disposition: attachment; filename="'.$File.'"');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Cache-Control: public');
header('Content-Description: File Transfer');
header('Content-Transfer-Encoding: binary');
header('Content-Length: '.$Len);
Doing this, the download will start, but the page where the user has clicked will not be changed, neither reloaded.
If you want to generate dynamic DOCX files to be downloaded, I recommend to use OpenTBS. This library can generate a DOCX (and XLSX, PPTX, ODT, ODS, ...) using templates. It has a function that let you send the result directly as a download, without temporary files, or let you save is in the server side.
Methods 1 & 2 create document on the server side somewhere in filesystem (after that you need to transfer it to the client).
Method 3 creates document as a response to client request - depending on settings browser will either save it or open in window (or ask 'Save/Open/Cancel?').
I personally would have made java applet or flash application which will have access to your local filesystem. It can load document from server and save to local file system without page reloads.

How can I view/open a word document in my browser using with PHP or HTML

How can I open and view a .doc file extension in my browser? The file is located on my server.
Two options: First is to just link to it, e.g. My Word Document, the second is to use an iframe and point it to the document. For this to work, however, most browsers require that the server sends a Content-disposition: inline header with the document. If you cannot configure your web server to do this, you can wrap the document in a bit of php:
<?php
header('Content-disposition: inline');
header('Content-type: application/msword'); // not sure if this is the correct MIME type
readfile('MyWordDocument.doc');
exit;
And then link to that script instead of your word document.
This isn't guaranteed to work though; the content-disposition header is just a hint, and any browser may choose to treat it as an attachment anyway.
Also, note that .doc isn't exactly portable; basically, you need Word to display it properly (Open Office and a few other Open Source applications do kind of a decent job, but they're not quite there yet), and the browser must support opening Word as a plugin.
If the .doc file format requirement isn't set in stone, PDF would be a better choice (the conversion is usually as simple as printing it on a PDF printer, say, CutePDF, from inside Word), or maybe you can even convert the document to HTML (mileage may vary though).
…
You will need a browser with a plugin for Office documents installed. I believe Microsoft Office will install one for at least Internet Explorer by default.
If you want to work without a plugin, then you will need to convert the document to another format — HTML for maximum compatibility. This isn't a trivial operation, especially for complex documents (or even those which just contain images).
$file = "$file_name.doc";
$len = filesize($file); // Calculate File Size
ob_clean();
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Type:application/zip"); // Send type of file
$header="Content-Disposition: attachment; filename=$patient_name.zip;"; // Send File Name
header($header );
header("Content-Transfer-Encoding: binary");
header("Content-Length: ".$len); // Send File Size
#readfile($file);
You can use google docs instead as it is free and reliable
You can assign your file path to iframe.
e.g. iframe1.Attributes.Add("Src", "http://docs.google.com/gview?url=http://YOUR_FILE_PATH&embedded=true");
If your .doc file is accessable online, you can try Office Web Viewer service.
If your documents stored in Intranet, you can use Microsoft Office Web Apps Server. It allows users to view Word, PowerPoint, Excel documents via browser.
//Edit
$header="Content-Disposition: attachment; filename=$file_name.doc;"; // Send File Name

Categories