i am looking for something to edit my PDF files on my server and replace text / links within.
I can not modify the files as they are generated, therefore i need a script that will modify them after.
Check this:
http://www.fpdf.org/
http://www.setasign.de/products/pdf-php-solutions/fpdi/
You need to read the pdf file first and then change the text and Regenerate it again
Related
I am extracting text from PDF files. this is the code:
<?php
require("PdfToText.php");
$file = 'SamplePF' ;
$pdf = new PdfToText ( "$file.pdf" ) ;
echo ( $pdf -> Text ) ;
?>
This class work fine for some PDF files.
The problem with this class is :
for some PDF files it take text from random page/line not in the
page sequence wise.
for some PDF files it is not showing any result.
for some PDF files it extract only one or two lines.
Please suggest some solution. Thank You!
I am not sure that this might be the exact problem because of which you are not able to extract but I also encountered something similar when extracting data from pdf. Sometimes the PDF files are locked by owner passwords which puts certain restrictions on the document and does not allow changing, content copying or extraction etc so as to protect its copyright issues. Check this link for more info on owner passwords.
So you can first try to remove owner password and then try to extract such pdf's. To remove owner passwords there are a number of tools available online, you can choose whichever fits you the best.
So I'm making a notepad app in PHP, but I want to add the ability to share the file amongst your peers or something.
It's based on AJAX, and it saves the file automatically, and the file is named to what your IP address is after being hashed in md5.
What I want to do is maybe go to /view/837ec5754f503cfaaee0929fd48974e7, while the actual text file is located at /notes/837ec5754f503cfaaee0929fd48974e7.txt
I know I'll have to use file_get_contents(), but I don't know how to display it on a page.
I could just have it link to the .txt file, but I don't want it raw. I want it to have some style.
How would I go about doing this? Where can I start?
First you would need a way to store a variable in the URL (the file name). This can be easiest done using the querystring.
So the link to a file for your user to see would be '/view/?file=MYFILENAME'
This would then be interpreted by your php (this could also be wrapped in AJAXy goodness) into a path to retrieve the text file from.
view/index.php
//Fetch the file based on the get variable
//Note the relative path
$file = file_get_contents('../notes/'.$_GET['file'].'.txt');
//Print the file. You can also dress it up or wrap it in HTML tags
echo $file;
When displaying the text file, there is some built in functions that will help. Most notable nl2br() which takes the new line characters in a text file and makes them into html <br> tags.
More reading on the GET array can be found here
I am using PHPRtfLite library (http://sigma-scripts.de/phprtflite/docs/index.html) to produce an RTF file using PHP and Yii.
So far, I've made a simple "Hello world" function.
Yii::import('ext.phprtf.PHPRtfLite');
Yii::registerAutoloader(array('PHPRtfLite','registerAutoloader'), true);
$rtf = new PHPRtfLite();
$sect = $rtf->addSection();
$sect->writeText('Hello world!', new PHPRtfLite_Font(), new PHPRtfLite_ParFormat());
//save rtf document
$rtf->sendRtf('takis.rtf');
File is created successfully, but when I open it (either wordpad or ms word) I do not see the actual content of the file but the raw code of the RTF:
{\rtf\ansi\deff0\fs20
{\fonttbl{\f0 Times New Roman;}}
{\colortbl;\red0\green0\blue0;}
{\info
}
\paperw11907 \paperh16840 \deftab1298 \margl1701 \margr1701 \margt567 \margb1134 \pgnstart1\ftnnar \aftnnrlc \ftnstart1 \aftnstart1
\pard \ql {\fs20 Hello world!}
}
Do you have any idea on how to solve this?
Thank you very much in advance.
To answer my own question, in case someone is having the same issue in the coming future...
It seems to be a problem of the sendRTF function. Now, I save the created file locally:
$rtf->save('takis.rtf');
and then generate a link for the user to download the file. This works pretty good.
I have experienced same thing myself. I'm not sure, if you had same reasons, but in my case, there was extra newline in the beginning of PHP file, before <?php tag. When I used sendRtf to download file from browser, that newline ended up also in RTF file, making it invalid and as result, raw rtf code was displayed. When using save, such extra characters won't reach to file.
So one thing to check in similar situations - open Rtf file in Notepad and examine beginning of file.
I'm generating a pdf file with html2fpdf.
$pdf = new HTML2FPDF();
$pdf->HTML2FPDF("P","mm","A4");
$pdf->AddPage();
$pdf->WriteHTML($html);
$pdf->output('sample.pdf');
This sample works great. But:
How do I delete the pdf after the output? I just want to have links in my tool, the users can download the pdf and after that it shoud be deleted on the server.
How can I 'clean up' after generating the pdf?
You can use PHP's file deletion function called unlink()
Call this function with the full path to the generated PDF file (or any file for that matter) and PHP will delete that file.
http://php.net/manual/en/function.unlink.php
You don't necessarily have to delete the file immediately after the user has downloaded it. You can just as easily place all the generated files in one central folder and have a cron job execute a more general clean up script simply removing the older files.
One method could be -
Scan the contents of the folder using scandir().
Iterate over its files in a foreach loop..
Inspect the creation time of each file using filemtime().
If the creation time was over hour ago, delete the file using unlink().
Because you are generating the PDF file yourself within your PHP code, I didn't mention the permissions consideration. Here would be a good place to mention that your PHP must have the correct file system permissions in order to perform any action on the file system. You are creating a PDF file so it's safe to assume that you have the correct permissions to make changes to the file system but if you plan on using this unlink() function in other scripts make sure that the files you are dealing with have the correct permissions set.
If you don't add the 'F' flag to the output function there will be no pdf files stored on the server at all:
$pdf->output('sample.pdf', 'F'); //stores PDF on server
In your case the script itself behaves like an actual pdf file. So, creating a link to the script is just like a link to the pdf, except that the PDF is created every time the script is requested. To tell the browser it's a PDF the content-type response header must be set to application/pdf:
content-type: application/pdf
This way the broser knows that it's a pdf even if the URL is ending in a .php. You can use rewrite engine to make it end in pdf or whatever else.
Sending the headers is done by the fpdf/tcpdf. In short: you don't have to do any cleanup, because no pdf file is stored on the server.
If you wonder what the name is for than, try saving the pdf file. The recommanded name when saving will be sample.pdf.
Reference:
PHP header() function, at the examples there is one for sending pdf
FPDF::Output()
TCPDF::Output()
I need to find a certain key in a pdf file. As far as I know the only way to do that is to interpret a pdf as txt file. I want to do this in PHP without installing a addon/framework/etc.
Thanks
You can certainly open a PDF file as text. PDF file format is actually a collection of objects. There is a header in the first line that tells you the version. You would then go to the bottom to find the offset to the start of the xref table that tells where all the objects are located. The contents of individual objects in the file, like graphics, are often binary and compressed. The 1.7 specification can be found here.
I found this function, hope it helps.
http://community.livejournal.com/php/295413.html
You can't just open the file as it is a binary dump of objects used to create the PDF display, including encoding, fonts, text, images. I wrote an blog post explaining how text is stored at http://pdf.jpedal.org/java-pdf-blog/bid/27187/Understanding-the-PDF-file-format-text-streams
Thank you all for your help. I owe you this piece of code:
// Proceed if file exists
if(file_exists($sourcePath)){
$pdfFile = fopen($sourcePath,"rb");
$data = fread($pdfFile, filesize($sourcePath));
fclose($pdfFile);
// Check if file is encrypted or not
if(stripos($data,$searchFor)){ // $searchFor = "/Encrypt"
$counterEncrypted++;
}else{
$counterNotEncrpyted++;
}
}else{
$counterNotExisting++;
}