best way to split a base64 encoded pdf in PHP - php

I have a string containing a 2 page base64 encoded PDF file. The second page of the PDF is always garbage. (Terms and conditions from the web service that sent me the PDF.) I would like to be able to modify the PDF to drop the second junk page and re-encode it as base64 data, ideally without writing to the disk. Any suggestions?

Well 24 hours later and a lot of research and I have come up with the answer. Brad is correct, this should have probably been split into two questions, although decoding the base64 data is very simple and has been covered multiple times on this site so I will not go into detail on how to do that. The real kicker for me was finding a framework that would let you load the PDF from a string, not from the disk. The answer is the zend framework.
// Load the PDF
$pdf = Zend_Pdf::parse($pdfString);
// Remove the page from the pages array
unset($pdf->pages[$id]);
// Return the PDF document as a string
$pdfString = $pdf->render();

Related

saving PDFs for later manipulation in PHP

I have a PDF with some text in it that I would like to modify dynamically using PHP. This is being done already with another PDF, and what happens is that PHP simply replaces a token in the form %token% with another value pulled from a database. If you open that PDF in a text editor, you can find the %token% in plain text. But with this other PDF that I want to do the same thing with, if you open it in a text editor, there are no tokens in plaintext (even though I explicitly created one using Adobe Acrobat Pro). Obviously, the PDF's string content in this PDF is either encrypted, compressed, or both. What I want to know is how can I save a PDF so that the string content remains as plaintext such that PHP can manipulate it.
Please note, I do not want to dynamically create the whole PDF from scratch using some PHP library. I know that is something that can be done, but the PDF I am working with already exists and I just want to modify it slightly in the manner described.
For things like that I like to use the free command line tool PDFtk, which can compress / decompress PDFs and some nice thinks more. You may have a look at: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
PS: I edit a special pdf calendar form from the internet. I decompress it and replace the awful pink color for the weekend with gray (Saturday) and blue for Sunday. Violet I use as small bar to mark vacation days.

Converting .pdf to .zpl

I need to convert .pdf -file to .zpl -label file for printing with zebra printers, but is this even possible?
The PDF comes in as a base64 encoded string, and somehow I need to output that as a .zpl -file.
I use PHP in my project, and I prefer the method in PHP, but basically any programming language is fine, as long as it gets the job done.
I was thinking about converting the PDF to image(which seems to be possible by quick googling), and then from image(PNG, JPG, etc.) to ZPL(which also seems to be ok by quick googling), but does anyone have any knowledge about this kind of operation or any insights before I start to do this? I'm on a tight schedule here, and I cannot afford any fruitless work.
Update 4.8.2016
I went the other way and created the ZPL from the scratch because it keeps our service faster than doing some conversions. So I don't have any more info on this than what the google already offers, if someone comes wondering about this same thing. PS. ZPL isn't that hard of a language. ;)
I had the some problem to solve: take a PDF file, convert it into ZPL code somehow and print it using a Zebra printer.
Thanks to stackoverflow and the ZPL Programming Guide, I learned about embedding bitmaps with the Graphic Field command (^GF).
Basically, you have to do these steps:
Render a PDF file as a bitmap
Make it monochrome
Convert the bitmap into ASCII hexadecimal (as defined by ZPL)
Compress the ASCII hexadecimal (otherwise our printers struggled with megabytes of bitmap data)
Put the data into this code template ^XA^GFA{some parameters and tons of bitmap data}^XZ
Our services were running on ASP.NET so I wrote a C# library to do just that. It wraps the native calls to Google's PDFium for rendering and returns a string with valid ZPL code for printing.
You could convert base64 encoded PDF files like that:
public static string GetZplCode(string pdfBase64, int page = 0, int dpi = 203)
{
return PDFtoZPL.Conversion.ConvertPdfPage(pdfBase64, page: page, dpi: dpi);
}
I hope this nuget package will prevent others from spending weeks in searching on how to make this ^GF command work.

How do I create a large pdf, semi quickly

I need to generate a large PDF, 2480 pages to be exact.
Currently I am using indesign, and while the output is exactly what I want.
I would rather not be involved in the document creation process.
It takes 31 minutes for indesign to execute the data merge, generate the pdf, save the pdf, and to save the pdf.indd file. (I dont really need the pdf.indd file, but I would rather not have to recreate the data merge if something were to happen to the pdf)
I am hoping for a php, or similar solution. Currently my data is stored in MySQL.
The majority of the pdf is static text, with 19 dynamically driven text fields.
There is one image on the pdf, 75x100px # 72dpi.
The output needs to be exact, the pdf file is printed and cut in half at 4.25 inches.
I have tried TCPDF, while it is fast at generating upto 50 pages, after that it would rather die than give me an output. I have also played with mPDF, and found it to be, ..., not as friendly. I have also considered generating many small files and using some utility to merge the smaller pdf's into one large pdf. Though that seems like driving around the mountain.
Any thoughts would be helpful.
You certainly can create documents directly with PHP, but it can be difficult. One method is to use one of the various PDF classes to create the document, as you have found. Another is to create images (using ImageMagic, GD, etc.) and convert those to PDF. (This method is less efficient, as you are creating raster graphics making the whole PDF page one giant graphic.)
However, I think you should consider simply scripting InDesign. InDesign has the capability to read data in via XML and create the document. This way, the design of your document isn't dependent on your programming abilities and you can still have the power of programmatically creating the document.
When it comes to huge number of pages in PDF, LaTeX is always the best answer. Nothing can really handle huge PDF generation as fast, accurate and elegant as LaTeX.
Check this question to see how to retrieve your data from the database.

PHP / Python script that base64encodes images in css

I find myself manually encoding background images in the css in base64 often.
When I mean manually, I mean that I encode the image, copy the resulting string, paste it into the css file and so on. This is stupid!
I came to the conclusion that writing a script in PHP or Python that does it automatically would not be difficult, it's just a matter of parsing the css, finding the image on the HD, encoding it in base64, replace the result with the original string in the css file and save a new file.
Then I thought: "how come nobody has already done this? Maybe it would be better to ask before doing it."
So here I am, does a similar solution exist?
Thanks
Well, Chris Coyer # CSS-Tricks published an article talking about Data URIs, where he explains how to use them and how they are useful. Near the end, he states that's it's very easy to generate those on the fly with PHP, like so
if you are using PHP (or PHP as CSS), you could create data URIs on the fly like this
<?php echo base64_encode(file_get_contents("../images/folder16.gif")) ?>
However, take not that you shouldn't use base64_encode on all images on a website. the size of the string generated by base64_encode is larger by about 33% of the original image. Data URIs are great when you have small pictures and you don't want to waste requests on them.

Is it possible to save a word file in MYSQL database and to view the content "AS IT LOOKS" in the Browser

For example, i am uploading a word file with some FORMATTED contents in the database. The content in the word document is aligned.
I done up to the above level . My issue is how can i able to view the CONTENTS AS IT LOOKS EXACTLY (means the exact formatted contents) IN A BROWSER.
Kindly help me out of this issue.
Thanks in Advance
Fero
You may stream the content in the body of the request as an attachment setting the correct MIME Type. If the user's client is configured to handle the content type it will show (after asking for permissions).
PHP MIME Content Type
Word is a format for word processing, whereas the browser is a client for displaying web pages. So no, you can't. There are some similarities between the two formats, so you can transform between them, but usually at a loss. Since Word is a proprietary format, transforming it to html can be tricky, but you can generally use open office for the job.
Another alternative is, instead of uploading the file, upload the content of the file through the use of a javascript WYSIWYG editor like TinyMCE. Since you will be storing the HTML markups that the editor converts from the formatted contents that you copy-paste in it, it will be very straight-forward to display the contents.
If the content doesn't need to be edited, why not convert it to a .PDF/.JPG on the fly, or do it once upon upload and cache the result?

Categories