How to convert base64 encoded attachment to file (PDF, msg, eml) - php

We have an XML file exported from ServiceNow which we are trying to import into our custom PHP app.
Each attachment <sys_attachment> are split into chunks <sys_attachment_doc> which is ordered by the <position> element.
<sys_attachment>
<chunk_size_bytes>734003</chunk_size_bytes>
<compressed>true</compressed>
<content_type>application/pdf</content_type>
<encryption_context display_value="" />
<file_name>Filename.pdf</file_name>
</sys_attachment>
<sys_attachment_doc>
<data>[BASE64 ENCODED STRING HERE]</data>
<length>[STRING LENGTH]</length>
<position>0</position>
</sys_attachment_doc>
<sys_attachment_doc>
<data>[BASE64 ENCODED STRING HERE]</data>
<length>[STRING LENGTH]</length>
<position>1</position>
</sys_attachment_doc>
We have tried combining the string and base64_decoding it but to no avail.
<?php
header('Content-type: application/pdf');
header('Content-Disposition: attachment; filename="servicenow.pdf"');
//echo base64_decode($chunk0.$chunk1);
echo base64_decode($chunk0).base64_decode($chunk1);
?>
We are unable to find any documentation on how to convert these attachments to files outside of ServiceNow (PHP). Is there an extra step that needs to be done before decoding the string and converting to file (PDF)
Edit: I manage to solve it using #Joey answer. I base64_decode the chunks then afterwards combine it. The combined string is actually gzip compressed. We used gzdecode() to generate the PDF.
$attachment = base64_decode($chunk0).base64_decode($chunk1);
echo gzdecode($attachment);

One thing that may be tripping you up is that <compressed> flag. Since that's reading as true, the data is also gzipped, so attachments start from byte[], which then get gzipped, broken into chunks, and base64 encoded (per chunk!).
I don't know how to do this in php specifically, but this strategy should work:
Base64 decode of each chunk will give you a byte[] per chunk.
Combine those chunks in order of position to give you one big byte stream
gunzip that stream into another big byte stream which should be your file.

Related

Remove 'E' output headers from base64 string in TCPDF

I'm using TCPDF using
$base64String = $pdf->Output('file.pdf', 'E');
So I can send the data via AJAX
The only problem is that it comes with header information in addition to the Base64 string
Content-Type: application/pdf;
name="FILE-31154d59f28c63efae86e4f3d6a00e13.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="FILE-31154d59f28c63efae86e4f3d6a00e13.pdf"
So if I take the string that is created to base64_decode() or use with phpMailer in my case it errors. Is it possible to remove the headers so I only have the base64 string?
(The error is that the pdf can't be read by any PDF reader when opened)
I thought I'd be able to find something that solves this but I haven't found anything!!
UPDATE
This is what I've put in place to solve the issue
$base64String = preg_replace('/Content-[\s\S]+?;/', '', $base64String);
$base64String = preg_replace('/name=[\s\S]+?pdf"/', '', $base64String);
$base64String = preg_replace('/filename=[\s\S]+?"/', '', $base64String);
However it's not very elegant! So if anyone has a better solution please post it below :)
TCPDF docs are huge but unusable – it's easier to read the source code directly. It has those extra headers because you're asking for them by using the E output mode, which is intended for generating email messages.
For sending the PDF data as a PHPMailer attachment, you want the straight binary PDF data as a string, as provided by the S output mode, which you can pass straight into addStringAttachment(), and PHPMailer will handle all the encoding for you. All you have to do is this:
$mail->addStringAttachment($pdf->Output('file.pdf', 'S'), 'file.pdf');
To convert the PDF binary into base64, for example to us it in a JSON string, simply pass it through base64_encode:
$base64String = base64_encode($pdf->Output('file.pdf', 'S'));

How to convert base64 string to video in PHP?

I have a base64 encoded string which my frontend team has provided me with.The string is a video which was encoded using base64. I want to convert that back into a video file using Php.
I am currently just using the following to decode the string but I don't know how to proceed further.
$decoded = base64_decode ($encoded_string);
There seems to be a way to convert images from string using imagecreatefromstring() function, but I could not find a way to convert it into a video.
Thank you
you should know the video file type. you can decode to original format
$fp=file_put_contents('sample.mp4',base64_decode($encoded_string,true));
Video streams tend to be very large so is isn't a good idea to convert them to plain text in the first place. We'd also need to know the exact mechanism (protocol, format...) used to deliver the base64 string. In any case, once there you can do something like this (error checking omitted for brevity):
$chunk_size = 8192; // Bytes (must be multiple of 4)
$input = fopen('php://input', 'rb');
$output = fopen('/tmp/foo.avi', 'wb');
while ($chunk = fread($input, $chunk_size)) {
fwrite($output, base64_decode($chunk));
}
fclose($output);
fclose($input);
Smaller chunks reduce RAM usage and larger chunks improve I/O performance. You'll need to find a balance that works best for you.

Get compressed byte size after zlib_decode()?

I'm trying to use PHP to parse a custom gzip archive file format that was created in Delphi (not my code!). The format is basically:
4-byte integer: count of files in archive
for each compressed file:
4-byte integer: filename length [n]
[n] bytes: filename
4-byte integer: uncompressed file length [m]
[????] bytes: gzipped content
I can read the file and actually decode the first compressed file correctly by using zlib_decode() with a max uncompressed length of [m] bytes on the remainder of the file after I know the length ([m]), but then I'm stuck because I don't know how far into the substring I should go to find the next filename -- zlib_decode() doesn't return the number of compressed bytes that it processed before stopping. Since this is a custom format, it doesn't seem like I can use the normal gzopen()/gzread() functions because the entire file isn't compressed (I tried, it doesn't work).
This code works in Delphi because apparently you can pass a file handle back and forth between normal file reading functions and the System.ZLib decoding functions -- you can read [m] uncompressed bytes and the pointer will remain at the last compressed byte -- but PHP doesn't seem to support switching between read-as-normal and read-as-gzip on the fly that way.
Am I missing an obvious way in PHP to deal with a mixed-content file format like this, where metadata and compressed data are stacked together this way? Or am I out of luck without knowing the compressed data length?
A dirty workaround is to recompress the content of each file as I am able to parse it, use that to calculate the compressed length, and adjust the file pointer in the original file manually as follows:
$current_pos = ftell($handle);
$skip_length = strlen(gzencode($uncompressed_text,9,FORCE_DEFLATE));
fseek($handle, $skip_length+$current_pos);
This works, but feels very hack-ish. I'd still be open to any better approaches.
EDIT:
Just a note that this eventually failed. However, I was fortunate enough to know in advance the list of expected filenames and I was able to do the following (more reliable since zlib_decode() will decode as much as it can and discard the rest anyway):
foreach ($filenames as $thisFilename) {
$thisPos = strpos($rawData, $thisFilename);
$gzresult = zlib_decode(substr($rawData, $thisPos + strlen($table) + 8)); // skip 8 bytes for filename size and uncompressed data size, which are useless info.
}

PHP generate Excel/CSV file and send as UTF-8

I'm retrieving data from my Postgres DB in UTF-8. The db and the client_connection settings are in UTF-8.
Then I send 2 headers to the visitor:
header("Content-Type: application/msexcel");
header("Content-Disposition: $mode; filename=export.xls");
and start outputting plain text data in a CSV-manner. This will open as a simple Excel file on the visitors desktop.
$cols = array ("col1", "col2", "col3");
echo implode("\t", $cols)."\r\n";
Works fine, untill special characters like é, è etc are encountered.
I tried changing my client_encoding while retrieving the data from the db to latin-1, which works in most cases but not for all chars. So that is not a solution.
How could I send the outputted file as UTF-8? I don't think converting the data from the db to latin-1 is possible, since the char seems unknown in latin-1 ... so I need Excel to treat the file as UTF-8
I'd look into using the PHPExcel engine. It uses UTF-8 as default and it can generate a whole list of spreadsheet file types (Excel, OpenOffice, CSV, etc.).
I would recommend not sending plain-text and masquerading it as Excel. XLS files are typically binary, and while binary isn't required, the official Excel method of using non-binary data is to format it as XML.
You mention "CSV" in the title, but nothing about your problem includes anything related to CSV. I bring this up because I believe that you should actually change your tabs to commas, and then you could simply output a standard .csv file, which is read by Excel still but doesn't rely on undocumented or unstable functionality.
If you truly want to send application/msexcel, then you should use a real Excel library, because currently, you are not creating a real Excel file.
use ; charset=UTF-8 after aplication/xxxxxx I do use:
header("Content-Type: application/vnd.ms-excel; charset=UTF-8");
// header("Content-Length: " . strlen($thecontent)); // this is not mandatory
header('Content-Disposition: attachment; filename="file.xls"');
Try mb_convert_encoding function.
Try to use iconv, for converting string into required charset.
Have you tried utf8_encode() the string?
So something like: echo implode("\t", utf8_encode($cols)."\r\n")
Not sure if that would work, but give it a go

How can I convert a large string of base64 image data back to an image with PHP?

I have a large string of base64 image data (about 200K). When I try to convert that data by outputting the decoded data with the correct header, the script dies, as if there isn't enough memory. I get no error in my Apache logs. The example code I have below works with small images. How can I decode a large image?
<?php
// function to display the image
function display_img($imgcode,$type) {
header('Content-type: image/'.$type);
header('Content-length: '.strlen($imgcode));
echo base64_decode($imgcode);
}
$imgcode = file_get_contents("image.txt");
// show the image directly
display_img($imgcode,'jpg');
?>
Since base64-encoded data is cleanly separated every 4 bytes (i.e. 3 bytes of plaintext are encoded into 4 bytes of base64-encoded text), you could split your b64 string into multiples of 4 bytes, and process them separately:
while (not at end of string) {
take next 4096 bytes // for example - 4096 is 2^12, therefore a multiple of 4
// you could use much larger blocks, depends on your memory limits
base64-decode them
append the decoded result to a file, or a string, or send it to the output
}
If you have a valid base64 string, this will work identically to decoding it all at once.
OK, here is a closer resolution. While this seems to decode the base64 data in smaller chunks, I still don't get an image in the browser. If I echo the data before I place a header, I get output. Again, this works with a small image but not a large one. Thoughts?
<?php
// function to display the image
function display_img($file,$type) {
$src = fopen($file, 'r');
$data = "";
while(!feof($src)) {
$data .= base64_decode(fread($src, 4096));
}
$length = strlen($data);
header('Content-type: image/'.$type);
header('Content-length: '.$length);
echo $data;
}
// show the image directly
display_img('image.txt','jpg');
?>
Content-length must specify the actual (decoded) content length not the length of the base64 encoded data.
Though I'm not sure that fixing it would solve this problem...
Save the base64 string to an image file using imagejpeg() or the correct function for the different formats, and then display the image with a simple <img> tag.

Categories