This question already has answers here:
Should I embed images as data/base64 in CSS or HTML
(7 answers)
Closed 9 years ago.
Could someone please explain how does this work?
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEAAAABACAYAAACqaXHeAAACDUlEQVR4Xu2Yz6/BQBDHpxoEcfTjVBVx4yjEv+/EQdwa14pTE04OBO+92WSavqoXOuFp+u1JY3d29rvfmQ9r7Xa7L8rxY0EAOAAlgB6Q4x5IaIKgACgACoACoECOFQAGgUFgEBgEBnMMAfwZAgaBQWAQGAQGgcEcK6DG4Pl8ptlsRpfLxcjYarVoOBz+knSz2dB6vU78Lkn7V8S8d8YqAa7XK83ncyoUCjQej2m5XNIPVmkwGFC73TZrypjD4fCQAK+I+ZfBVQLwZlerFXU6Her1eonreJ5HQRAQn2qj0TDukHm1Ws0Ix2O2260RrlQqpYqZtopVAoi1y+UyHY9Hk0O32w3FkI06jkO+74cC8Dh2y36/p8lkQovFgqrVqhFDEzONCCoB5OSk7qMl0Gw2w/Lo9/vmVMUBnGi0zi3Loul0SpVKJXRDmphvF0BOS049+n46nW5sHRVAXMAuiTZObcxnRVA5IN4DJHnXdU3dc+OLP/V63Vhd5haLRVM+0jg1MZ/dPI9XCZDUsbmuxc6SkGxKHCDzGJ2j0cj0A/7Mwti2fUOWR2Km2bxagHgt83sUgfcEkN4RLx0phfjvgEdi/psAaRf+lHmqEviUTWjygAC4EcKNEG6EcCOk6aJZnwsKgAKgACgACmS9k2vyBwVAAVAAFAAFNF0063NBAVAAFAAFQIGsd3JN/qBA3inwDTUHcp+19ttaAAAAAElFTkSuQmCC
And how does this generate an image and how to create it? I found this a lot of times in html.
Follow up question
How does this differ on a url as a src in terms of loading time and http request?
does this make loading time faster? How would it scale if i am to use, say 50 images?
Also.
if this is better
in uploading, converting images to base64 and saving it on database rather than a url would make a site better?
You can use it like this:
<img alt="Embedded Image" src="data:image/png;base64,{base64 encoding}" />
It's used to generate new images, or to store images as plain text. You can read more about base64 encoding here on Wikipedia:
http://nl.wikipedia.org/wiki/Base64
How does it work?
The characters are converted to binair
They take a group of 6 bits
The groups will be converted to decimal
For each decimal they take the number on the position n+1 which is in the base64 character table, the numbers variate between 0 and 63.
It does not always come out correctly, since the number of bits must be a multiple of 6. If this is the case, there will be, depending on the required number of additional bits, put 2 or 4 zeros at the end. If so, there will be added a = at the end.
Base64 character table
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
Different languages and usage
PHP
<?php
base64_encode($source);
// Or decode:
base64_decode($source);
Python
>>> import base64
>>> encoded = base64.b64encode('data to be encoded')
>>> encoded
'ZGF0YSB0byBiZSBlbmNvZGVk'
>>> data = base64.b64decode(encoded)
>>> data
'data to be encoded'
Objective C
// Encoding
NSData *plainData = [plainString dataUsingEncoding:NSUTF8StringEncoding];
NSString *base64String = [plainData base64EncodedStringWithOptions:0];
NSLog(#"%#", base64String); // Zm9v
// Decoding
NSData *decodedData = [[NSData alloc] initWithBase64EncodedString:base64String options:0];
NSString *decodedString = [[NSString alloc] initWithData:decodedData encoding:NSUTF8StringEncoding];
NSLog(#"%#", decodedString); // foo
The bit after the "base64," is a base64 encoded version of the binary png. Since your question is tagged PHP, here's how you would do that in php:
<?php
$img = file_get_contents('img.png');
echo "data:image/png;base64,".base64_encode($img);
How does it generate an image?
First off, the src of the image is recognized by the browser as a Data URI. It then tries to parse the Data URI (see how chrome(ium) does it here. The parser parses the URI, finds that it is a base64 encoded image and decodes it using a base64 decoder into a binary object. This is equivalent to any normal image file. This binary object is used subsequently while rendering the page.
How does this differ on a URL as a src in terms of loading time and HTTP request?
Since there are no HTTP requests made and the image data is already in memory data URIs should load significantly faster.
Does this make loading time faster? How would it scale if i am to use, say 50 images?
The page loading time? Depends. Base64 encoding string is about 2-3 times larger than the original string. That means more data is transferred with the page load. Also, data URI images are not cached in the browser! So that means it has a clear disadvantage if you have to show this image on different pages - because you have to serve base64 content every time! Instead you could have simply set cache headers on your image data types and simply served it once, and let the browser take the image from memory/cache in subsequent page loads. It really depends on your specific usage. But, you now know the intricacies of base64 encoded data URIs.
Summing it up
+
Easier to generate/store
Has a fixed charset
Smaller perceived loading times
-
More data transferred
Require decoding by the browser
No caching
Format: data:[<MIME-type>][;charset=<encoding>][;base64],<data>
This method is called data URI scheme, it is a URI scheme (Uniform Resource Identifier scheme) that provides a way to include data in-line in web pages as if they were external resources. It is a form of file literal or here document (is a file literal or input stream literal). This technique allows normally separate elements such as images and style sheets to be fetched in a single HTTP request rather than multiple HTTP requests, which can be more efficient.
Ref: http://en.wikipedia.org/wiki/Data_URI_scheme
Ref: http://en.wikipedia.org/wiki/Here_document
Related
the goal is to make a http request (empty) from Angular 7 to PHP to receive binary data in Angular for the use with protobuf3.
More specifically, the binary data (encoded like described here: https://developers.google.com/protocol-buffers/docs/encoding) in PHP (source) is encapsulated in a string, while the goal in Angular is a Uint8Array.
Therefore, I currently have the following working code:
PHP Code (a simple ProcessWire root template):
header('Content-Type: application/b64-protobuf');
…
echo base64_encode($response->serializeToString());
Angular:
let res = this.httpClient.get(`${this.API_URL}`, { responseType: 'text' });
res.subscribe((data) => {
let binary_string = atob(data);
let len = binary_string.length;
let bytes = new Uint8Array(len);
for (let i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
let parsedResponse = pb.Response.deserializeBinary(bytes)
})
As you can see I encode the data as base64 before sending it. So, it is not as efficient as it could be, because base64 reduces the amount of information per character. I tried already quite a lot to get binary transmission working, but in the end the data always gets corrupted, i.e. the variable bytes is not identical to the argument of base64_encode.
But still, according to some sources (e.g. PHP write binary response, Binary data corrupted from php to AS3 via http (nobody says it would not be possible)) it should be possible.
So my question is: What must change to directly transfer binary data? Is it even possible?
What have I tried?
using different headers, such as
header('Content-Type:binary/octet-stream;'); or using Blob in Angular.
I also tried to remove base64_encode from the PHP Code and atob
from the Angular Code. The result: the content of the data is modified between serializeToString and deserializeBinary(bytes), which is not desired.
I checked for possible characters before <?php
Specifications:
PHP 7.2.11
Apache 2.4.35
Angular 7.0.2
If further information is needed, just let me know in the comments. I am eager to provide it. Thanks.
I seem to be stuck at sending the compressed messages from PHP to NodeJS over Amazon SQS.
Over on the PHP side I have:
$SQS->sendMessage(Array(
'QueueUrl' => $queueUrl,
'MessageBody' => 'article',
'MessageAttributes' => Array(
'json' => Array(
'BinaryValue' => bzcompress(json_encode(Array('type'=>'article','data'=>$vijest))),
'DataType' => 'Binary'
)
)
));
NOTE 1: I also tried putting compressed data directly in the message, but the library gave me an error with some invalid byte data
On the Node side, I have:
body = decodeBzip(message.MessageAttributes.json.BinaryValue);
Where message is from sqs.receiveMessage() call and that part works since it worked for raw (uncompressed messages)
What I am getting is TypeError: improper format
I also tried using:
PHP - NODE
gzcompress() - zlib.inflateraw()
gzdeflate() - zlib.inflate()
gzencode() - zlib.gunzip()
And each of those pairs gave me their version of the same error (essentially, input data is wrong)
Given all that I started to suspect that an error is somewhere in message transmission
What am I doing wrong?
EDIT 1: It seems that the error is somewhere in transmission, since bin2hex() in php and .toString('hex') in Node return totally different values. It seems that Amazon SQS API in PHP transfers BinaryAttribute using base64 but Node fails to decode it. I managed to partially decode it by turning off automatic conversion in amazon aws config file and then manually decoding base64 in node but it still was not able to decode it.
EDIT 2: I managed to accomplish the same thing by using base64_encode() on the php side, and sending the base64 as a messageBody (not using MessageAttributes). On the node side I used new Buffer(messageBody,'base64') and then decodeBzip on that. It all works but I would still like to know why MessageAttribute is not working as it should. Current base64 adds overhead and I like to use the services as they are intended, not by work arounds.
This is what all the SQS libraries do under the hood. You can get the php source code of the SQS library and see for yourself. Binary data will always be base64 encoded (when using MessageAttributes or not, does not matter) as a way to satisfy the API requirement of having form-url-encoded messages.
I do not know how long the data in your $vijest is, but I am willing to bet that after zipping and then base64 encoding it will be bigger than before.
So my answer to you would be two parts (plus a third if you are really stubborn):
When looking at the underlying raw API it is absolutely clear that not using MessageAttributes does NOT add additional overhead from base64. Instead, using MessageAttributes adds some slight additional overhead because of the structure of the data enforced by the SQS php library. So not using MessageAttributes is clearly NOT a workaround and you should do it if you want to zip the data yourself and you got it to work that way.
Because of the nature of a http POST request it is a very bad idea to compress your data inside your application. Base64 overhead will likely nullify the compression advantage and you are probably better off sending plain text.
If you absolutely do not believe me or the API spec or the HTTP spec and want to proceed, then I would advise to send a simple short string 'teststring' in the BinaryValue parameter and compare what you sent with what you got. That will make it very easy to understand the transformations the SQS library is doing on the BinaryValue parameter.
gzcompress() would be decoded by zlib.Inflate(). gzdeflate() would be decoded by zlib.InflateRaw(). gzencode() would be decoded by zlib.Gunzip(). So out of the three you listed, two are wrong, but one should work.
Creating bzip2 archived data in PHP is very easy thanks to its implementation in bzcompress. In my present application I cannot in all reason simply read the input file into a string and then call bzcompress or bzwrite. The PHP documentation does not make it clear whether successive calls to bzwrite with relatively small amounts of data will yield the same result as when compressing the whole file in one single swoop. I mean something along the lines of
$data = file_get_contents('/path/to/bigfile');
$cdata = bzcompress($data);
I tried out a piecemeal bzcompression using the routines shown below
function makeBZFile($infile,$outfile)
{
$fp = fopen($infile,'r');
$bz = bzopen($outfile,'w');
while (!feof($fp))
{
$bytes = fread($fp,10240);
bzwrite($bz,$bytes);
}
bzclose($bz);
fclose($fp);
}
function unmakeBZFile($infile,$outfile)
{
$bz = bzopen($infile,'r');
while (!feof($bz))
{
$str = bzread($bz,10240);
file_put_contents($outfile,$str,FILE_APPEND);
}
}
set_time_limit(1200);
makeBZFile('/tmp/test.rnd','/tmp/test.bz');
unmakeBZFile('/tmp/test.bz','/tmp/btest.rnd');
To test this code I did two things
I used makeBZFile and unmakeBZFile to compress and then decompress a SQLite database - which is what I need to do eventually.
I created a 50Mb filled with random data dd if=/dev/urandom of='/tmp.test.rnd bs=50M count=1
In both cases I performed a diff original.file decompressed.file and found that the two were identical.
All very nice but it is not clear to me why this is working. The PHP docs state that bzread(bzpointer,length) reads a maximum length bytes of UNCOMPRESSED data. If my code below is woring it is because I am forcing the bzwite and bzread size to 10240 bytes.
What I cannot see is just how bzread knows how to fetch lenth bytes of UNCOMPRESSED data. I checked out the format of a bzip2 file. I cannot see tht there is anything there which helps easily establish the uncompressed data length for a chunk of the .bz file.
I suspect there is a gap in my understanding of how this works - or else the fact that my code below appears to perform a correct piecemeal compression is purely accidental.
I'd much appreciate a few explanations here.
To understand how the decompression get the length of bytes you have to understand first the compression. It seems that you don't know any thing about compression algorigthim.
BZIP2
Crucial algorithm of BZIP2 is the Burrows Wheeler transformation (BWT), that converts the original data into a suitable form for following coding. The current version applies a Huffman code. Compression algorithm processes the data in blocks totally independent from each block. Block sizes can be set in a range from 1-9 (100,000 - 900,000 bytes).
BZIP2 Data Structure
The first two character of compressed string start with letter 'BZ' and thereafter 1 byte for algorigthim used. Thereafter identification of the block size immediately follows, being valid for the entire file (h1, h2, h3 to h9). The parameter indicates the block size in units from 1-9 (100,000 - 900,000 bytes).
Actual original data are stored in blocks according to the selected size and will be protected individually with a CRC32 checksum. Additionally a 48 bit identifier introduces each block. This block structure allows a partial reconstruction of damaged files.
GZIP/BZIP
Gzip and bzip2 are functionally equivalent. One advantage of GZIP is that it can compress a stream, a sequence where you can't look behind. This makes it the official compressor of http streams. GZZIP DEFLATE RFC 1951 Compressed Data Format Specification and GUNZIP RFC 1952 File Format Specification are published documents.
GIP explained
I try to decompress blocks of data which were compressed with zlib and author made remarks that for decompress i must use inflate_init and inflate with Z_SYNC_FLUSH. I sure that this must work because that works on php in this way :
$temp = substr($temp, 2, -4);
$temp{0} = chr(ord($temp{0}) | 1);
$temp = gzinflate($temp);
but i ckecked many method for decompress this on C++ and every time fail.
Here is one of them :
char compressedblockbuffer[3371];
char uncompressedblockbuffer[8192];
is.read(compressedblockbuffer, 3371);
z_stream strm;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = 3371;
strm.next_in = (Bytef *)compressedblockbuffer;
strm.avail_out = 8192;
strm.next_out = (Bytef *)uncompressedblockbuffer;
inflateInit(&strm);
inflate(&strm, Z_SYNC_FLUSH);
inflateEnd(&strm);
It's not full code, just example to show problem and thats why i specified already known sizes.
I use last zlib realize so may be something change in the zlib inflate since 2003-2004 years?
So the result is :
So seems that uncompressedblockbuffer contains '\0' at the 2,3,4 indexes and many other and if i print this to console i just see two first elements.
UPD:
If gzinflate() in PHP works on the data, then your code won't. gzinflate() expects raw deflate data. Your code is looking for zlib-wrapped deflate data. If you want to decode raw deflate data, you need to use inflateInit2(&strm, -15) instead.
Your call to inflate() is likely returning an error that you are not checking for. You need to always check the return codes of the zlib routines, or for that matter any function that has the potential to return an error.
What kind of data are you decompressing? Many binary formats are perfectly accepting of NUL bytes in their data, since it just reads as a value of 0. For example, inside of image data in many formats, it'd just represent a value of 0 in either that channel or pixel (depending on data size). Not to mention, binary formats don't necessarily read as bytes. A NUL byte may actually be a part of a 2- or 4-byte value.
This is the problem with trying to read binary data as a character string. Binary data needn't follow the rules of text. This is why usually the data boundary is a separate size value, because it can't terminate on NUL values like text.
If you have the original uncompressed data for comparison, either load that data into memory and compare the data, or save the decompressed data to a file and use a diff tool to do a binary comparison of the files.
I have a PHP webservice which currently returns a zip archive as its only output. I'm reading the zip archive from disk using file_get_contents and sending it back as the body of the response.
I'd like it to return some additional metadata, in a JSON format:
{
"generatedDate": "2012-11-28 12:00:00",
"status": "unchanged",
"rawData": <zip file in raw form>
}
The iOS app which talks to this service will receive this response, parse the JSON, and then store the zip file locally for its own use.
However, if I try to stuff the result of file_get_contents into json_encode, it rightfully complains that the string is not in UTF-8 format. If I UTF-8-encode it using mb_convert_encoding($rawData, 'UTF-8',
mb_detect_encoding($rawData, 'UTF-8, ISO-8859-1', true));, it will encode it happily, but I can't find a way to reverse the operation on the client (calling [dataString dataUsingEncoding:NSUTF8StringEncoding] and then treating the result as a zip file fails with BOM could not extract archive: Couldn't read pkzip local header.
Can anyone suggest a good way to insert a blob of raw data as one field in a JSON response?
Surely if you successfully included the raw data in the JSON then you'd have the opposite problem at the other end, when you try to decode the JSON and whatever you use to decode can't handle the raw data?
Instead, I would suggest that you send the raw data only in the response body, and use headers to send the metadata.
Strike this question.
It turns out that UTF-8 encoding raw data like this is nonstandard at best, and the standard solution is base-64 encoding it and then using a base-64 decoder to recover it on the client:
$this->response(200, array('rawData' => base64_encode($rawData)));
...
NSString *rawDataString = [[response responseJSON] objectForKey:#"rawData"];
NSData *rawData = [Base64 decode:rawDataString];
ZIP archives are not text—they are binary files! Trying to convert your archive from ISO-8859-1 to UTF-8 makes as much sense as trying to rotate it.
There're several algorithms to serialize binary streams as text but they'll all increase the file size. If that's not an issue, have a look at:
base64_encode()
bin2hex()
unpack()