I am developing a phonegap app, one part of the app is about 10 images that are base64 encoded and downloaded about once per week per user(100 users now, hopefully growing alot)
My server is slow, which i am also working on, so delivery of these images is slow.
My question is:
Would it be faster, php and server wise, to generate and save these base64 images to a db once and fetch the images from the db on request OR base64 encode the image every time the image is requested?
Thanks for your help.
It would definitely be faster to base64 encode the images and store the encoding.
This is a classic memory vs. speed tradeoff, you can pay a lower computation cost, for a higher memory cost. In this case, that means storing more data (8/7 8/6 more if you just keep the encoded version, and a little more than 2x if you keep the original too).
The best thing you could do is keep the images in memory, since this would avoid the cost of accessing the disk. You can do this with shared memory functions, or by abusing the session variables and assigning a fixed session id to retrieve content.
Without knowing the details of your app, it seems to me that having a db for just 10 images is overkill. The added overhead of running the db on your slow server will probably kill any benefits you may get from saving on base64 encoding.
I would store the base64 encoded images as files instead of a db, so that they can be served directly to the clients by your web server.
I would also make sure you can deliver data gzip compressed if the client can handle it, since base64 data compresses really well. This will reduce the traffic to your server considerably. See this.
You'll most likely be bandwidth-bound before your server become processor bound. My thoughts:
Don't send base64-encoded images. Instead, send properly compressed binary data.
Don't have the client update unless it needs to (i.e. don't grab the image if there's no newer image to grab). Use 304 headers and related to keep track.
Once things start to hit hard, use memcache/Redis instead of a database to store the "pre-digested" image data.
Related
I am in the process of putting together a REST API of an image application to be consumed by an Angular Frontend. The API is being put together using PHP. All of the images are securely stored outside of the webroot.
Problem is that I am converting all my images to base64, it increases the payload, in some cases I have 40 images display on a page, not uncommon to wait 30-40 seconds due to the huge payload.
What is the best practice for presenting images using REST API? I have searched round, there is nothing that exactly addresses the problem. Code below. The base64 images bloats the payload by an incredible amount. Any pointers please.
//create presentation array
$presentation_arr=array();
$presentation_arr["records"]=array();
$LargeImageName = $slideName;
$LargefileDir = $largefolder. $fileid . '/';
$Largefile = $LargefileDir . $LargeImageName;
if (file_exists($Largefile)){
$b64largeImage = base64_encode(file_get_contents($Largefile));
$datafullpath = 'data:image/jpg;base64,$b64image';
}
$presentation_item=array(
"id" => $id,
"smallimage" => $b64image,
"largeimage" => $b64largeImage
);
array_push($presentation_arr["imagerecords"], $presentation_item);
Two approaches:
Create a "wrapper" endpoint that is just a proxy to the final image itself (e.g. does a readfile() internally, see this: https://stackoverflow.com/a/1353867/1364793)
Host the images at a static, web accessible folder (or even consider S3 as a storage for static assets). Then, your main endpoint just returns publicly accessile URLs to those.
You write that you are serving images as base64 encoded blobs due to security concerns, including scraping.
To meet this security requirement, you are incurring a significant performance penalty, in server-side encoding energy, in file transfer and rendering time on the client.
To improve server-side performance, you can cache the encoded version; you could write $b64largeImage to disk in the same directory, check whether it exists and send it to the client.
To improve transfer time, make sure you've got GZIP enabled on the server; this should compress your data.
However, client side performance will remain a problem - your images will most likely not be cached on the client, and decoding the images (especially if there are 40 on each page) can consume a decent amount of CPU (especially on mobile devices).
You then get the problem that if a browser can decode the image, and attakcer/scraper can too, and they can store a copy of that image. So, all that effort doesn't really buy you very much secrecy.
Of course, you may want to avoid having a 3rd party embedding your images in their pages, or you may want to avoid having them scrape your images easily.
In that case, you may want to focus on having URLs that are hard/impossible to guess, or expire. This will hurt your SEO, so it's a trade-off. S3 has expiring URLs, or you could create a service which checks the referrer for each request and only honours image requests from white-listed domains, or create your own expiring image URL service - but in each case, you'd serve JPEG/GIF/PNG images so you get small file sizes and limited decoding time.
I am working in ionic framework. Currently designing a posts page with text and images. User can post there data and image and all are secure.
So, i use base 64 encoding and save the image in database.
encodeURIComponent($scope.image)
Each time when user request, i select rows from table and display them along with text and decode them.
decodeURIComponent($scope.image)
with HTML "data:image/jpeg;base64,_______" conversion.
Works fine, but take so much time that i expected. Hence, image are 33% bigger size, and totally looks bulgy.
Then i decide to move on file upload plugin of cordova. But i realize, maintain file in this way is so much risk and complected. I also try to save binary data into database. But failed.
Text selecting without base64 data are dramatically reduce time. If it is possible to select image individually in another http call, after selecting other column and display. Is it a right mechanism to handle secure images?
As a rule of thumb, don't save files in the database.
What does the mysql manual have to say about it?
http://dev.mysql.com/doc/refman/5.7/en/miscellaneous-optimization-tips.html
With Web servers, store images and other binary assets as files, with
the path name stored in the database rather than the file itself. Most
Web servers are better at caching files than database contents, so
using files is generally faster. (Although you must handle backups and
storage issues yourself in this case.)
Don't save base4 encoded files in a database at all
Works fine, but take so much time that i expected. Hence, image are
33% bigger size, and totally looks bulgy.
As you discovered, unwanted overhead in encoding/decoing + extra space used up which means extra data transfer back and forth as well.
As #mike-m has mentioned. Base64 encoding is not a compression method. Why use Base64 encoding is also answered by a link that #mike-m posted What is base 64 encoding used for?.
In short there is nothing to gain and much to loose by base64 encoding images before storing them on the file system be it S3 or otherwise.
What about Gzip or other forms of compression without involving base64. Again the answer is that there is nothing to gain and much to lose. For example I just gzipped a 1941980 JPEG image and saved 4000 bytes that's 0.2% saving.
The reason is that images are already in compressed formats. They cannot be compressed any further.
When you store images without compression they can be delivered directly to browsers and other clients and they can be cached. If they are compressed (or base64 encoded) they need to be decompressed by your app.
Modern browsers are able to display base64 images embedded to the HTML but then they cannot be cached and the data is about 30% larger than it needs to be.
Is this an exception to the norm?
User can post there data and image and all are secure.
I presume that you mean a user can download images that belong to him or shared with him. This can be easily achieved by savings the files off the webspace in the file system and saving only the path in the database. Then the file is sent to the client (after doing the required checks) with fpassthru
What about when I grow to a 100000 users
How they take care about images file. In performance issue, when large
user involved, it seams to me, i need 100000 folder for 100000 user
and their sub folder. When large amount of user browse same root
folder, how file system process each unique folder.
Use a CDN or use a file system that's specially suited for this like BTRFS
Database has good searching facility, good thread safe connection, good session management. Is this scenario changed when large operation involved
Yes Indeed. Use it to the fullest by saving all the information about the file and it's file path in the database. Then save the file itself in the file system. You get best of both worlds.
Since it's just personal files, your could store them in S3.
In order to be safe about file uploads, just check the file's mime type before uploading for whatever storage you choose.
http://php.net/manual/en/function.mime-content-type.php
just run a quick check on the uploaded file:
$mime = mime_content_type($file_path);
if($mime == 'image/jpeg') return true;
no big deal!
keeping files on the database is bad practise, it should be your last resource. S3 is great for many use cases, but it's expensive for high usages and local files should be used only for intranets and non-public available apps.
In my opinion, go S3.
Amazon's sdk is easy to use and you get a 1gb free storage for testing.
You could also use your own server, just keep it out of your database.
Solution for storing images on filesystem
Let's say you have 100.000 users and each one of them has 10 pics. How do you handle storing it locally?
Problem: Linux filesystem breaks after a few dozens of thousands images, therefore you should make the file structure avoid that
Solution:
Make the folder name be 'abs(userID/1000)*1000'/userID
That way when you have the user with id 989787 it's images will be stored on the folder
989000/989787/img1.jpeg
989000/989787/img2.jpeg
989000/989787/img3.jpeg
and there you have it, a way of storing images for a million users that doesn't break the unix filesystem.
How about storage sizes?
Last month I had to compress a 1.3 million jpegs for the e-commerce I work on. When uploading images, compress using imagick with lossless flags and 80% quality. That will remove the invisible pixels and optimize your storage. Since our images vary from 40x40 (thumbnails) to 1500x1500 (zoom images) we have an average of 700x700 images, times 1.3 million images which filled around 120GB of storage.
So yeah, it's possible to store it all on your filesystem.
When things start to get slow, you hire a CDN.
How will that work?
The CDN sits in front of your image server, whenever the CDN is requested for a file, if it doesn't find it in it's storage (cache miss) it will copy it from your image server. Later, when the CDN get's requested again, it will deliver the image from it's own cache.
This way no code is needed to migrate to a CDN image deliver, all you will need to do is change the urls in your site and hire a CDN, the same works for a S3 bucket.
It's not a cheap service, but it's waaaaay cheaper then cloudfront and when you get to the point of needing it, you can probably afford it.
I would suggest you to continue with base64 string only, you can use LZ string compression technique to reduce the string size. I've been using and it's working pretty well.
I don't know how am I near to your question, but hope this will help you out.
Here is LZ compression technique : https://github.com/pieroxy/lz-string/
Hi I am developing small school app using angular.js and php that captures images from canvas and then stores it in mySql db.
The images are stored image/png;base64 string.
My problem is that when I load these images (300 images of 11.2Kb) it takes lot of time to display it on screen (approx 1 minute for 3Mb HTML).
Are there any libraries or angular directives that I can use to cache these images when displaying?
Well, first of all, storing images in the DB is quite a bad practice. I would recommend you to save images as binary files and store the upload paths.
Second, storing large images in base64 would take more memory (base64 images are usually ~33% larger than binaries), and, as claimed here, rendering them might take a significantly more time.
base64 encoded data may possibly take longer to process than binary
data (again, this might be exceptionally painful for mobile devices,
which have more limited CPU and memory)
BTW if I misunderstood you and the bottleneck is reading images from the db, then storing images as files would also help ;)
Let me first establish what I want to do:
My user is able to record voicenotes on my website, add tags to said notes for indexing as well as a title. When the note is saved I save the path of the note along with the other info in my DB.
Now, I have 2 choices to do the recording, both involve a .swf embedded in my site:
1) I could use Red5 server to stream the audio to my server and save the file and return the path to said file to my app to do the DB saving, seems rather complicated since I would have to convert the audio and move it to the appropriate folder that belongs to the user in a server side Red5 app, which I'm not very aware of how to build.
2) I could simply record the audio and grab its byte array, do a Base64 encoding on it and send it to PHP along with the rest of the data that is necessary (be it by a simple POST or an AJAX call), decode it on the server and make the file with the appropriate extension, audio conversion would also occur here using ffmpeg, this option seems simpler but I do not know how viable it is.
What option would you say is more viable and easier to develop? Thanks in advance
Depending on the planned duration of the recording, you may very well be able to use option number two. I recently used a similar approach successfully for a project, but recordings were only up to 30 seconds or so. Here's what I did differently from what you're suggesting though, and why I think it's better:
To capture the sound from the microphone and store it to a ByteArray, use the SAMPLE_DATA event which is dispatched whenever more sound data comes in from the microphone. There's an example in the documentation that should explain this well enough.
Because most users would be on normal home computers without any special recording equipment, it was safe to assume that the full fidelity of the recording is not necessary. I used just 2 bytes per sample, and only mono, instead of using the full 64 bit floats (AS3 Number) that you get from the microphone on the SAMPLE_DATA event. Simply read the Number and do myFloatSample * 0x7fff to convert to 16 bit signed integer.
Don't use the native 44.1kHz sampling rate if you're just recording speech or something else in that frequency range. You will likely get away just fine with 22.05kHz, which will cut the amount of data in half straight away. Just set the Microphone.rate property accordingly.
Don't use Base64 to encode your data. Send it as binary data, which will be significantly smaller. You can send it as raw POST data, or using something like AMF. Also, before you send it, use the native compress() or deflate() methods on the ByteArray to compress it. On the server, decompress using the ZLIB or raw DEFLATE (inflate) algorithms respectively, which PHP supports.
Once decompressed on the server, what you have is essentially what is called a raw 16-bit mono PCM stream. Incidentally, that should be one of the very input formats that ffmpeg (or lame) supports, so you should be able to encode it to mp3 without having to do any manual decoding first.
Obviously the Red5 solution will likely be better, because it's more tailored for the task. But if you don't have the resources to set up a Red5 server, or don't want to use Java, the above solution is proven to work well as long as you stay away from too long recordings.
To take a simple example, a 30 second recording at 22,050 samples per second, 2 bytes per sample will be ~1.3MB. Even once deflated, the transfer to the server will likely still be almost a megabyte for 30 seconds of audio. This may or may not be acceptable for your application.
So I am working on something in php where I have to get my images from a sql database where they will be encoded in base64. The speed of displaying these images is critical so I am trying to figure out if it would be faster turn the database data into an image file and then load it in the browser, or just echo the raw base64 data and use:
<img src="data:image/jpeg;base64,/9j/4AAQ..." />
Which is supported in FireFox and other Gecko browsers.
So to recap, would it be faster to transfer an actual image file or the base64 code. Would it require less http request when using ajax to load the images?
The images would be no more than 100 pixels total.
Base64 encoding makes the file bigger and therefore slower to transfer.
By including the image in the page, it has to be downloaded every time. External images are normally only downloaded once and then cached by the browser.
It isn't compatible with all browsers
Well I don't agree with anyone of you. There are cases when you've to load more and more images. Not all the pages contain 3 images at all. Actually I'm working on a site where you've to load more than 200 images. What happens when 100000 users request that 200 images on a very loaded site. The disks of the server, returning the images should collapse. Even worse you've to make so much request to the server instead of one with base64. For so much thumbnails I'd prefer the base64 representation, pre-saved in the database. I found the solution and a strong argumentation at http://www.stoimen.com/2009/04/23/when-you-should-use-base64-for-images/. The guy is really in that case and made some tests. I was impressed and make my tests as well. The reality is like it says. For so much images loaded in one page the one response from the server is really helpful.
Why regenerate the image again and again if it will not be modified. Hypothetically, even if there are a 1000 different possible images to be shown based on 1000 different conditions, I still think that 1000 images on the disks are better. Remember, disk based images can be cached by the browser and save bandwidth etc etc.
It's a very fast and easy solution. Although the image size will increase about 33% in size, using base64 will reduce significantly the number of http requests.
Google images and Yahoo images are using base64 and serving images inline. Check source code and you'll see it.
Of course there are drawbacks on this approach, but I believe the benefits outweighs the costs.
A cons I have found is in slow devices. For example, In iPhone 3GS the images served by google images are very slow to render, since the images come gziped from the server and must be uncompressed in the browser. So, if the customer has a slow device, he will suffer a little when rendering the images.
To answer the initial question, I ran a test measuring a jpeg image 400x300 px in 96 ppi:
base64ImageData.Length
177732
bitmap.Length
129882
I have used base64 images once or twice for icons (10x10 pixels or so).
Base64 images pros:
compact - you have single file. also if file is compressed, base64 image is compressed almost to the size of normal image.
page is retrieved in single request.
Base64 images cons:
to be realistic, you probably need to use scripting engine (such PHP) on all pages that contains the image.
if image is changed, all cached pages must be re-downloaded.
because image is inline, you can not use CDN or static content web server.
Normal images pros:
if you are use SPDY protocol, at least theoretical, page + images + CSS will load with single request too.
you can set expiration on the image, so content will be cached from the browsers.
Don't think data:// works in IE7 or below.
When an image is requested you could save it to the filesystem then serve that from then on. If the image data in the database changes then just delete the file. Serve it from another domain too like img.domain.com. You can get all the benefits of last-modified, or e-tags for free from your webserver without having to start up PHP unless you need too.
If you're using apache:
# If the file doesn't exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/(image123).jpg$ makeimage.php?image=$1
Generally, using base64 encoding is going to increase the byte size by about 1/3. Because of that, you are going to have to move 1/3 bytes from the database into the server, and then move those extra same 1/3 bytes over the wire to the browser.
Of course, as the size of the image grows, the overhead mentioned will increase proportionately.
That being said, I think it is a good idea to change the files into their byte representations in the db, and transmit those.
To answer the OP Question.
As static files, directly via disk thru web server.
at only 100px they are ideally suited to in memory caching by the Web server.
There is a plethora of info ,caching strategies, configs, how-to's for just about every web server out there.
Infact - The best option in terms of user experience (the image speed you refer to) is to use a CDN capable object store. period.
The "DB" as static storage choice is simply expensive - in terms of all the overhead processing, the burden on the DB, as well as financially, and in terms of technical debt.
A few things, from several answers
Google images and Yahoo images are using base64 and serving images
inline. Check source code and you'll see it.
No. They absolutely do NOT. Images are mostly served from a static file "web server" Specfically gstatic.com:
e.g. https://ssl.gstatic.com/gb/images/p1_2446527d.png
compact - you have single file. also if file is compressed, base64
image is compressed almost to the size of normal image.
So actually, No advantage at all, plus the processing needed to compress?
page is retrieved in single request.
Again, multiple parallel requests as opposed to a single larger load.
What happens when 100000 users request that 200 images on a very
loaded site. The disks of the server, returning the images should
collapse.
You will still be sending The same amount of data, but having a Longer connection time, as well as stressing your database. Secondly the odds of a run of the mill site having 100000 concurrent connections... and even if so, if you are running this all of a single server you are a foolish admin.
By storing the images - binary blobs or base64 in the DB, all you are doing it adding huge overhead to the DB. Either, you have masses and masses of RAM, or your query via the DB will come off the disk anyway.
And, if you DID have such unlimited RAM, then serving the bin images off a Ramdisk - ideally via an alternative dedicated, lightweight webserver static file & caching optimised, configured on a subdomain, would be the fastest, lightest load possible!
Forward planning? You can only scale up so far, and scaling a DB is expensive (relatively speaking). Again the disks you say will "sp
In such a case, where you are serving 100's of images to 100000 concurrent users, the serving of you images should be the domain of CDN Object store.
If you want the fastest speed, then you should write them to disk when they are uploaded/modified and let the webserver serve static files. Rojoca's suggestions are good, too, since they minimize the invocation of php. An additional benefit of serving from another domain is (most) browsers will issue the requests in parallel.
Barring all that, when you query for the data, check if it was last modified, then write it to disk and serve from there. You'll want to make sure you respect the If-Modified-Since header so you don't transfer data needlessly.
If you can't write to disk, or some other cache, then it would be fastest to store it as binary data in the database and stream it out. Adjusting buffer sizes will help at that point.