Having an image stripped of metadata upon upload in PHP - php

A certain site I know recently upgraded their bandwith from 2,5 TB monthly to 3,5 TB.
Reason is they went over the 2,5 limit recently. They're complaining they don't know how to get down the bandwidth usage.
One thing I haven't seen them consider is the fact that JPEG and other images that are displayed on the site(and it is an image-heavy site) can contain metadata. Where the picture was taken and such.
Fact of the matter is, this information is of no importance whatsoever on that site. It's not gonna be used, ever. Yet it's still adding to the bandwidth, since it increases the filesize of every images from a few bytes to a few kilobytes.
On a site that uses up more then 2,5 TB per month, stripping the several thousands images of their metadata will help decrease the bandwidth usage at least by a few Gigabytes per month I think, if not more.
So is there a way to do this in PHP? And also, for the allready existing files, does anybody know a good automatic metadata remover? I know of JPEG & PNG Stripper, but that's not very good... Might be usefull for initial cleaning though...

It's trivial with GD:
$img = imagecreatefromjpeg("myimg.jpg");
imagejpeg($img, "newimg.jpg", $quality);
imagedestroy($img);
This won't transfer EXIF data. Don't know how much bandwidth it will actually save, though, but you could use the code above to increase the compression of the images. That would save a lot of bandwidth, although it possibly won't be very popular.

I seriously doubt image metadata is the root of all evil here.
Some questions to take into consideration:
How is the webserver configured?
Does it issue http 304 responses properly?
Isn't there some kind of hand-made caching/streaming of data through php scripting that prevents said data from being cached by the browser? (in which case, url rewriting and http redirections should be considered).

Check out Smush.it! It will strip all un-necs info from an image. They have an API you can use to crunch the images.
Note: By Design, it may change the filetype on you. This is on purpose. If another filetype can display the same image with the same quality, with less bytes it will give you a new file.

I think you need to profile this. You might be right about it saving a few GB but thats relatively little on 2.5TB of bandwidth. You need real data about what is being served most and work on that. If you do find it is images that send your bandwidth usage so high you first should check your caching headers and 304 responses, you also might want to investigate using something like amazon S3 to serve your images. I have managed to reduce bandwidth costs a lot by doing this.
That said, if the EXIF data is really making that much of a difference then you can use the GD library to copy a jpeg image using the imagejpeg function. This won't copy EXIF data.

Emil H's probably addresses the question the best.
But I wanted to add that this will almost certainly not save you as much as you may think. This type of metadata takes up very little space; I would think that
Re-compressing the images to a smaller file size, and
Cropping or resizing to reduce the resolution of the images
are both going to have a much greater effect. With point one alone you could probably drop bandwidth 50% and with both, you could drop bandwidth 80% - that is if you are willing to sacrifice some image size.
If not, you could always have the default view at a smaller size, with an 'enlarge' link. Most people just browsing will see the smaller image, and only those who want the largest size will click to enlarge it, so you'll still get almost all the bandwidth saving. This is what Flickr does, for example.

Maybe some sort of hex data manipulation would help here. I'm facing the same problem and investigating on some sort of automated solution.
Just wondering if that can be done and if possible, I'll write a php class for this.

Might be smart to do all the image manipulation on the client side (using a java applet such as facebook does) and then when the image is compressed, resized and fully stripped of unnecessary pixels and content, it can be uploaded at it's optimal size, saving you bandwidth and server side performance! (at the cost of initial development)

Related

How do I compress only images that haven't been compressed already, using just PHP?

I'm currently trying to speed up the websites we develop. The part I'm working on now is to optimise the images so that they are as small (filesize, not dimensions) as possible without losing quality.
Our customers can upload their own images to the website through our custom CMS, but images aren't being compressed or optimised at all. My superior explained this is because the customers can upload their own images, and these images could be optimised beforehand through Photoshop or tools like it. If you optimise already optimised images, the quality would get worse. ...right?
We're trying to find a solution that won't require us to install a module or anything. We already use imagejpeg(), imagepng() and imagegif(), but we don't use the $quality parameter because of reasons previously explained. I've seen some snippets, but they all use imagejpg() and the like.
That being said, is there a sure-fire way of optimising images without the risk of optimising previously optimised images? Or would it be no problem at all to use imagejpeg(), imagepng() and imagegif(), even if it would mean optimising already optimised images?
Thank you!
"If you optimise already optimised images, the quality would get worse. "
No if you use a method without loose.
I don't know for method directly in php but if you are on linux server you can use jpegtran or jpegoptim ( with --strip-all) for jpeg and OptiPNG or PNGOUT for png.
Going from your title, I am going to assume compression
So, lets say a normal jpg of 800x600 is uploaded by your customers.
The customers jpg is 272kb because it has full details and everything.
You need to set tresholds for filesizes at dimensions what is acceptable.
Like:
if $dimensions->equals(800,600) and file_type($image) =='jpg' and file_size($image) > 68kb
then schedule_for_compression($image)
and that way you set up parameters for what is acceptable as an upper limit of file size. If the dimensions match, and the filesize is bigger, then its not optimised.
But without knowing more details what exactly is understood about optimising, this is the only thing I can think of.
If you are using a low number of images to compress, you might find an external service, such as: https://tinypng.com/developers might be of assistance.
I've used their on-line tools for reducing filesize on both JPG and PNG file manually but they do appaear to offer a free API service for the first 500 images per month.
Apologies if this would be better as a comment than an answer, I'm fairly new to stackoverflow and haven't got enough points yet, but felt this may be a handy alternative solution.
Saving a JPEG with the same or higher quality setting will not result in a noticeable loss in quality. Just re-save with your desired quality setting. If the file ends up larger, just discard it and keep the original. Remove metadata using jpegran or jpegoptim before you optimize so it doesn't affect the file size when you compare to the original.
PNG and GIF wont't lose any quality unless you reduce the number of colors. Just use one of the optimizers Gyncoca mentioned.

PHP, manipulating very big image files without loading them into RAM

I've been searching about this for a while but I didn't find what I wanted, so here is my problem:
using PHP,
I want to create a very big image file,lets say 20000 gigapixels, then I want to add a small image to specific location on this big image. My computer doesn't have enough RAM to load up the entire image and manipulate pixels that way, so I think I need to access the image data on hard disk and manipulate them in some way, so anyone knows how to do this?
thanks for helping me out :)
ImageMagick supports operations on very large files. I don't see support in the PHP/ImageMagick API but you could call out (exec) to the command line program and use one of it's disk caching or streaming options.
There is some documentation for dealing with large files here: www.imagemagick.org.
What would you do with an image that size? You couldn't serve it a browser, and even if you did manage to load it into the server, it would take up all the server resources, so you wouldn't be using the server for anything else in the meanwhile.
The short answer is that handling an image of that kind of scale as a single file in RAM is out of the question unless you've got an extremely powerful machine dedicated to it, and nothing else. At 20k x 20k pixels, even a simple monchrome image is going to take 400mb. Scale that up to any useful colour depth, and you're talking about gigabytes of RAM just to hold the graphic, and that's before we even start thinking about actually doing stuff with it.
I guess the solution is to look at what other people do, given the same problem.
Real applications that use images of that scale (eg mapping apps or panorama photos like this one) store their image as a series of much smaller blocks. Each block is a smaller image in its own right. They'd also usually have separate sets of blocks for each zoom level too. Handling a single massive image file is implausible for any realistic server environment, but smaller chunks make it easy to handle for both browser and server. The server just sends the blocks to the user that are in the current view; when the user scrolls or zooms, they get sent more blocks.
Your question mentions adding a smaller image to a specific location on the big one. Again, looking at how others do this, google maps and others handle this kind of thing using a layering system. The layers are built up and sent to the browser separately.
I know that doesn't directly answer the question, but I hope it gives you some options to think about.
Just keep a simple file, not image, and store pixel data in it in any custom format. PHP has a fseek function, which allows you to jump to any location in the file, so you can calculate needed location & perform read/write on it. If you have image with size W x H, and if each pixel takes 3 bytes, then the address of pixel (X, Y) in the file will be (W * Y + X) * 3.

How to get quality of image in PHP

I've been doing some speed optimization on my site using Page Speed and it gives recommendations like so:
Optimizing the following images could reduce their size by 35.3KiB (21% reduction).
Losslessly compressing http://example.com/.../some_image.jpg could save 25.3KiB (55% reduction).
How are the calculations done to get the file size reduction numbers? How can I determine if an image is optimized in PHP?
As I understand it, they seem to base this on the image quality (so saving at 60% in Photoshop or so is considered optimized?).
What I would like to do is once an image is uploaded, check if the image is fully optimized, if not, then optimize it using a PHP image library such as GD or ImageMagick. If I'm right about the number being based on quality, then I will just reduce the quality as needed.
How can I determine if an image is fully optimized in the standards that Page Speed uses?
Chances are they are simply using a standard compression or working on some very simple rules to calculate image compression/quality. It isn't exactly what you were after but what I often use on uploaded images etc etc dynamic content is a class called SimpleImage:
http://www.white-hat-web-design.co.uk/blog/resizing-images-with-php/
this will give you the options to resize & adjust compression, and I think even change image type (by which I mean .jpg to .png or .gif anything you like). I worked in seo and page optimization was a huge part of my Job I generally tried to make the images the size the needed to be no smaller or bigger. compress JS & CSS and that's really all most people need to worry about.
It looks like you could use the PageSpeed Insights API to get the compressions scores: https://developers.google.com/speed/docs/insights/v1/reference, though I'm guessing you'd want to run a quality check/compression locally, rather than sending everything through this API.
It also looks like they've got a standalone image optimizer available at http://code.google.com/p/page-speed/wiki/DownloadPageSpeed?tm=2, though this appears to be a Windows binary. They've got an SDK there as well, though I haven't looked into what it entails, or how easy it would be to integrate into an existing site.

Question about fonts for php captcha

I am working on my captcha image script, I know the fonts I use are stored on the server end but I am wondering for a high traffic site, does the font size (like kb and mb not dimensions) does it make a difference on the server? Some font's I was messing with are like 4-5mb and then some are under 1mb, is it better to use the smallest fonts or does it not make a difference?
The fonts are not being sent to the visitors of your site if they are being used to generate a captcha image. In this case, they are only being read internally by your image generation libraries, in which case the only thing you have to worry about is disk/memory access time.
I don't think reading the font files is going to give you any performance problems. The image generation is likely a lot more time-consuming, and would be a better place to optimize if you needed to. However, unless you're having problems with it, premature optimization is generally not a good use of your time.

PHP/JS - Create thumbnails on the fly or store as files

For an image hosting web application:
For my stored images, is it feasible to create thumbnails on the fly using PHP (or whatever), or should I save 1 or more different sized thumbnails to disk and just load those?
Any help is appreciated.
Save thumbnails to disk. Image processing takes a lot of resources and, depending on the size of the image, might exceed the default allowed memory limit for php. It is less of a concern if you have your own server with only your application running but it still takes a lot of cpu power and memory to resize images. If you're considering creating thumbnails on the fly anyway, you don't have to change much - upon the first request, create the thumbnail from the source file, save it to disk and upon subsequent requests just read it off the disk.
I use phpThumb, as it's the best of both worlds. You can create thumbnails on the fly, but it automatically caches the images to speed up future requests. It creates a nice wrapper around the GD and ImageMagick libraries. Worth a look!
It would be much better to cache the thumbnails. Generating them on the fly would be very taxing on the system.
It depends on the usage pattern of the site, but, basically, how many times do you expect each image to be viewed?
In the case of thumbnails, they're most likely to be around for quite a while (the image is uploaded once and never changed, so the thumbnail doesn't change either), so it's generally worthwhile to generate when the full image is uploaded and store them for later. Unless the site is completely dead, they'll be viewed many (hundreds or thousands of) times over their lifetime and disk is a lot cheaper than latency these days. This also becomes more significant as load on the server increases, of course.
Conversely, for something like stock charts that get updated every hour (if not more frequently), that would be a situation where you'd do better to create them on the fly, so as to avoid wasting CPU time on constantly generating images which no user will ever see.
Or, if you want to get fancy, you can optimize to handle either access pattern by generating the images on the fly the first time they're needed and then showing the pre-generated one afterwards, up until the data it's generated from changes, at which point you delete it so that it will be regenerated the next time it's needed. But that would be overkill for something as static as thumbnails, IMO.
check out the gd library and imagemagick

Categories