I'm building a web based system, which will host loads and loads of highres images, and they will be available for sale. Of course I will never display the highres image, instead when browsing people will only see a low resolution, watermarked image. Currently the workflow is as follows:
PHP script handles the highres image upload, when image is uploaded, it's automatically re-sized to a low res image and to a thumbnail image as well and both of the files are saved on the server, (no watermark is added).
When people are browsing, the page displays the thumbnail of the image, on click, it enlarges and displays the lowres image with watermark as well. At the time being I apply the watermark on the fly whenever the lowres image is opened.
My question is, what is the correct way:
1) Should I save a 2nd copy of the lowres image with thumbnail, only when it's access for the first time? I mean if somebody access the image, I add the watermark on the fly, then display the image & store it on the server. Next time the same image is accessed if a watermarked copy exist just display the wm copy, otherwise apply watermark on the fly. (in case watermark.png is changed, just delete the watermarked images and they will be recreated as accessed).
2) Should I keep applying watermarks on the fly like I'm doing now.
My biggest question is how big is the difference between a PHP file_exists(), and adding a watermark to an image, something like:
$image = new Imagick();
$image->readImage($workfolder.$event . DIRECTORY_SEPARATOR . $cat . DIRECTORY_SEPARATOR .$mit);
$watermark = new Imagick();
$watermark->readImage($workfolder.$event . DIRECTORY_SEPARATOR . "hires" . DIRECTORY_SEPARATOR ."WATERMARK.PNG");
$image->compositeImage($watermark, imagick::COMPOSITE_OVER, 0, 0);
All lowres images are 1024x1024, JPG with a quality setting of 45%, and all unnecessary filters removed, so the file size of a lowres image is about 40Kb-80Kb.
It is somehow related to this question, just the scale and the scenarios is a bit different.
I'm on a dedicated server (Xeon E3-1245v2) cpu, 32 GB ram, 2 TB storage), the site does not have a big traffic overall, but it has HUGE spikes from time to time. When images are released we get a few thousand hits per hours with people browsing trough the images, downloading, purchasing, etc. So while on normal usage I'm sure that generating on the fly is the right approach, I'm a bit worried about the spike period.
Need to mention that I'm using ImageMagick library for image processing, not GD.
Thanks for your input.
UPDATE
None of the answers where a full complete solution, but that is good since I never looked for that. It was a hard decision which one to accept and whom to accord the bounty.
#Ambroise-Maupate solution is good, but yet it's relay on the PHP to do the job.
#Hugo Delsing propose to use the web server for serving cached files, lowering the calls to PHP script, which will mean less resources used, on the other hand it's not really storage friendly.
I will use a mixed-merge solution of the 2 answers, relaying on a CRON job to remove the garbage.
Thanks for the directions.
Personally I would create a static/cookieless subdomain in a CDN kinda way to handle these kind of images. The main reasons are:
Images are only created once
Only accessed images are created
Once created, an image is served from cache and is a lot faster.
The first step would be to create a website on a subdomain that points to an empty folder. Use the settings for IIS/Apache or whatever to disable sessions for this new website. Also set some long caching headers on the site, because the content shouldn't change
The second step would be to create an .htaccess file containing the following.
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*) /create.php?path=$1 [L]
This will make sure that if somebody would access an existing image, it will show the image directly without PHP interfering. Every non-existing request will be handled by the create.php script, which is the next thing you should add.
<?php
function NotFound()
{
if (!headers_sent()) {
$protocol = (isset($_SERVER['SERVER_PROTOCOL']) ? $_SERVER['SERVER_PROTOCOL'] : 'HTTP/1.0');
header($protocol . ' 404 Not Found');
echo '<h1>Not Found</h1>';
exit;
}
}
$p = $_GET['path'];
//has path
if (strlen($p)<=1)
NotFound();
$clean = explode('?', $p);
$clean = explode('#', $clean[0]);
$params = explode('/', substr($clean[0], 1)); //drop first /
//I use a check for two, because I dont allow images in the root folder
//I also use the path to determine how it should look
//EG: thumb/125/90/imagecode.jpg
if (count($params)<2)
NotFound();
$type = $params[0];
//I use the type to handle different methods. For this example I only used the full sized image
//You could use the same to handle thumbnails or cropped/watermarked
switch ($type) {
//case "crop":if (Crop($params)) return; else break;
//case "thumb":if (Thumb($params)) return; else break;
case "image":if (Image($params)) return; else break;
}
NotFound();
?>
<?php
/*
Just some example to show how you could create a responds
Since you already know how to create thumbs, I'm not going into details
Array
(
[0] => image
[1] => imagecode.JPG
)
*/
function Image($params) {
$tmp = explode('.', $params[1]);
if (count($tmp)!=2)
return false;
$code = $tmp[0];
//WARNING!! SQL INJECTION
//USE PROPER DB METHODS TO GET REALPATH, THIS IS JUST EXAMPLE
$query = "SELECT realpath FROM images WHERE Code='".$code."'";
//exec query here to $row
$realpath = $row['realpath'];
$f = file_get_contents($realpath);
if (strlen($f)<=0)
return false;
//create folder structure
#mkdir($params[0]);
//if you had more folders, continue creating the structure
//#mkdir($params[0].'/'.$params[1]);
//store the image, so a second request won't access this script
file_put_contents($params[0].'/'.$params[1], $f);
//you could directly optimize the image for web to make it even better
//optimizeImage($params[0].'/'.$params[1]);
//now serve the file to the browser, because even the first request needs to show the image
$finfo = finfo_open(FILEINFO_MIME_TYPE);
header('Content-Type: '.finfo_file($finfo, $params[0].'/'.$params[1]));
echo $f;
return true;
}
?>
I would suggest you to create watermarked images on-the-fly and to cache them at the same time as everybody suggested.
Then you could create a garbage-collector PHP script that will be executed every days (using cron). This script will browse your cache folder to read every image access time. This can done using fileatime() PHP method. Then when a cached wm image has not been accessed within 24 or 48 hours, just delete it.
With this method, you can handle spike periods as images are cached at the first request. AND you will save your HDD space as your garbage-collector script will delete unused images for you.
This method will only work if your server partition has atime updates enabled.
See http://php.net/manual/en/function.fileatime.php
For most scenarios, lazily applying the watermark would probably make most sense (generate the watermarked image on the fly when requested then cache the result) however if you have big spikes in demand you are creating a mechanism to DOS yourself: create the watermarked version on upload.
Considering your HDD storage capacity and Pikes.
I would only create a watermarked image if it is viewed.(so yes on the fly) In that way you dont use to much space with a bunch a files that are or might not be viewed.
I would not watermark thumbnails i would rather make a filter that fake watermark and protect from being saved. That filter would apply to all thumbnails without creating a second image.
In this way all your thumbbails are watermarked (Fake with onther element on top).
Then if one of these thumbnails is viewed it generate a watermarked image (only once) since after its generated you load the new watermarked image.
This would be the most efficient way to deal with your HDD storage and Pikes.
The other option would be to upgrade your hosting services. Godaddy offer unlimited storage and bandwith for about 50$ a year.
Related
I have a website with images upload/show functionality on it. All images are saved into filesystem on a specific path.
I use Yii2 framework in the project. There isn't straight way to the images and all of them requested by specific URL. ImageController proceses the URL and takes decision about image resizing. ImageModel does the job. The user get image content.
Here the code snippet:
$file = ... // full path to image
...
$ext = pathinfo($file)['extension'];
if (file_exists($file)) {
// return original
return Imagine::getImagine()
->open($file)
->show($ext, []);
}
preg_match("/(.*)_(\d+)x(\d+)\.{$ext}/", $file, $matches);
if (is_array($matches) && count($matches)) {
if (!file_exists("{$matches[1]}.{$ext}")) {
throw new NotFoundHttpException("Image doen't exist!");
}
$options = array(
'resolution-units' => ImageInterface::RESOLUTION_PIXELSPERINCH,
'resolution-x' => $matches[2],
'resolution-y' => $matches[3],
'jpeg_quality' => 100,
);
return Imagine::resize("{$matches[1]}.{$ext}", $matches[2], $matches[3])
->show($ext, $options);
} else {
throw new NotFoundHttpException('Wrong URL params!');
}
We don't discuss data caching in this topic.
So, I wonder about efficient of this approach. Is it ok to return all images by PHP even they aren't changed at all? Will it increase the server load?
Or, maybe, I should save images to another public directory and redirect browser to it? How long does it take to so many redirects on a single page (there are can be plenty images). What about SEO?
I need an advice. What is the best practice to solve such tasks?
You should consider using sendFile() or xSendFile() for sending files - it should be much faster than loading image using Imagine and displaying it by show(). But for that you need to have a final image saved on disk, so we're back to:
We don't discuss data caching in this topic.
Well, this is actually the first thing that you should care about. Sending image by PHP will be significantly less efficient (but still pretty fast, although this may depend on your server configuration) than doing that by webserver. Involving framework into this will be much slower (bootstrapping framework takes time). But this is all irrelevant if you will resize the image on every request - this will be the main bottleneck here.
As long as you're not having some requirement which will make it impossible (like you need to check if the user has rights to see this image before displaying it) I would recommend saving images to public directory and link to them directly (without any redirection). It will save you much pain with handling stuff that webserver already do for static files (handling cache headers, 304 responses etc) and it will be the most efficient solution.
If this is not possible, create a simple PHP file which will only send file to the user without bootstrapping the whole framework.
If you really need the whole framework, use sendFile() or xSendFile() for sending file.
The most important things are:
Do not use Imagine to other things than generating an image thumbnail (which should be generated only once and cached).
Do not link to PHP page which will always only redirect to real image served by webserver. It will not reduce server load comparing to serving image by PHP (you already paid the price of handling request by PHP) and your website will work slower for clients (which may affect SEO) due to additional request required to get actual image.
If you need to serve image by PHP, make sure that you set cache headers and it works well with browser cache - you don't want to download the same images on every website refresh.
We all know, this is a very important issue for many web developers. They want to protect direct access or direct readability to their confidential images. The folder that contains all the images is open and anyone can visit that folder, but I want to do something that can protect my image contents, means, if an unauthorised guy looks for an image he may get the image by visiting the appropriate folder but the contents will be invisible or difficult to understand. I think if I get a solution here, many guys will be helped from this question. Writing .htaccess isn't always a stable choice. So, after brainstorming I found some ways how I can protect image contents from direct access. I want to use Imagick with PHP to perform any kind of image editing.
Adding and removing a layer: After uploading, add a layer to make contents of the image invisible. So, if anyone reaches the folder you've stored the images will be meaningless as he will see the layer not the image content. Then remove the layer and show to them who have proper rights.
Converting the image to another format: Convert the image to any format like .txt, .exe, .bin, .avi or any other format so that without editing, the image won't be visible. Convert back to show it to the authorised user.
Image grid: Divide the image into some grids, say, if the image is medium 100 grids and change the position of the grids to make the contents unclear. To do this, we can name each grid like 1, 2, 3 and so on, then change the position to $position - 20. So the grid of position 25 will go to 5, 100 will go to 80, 1 will go to 81 and so on. Reverse the same way to display to the authorised users.
It is never possible to protect completely but we can make it harder. I don't know which of the three is possible with Imagick and which is not. Please tell me if you know. Thanks in advance.
You can put these images in a different folder outside of the public_html (so nobody can access them). Then via script, if a user is logged in, you get the image file content and then change the header. If a user is not logged, you can display a random image or showing a default image.
for example, the public html folder is: /var/www your image folder can be: /registered_user/images/
Then in your PHP script you can write:
<?php
if(!userLogged() || !isset($_GET['image'])) {
header('Location: /');
die();
}
$path = '/registered_user/images/';
$file = clean($_GET['image']); // you can create a clean function that only get valid character for files
$filename = $path . $file;
if(!file_exists($filename)) {
$filename = '/var/www/images/bogus.jpg';
}
$imageInfo = getimagesize($filename);
header ('Content-length: ' . filesize($filename));
header ('Content-type: ' . $imageInfo['mime']);
readfile ($filename);
After doing research, I found that it is more recommended to save the image name in database and the actual image in a file directory. Two of the few reasons is that it is more safer and the pictures load a lot quicker. But I don't really get the point of doing this procedure because every time I retrieve the pictures with the firebug tool i can find out the picture path in the file directory which can lead to potential breach.
Am I doing this correctly or it is not suppose to show the complete file directory path of the image?
PHP for saving image into database
$images = retrieve_images();
insert_images_into_database($images);
function retrieve_images()
{
$images = explode(',', $_GET['i']);
return $images;
}
function insert_images_into_database($images)
{
if(!$images) //There were no images to return
return false;
$pdo = get_database_connection();
foreach($images as $image)
{
$path = Configuration::getUploadUrlPath('medium', 'target');
$sql = "INSERT INTO `urlImage` (`image_name`) VALUES ( ? )";
$prepared = $pdo->prepare($sql);
$prepared->execute(array($image));
echo ('<div><img src="'. $path . $image . '" /></div>');
}
}
One method to achieve what you originally intended to do by storing images in database is still continue to serve image via a PHP script, thus:
Shielding your users from knowing the actual path of an image.
You can, and should have, images stored outside of your DocumentRoot, so that they are not able to be served by web server.
Here's one way you can achieve that through readfile():
<?php
// image.php
// Translating file_id to image path and filename
$path = getPathFromFileID($_GET['file_id']);
$image = getImageNameFromFileID($_GET['file_id']);
// Actual full path to the image file
// Hopefully outside of DocumentRoot
$file = $path.$image;
if (userHasPermission()) {
readfile($file);
}
else {
// Better if you are actually outputting an image instead of echoing text
// So that the MIME type remains compatible
echo "You do not have the permission to load the image";
}
exit;
You can then serve the image by using standard HTML:
<img src="image.php?file_id=XXXXX">
You can use .htaccess to protect your images.
See here:
http://michael.theirwinfamily.net/articles/csshtml/protecting-images-using-php-and-htaccess
I'm also working on a project which stores the url path of images on the database (Amazon RDS) and the actual images in a cloud managed file system in Amazon S3.
The decision to do so came primarily with the concern of price, scalability and ease of implementation.
Cheaper: Firstly, it is cheaper to store data in a file system (Amazon S3) compared to a database (Amazon EC2 / RDS).
Scalable: And since the repository of images may grow pretty big in the future, you might also need to ensure that you have the adequate capacity to serve them. On this point, it is easier to scale up a filesystem compared to a database. In fact, if you are using cloud storage (like Amazon S3), you don't even need to worry about having not enough space as it has been managed for you by Amazon! you would just need to pay for what you use.
Ease of Implementation: In terms of implementation, storing images in a file system is much easier. If you were to serve images directly from databases, you would probably need to implement additional logic to convert blob files into html src blob strings to serve images. And from the look of it, this might actually take up quite substantial processing power which might slow your web server down.
On the other hand, if you were to use a filesystem, all you would require is to put down the url path of the image from the database to the src attribute of the image and its all done!
Security: As for security of the images, i have changed the image name to a timestamp concatenated with a random string so that it will prove really difficult for someone to browse for pictures without knowing the file name.
ie. 1342772480UexbblEY7Xj3Q4VtZ.png
Hope this helps!
NB: Please edit my post if you find anything wrong here! this is just my opinion and everyone is welcome to edit!
I have a site where users can upload images. I process these images directly and resize them into 5 additional formats using the CodeIgniter Image Manipulation class. I do this quite efficiently as follow:
I always resize from the previous format, instead of from the original
I resize using an image quality of 90% which about halves the file size of jpegs
The above way of doing things I implemented after advise I got from another question I asked. My test case is a 1.6MB JPEG in RGB mode with a high resolution of 3872 x 2592. For that image, which is kind of borderline case, the resize process in total takes about 2 secs, which is acceptable to me.
Now, only one challenge remains. I want the original file to be compressed using that 90% quality but without resizing it. The idea being that that file too will take half the file size. I figured I could simply resize it to its' current dimensions, but that doesn't seem to do anything to the file or its size. Here's my code, somewhat simplified:
$sourceimage = "test.jpg";
$resize_settings['image_library'] = 'gd2';
$resize_settings['source_image'] = $sourceimage;
$resize_settings['maintain_ratio'] = false;
$resize_settings['quality'] = '90%';
$this->load->library('image_lib', $resize_settings);
$resize_settings['width'] = $imagefile['width'];
$resize_settings['height'] = $imagefile['height'];
$resize_settings['new_image'] = $filename;
$this->image_lib->initialize($resize_settings);
$this->image_lib->resize();
The above code works fine for all formats except the original. I tried debugging into the CI class to see why nothing happens and I noticed that the script detects that the dimensions did not change. Next, it simply makes a copy of that file without processing it at all. I commented that piece of code to force it to resize but now still nothing happens.
Does anybody know how to compress an image (any image, not just jpegs) to 90% using the CI class without changing the dimensions?
I guess you could do something like this:
$original_size = getimagesize('/path/to/original.jpg');
And then set the following options like this:
$resize_settings['width'] = $original_size[0];
$resize_settings['height'] = $original_size[1];
Ok, so that doesn't work due to CI trying to be smart, the way I see it you've three possible options:
Rotate the Image by 360ยบ
Watermark the Image (with a 1x1 Transparent Image)
Do It Yourself
The DIY approach is really simple, I know you don't want to use "custom" functions but take a look:
ImageJPEG(ImageCreateFromString(file_get_contents('/path/to/original.jpg')), '/where/to/save/optimized.jpg', 90);
As you can see, it's even more simpler than using CI.
PS: The snippet above can open any type of image (GIF, PNG and JPEG) and it always saves the image as JPEG with 90% of quality, I believe this is what you're trying to archive.
What would be the best practice way to handle the caching of images using PHP.
The filename is currently stored in a MySQL database which is renamed to a GUID on upload, along with the original filename and alt tag.
When the image is put into the HTML pages it is done so using a url such as '/images/get/200x200/{guid}.jpg which is rewritten to a php script. This allows my designers to specify (roughly - the source image maybe smaller) the file size.
The php script then creates a hash of the size (200x200 in the url) and the GUID filename and if the file has been generated before (file with the name of the hash exists in TMP directory) sends the file from the application TMP directory. If the hashed filename does not exist, then it is created, written to disk and served up in the same manner,
Is this efficient as it could be? (It also supports watermarking the images and the watermarking settings are stored in the hash as well, but thats out of scope for this.)
I would do it in a different manner.
Problems:
1. Having PHP serve the files out is less efficient than it could be.
2. PHP has to check the existence of files every time an image is requested
3. Apache is far better at this than PHP will ever be.
There are a few solutions here.
You can use mod_rewrite on Apache. It's possible to use mod_rewrite to test to see if a file exists, and if so, serve that file instead. This bypasses PHP entirely, and makes things far faster. The real way to do this, though, would be to generate a specific URL schema that should always exist, and then redirect to PHP if not.
For example:
RewriteCond %{REQUEST_URI} ^/images/cached/
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteRule (.*) /images/generate.php?$1 [L]
So if a client requests /images/cached/<something> and that file doesn't exist already, Apache will redirect the request to /images/generate.php?/images/cached/<something>. This script can then generate the image, write it to the cache, and then send it to the client. In the future, the PHP script is never called except for new images.
Use caching. As another poster said, use things like mod_expires, Last-Modified headers, etc. to respond to conditional GET requests. If the client doesn't have to re-request images, page loads will speed dramatically, and load on the server will decrease.
For cases where you do have to send an image from PHP, you can use mod_xsendfile to do it with less overhead. See the excellent blog post from Arnold Daniels on the issue, but note that his example is for downloads. To serve images inline, take out the Content-Disposition header (the third header() call).
Hope this helps - more after my migraine clears up.
There is two typos in Dan Udey's rewrite example (and I can't comment on it), it should rather be :
RewriteCond %{REQUEST_URI} ^/images/cached/
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteRule (.*) /images/generate.php?$1 [L]
Regards.
One note worth adding is to make sure you're code does not generate "unauthorized" sizes of these images.
So the following URL will create a 200x200 version of image 1234 if one doesn't already exist. I'd highly suggest you make sure that the requested URL contains image dimensions you support.
/images/get/200x200/1234.jpg
A malicious person could start requesting random URLs, always altering the height & width of the image. This would cause your server some serious issues b/c it will be sitting there, essentially under attack, generating images of sizes you do not support.
/images/get/0x1/1234.jpg
/images/get/0x2/1234.jpg
...
/images/get/0x9999999/1234.jpg
/images/get/1x1/1234.jpg
...
etc
Here's a random snip of code illustrating this:
<?php
$pathOnDisk = getImageDiskPath($_SERVER['REQUEST_URI']);
if(file_exists($pathOnDisk)) {
// send header with image mime type
echo file_get_contents($pathOnDisk);
exit;
} else {
$matches = array();
$ok = preg_match(
'/\/images\/get\/(\d+)x(\d+)\/(\w+)\.jpg/',
$_SERVER['REQUEST_URI'], $matches);
if(! $ok) {
// invalid url
handleInvalidRequest();
} else {
list(, $width, $height, $guid) = $matches;
// you should do this!
if(isSupportedSize($width, $height)) {
// size is supported. all good
// generate the resized image, save it & output it
} else {
// invalid size requested!!!
handleInvalidRequest();
}
}
}
// snip
function handleInvalidRequest() {
// do something w/ invalid request
// show a default graphic, log it etc
}
?>
Seems great post, but my problem still remains unsolved. I dont have access to htaccess in my host provider, so no question of apache tweaking. Is there really a way to set cace-control header for images?
Your approach seems quite reasonable - I would add that some mechanism should be put into place to check that the date the cached version was generated was after the last modified timestamp of the original (source) image file and if not regenerate the cached/resized version. This will ensure that if an image is changed by the designers the cache will be updated appropriately.
That sounds like a solid way to do it. The next step may be to go beyond PHP/MySQL.
Perhaps, tweak your headers:
If you're using PHP to send MIME types, you might also use 'Keep-alive' and 'Cache-control' headers to extend the life of your images on the server and take some of the load off of PHP/MySQL.
Also, consider apache plugin(s) for caching as well. Like mod_expires.
Oh, one more thing, how much control do you have over your server? Should we limit this conversation to just PHP/MySQL?
I've managed to do this simply using a redirect header in PHP:
if (!file_exists($filename)) {
// *** Insert code that generates image ***
// Content type
header('Content-type: image/jpeg');
// Output
readfile($filename);
} else {
// Redirect
$host = $_SERVER['HTTP_HOST'];
$uri = rtrim(dirname($_SERVER['PHP_SELF']), '/\\');
$extra = $filename;
header("Location: http://$host$uri/$extra");
}
Instead of keeping the file address in the db I prefer adding a random number to the file name whenever the user logs in. Something like this for user 1234: image/picture_1234.png?rnd=6534122341
If the user submits a new picture during the session I just refresh the random number.
GUID tackles the cache problem 100%. However it sort of makes it harder to keep track of the picture files. With this method there is a chance the user might see the same picture again at a future login. However the odds are low if you generate your random number from a billion numbers.
phpThumb is a framework that generates resized images/thumbnails on the fly. It also implements caching and it's very easy to implement.
The code to resize an image is:
<img src="/phpThumb.php?src=/path/to/image.jpg&w=200&h=200" alt="thumbnail"/>
will give you a thumbnail of 200 x 200;
It also supports watermarking.
Check it out at:
http://phpthumb.sourceforge.net/