I'm currently rewriting a website that need a lot of different sizes for each images. In the past I was doing it by creating the thumbnails images for all sizes on the upload. But now I have a doubt about is performance. This is because now I have to change my design and half of my images are not of the right size. So I think of 2 solutions :
Keep doing this and add a button on the backend to re-generate all the images. The problem is that I always need to know every sizes needed by every part of the site.
Only upload the real size image, and when displaying it, put in the SRC tag something like sr="thumbs.php?img=my-image-path/image.jpg&width=120&height=120". Then create the thumb and display it. Also my script would check if the thumb already exists, if it does it doesn't need to recrate it so just display it. Each 5 Days launch a script with a crontask to delete all the thumbs (to be sure to only use the usefull ones).
I think that the second solution is better but I'm a little concern by the fact that I need to call php everytime an image is shown, even if it's already created, it's php that give it to display...
Thanks for your advises
Based on the original question and subsequent comments, it would sound like on-demand generation would be suitable for you, as it doesn't sound like you will have a demanding environment in terms of absolutely minimizing the amount of download time to the end client.
It seems you already have a grasp around the option to give your <img> tags a src value that is a PHP script, with that script either serving up a cached thumbnail if it exists, or generating it on the fly, caching it, and then serving it up, so let me give you another option.
Generally speaking, utilizing PHP to serve up static resources is not a great idea as you begin to scale your site as
This would require the additional overhead of invoking PHP to serve these sorts of requests, something much more optimized with the basic web server like Apache, Nginx, etc. This means your site is going to be able to handle less traffic per server because it is using extra memory, CPU, etc. in order to serve up this static content.
It makes it hard to move those static resources into a single repository outside of the server for serving up content (such as CDN). This means you have to duplicate your files on each and every web server you have powering a site.
As such, my suggestion would be to still serve up the images as static image files via the webserver, but generate thumbnails on the fly if they are missing. To achieve this you can simply create a custom redirect rule or 404 handler on the web server, such that requests in your thumbnail directory which do not match an existing thumbnail image could be redirected to a PHP script to automatically generate the thumbnail and serve up the image (without the browser even knowing it). Future requests against this thumbnail would be served up as a static image.
This scales quite nicely as, if in the future you have the need to move your static images to a single server (or CDN), you can just use an origin-pull mechanism to try to get the content from your main servers, which will auto-generate them via the same mechanism I just mentioned.
Use the second option, if you don't have too much storage and first if you don't have too much CPU.
Or you can combine these: generate and store the image at the first open of the php thumbnails generator and nex time just give back the cached image.
With this solution you'll have only the necessary images and if you want you can delete sometimes the older ones.
Related
I've been on a project for the past few days and hit a problem displaying large quantities of images (+20gb total ~1-2gb/directory)in a gallery on one area of the site. The site is built on the bootstrap framework. I've been trying to make massive carousels that ultimately do not function fluidly due to combined /images size. Question A: In this situation do I need i/o from a database and store images there-- is this faster than in /images folder on front end?
And b) in my php script i need to -set directories to variables/ iterate through and display images into < li >, but how do I go about putting controls on the memory usage so as to not overload browser? Any additions, suggestions, or alternatives would be greatly appreciated. Im looking for most direct means to end here.
Though the question is a little generic, here are some thoughts in regards to your two questions:
A) No, performance pulling images from a database would most likely be worse than pulling straight from the file system. In general, it is not a good idea to store images or other binary data in databases unless you absolutely have to, because databases can't do much with this information and you are just adding an extra layer on top of the file system that doesn't need to be there. You would, however, want to store paths to images in your database, potentially along with other characteristics such as image dimensions, thumbnail paths, keywords, etc. Then your application would read the entries for the images to return the correct paths to the images.
B) You will almost certainly want to implement some sort of paging if you are displaying many hundreds or thousands of photos. If the final display must be a carousel, you will want to investigate the Javascript that drives it to determine how you could hook in a function that retrieves more results from your PHP application via an AJAX call when it reaches the end or near end of the current listing of images. If you are having problems with the browser crashing due to too many images, you will also want to remove images from the first part of the list of <li>s when you load new ones so that it keeps the DOM under control.
A) It's a bad idea to store that much binary data into a database, even if the DB allows it, you shouldn't use it, it'll also give you much more memory consumption, all your data will be stored in the database's memory space, then copied into PHP's memory space for you to handle, which eats up twice the memory, plus the overhead of running a database server, and querying, etc.. so no, it's slower to use a database, accessing the filesystem directly is faster, if you also use varnish or other front-end caching system, you'll even be able to serve content much faster too.
What I would do is store files on the filesystem, and the best server to handle static serving like that is either G-WAN or NGINX Source, but do your read up and decide for yourself what suits you best. point is, stay away from apache, and probably host all those static files onto a separate server running a lightweight http server
ProTip: Save multiple copies of the same image with scaled down sizes for example 50% and another version with 25% of the original image size, this way you'll be able to send the thumbnails first for quick browsing, then when a user decides to view an image you serve up the 50% or 100% size, depending on their screen size, this way you save yourself bandwidth and memory. you also save a big 3G bill for mobile users.
B) This is where it makes some sense to use a database, you can index all the directories into a database, and use that to store the location of the image in the FS, and perhaps some tags, and maybe even number of views, etc...
and in the forntend you'll implement a scipt that'll fetch for example 50 thumbnails per page then the user can scroll around using some fancy JQuery, and when you need to fetch more, simply get a new result set with 50 more thumbs, etc..
this way you'll save yourself memory, bandwidth and even the users will thank you for such a lightweight browsing experience !
Another tip:
If you want to be able to handle bigger traffic, you might want to consider using a CDN, there are many CDN services that aren't as expensive as Amazon S3, a simple search will give you tons of resources !
Happy hacking !
I am making a web application that needs to show 3 types of thumbnails to a user. No I might end up with a lot of thumbnail files on the server for a lot of users.
This makes me think is generating thumbnails on the fly is a better option than storing them?
Speed vs Storage vs Logic - Which one to go for?
Does anyone here ever faced such a dilemma - let me know!
I am using CodeIgniter and its inbuilt Image Library for generating thumbnails.
I would go with: generate when needed, store afterwards.
Link to the image using a URL like /img/42/400x300.jpg. Through rewrite rules, you can fire up a PHP script should the image not exist. That script can then generate the requested image in the requested size and store it in the public web folder, where the web server can serve it directly the next time.
That gives you the best of both worlds: the image is not generated until needed, it is only generated once and it even makes it very flexible to work with different image sizes on the fly.
If you're worried about storage space, you can add a regular clean-up job which removes old images or perhaps analyses your access log files and removes images which where not accessed for some time.
My comment as an answer: (why not :)
My personal thoughts on this are, if you're anticipating a lot of users go with storage as the the load of creating dynamic thumbnails for every one of these users for every page load is going to hurt the server, maybe create it dynamically the first time it's ever viewed and then store it.
You may also take advantage of browser caching to save load and bandwidth. (marginal but every little helps)
I'm using an image resizer to output images in my website. Is it possible to use a CDN in the current situation?
The image resizer takes the file path of the image and outputs the desired image. Now even if I use a CDN, the image resizer script is hosted on my server. So every image request is going through my server. Will using a CDN benefit me in any way?
The cached object on CDNs are keyed on the request URI, so you should benefit from a CDN provided you application isn't generating any randomness in the URLs. If your image request looks like this
/resizer/200x100/thing.jpg
# ...or...
/resizer/thing.jpg?size=200x100
Then the CDN will cache it for all subsequent requests.
If your sever and script is quicker enough then I would use your server code. This means you can play around with the script if you need to add custom functions. If you have any serious problems or want much more options which a CDN may provide then switch.
Short answer: No. If your server resizes images on-the-fly, then it still serves as a bottleneck, and the benefits of the CDN are essentially lost.
Three solutions you might be comfortable with:
Have the image resizer script run once upon image upload, and create all necessary image sizes, then store everything in a logical manner on the CDN. (Reasonable solution. Downside: Very rigid, adding new image sizes requires considerable work).
Have the image resizer script run (ie: resize a single image and upload to the CDN) upon request, but only if the image does not exist on the CDN already. (You can either store a list of created images in a database or, preferably, if at all possible use the object notation default image technique) (Cool solution. Downside: Old browsers don't like the object tag, non-standard, albeit valid, code).
Switch CDNs, use a more mature CDN service that allows you to manipulate media files via API. ex: http://cloudinary.com/ (Smooth sailing solution. Downside: Not as cheap as the non-intelligent CDNs out there, but in most cases you should hardly feel it, and it will save you a ton of coding).
Hope this helps, I'd love to hear what solution you chose.
I am creating a social network where users upload their profile image.
This image will be used in their profile page in 150 / 150 px dimension.
In the home page i.e user Activity feed I need the same image to be in 75 / 75 px.
What would be the best practice to do this
Resize image on fly (timthumb).
Resize and Save image in the server.
While uploading a photo create required set of thumbnails and save as a [image_name]thumb[size_name].jpg or so:
uploaded: file.jpg
medium: file_thumb_150x150.jpg
small: file_thumb_75x75.jpg
Naming convention is up to you, but in a fast way you get easy access to the data you need. No need to use server to generate it on the fly or scale in a browser.
I've been working on this problem for a little while and have come across 3 main ways of doing this:
Generate thumbnail images at the point of upload as a background process.
Generate images on demand through the main application
Generate images on demand using the URL as an API
Each approach has its pros and cons.
This approach is the most restrictive, you have to know all the uses and sizes of thumbnails up front so that you can generate them immediately after upload. The main advantage is that the images can be served efficiently using a server like nginx and are just like any other static resources.
Django has a library called sorl-thumbnail which provides a template tag for generating thumbnails of all kinds as and when they are needed. It uses a fast key/value store to keep track of what thumbnails have been generated, and invalidates stale generated images automatically if it detects the source image has been changed. The template tag then returns the URL for the generated image, which can be served directly from nginx without going through a scripting layer. More flexible than 1, but you can't (for example) generate an image URL using JavaScript and expect it to exist, it has to be done by the website's backend code or templates.
Fully dynamic and flexible, you can get whatever version of the image you want just by tweaking the URL, Amazon uses this method as do all those placeholder image generation websites. Can generate URLs in JavaScript and do whatever fancy things you want. The website itself doesn't need any knowledge of the thumbnailing layer short of maybe a few helper methods to generate the URLs for you. BUT, this is obviously the most resource-intensive way of doing things and you need to make sure your architecture can handle the load. You need to use every trick in the book to invalidate caches in a timely manner, avoid unnecessary hits on the resizing script etc.
I'm a big fan of the 3rd way of doing things, I like image generation to be completely isolated from my main website functionality and I like the considerable flexibility, but you need to know what you're doing when it comes to configuring your servers to handle it.
I tell you what I do. I allways store the full size image, but renaming it ussing the db ID with leading zeros. On the first use I create the thumbnail and store it in other folder, using it in next calls.
If server space and bandwidth is an issue you should consider using an cdn.
Amazon has a good service,
Here's a bit of history first: Recently finished an application that allows me to upload images and store them in a directory, it also stores the information of that file in a database. Database stores the location, name and gives it an ID (auto_increment).
Okay, so what I'm doing now is allowing people to insert images into posts. Throwing a few ideas around on the best way to do this, as the application I designed allows people to move files around, and I don't want images in posts to break if an image is moved to a different directory (hence the storing of IDs).
What I'm thinking of doing is when linking to images, instead of linking to the file directly, I link it like so:
<img src="/path/to/functions.php?method=media&id=<IMG_ID_HERE>" alt="" />
So it takes the ID, searches the database, then from there determines the mime type and what not, then spits out the image.
So really, my question is: Is this the most efficient way?
Note that on a single page there could be from 3 to 30 images, all making a call to this function.
Doing that should be fine as long as you are aware of your memory limitations configured by both PHP and the web server. (Though you'll run into those problems merely by receiving the file first)
Otherwise, if you're strict about this being just for images, it could prove more efficient to go with Mike B's approach. Design a static area and just drop the images off in there, and record those locations in the records for their associated post. It's less work, and less to worry about... and I'm willing to bet your web server is better at serving files than most developer's custom application code will be.
Normally, I would recommend keeping the src of an image static (instead of a php script). But if you're allowing users to move them around the filesystem you need a way to track them
Some form of caching would help reduce the number of database calls required to fetch the filesystem location of each image. Should be pretty easy to put an indefinite TTL on the cache and invalidate upon the image being moved.
I don't think you should worry about that, what you have planned sounds fine.
But if you want to go out of your way to minimise requests or whatever, you could instead do the following: when someone embeds an image in a post, replace the anchor tag with some special character sequence, like [MYIMAGE=1234] or something. Then when a page with one or more posts is viewed, search through all the posts to find all the [MYIMAGE=] sequences, query the database to get all of the images' locations, and then output the posts with the [MYIMAGE=] sequences replaced with the appropriate anchor tags. You might or might not want to make sure users cannot directly add [MYIMAGE=] tags to their submitted content.
The way you have suggested will work, and it's arguably the nicest solution, but I should warn you that I've tried something similar before and it completely fell apart under load. The database seemed to be keeping up, but the script would start to time out and the image wouldn't arrive. That was probably down to some particular server configuration, but it's worth bearing in mind.
Depending on how much access you have to the server it's running on, you could just create a symlink whenever the user moves a file. It's a little messy but it'll be fast and reliable, and will also handle collisions if a user moves a file to where another one used to be.
Use the format proposed by Hammerite, and use [MYIMAGE=1234] tags (or something similar).
You can then fetch the id-path mappings before display, and replace the [MYIMAGE] tags with proper tags which link to images directly. This will yield much better performance than outputting images using php.
You could even bypass the database completely, and simply use image paths like (for example) /images/hash(IMAGEID).jpg.
(If there are different file formats, use [MYIMAGE=1234.png], so you can append png/jpg/whatever without a database call)
If the need arises to change the image locations, output method, or anything else, you only need to change the method where [MYIMAGE] tags are converted to full file paths.