I have a large SVG file (approx. 60 MB, 10000x10000 pixels but with the potential to get much larger), and I'm wanting to create, say, many tiled 256x256 PNG images from it (in that example there would be 1600 images; round(10000/256)^2).
Does anyone have any idea of how to do this on a web server (running PHP amongst other things)? I thought about rsvg, but it doesn't seem to have any functionality to modify the bounding box (and I'd rather avoid doing it manually for each section). ImageMagick might be able to do it, but I've not been having much luck with getting it to work. Using rsvg to create a large PNG and then using a tool dedicated to tiling very large images might work, but I've not had any luck finding such a thing! Speed isn't really an issue, although it is desirable, so if the worst comes to the worst, I might look into modifying the SVG's bounding box per section. I could see the generation taking forever, though!
Anyone know of any methods to do this?
Edit 2016-03-02:
I recently came back to needing an answer for this question again, and Inkscape appears to be the only tool which can render SVGs for a given area at given sizes (svgexport almost meets these requirements, but it doesn't let you change the aspect ratio).
My aim was to tile an SVG into 256x256 tiles, and I've now successfully made a script which can tile an arbitrarily large SVG by doing repeated renderings in inkscape of about 16,000 x 16,000 and tiling the resulting images. I've successfully rendered SVGs where the dimensions are over 500,000 x 500,000 pixels—no problems with memory usage (it just takes a long time!)
inkscape has a command line mode to export pngs, taking an optional argument to choose which area to export
inkscape vector.svg --export-png=raster.png --export-area=0:0:100:100
I'd look at Apache Batik. In particular, their SVG Rasterizer looks like just what you need.
I've never used it for giant SVG files, though, so I don't know if it's optimized for that case or not.
Check out this question i posted earlier and got working.
If the image is only 10000x10000 the script i have in the question works best.
If however you want to use much bigger images check out the script in my anser.
ImageMagick crop huge image
PanoJS seems to do what you're asking about. You need to convert the SVG to a large PNG first though (e.g. using inkscape on the command line), and then use PanoJS's tilemaker to make the tiles. It is a very memory intensive beast, but if you can get it to run successfully, you can then use the PanoJS Javascript code to point to your webserver. XKCD used it for a large image describing money.
You might want to edit the source properties of your SVG (a copy), to render certain areas only. Use the "width" and "height" properties to match your desired tile size (256) and the "viewBox" to the desired tile area (for example 'viewBox="512 256 768 512"' for the 3rd tile in the second row).
You could do something like this in a loop:
$sed = "sed 's/width=\"10000\"/width=\"256\"' ".$sourcefile;
$sed .= " | sed 's/height=\"10000\"/height=\"256\"'";
$sed .= " | sed 's/viewBox=\"0 0 10000 10000\"/viewBox=\"0 0 256 256\"'";
exec($sed." > ".$tmpfile);
exec('rsvg '.$tmpfile.' > '.$tilefile);
I don't know how this behaves on very large files though.
I am working at the moment on an issue where we are seeing CPU Usage issues on a particular host when converting images using iMagick. The issue is pretty perfectly described here:
https://github.com/ResponsiveImagesCG/wp-tevko-responsive-images/issues/150 (I don't use that particular library, but I DO use the same responsive images classes they do, and I am timing out on that particular line, only for some images).
They seem to suggest that removing the call to ->posterizeImage() will fix their issue, and in my tests it does, I can't even tell any difference in the converted images. But this worries me because I wonder if there is a difference that I am not seeing, or one that only comes up in certain scenarios (I mean if posterizing an image didn't do anything there wouldn't be a method for it, right?). I see online that it 'Reduces the image to a limited number of color level' (136 levels in the case causing an issue for me, for what it's worth). I'm having some difficulty parsing that though, which I think is related to a poor grasp of the way various image formats store data (really it doesn't go past the idea that an image is broken up into pixels, which are broken up into proportions of red green and blue).
What actual visual differences could I expect to see if we stop posterizing images? Is it something that I would only expect in certain types of image (like, would it be more visible in transparent over non-transparent, or warmer coloured images)? Or that would be more evident in certain display styles (like print, or the warmer colour temp in iPhone displays)?
Basically I am looking for the info to make an informed choice on whether it's safe to comment out. I'm not worried if it means some images might be x Kb larger, but if it will make them look poor quality, or distort them in some way (even in corner cases) then I need to consider other options.
From the ImageMagick command line documentation:
-posterize levels
reduce the image to a limited number of color levels per channel.
Very low values of levels, e.g., 2, 3, 4, have the most visible effect.
There is a bit more info in the Color Quantization examples - it also has some example images:
The operators original purpose (using an argument of '2') is to re-color images using just 8 basic colors, as if the image was generated using a simple and cheap poster printing method using just the basic colors. Thus the operator gets its name.
An argument of '3' will map image colors based on a colormap of 27 colors, including mid-tone colors. While an argument of '4' will generate a 64 color colortable, and '5' generates a 125 color colormap.
Essentially it reduces the number of colors used in the image - and by extension the size. Using a level of 136 would not have much visible affect, as this translates to a 2,515,456 color colortable (136^3).
It is also worth noting from the commit for the issue you linked is that this isn't even always an effective way of reducing image size:
... it turns out that posterization only improves file sizes
for PNGs and can actually lead to slightly larger file sizes for
JPG images.
Posterisation is a reduction of the amount of colour information stored in an image - as such, it is really a decrease in quality. It's hard to imagine how stopping doing this could be detrimental. And, if it turns out later that there is/was a legitimate reason for doing it, you can always do it later because if you stop doing it now, you will still have all the original information.
If it was the other way around, and you started to introduce posterisation and later found out it was undesirable for some reason, you would no longer be able to get the original information back.
So, I would see no harm in stopping posterising. And the fact that I have written that, kind of challenges anyone who knows better to speak up and tell me I am wrong :-)
I've been searching about this for a while but I didn't find what I wanted, so here is my problem:
using PHP,
I want to create a very big image file,lets say 20000 gigapixels, then I want to add a small image to specific location on this big image. My computer doesn't have enough RAM to load up the entire image and manipulate pixels that way, so I think I need to access the image data on hard disk and manipulate them in some way, so anyone knows how to do this?
thanks for helping me out :)
ImageMagick supports operations on very large files. I don't see support in the PHP/ImageMagick API but you could call out (exec) to the command line program and use one of it's disk caching or streaming options.
There is some documentation for dealing with large files here: www.imagemagick.org.
What would you do with an image that size? You couldn't serve it a browser, and even if you did manage to load it into the server, it would take up all the server resources, so you wouldn't be using the server for anything else in the meanwhile.
The short answer is that handling an image of that kind of scale as a single file in RAM is out of the question unless you've got an extremely powerful machine dedicated to it, and nothing else. At 20k x 20k pixels, even a simple monchrome image is going to take 400mb. Scale that up to any useful colour depth, and you're talking about gigabytes of RAM just to hold the graphic, and that's before we even start thinking about actually doing stuff with it.
I guess the solution is to look at what other people do, given the same problem.
Real applications that use images of that scale (eg mapping apps or panorama photos like this one) store their image as a series of much smaller blocks. Each block is a smaller image in its own right. They'd also usually have separate sets of blocks for each zoom level too. Handling a single massive image file is implausible for any realistic server environment, but smaller chunks make it easy to handle for both browser and server. The server just sends the blocks to the user that are in the current view; when the user scrolls or zooms, they get sent more blocks.
Your question mentions adding a smaller image to a specific location on the big one. Again, looking at how others do this, google maps and others handle this kind of thing using a layering system. The layers are built up and sent to the browser separately.
I know that doesn't directly answer the question, but I hope it gives you some options to think about.
Just keep a simple file, not image, and store pixel data in it in any custom format. PHP has a fseek function, which allows you to jump to any location in the file, so you can calculate needed location & perform read/write on it. If you have image with size W x H, and if each pixel takes 3 bytes, then the address of pixel (X, Y) in the file will be (W * Y + X) * 3.
I use imagemagick to create thumbnails from images on my website using convert like so: convert -size 220x220 %s -resize 220 -profile '*' %s", $image, $thumb and this has worked great for a long time. Thousands of images have been processed and all the thumbnails look great ... except for one. For some reason this image produces a very ugly thumbnail and I can't figure out why.
Original image: http://i.imgur.com/fCbAN.jpg
Generated Thumbnail: http://i.imgur.com/MdLCs.jpg
Does anyone have any insight as to why this might happen with my convert code?
The thumbnail has been saved with very low quality (approximately 10-15, 99 being close to lossless). I think the question is, "why did that happen".
I can think of some reasons, but you will have to experiment. I assume the images you posted are the real images (not copies done converting e.g. PNG to JPG, I mean), and the command line is complete and describes the complete image workflow.
your ImageMagick setup attempts to keep estimated image quality. You do not set a quality explicitly (e.g. -quality 75), so the thumbnail gets the same quantizer setting as the source image. Suppose the source has a low quantizer, but you do not see it due to the high-frequency component (the image is "noisy" due to scanning). When resampling, the background loses its noise and becomes a smooth gradient, which was absent in the source. And a smooth gradient is hell on low quantizers. Try explicitly setting a quality factor (40 to 99, 40 is better compressed but chunkier, 99 is very high quality but bigger file).
there is some kind of interference between the resampler and the Moiré pattern that the scanner creates in the acquired image. This is less likely, because I see a "wavelength" of about 8 pixels which isn't at all uncommon, nor do I think that with so many images you acquired, none had approximately the same size and aspect ratio of this one; which in this scenario ought to have triggered the same behaviour. You say it didn't happen, so if this image isn't uncommon for size, aspect ratio, or source (e.g. one of the very few images scanned with a Scan-o-matic 600 scanner in the batch), this scenario becomes pretty unlikely. But if it is correct, then add a Gaussian blur before resizing and it ought to fix things: e.g. -blur 2x2.
there is bad juju in the file name, and for some reason this gets the ImageMagick wrapper to interpret a command of "set quantizer to its crappiest value". REALLY unlikely (if the interpret interprets a part of the filename as an option, it shouldn't interpret it as a filename, and the rest of the filename is no longer the true filename; resulting in a "File not found" error which we don't observe. All the same, if the original file name is something like "--progressive-swedish-music.jpg", try renaming it before thumbnailing.
I'm putting my money on option #1, anyway.
Another test which you could attempt is to run the same command from ImageMagick (command line) and not from PHP.
I am currently working on a PHP application which is ran from the command line to optimize a folder of Images.
The PHP application is more of a wrapper for other Image Optimizer's and it simply iterates the directory and grabs all the images, it then runs the Image through the appropriate program to get the best result.
Below are the Programs that I will be using and what each will be used for...
imagemagick to determine file type and convert non-animated gif's to png
gifsicle to optimize Animated Gif images
jpegtran to optimize jpg images
pngcrush to optimize png images
pngquant to optimize png images to png8 format
pngout to optimize png images to png8 format
My problem: With 1-10 images, everything runs smooth and fairly fast however, once I run on a larger folder with 10 or more images, it becomes really slow. I do not really see a good solution around this but one thing that would help is to avoid re-processing images that have already been Optimized. So if I have a folder with 100 images and I optimize that folder and then add 5 new images, re-run the optimizer. It then has to optimize 105 images, my goal is to have it only optimize the 5 newer images since the previous 100 would have already been optimized. This alone would greatly improve performance when new images are added to the image folder.
I realize the simple solution would be to simply copy or move the images to a new folder after processing them, my problem with that simple solution is that these images are used for the web and websites, so the images are generally hard-linked into a websites source code and changing the path to the images would complicate that and possibly break it sometimes.
Some ideas I have had are: Write some kind of text file database to the image folders that will list all the images that have already been processed, so when the application is ran, it will only run on images that are not in that file already. Another idea was to cheange the file name to have some kind of identification in the name to show it has been optimized, a third idea is to move each optimized file to a final destination folder once it is optimized. Idea 2 and 3 are not good though because they will break all image path links in the websites source code.
So please if you can think of a decent/good solution to this problem, please share?
Meta data
You could put a flag in the meta info of each image after it is optimized. First check for that flag and only proceed if it's not there. You can use exif_read_data() to read the data. Writing it maybe like this.
The above is for JPGs. Metdata for PNGs is also possible take a look at this question, and this one.
I'm not sure about GIFs, but you could definitely convert them to PNGs and then add metadata... although I'm pretty sure they have their own meta info, since meta data extraction tools allow GIFs.
Database Support
Another solution would be to store information about the images in a MySQL database. This way, as you tweak your optimizations you could keep track of when and which optimization was tried on which image. You could pick which images to optimize according to any parameters of your choosing. You could build an admin panel for this. This method would allow easy experimentation.
You could also combine the above two methods.
Maximum File Size
Since this is for saving space, you could have the program only work on images that are larger than a certain file size. Ideally, after running the compressor once, all the images would be below this file size, and after that only newly added images that are too big would be touched. I don't know how practical this is in terms of implementation, since it would require that the compressor gets any image below some arbitrary files size. You could make the maximum file size dependent on image size.....
The easiest way would most likely be to look at the time of the last change for each image. If an image was changed after the last run of your script, you have to run it on this particular image.
The timestamp when the script was ran could be saved easily in a short text file.
A thought that comes to my head is to mix the simple solution with a more complicated one. When you optimize the image, move it to a separate folder. When an access is made into the original image folder, have your .htaccess file capture those links and route them to an area of which can see if that same image exists within the optimized folder section, if not, optimize, move, then proceed.
I know i said simple solution, this is a sightly complicated solution, but the nice part is that the solution will provide a scalable approach to your issue.
Edit: One more thing
I like the idea of a MySQL database because you can add a level security (not all images can be viewed by everyone) If thats a need of course. But it also makes your links problem (the hard coded one) not so much a problem. Since all links are a single file of which retrieves the images from the db and the only thing that changes are get variables which are generated. This way your project becomes significantly more scalable and easier to do a design change.
Sorry this is late, but since there is a way to address this issue without creating any files, storing any data of any kind or keeping track of anything. I thought I'd share my solution of how I address things like this.
Setup an idempotent solution that efficiently optimizes images without dependencies that require keeping track of its current status.
This allows for a truly portable solution that can work in a new environment, an environment that somehow lost its tracker, or an environment that is sensitive as to what files you can actually save in there.
Although metadata might be the first source you'd think to check for this information, it's true that in some cases it will not be available and the nature of metadata itself is arbitrary, like comments, they can come and go and not affect the image in any way. We want something more concrete, something that is a definite descriptor of the asset at hand. Ideally you would want to "identify" if one has been optimized or not, and the way to do that is to review the image to see if it has been based on its characteristics.
When you optimize an image, you are providing different options of all sorts in order to reach the final state of optimization. These are the very traits you will also check to come to the conclusion of whether or not it had been in fact optimized.
Lets say we have a function in our script called optimize(path = ''), and let's assume that part of our optimization does the following:
$ convert /path/to/image.jpg -bit-depth=8 -quality=87% -colors=255 -colorspace sRGB ...
Note that these options are ones that you choose to specify, they will be applied to the image and are properties that can be reviewed later...
$ identify -verbose /path/to/image.jpg
Image: /path/to/image.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Mime type: image/jpeg
Geometry: 1250x703+0+0
Colorspace: sRGB <<<<<<
Depth: 8-bit <<<<<<
Channel depth:
Red: 8-bit
Green: 8-bit
Blue: 8-bit
Channel statistics:
Pixels: 878750
Image statistics:
Rendering intent: Perceptual
Gamma: 0.454545
Transparent color: none
Interlace: JPEG
Compose: Over
Page geometry: 1250x703+0+0
Dispose: Undefined
Iterations: 0
Compression: JPEG
Quality: 87 <<<<<<
Number pixels: 878750
As you can see here, the output quite literally has everything I would want to know to determine whether or not I should optimize this image or not, and it costs nothing in terms of a performance hit.
When you are iterating through a list of files in a folder, you can do so as many times as you like without worrying about over optimizing the images or keeping track of anything. You would simply filter out all the extensions you don't want to optimize (eg .bmp, .jpg, .png) then check their stats to see if they possess the attributes your function will apply to the image in the first place. If it has the same values, skip, if not, optimize.
If you want to get extremely efficient, you would check each attribute of the image that you plan on optimizing and in your optimization execution you would only apply the options that have not been applied to the command.
This technique is obviously meant to show an example of how you can accurately determine whether or not an image needs to be optimized. The actual options I have listed above are not the complete scope of elements that can be chosen. The are a variety of available options to choose from, and you can apply and check for as many as you want.
I'm new to php, but i'm fairly certain this is possible to do. I have a bunch of images on my server, and i'd like to give them all a thick black border. I know i could use CSS, but i'd rather the border was real. My images are all the same size, so it's nothing more than centering the server image onto this black box image, then merging them together and re-saving the server image.
I could technically do this in photoshop too, but there's a ton of images...
If i could shrink the image after i'm done, that'd be nice too. They are a bit larger than i need.
Take a look at imagemagicks, see some examples here http://www.imagemagick.org/script/examples.php
You can call it via exec() from PHP
e.g. aligning two images next to each other, adding a border around each of them:
$cmd = 'montage image1.jpg image2.jpg -tile x1 -border 5 -geometry +5+5 result.jpg';
The GD2 library also gives generally good results if ImageMagick isn't installed on your server.
See some tutorials here: http://www.roseindia.net/tutorial/php/phpgd/
Some say that ImageMagick gives better results in many cases, but GD2 may suit what you need just fine and is fairly easy to use. Hopefully, it matches the use case you're describing.