Is there a reliable way of detecting noise or artifacts in an image, consisting of text and images (page from a PDF file), while not harming the text or "real" pictures. Only removing the noise, specks, blotches, etc.?
In general, there is no reliable, non-destructive way -- this would mean to ask a computer program to "magically" know what's noise and what is not. However, there are methods that get close in practice.
One commonly applied method which is reasonably simple and often not very destructive is a small radius (3-5) median filter. A median filter is good at finding scratches or "wrong pixel" noise.
Another noise reducing method would be a bilateral filter, which in layman terms is basically a blur that respects features.
Yet another method to detect and filter noise would be akin to the technique Pixar used in their "wavelet noise" algorithm:
downsample the image (e.g. by one mip level, a.k.a. 1/2 in every direction)
subtract the downsampled image from the original (implicitly upsampling again)
what remains, the difference, is what couldn't be represented in the lower resolution image, thus noise
There's no statistical way which would exactly remove noise, you can use Super Resolution GANs though, make a sythetic data and train your model for noisy to clean images.
Related
I am working at the moment on an issue where we are seeing CPU Usage issues on a particular host when converting images using iMagick. The issue is pretty perfectly described here:
https://github.com/ResponsiveImagesCG/wp-tevko-responsive-images/issues/150 (I don't use that particular library, but I DO use the same responsive images classes they do, and I am timing out on that particular line, only for some images).
They seem to suggest that removing the call to ->posterizeImage() will fix their issue, and in my tests it does, I can't even tell any difference in the converted images. But this worries me because I wonder if there is a difference that I am not seeing, or one that only comes up in certain scenarios (I mean if posterizing an image didn't do anything there wouldn't be a method for it, right?). I see online that it 'Reduces the image to a limited number of color level' (136 levels in the case causing an issue for me, for what it's worth). I'm having some difficulty parsing that though, which I think is related to a poor grasp of the way various image formats store data (really it doesn't go past the idea that an image is broken up into pixels, which are broken up into proportions of red green and blue).
What actual visual differences could I expect to see if we stop posterizing images? Is it something that I would only expect in certain types of image (like, would it be more visible in transparent over non-transparent, or warmer coloured images)? Or that would be more evident in certain display styles (like print, or the warmer colour temp in iPhone displays)?
Basically I am looking for the info to make an informed choice on whether it's safe to comment out. I'm not worried if it means some images might be x Kb larger, but if it will make them look poor quality, or distort them in some way (even in corner cases) then I need to consider other options.
From the ImageMagick command line documentation:
-posterize levels
reduce the image to a limited number of color levels per channel.
Very low values of levels, e.g., 2, 3, 4, have the most visible effect.
There is a bit more info in the Color Quantization examples - it also has some example images:
The operators original purpose (using an argument of '2') is to re-color images using just 8 basic colors, as if the image was generated using a simple and cheap poster printing method using just the basic colors. Thus the operator gets its name.
...
An argument of '3' will map image colors based on a colormap of 27 colors, including mid-tone colors. While an argument of '4' will generate a 64 color colortable, and '5' generates a 125 color colormap.
Essentially it reduces the number of colors used in the image - and by extension the size. Using a level of 136 would not have much visible affect, as this translates to a 2,515,456 color colortable (136^3).
It is also worth noting from the commit for the issue you linked is that this isn't even always an effective way of reducing image size:
... it turns out that posterization only improves file sizes
for PNGs and can actually lead to slightly larger file sizes for
JPG images.
Posterisation is a reduction of the amount of colour information stored in an image - as such, it is really a decrease in quality. It's hard to imagine how stopping doing this could be detrimental. And, if it turns out later that there is/was a legitimate reason for doing it, you can always do it later because if you stop doing it now, you will still have all the original information.
If it was the other way around, and you started to introduce posterisation and later found out it was undesirable for some reason, you would no longer be able to get the original information back.
So, I would see no harm in stopping posterising. And the fact that I have written that, kind of challenges anyone who knows better to speak up and tell me I am wrong :-)
I'm trying to work on an algorithm that will morph one "shape" into another "shape". Both shapes are arbitrary, and may even have smaller, disjointed shapes too.
The basic idea I have so far is as follows: locate the edges of the shape, place points all along those edges, then do the same with the target image, then move the points to their targets.
Here's an illustration:
I just don't know where to start. The image above is a simplification, actual use case has more complex shapes/outlines. My main problem is: How do I handle disjoint shapes? The best I can come up with is to figure out the closest point between the two pieces, and join them together as part of the path. But how would I implement this?
I don't have any code yet, I'm still at the planning phase for this. I guess what I'm asking for is if anyone can link me to any resources that may help, or give any pointers. Searching Google has yielded some interesting morph algorithms, but they all deal with full images and involve breaking the image into pieces to reshape them, which is not what I'm looking for.
Note that this will be used in JavaScript, but could be precomputed in PHP instead if it's easier.
It's best to break the problem into multiple smaller problems which can be solved independently. That way you also own independent functionalities after solving this problem, which can be added to some global module collection.
First we need to figure out which pixel in the from_shape goes to which pixel in the to_shape. We can figure that out with the following method:
Place to_shape over from_shape.
For every pixel in from_shape, find its closest to_shape pixel.
Every pixel in a shape must have a unique id, that id can be for instance, its xy location.
Now you can record each unique pixel in from_shape, and which unique pixel it goes to in to_shape.
Delete the overlapped shapes and go back to the original ones,
just now each pixel in from_shape knows its destination in to_shape.
We also need to know which 'siblings' each pixel has.
A sibling is a pixel that lies right next to another pixel.
To find it, go to a given pixel, collect all pixels in radius one from it, all of them which are black.. are the from-pixel's siblings. This information is necessary to keep the shape as a single unit when the pixels travel to their destination. Skipping the siblings would substantially speed up and simplify the morph, but without them the shape might become fragmented during morph. Might wanna begin with a siblingless version, see how that goes.
And finally we implement the morph:
There is morph_time_duration.
For each pixel in from_shape, find the distance to it's destination in to_shape.
That distance, divided by morph_time_duration, is the speed of the pixel during the morph.
Also, the angle towards destination is the angle to travel in.
So now you have speed and angle.
So at each frame in the morphing procedure, a given from_pixel now knows which direction to travel in, speed, and it also knows its siblings. So in each frame just draw the pixel in its new location, after having traveled at its speed in its direction. And then draw a line to all of that pixels siblings.
And that will display your morph.
I've found a demonstration (using Raphael.js) of outline morphing and motion tweening in JavaScript, showing how Raphael.js can be used to morph one curve into another curve.
Also, this related question (about shape tweening in JavaScript) may contain some answers that are relevant to this question.
The MorpherJS library may also be suitable for this purpose. Some demonstrations of outline morphing with MorpherJS can be found here.
Doing that wont be very easy, but I can give you a couple of starting points. If you want a plain javascript implementation a great starting point would be:
http://raphaeljs.com/animation.html
which is doing exactly what you want. So you can check what methods are invoked and browse through the library source for those methods to see the implementation.
If you instead need to morph 2 images in PHP, I would suggest you use some sort of an extension and not do that in plain PHP. Here is an example using ImageMagick to do it:
http://www.fmwconcepts.com/imagemagick/shapemorph2/index.php
If you want to know more about the internals of it:
http://web.mit.edu/manoli/www/ecimorph/ecimorph.html#algo
Hope one of those helps.
The short answer, if you're trying to roll your own, it's not a straightforward task. There's plenty of math out there on these topics that perform these very transformations (the most common you're find deal with the most common shapes, obviously), but that may or may not be accessible to you and it won't be as easy to figure out how to do the non-standard transformations.
If you're just looking for a logical approach, here's where I'd start (not having done the math in years, and not having studied the inner workings of the graphics libraries linked):
Choose a distance, measured in whatever units make sense, pixels perhaps.
Identify each continuous edge in each shape. Pick an arbitrary point on one edge for each shape (say, on a plane where (0,0) represents the upper left corner, the edge point on each shape closest to (0,0)), and align your separate shapes on that point. For the purposes of your transformation, that point will remain static and all other points will conform to it.
If your shape has two or more distinct edges, order them by perimeter length. Consider the shorter lengths to be subordinate to the longer lengths. Use a similar process as in step 2 to pick an arbitrary point to connect these two edges together.
At each of your chosen points, count points around your edges at the interval of the distance you selected in step 1.
(left as an exercise for the reader) conform your points on your disparate edges together and into the target shape, aligning, reducing or adding points on the edges as necessary.
Alternatively, you could select an arbitrary number of points instead of an arbitrary distance, and just spread them appropriately along the edges at whatever distance they will fit, and then conform those points together.
Just throwing some ideas out there, I don't honestly know how deep the problem goes.
I run a website with thousands of user-contributed photos on it. What I'd like is a script to help me weed out poor photos from good photos. Obviously this isn't 100% possible, but it should be possible to determine if an image has no discernable focussed area? I think?
I did a bit of googling and couldn't find much on the subject.
I've written a very simple script that iterates over the pixels, and sums the difference in brightness between neighbouring pixels. This gives a high value for sharp contrasty images, and a low value for blurred/out of focus images. It's far from ideal though, as if there's a perfectly focussed small subject in the frame, and a nice bokeh background, it'll give a low value.
So I think what I want is a script that can determine if a part of an image is well-focussed, and if none is then to alert me?
Any bright ideas? Am I wasting my time?
I'd be interested in any code that can determine other sorts of "bad" photos too - too dark, too light, too flat, that sort of thing.
Too dark and too light are easy - calculate a colour average as you iterate through every pixel.
For your focus issue, I think you're going to run into a lot of problems with this one. I would strongly recommend looking up kernel convolution, as I have a sinking feeling that you'll need it. This allows you to perform more complex operations on pixels based on neighbors - and is how most Photoshop filters are done!
Once you've got the maths background to do it, what I would do is to convert your image to an array of unique values (as opposed to RGB) representing brightness. From there, use an edge-finder kernel (Sobel operator should do the trick) and find the edges. Once that is done, iterate over again, mapping the bits with no edge, and calculate the largest square area without an edge from this. It is probably the least CPU-intensive solution, though not the most esoteric.
What mean-stat-equation should I use when I have an image with N-number sample-size of selections?
I have a unique problem for which i was hoping to get some advice, so that i don't miss out on anything.
The Problem: To find the most favored/liked/important area on an image based on user selection of areas in different selection ratios.
Scenario: Consider an Image of a dog, and hundreds of users selecting area over this image in various resolutions, the obvious area of focus in most selections will be the area containing the dog. I can record the x1,x2,y1,y2 co-ordinates and put them into a db, now if i want to automatically generate versions of this image in a set of resolutions i should be able to recognize the area with the max attraction of the users.
The methods i think could work are:
Find the average center point of all selections and base the selection in that. - Very simple but would not be as accurate.
Use some algorithm like K Means or EM Clustering but i don't know which one would be best suited.
Looking forward to some brilliant solution to my problem
More info on the problem:
The Actual image will be most probably be a 1024x768 image, and the selections made on it will be of the most common mobile phone resolutions. The objective is to automatically generate mobile phone wallpapers by intelligent learning based on user selections.
I believe that you have two distinct problems identified above:
ONE: Identification of Points
For this, you will need to develop some sort of heuristic for identifying whether a point should be considered or not.
I believe you mentioned that hundreds of users will be selection locations over this image? Hundreds may be a lot of points to cluster. Consider excluding outliers (by removing points which do not have a certain number of neighbors within a particular distance)
Anything you can do to reduce your dataset will be helpful.
TWO: Clustering of Points
I believe that K Means Clustering would be best suited for this particular problem.
LINK
Your particular problem seems to closely mirror the standard Cartesian coordinate clustering examples used in explaining this algorithm.
What you're trying to do appears to be NP-Hard, but should be satisfied by the classical approximations.
Once clustered, you can take an average of the points within that cluster for a rather accurate approximation.
In Addition:
You dataset sounds like it will already be tightly clustered. (i.e. Most people will pick the dog's face, not the side of it's torso.) You need to be aware of local minima. LINK These can really throw a wrench into your algorithm. Especially with a small number of clusters. Be aware that you may need a bit of dynamic programming to combat this. You can usually introduce some variance into your algorithm, allowing the average points to "pop out" of these local minima. Local Minima/Maxima
Hope this helps!
I think you might be able to approach your problem in a different way. If you have not heard of Seam Carving then I suggest you check it out, because the data you have available to use is perfectly suited to it. The idea is that instead of cropping an image to resize it, you can instead remove paths of pixels that are not necessarily in a straight line. This allows you to resize an image while retaining more of the 'interesting' information.
Ordinarily you choose paths of least energy, where energy here is some measurement of how much the hue/intensity changes along the path. This will fail when you have regions of an image that are very important (like a dog's face), but where the energy of those regions is not necessarily very high. Since you have user data indicating what parts of the image are very important you can make sure to carve around those regions of the image by explicitly adding a little energy to a pixel every time someone selects a region with that pixel.
This video shows seam carving in action, it's cool to watch even if you don't think you'll use this. I think it's worth trying, though, I've used it before for some interesting resizing applications, and it's actually pretty easy to implement.
I need to store in a database pieces of text and the size of the layer that will wrap the text in the browser screen (I cannot build the layer in the client side - it is a requirement of the project). The whole piece of text must fit into the square layer properly. Users will be able to use different font size and font families from a small group of options and we'll know the selection at the time of computation.
Right now, I compute the layer volume based in the theory that every character size is its font-size pixels height and 50% of the font-size width. With the 50% value, I got the best approximation for the average cases, but it is still not a good solution because it cuts pieces of text or leaves too much blank space at the end. And it's even worst with some wider font-types.
Any idea on how to approach this problem?
Given how easy it is for users to override font choices in the browser, at best this would only be a "best guess" attempt, but you could use GD to draw the text and then compute a bounding box for it using imagettfbbox().
But this only supports simple text, so if you're doing "complicated" stuff with boldface/italics, variable sizes, etc.. then you're SOL for the most part.
Given the requirements, the only way to do this 100% guaranteed is to:
Render the image on the server -- this can be to a normal image, to a vector format such as SVG, or even (more extreme) to Flash -- and use the generated output on the client in some form which guarantees precise rendering as any CSS rendering/local-font issues may differ from the server-generated values and is thus unreliable!
The exact dimensions of the box are thus known independent upon local font rendering issues as they depend only upon the servers "view".
Happy coding.