I've a Problem to solve, but couldn't find a solution.
I need to compare a original image with a photo of the same image, and the function should return true if the photos are equal or false if the photos are not equal.
The photo can also have another size as the original image, and also if the photo contains only a part of the original it should detect the original.
Can I use normal face detection library's or do you have a better solution to solve this problem?
Thanks
There are several ways you could approach this problem. If you are looking to see if images are EXACTLY the same. You could go after the file. Using an md5 comparison you can help determine to see if it's the exact same file. Now this won't work for ACTUALLY comparing them.
If you want to actually compare the contents of the pictures you have, I suggest taking a look at PHP's gd library.
After some googling around I found a nice blog entry here about comparing the similarity of images. It's a good read.
A good method to start off with when comparing photos with GD is making the images the same size. The size should be reasonable so I'd say somewhere around 16x16. You should then consider RGB values, shapes, etc.
Some other libraries I should point you to are libpuzzle and imagemagick. Both of which make it pretty easy for comparing images in PHP. The documentation is pretty bad though so it may require a lot more googling and actual testing. Good luck!
Related
Here, I want to search am image from MySQL by upload an image.
I have tried with hash and rgb and also with hex. But I am not getting the perfect picutres as google.
For e.g. If you find Koala.jpg, then you can see only those photos related to Koala.jpg. But I am getting other photos too with above try.
Is there any suggestions that I can search image from the database by image upload.
Thanks in advance.
Introduction
As I understand your question, how can I search for similar image while I am uploading the image?
Before I dig more in details, I would roughly give a little introduction to the topic.
To be able to search or analysis images in computer, we need to go through some processes and convert the image to a numbers representation. This make it possible to do almost every thing.
There are different similarity search Algorithms, and it is a very heat topic a lot researchers working to improve the techniques and developing better ways.
Depending on your requirement, there are a lot of things involved in it, like how big file is, how fast you expecting the results, is it while the person is upload, how many images should be processed at the time etc.
Google has a lot powerful servers and a lot of machine learning that makes it very smooth to compare images almost with no delays.
IMO you need gather some theoretical information that will help you a lot understanding the process around it.
Some links with information regarding my explanation:
https://en.wikipedia.org/wiki/Reverse_image_search
https://en.wikipedia.org/wiki/Google_Images#Search_by_image
scientific paper http://ai.stanford.edu/~gal/Research/OASIS/
I am pretty sure with a bit more google search you can find a lot of theoretical resources.
Now back to your question,
The following lib/class IMO will solve your problem.
Libpuzzle is a PHP library to find similar picture
(https://www.pureftpd.org/project/libpuzzle)
PHP Compare Images Similarity is also a PHP class
(http://www.phpclasses.org/package/8255-PHP-Compare-two-images-to-find-if-they-are-similar.html)
I will leave some link that might enlighten you as well, that said you have few steps to reach your goal:
Start test one of these libraries and see which one fits you best
Then try to test by uploading and comparing image
Few more links:
Image similarity comparison
Image comparison - fast algorithm
Good way to identify similar images?
Find similar images in (pure) PHP / MySQL
http://nekkidphpprogrammer.blogspot.dk/2014/01/not-all-bits-are-created-equal.html
Algorithm for finding visually similar photos from a database?
Image comparison - fast algorithm
My users are uploading images to my website and i would like first to offer them already uploaded images first. My idea is to
1. create some kind of image "hash" of every existing image
2. create a hash of newly uploaded image and compare it with the other in the database
i have found some interesting solutions like http://www.pureftpd.org/project/libpuzzle or or http://phash.org/ etc. but they got one or more problems
they need some nonstandard extension to PHP (or are not in PHP at all) - it would be OK for me, but I would like to create it as a plugin to my popular CMS, which is used on many hosting environments without my control.
they are comparing two images but i need to compare one to many (e.g. thousands) and doing it one by one would be very uneffective / slow ...
...
I would be OK to find only VERY similar images (so e.g. different size, resaved jpg or different jpg compression factor).
The only idea I got is to resize the image to e.g. 5px*5px* 256 colors, create a string representation of it and then find the same. But I guess that it may have create tiny differences in colors even with just two same images with different size, so finding just the 100 % same would be useless.
So I would need some good format of that string representation of image which than could be used with some SQL function to find similar, or some other nice way. E.g. phash create perceptional hashes, so when two numbers are close, the images should be close as well, so i just need to find closest distances. But it is again external library.
Is there any easy way?
I've had this exact same issue before.
Feel free to copy what I did, and hopefully it will help you / solve your problem.
How I solved it
My first idea that failed, similar to what you may be thinking, is I ended up making strings for every single image (no matter what size). But I quickly worked out this fills your database super fast, and wasn't effective.
Next option (that works) was a smaller image (like your 5px idea), and I did exactly that, but with 10px*10px images. The way I created the 'hash' for each image was the imagecolorat() function.
See php.net here.
When receiving the rgb colours for the image, I rounded them to the nearest 50, so that the colours were less specific. That number (50) is what you want to change depending on how specific you want your searches to be.
for example:
// Pixel RGB
rgb(105, 126, 225) // Original
rgb(100, 150, 250) // After rounding numbers to nearest 50
After doing this to every pixel (10px*10px will give you 100 rgb()'s back), I then turned them into an array, and stored them in the database as base64_encode() and serialize().
When doing the search for images that are similar, I did the exact same process to the image they wanted to upload, and then extracted image 'hashes' from the database to compare them all, and see what had matching rounded rgb's.
Tips
The Bigger that 50 is in the rgb rounding, the less specific your search will be (and vice versa).
If you want your SQL to be more specific, it may be better to store extra/specific info about the image in the database, so that you can limit the searches you get in the database. eg. if the aspect ratio is 4:3, only pull images around 4:3 from the database. (etc)
It can be difficult to get this perfectly 5px*5px, so a suggestion is phpthumb. I used it with the syntax:
phpthumb.php?src=IMAGE_NAME_HERE.png&w=10&h=10&zc=1
// &w= width of your image
// &h= height of your image
// &zc= zoom control. 0:Keep aspect ratio, 1:Change to suit your width+height
Good luck mate, hope I could help.
For an easy php implementation check out: https://github.com/kennethrapp/phasher
However - I wonder if there is a native mySql function for "compare" (see php class above)
I scale down image to 8x8 then I convert RGB to 1-byte HSV so result hash is 172 bytes string.
HSVHSVHSVHSVHSVHSVHSVHSV... (from 8x8 block, 172 bytes long)
0fff0f3ffff4373f346fff00...
It's not 100% accurate (some duplicates aren't found) but it works nice and looks like there is no false positive results.
Putting it down in an academical way, what you are looking for is a similarity function which takes in two images and returns an indicator how far/similar the two images are. This indicator could easily be a decimal number ranging from -1 to 1 (far apart to very close). Once you have this function you can set an image as a reference and compare all the images against it. Then finding the similar images to one is as simple as finding the closest similarity factor to it which is done with a simple search over a double field within an RDBMS like MySQL.
Now all that remains is how to define the similarity function. To be honest this is problem specific. It depends on what you call similar. But covariance is usually a good starting point, it just needs your two images to be of the same size which I think is of no big deal. Yet you can find lots of other ideas searching for 'similarity measures between two images'.
suppose there is an image on web without watermark. And someone downloads it and makes some edits on it like adding watermark etc etc. Is it possible to write a script in php to compare these two images. Like when I submit these two images to the script, it should be able to output the original image and manipulated image.
I read google's webmaster page which says
Google often finds multiple copies of the same image online. We use many different signals to identify the original source of the image
Blockquote
This is the main concern of my question
One more doubt is will there be any meta tags inside an image. if at all how to read them. Is it possible to edit them. Are there any information(not visual) inside an image which cannot be edited.
Anything within the image can be edited (it is, after all, just a collection of bytes), and it's definitely trivial for someone to add a watermark to an image, or simply change the contrast ever-so-slightly, to make it a very different file from the original. There are several other non-destructive changes that would make image files look completely different to a naive comparison algorithm (e.g., scaling, changing filetypes and compression, changing brightness, rotation, etc.).
Advanced image processing algorithms, however, can still often identify similarities between images that have been manipulated in ways like those above. There are many algorithms to do this, and honestly you could spend thousands of hours trying to roll an algorithm like this yourself. These sorts of algorithms are referred to as "content-based image retrieval."
You might be better off calling into engine that's already been developed to do exactly this. Here are some possibilities:
TinEye has a RESTful API that you can use, described here.
You could scrape the response from Google's Search by Image results using this technique.
You could use any of the number of suggestions within this slightly older StackOverflow post.
Good luck!
Photos taken by digital cameras usually have exif data embedded.
You can get the data with the exif_read_data function in PHP.
As for identifying similar images, here's some useful resources:
TinEye
SO Q on image similarity
The comments on Resig's article
You could submit both images to ImageEdited and see which one has been edited. Even if the exif data's missing, it tells when an image has been created with a program.
I am aware that there is one more question like mine, but I just thought mine was after all a bit different.
I have to be able to establish if the images are very similar or entirely different...
have a look at the following two images:
The first image is a bit lighter than the second image. You can see that on black striped fish in the middle.
So, comparing the md5 hashes doesn't really help. Is there anyother clever way to do it?
thanks!
try that function
http://www.php.net/manual/en/function.imagick-compareimages.php
you will need to google for usage since the doc seems to be empty ...
I am not sure if it would help, but I think if you run the images through GD image processing, it would really help you there!
this way is useful
$img1 = md5(file_get_content($image1))
...
if($img1 == $img2){
..
}
Try this .Someone wrote open source code .
http://compareimages.nikhazy-dizajn.hu/
Compare Images PHP Class:
This PHP Class compares two images and returns a number representing how similar they are. It is capable to tell if two pictures are similar even if they have different sizes or aspect ratio. Smaller number means the images are more similar. Numbers more than 10 means they are most likely not the same image.
Users are uploading fotos to our php build system. Some of them we are marking as forbidden because of not relevant content. I´m searching for optimalisation of an 'AUTO-COMPARE' algorithm which is skipping these marked as forbidden fotos. Every upload need to be compared to many vorbinden.
Possible solutions:
1/ Store forbidden files and compare whole content - works well but is slow.
2/ Store image file checksum and compare the checksums - this is the idea to improve the speed.
3/ Any inteligent algorithm which is fast enough and can compare similarity between photos. But I dont have any ideas abut these in PHP.
What is the best solution?
Don't calculate checksums, calculate hashes!
I've once created a simple application that had to look for duplicate images on my harddisk. It would only search for .JPG files but for every file I would calculate a hash value over the first 1024 bytes, then append the width, height and size of the image to it to get a string like: "875234:640:480:13286", which I would use as key for the image.
As it turns out, I haven't seen any false duplicates with this algorithm, although there still is a chance of false duplicates.
However, this scheme will allow duplicates when someone just adds one byte to it, or makes very small adjustments to the image.
Another trick could be by reducing the size and number of colors of every image. If resize every image to 128x128 pixels and reduce the number of colors to 16 (4 bits) then you end up with reasonable unique patterns of 8192 bytes each. Calculate a hash value over this pattern ans use the hash as primary key. Once you get a hit, you might still have a false positive thus you would need to compare the pattern of the new image with the pattern stored in your system.
This pattern compare could be used if the first hash solution indicates that the new image is unique. It's something that I still need to work out for my own tool, though. But it's basically a kind of taking fingerprints of images and then comparing them.
My first solution will find exact matches. My second solution would find similar images. (Btw, I wrote my hash method in Delphi but technically, any hash method would be good enough.)
Image similarity comparison isn't exactly a trivial problem, so unless you really want to devote a lot of effort to image comparison algorithms, your idea of creating some sort of hash of the image data and comparing that will at least allow you to quickly detect exact duplicates. I'd go with your current plan, but make sure it's a decent (but fast) hash so that the likelihood of collisions is low.
The problem with hashes, as suggested, is that if someone changes 1 pixel the hash turns out completely different.
There are excellent frameworks out there that are able to compare the contents of a file, and return (in a percentage) how much they look alike. There is one in specific, a command line app, I once came across which was build within a scientific environment and it was open source but I can't remember its name.
This kind of framework could definitely help you out, since they can be extremely fast, even with a large number of files.
Upload image to ipfs and store cid. every cid is unique to the file. store thumbnails locally
To give a more relevant answer than my first, I suggest Google Vision API for image recognition (google it haha) or write a simple script to see what goigle lens says about an item.