I’m trying to extract some numbers ranging from 1-99 from a picture. I’ve tried several OCR methods using PHP, but eventually my script will fail, since the numbers occasionally is rotated 5% to the left or right. This making the picture not being recognizable.
I’ve now installed Ocropus http://code.google.com/p/ocropus/ as a test. Unfortunately this is not giving me the correct numbers every time. This leads me to think that my pictures are not optimized enough.
Does anyone have some tips/ideas how to optimize the readability of the numbers? I would also be grateful for ideas how to find the numbers from the picture.
It seems that Tesseract / Ocropus are getting confused with the skew an it could be that multiple skewed numbers on the same line is confusing the Tesseract or Ocropus.
Are you passing in the whole image as a grid of numbers ? Have you tried sending each box (number) individually as a separate image to the OCR engine ? You may find you get better results.
Have you tried any other OCR engines ? Do you require it to be open source ?
I ran the image through a cheaper commercial OCR engine and all numbers recognised correctly. So another option is to wrap up a commercial OCR engine quite quickly with C# or C++ code and interface to deliver improved results.
Is it acceptable to use an external (web-based) API for your solution?
If so, please consider http://www.wisetrend.com/wisetrend_ocr_cloud.shtml (a REST API for OCR)
It can automatically correct for image rotation; Try tweaking the Deskew and AnalysisMode parameters described in http://www.wisetrend.com/WiseTREND_Online_OCR_API_v2.0.htm
(Also, when using the API, make sure that the image resolution is correctly set in the input image header - it can make all the difference in recognition quality).
Related
I'm trying to read scanned qr codes from php, running zbarimg via exec. Working not-too-bad.
The issue is that it seems to choke on scanning artifacts like these small dots:
I've been trying to get rid of the white dots syndrome by fiddling around with Imagick - changing brightness/contrast/sharpness seems to make them stand out less but some, like this one, are still unreadable.
Is there a way to remove the white dots / improve zbarimg's recognition?
Edit:
One thing I forgot to point out:
What strikes me as weird is the fact that scanning the QR via smartphone, using the camera, reads the code succesfully in a single instant without a single issue, which leads me to think this "fixing up" shouldn't even be needed.
Am I just using zbar the wrong way?
Or do mobile OSes just use a different, better, algorithm? I tried using a zxing wrapper for PHP as well, but it gave even less results compared to zbar.
In terms of cleaning up the image you have shown us, the obvious approach would be to use cellular automata - although for best results you would ant to modify the behaviour to encompass the sharpening and thresholding you are already applying using other filters. You might consider setting the size of the cell to, say, one 25th of the QR code block resolution rather than a 1:1 with pixels in the unerdlying image. Really you should be applying your thresholding via a histogram based approach (assuming that you can isolate the QR code in the image).
I'm not aware of an implementation in PHP but there is at least one OpenCV interface for PHP
Here, I want to search am image from MySQL by upload an image.
I have tried with hash and rgb and also with hex. But I am not getting the perfect picutres as google.
For e.g. If you find Koala.jpg, then you can see only those photos related to Koala.jpg. But I am getting other photos too with above try.
Is there any suggestions that I can search image from the database by image upload.
Thanks in advance.
Introduction
As I understand your question, how can I search for similar image while I am uploading the image?
Before I dig more in details, I would roughly give a little introduction to the topic.
To be able to search or analysis images in computer, we need to go through some processes and convert the image to a numbers representation. This make it possible to do almost every thing.
There are different similarity search Algorithms, and it is a very heat topic a lot researchers working to improve the techniques and developing better ways.
Depending on your requirement, there are a lot of things involved in it, like how big file is, how fast you expecting the results, is it while the person is upload, how many images should be processed at the time etc.
Google has a lot powerful servers and a lot of machine learning that makes it very smooth to compare images almost with no delays.
IMO you need gather some theoretical information that will help you a lot understanding the process around it.
Some links with information regarding my explanation:
https://en.wikipedia.org/wiki/Reverse_image_search
https://en.wikipedia.org/wiki/Google_Images#Search_by_image
scientific paper http://ai.stanford.edu/~gal/Research/OASIS/
I am pretty sure with a bit more google search you can find a lot of theoretical resources.
Now back to your question,
The following lib/class IMO will solve your problem.
Libpuzzle is a PHP library to find similar picture
(https://www.pureftpd.org/project/libpuzzle)
PHP Compare Images Similarity is also a PHP class
(http://www.phpclasses.org/package/8255-PHP-Compare-two-images-to-find-if-they-are-similar.html)
I will leave some link that might enlighten you as well, that said you have few steps to reach your goal:
Start test one of these libraries and see which one fits you best
Then try to test by uploading and comparing image
Few more links:
Image similarity comparison
Image comparison - fast algorithm
Good way to identify similar images?
Find similar images in (pure) PHP / MySQL
http://nekkidphpprogrammer.blogspot.dk/2014/01/not-all-bits-are-created-equal.html
Algorithm for finding visually similar photos from a database?
Image comparison - fast algorithm
I need to find the common colors used in a particular website.Most of the cases it will be body background,header background etc. But the problem is, some of the classes or IDs override other.So we cannot get the exact color patterns. Is there any way to find the exact color patterns of a website which browser picking?
As Havelock pointed out, the idea that does come to mind is transforming the page into an image, and then getting the color-palette from that. It does however have a few problems:
There is no guarantee, that what the library returns is what the users seens in a particular browser, yet alone all.
The processing needed could be way easier implemented in other languages than PHP. I dont mean, that it cant be done, but it is just not well suited for this task.
If you do however continue along this path, I would recommend trying something with an API, to get your screenshots, and then just using some PHP to parse them. Example for such a service - http://browsershots.org/xmlrpc/
There are several online services to extract colors from websites. Including image colors:
http://www.colorcombos.com/grabcolors.html
http://www.hextractor.com/
more...
A PHP class to extract colors from images can be found here. See also How do I get the Hex Code of a color on my webpage
Also a FireFox Plugin exists.
I am creating an app, its main functionality is, a user can input image which includes a face of a person. Using that input image, I need to check(Compare), Whether the input image is matching with the list of images I have. I mean the face in the input image, The list may contain more 100000 images.
How can we do this in PHP ?
OpenCV does face regonition. You can call OpenCV directly from PHP or you can try the OpenCV for PHP project.
There is also a long list of other face recognition libraries on this question: Face recognition Library
Though it might actually be possible to do that in php that language certainly is a bad choice for the problem. What you are probably really looking for is:
1.) some pattern matching software that can recognize similarities in images. Since that involves graphics processing such software is typicall not coded in a scripting language like php but in something more efficient like c or c++.
2.) some primitive php code controlling the face recognition, the comparision running in another process.
I am looking to generate transport maps in a style similar to the iconic London underground [tube] map.
These maps will change from time to time and many will be required so instead of drawing them up manually in inkscape [or similar] I am hoping to have them generated dynamically from a db or dataset.
Does anyone know if there is any library apis etc. out there that would help with this task, or any suggestions in general of how [or how not] to go about this ?
I am thinking svg's would be the best way to go with this, plus there may be need for basic interactivity down the line.
I am working in php so otherwise it's GDlib, ImageMagick ?
Thanks in advance.
.k
Well, the answer really isnt in how to use GD or ImageMagick, there are manuals for that. As for helper libraries, most libraries focus on graphing, anything else you will have to write yourself. Your best bet as a solution would be to have your admin interface generate the images when data in the backend changes and cache the images, since there's no reason to build the image every time someone accesses it.
For generating maps, i think your best bet would be defining stations with one or many 'lines' which determine some sort of indicator of relationship to the stations around, and an x,y. You'd probably only need to determine a 'parent' station since you're just drawing lines from a-b. That way you can position them in the same manner as they're typically rendered on the actual trains, use the lines and surrounding stations to draw mappings.
Doesn't sound like too difficult a problem. 3 tables:
stations [stationid,name,x,y,meta1,meta2],
placements [placementid,stationid,lineid,parentstationid],
lines [lineid,name,meta1,meta2,colour,etc].
SVG would be pretty good at this sort of thing, and you would avoid the whole image building and caching process, but be wary of browser support issues.
Sounds like a pretty interesting project though, good luck :)
One strategy I use when I need to generate graphs from data in a db is to extract the data in some kind of XML way (e.g. Oracle SQLX or Cocoon XSP/ESQL or eXist-db XQuery) and process it through an XSLT to generate SVG. Good old Cocoon is fine for this kind of job if you don't want to write any code (except the XSL of course ;-).
The SVG itself can be loaded in some graphic tools to reprocess.
These maps will change from time to
time and many will be required so
instead of drawing them up manually in
inkscape [or similar] I am hoping to
have them generated dynamically from a
db or dataset.
If I were in your shoes, the very first thing I'd do is try to prove that the Google Maps API won't work for your application. Then, maybe, prove that ArcGIS won't work. (Even if they don't work, they're widely used, and you get to add lines to your CV.)