I'm currently working on a personal project for decoding a text or any object in an image.
I'm using GD library for processing image. I have access to every pixel of image and its rgb color.
My question is not about coding,I'm just looking for an algorithm to decode image,or any advise for how to do that and I don't want to use any API, I want to do it by myself.
I know that php has a face detection library, but it only recognizes faces in image, and I don't know how it does that .
for start, I assume that the object is white and the background is black (or any separate colors) .
summary : How can I define an object or a word for a php program and train it to recognize it from a picture?
You have some api which decode simple captcha like this.
Check this link : Captcha Decoded
And try with this api : http://www.opendecoder.com/api, there are many API if you search on google
The process you are trying to implement is called “optical character recognition” and there is some free software available and doing this. With this expressions, you may find more information.
You did not specify the kind of software component you are looking far, so it is hard to be more specific.
This is usually an error-prone process, but you might get better results if you can make regularity assumptions on your input, especially if you already know which character types are used in your input.
Useful starting points could be
http://jwilk.net/software/ocrodjvu
http://unpaper.berlios.de/
If converting to DJVU and using python on a UNIX system is an option for you, you might consider a the first link as a solution. Otherwise you may use the various tools supported by ocrodjvu to start your research. The second is more about pre-processing you might want to do before OCR but still might be useful if you want to implement your own procedure.
Related
I am trying to integrate Adobe Signature in PDF where end user can sign it on browser itself, I want his/her hand written signature on it. End user will use his/her mouse to draw the signature. This PDF creation is written in PHP and application contains Adobe APIs.
I referred to the Handwritten Adobe page and Adobe tags
I have also referred to Stack 1 and stack 2, not matching to my requirement.
I was able to sign the custom runtime generated PDF document using {{Sig_es_:signer1:signature}}
I checked it at several places including Stackoverflow, but i cant find any such reference document which can guide me to code for hand written signatures. i also need to understand if Hand written signatures have any limitation or drawbacks or any privacy/security issues.
Let me know if anyone knows How to proceed on this.
Draw a signature with a mouse? That will not work. I can't do that. A finger on a phone would work better. Still clumsy, but better.
Drawn signatures are old fashioned in the digital world, and require complex verifiable encryption. You would have to prove that the digital copy you have, was indeed drawn within the exact document it appears in. Digital things can, after all, be copied easily. Whenever there's a dispute you would have to prove that the signature is an inherent part of the unchangeable digital document. This is far more difficult than it seems at first. That's why it is usually quite expensive.
I would strongly advice to not go down this road. Find another solution.
You haven't explained what you want to use the signature for, which makes it difficult for me to suggest another solution, so I won't.
Re:
i also need to understand if Hand written signatures have any limitation or drawbacks or any privacy/security issues.
Yes, there are lots of limitations and drawbacks. You need to consider the issues of forgery (someone else signing as me, Larry) and non-reputability (I signed it but later claim that it wasn't me. How do you prove that it was Larry who signed it?)
There's also the overall context of the signature: what is the value of the agreement? What are the consequences of not being able to prove that the right person did sign the document?
Adobe Sign (and their competitors) have answers to all of the above. eSignatures are far more complicated than just getting something that looks like the person's signature on the PDF.
Pro-tip: how the signature looks on the PDF is the least important part of the process.
I'm working on a PHP project where I create a more readable version of a text transcript for a judicial inquiry, and one thing I'd really like to do is have photos depicting each speaker.
Some of them are public figures (I.e., well-known UK judges and lawyers; UK politicians), others are journalists, some are celebrities.
It seems like Wikipedia is the best thing to use for this (I may be wrong, however), however, I'm really unfamiliar with the MediaWiki API.
So, my questions:
Is Wikipedia the best thing to use for this task? Or is there a database of headshots somewhere with a very wide variety of subjects? If the latter, where's its API documentation?
If Wikipedia, what API call would I use for fetching an article's main image URL?
Lastly, how would I translate a string like "SIR PAUL STEPHENSON" to how it's listed in Wikipedia, i.e., "Paul_Stephenson_(police_officer)"
Note that I'm aware special cases will come up where no photo on Wikipedia exists or there needs to be disambiguation -- I'm quite aware I'll have to deal with those on a per-case basis.
Thanks!
Google images has a face filter:
https://www.google.com/search?tbm=isch&q=SIR+PAUL+STEPHENSON&tbs=itp:face
I'm not sure if you are allowed to use their API for this kind of stuff though, you need to read their TOS.
You can use the search api to find the most likely article for a name. AFAIK there is no sane API though to find the first image in the article (the images api will return the images in alphabetic order, and includes images from templates), so your best bet is to parse the HTML (the portrait is usually the first large image) or the wikitext (most infoboxes use a parameter called image). You can use the imageinfo api to get the image URL from the image page name.
All in all, you are probably better off with Flickr.
I wanted to know if this is possible?
I would like to create a cross platform Flash projector and files and than create an ISO from it for the user to download.
Google did not help me much so far...
Of course it is possible to do directly from PHP.
However, as one of the comments to your question states, it's probably going to be easier to call an external binary to do the work for you (Although not all hosts may have mkisofs installed).
If you really must do this from PHP, here's some useful references for you.
ISO 9660 specification (ECMA-119) - This is the file format for "ISO" image files.
PHP pack() and unpack() - These will help you manipulate binary data in PHP.
Once you're familiar with the file structure, you may be able to create some pre-compiled segments, and just patch them at various offsets as well as inserting the payload.
Good luck!
I have seen some captchas being decode using javascript, php, etc. How do they do it?
For example, very popular megaupload site's captcha has also been decoded.
I'm an image processing specialist and CAPTCHA decoder, I've done many CAPTCHA resolving projects before.
OK, let's start CAPTCHA resolving steps!
Decoding any kind off CAPTCHA has 3 main steps:
1- Removing background
Clear the CAPTCHA from any noise (using any image processing methods).
Note for captcha decoding fighter: If you want to have a good CAPTCHA, you should add a stronger noise. Use random noised background that has similar color of characters.
2- Splitting characters
Easy step when they are separate and very hard when they're not.
*Note for captcha decoding fighter: If you want to have a good CAPTCHA, don't leave the character separate! Make them overlapping, do NOT use different colors for characters, decoders can split the characters very easily! (most of the developers are unaware of this and think it's better to use a colorful CAPTCHA!), the best one is making an overlapping string with black color. For an experienced CAPTCHA decoder, it's not a problem to decode a colorful CAPTCHA! It's just beautiful and not useful! :) Use random curved lines witch connect all characters to each other. *
3- Converting separate images into character
After separation, we have a character set, (we don't have any string now, just have images and pixels), we should convert character images into string, But how?!
There are several ways, if they are not rotated, and have fixed font and size (such as freeglobes CAPTCHA), you can define a pattern set, your program should loop throw the patters to find the best match for each image, if the characters is very different and needs a large pattern you should use a "Neural Network" to recognize the character. A neural network for CAPTCHA resolving, will takes a character, and we say the network what this character is, for example, we will give it an image of "A" and we tel the NN: it's "A"! , then it will "LEARN" this character and will save its learning into a database, This procedure called "TRAINING". So, when we ask a trained network for a new character again, it will return us the best match from it's learning database.
Usually decoder specialists use the CAPTCHA itself to train the neural network. Be careful! Using appropriate data for training can make or break your results.
Note for captcha decoding fighter: If you want to have a good CAPTCHA, use any method witch a decoder can't recognize the characters, even with a Neural network. Deform the characters randomly, use many fonts instead of one and rotate the characters as well, etc.
Finally, we concatenate all single characters into one and return it as result.
Unfortunately, there are no fixed algorithm for solving any CAPTCHA, it means, new CAPTCHA needs new analysis and training. You can't make a CAPTCHA decoder to decode all CAPTCHA.
What should you know before starting:
1- Image processing fundamentals
2- General understanding of a Neural Network
3- Simple image processing functions (in any language)
For PHP:
imagecreate()
imagecreatetruecolor()
imagecolorat()
imagecolorsforindex()
imagesetpixel()
.
.
.
For .NET:
Bitmap type,
getPixel()
setPixel()
.
.
.
For JavaScript and HTML5:
You should know the Canvas very well.
Lastly:
Note for captcha decoding fighter: If you are wonder about how someone can decode a CAPTCHA and want to prevent it from decoding, you should first be a CAPTCHA decoder yourself or hire someone knows the weakness and attacking algorithm very well!
Hope to help! ;)
See:
OCR and Neural Nets in JavaScript
Here John Resig (creator of JQuery javascript library) explains how exactly it is been done.
Take a look at PWNtcha
You can also read Breaking a Visual CAPTCHA
I was involved in a project to circumvent Captcha images on the TicketMaster website about 8-9 years ago for a third-party ticket seller. When an event went on-sale, like a concert, our network of machines would use multiple credit cards and mailing addresses to buy any and every seat possible in the first 10 rows.
Rather than generating new captcha's each time, TM had a limited pool of images they could re-use. We'd create a unique digital fingerprint (checksum) for each image, then simply attack it with some imaging tools (LEADTOOLS.com) (to remove extraneous elements, enhance contrast, etc) and then use OCR tools. It was surprisingly effective.
We were able to crack a great number programmatically, and we'd store the ones we couldn't crack for human processing. Sometimes they'd have a pool of 20K images, so at first we'd get maybe 60-70% automatically, but eventually we'd get 100% success because we could identify the images our humans processed (offline) based on looking up their hash in our database. (That is, we could check a captcha image against our database based on the hash we created and if we already had the solution we could just submit the answer immediately.)
Occasionally, they'd flush and replace their pool of captcha image images with a new set, but again, it would just take us a bit of time to get back up to a 100% rate. The fatal flaw with this particular system was that they recycled images, rather than programmatically generating new captcha images each time.
But the fact is, if the financial incentive to crack the capthcha is high enough, it doesn't take much to create a distributed platform where low-wage unskilled workers can sit around earning pocket change to crack them all day.
Inside India's CAPTCHA solving economy
http://www.zdnet.com/blog/security/inside-indias-captcha-solving-economy/1835
There are services for recognition. Such as 2captcha. This is a tool for solving php https://github.com/jumper423/decaptcha/
I'd like to create a PHP image generator that takes several query string parameters and generates an image on the fly. I've done this before in ASP.NET via a handler but now I'd like to in PHP. For example, I'd like a call to image.php?img1=foo.jpg&img2=bar.jpg to render the concatenation of foo.jpg horizontally next to bar.jpg. Later on I'll do some mod_rewrite to make it something like foo_bar.jpg but that's not my concern. Are there any samples out there that do what I want or any code snippets to examine? This is actually more of a learning experience than a necessary step to build something.
Here's a sample application of it:
image.php?d=14&m=jan&y=2010
Would really be the graphic concatenation of 14.jpg + jan.jpg + 2010.jpg
And after mod_rewite it could be used like:
14_jan_2010.jpg
The PHP GD library can certainly do this. You can work out how large the image needs to be, create a blank one and then use imagecopymerge() to copy the three images into it.
I can give you some sample code if it helps.
http://www.psych0tik.net/images/sig.phps ::EDIT LINK IS BROKEN. I'll fix it shortly.
It's very simple (and doesn't do exactly what you want) but should be able to fulfill your needs with some minor hackery.