Main
Use SSL

Faulty CAPTCHA implementations

19 October 2007


CAPTCHAs are pretty effective at keeping the spammers away, but they have their own weaknesses too. Some CAPTCHAs are "easy to read", and a simple OCR algorithm could solve them. Yet, there are even simpler ways of defeating certain CAPTCHAs, without even OCRing them.

The problem usually lies in the fact that at lest 3 different pages need to share the same code. The form page, the receiving page, and the page that generates the image. Some implementations simply don't make sure that the code isn't forged on the way.

Let's look at a very simple CAPTCHA system and understand what's wrong.

The 3 pages share a single encryption key known only to the server. The form page generates the code and encrypts it using a shared algorithm. The cipher appears on the form page inside an <img> tag for the CAPTCHA generator, and in an <input> tag for the receiver. The generator decrypts it and displays, while the receiver decrypts it and compares it to the code it got from the client.

This seems to be fine, and doesn't require the server to remember any codes or sessions. However, here is the problem. The server can't know if the code it received was the one it generated. The client can always use a single known cypher/plaintext combination in the form, and the attack will work every time.

Now it's clear that the sever must remember the code. If so, it'll have to use some sort of a session. Moreover, if the code is known to the server, there is no need to encrypt it.

Let's look at a system that abides this principle. The form page starts a session, and places a new CAPTCHA value in the session. The generator page reads the code from the session, and displays it. The receiver just compares the user input to the code in the session.

Sounds nice and simple, but you must make sure of a few things before you can deem the technique as secure.

1) Regenerate the code in the session after every submission, especially a successful one. Otherwise it will be possible to reuse the session many times over.

2) Make sure that your session system is secure. It should generate a single random key, and place it in the cookie. Do not trust the cookie to contain the CAPTCHA code (even in encrypted form, note the previous example).

3) If you must have multiple CAPTCHAs on a single page, you'll need to ID them in your HTML. Make sure the ID is not the md5 hash of the CAPTCHA code (or something of the sort), but a newly generated random number. Remember that the hash of a 6 character code can be quickly cracked.

Just one last thing to remember. A good implementation is far more important than excessive entropy in the CAPTCHA. As you see, no matter how complex the images were, in vulnerable systems they became pretty much useless.

Posted by: kGen | In category: CAPTCHA | Comments (1)