29 Sep 09

You have your blog ,contact form, application setup and your content is dominating the search engines. You get tons of traffic every day and tons of spam also. Spam about everything you can imagine: pills, drugs and links to casinos. Even the president of the united states came to your blog and commented on one of your posts asking you to buy something… What can you do to stop this?

CAPTCHA

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a way to stop computers from posting forms. A form can be a contact form, a comment form or anything that can be posted to the backend of a web application. There are numerous types of CAPTCHAs. We all have been asked to fill in a word that is displayed as an image during our surfing and some of us use them to one of our sites.

The most known CAPTCHA type is an image displaying a word which is distorted in some way. So we find CAPTCHAs like the one used on Yahoo!:

This type of CAPTCHA is used by Yahoo!

This type of CAPTCHA is used by Yahoo!

or like the ones previously used by Rapidshare:

Used by Rapidshare in the past

Used by Rapidshare in the past

or a typical CAPTCHA with lines obscuring the words:

Another type of image CAPTCHA

Another type of an image CAPTCHA

There are some other CAPTCHA types that you should be aware of. For example asking the person to type in the sum of two numbers like this one:

Please add 6 plus 2 and fill in the result

This is something most of computers can’t figure out especially if it is displayed in an image like this one:

Obscured numbers addition

Obscured numbers addition

but it is not bulletproof because a bot can be configured to read the numbers and add them. Another nice but still not bulletproof CAPTCHA style is the one used in some sites that ask the visitor to write the a word backwards. So it looks like this:

Please write the word “double” backwards

Still not quite secure, because someone could grab the word surrounded by “quotes” and reverse it. It is pretty simple to do it.

So we see that there are a lot of options that really don’t give any solution. You might think that the CAPTCHA used by Rapidshare is pretty strong and it would help against spam, but it is not user friendly at all. You don’t want your visitors to get headaches trying to solve your little enigma there. You want to keep bots away.

Another problem when using captchas is usability. What will happen if a visually impaired person visits your site with a screen reader? You need a sollution that can handle this usability feature too. Javascript is also not so usable because a lot of screen readers don’t support it.

No Solution!!! What can i do?

Before we suggest our idea, we should first try to explain why most of the CAPTCHA types mentioned above. Having a smooth distorted word in an image is very easy for a program to scan it and read the characters while having an image with very distorted text might make your visitors go away. On the other hand, having a sentence asking someone to write a word or add some numbers is very user friendly but bot friendly too. If i was a spammer i would look for patterns in the sentence and configure my bot to bypass the test.

A suggestion

We need a test that will be strong and user friendly too. So we recommend a solution that will be text based but it will alter it’s pattern so that it will be harder to bypass it. The solution that we suggest is very easy to implement. We will use PHP for this one. First we need to create some arrays:

//vowels array
 $vowels = array('a','e','i','o','u','y');
 //consonants array
 $consonants = array('b','c','d','f','g','j','k','l','m','n','p','q','r','s','t','v','w','x','z');
 //words array
 $words = array('design','develop','program','tools','awesome','stunning','person','computer','jeez','smashing','source','distorted','example','lorem','ipsum');
 //Patterns
 $patterns = array('vowelsout','consout','backwards','firstlast');

The arrays above will be used to alter the pattern each time we load a new test for the user. So we see that we have 4 different patterns to choose from:

  • vowelsout – This will strip all vowels from the word
  • consout – This will do the opposite of vowelsout
  • backwards – This will reverse the word
  • firstlast – This will ask the user to fill in the first and the last char of the word

The full code is:

session_start();

function captcha(){
 //vowels array
 $vowels = array('a','e','i','o','u','y');
 //consonants array
 $consonants = array('b','c','d','f','g','j','k','l','m','n','p','q','r','s','t','v','w','x','z');
 //words array
 $words = array('design','develop','program','tools','awesome','stunning','person','computer','jeez','smashing','source','distorted','example','lorem','ipsum');
 //Patterns
 $patterns = array('vowelsout','consout','backwards','firstlast');

 //first we need to decide what pattern to use. So we pick one randomly
 $pattern = $patterns[array_rand($patterns, 1)];
 //then we pick a random word from the words array
 $word = $words[array_rand($words)];
 //then we check what the pattern is and we go on using this technic
 switch($pattern){
 case"vowelsout":
 //This pattern strips vowels from the word
 $_SESSION['CAPTCHA'] = str_replace($vowels,"",$word);
 //a string to display to the visitor:
 $question = "Please write the word '$word' without vowels";
 break;
 case"consout":
 //This is the opposite of the vowelsout pattern
 $_SESSION['CAPTCHA'] = str_replace($consonants,"",$word);
 //a string to display to the visitor:
 $question = "Please write the word '$word' without consonants";
 break;
 case"backwards":
 //This one reverses the word
 $_SESSION['CAPTCHA'] = strrev($word);
 //a string to display to the visitor:
 $question = "Please write the word '$word' backwards";
 break;
 case"firstlast":
 //This one takes the first and the last char of the word
 $_SESSION['CAPTCHA'] = $word[0].$word[strlen($word)];
 //a string to display to the visitor:
 $question = "Please write the first and the last characters of the word : '$word'";
 break;
 }
 echo $question;
}

captcha();

We can add more patterns to our CAPTCHA system and each time the code the user needs to write, is stored in a session variable that can be used later to test if the user posted the right code. You can make some more improvements to the patterns. For example you can replace all vowels with a number and ask the user something like this:

Please replace the vowels from this word: ‘awesome’ with number 1 (Sollution:  1w1s1m1)

Honeypots

A honeypot is a trap we set to bots in order to detect if an action was generated by a human or a computer. Generally, spam bots check for standard form fields to fill. Some of these fields include:

  • Name
  • Username
  • Email
  • Surname
  • Body
  • Message

So a honeypot would be a form that has one or all of the above fields but does not need the user to fill in something. In fact, a honeypot would refuse to accept a form that has one of these fields filled with data. The implementation is plain easy. You set one or more honeypot fields in a form and you use CSS to hide them from the user (Bots do not understand CSS). When the form is submitted, you check if the field(s) are empty and if not, you refuse to accept the form contents.

Dynamic Creation of Form Fields Names

Another cool and strong form protection against spam bots can be achieved by creating form field names in a dynamic way. You create a hash for each field with some random salt and you store this hash in a session variable. When the bot comes to your page, it does not know what to do and where. Even if the spammer configures the bot to assign email to a given field, there is no chance (less than 1%) that the same field name will ever exist. So the bot is lost.

Some CAPTCHA implementations

During our research for CAPTCHA systems we run across some interesting implementations. These are:

Ajax Fancy Captcha

ajax

This one asks you to drag and drop a specified item into a circle. Clever implementation because bots don’t support javascript. Do they.

Captcha that wants you to know somethings

This one is a good solution for sites that need their visitors to have some knowledge of the topic they want to comment on. So for example on a topic about php, the question could be:

Which php function displays info about PHP? (Sollution: phpinfo)

Other

I read that somewhere on the net but i can’t remember where (Maybe Slashdot), that a solution to spam bots could be the use of flash animated text, easy for a human to read but hard for a bot to do the same.

In Vain

The most disappointing thing of all is human labor for spam purposes. A company can hire some hundreds of people that really need the money to solve CAPTCHAs and all your tests to be useless. Some other spam technics include the use of sites that ask from visitors to solve captchas in order to see nude images and many more mentioned on Wikipedia that are worth reading.

VN:F [1.9.1_1087]
Rating: 6.0/10 (2 votes cast)
VN:F [1.9.1_1087]
Rating: +1 (from 1 vote)
CAPTCHA Problems, A Suggestion and Alternatives, 6.0 out of 10 based on 2 ratings

Popularity: 1%

  • Share/Bookmark

Related posts:

  1. The Road To Cost Effective Bandwidth Management You have your site, full of content, visitors come and...
  2. 4 Website Thumbnails Generators With An API To Use Yesterday, I was trying to add automatic website thumbnails generation...
  3. A Jeez.eu implementation of an URL Shortener Two days ago a friend of mine suggested to me...
  4. Use Google’s Power To Create Powerfull Search Engines (Part II) In our previous article, we learned how to create a...
  5. Web 3.0 Will Be All About Web Services. Learn The Basics The web has evolved in many ways. From the static...

About the Author:

Filed under: Tutorials - Trackback Uri


7 Comments.

  • Jen says:

    Gahhh spell check is your friend! s/sollution/solution/g

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • admin says:

    Spelling error fixed! Thank you

    VN:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • CertPal says:

    Nice alternatives. I did not know about the honey pot. I recently wrote about other captcha flavors as well. Check out google’s rotating captcha research and the related PDF. Makes for a good read.

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • rc says:

    For god’s sake, pls use a spell-checker.. “purposses” “surounded”. When you put so much effort on writing good content, giving excellent illustrations, linking to appropriate sources.. not doing this one trivial step… aarg.. hope u get the point..

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • admin says:

    Sorry for the spelling mistakes. English is not my mother tongue :( Although i am very familiar with the English language some typos are still in there. I will follow your advice on spell-checker.
    Thanks!!!

    VN:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • Nathan says:

    So… trying to make something hard for programs to read/parse but still readable by humans? Why not easy for humans and very hard for programs… like a contact form written in flash ;-)

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  • Felicia says:

    please… can you tell me why, when i log ino yahelite and the captcha comes on … why are there no numbers?

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: -1 (from 1 vote)