reCAPTCHA

Jump to navigation Jump to search

File:Modern-captcha.jpg
An example of a reCAPTCHA challenge

reCAPTCHA is the process of utilizing CAPTCHA to improve the process of digitizing books. It takes scanned words that optical character recognition software reported undetectable and presents them for humans to decipher as CAPTCHA words alongside words recognized by the computer.

How it works

In order to verify that humans can decipher these previously undetectable words correctly, two words are displayed instead of the standard one word. One of these words is generated in the usual CAPTCHA form, and only the other word is undetectable. If the users solve the usual test for the first word, it is assumed that they were also correct about the other previously undetected word. Nevertheless, more than one user has to verify the other word in order for it to be considered truly solved.

How it is provided

reCAPTCHA tests are taken from the central site of the reCAPTCHA project[1] as they are supplying the undetected words. This is done through a Javascript API with the server making a callback to reCAPTCHA after the request has been submitted. The reCAPTCHA project provides libraries for various programming languages and applications to make this process easier. reCAPTCHA is a free service, except for users who would require a prohibitive amount of bandwidth.

Notes

  • reCAPTCHA has the same goal as Distributed Proofreaders, though DP uses conventional proofreaders
  • reCAPTCHA press release: "May 24: Carnegie Mellon Project Boosts Book Digitization Efforts - Carnegie Mellon University". Retrieved 2007-06-23.
  • Spam weapon helps preserve books — BBC news report by Paul Rubens, 2007-10-02.

External links

de:Captcha#Weiterentwicklung sk:ReCAPTCHA