Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.


Published Online August 14, 2008
Science DOI: 10.1126/science.1160379

Reports

Submitted on May 12, 2008
Accepted on August 5, 2008

reCAPTCHA: Human-Based Character Recognition via Web Security Measures

Luis von Ahn 1*, Benjamin Maurer 1, Colin McMillen 1, David Abraham 1, Manuel Blum 1

1 Computer Science Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA.

* To whom correspondence should be addressed.
Luis von Ahn , E-mail: biglou{at}cs.cmu.edu

CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) are widespread security measures in the World Wide Web that prevent automated programs from abusing online services. They do so by asking humans to perform a task that computers cannot yet perform, such as deciphering distorted characters. Our research explored whether such human effort can be channeled into a useful purpose: helping to digitize old printed material by asking users to decipher scanned words from books that computerized optical character recognition (OCR) failed to recognize. We showed that this method can transcribe text with word accuracy over 99%, matching the guarantee of professional human transcribers. Our apparatus is deployed in over 40,000 Web sites and has transcribed over 440 million words.





To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)