What reCAPTCHA Really Does

Another interesting item came from Ultan (@ultan), our sometime contributing author and UX dude.

This TEDxCMU talk by Luis von Ahn explains reCAPTCHA. Before you pass because you (and everyone else) hate captchas, reCAPTCHA is different. When you complete a reCAPTHCA, you’re helping Google (who acquired them in September 2009) to digitize the World’s printed material.

Instead of generating random characters, which frequently creates real words and unintentional comedy/annoyance, reCAPTCHA uses words that OCR cannot recognize. Therefore, when you’re verifying your humanity, you’re contributing to the larger cause of Google’s growing collection of once-printed-now-digitized material.

As with a lot of Google’s efforts, you may not agree that this is good, or not evil, but it’s pretty cool from a technical perspective, much like their YouTube algorithm to prevent copyright infringement and placate record labels, another fascinating technical feat.

Update: As Ultan points out in comments, the meat of this video concerns DuoLingo, Luis von Ahn’s next awesome project that aims to translate the web (you read that right) for free by offering users free language training.

It’s a great trade. Users get really good (or so he claims) language training, and DuoLingo translates portions of the web at a very cheap rate, basically the machine power. Plus, users translate real content, not bogus sentences crafted to bore you to tears.

Super clever idea, which fits right in with reCAPTHCA and the ESP Game, both of which Google has acquired.

I love these projects because they’re both clever and employ game mechanics.

And that will teach me to rush a post out before watching the full video. My bad.

AboutJake

a.k.a.:jkuramot

4 comments

  1. Crap, I didn’t get a chance to watch all of the video. Even so, most people don’t know what reCAPTCHAs do, so that’s helpful. Must watch remainder of video now, have a flickering memory that I read about DuoLingo at some point. Will update with thoughts.

  2. I totally did not get the reCaptcha thingy. So now instead of wasting 10s, they plan to waste 20s? Annoy them twice as much? The core of hat recaptcha thing sounds flawed to me. Basically google wants to make money for free by vexing users. What am I missing here?

    On DuoLingo, I have my doubts if this will really work. I mean you are planning to translate some text. You obviously don’t have the answer. You are using learners to translate. How can you make sure quality is good. Even if you combine several learners, they are all still learners and have will make the same common mistake.

    May be I am wrong, need to know more details. But good stuff either way. Atleast a start. So credit goes to them. They tried language processing based on grammar, statistical approach..now this new user based collaborative approach. Lets see!

  3. So, captchas are a necessary evil or so we’re told. I guess reCAPTCHA helps you feel better about them. I’m not sure helping Google digitize books is on my list of important things to do, so yeah, I’m with you.

    DuoLingo is promising. They use algorithms to compare translations for accuracy, and it’s hard to argue with his metrics and examples. Looks like the system could work, and it’s win-win.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.