Latest weapon against spam also enlists computer users to assist the Internet Archive

Share with others:


Print Email Read Later

Luis von Ahn's office is a welcome burst of color and playfulness in an otherwise dull and stodgy building. Save for his computer chair, his room in CMU's Wean Hall offers only beanbag chairs and bouncy balls as visitor seating. Stuffed animals and Legos line the shelves on the walls.

The visual parade of color is just another manifestation of the computer science professor's creative spirit. At 28, Dr. von Ahn has already made a name for himself with a simple yet radical idea: capitalizing on tasks computers can't do, but human minds can, to enhance artificial intelligence.

As a graduate student at CMU in 2000, Dr. von Ahn and his adviser Manuel Blum developed a security measure to foil Internet spambots. They called these tests Captchas -- Completely Automated Public Turing test to tell Computers and Humans Apart. Ever had to decipher a squiggly, distorted word to gain access to a site like Yahoo? That's a Captcha.


Those familiar squiggly letters are Captchas, an upgraded security measure used by many Web sites to distinguish humans from spambots.
Robin Rombach, Post-GazetteDr. Luis von Ahn, assistant professor of computer science at Carnegie Mellon University in his colorful office in September. As a graduate student at CMU, he helped invent Captchas, which aimed to thwart hackers by using words or numbers online in weird script that one must identify and type into a space to access a Web site. In May, he launched an improved version called reCaptchas.
Click photo for larger image.

But resourceful hackers have found ways to solve Captchas, and Dr. von Ahn has had to rethink the program.

"It's an arms race," he said. "We come up with something that programs shouldn't be able to read. Then somebody comes up with a way to read it, so we have to come up with a better one."

So in late May, he launched reCaptchas, an improvement upon Captchas that he predicts will take years to break.

The reCaptchas are innovative not only because they can effectively thwart spambots. Their inventiveness lies in the fact that the very feature that frustrates computers also benefits humanity -- in the form of the Internet Archive, a nonprofit project that aims to digitize books.

An estimated 6 million Captchas are completed each day, costing users 10 seconds each, wasting an overwhelming number of labor hours. Dr. von Ahn has made it his mission to harness that manpower and put it to use.

Here's how reCaptchas works: Instead of generating random words, the program pulls words from scanned books that are too smudgy for computers to read, and gives them to human users to decipher.

"We only take words the computer can't read," said Dr. von Ahn. "That extra step makes it much more secure, because we just threw away everything a computer could read."

Eventually, after multiple users agree on a word, it is fed back into the Archive. Bit by bit, distorted scanned books are translated into text that is available to the public.

Launched a little over a month ago, reCaptchas are already everywhere. Dr. von Ahn rattles off a long list of Web sites -- John Edwards for President, Ben and Jerry's, the Iowa correctional facilities program, among others.

That includes "OnlineBootyCall.com, which is surprisingly a dating site, not a porn site," he said. "Although there is a porn site that uses the service, too."

Maintaining and finalizing reCaptchas keeps Dr. von Ahn pretty busy -- he estimates that he won't be done for another six months.

But he's still found the time to further his goal of making humanity as efficient as possible, and he's done it with flair. The easiest way to get people to do work, Dr. von Ahn reasoned, was to make their tasks enjoyable. So he plans to release a pantheon of new online games within the month that use human minds to solve computing problems.

Matchin' gets players to identify attractiveness, thus creating an image archive searchable by, well, "prettiness." InTune, which is still being worked out, aims to make searching for audio clips possible by criteria other than the file name. Three others -- Babble, Squigl and Verbosity -- do much of the same. Artificial intelligence, often billed as superior to the human brain, is now actually benefiting from uniquely human capabilities like common sense and aesthetic sensibility.

One of his old games -- the ESP Game, licensed by Google in 2006 -- has been massively successful in improving computer functions. Pairs of random players from around the world type in words to describe an image, hoping to hit upon the same description as their partner. When players agree on a word, the picture gets tagged as that word. The game is now at work making Google's image search more accurate.

"With all these games, we've gotten precisely the data we wanted," he said. "Every time, it works."

He came up with the idea of using games after thinking for a long time about how to make humans do the work that computers can't.

"One day, I was on a flight and there were three people next to me all playing crossword puzzles," he said. "At the time, I was like 'These people are doing things computers can't do, right here, of their own free will!'"

He admitted that he later found out that computers can do crosswords even better than humans. But at that point, it didn't matter -- he had found the answer he needed.

At CMU, Dr. von Ahn teaches a freshman lecture class called Great Theoretical Ideas in Computer Science. He chose to be a professor over a career at Microsoft or Google because, he said, "I'm selfish."

"The spellchecker in Google is an amazing piece of software because it doesn't use a dictionary," he said. "But do you know who made that? Nobody does.

"The problem with working for Google is that there are like 10,000 really smart people, but everything they make is collectively 'Google.' I'd rather have people know what I did."

And people are certainly taking note -- last September he won the MacArthur Fellowship, commonly known as the Genius Grant. He has won numerous other fellowships, and was featured in Wired magazine last week.

Dr. von Ahn has no idea where he wants to go after his games and the reCaptchas are up and running. He has mentioned using his system to digitize newspapers, or increase bank security. But for now, the future does not concern him; between perfecting his programs and planning his wedding, "I have enough to keep me busy for the next two years," he said.

He also has no idea what he wants to do with the $500,000 MacArthur prize money yet. No matter what, though, "I want to use it for something that's good for the world," he said. "I want to impact humanity."

For information on reCAPTCHA, or to try your hand at one of the puzzles, visit recaptcha.net. To learn more about the Internet Archive, see www.archive.org.


Laura Yao can be reached at lyao@post-gazette.com or 412-263-1878.


Advertisement
Advertisement
Advertisement

You have 2 remaining free articles this month

Try unlimited digital access

If you are an existing subscriber,
link your account for free access. Start here

You’ve reached the limit of free articles this month.

To continue unlimited reading

If you are an existing subscriber,
link your account for free access. Start here