Human champs of 'Jeopardy!' vs. Watson the IBM computer: a close match

As two game show champs compete with the supercomputer Watson, anyone -- or anything -- could win

Share with others:

Print Email Read Later

In the battle of man vs. machine, Eric Nyberg got crushed.

More than a year ago, the Carnegie Mellon computer science professor battled wits with Watson, IBM's super-smart supercomputer that will compete on "Jeopardy!" starting Monday night. CMU was the first university to sign on to the Watson project, said Dr. Nyberg, and has been working with IBM for the past four years to develop it.

For the next three days, Watson will take on "Jeopardy!" uber-champions Ken Jennings and Brad Rutter in a $1 million televised competition.

But more than just a television spectacle, Watson represents real advances in the field of artificial intelligence.

"It really stretches the state of the art," said Dr. Nyberg, a professor in CMU's Language Technologies Institute within the school of computer science.

Watson was conceived in 2006 by an IBM employee who noticed patrons at a bar flocking to a television to watch Mr. Jennings' 74-game win streak. The company had previously competed in chess in man vs. machine contests in 1996 and 1997, when its Deep Blue computer took on world champion Garry Kasparov. (Deep Blue lost the first match, but came back to win the second the next year.)

But where chess has a defined set of rules, "Jeopardy!" has goofy convoluted clues that confuse even human trivia machines. And where chess has a limited scope, the breadth of knowledge covered by "Jeopardy!" is enormous.

When Carnegie Mellon was brought into discussions about the project in 2007, it wasn't even clear that such a computer was viable.

"I didn't consider it to be impossible but it was far from certain," said CMU computer science doctoral student Nico Schlaefer, who worked extensively on Watson. "It would have been a huge disappointment if it hadn't worked out, but it wouldn't be surprising."

A computer that would be able to answer a "Jeopardy!" question, for example, has to first be able to understand the question.

Take a real question that was once asked in a "Decorating" category: "Though it sounds 'harsh,' it's just embroidery, often in a floral pattern, done with yarn on cotton cloth." From all those words (sent by text message, since Watson is not a voice-recognition system), Watson needs to figure out that the question is actually asking for a type of embroidery -- not a type of flower, or a yarn, or something cotton, or something harsh.

Watson works by then searching through its knowledge base -- Mr. Schlaefer estimated it at 30 to 50 gigabytes, or four to eight times the size of Wikipedia -- to generate hundreds of possible answers, using a quick sift to whittle those down to about 100.

For those 100 "candidate answers," Watson then performs about 300 pieces of analysis on each, using its past performance to weigh all of the evidence and generate a "confidence score" for each possible answer.

Finally (and all in the span of about three seconds), Watson must decide whether that confidence score is good enough to hit the buzzer -- depending on how conservative or risky Watson wants to be, given the situation in that particular game.

And in the embroidery example, Watson would have to come up with "What is crewel?" -- figuring out somehow that harsh is a synonym for the word cruel, the homonym for the type of needlework.

Early results were encouraging -- but nowhere near good enough for "Jeopardy!" An initial monthlong effort to adapt an existing computer system produced a system that could answer about 13 percent of previous "Jeopardy!" questions correctly.

By the end of 2008, the system -- now known as Watson -- was attempting about 70 percent of the questions and getting about 70 percent of those right. But the system was taking up to two hours to answer a single question.

"I got very concerned about it at one point," said Mr. Schlaefer of the timing issue, which IBM eventually solved by running the system on about 2,500 parallel processors.

When Dr. Nyberg took on Watson, he actually knew most of the answers. But his 49-year-old reflexes were unable to beat Watson to the buzzer.

"A very early version of Watson was able to dominate me," he said. "I probably could have won if I had a chance of buzzing in faster. As you get older, your reflexes slow down."

In addition to Dr. Nyberg, Watson took on IBM employees and former "Jeopardy!" players and champions in preparation for this week's $1 million matches.

One change made along the way was to send Watson the responses -- correct or incorrect -- of the other contestants. Watson was getting mocked by the practice round host for ringing in after a competitor gave an incorrect response with the exact same wrong response.

Watching Watson compete -- and improve -- was "both inspirational and humbling," said IBM researcher David Gondek. Every time he thought Watson was ready to compete against a higher caliber of player, the humans would surprise him.

"Actual 'Jeopardy!' players were a whole quantum leap better," than the people used in early testing, he said. "They're like athletes, they know so much."

But eventually, Watson caught up. In 100 test matches against "Jeopardy!" winners, the computer won about 65 percent of the time.

The match against Mr. Jennings and Mr. Rutter has been taped, but those present in the IBM conference-room-turned-Jeopardy-set -- including Dr. Nyberg and Mr. Schlaefer from CMU -- are sworn to secrecy.

The multi-hour taping was "easily the highest level of tension that I have ever experienced in my life," said Dr. Nyberg. "It was quite a spectacle."

For CMU, the lessons of Watson will continue well past this week's television shows.

Mr. Schlaefer will use his contribution to Watson for his dissertation on "statistical source expansion." His work helped Watson acquire the best combination of Web sources for its enormous "corpus" of information. (Watson is not, however, connected to the Internet during matches.) Another Carnegie Mellon doctoral student, Hideki Shima, contributed one of the functions used to analyze text as Watson researches possible answers.

In addition to Carnegie Mellon, seven other universities worked on aspects of Watson.

The work on Watson not only advances the field of artificial intelligence, said Dr. Nyberg, but will likely have practical uses in fields such as health care and defense.

"At this point, win or lose, we feel like we've accomplished something," said Dr. Gondek of IBM. "We feel like we're competitive with humans."

Anya Sostek: or 412-263-1308.


Create a free PG account.
Already have an account?