EmailEmail
PrintPrint
The brilliant professor
CMU computer scientist creates problem-solving algorithm
Thursday, October 23, 2008

Like a locksmith with a master key, Carlos Guestrin has created a computer algorithm that can do everything from figuring out the best way to detect water contamination to revealing which political blogs do the best job of staying on top of the news.

The Carnegie Mellon University computer science professor's work landed him on this year's "Brilliant 10" list created by Popular Science magazine, showcasing some of the nation's top young researchers.

The key to Dr. Guestrin's algorithm, which is a recipe a computer uses to solve complex problems, is that it can assemble the maximum amount of information for the least effort.

Dr. Guestrin, 33, of Point Breeze, began work on the project four years ago when he was doing research for Intel in Berkeley, Calif.

Scientists there wanted to measure the climate variations under the soaring canopies of the redwood forests in northern California by installing a network of wireless sensors.

"I said to them, 'How will you decide where you're putting the sensors in the forest?' And they said, 'Oh, whatever looks good, we'll use our intuition.' "

But Dr. Guestrin thought he could come up with a more logical approach. "I thought maybe I'd do a project where I'd work with them a month on this, and four years later we're into something much bigger and more elaborate and interesting than I thought I could possibly get into."

In a simplistic sense, Dr. Guestrin's algorithm figures out the best place to put the first sensor in a complex network, then the best place for the second-best sensor, and so on down the line until there is not much incremental value in adding another sensor.

"If I've put out four sensors," he said, "putting out a fifth one might help me a lot, but if I've put out 100 sensors, the 101st will help me less, and if I put out 1,000, the 1,001st will help even less."

Knowing the minimum number of sensors needed to get optimal information can be especially important when the measurement devices are expensive.

That was a key issue when the federal Environmental Protection Agency wanted to know the best way to detect contamination in an urban water system. In the model his team was working with, Dr. Guestrin said, there were 12,000 possible locations to put sensors, but even the cheapest sensor cost $14,000.

Using his algorithm, he was able to show that officials could get a good answer on water contamination by using just 19 strategically placed sensors.

The algorithm achieved the same kind of efficiency in a project his team did with the Quality of Life Technology Center, a joint project of Carnegie Mellon and the University of Pittsburgh designed to help older people and those with disabilities live independently.

Dr. Guestrin's team was asked to figure out the best placement of sensors on the back and seat of a chair to measure people's postures when they were sitting.

There already are sheets available that have pressure sensors embedded every square centimeter, but they cost $16,000 each, he said. The team came up with a prototype chair with far fewer sensors that cost only about $100.

Dr. Guestrin's algorithm is part of a broader effort under way in computer science known as machine learning, which he defined as "computer programs that can improve their performance based on experience." Machine learning programs, in other words, can discover new information on their own.

That showed up clearly when Dr. Guestrin and graduate students Andreas Krause and Jure Leskovec developed their blog-rating program.

The three cooked up the idea in Dr. Guestrin's office after they started discussing the overwhelming amount of information that's available on the Web. With political blogs alone, he said, "There are millions of postings out there, the blogosphere is overwhelming, you have 10 minutes a day to spend reading up on what's going on, and the question is, 'What blogs should you be reading to stay on top of information?' "

They created a "cascades model" that showed which blogs' postings had the greatest spread through all the other blogs on the Web. "So a good blog," he said, "would be one that caught a big story as close to the source as possible and then did that for several stories."

When they ran the program in 2006, the top-rated blog was Instapundit.com, a site run by a University of Tennessee law professor.

But Instapundit has too many entries for someone who has only 10 minutes a day to spend reading blogs, Dr. Guestrin said. So the program also compiled the top blogs that average about 15 posts a day, and that revealed that there are individuals who are great "summarizers" of the top stories of the day.

It's a telling example of machine learning, he said, because he and his fellow researchers wouldn't have been able to guess who those bloggers were from the overall list.

Dr. Guestrin grew up in Rio de Janeiro, Brazil, and got his bachelor's degree in mechanical engineering from the University of Sao Paulo. He attended graduate school at Stanford University, worked for a year at Intel after getting his Ph.D. in computer science, and then joined the Carnegie Mellon faculty in 2004.

He is now working on long-range projects to develop algorithms that would allow people to make better decisions using information from the Web, and to allow computers to do a better job of divvying up tasks to solve large-scale problems.

Mark Roth can be reached at mroth@post-gazette.com or 412-263-1130.
First published on October 23, 2008 at 12:00 am
Featured Homes
Featured Rentals