Martha Rial, Post-Gazette
Carnegie Mellon University professor Tuomas Sandholm with the IBM super computer in the School of Computer Science Data Center that he uses to crunch numbers
Carnegie Mellon University researchers are raising the stakes now that their Texas Hold 'em computer program has beaten other competing programs and forced some expert human players to fold.
They are working on updated versions of their program that they predict, in time, will beat any poker ace, human or machine, in head-to-head competition.
Dr. Tuomas Sandholm, a CMU computer science professor, and Andrew Gilpin, a CMU doctoral student in computer science, have been working on the program for 2 1/2 years, beginning with their unbeatable program that played Rhode-Island Hold 'em, a three-card poker game.
Next they produced a program, known as GS1, that plays the more complex, seven-card Texas Hold 'em, and they had a paper published last summer for the National Conference of the American Association for Artificial Intelligence in Boston.
Recently they completed an upgraded version, GS2, that Dr. Sandholm said beat GS1 and should begin to make it more difficult for human poker masters to maintain their poker faces.
Dr. Sandholm is attending the national conference this week in Boston, where he and his GS2 program are competing in the poker competition.
Surprising aspects of the program include the fact it was developed by two men who are not expert poker players.
The second surprise? The program is not based on strategies employed by the world's best poker players. Instead Dr. Sandholm and Mr. Gilpin pursued a purely mathematical approach to poker by using game and linear theory, algorithms, computer ingenuity and some voluminous data-crunching.
Andrew Fuqua, who makes good money playing poker online, won $100 against GS1 in 250 hands, with each starting with $1,000. He said he exploited program weakness in GS1.
"I think it's an excellent start," said Mr. Fuqua, the director of software engineering at CombineNet Energy in Pittsburgh, where Dr. Sandholm is head of research. "If anyone can make a heads-up poker-bot to beat humans, Tuomas and Andrew will be the ones to do it."
In Texas Hold'em, two cards are dealt to each player, leading to a round of bidding. Next, three community cards known as the "flop" are dealt, leading to more bidding. At any point a player can fold.
A fourth community card, known as the "turn," is dealt, leading to another round of bidding. That is followed by the dealing of a final community card, known as the "river," that prompts the final round of bidding. A player's hand is based on the best five of those seven cards.
It's impossible for any player or program to win every hand because of the randomness of the deal, or what players refer to as luck. So the goal of the program is to win a majority of hands, especially those with the most chips on the table.
Dr. Sandholm explains their computer strategy with obvious glee.
Poker, unlike chess, where all information is available to both players, involves two players competing with incomplete information about what cards the opponent holds.
It also combines rational actions mixed with irrational ones. One example of irrational behavior is when people engage in bluffing -- the act of bidding aggressively while holding a mediocre or bad hand, hoping that players with better hands will fold.
The CMU program analyzes billions of possible hands that can be dealt. At each stage of the game, the program re-evaluates the possible outcomes before making a bid.
The program analyses its own hand in relationship to possible hands the opposing player might be holding.
Dr. Sandholm uses the game of "Rock, Paper, Scissors" to explain game theory, and how the poker games differ from others.
Rock, Paper, Scissors is random. If the computer randomly flashed one of the three choices, it would maximize its wins. If it began to emphasize one of the three over the other two, the opponent eventually could notice that trend, counter it and gain an edge.
But poker cannot be reduced to that simple of a choice, Dr. Sandholm said.
Using an IBM super computer, Dr. Sandholm and Mr. Gilpin grouped different hands into broad categories that should be played in identical fashion to maximize the probability of winning. The program considers all possibilities in subsequent rounds at each stage, as more cards are dealt, to determine what strategy to pursue in earlier rounds.
"We did it mathematically," Dr. Sandholm said. "We changed it many times before we got it right."
To test its GS1 version of the Texas Hold'em program, Dr. Sandholm and Mr. Gilpin had it play the best Texas Hold 'em computer programs on the market. GS1 beat them all when 20,000 hands were played.
It did not fair quite as well against seven intermediate and expert human players, such as Mr. Fuqua, but it was competitive and beat three of the seven.
The newer version, GS2, has yet to be tested against human players, but it soundly defeated GS1. Already there are plans for GS3.
"At some point we will have a program better than the best human players," Dr. Sandholm said, predicting a 60-40 success rate that takes into account the element of luck. "If we play enough hands, we'll be making money. We would lose some hands but win the big pots.
"Someday we will have a program with positive expectations against humans."
Not bad, considering Dr. Sandholm and Mr. Gilpin's skill level as poker players.
"Neither of us is a great poker player," he said, pausing to reconsider his statement. "Actually, both of us are pretty poor poker players."
David Templeton can be reached at firstname.lastname@example.org or 412-263-1578