Top 10 Stories of 2017, #10: Computer Beats Poker Pros in 'Brains vs. AI'
This year's Top Stories are brought to you by the VerStandig Law Firm, LLC. Combining a keen understanding of the gaming world and an equally keen understanding of the law, Mac VerStandig and his colleagues are devoted to fighting on behalf of the poker community and its members.
Who would have thought that 2017 would be the year that AIs surpass humans in a game near and dear to our hearts — no-limit Texas hold'em? Considering how strategically advanced human no-limit hold’em knowledge has gotten, it’s not all that surprising that computer programs have also advanced to a point beyond what human brains can achieve. After all, it was only a matter of time.
As a game of imperfect information with different agents (players) having access to varying information and 10161 different decision points based on starting stacks of $20,000 limited to dollar-increment bets, the complexity of Texas hold’em is what makes it "the main benchmark challenge" for designing AI that can be applied to other fields of imperfect information.
Road to Besting the Humans
Though it's been a long time coming with AI's history with strategy games spanning more than three decades, the most recent bots have finally overtaken humans in the popular game of NLH. A bot called Claudico went up against four top heads-up Hold’em players in March, 2015 but proved too exploitable to overtake the humans, although the overall numbers may have been a little misleading.
The four players, Bjorn Li, Doug Polk, Dong Kim, and Jason Les collectively won $732,713 over 80,000 hands (even though Les lost $80,842). However, more than $170 million was wagered during the challenge, rendering the humans’ profit less than .5 percent of the total bets, a number determined to be statistically insignificant.
The Carnegie Mellon University team did not stop with Claudico. In particular, Tuomas Sandholm and Noam Brown of the CMU Computer Science Department kept at it, taking feedback from the poker pros and incorporating it into more bot designs. They came up with Baby Tartanian 8 in 2016 and Libratus was born after that, ready to compete in 2017.
In January of 2017, the Brains vs. AI "rematch" was set against the new bot in town, and the human team changed personnel a bit with Kim and Les being joined by Jimmy Chou and Daniel McAulay.
As can be seen in the news video that follows, the rematch had a different result from the first bout, including a significant win for the bot, which finished with $1,766,250 in chips after 120,000 hands against its human opponents. Libratus’ performance also earned the team the HPCwire ‘Best Use of AI’ award in November.
Doug Polk said of Claudico, "It had some leaks, but those were fixed by the time they came out with their new program, 'Libratus.' It became clear to me at this point humans had fallen behind the curve in the game of heads up no-limit, and I imagine this will continue to happen in other formats."
By other formats, Polk refers to the fact that the bots' main weakness at this point is in their lack of applicability in formats other than heads up no-limit, like six-handed or nine-handed games or other poker variants. But as Polk indicates, we may not be far from that either.
So what has the poker AI advancement taught us both as poker aficionados and members of a rapidly advancing technological society?
Strategy From AI
Jason Les has been one of the most involved poker pros in the poker AI testing, getting to experience firsthand playing against various iterations of the bots. Les told PokerNews that although he hadn’t studied the subject since college, he’s always found AI fascinating and he got involved as a human competitor when the CMU team challenged top human heads-up players against Claudico. Les was even recently reunited with "his buddy" Libratus at the Neural Information Processing Systems conference.
Poker master Jason Les (@heyitscheet) will be at the #NIPS2017 demo session Tuesday night to have a reunion with hi… https://t.co/RLgZBGr9jB— Noam Brown (@polynoamial)
“Playing Libratus was a very intense and trying experience. It employed a strategy closer to nash-equilibrium than has ever been seen before, so by definition it did not have weaknesses that could be exploited. We initially hoped we could find an edge by using bet sizes that Libratus did not have in its abstraction. However, we later discovered that whenever we tried a bet size that it was unfamiliar with, the A.I. would run an algorithm overnight where it learned the new bet size and patched the vulnerability.”
The abstractions Les is referring to were explained by the bot's builders in an official paper entitled "Libratus: The Superhuman AI for No-Limit Poker," proceedings from their demonstration at the Twenty-Sixth International Joint Conference on Artificial Intelligence just published on Dec. 18 that elaborates on the AI's inner-workings.
Libratus did not have a pre-programmed strategy as some may assume, but generated decisions algorithmically to approach Nash equilibrium before it played. The algorithm worked based on “action abstractions,” or groups of bet sizes employed for different situations and also on “card abstractions,” which groups similar poker hands together and treats them identically. This algorithm allowed Libratus to take the game decisions from 10161 down to 1012.
Unlike its predecessors, Libratus comprises three main modules with algorithms for each, listed in the aforementioned paper as:
- Computing approximate Nash equilibrium strategies before the event.
- Subgame solving during play.
- Improving Libratus’s own strategy to play even closer to equilibrium based on what holes the opponents have been able to identify and exploit.
“Strength wise, Libratus played a very balanced strategy with the appropriate amount of bluffs in all spots. In addition, it had a perfectly executed mixed strategy which made putting Libratus on a range much more difficult. Humans are not capable of executing the mixed strategy as well, playing the same hands many different ways with different bet sizes and without bias. Libratus had no bias or tendencies, only the strategy it crafted itself over billions of hands.”
"Libratus played a very balanced strategy with the appropriate amount of bluffs in all spots."
Les thinks humans can imitate to some degree Libratus’ use of a mixed strategy when it comes to playing poker, to not fall into traps of playing certain hands or situations the same traditional way each time, which can lead to following predictable patterns your savvy opponents can exploit with simple adjustments against you.
“From observing Libratus, I think the biggest takeaway for humans is that strategies should be mixed. That is, you should open yourself to playing hands many different ways even in times where it seems very unconventional. Don't let the perceived 'accepted' way to play perpetuated by other players influence you in determining what's right or wrong. Taking this approach, and opening your game up to a variety of bet sizes (both small and large) like Libratus is something that everyone can learn from and use playing against normal human opponents as well.”
Dong Kim mentioned similar ideas in a PokerNews interview following the matches with Libratus, saying that the "world-class" bot used mixed strategies and also overbet as a bluff more than any human he had seen.
In recent months, overbetting has become a more common strategy with high level players seeking to incorporate it into their games in a balanced way for both bluffs and value bets, much like Libratus. Overbetting strategy was discussed in this video from Matt Berkey and Andrew Brokos and in another article about ideal situations for employing the river overbet, including an example from a Poker Masters final table. In that past couple years, there has also been increasing talk and tools regarding playing a Game Theory Optimal (GTO) strategy.
"You should open yourself to playing hands many different ways even in times where it seems very unconventional."
One of the most important aspects we can learn from Libratus is the bot’s approach to learning. Libratus was able to defeat poker’s top NLH heads-up players not only because of its complex algorithm, but also due to its ability to continuously adjust its strategy based on new information and new hands it played.
While we as humans do not have the kind of mental computing power as the Pittsburgh Supercomputing Center (PSC) “Bridges” source that provided Libratus millions of core hours of computation on which to base its strategies, top players do train themselves on potential situations in the game using AI simulation tools and programs such as PioSOLVER, a fast Game Theory Optimal solver for Hold’em and players can also work to continuously find holes in opponents' strategy and adjust by forming strategies to exploit those weaknesses.
As Les notes, people may be able to leverage poker AI's advancement to improving their own game.
"The advancement of A.I. gives people really strong tools for improving their poker game. Playing good players is always a great way to learn how to play poker and the only way you can do that today is playing for real money and if you really want to challenge yourself it has to be at stakes higher than you normally play. With A.I. training tools, players could have the opportunity to play an opponent playing their absolute best without risking any money whatsoever."
Future of the Game
Despite all of this advancement in the game of NLH, even one of Libratus' creators, Brown explained that the game won't be "solved," at least not any time soon. In a 2016 interview with PokerNews, he explained, "There is no chance that no-limit Texas Hold'em is going to be solved within our lifetime, if ever. That said, there are ways to get good approximate solutions and I think that certainly in the next few years we will see a bot that can take down the very top pros, but there is a big difference between that and solved."
"Even though solutions come out, people are going to not play perfect."
That prediction did indeed come to fruition less than a year later with Libratus, but as Brown articulated, the success Libratus had is due to "good approximate solutions," emanating primarily from the abstractions in the program.
But even as computing power gets more advanced and more cost-effective, most real-life poker situations will continue to involve humans playing other humans and as we know, humans are — well — only human. Humans make mistakes and have emotions. They also get distracted, have lives outside of poker and most of them have limited time to study the game because of "real" jobs. People are limited in their mental computing power and many of them play poker with other goals like entertainment and fun more than for profit. All of these factors combined render no-limit Hold'em still very much alive, at least for the time being.
Dan “Jungleman” Cates echoes this sentiment, saying in a recent Paul Phua Poker video that despite the fact that our beloved two-card game is relatively “solved” in a lot of ways, the norm at most levels of play is still suboptimal poker that savvy players can exploit for profit.
"Even though solutions come out, people are going to not play perfect."
While some concerns have arisen regarding the potential for bots to be used to cheat players out of money in online games, there are several precautions that both players and online sites can take to protect themselves online. Players can look out for identical patterns of bet sizing and timing, repeated uncommon betting lines or inability to respond in chat, and can report suspicious behaviors to poker sites.
"I think the viability of internet poker depends on constructing new games that are hardened against A.I. development."
In addition, the technology for preventing the use of bots and other forms of cheating like collusion are constantly advancing, making online poker even safer in those countries and states where internet poker is regulated by the government. But the question of whether the protection from the bots can keep pace with the technology is a major question, which Les also brought up.
"The implications for AI development is unfortunately games will become less safe to play online. While Libratus only plays heads-up, as A.I. advances and beings encroaching on 6-max and full-ring as well, we will be faced with some difficult decisions about the safety of playing online. I know poker sites are working very hard to combat this, as their business relies on it, and I hope they are able to have success deploying countermeasures.
"In the long run, I think the viability of internet poker depends on constructing new games that are hardened against A.I. development. That may mean adding more cards to the deck, increasing stack sizes, or any other changes of the sort."
Whether it is primarily for reasons of online cheating prevention or not, the trend toward more variation in poker games that Les mentions seems to be a growing one of late and for now, all signs point to — poker isn't dead.
The VerStandig Law Firm, LLC represents poker professionals, sports bettors and advantage players across the United States. The firm assists clients in connection with legal issues including personal LLC formation and operation, tax planning that focuses on gaming deductions and exemptions, casino disputes, and personal matters spanning from divorce to criminal dust-ups.
Lead image courtesy of CMU.edu
Finding a trustworthy room to play online poker can be a monumental burden. That's all the more true if you're just looking for a place to play poker for free. We've listed five of the best play-money poker sites to enjoy and help hone your skills.