Saturday, January 1, 2011

Optimal Poker AI -- I told you so!

A couple years ago, I had a long debate about Texas Hold'em with my brother, a long-time aficionado of the game. He's quite good and he's roomed with friends who actually payed the bills playing online poker. So you can see why he'd get defensive when I claimed that there is an optimal strategy for heads-up Texas Hold'em.

I've thought back to that conversation several times, mainly because it was really frustrating for both of us. I had a proof backing me up -- Von Neuman's 80-year-old minimax theorem* -- an absolutely airtight mathematical proof, but I still couldn't convince him. The problem is that it's an existence proof, so it demonstrates that the strategy must exist, but doesn't tell you what the strategy is. And of course, arguing that "my math says there's an optimal strategy" rings pretty hollow when you can't answer the question, "so what's the strategy?"

So I was very happy to run across this paper and demo presenting the optimal strategy for Rhode Island Hold'em. Rhode Island poker is a stripped down version of Texas Hold'em (get it?). You play with a smaller deck, only one hole card, a one-card flop, and a one-card turn, but no river. There are a few limits on betting, but the game is essentially the same. According the abstract, "Some features of the [optimal strategy] include poker techniques such as bluffing, slow-playing, check-raising, and semi-bluffing."

Very nifty proof of concept.

* A little bit more on the math. The minmax theorem states that every one-on-one winner-take-all game has an optimal strategy profile. A quick FAQ on what it means to be an optimal strategy.

So what does it mean that the strategy is optimal?
The technical definition is that it's an equilibrium strategy. If both players are playing the optimal strategy, neither has an incentive to deviate and play anything else. As one of my profs would say, it's prediction proof. Even if we both know each others' strategies, neither of us has an incentive to deviate.

So this strategy always wins?
No. There is some luck in the game. But if you played it repeatedly, there's no way you would win more than half the time. Almost any non-optimal strategy you'd play, the computer would beat you more than half the time.

Wouldn't an "optimal strategy" be really predictable?
No. In poker, it's often useful to bluff and keep your opponent guessing, so the optimal strategy is partly random. That means that when facing a given situation (e.g. After the flop, I have X cards, I did Y pre-flop, and my opponent did Z) the computer will flip a weighted coin to make its decision (e.g. 3/4 of the time I fold, 1/8 of the time I raise, and 1/8 of the time I check.) So the strategy is perfectly well-defined, but still unpredictable at times.

Does it know how to play short-stack and long-stack poker?
I don't think so. As far as I can tell, Rhode Island poker is played one hand at a time, with limits on betting. So the optimal strategy doesn't need to know anything about stack length, escalating blinds, going all in, etc.

If you introduced those elements, you'd be playing a different game. According to the min-max theorem, there would still be an optimal strategy, but it would probably be at least a little different.

No comments: