The Lowly Wonk: AI

Showing posts with label AI. Show all posts

Saturday, April 9, 2011

Prezi: AI and games

A few months ago I first tried out prezi. Since then, I've seen it trickling into presentations here and there. The novelty is great, and the usability is a little better than last time I played with it. So I went ahead and

These are slides for my (mock) TED talk from the Hill Street TED activity yesterday.

Tuesday, March 8, 2011

AI for rock, paper, scissors

A really simple and clever demonstration of another place AI is surprisingly good at beating people: a rock-paper-scissors game on NYTimes. Don't laugh. The computer analyzes your past play and looks for weaknesses to exploit. I played it 100 times and lost 23 to 35, with 42 ties. (I said don't laugh!)

Nifty application. Nice design. The "see what the computer is thinking" is a great way of explaining how the AI works. Much better than saying it's a 4th-order Markov process with backing off.

The small print on the side of the screen is revealing (emphasis mine):

Note: A truly random game of rock-paper-scissors would result in a statistical tie with each player winning, tying and losing one-third of the time. However, people are not truly random and thus can be studied and analyzed. While this computer won't win all rounds, over time it can exploit a person's tendencies and patterns to gain an advantage over its opponent.

This is nice evidence that people are surprisingly bad at being unpredictable.

PS. I ran the numbers. The odds of me doing so poorly against the computer in an even game (one third wins, one third losses, one third ties) are about one in 60, a p value of 0.015. Not impossible, but extremely unlikely. So the smart money is on the computer.

Tuesday, February 15, 2011

How to win the final wager in Jeopardy

Here's a fun sideline to the human-vs-AI Jeopardy showdown: optimal wagering strategies for final Jeopardy. The challenge: given what you know about yourself, your opponents, their bankrolls, and the question, choose the betting strategy most likely to secure a win or at least a tie. What do you do?

As it happens, some smart game theorists have already worked on this problem. Work in the 1990's shows that against your average human, it's not too complicated.

If you're winning, assume your opponents will bet everything, and react accordingly by betting enough to double the number two player's score, plus one dollar.
If you're in the middle it can be tricky, but a good bet is to triple your score and subtract double the first player's score.
If you're losing by a lot, bet everything.

Following these rules should increase your chances of winning, regardless of your starting position. (See pg 271 in the paper.)

Against a human (or computer) who knows some game theory, the ideal strategy is complicated, because your strategizing will prompt strategizing by your opponents, and vice versa, ad infinitum. The Gilbert and Hatcher paper gives a solution for the two-player scenario. This situation happens when one contestant has no money to wager in the final round -- it's actually not all that unusual.

The three-player scenario is unsolved, but it's almost certainly a mixed strategy. Sounds like a good problem for a different kind of AI: computational game theory. Given that simple versions of poker have been completely solved, a mixed strategy over wagers should be child's play.

With all that as background, here's IBM's page on how Watson wagers. They cite the ridiculously detailed Jeopardy folklore of betting strategies --- the entries for "two-thirds" and "Shore's conjecture" are particularly on point --- but evidently, they haven't worked through the game theory. Knowing this, Ken and Brad should be able to outflank Watson in the final round, giving them slightly better odds at cinching a win.

Punchline: in the world of AI, Watson is a verbal genius in need of some remedial math.

Saturday, January 1, 2011

Optimal Poker AI -- I told you so!

A couple years ago, I had a long debate about Texas Hold'em with my brother, a long-time aficionado of the game. He's quite good and he's roomed with friends who actually payed the bills playing online poker. So you can see why he'd get defensive when I claimed that there is an optimal strategy for heads-up Texas Hold'em.

I've thought back to that conversation several times, mainly because it was really frustrating for both of us. I had a proof backing me up -- Von Neuman's 80-year-old minimax theorem* -- an absolutely airtight mathematical proof, but I still couldn't convince him. The problem is that it's an existence proof, so it demonstrates that the strategy must exist, but doesn't tell you what the strategy is. And of course, arguing that "my math says there's an optimal strategy" rings pretty hollow when you can't answer the question, "so what's the strategy?"

So I was very happy to run across this paper and demo presenting the optimal strategy for Rhode Island Hold'em. Rhode Island poker is a stripped down version of Texas Hold'em (get it?). You play with a smaller deck, only one hole card, a one-card flop, and a one-card turn, but no river. There are a few limits on betting, but the game is essentially the same. According the abstract, "Some features of the [optimal strategy] include poker techniques such as bluffing, slow-playing, check-raising, and semi-bluffing."

Very nifty proof of concept.

* A little bit more on the math. The minmax theorem states that every one-on-one winner-take-all game has an optimal strategy profile. A quick FAQ on what it means to be an optimal strategy.

So what does it mean that the strategy is optimal?
The technical definition is that it's an equilibrium strategy. If both players are playing the optimal strategy, neither has an incentive to deviate and play anything else. As one of my profs would say, it's prediction proof. Even if we both know each others' strategies, neither of us has an incentive to deviate.

So this strategy always wins?
No. There is some luck in the game. But if you played it repeatedly, there's no way you would win more than half the time. Almost any non-optimal strategy you'd play, the computer would beat you more than half the time.

Wouldn't an "optimal strategy" be really predictable?
No. In poker, it's often useful to bluff and keep your opponent guessing, so the optimal strategy is partly random. That means that when facing a given situation (e.g. After the flop, I have X cards, I did Y pre-flop, and my opponent did Z) the computer will flip a weighted coin to make its decision (e.g. 3/4 of the time I fold, 1/8 of the time I raise, and 1/8 of the time I check.) So the strategy is perfectly well-defined, but still unpredictable at times.

Does it know how to play short-stack and long-stack poker?
I don't think so. As far as I can tell, Rhode Island poker is played one hand at a time, with limits on betting. So the optimal strategy doesn't need to know anything about stack length, escalating blinds, going all in, etc.

If you introduced those elements, you'd be playing a different game. According to the min-max theorem, there would still be an optimal strategy, but it would probably be at least a little different.