Thursday, August 13, 2009

On putting faith in models

Here's a conversation I had with my brother via g-chat the other day. It's a great prototype of a conversation I've had many times in the last few months. The basic question is "how much faith can we place in mathematical models?" Most people seem skeptical; I'm more of a believer.

This particular exchange was unusual because 1) it was conveniently recorded, and 2) it went in some interesting and productive directions at the end. I'm posting it unedited, except for some spelling fixes and external links. Comments welcome.


10:44 AM Sam: this is probably trivial compared to the analysis you usually look at, but i thought you might like it nonetheless
me: I'll check it out


6 minutes
10:51 AM me: Interesting stuff
I hadn't seen the wine article before
10:52 AM I'd read the paper on war, but I hadn't seen the TED talk
10:54 AM This is exactly the kind of stuff I'm interested in doing
Sam: have you read when genius failed?
me: no
Sam: i can't remember if we talked about it already
it's about a hedgefund
10:55 AM and their story
it's the classic cautionary tale of putting too much faith in models
me: oh, yes
I haven't read it, but I know the gist
10:56 AM I would change the interpretation a little bit and say putting too much faith in a theory.
A model is a theory that happens to be expressed mathematically
Like any theory, if your assumptions are bad your conclusions will be bad
10:57 AM Sam: it was all mathematical
they determined 'actual' risk spreads based on reams of historical data and market conditions
10:58 AM and then tried to beat the market by playing the spreads dictated by their systems
me: right
as I understand it, the mistake they made was in the way they calculated risk
10:59 AM Sam: this is an interesting meta-argument
because you're pointing to the specific problem of their models
where i'm saying their mistake was over reliance on models
me: yes
11:00 AM this is a live debate in social science
I'm a pretty strong advocate for the quant side
11:01 AM Sam: hm
me: I'd say the key question is "is there anything substantively different about theories expressed in math verses theories expressed in English?"
Sam: this is probably a classic
me: classic?
11:02 AM Sam: academics vs business
11:03 AM me: hm
maybe
it's not so clean cut, though, because there are academics who reject the quant stuff and businessmen who embrace it
11:04 AM I think it has to do with the way people think about math
Is it a set of fixed processes for getting answers, or is it a language for expressing ideas?
11:05 AM If it's a language, then the fault for bad models lies is the assumptions expressed, not the language for expressing them.

5 minutes
11:10 AM Sam: accepting that an omniscient agent could express all ideas as formulas and believing that you can are different, right?
11:11 AM its that leap where you decide to stake a business and millions of dollars on the formula that puts you on one side of the line or the other, in my opinion
me: yes, fair points
11:13 AM It seems to me that you're introducing another aspect of theories, which is that they don't just have to expressed, they also have to be acted upon
and strange things can happen when you act on a theory without fully understanding it
It's kind of a Jurassic Park idea
11:14 AM Sam: ha
i guess ultimately the moral to when genius is the same as jurassic park
11:15 AM but trust funds drying up is less dramatic than dino-carnage
me: so maybe the problem with models isn't that they are more likely to be wrong, but that they invite careless extrapolation
Sam: i don't think it's careless
it's hubris
me: "I'm going to leverage a billion dollars 30 times"
11:16 AM "I'm going to bring back dinosaurs from the dead, focusing mainly on intelligent top predators"
Sam: it's the opposite of being careless- it's spending so much time working on a model that you believe that you can and have thought of everything
11:17 AM its validating that model again and again against the datasets you have without allowing for future events to be unknowable
11:18 AM me: I agree -- my only reservation would be that hubris is not specific to people who frame their models using math
Sam: of course
the opposite story is much more common
which is why when genius failed is a story worth telling
11:19 AM when hubris failed would be too common
11:21 AM me: so instead of "people fail because of math," it's "even people using math can fail because of hubris"
11:22 AM Sam: yeah, that's the gist
me: okay, I don't have to feel so defensive now
11:23 AM a lot of the people around me here are model-builders
some of the faculty are among the best in the world
I see both types
11:24 AM Some use models for the sake of transparency -- all the assumptions are laid out for criticism and improvement
Others use models for the sake of beating up on people who don't speak game theory
11:25 AM I'm pretty invested in the notion that models can help the process of collective learning, and it frustrates me when the arrogant ones give modeling a bad name

Thursday, August 6, 2009

Big news stories in the 2008 elections -- Looking for you input

I have a ~10 minute favor to ask. It has to do with timelines again*.

I've pulled a list of the top ~200 events from the run-up to last year's presidential elections. (Download it here as an .xlsx file, here as an .xls file, and here as a .txt file) Sometime when you weren't doing anything anyway (e.g. facebook), skim through the list and pick the top 20 or so events that you think were the biggest stories* in the campaign. If there are important stories that you think are missing, you can add them.

When you're done, post your results in the comments section. Here are the rules:
  • By "big news stories," I mean events that did at least one of three things: 1) generated a lot of media buzz, 2) affected public opinion, or 3) evoked a strong response from the campaigns themselves. Any combination of these things qualifies as a news story.
  • Don't do any background research and don't ask anybody else for their opinion. I'm just looking for a gut check on which events were the most important.
  • Don't worry if you aren't a big-time pundit. I'm not either, and they're mostly bluffing anyway.
  • And don't don't don't read other peoples' responses before you post your own -- this will be much more useful and interesting if everyone's ideas are independent.
I'm going to use these responses (anonymously) at a conference coming up in a few weeks. Once the conference is over I'll post some diagnostics and results here so you can see how your intuition stacks up against the wisdom of the crowd.

Like I said, this should only take about 10 minutes or so. Thanks for being part of a convenience sample of the willing!

Some background:
I'm working on a research project using automated content analysis to identify big news stories in archives of media content. The 2008 presidential election is my test case. Basically, I'm throwing a lot of text at the computer and using tricks from computational linguistics to tease out the news stories. It would be neat to be able to do this because 1) news stories are a big part of the way we think about politics, 2) this would make it possible to identify news stories in a replicable way on a grand scale, and 3) complicated algorithms are cool.

Your answers will help me construct a baseline to check how well the algorithm is doing. If the computer finds events that are broadly consistent with human intuition, that's a good sign that it's working. Thanks again for your help.


*PS on my previous post: Chronologic turned out harder than I initially thought! I'm working on ways to weed out some of the ridiculously obscure cards.)