Thursday, July 1, 2010

Reliability

I'm sold on the argument that Krippendorff's alpha is the way to go for most reliability calculation. Unfortunately, it's been hard to find code to calculate it. Here are some of the best links I've run across.

The NLTK package for python has code for computing alpha. It looks like this does basic nominal calculation; I don't know if/how it copes with missing data.

The concord package in R does nominal, ordinal, interval, and ratio versions of alpha. It looks like this might not be maintained anymore, but it works.

Here's a nice page of resources by computational linguists Artstein and Poesio. Unfortunately, what they show is mainly that there aren't very good resources out there. Their review article is very good -- the best treatment of reliability I've seen in the NLP community so far.

Deen Freelon has some links to reliability calculators and resources, including two nice online reliability calculators: Recal-OIR, and Recal-3.

Krippendorff's oddly-formatted, information-sparse web page. He invents the best measure for calculating reliability, then keeps a lid on it. Less animated bowtie dogs, and more software, please!

Matthew Lombard has a nice page on reliability statistics and the importance of reliability in content analysis in general.

Beg: Does anybody know how to compute K's alpha for a single coder?

I have data coded by several coders and need to know who's doing a good job, reliability-wise, and who's not. At a pinch, I'd be willing to use a different reliability statistic, or even an IRT model. It just needs to be statistically defensible and reasonably easy to code.

1 comment:

Abe said...

Reliability update: I spent all yesterday writing code. I've replicated the nominal and ordinal versions of R concord's kripp.alpha function.

I also wrote a function to compute an individual coder's alpha -- that is, the pairwise reliability of his coding against other coders. Useful for picking out lazy or fraudulent coders on your team.

Once I've developed this a little more, I'm planning to wrap these functions into a package and make them available. That may take a few months, though, so let me know if you want to borrow code.