The Lowly Wonk: NLP

Showing posts with label NLP. Show all posts

Monday, February 21, 2011

Glenn Beck conspiracy generator

A Glenn Beck conspiracy generator.

How does this thing work? I'm guessing mturk or some mailing list. The phrases don't seem quite formulaic enough for Markov generation or automated madlibs.

Thursday, November 4, 2010

Starting to get some dissertation results...

Apologies for the long delay between posts. Stock excuse: "Dissertation... blah blah blah..."

Actually, I'm starting to get some nifty results from my dissertation. I've spent a long summer writing surveys and software, and in the next few weeks I hope to have something to show for it. Exhibit A: a word cloud for an automated classifier of political content.

Orange words are associated with political content, and blue words are disassociated. The size of a word denotes the strength of association -- essentially, the size of each word corresponds to the absolute value of the beta value of the word in a logistic regression with "political-ness" as the dependent variable. The layout of the words is done by computer algorithm to conserve space; it doesn't carry any important information.

I used wordle for the layout. The classifier runs regularized logistic regression using the scikits.learn package for python. The training data is from a team of undergraduate research assistants.