Archive for April 6th, 2010

correlograms are correlicious

Tuesday, April 6th, 2010

In the last year or so, I’ve been experimenting with different ways of displaying correlation matrices, and have gotten very fond of color-coded correlograms. Here’s one from a paper I wrote investigating the relationship between personality and word use among bloggers (click to enlarge):

Figure S2 Extraversion

The rows reflect language categories from Jamie Pennebaker’s Linguistic Inquiry and Word Count (LIWC) dictionary; the columns reflect Extraversion scores (first column) or scores on the lower-order “facets” of Extraversion (as measured by the IPIP version of the NEO-PI-R). The plot was generated in R using code adapted from the corrgram package (R really does have contributed packages for everything). Positive correlations are in blue, negative ones are in red.

The thing I really like about these figures is that the colors instantly orient you to the most important features of the correlation matrix, instead of having to inspect every cell for the all-important ***magical***asterisks***of***statistical***significance***. For instance, a cursory glance tells you that even though Excitement-Seeking and Cheerfulness are both nominally facets of Extraversion, they’re associated with very different patterns of word use. And then a slightly less cursory glance tells you that that’s because people with high Excitement-Seeking scores like to swear a lot and use negative emotion words, while Cheerful people like to talk about friends, music, and use positive emotional language. You’d get the same information without the color, of course, but it’d take much longer to extract,  and then you’d have to struggle to keep all of the relevant numbers in mind while you mull them over. The colors do a lot to reduce cognitive load, and also have the secondary benefit of looking pretty.

If you’re interested in using correlograms, a good place to start is the Quick-R tutorial on correlograms in R. The documentation for the corrgram package is here, and there’s a nice discussion of the principles behind the visual display of correlation matrices in this article.

p.s. I’m aware this post has the worst title ever; the sign-up sheet for copy editing duties is in the comment box (hint hint).

my new favorite blog

Tuesday, April 6th, 2010

…teaches you How To Write Badly Well. For instance, if you want to write badly well, you must Refuse to leave the present tense:

I sit at my desk and remember how, years ago, I wonder what my life will be like when I am fifty, which I am now. I’m imagining that I’m living in a big house, I remember as I sit in my one-bedroom apartment. Now I pour myself a drink and cast my mind back to a time when I’m full of hope and passion which is never to be extinguished, as it is now.

‘What am I doing?’ I mutter to myself, taking a sip of my drink. In my memory, I’m seven years old, sitting in the highest branches of a tree which is being planted a hundred years before I am born. Now, though, the tree is long dead. I’m chopping it down at the age of twenty and thinking about when it is supporting my weight at the age of seven. I look at my watch.

‘Late,’ I mutter to myself. It is eight; the retrospective is just starting, half an hour ago.