correlograms are correlicious

In the last year or so, I’ve been experimenting with different ways of displaying correlation matrices, and have gotten very fond of color-coded correlograms. Here’s one from a paper I wrote investigating the relationship between personality and word use among bloggers (click to enlarge):

Figure S2 Extraversion

The rows reflect language categories from Jamie Pennebaker’s Linguistic Inquiry and Word Count (LIWC) dictionary; the columns reflect Extraversion scores (first column) or scores on the lower-order “facets” of Extraversion (as measured by the IPIP version of the NEO-PI-R). The plot was generated in R using code adapted from the corrgram package (R really does have contributed packages for everything). Positive correlations are in blue, negative ones are in red.

The thing I really like about these figures is that the colors instantly orient you to the most important features of the correlation matrix, instead of having to inspect every cell for the all-important ***magical***asterisks***of***statistical***significance***. For instance, a cursory glance tells you that even though Excitement-Seeking and Cheerfulness are both nominally facets of Extraversion, they’re associated with very different patterns of word use. And then a slightly less cursory glance tells you that that’s because people with high Excitement-Seeking scores like to swear a lot and use negative emotion words, while Cheerful people like to talk about friends, music, and use positive emotional language. You’d get the same information without the color, of course, but it’d take much longer to extract,  and then you’d have to struggle to keep all of the relevant numbers in mind while you mull them over. The colors do a lot to reduce cognitive load, and also have the secondary benefit of looking pretty.

If you’re interested in using correlograms, a good place to start is the Quick-R tutorial on correlograms in R. The documentation for the corrgram package is here, and there’s a nice discussion of the principles behind the visual display of correlation matrices in this article.

p.s. I’m aware this post has the worst title ever; the sign-up sheet for copy editing duties is in the comment box (hint hint).

9 thoughts on “correlograms are correlicious”

  1. Is there a link available to the R code for this? I’m interested in stealing the pattern to put labels as these graphs do on the margins, as I have long names for categories.

    Thanks!

  2. Hi,
    Any chance of sharing the code you used to generate the correlograms you used in this article?
    many thanks!
    Dee

  3. Hi!
    I am a student in China. I read your study published in 2010, “personality in 100,000 words: A Large-scale analysis of personality and word use among bloggers”. As seen in your study, the full trait*word matrix could be available on your personal website, but I couldn’t find it here. Could you send me the link? Thank you very much!
    Joy

  4. I wanted to thank you for this fantastic read!! I certainly loved every bit of
    it. I have you saved as a favorite to look at new things you post…

    Here is my webpage Bing

  5. As many other comments
    have already asked:
    – Any chance you can share the R code
    you used in this example?

    Or, at least tell us why you don’t answer
    any of these comments?

    Thanks!
    Bob
    SF

Leave a Reply to Jan GalkowskiCancel reply