Using R: correlation heatmap, take 2

Apparently, this turned out to be my most popular post ever.  Of course there are lots of things to say about the heatmap (or quilt, tile, guilt plot etc), but what I wrote was literally just a quick celebratory post to commemorate that I’d finally grasped how to combine reshape2 and ggplot2 to quickly make this colourful picture of a correlation matrix.

However, I realised there is one more thing that is really needed, even if just for the first quick plot one makes for oneself: a better scale. The default scale is not the best for correlations, which range from -1 to 1, because it’s hard to tell where zero is. We use the airquality dataset for illustration as it actually has some negative correlations. In ggplot2, it’s very easy to get a scale that has a midpoint and a different colour in each direction. It’s called scale_colour_gradient2, and we just need to add it. I also set the limits to -1 and 1, which doesn’t change the colour but fills out the legend for completeness. Done!

data <- airquality[,1:4]
library(ggplot2)
library(reshape2)
qplot(x=Var1, y=Var2, data=melt(cor(data, use="p")), fill=value, geom="tile") +
   scale_fill_gradient2(limits=c(-1, 1))

correlation_heatmap2

3 reaktioner på ”Using R: correlation heatmap, take 2

  1. Pingback: Momento R do Dia – Motéis, Cinemas, Jogos e Capitanias Hereditárias | De Gustibus Non Est Disputandum

  2. Fantastic. I love the conciseness of using melt and cor to calculate the correlation matrix. Here’s a version using geom_text to add labels:


    # Adaptation of https://martinsbioblogg.wordpress.com/2014/03/03/using-r-correlation-heatmap-take-2/ with labels
    data <- airquality[,1:4]
    library(ggplot2)
    library(reshape2)
    qplot(x=Var1, y=Var2, data=melt(cor(data, use="p")), fill=value, geom="tile") +
    scale_fill_gradient2(limits=c(-1, 1)) +
    geom_text(aes(label=round(melt(cor(data, use="p"))$value,2)))

    • Thank you! I also really like the way plyr/reshape2/ggplot2 work together. (And I’m looking forward to playing with dplyr and ggvis.)

Kommentarer är stängda.