There is grandeur in this view of life

martins bioblogg

Using R: Correlation heatmap with ggplot2

with 12 comments

Just a short post to celebrate that I learned today how incredibly easy it is to make a heatmap of correlations with ggplot2 (and reshape2, of course).

data(attitude)
library(ggplot2)
library(reshape2)
qplot(x=Var1, y=Var2, data=melt(cor(attitude)), fill=value, geom="tile")

attitude_heatmap

So, what is going on in that short passage? cor makes a correlation matrix with all the pairwise correlations between variables (twice; plus a diagonal of ones). melt takes the matrix and creates a data frame in long form, each row consisting of id variables Var1 and Var2 and a single value. We then plot with the tile geometry, mapping the indicator variables to rows and columns, and value (i.e. correlations) to the fill colour.

About these ads

Written by mrtnj

21 mars, 2013 den 23:52

Publicerat i data analysis, english

Taggad med ,

12 svar

Prenumerera på kommentarer via RSS.

  1. Very nice- I knew there would be a quick way to get a correlation table out of ggplot2, but I hadn’t pursued it. Adding in the value of each correlation is pretty simple, starting from the base you’ve provided:

    cor_melt = melt(cor(attitude))

    ggplot(cor_melt, aes(Var1, Var2, fill=value, label=round(value, 2))) +
    scale_fill_gradient(low=”#FEE0D2″, high=”#FB6A4A”) +
    geom_tile() +
    geom_text()

    Marius

    22 mars, 2013 at 06:13

  2. I just started to think about how to plot correlations with ggplot, too. :-) An alternative approach might be points that indicate the correlation strength:

    # add point size, by multiplying the correlation value
    corr =0.999)] <- 0

    ggplot(data=corr, aes(x=Var1, y=Var2, fill=value)) +
    geom_point(aes(fill=value), shape=21, size=corr$psize) +
    geom_text(aes(x=Var1, y=Var2), label=c(round(corr$value,2)), colour="white")

    A question still remains: how to deal with negative correlations? Would be nice to have, e.g., red to black for correlations from -1 to 0 and black to blue for positive correlations from 0 to 1. So, the darker the color, the weaker the correlation, and red/blue indicating negative or positive correlations.

    Daniel

    22 mars, 2013 at 11:10

    • Seems like some lines of code were not accepted:

      # add point size to data frame, by multiplying the correlation value
      corr = cbind(corr, psize=c(exp(abs(corr$value))*20))
      # use this if you want to hide the diagonal 1-correlations
      corr$psize[which(corr$value>=0.999)] = 0

      Daniel

      22 mars, 2013 at 11:12

    • Ok, got a solution for the negative value thing:

      ggplot(data=corr, aes(x=Var1, y=Var2, fill=value)) +
      geom_point(shape=21, size=corr$psize) +
      scale_fill_gradientn(colours=c(”#ff9999″, ”#ff6666″, ”#cc4444″, ”black”, ”#3355cc”, ”#4488ff”, ”#6699ff”), limits=c(-1,1)) +
      geom_text(label=c(round(corr$value,2)), colour=”white”)

      The color gradient is not very optimal, could be better. The ”limits”-attribute makes sure that the colour range is always from -1 to +1, independent from lowest and highest correlation coefficients.

      Daniel

      22 mars, 2013 at 11:26

  3. Hi!

    Thank you for your contributions! In the above I didn’t think a lot about the presentation, so I haven’t changed any of the default theme settings. Adding the correlation in text is very useful though, even for the first exploratory graphs you make for yourself.

    In my opinion, mapping numbers to the area of something is often a bit iffy, so I think I prefer the heatmap style. But I’m no graphics whiz, and opinions differ :)

    Cheers,

    m.

    mrtnj

    22 mars, 2013 at 17:18

  4. Hi, I really appreciate your code, it will be very helpful to my research. Quick question: do you know how to remove ‘var1′ and ‘var2′ from the plot please?

    Dhaval A.

    22 mars, 2013 at 18:59

  5. this is how you remove var1, var2 from the plot. add this to your plot code:
    +theme(axis.title=element_blank())

    thank you again!

    Dhaval A.

    22 mars, 2013 at 19:37

  6. [...] (via Using R: Correlation heatmap with ggplot2 | There is grandeur in this view of life) [...]

    Schaver.com

    6 april, 2013 at 20:03

  7. [...] correlations: sjPlotCorr.R A very quick way of plotting a correlation heat map can be found in this blog. I had a similar idea in mind for some time and decided to write a small function that allows some [...]

  8. [...] Here is how I could print a heatmap made of p-values. First of all, let’s construct some data (post inspired in this one): [...]

  9. […] this turned out to be my most popular post ever.  Of course there are lots of things to say about the […]


Kommentera

Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in:

WordPress.com Logo

Du kommenterar med ditt WordPress.com-konto. Logga ut / Ändra )

Twitter-bild

Du kommenterar med ditt Twitter-konto. Logga ut / Ändra )

Facebook-foto

Du kommenterar med ditt Facebook-konto. Logga ut / Ändra )

Google+ photo

Du kommenterar med ditt Google+-konto. Logga ut / Ändra )

Ansluter till %s

Följ

Få meddelanden om nya inlägg via e-post.

Gör sällskap med 1 123 andra följare

%d bloggers like this: