Using R: Correlation heatmap with ggplot2

Just a short post to celebrate that I learned today how incredibly easy it is to make a heatmap of correlations with ggplot2 (and reshape2, of course).

data(attitude)
library(ggplot2)
library(reshape2)
qplot(x=Var1, y=Var2, data=melt(cor(attitude)), fill=value, geom="tile")

attitude_heatmap

So, what is going on in that short passage? cor makes a correlation matrix with all the pairwise correlations between variables (twice; plus a diagonal of ones). melt takes the matrix and creates a data frame in long form, each row consisting of id variables Var1 and Var2 and a single value. We then plot with the tile geometry, mapping the indicator variables to rows and columns, and value (i.e. correlations) to the fill colour.

Om mrtnj

poet, student, kaffedrickare
Det här inlägget postades i data analysis, english och har märkts med etiketterna , . Bokmärk permalänken.

13 kommentarer till Using R: Correlation heatmap with ggplot2

  1. Marius skriver:

    Very nice- I knew there would be a quick way to get a correlation table out of ggplot2, but I hadn’t pursued it. Adding in the value of each correlation is pretty simple, starting from the base you’ve provided:

    cor_melt = melt(cor(attitude))

    ggplot(cor_melt, aes(Var1, Var2, fill=value, label=round(value, 2))) +
    scale_fill_gradient(low=”#FEE0D2″, high=”#FB6A4A”) +
    geom_tile() +
    geom_text()

  2. Daniel skriver:

    I just started to think about how to plot correlations with ggplot, too. :-) An alternative approach might be points that indicate the correlation strength:

    # add point size, by multiplying the correlation value
    corr =0.999)] <- 0

    ggplot(data=corr, aes(x=Var1, y=Var2, fill=value)) +
    geom_point(aes(fill=value), shape=21, size=corr$psize) +
    geom_text(aes(x=Var1, y=Var2), label=c(round(corr$value,2)), colour="white")

    A question still remains: how to deal with negative correlations? Would be nice to have, e.g., red to black for correlations from -1 to 0 and black to blue for positive correlations from 0 to 1. So, the darker the color, the weaker the correlation, and red/blue indicating negative or positive correlations.

    • Daniel skriver:

      Seems like some lines of code were not accepted:

      # add point size to data frame, by multiplying the correlation value
      corr = cbind(corr, psize=c(exp(abs(corr$value))*20))
      # use this if you want to hide the diagonal 1-correlations
      corr$psize[which(corr$value>=0.999)] = 0

    • Daniel skriver:

      Ok, got a solution for the negative value thing:

      ggplot(data=corr, aes(x=Var1, y=Var2, fill=value)) +
      geom_point(shape=21, size=corr$psize) +
      scale_fill_gradientn(colours=c(”#ff9999”, ”#ff6666”, ”#cc4444”, ”black”, ”#3355cc”, ”#4488ff”, ”#6699ff”), limits=c(-1,1)) +
      geom_text(label=c(round(corr$value,2)), colour=”white”)

      The color gradient is not very optimal, could be better. The ”limits”-attribute makes sure that the colour range is always from -1 to +1, independent from lowest and highest correlation coefficients.

  3. mrtnj skriver:

    Hi!

    Thank you for your contributions! In the above I didn’t think a lot about the presentation, so I haven’t changed any of the default theme settings. Adding the correlation in text is very useful though, even for the first exploratory graphs you make for yourself.

    In my opinion, mapping numbers to the area of something is often a bit iffy, so I think I prefer the heatmap style. But I’m no graphics whiz, and opinions differ :)

    Cheers,

    m.

  4. Dhaval A. skriver:

    Hi, I really appreciate your code, it will be very helpful to my research. Quick question: do you know how to remove ‘var1’ and ‘var2’ from the plot please?

  5. Dhaval A. skriver:

    this is how you remove var1, var2 from the plot. add this to your plot code:
    +theme(axis.title=element_blank())

    thank you again!

  6. Ping: Schaver.com

  7. Ping: Examples for sjPlotting functions, including correlations and proportional tables with ggplot #rstats | Strenge Jacke!

  8. Ping: … ridiculously photogenic factors (heatmap with p-values) | Tales of R

  9. Ping: Using R: correlation heatmap, take 2 | There is grandeur in this view of life

  10. Ping: How to create a simple heatmap in R | FYTRO SPORTS

Kommentera

Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in:

WordPress.com Logo

Du kommenterar med ditt WordPress.com-konto. Logga ut / Ändra )

Twitter-bild

Du kommenterar med ditt Twitter-konto. Logga ut / Ändra )

Facebook-foto

Du kommenterar med ditt Facebook-konto. Logga ut / Ändra )

Google+ photo

Du kommenterar med ditt Google+-konto. Logga ut / Ändra )

Ansluter till %s