Using R: a function that adds multiple ggplot2 layers

Another interesting thing that an R course participant identified: Sometimes one wants to make a function that returns multiple layers to be added to a ggplot2 plot. One could think that just adding them and returning would work, but it doesn’t. I think it has to do with how + is evaluated. There are a few workarounds that achieve similar results and may save typing.

First, some data to play with: this is a built-in dataset of chickens growing:

library(ggplot2)

data(ChickWeight)
diet1 <- subset(ChickWeight, Diet == 1)
diet2 <- subset(ChickWeight, Diet == 2)

This is just an example that shows the phenomenon. The first two functions will work, but combining them won’t.

add_line <- function(df) {
  geom_line(aes(x = Time, y = weight, group = Chick), data = df)
}

add_points <- function(df) {
  geom_point(aes(x = Time, y = weight), data = df)
}

add_line_points <- function(df) {
  add_line(df) + add_points(df)
}

## works
(plot1 <- ggplot() + add_line(diet1) + add_points(diet1))

## won't work: non-numeric argument to binary operator
try((plot2 <- ggplot() + add_line_points(diet1)))

Update: In the comments, Eric Pedersen gave a neat solution: stick the layers in a list and add the list. Like so:

(plot2.5 <- ggplot() + list(add_line(diet1), add_points(diet1)))

Nice! I did not know that one.

Also, you can get the same result by putting mappings and data in the ggplot function. This will work if all layers are going to plot the same data, but that does it for some cases:

## bypasses the issue by putting mappings in ggplot()
(plot3 <- ggplot(aes(x = Time, y = weight, group = Chick), data = diet1) +
    geom_line() + geom_point())

One way is to write a function that takes the plot object as input, and returns a modified version of it. If we use the pipe operator %>%, found in the magrittr package, it even gets a ggplot2 like feel:

## bypasses the issue and gives a similar feel with pipes

library(magrittr)

add_line_points2 <- function(plot, df, ...) {
  plot +
    geom_line(aes(x = Time, y = weight, group = Chick), ..., data = df) +
    geom_point(aes(x = Time, y = weight), ..., data = df)
}

(plot4 <- ggplot() %>% add_line_points2(diet1) %>%
   add_line_points2(diet2, colour = "red"))

Finally, in many cases, one can stick all the data in a combined data frame, and avoid building up the plot from different data frames altogether.

## plot the whole dataset at once
(plot5 <- ggplot(aes(x = Time, y = weight, group = Chick, colour = Diet),
                 data = ChickWeight) +
   geom_line() + geom_point())

Okay, maybe that plot is a bit too busy to be good. But note how the difference between plotting a single diet and all diets at the same time is just one more mapping in aes(). No looping or custom functions required.

I hope that was of some use.

Annonser

9 thoughts on “Using R: a function that adds multiple ggplot2 layers

  1. Nice post, and a good illustration of the quirks of adding multiple objects to a plot at once!

    I’m not a ggplot expert, but from what I understand from working with it, ggplot2 basically assumes the left-hand side of the ‘+’ operator is always a ggplot object or a ggplot theme, and the right hand side is something you want to add to it. Sequential `+` operators are evaluated in order from left to right, so the left is always a valid ggplot object. The error you’re getting above is because the function ”add_line_points” is getting evaluated together, and ggplot doesn’t know how to add two geoms together in the absence of a ggplot object. You get the same error from just running

    try(add_line_points(diet1))

    The way you can add multiple ggplot2 parts together in absence of a ggplot() object is to connect them together with a list. So this works as it should:

    add_line_points3 <- function(df) {
    list(add_line(df), add_points(df))
    }

    ## works
    try((plot2 <- ggplot() + add_line_points3(diet1)))

    I use the list trick all the time when I want to create plot templates to apply the same set of styles and geoms to different data sets.

  2. You are right. But mainly this is how ggplot2 works internally. Your last example is how it should work. You organise your data to be able to group it according to what you want to show. You don’t usually try to plot different data Frames with one ggplot command. The philosophy behind ggplot2 is that you first prepare your data according to what you want to plot and then use the aes() to group it correctly.

    • Yeah, for this example, the last one is the prettiest and my favoured solution. I hope one can tell that from the post.

      But there are plenty of reasons legitimate reasons to have different data frames in dufferent layers, e.g. showing data and model predictions in the same plot. ggplot2 can accommodate those too.

    • Cool. I always wanted a reason to make a geom, and you found a good one. 🙂

      (In the case of this post, lines and points was just an example. And not even a very good one, as the clutter in the last graph suggest. But I’ll need to try it for my lollypop plot needs!)

Kommentera

Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in:

WordPress.com Logo

Du kommenterar med ditt WordPress.com-konto. Logga ut / Ändra )

Twitter-bild

Du kommenterar med ditt Twitter-konto. Logga ut / Ändra )

Facebook-foto

Du kommenterar med ditt Facebook-konto. Logga ut / Ändra )

Google+ photo

Du kommenterar med ditt Google+-konto. Logga ut / Ändra )

Ansluter till %s