There is grandeur in this view of life

martins bioblogg

Using R: Coloured sizeplot with ggplot2

with 8 comments

Someone asked about this and I though the solution with ggplot2 was pretty neat. Imagine that you have a scatterplot with some points in the exact same coordinates, and to reduce overplotting you want to have the size of the dot indicating the number of data points that fall on it. At the same time you want to colour the points according to some categorical variable.

The sizeplot function in the plotrix package makes this type of scatterplot. However, it doesn’t do the colouring easily. I’m sure it’s quite possible with a better knowledge of base graphics, but I tend to prefer ggplot2. To construct the same type of plot we need to count the data points. For this, I use table( ), and then melt the contingency table and remove the zeroes.

library(ggplot2)
library(reshape2)
data <- data.frame(x=c(0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4),
                   y=c(0, 0, 0, 3, 1, 1, 1, 2, 2, 1, 4, 4),
                   group=c(rep(1, 6), rep(2, 4), rep(3, 2)))
counts <- melt(table(data[1:2]))
colnames(counts) <- c(colnames(data)[1:2], "count")
counts <- subset(counts, count != 0)
sizeplot <- qplot(x=x, y=y, size=count, data=counts) + scale_size(range=c(5, 10))

scaleplot_1

This is the first sizeplot. (The original scale makes single points very tiny. Hence the custom scale for size. Play with the range values to taste!) To add colour, we merge the counts with the original data to get back the group information — and, in true ggplot2 fashion, map the group variable to colour.

counts.and.groups <- merge(counts, unique(data))
sizeplot.colour <- qplot(x=x, y=y, size=count,
                         colour=factor(group), data=counts.and.groups) +
                     scale_size(range=c(5, 10))

sizeplot_2

One thing that this simple script does not handle well is if points that should have different colour happen to overlap. (As it stands, this code will actually plot two points both the size of the total number of overlapping points in different colours on top of each other. That must be wrong in several ways.) However, I don’t know what would be the best behaviour in this instance. Maybe to count the number of overlaps separately and plot both points while adding some transparency to the points?

About these ads

Written by mrtnj

17 november, 2013 at 20:02

Publicerat i computer stuff, data analysis, english

Tagged with , , ,

8 svar

Subscribe to comments with RSS.

  1. You could use the hjust and vjust attributes and use the group-number as value (”vjust=0.5*group” for instance).

    Daniel

    18 november, 2013 at 10:39

  2. Or you could probably use geom_jitter (not sure)…

    Daniel

    18 november, 2013 at 11:33

  3. You can use jitter as Daniel says. If you are using qplot you only need to add:
    position = position_jitter(w = 0.1, h = 0)
    to your qplot statement.
    Additionally I would set alpha less than 1 e.g. in qplot alpha=I(0.7), so you can see each point through the one on top of it.

    Kristbjorn

    18 november, 2013 at 13:00

  4. Why not split the counts based on class into two groups and then set your alpha = 0.5 (or around there), so you get some transparency to see the two.

    Joshua Aldrich

    18 november, 2013 at 20:02

  5. Thank you for your comments! You seem to be thinking along the same lines as I do. (I don’t get the vjust thing though, since unless I’m mistaken, I think that aesthetic is only for text?)

    mrtnj

    18 november, 2013 at 20:27

    • True, vjust/hjust is for text only… Did not recall that.

      Daniel

      18 november, 2013 at 21:56

  6. You could probably also use ”position_dodge”. This is what I did to avoid overlapping: geom_point(position=position_dodge(0.8), size=dotSize, shape=21).

    Daniel

    19 november, 2013 at 08:30

    • Ah, position_dodge! I didn’t think of that. After these comments, it’s quite obvious I need to play around a little and update the post :)

      mrtnj

      19 november, 2013 at 18:51


Kommentera

Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in:

WordPress.com Logo

Du kommenterar med ditt WordPress.com-konto. Logga ut / Ändra )

Twitter-bild

Du kommenterar med ditt Twitter-konto. Logga ut / Ändra )

Facebook-foto

Du kommenterar med ditt Facebook-konto. Logga ut / Ändra )

Google+ photo

Du kommenterar med ditt Google+-konto. Logga ut / Ändra )

Ansluter till %s

Följ

Få meddelanden om nya inlägg via e-post.

Gör sällskap med 1 136 andra följare

%d bloggers like this: