# Reviewing, postscript

Later the same day as the post on reviewing was published, I saw the paper by Kovanis and coworkers on the burden of peer review in biomedical literature. It’s silly of me that it didn’t occur to me to look for data on how many papers researchers review. Their first figure shows data on the number of reviews performed 2015 by Publons users:

Figure 1B from Kovanis & al (2016) PLOS ONE (cc:by 4.0).

If we take these numbers at face value (but we probably shouldn’t, because Publons users seem likely to be a bised sample of researchers), my 4-6 reviews in a year fall somewhere in the middle: on the one hand, more than half of the researchers review fewer papers, but it’s a lot less than those who review the most.

This paper estimates the supply and demand of reviews in biomedical literature. The conclusion is lot like the above graph: reviewer effort is unevenly distributed. In their discussion, the authors write:

Besides, some researchers may be willing to contribute but are never invited. An automated method to improve the matching between submitted articles and the most appropriate candidate peer reviewers may be valuable to the scientific publication system. Such a system could track the number of reviews performed by each author to avoid overburdening them.

This seems right to me. There may be free riders who refuse to pull their weight. But there are probably a lot more of people like me, who could and would review more if they were asked to. A way for editors to find them (us) more easily would probably be a good thing.

# Morning coffee: reviewing

(It was a long time since I did one of these posts. I’d better get going!)

One fun thing that happened after I received my PhD is that I started getting requests to review papers, four so far. Four papers (plus re-reviews of revised versions) in about a year probably isn’t that much, but it is strictly greater than zero. I’m sure the entertainment value in reviewing wears off quite fast, but so far it’s been fun, and feels good to pay off some of the sizeable review debt I’ve accumulated while publishing papers from my PhD. Maybe I’m just too naïve and haven’t seen the worst parts of the system yet, but I don’t feel that I’ve had any upsetting revelations from seeing the process from the reviewer’s perspective.

Of course, peer review, like any human endeavour, has components of politics, ego and irrationality. Maybe one could do more to quell those tendencies. I note that different journals have quite different instructions to reviewers. Some provide detailed directions, laying out things that the reviewer should and shouldn’t do, while others just tell you how to use their web form. I’m sure editorial practices also differ.

One thing that did surprise me was when an editor changed the text of a review I wrote. It was nothing major, not a case of removing something inappropriate, but rewording a recommendation to make it stronger. I don’t mind, but I feel that the edit changed the tone of the review. I’ve also heard that this particular kind of comment (when a reviewer states that something is required for a paper to be acceptable for publication) rubs some people the wrong way, because that is up to the editor to decide. In this case, the editor must have felt that a more strongly worded review was the best way to get the author to pay attention, or something like that. I wonder how often this happens. That may be a reason to be even more apprehensive about signing reviews (I did not sign).

So far, I’ve never experienced anything else than single-blind review, but I would be curious to review double-blinded. I doubt the process would differ much: I haven’t reviewed any papers from people I know about, and I haven’t spent any time trying to learn more about them, except in some cases checking out previous work that they’ve referenced. I don’t expect that I’d feel any urge to undertake search engine detective work to figure out who the authors were.

Sometimes, there is the tendency among scientists and non-scientists alike to elevate review to something more than a couple of colleagues reading your paper and commenting on it. I’m pretty convinced peer review and editorial comments improve papers. And as such, the fact that a paper has been accepted by an editor after being reviewed is some evidence of quality. But peer review cannot be a guarantee of correctness. I’m sure I’ve missed and misunderstood things. But still, I promise that I’ll do my best, and I will not have the conscience to turn down a request for peer review for a long time. So if you need a reviewer for a paper on domestication, genetic mapping, chickens or related topics, keep me in mind.

# Paper: ”Feralisation targets different genomic loci to domestication in the chicken”

It is out: Feralisation targets different genomic loci to domestication in the chicken. This is the second of our papers on the Kauai feral and admixed chicken population, and came out a few days ago.

The Kauai chicken population is kind of famous: you can find them for instance on Flickr, or on YouTube. We’ve previously looked at their plumage, listened to the roosters’ crowings, and sequenced mitochondrial DNA to investigate their origins. Based on this, we concur with the common view that the chickens of Kauai probably are a mixture of feral birds of domestic origin and wild Junglefowl. The Kauai chickens look and sound like a mix of wild and domestic, and we found mitochondrial DNA of two haplogroups, one of which (called D) is typical in ancient chicken DNA from Pacific islands (Gering et al 2015).

In this paper, we looked at the rest of the genome of the same chickens — you didn’t think we sequenced the whole thing just to look at the mitochondrion plus a subset of markers, did you? We turn to population genomics, and a family of methods called selective sweep mapping, to search for regions of their genome that show signs of being affected by natural selection. This lets us: 1) draw pretty rainbow plots such as  this one …

(Figure 1a from the paper in question, Johnsson & al 2016. cc:by The chromosomes have been laid out on the horizontal axis with different colours, and split into windows of 40 kb. Each dot represents the heterozygosity of that windows. For all the details, see the paper.)

… 2) highlight a regions of the genome that may have been selected during feralisation on Kauai (these are the icicles in the graph, highligthed by arrows); 3) conclude that the regions that look like they’ve been selected in feralisation overlap very little with the ones that look like they’ve been selected in chicken domestication. Hence the title.

That was the main result, but of course we also look at what genes are highlighted. Mostly we have no idea how they may contribute to feralisation, but a couple of regions overlap with those that we’ve previously found in genetic mapping of comb size and egg laying in our wild-by-domestic intercross. We also compare the potentially selected regions to domestic chicken sequences.

Last year, Ewen Callaway visited Dominic Wright, Eben Gering and Rie Henriksen on the last fieldtrip to Kauai. The article, When chickens go wild, was published in Nature News in January, and it explains a lot of the ideas nicely. This paper was submitted by then, so the samples they gathered on that trip do not feature in it. But, spoiler alert: there is more to come. (I don’t know what role I personally will play, but that is less important.)

As you may have guessed if you looked at the author list, this was a collaboration between quite a lot of people in Linköping, Michigan, London, and Victoria. Thanks to all involved! This was great fun, and for those of you who like this sort of thing, I hope the paper will be an interesting read.

Literature

M. Johnsson, E. Gering, P. Willis, S. Lopez, L. Van Dorp, G. Hellenthal, R. Henriksen, U. Friberg & D. Wright. (2016) Feralisation targets different genomic loci to domestication in the chicken. Nature Communications. doi:10.1038/ncomms12950

# Toying with models: The Game of Life with selection

Conway’s Game of life is probably the most famous cellular automaton, consisting of a grid of cells developing according simple rules. Today, we’re going to add mutation and selection to the game, and let patterns evolve.

The fate of a cell depends on the number cells that live in the of neighbouring positions. A cell with fewer than two neighbours die from starvation. A cell with more than three neighbours die from overpopulation. If a position is empty and has three neighbours, it will be filled by a cell. These rules lead to some interesting patterns, such as still lives that never change, oscillators that alternate between states, patterns that eventually die out but take long time to do so, patterns that keep generating new cells, and so forth.

When I played with the Game of life when I was a child, I liked one pattern called ”virus”, that looked a bit like this. On its own, a grid of four-by-four blocks is a still life, but add one cell (the virus), and the whole pattern breaks. This is a version on a 30 x 30 cell board. It unfolds rather slowly, but in the end, a glider collides with a block, and you are left with some oscillators.

There are probably other interesting ways that evolution could be added to the game of life. We will take a hierarchical approach where the game is taken to describe development, and the unit of selection is the pattern. Each generation, we will create a variable population of patterns, allow them to develop and pick the fittest. So, here the term ”development” refers to what happens to a pattern when applying the rules of life, and the term ”evolution” refers to how the population of patterns change over the generations. This differ slightly from Game of life terminology, where ”evolution” and ”generation” usually refer to the development of a pattern, but it is consistent with how biologists use the words: development takes place during the life of an organism, and evolution happens over the generations as organisms reproduce and pass on their genes to offspring. I don’t think there’s any deep analogy here, but we can think of the initial state of the board as the heritable material that is being passed on and occasionally mutated. We let the pattern develop, and at some point, we apply selection.

First, we need an implementation of the game of life in R. We will represent the board as a matrix of ones (live cells) and zeroes (empty positions). Here is function develops the board one tick in time. After dealing with the corners and edges, it’s very short, but also slow as molasses. The next function does this for a given number of ticks.

## Develop one tick. Return new board matrix.
develop <- function(board_matrix) {
padded <- rbind(matrix(0, nrow = 1, ncol = ncol(board_matrix) + 2),
cbind(matrix(0, ncol = 1, nrow = nrow(board_matrix)),
board_matrix,
matrix(0, ncol = 1, nrow = nrow(board_matrix))),
matrix(0, nrow = 1, ncol = ncol(board_matrix) + 2))
for (i in 2:(nrow(padded) - 1)) {
for (j in 2:(ncol(padded) - 1)) {
if (neighbours < 2 | neighbours > 3) {
new_board[i, j] <- 0
}
if (neighbours == 3) {
new_board[i, j] <- 1
}
}
}
}

## Develop a board a given number of ticks.
tick <- function(board_matrix, ticks) {
if (ticks > 0) {
for (i in 1:ticks) {
board_matrix <- develop(board_matrix)
}
}
board_matrix
}


We introduce random mutations to the board. We will use a mutation rate of 0.0011 per cell, which gives us a mean of a bout one mutation for a 30 x 30 board.

## Mutate a board
mutate <- function(board_matrix, mutation_rate) {
mutated <- as.vector(board_matrix)
outcomes <- rbinom(n = length(mutated), size = 1, prob = mutation_rate)
for (i in 1:length(outcomes)) {
if (outcomes[i] == 1)
mutated[i] <- ifelse(mutated[i] == 0, 1, 0)
}
matrix(mutated, ncol = ncol(board_matrix), nrow = nrow(board_matrix))
}


I was interested in the virus pattern, so I decided to apply a simple directional selection scheme for number of cells at tick 80, which is a while after the virus pattern has stabilized itself into oscillators. We will count the number of cells at tick 80 and call that ”fitness”, even if it actually isn’t (it is a trait that affects fitness by virtue of the fact that we select on it). We will allow the top half of the population to produce two offspring each, thus keeping the population size constant at 100 individuals.

## Calculates the fitness of an individual at a given time
get_fitness <- function(board_matrix, time) {
board_matrix %>% tick(time) %>% sum
}

## Develop a generation and calculate fitness
grow <- function(generation) {
generation$fitness <- sapply(generation$board, get_fitness, time = 80)
generation
}

## Select a generation based on fitness, and create the next generation,
next_generation <- function(generation) {
keep <- order(generation$fitness, decreasing = TRUE)[1:50] new_generation <- list(board = vector(mode = "list", length = 100), fitness = numeric(100)) ix <- rep(keep, each = 2) for (i in 1:100) new_generation$board[[i]] <- generation$board[[ix[i]]] new_generation$board <- lapply(new_generation$board, mutate, mutation_rate = mu) new_generation } ## Evolve a board, with mutation and selection for a number of generation. evolve <- function(board, n_gen = 10) { generations <- vector(mode = "list", length = n_gen) generations[[1]] <- list(board = vector(mode = "list", length = 100), fitness = numeric(100)) for (i in 1:100) generations[[1]]$board[[i]] <- board
generations[[1]]$board <- lapply(generations[[1]]$board, mutate, mutation_rate = mu)

for (i in 1:(n_gen - 1)) {
generations[[i]] <- grow(generations[[i]])
generations[[i + 1]] <- next_generation(generations[[i]])
}
generations[[n_gen]] <- grow(generations[[n_gen]])
generations
}


Let me now tell you that I was almost completely wrong about what happens with this pattern once you apply selection. I thought that the initial pattern of nine stable blocks (36 cells) was pretty good, and that it would be preserved for long, and that virus-like patterns (like the first animation above) would mostly have degenerated around 80. As this plot of the evolution of the number of cells in one replicate shows, I grossly underestimated this pattern. The y-axis is number of cells at time 80, and the x-axis individuals, the vertical lines separating generations. Already by generation five, most individuals do better than 36 cells in this case:

As one example, here is the starting position and the state at time 80 for a couple of individuals from generation 10 of one of my replicates:

Here is how the average cell number at time 80 evolves in five replicates. Clearly, things are still going on at generation 10, not only in the replicate shown above.

Here is the same plot for the virus pattern I showed above, i.e. the blocks but with one single added cell, fixed in the starting population. Prior genetic architecture matters. Even if the virus pattern has fewer cells than the blocks pattern at time 80, it is apparently a better starting point to quickly evolve more cells:

And finally, out of curiosity, what happens if we start with an empty 30 x 30 board?

Not much. The simple still life block evolves a lot. But in my replicate three, this creature emerged. ”Life, uh, finds a way.”

Unfortunately, many of the selected patterns extended to the edges of the board, making them play not precisely the game of life, but the game of life with edge effects. I’d like to use a much bigger board and see how far patterns extend. It would also be fun to follow them longer. To do that, I would need to implement a more efficient way to update the board (this is very possible, but I was lazy). It would also be fun to select for something more complex, with multiple fitness components, potentially in conflict, e.g. favouring patterns that grow large at a later time while being as small as possible at an earlier time.

Code is on github, including functions to display and animate boards with the animation package and ImageMagick, and code for the plots. Again, the blocks_selection.R script is slow, so leave it running and go do something else.

# Toying with models: The Luria–Delbrück fluctuation test

I hope that Genetics will continue running expository papers about their old classics, like this one by Philip Meneely about Luria & Delbrück (1943). Luria & Delbrück performed an experiment on bacteriophage resistance in Escherichia coli, growing bacterial cultures, exposing them to a phage, and then plating and counting the survivors, who have become resistant to the phage. They considered two hypotheses: either resistance occurs adaptively, in response to the phage, or it occurs by mutation some time during the growth of the culture but before the phages are added. They find the latter to be the case, and this is an example of how mutations happen irrespective of their effects of fitness, in a sense at random. Their analysis is based on a model of bacterial growth and mutation, and the aim of this exercise is to explore this model by simulating some data.

First, we assume that mutation happens with a fixed mutation rate $\mu = 2 \cdot 10^{-8}$, which is quite close to their estimated value, and that the mutation can’t reverse. We also assume that the bacteria grow by doubling each generation up to 30 generations. We start a culture from a single susceptible bacterium, and let it grow for a number of generations before the phage is added. (We’re going to use discrete generations, while Luria & Delbrück use a continuous function.) Then:

$n_{susceptible,i+1}= 2 (n_{susceptible,i} - n_{mutants,i})$

$n_{resistant,i+1} = 2 (n_{resistant,i} + n_{mutants,i})$

That is, every generation i, the mutants that occur move from the susceptible to the resistant category. The number of mutants that happen among the susceptible is binomially distributed:

$n_{mutants,i} \sim Binomial(n_{susceptible,i}, \mu)$.

This is an R function to simulate a culture:

culture <- function(generations, mu) {
n_susceptible <- numeric(generations)
n_resistant <- numeric(generations)
n_mutants <- numeric(generations)
n_susceptible[1] <- 1
for (i in 1:(generations - 1)) {
n_mutants[i] <- rbinom(n = 1, size = n_susceptible[i], prob = mu)
n_susceptible[i + 1] &lt;- 2 * (n_susceptible[i] - n_mutants[i])
n_resistant[i + 1] &lt;- 2 * (n_resistant[i] + n_mutants[i])
}
data.frame(generation = 1:generations,
n_susceptible,
n_resistant,
n_mutants)
}
cultures <- replicate(1000, culture(30, 2e-8), simplify = FALSE)


We run a few replicate cultures and plot the number of resistant bacteria. This graph shows the point pretty well: Because of random mutation and exponential growth, the cultures where mutations happen to arise relatively early will give rise to a lot more resistant bacteria than the ones were the first mutations are late. Therefore, there will be a lot of variation between the cultures because of their different histories.

combined <- Reduce(function (x, y) rbind(x, y), cultures)
combined$culture <- rep(1:1000, each = 30) resistant_plot <- qplot(x = generation, y = n_resistant, group = culture, data = combined, geom = "line", alpha = I(1/10), size = I(1)) + theme_bw()  We compare this to what happens under the alternative hypothesis where resistance arises as a consequence of introduction of the phage with some resistance rate (this is not the same as the mutation rate above, even though we’re using the same value). Then the number of resistant cells in a culture will be: $n_{acquired} \sim Binomial(2^{29}, \mu_{aquried})$. resistant <- unlist(lapply(cultures, function(x) max(x$n_resistant)))

acquired_resistant <- rbinom(n = 1000, size = 2^29, 2e-8)

resistant_combined <- rbind(transform(data.frame(resistant = acquired_resistant), model = "acquired"),
transform(data.frame(resistant = resistant), model = "mutation"))

resistant_histograms <- qplot(x = resistant, data = resistant_combined,bins = 10) +
facet_wrap(~ model, scale = "free_x")


Here are two histograms side by side to compare the cases. The important thing is the shape. If the acquired resistance hypothesis holds, the number of resistant bacteria in replicate cultures follows a Poisson distribution, because it arises when one counts the number of binomially distributed events that occur in a given number of trials. The interesting thing about the Poisson distribution in this case is that its mean is equal to the variance. However, under the mutation model (as we’ve already illustrated), there is a lot of variation between cultures. These fluctuations make the variance much larger than the mean, which is also what Luria and Delbrück found in their data. Therefore, the results are inconsistent with acquired mutation, and hence the experiment is called the Luria–Delbrück fluctuation test.

mean(resistant)
var(resistant)
mean(acquired_resistant)
var(acquired_resistant)


Literature

Luria, S. E., & Delbrück, M. (1943). Mutations of bacteria from virus sensitivity to virus resistance. Genetics, 28(6), 491.

Meneely, P. M. (2016). Pick Your Poisson: An Educational Primer for Luria and Delbrück’s Classic Paper. Genetics, 202(2), 371-375.

# Last year in Marseille and the EBM18 book

The EBM in Marseille was about a year ago (September 2014), but I don’t mind a bit of blog anachronism. I post this from the European Society for Evolutionary Biology conference in Lausanne. If you happen to be here, you can see me talk about signatures of selection in feralisation in Symposium 20 on Tuesday afternoon.

If you saw a bearded man carrying a pink bag scrambling towards Gare de Marseille Saint-Charles (pictured below; an incredibly beautiful train station) while eating boiled potatoes from a plastic bag, you may have witnessed my stylish departure from Marseille. This was during the Air France strike, and I had just learned that I could catch a train to Nice and go from there. In a moment of brilliance or cheapness, I also decided to spend the night at the airport Nice Côte d’Azur.

The conference itself was nothing short of wonderful. There were many interesting talks, but it was small enough that everything fit in one track, and there was plenty of time to meet people. The conference also ended with the nicest social activity at any conference I’ve been to: a group of participants went for a walk around Marseille with Pierre Pontarotti and his cute little dog. Myself, I presented our comb size work (2012 paper, 2014 paper and some new stuff). I felt like it went rather well. It seems someone else in the scientific committee thought so too, because I got invited to write a chapter for the book with meeting participants that they make each year. The invitation was to write an overview of the field we talked about, so I wrote about ”The genomics of sexual ornamentation, gene identification and pleiotropy”. One can have a look at the chapter on Google Books. The chapter goes through genomic studies (mostly QTL mapping and gene expression microarrays) on sexual ornaments, and some of the problems and promises.

I am not really sure when the book came out; I saw it popping up on Google Scholar the other day, but I haven’t seen the final version of my chapter. I assume a book is on it’s way or waiting for me when I get back from ESEB.

Pontarotti Pierre (Ed). (2015) Evolutionary Biology: Biodiversification from Genotype to Phenotype. Springer.

# @sweden recap

So, a couple of weeks ago I tweeted from the @sweden account. This is a short recap of some things that were said, and a few links that I promised people. Overall I think it went pretty well. I didn’t tweet as much as some other curators, but much much more than I usually do. This also meant I did spend my lunch and coffee breaks looking at my phone. My tweets are collected here, if for some reason you’d care to read them.

I talked quite a bit about my research. I spent more or less a full day on the chicken comb as a sexual ornament and genetics of comb mass. We discussed domestication as an evolutionary process, tonic immobility, and how to measure gene expression for eQTL mapping. I also wrote about Kauai feral chickens … And what I actually do in a day nowadays, that is: writing R code.

I got a question about what to say to your creationist friend. I think this depends on what the creationist friend believes and what their objections to evolution are. Unfortunately, I don’t think there is a simple knock-down argument against all forms of creationism, except that evolution works really well and has a lot of evidence going for it. I certainly don’t think it will do to rely on methodological naturalism and say that ”creation would be a supernatural event and outside the scope of science”. First, because I don’t think that is how science works. Say if unicorns, miraculous healing, and species popping into existence without relation to other species were actually part of the world, wouldn’t we want to study that? Second, that will never convince anyone, except of the irrelevance of science to their worldview.

But I think there are a handful of things that creationists often take issue with. First, some don’t believe in sequence variants creating new functions. This is often described with slogans about information, and how it cannot be created by random mutation. I don’t think ”sequence information” is a particularly useful concept, and would much prefer to talk about function and adaptation. That is what is important, after all, organisms acquiring new adaptations. It turns out, new functions arising can be observed, particularly in microorganisms. Some really fun and well-studied example occur in the Long Term Evolution Experiment; see Richard Lenski’s blog which has explanatory posts and links to papers.

Second, the formation of species come up a lot in these discussions. This is a bit tricky, because it’s not always clear what constitutes different species. The definition most people have heard is probably that individuals belong to different species if they cannot have fertile offspring. But just think of asexually reproducing organisms. There, individuals belong to different species if they’re sufficiently different. So we already have what is needed to understand the formation of species in the evolution of new functions. When it comes to sexually reproducing organisms, there are examples of the evolution of reproductive isolation — cases where it seems to be ongoing or to have happened recently. (See for instance this paper on hybrid incompatibility in Mimulus guttatus; I have blogged about it, but only in Swedish)

Third, there is the question of relatedness between species. In particular, some creationists really hate the idea that humans are apes. I think it is important to emphasize a couple of things that evolution does not say about humans and other apes. By the way, this isn’t just confusing for creationists, but for everyone. Evolution does not mean that humans descend from extant apes. Look at this phylogenetic tree from Perelman & al 2011. This is just like a family tree, but of populations: we see how chimps and humans have a recent common ancestor population. This is different than claiming that we would descend from extant chimps. Of course, chimps have also changed since the common ancestor, although not in the same ways as humans. (Again, I’ve written about this before in Swedish.)

Speaking of unicorns, I of course celebrated unicorn Friday:

Someone asked whether you can keep fruit flies for amateur genetics at home. That should be quite possible, and I don’t see any real problems with it either. The fruit fly community has really strong culture of classical genetics with crosses and stocks. I don’t know if stock centres would deliver to private customers, but I don’t see why they wouldn’t — except for transgenic flies. It turned out, however, that transgenic flies was actually what the person asking was after. And of course, I can’t recommend that. I must say, I have mixed feelings about do-it-yourself biotechnology. On the one hand, some home molecular biology should be possible and rewarding. On the other hand, a lot of things routinely used in molecular labs are actually really dangerous if misused, and not just for the user. For example, when making any type of construct in transgenic bacteria, antibiotics and antibiotic resistance genes are the standard screening markers. They are used to pick out the bacteria that have incorporated the piece of DNA you care about. This is not the kind of stuff you want to use without proper containment. So, in the fly example, you would not only have to handle the flies, but also transgenic antibiotic resistant bacteria safely and legally. Then again, a lot of the genetics I care about does not involve any of that, and could very well be done in a basement.

The @sweden account caught me under a teaching week; otherwise, all of my photos would’ve been my computer, my pen and my coffee mug. Now I got to walk the followers through agarose gel electrophoresis and a little transformation of bacteria:

And, finally, Swedish spring: