Morning coffee: multilevel drift

20170204_185122

There is an abstract account of natural selection (Lewontin 1970) where one observes that any population of entities, whatever they may be, will evolve through natural selection if (1) there is variation, that (2) affects reproductive success, and (3) is heritable.

I don’t know how I missed this before, but it recently occured to me that there must be a similarly abstract account of drift, where a population will evolve through drift if there is (1) variation, (2) that is heritable, and (3) sampling due to finite population size.

Drift may not be negligible, especially since at a higher level of organization, the population size should be smaller, making natural selection relatively less efficient.

EBM 2016, Marseille

In September, I went to the 20th Evolutionary Biology Meeting in Marseille. This is a very nice little meeting. I listened to a lot of talks, had some very good conversations, met some people, and presented our effort to map domestication traits in the chicken with quantitative trait locus mapping and gene expression (Johnsson & al 2015, 2016, and some unpublished stuff).

Time for a little conference report. Late, but this time less than a year from the actual conference. Here are some of my highlights:

Richard Cordaux on pill bugs, Wolbachia and sex manipulation — I did not know that Wolbachia, the intracellular parasite superstar of arthropods, had feminization of hosts in its repertoire (Cordaux & al 2004). Not only that, but in some populations of pill bugs, a large chunk of the genome of the feminizing Wolbachia has inserted into the pill bug genome, thus forming a new W chromosome (Leclercq & al 2016, published since the conference). He also told me how this is an example of the importance of preserving genetic resources — the lines of pill bugs have been maintained for a long time, and now they’re able to return to them with genomics tools and continue old lines of research. I think that is seriously cool.

Olaya Rendueles Garcia on positive frequency-dependent selection maintaining diversity in social bacterium Myxococcus xanthus (Rendueles, Amherd & Velicer 2015) — In my opinion, this was the best talk of the conference. It had everything: an interesting phenomenon, a compelling study system, good visuals and presentation. In short: M. xanthus of the same genotype tend to cooperate, inhabit their own little turfs in the soil, and exclude other genotypes. So it seems positive frequency-dependent selection maintains diversity in this case — diversity across patches, that is.

A very nice thing about this kind of meetings is that one gets a look into the amazing diversity of organisms. Or as someone put it: the complete and utter mess. In this department, I was particularly struck by … Sally Leys — sponges; Marie-Claude Marsolier-Kergoat — bison; Richard Dorrell — stramenopile chloroplasts.

I am by no means a transposable elements person. In fact, one might believe I was actively avoiding transposable elements by my choice of study species. But transposable elements are really quite interesting, and seem quite important to genome evolution, both to neutrally evolving and occasionally adaptive sequences. This meeting had a good transposon session, with several interesting talks.

Anton Crombach presented models the gap gene network in Drosophila melanogaster and Megaselia abdita, with some evolutionary perspectives (Crombach & al 2016). A couple of years ago, Marjoram, Zubair & Nuzhdin used the gap gene network as their example model to illustrate the suggestion to combine systems biology models with genetic mapping. I very much doubt (though I may be wrong; it happens a lot) that there is much meaningful variation within populations in the gap gene network. A between-species analysis seems much more fruitful, and leads to the interesting result where the outcome, in terms of gap gene expression along the embryo, is pretty similar but the way that the system gets there is quite different.

If you’ve had a beer with me and talked about the future of quantitative genetics, you’re pretty likely to have heard me talk about how in the bright future, we will not just map variation in phenotypes, but in the parameters of dynamical models. (I also think that the mapping will take place through fully Bayesian hierarchical models where the same posterior can be variously summarized for doing genomic prediction or for mapping the major quantitative trait genes, interactions etc. Of course, setting up and running whole-genome long read sequencing will be as convenient and cheap as an overnight PCR. And generally, there will be pie in the sky etc.) At any rate, what Anton Crombach showed was an example of combining systems biology modelling with variation (between clades). I thought it was exciting.

It was fun to hear Didier Raoult, one of the discoverers of giant viruses, speak. He was somewhat of a quotation machine.

”One of the major problems in biology is that people believe what they’ve learned.”

(About viruses being alive or not) ”People ask: are they alive, are they alive? I don’t care, and they don’t care either”

Very entertaining, and quite fascinating stuff about giant viruses.

If there are any readers out there who worry about social media ruining science by spilling the beans about unpublished results presented at meetings, do not worry. There were a few more cool unpublished things. Conference participants, you probably don’t know who you are, but I eagerly await your papers.

I think this will be the last evolution-themed conference for me in a while. The EBM definitely has a different range of themes than the others I’ve been to: ESEB, or rather: the subset of ESEB I see choosing my adventure through the multiple-session programme, and the Swedish evolution meetings. There was more molecular evolution, more microorganisms and even some orgin of life research.

Toying with models: The Game of Life with selection

Conway’s Game of life is probably the most famous cellular automaton, consisting of a grid of cells developing according simple rules. Today, we’re going to add mutation and selection to the game, and let patterns evolve.

The fate of a cell depends on the number cells that live in the of neighbouring positions. A cell with fewer than two neighbours die from starvation. A cell with more than three neighbours die from overpopulation. If a position is empty and has three neighbours, it will be filled by a cell. These rules lead to some interesting patterns, such as still lives that never change, oscillators that alternate between states, patterns that eventually die out but take long time to do so, patterns that keep generating new cells, and so forth.

oscillators still_life

When I played with the Game of life when I was a child, I liked one pattern called ”virus”, that looked a bit like this. On its own, a grid of four-by-four blocks is a still life, but add one cell (the virus), and the whole pattern breaks. This is a version on a 30 x 30 cell board. It unfolds rather slowly, but in the end, a glider collides with a block, and you are left with some oscillators.

blocks virus

There are probably other interesting ways that evolution could be added to the game of life. We will take a hierarchical approach where the game is taken to describe development, and the unit of selection is the pattern. Each generation, we will create a variable population of patterns, allow them to develop and pick the fittest. So, here the term ”development” refers to what happens to a pattern when applying the rules of life, and the term ”evolution” refers to how the population of patterns change over the generations. This differ slightly from Game of life terminology, where ”evolution” and ”generation” usually refer to the development of a pattern, but it is consistent with how biologists use the words: development takes place during the life of an organism, and evolution happens over the generations as organisms reproduce and pass on their genes to offspring. I don’t think there’s any deep analogy here, but we can think of the initial state of the board as the heritable material that is being passed on and occasionally mutated. We let the pattern develop, and at some point, we apply selection.

First, we need an implementation of the game of life in R. We will represent the board as a matrix of ones (live cells) and zeroes (empty positions). Here is function develops the board one tick in time. After dealing with the corners and edges, it’s very short, but also slow as molasses. The next function does this for a given number of ticks.

## Develop one tick. Return new board matrix.
develop <- function(board_matrix) {
  padded <- rbind(matrix(0, nrow = 1, ncol = ncol(board_matrix) + 2),
                  cbind(matrix(0, ncol = 1, nrow = nrow(board_matrix)), 
                        board_matrix,
                        matrix(0, ncol = 1, nrow = nrow(board_matrix))),
                  matrix(0, nrow = 1, ncol = ncol(board_matrix) + 2))
  new_board <- padded
  for (i in 2:(nrow(padded) - 1)) {
    for (j in 2:(ncol(padded) - 1)) {
      neighbours <- sum(padded[(i-1):(i+1), (j-1):(j+1)]) - padded[i, j]
      if (neighbours < 2 | neighbours > 3) {
        new_board[i, j] <- 0
      }
      if (neighbours == 3) {
        new_board[i, j] <- 1
      }
    }
  }
  new_board[2:(nrow(padded) - 1), 2:(ncol(padded) - 1)]
}

## Develop a board a given number of ticks.
tick <- function(board_matrix, ticks) {
  if (ticks > 0) {
    for (i in 1:ticks) {
      board_matrix <- develop(board_matrix) 
    }
  }
  board_matrix
}

We introduce random mutations to the board. We will use a mutation rate of 0.0011 per cell, which gives us a mean of a bout one mutation for a 30 x 30 board.

## Mutate a board
mutate <- function(board_matrix, mutation_rate) {
  mutated <- as.vector(board_matrix)
  outcomes <- rbinom(n = length(mutated), size = 1, prob = mutation_rate)
  for (i in 1:length(outcomes)) {
    if (outcomes[i] == 1)
      mutated[i] <- ifelse(mutated[i] == 0, 1, 0)
  }
  matrix(mutated, ncol = ncol(board_matrix), nrow = nrow(board_matrix))
}

I was interested in the virus pattern, so I decided to apply a simple directional selection scheme for number of cells at tick 80, which is a while after the virus pattern has stabilized itself into oscillators. We will count the number of cells at tick 80 and call that ”fitness”, even if it actually isn’t (it is a trait that affects fitness by virtue of the fact that we select on it). We will allow the top half of the population to produce two offspring each, thus keeping the population size constant at 100 individuals.

## Calculates the fitness of an individual at a given time
get_fitness <- function(board_matrix, time) {
  board_matrix %>% tick(time) %>% sum
}

## Develop a generation and calculate fitness
grow <- function(generation) {
  generation$fitness <- sapply(generation$board, get_fitness, time = 80)
  generation
}

## Select a generation based on fitness, and create the next generation,
## adding mutation.
next_generation <- function(generation) {
  keep <- order(generation$fitness, decreasing = TRUE)[1:50]
  new_generation <- list(board = vector(mode = "list", length = 100),
                         fitness = numeric(100))
  ix <- rep(keep, each = 2)
  for (i in 1:100) new_generation$board[[i]] <- generation$board[[ix[i]]]
  new_generation$board <- lapply(new_generation$board, mutate, mutation_rate = mu)
  new_generation
}

## Evolve a board, with mutation and selection for a number of generation.
evolve <- function(board, n_gen = 10) { 
  generations <- vector(mode = "list", length = n_gen)

  generations[[1]] <- list(board = vector(mode = "list", length = 100),
                           fitness = numeric(100))
  for (i in 1:100) generations[[1]]$board[[i]] <- board
  generations[[1]]$board <- lapply(generations[[1]]$board, mutate, mutation_rate = mu)

  for (i in 1:(n_gen - 1)) {
    generations[[i]] <- grow(generations[[i]])
    generations[[i + 1]] <- next_generation(generations[[i]])
  }
  generations[[n_gen]] <- grow(generations[[n_gen]])
  generations
}

Let me now tell you that I was almost completely wrong about what happens with this pattern once you apply selection. I thought that the initial pattern of nine stable blocks (36 cells) was pretty good, and that it would be preserved for long, and that virus-like patterns (like the first animation above) would mostly have degenerated around 80. As this plot of the evolution of the number of cells in one replicate shows, I grossly underestimated this pattern. The y-axis is number of cells at time 80, and the x-axis individuals, the vertical lines separating generations. Already by generation five, most individuals do better than 36 cells in this case:

blocks_trajectory_plot

As one example, here is the starting position and the state at time 80 for a couple of individuals from generation 10 of one of my replicates:

blocks_g10_1 blocks_g10_80

blocks_g10_1b blocks_g10_80b

Here is how the average cell number at time 80 evolves in five replicates. Clearly, things are still going on at generation 10, not only in the replicate shown above.

mean_fitness_blocks

Here is the same plot for the virus pattern I showed above, i.e. the blocks but with one single added cell, fixed in the starting population. Prior genetic architecture matters. Even if the virus pattern has fewer cells than the blocks pattern at time 80, it is apparently a better starting point to quickly evolve more cells:

mean_fitness_virus

And finally, out of curiosity, what happens if we start with an empty 30 x 30 board?

mean_fitness_blank

Not much. The simple still life block evolves a lot. But in my replicate three, this creature emerged. ”Life, uh, finds a way.”

blank_denovo

Unfortunately, many of the selected patterns extended to the edges of the board, making them play not precisely the game of life, but the game of life with edge effects. I’d like to use a much bigger board and see how far patterns extend. It would also be fun to follow them longer. To do that, I would need to implement a more efficient way to update the board (this is very possible, but I was lazy). It would also be fun to select for something more complex, with multiple fitness components, potentially in conflict, e.g. favouring patterns that grow large at a later time while being as small as possible at an earlier time.

Code is on github, including functions to display and animate boards with the animation package and ImageMagick, and code for the plots. Again, the blocks_selection.R script is slow, so leave it running and go do something else.

Vi härstammar inte från schimpanser

I have found it difficult, when looking at any two species, to avoid picturing to myself, forms directly intermediate between them. But this is a wholly false view; we should always look for forms intermediate between each species and a common but unknown progenitor; and the progenitor will generally have differed in some respects from all its modified descendants.

– Charles Darwin. ”On the Origin of Species…” Chap. IX. On the Imperfection of the Geological Record. (Citerat här.)

Kreationism eller intelligent design kan inte vara föreställningen att Gud skapade världen — för det finns många olika idéer om hur gudar kan ha skapat världen som inte får folk att vägra vetenskap — utan föreställningen att levande organismer av olika arter inte har ett gemensamt ursprung, fast de är så lika, och inte förändras över tid genom evolution, även om det ser ut som att de gjort det. Eftersom att evolution går att observera i realtid i naturen och i labbet blir de mer eller mindre tvungna att erkänna att evolution finns, bara att den inte kan förklara variationen i naturen. Jag tror inte det finns någon som helst mening med att lägga massor av tid på att debattera mot kreationister, särskilt när svenska kreationister är så sällsynta. Däremot tror jag att det kan vara både roligt och meningsfullt att skriva allmänt hållna saker om evolutionsbiologi som har att göra med sådant kreationister brukar säga.

Om vi frågar oss: Om människan härstammar från apor, varför finns det fortfarande apor? Eller pröva den här, tycker jag, intressantare versionen: Om flercelliga organismer härstammar från encelliga organismer, varför finns det så många bakterier? Kruxet är förstås att vi inte härstammar från nu levande encelliga organismer eller nu levande andra arter av apor. Men vi har ett gemensamt ursprung. Människor och andra apor är hyfsat lika, så vi kan sluta oss till att de sista gemensamma förfäder- och mödrar vi hade måste ha varit något som vi skulle beskriva som apor. Men det var länge sedan och många genetiska varianter har flutit under de metaforiska broarna, både för det släktled som ledde till oss och det som ledde till schimpanserna. Min pappa kom från Skåne och alltså måste jag till hälften härstamma från skåningar. Ändå finns det fortfarande skåningar.

Här är ett släktträd (från Polina Perelman m.fl, 2011, cc:by 3.0) med oss och andra levande apor. Den tekniska termen är fylogenetiskt träd, men släktträd beskriver ganska bra vad det är frågan om. Skillnaden är att det är ett träd över grupper (i det här fallet släkten) och att det saknas bild och namn på de gemensamma förfäder- och -mödrar som är förgreningarna i trädet. Det här trädet är baserat på jämförelser av dna-sekvenser mellan de nu levande arterna. Mycket riktigt ser vi att människor (Homo) har schimpanser (Pan) som närmaste släktingar och med gorillorna på lite längre håll.

journal.pgen.1001342.g001

Dagens rekommendation: virus som lever på bakterier som lever på träd

Jag var ju på konferens i somras: ESEB2013, som var helt fantastisk — som en festival av vetenskap. Föredragen spelades in och nu börjar det komma upp fler på konferensens youtube-kanal. Det är förstås fråga om ganska tekniska presentationer riktade till en expertpublik, men många av föredragshållarna är så bra och underhållande att jag tror att det är i allra högsta grad tillgängligt för en lite nördigt intresserad allmänhet: vetenskap som är fullt tjänlig som populär/vetenskap!

Dagens föredrag: Britt Koskella är en evolutionsbiolog som jobbar med interaktioner mellan parasiter och värdorganismer. Här berättar hon om bakteriofager som lever på bakterier som lever på kastanjer, deras interaktioner och deras evolutionära anpassning till sin närmiljö och till varandra, från sessionen om snabb evolution. Inte nog med att en kan flytta bakterier och virus fram och tillbaka för att pröva hur de klarar sig i en ny miljö, dessutom går det att frysa ner och tina dem igen, så att en kan jämföra gamla organismer med nya och se anpassning i realtid!

Pig and chimp dna just won’t splice — eller: ta en titt på Wikipedia och mejla mig

Okej, titeln hänvisar till South Park som ju verkligen inte är en särskilt bra serie. Men det visar sig att South Park-figuren Dr. Alphonso Mephesto (så!) har en sorts verklig motsvarighet — verkligheten överträffar dikten och så vidare. Det finns tydligen en genetiker som driver en egen hypotes om människans utveckling: att vi kommer från hybrider mellan schimpanser och grisar. Det finns som bekant ingen åsikt som är så konstig att det inte är någon som driver den. Och på något sätt har The Daily Mail, som är tämligen ökänd för sin dåliga vetenskapsjournalistik, fått för sig att skriva om det. Och direkt från the Daily Mails hemsida kommer den till Aftonbladet.

Eugene McCarthy heter han som driver idén att människan är en schimpans–grishybrid. Det är han förmodligen ganska ensam om att tro — varför framgår nog om en läser hans väldigt långdragna hemsida. Jag har bara läst den länkade artikeln där han än så länge kommit fram till att det inte finns något som tyder på att schimpanser och grisar kan få fertil avkomma, att det finns andra förklaringar till kroppsliga likheter mellan människor och grisar samt att människan genetiskt inte särskilt liknar grisen men däremot schimpansen. Vad som talar för hans hypotes har han inte hunnit med.

Men det var inte hypotesen som sådan jag tänkte skriva om utan vetenskapsjournalistik. (Om du verkligen undrar om grisarna och schimpanserna, se PZ Myers bloggpost. Myers är för övrigt professor i biologi, så hans akademiska position smäller klart högre än min, och han har valt ett annat South Park-klipp som illustration.) Det är såklart väldigt lätt att skälla på de som skrivit artiklarna i Daily Mail och Aftonbladet — det borde väl vem som helst fatta att ”teorin” ifråga är ett påhitt från en tok? Eller? Jag vet faktiskt inte! Jag tycker kanske att den som skriver ett referat borde kunna ta sig tillbaka till källan och se att det inte är något publicerat forskningsresultat, utan en ganska suspekt egen hemsida. Men när det gäller påståendet att människan är en schimpans-grishybrid? Det låter absurt, men det finns det också en hel del riktig forskning som gör. Forskare och vetenskapskommunikatörer är dessutom väldigt förtjusta i att hitta på slående och uppseendeväckande rubriker och sammanfattningar. Det kanske faktiskt inte är så lätt att veta vad som är rimligt och inte.

Det är utan tvivel så att forskare — inklusive små doktorander som undertecknad — ibland kan vara dryga på Twitter eller skriva arga mejl när de tycker någon har publicerat något dumt. Och de är, precis som journalister och reportrar upptagna och under ständig tidspress. Men de flesta bör vara vana vid och intresserade av att förklara vetenskap. Så om du är reporter, sitter med en artikel i knät och undrar vad det betyder och om det är trovärdigt — ring någon! Skicka ett mejl! Det ingår faktiskt i vårt jobb att dela med oss av kunskap. Även om någon som jag absolut inte är expert på det mesta inom biologi, så har vi i alla fall haft mycket övning i att läsa och utvärdera vetenskapliga påståenden.

From Lisbon

Dear diary,

I’m at the Congress of the European Society for Evolutionary Biology in Lisbon. It’s great, of course and I expected nothing less, but there is so much of it! Every session at ESEB has nine symposia running in parallel, so there are many paths through the conference programme. Mine contains a lot of genomics for obvious reasons.

Some highlights so far:

Juliette de Meaux’s plenary: while talking about molecular basis of adaptations in Arabidopsis thaliana — one study based on a candidate gene and one on a large-effect QTL — de Meaux brought up two fun concepts that would recur in Thomas Mitchel-Olds’ talk and elsewhere:

1) The ‘mutational target’ and how many genes there are that could possibly be perturbed to change a trait in question. The size of the mutational target and the knowledge of the mechanisms underlying the trait of course affects whether it is fruitful to try any candidate gene approaches. My intuition is to be skeptical of candidate gene studies for complex traits, but as in the case of plant pathogen defense (or melanin synthesis for pigmentation — another example that got a lot of attention in several talks), if there is only one enzyme pathway to synthesise a compound and only one step that controls the rate of the reaction, there will be very few genes that can physically be altered to affect the trait.

2) Some of both de Meaux’s and Mitchel-Olds’ work exemplify the mapping of intermediate molecular phenotypes to get at small-effect variants for organismal traits — the idea being that while there might be many loci and large environmental effects on the organismal traits, they will act through different molecular intermediates and the intermediate traits will be simpler. The intermediate traits might be flagellin bindning, flux through an enzymatic pathway or maybe transcript abundance — this is a similar line of thinking as the motivations for using genetical genomics and eQTL mapping.

The ”Do QTN generally exist?” symposium: my favourite symposium so far. (Note: QTN stands for Quantitative Trait Nucleotide, and it means nothing more than a known causal sequence variant for some quantitative trait. Very few actual QTN featured in the session, so maybe it should’ve been called ”Do QTG generally exist?” Whatever.) I’ve heard both him and Annalise Paaby present their RNA inference experiments revealing cryptic genetic variation in C. elegans before, but Matt Rockman also talked about some conceptual points (”things we all know but sometimes forget” [I’m paraphrasing from memory]): adaptation does not require fixation; standing variation matters; effect-size is not an intrinsic feature of an allele. There was also a very memorable question at the end, asking whether the answer to the questions Rockman posed at the beginning, ”What number of loci contribute to adaptive evolution?” and ”What is the effect-size distribution?” should be ”any number of loci” and ”any distribution” … To which Rockman answered that those were pretty much his views.

In the same symposium, Luisa Pallares, showed some really nice genome wide association result for craniofacial morphology from natural hybrid mice. As someone who works on an experimental cross of animals, I found the idea very exciting, and of course I immediately started dreaming about hybrid genetical genomics.

Dieter Ebert’s plenary: how they with lots of work seem to have found actual live Red Queen dynamics with Daphnia magna and Pasteuria ramosa.

Larry Young and Hanna Kokko: Young and Kokko had two very different invited talks back to back in the sex role symposium, Young about the neurological basis of pair-bonding in the famous monogamous voles, and Kokko about models of evolution of some aspects of sex roles.

Susan Johnston‘s talk: about how heterozygote advantage maintains variation at a horn locus in the Soay sheep of St Kilda. Simply awesome presentation and results. Published yesterday!

On to our stuff! Dominic Wright had a talk presenting our chicken comb work in the QTN session, and on Friday I will have a poster on display about the behaviour side of the project. There’s actually quite a few of us from the AVIAN group here, most of them also presenting posters on Friday (Anna-Carin, Johan, Amir, Magnus, Hanne, Rie). And (though misspelled) my name is on the abstract of Per Jensen‘s talk as well, making this my personal record for conference contribution.

The poster sessions are very crowded and a lot of the posters are hung facing the wall with very little space for walking past, and some of them behind pillars. In all probability there’s a greater than 0.5 chance that my poster will be in a horrible spot. But if you happen to be curious feel free to grab me anywhere you see me, or tweet at me.

I looke like this when posing with statues or when I’m visibly troubled by the sunlight. If you’re into genetical genomics for QTG identification, domestication and that kind of stuff, this is the hairy beast you should talk too.

martin_eseb