Last time we starting poking around in the fitness interactions between the mutations that arise and fix in these simulations populations. I wrote a little about why we expect to see epistasis in a gene network, though you may be starting to wonder how closely the model underlying these simulations resembles a gene network. I’ll post about that in a few weeks, but first I want to follow through on our hunt for any strongly interacting pairs of mutations.
But before even that, I’m realizing that I’m not doing a very good job of showing my research process–specifically, the missteps, dead ends, and circuitous routes that far outnumber the published successes. The urge to protect my ego is a minor factor in my resistance to this type of research blogging; mostly, I just can’t imagine spending the time it would take to really document the full process. But let me take a little step in that direction and tell you about a bug.
A computer program passes a dangerous threshold when its output becomes too complex to evaluate by simple inspection; such complexity forces you to code tools to help you what the program is even doing. That extra layer of abstraction between what the code is doing and what you are seeing is a breeding ground for bugs; the one I mostly recently experienced sat right at the interface between some complicated R code and some even crazier C code. The only formula I know of for sniffing out these kinds of mistakes is to visualize results when and where you have a clear prediction: in this case, I attempted to verify that mutations that had rapidly fixed in populations were beneficial when they actually did so. When my code spit out strong negative selection coefficients for these mutations, suddenly every bit of my nice code, from the stuff I wrote last week to pieces I put together 18 months ago, was potentially suspect. Fortunately, the mistake turned out to be in the most recent thing I had done–not an uncommon outcome. I think that scientists know, at some level, that any day they could come into work and end up disproving months or years of their own work, but I think it helps to acknowledge that fear, get it out in the open, and make sure it’s not pushing you to sweep inconsistencies under the rug.
OK, back to epistasis. We’re looking for one clear example of a substitution that happens late in our simulated experiment, where the cause of that late arrival can be assigned, at least in part, to interactions with other loci. Basically, we want to see if we can find a mutation that could not have been beneficial much earlier in the history of its population then when it actually arose and swept through. Finding one or two such mutations doesn’t really tell us anything too surprising, but would add to the stack of little results that might lead to an actual discovery.
Here’s another of the same ten populations we’ve been checking out– #7, in fact. The x-axis, once again, is time in generations; the skinny plots show allele frequencies over time, with the presence of each gene indicated by white space and the freuency of its deletion shown by the grey background. The plot at the bottom shows mean population fitness.
There’s a few cool things that happen right toward the end of this simulation, including a late but strongly selected duplication, but let’s focus on the top row, R1. Here, an allele (in blue) that fixed toward the very beginning is suddenly threatened by a new mutation (purple), then displaced, along with the purple new arrival, by a third allele (orange). So, why all this sudden activity–could this orange allele have substituted earlier?
If we calculate the selective advantage of this orange allele in previous genotypes along the lineage, we see that this allele could have substituted anytime after about 32,000 generations, but before then, it was deleterious.
In fact, it was disastrously deleterious until about 20,000 generations, where something changed in the genetic background that made this possible mutation almost neutral. Looking back at our first figure, we can see that that permissive change was a duplication that seemed to increase fitness only a little by itself.
So, we have a mutation in R1 that is near-lethal for much of the population’s history, but becomes almost acceptable with the addition of a duplicate gene, and which then goes on to become outright beneficial with a change at a third locus. What if R4 is a duplication of the R1 locus, and the redundancy provided by a second copy reduced selection on R1, helping it evolve later on? Tune in next week to find out!