Coding vs. non-coding variants
Mutations produce novelty. The kind of novel variants on which selection acts. These variants fall within two broad categories depending on location in the genome: protein-coding and regulatory sequences. The former produces an altered form of a protein that more or less affects its function. Mutations of this type are consequential and often lead to significant changes in the organisms that carry them. Changes in the regulatory sequence, on the other hand, are more subtle. They alter the activity of the protein rather than its structure. Therefore, they come with minimal fitness cost and the potential to affect specific traits in a given population, individuals, or even particular tissues. Given the larger portion of the genome involved in regulation, as opposed to protein-coding, many argue that this latter category of regulatory evolution is the primary generator of novelty and possibilities for selection.
Cis vs. trans-acting
Mutations in the regulatory sequences affect nearby genes or far away ones via their products. This localized form of regulation is called cis-acting. It changes the binding site of transcription factors or repressors that turns on or off the transcription of the nearby gene. The latter modifies the activity of a protein that controls another — trans-acting. The two forms of regulators have features in common but differ in respect to others. Trans effects are shared across species and tissues. Cis regulators are often more specific and vary considerably between species, individuals, and tissues.
Evidence for selection
It is often not enough to show that a genetic variant affects gene activity. The main challenge here is to establish a neutral model of when this is not the case and be able to reject it. In other words, if the genetic variant is not adaptive and is merely carried on neutrally, there is no selection to speak of. But if, on the other hand, the variant is more prevalent than expected by the neutral model, the variant would be subject to selection. One also has to show that this effect impacts the phenotype as a whole. If the impact is negative, the variant would be selected against and vice versa.
Researchers exploit two biological features in evolution to present evidence for selection on regulatory variants. One is the direction of the effect they produce on their target. The other is the aggregate effect many of the variants have on a set of genes in a particular biological pathway. A variant that consistently associates with higher expression of a gene across species is more likely to be subject to selection. That is, compared to variants that are linked by chance to a regulatory segment nearby the gene. A causal variant is more consistent with a situation in which it is selected for in species with higher gene expression. Similarly, a set of variants can be selected for or against if they produce changes in expression in a specific direction: activation or repression. This ultimately hinges on comparing the directionality of these effects to what is expected by chance alone.
Whenever one has a dataset with many variables, there is always the problem of random associations. When working with genetic variants, their prevalence in populations and impact on the phenotypes may not be that obvious. Because selection doesn't act on individual genomic locations, multiple variants occur nearby, a minority of which are adaptive. The segment containing the adaptive and the nonadaptive traits passes to the next generation. So another challenge is to figure out which of these traits are causative and which are linked by virtue of being nearby or for some other reason.
So far, I have presented the view that mutations in the regulatory regions affect the activity of the genes. These can be subject to selection which can be adaptive to the organism that carries them, contributing to their evolution. Many argue that this regulatory evolution is prevalent and is a source of great diversity in phenotypes and species. Identifying instances of regulatory evolution can be challenging, especially at the genome-wide level. First, one has to identify a location that impacts the gene activity. Second, show that this location is subject to selection, excluding all other linked locations that are not. And finally, identify the direction of selection, be it negative or positive, that is working on the trait.
A handful of organisms can be studied for this purpose. Each provides unique opportunities and challenges. Factoring in ethical, financial, and environmental constraints, one ends up with limited studies tailored to the particular subject of the experiment. The totality of evidence is, however, what matters. Even if a finding is derived from yeast or mice, it is one step closer to being investigated in humans.
Scientists experiment with yeast in pretty much any form they please. The relevant experiment here is yeast hybrids. This entails fusing two different strains from the same species into one. When the hybrid divides, its progeny would share some but not all genetic backgrounds. In particular, the separate genomes would have separate cis-acting regulators but share the ones that act in trans. This way, one can distinguish between the two types of regulators. The cis-acting effects can be seen in the progeny, while the trans would differ between the parents and the progeny.
Mice crosses can also be very informative. For example, mating two distinct strains produce a progeny with genetic variants composition that is essentially random, aka neutral. The adaptive traits carried by the parents are passed to their progeny non preferentially. The trait distribution in the progeny is neutral and can be contrasted with the distribution in the parents. This way, one can show selection acting on a trait in the parents because it's adaptive more or less than its existence in the progeny, which is essentially random.
None of these experimental approaches apply to wild populations, including humans. Scientists, however, can leverage the natural variations between people to learn about regulatory evolution. Population genetics approaches have been adapted to this end. Recent adaptations in humans would only be detected in a particular population. Similarly, humans live in diverse geography, climates, and environments. These variations can be associated with specific regulatory variants. The availability of genomes from archaic humans presents an opportunity to directly compare with modern populations and study ancient adaptations compared to the first approach.
- Fraser, H. B., Moses, A. M., & Schadt, E. E. (2010). Evidence for widespread adaptive evolution of gene expression in budding yeast. Proceedings of the National Academy of Sciences of the United States of America, 107(7), 2977–2982. https://doi.org/10.1073/pnas.0912245107
- Fraser, H. B. (2011). Genome-wide approaches to the study of adaptive gene expression evolution: Systematic studies of evolutionary adaptations involving gene expression will allow many fundamental questions in evolutionary biology to be addressed. BioEssays, 33(6), 469–477. https://doi.org/10.1002/bies.201000094
- Villarroel, C. A., Bastías, M., Canessa, P., & Cubillos, F. A. (2021). Uncovering Divergence in Gene Expression Regulation in the Adaptation of Yeast to Nitrogen Scarcity. https://doi.org/10.1128/mSystems