1
|
Saubin M, Stoeckel S, Tellier A, Halkett F. Neutral genetic structuring of pathogen populations during rapid adaptation. J Hered 2025; 116:62-77. [PMID: 39114995 DOI: 10.1093/jhered/esae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 08/05/2024] [Indexed: 01/07/2025] Open
Abstract
Pathogen species are experiencing strong joint demographic and selective events, especially when they adapt to a new host, for example through overcoming plant resistance. Stochasticity in the founding event and the associated demographic variations hinder our understanding of the expected evolutionary trajectories and the genetic structure emerging at both neutral and selected loci. What would be the typical genetic signatures of such a rapid adaptation event is not elucidated. Here, we build a demogenetic model to monitor pathogen population dynamics and genetic evolution on two host compartments (susceptible and resistant). We design our model to fit two plant pathogen life cycles, "with" and "without" host alternation. Our aim is to draw a typology of eco-evolutionary dynamics. Using time-series clustering, we identify three main scenarios: 1) small variations in the pathogen population size and small changes in genetic structure, 2) a strong founder event on the resistant host that in turn leads to the emergence of genetic structure on the susceptible host, and 3) evolutionary rescue that results in a strong founder event on the resistant host, preceded by a bottleneck on the susceptible host. We pinpoint differences between life cycles with notably more evolutionary rescue "with" host alternation. Beyond the selective event itself, the demographic trajectory imposes specific changes in the genetic structure of the pathogen population. Most of these genetic changes are transient, with a signature of resistance overcoming that vanishes within a few years only. Considering time-series is therefore of utmost importance to accurately decipher pathogen evolution.
Collapse
Affiliation(s)
- Méline Saubin
- Université de Lorraine, INRAE, IAM, Nancy, France
- Professorship for Population Genetics, Department of Life Science Systems, School of Life Science, Technical University of Munich, Freising, Germany
- INRAE, Université de Bordeaux, BIOGECO, F-33610, Cestas, France
| | - Solenn Stoeckel
- INRAE, Agrocampus Ouest, Université de Rennes, IGEPP, Le Rheu, France
- DECOD (Ecosystem Dynamics and Sustainability), INRAE, Institut Agro, IFREMER, Rennes, France
| | - Aurélien Tellier
- Professorship for Population Genetics, Department of Life Science Systems, School of Life Science, Technical University of Munich, Freising, Germany
| | | |
Collapse
|
2
|
Saubin M, Tellier A, Stoeckel S, Andrieux A, Halkett F. Approximate Bayesian Computation applied to time series of population genetic data disentangles rapid genetic changes and demographic variations in a pathogen population. Mol Ecol 2024; 33:e16965. [PMID: 37150947 DOI: 10.1111/mec.16965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 04/04/2023] [Accepted: 04/12/2023] [Indexed: 05/09/2023]
Abstract
Adaptation can occur at remarkably short timescales in natural populations, leading to drastic changes in phenotypes and genotype frequencies over a few generations only. The inference of demographic parameters can allow understanding how evolutionary forces interact and shape the genetic trajectories of populations during rapid adaptation. Here we propose a new Approximate Bayesian Computation (ABC) framework that couples a forward and individual-based model with temporal genetic data to disentangle genetic changes and demographic variations in a case of rapid adaptation. We test the accuracy of our inferential framework and evaluate the benefit of considering a dense versus sparse sampling. Theoretical investigations demonstrate high accuracy in both model and parameter estimations, even if a strong thinning is applied to time series data. Then, we apply our ABC inferential framework to empirical data describing the population genetic changes of the poplar rust pathogen following a major event of resistance overcoming. We successfully estimate key demographic and genetic parameters, including the proportion of resistant hosts deployed in the landscape and the level of standing genetic variation from which selection occurred. Inferred values are in accordance with our empirical knowledge of this biological system. This new inferential framework, which contrasts with coalescent-based ABC analyses, is promising for a better understanding of evolutionary trajectories of populations subjected to rapid adaptation.
Collapse
Affiliation(s)
- Méline Saubin
- Université de Lorraine, INRAE, IAM, Nancy, France
- Department for Life Science Systems, Technical University of Munich, Freising, Germany
| | - Aurélien Tellier
- Department for Life Science Systems, Technical University of Munich, Freising, Germany
| | - Solenn Stoeckel
- INRAE, Agrocampus Ouest, Université de Rennes, IGEPP, Le Rheu, France
| | | | | |
Collapse
|
3
|
Harris M, Kim BY, Garud N. Enrichment of hard sweeps on the X chromosome compared to autosomes in six Drosophila species. Genetics 2024; 226:iyae019. [PMID: 38366786 PMCID: PMC10990427 DOI: 10.1093/genetics/iyae019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/17/2024] [Accepted: 01/18/2024] [Indexed: 02/18/2024] Open
Abstract
The X chromosome, being hemizygous in males, is exposed one-third of the time increasing the visibility of new mutations to natural selection, potentially leading to different evolutionary dynamics than autosomes. Recently, we found an enrichment of hard selective sweeps over soft selective sweeps on the X chromosome relative to the autosomes in a North American population of Drosophila melanogaster. To understand whether this enrichment is a universal feature of evolution on the X chromosome, we analyze diversity patterns across 6 commonly studied Drosophila species. We find an increased proportion of regions with steep reductions in diversity and elevated homozygosity on the X chromosome compared to autosomes. To assess if these signatures are consistent with positive selection, we simulate a wide variety of evolutionary scenarios spanning variations in demography, mutation rate, recombination rate, background selection, hard sweeps, and soft sweeps and find that the diversity patterns observed on the X are most consistent with hard sweeps. Our findings highlight the importance of sex chromosomes in driving evolutionary processes and suggest that hard sweeps have played a significant role in shaping diversity patterns on the X chromosome across multiple Drosophila species.
Collapse
Affiliation(s)
- Mariana Harris
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Bernard Y Kim
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Nandita Garud
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
4
|
Romero EV, Feder AF. Elevated HIV Viral Load is Associated with Higher Recombination Rate In Vivo. Mol Biol Evol 2024; 41:msad260. [PMID: 38197289 PMCID: PMC10777272 DOI: 10.1093/molbev/msad260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 11/21/2023] [Accepted: 11/27/2023] [Indexed: 01/11/2024] Open
Abstract
HIV's exceptionally high recombination rate drives its intrahost diversification, enabling immune escape and multidrug resistance within people living with HIV. While we know that HIV's recombination rate varies by genomic position, we have little understanding of how recombination varies throughout infection or between individuals as a function of the rate of cellular coinfection. We hypothesize that denser intrahost populations may have higher rates of coinfection and therefore recombination. To test this hypothesis, we develop a new approach (recombination analysis via time series linkage decay or RATS-LD) to quantify recombination using autocorrelation of linkage between mutations across time points. We validate RATS-LD on simulated data under short read sequencing conditions and then apply it to longitudinal, high-throughput intrahost viral sequencing data, stratifying populations by viral load (a proxy for density). Among sampled viral populations with the lowest viral loads (<26,800 copies/mL), we estimate a recombination rate of 1.5×10-5 events/bp/generation (95% CI: 7×10-6 to 2.9×10-5), similar to existing estimates. However, among samples with the highest viral loads (>82,000 copies/mL), our median estimate is approximately 6 times higher. In addition to co-varying across individuals, we also find that recombination rate and viral load are associated within single individuals across different time points. Our findings suggest that rather than acting as a constant, uniform force, recombination can vary dynamically and drastically across intrahost viral populations and within them over time. More broadly, we hypothesize that this phenomenon may affect other facultatively asexual populations where spatial co-localization varies.
Collapse
Affiliation(s)
- Elena V Romero
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Alison F Feder
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Herbold Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA, USA
| |
Collapse
|
5
|
Harris M, Kim B, Garud N. Enrichment of hard sweeps on the X chromosome compared to autosomes in six Drosophila species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.21.545888. [PMID: 38106201 PMCID: PMC10723260 DOI: 10.1101/2023.06.21.545888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The X chromosome, being hemizygous in males, is exposed one third of the time increasing the visibility of new mutations to natural selection, potentially leading to different evolutionary dynamics than autosomes. Recently, we found an enrichment of hard selective sweeps over soft selective sweeps on the X chromosome relative to the autosomes in a North American population of Drosophila melanogaster. To understand whether this enrichment is a universal feature of evolution on the X chromosome, we analyze diversity patterns across six commonly studied Drosophila species. We find an increased proportion of regions with steep reductions in diversity and elevated homozygosity on the X chromosome compared to autosomes. To assess if these signatures are consistent with positive selection, we simulate a wide variety of evolutionary scenarios spanning variations in demography, mutation rate, recombination rate, background selection, hard sweeps, and soft sweeps, and find that the diversity patterns observed on the X are most consistent with hard sweeps. Our findings highlight the importance of sex chromosomes in driving evolutionary processes and suggest that hard sweeps have played a significant role in shaping diversity patterns on the X chromosome across multiple Drosophila species.
Collapse
Affiliation(s)
- Mariana Harris
- Department of Computational Medicine, University of California Los Angeles, Los Angeles California, United States of America
| | - Bernard Kim
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Nandita Garud
- Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles California, United States of America
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
6
|
Schrider DR. Allelic gene conversion softens selective sweeps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.05.570141. [PMID: 38106127 PMCID: PMC10723294 DOI: 10.1101/2023.12.05.570141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The prominence of positive selection, in which beneficial mutations are favored by natural selection and rapidly increase in frequency, is a subject of intense debate. Positive selection can result in selective sweeps, in which the haplotype(s) bearing the adaptive allele "sweep" through the population, thereby removing much of the genetic diversity from the region surrounding the target of selection. Two models of selective sweeps have been proposed: classical sweeps, or "hard sweeps", in which a single copy of the adaptive allele sweeps to fixation, and "soft sweeps", in which multiple distinct copies of the adaptive allele leave descendants after the sweep. Soft sweeps can be the outcome of recurrent mutation to the adaptive allele, or the presence of standing genetic variation consisting of multiple copies of the adaptive allele prior to the onset of selection. Importantly, soft sweeps will be common when populations can rapidly adapt to novel selective pressures, either because of a high mutation rate or because adaptive alleles are already present. The prevalence of soft sweeps is especially controversial, and it has been noted that selection on standing variation or recurrent mutations may not always produce soft sweeps. Here, we show that the inverse is true: selection on single-origin de novo mutations may often result in an outcome that is indistinguishable from a soft sweep. This is made possible by allelic gene conversion, which "softens" hard sweeps by copying the adaptive allele onto multiple genetic backgrounds, a process we refer to as a "pseudo-soft" sweep. We carried out a simulation study examining the impact of gene conversion on sweeps from a single de novo variant in models of human, Drosophila, and Arabidopsis populations. The fraction of simulations in which gene conversion had produced multiple haplotypes with the adaptive allele upon fixation was appreciable. Indeed, under realistic demographic histories and gene conversion rates, even if selection always acts on a single-origin mutation, sweeps involving multiple haplotypes are more likely than hard sweeps in large populations, especially when selection is not extremely strong. Thus, even when the mutation rate is low or there is no standing variation, hard sweeps are expected to be the exception rather than the rule in large populations. These results also imply that the presence of signatures of soft sweeps does not necessarily mean that adaptation has been especially rapid or is not mutation limited.
Collapse
Affiliation(s)
- Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
7
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
8
|
Whitehouse LS, Schrider DR. Timesweeper: accurately identifying selective sweeps using population genomic time series. Genetics 2023; 224:iyad084. [PMID: 37157914 PMCID: PMC10324941 DOI: 10.1093/genetics/iyad084] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 07/25/2022] [Accepted: 04/25/2023] [Indexed: 05/10/2023] Open
Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Collapse
Affiliation(s)
- Logan S Whitehouse
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27514, USA
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27514, USA
| |
Collapse
|
9
|
Popović M, Nuskern L, Peranić K, Vuković R, Katanić Z, Krstin L, Ćurković-Perica M, Leigh DM, Poljak I, Idžojtić M, Rigling D, Ježić M. Physiological variations in hypovirus-infected wild and model long-term laboratory strains of Cryphonectria parasitica. Front Microbiol 2023; 14:1192996. [PMID: 37426020 PMCID: PMC10324583 DOI: 10.3389/fmicb.2023.1192996] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 05/25/2023] [Indexed: 07/11/2023] Open
Abstract
Introduction Forest ecosystems are highly threatened by the simultaneous effects of climate change and invasive pathogens. Chestnut blight, caused by the invasive phytopathogenic fungus Cryphonectria parasitica, has caused severe damage to European chestnut groves and catastrophic dieback of American chestnut in North America. Within Europe, the impacts of the fungus are widely mitigated through biological control that utilizes the RNA mycovirus: Cryphonectria hypovirus 1 (CHV1). Viral infections, similarly to abiotic factors, can cause oxidative stress in their hosts leading to physiological attrition through stimulating ROS (reactive oxygen species) and NOx production. Methods To fully understand the interactions leading to the biocontrol of chestnut blight, it is vital to determine oxidative stress damage arising during CHV1 infection, especially considering that other abiotic factors, like long-term cultivation of model fungal strains, can also impact oxidative stress. Our study compared CHV1-infected C. parasitica isolates from two Croatian wild populations with CHV1-infected model strains (EP713, Euro7 and CR23) that have experienced long-term laboratory cultivation. Results and Discussion We determined the level of oxidative stress in the samples by measuring stress enzymes' activity and oxidative stress biomarkers. Furthermore, for the wild populations, we studied the activity of fungal laccases, expression of the laccase gene lac1, and a possible effect of CHV1 intra-host diversity on the observed biochemical responses. Relative to the wild isolates, the long-term model strains had lower enzymatic activities of superoxide dismutase (SOD) and glutathione S-transferase (GST), and higher content of malondialdehyde (MDA) and total non-protein thiols. This indicated generally higher oxidative stress, likely arising from their decades-long history of subculturing and freeze-thaw cycles. When comparing the two wild populations, differences between them in stress resilience and levels of oxidative stress were also observed, as evident from the different MDA content. The intra-host genetic diversity of the CHV1 had no discernible effect on the stress levels of the virus-infected fungal cultures. Our research indicated that an important determinant modulating both lac1 expression and laccase enzyme activity is intrinsic to the fungus itself, possibly related to the vc type of the fungus, i.e., vegetative incompatibility genotype.
Collapse
Affiliation(s)
- Maja Popović
- Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Lucija Nuskern
- Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Karla Peranić
- Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Rosemary Vuković
- Department of Biology, Josip Juraj Strossmayer University of Osijek, Osijek, Croatia
| | - Zorana Katanić
- Department of Biology, Josip Juraj Strossmayer University of Osijek, Osijek, Croatia
| | - Ljiljana Krstin
- Department of Biology, Josip Juraj Strossmayer University of Osijek, Osijek, Croatia
| | | | | | - Igor Poljak
- Faculty of Forestry and Wood Technology, University of Zagreb, Zagreb, Croatia
| | - Marilena Idžojtić
- Faculty of Forestry and Wood Technology, University of Zagreb, Zagreb, Croatia
| | - Daniel Rigling
- Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
| | - Marin Ježić
- Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia
| |
Collapse
|
10
|
Caspi I, Meir M, Ben Nun N, Abu Rass R, Yakhini U, Stern A, Ram Y. Mutation rate, selection, and epistasis inferred from RNA virus haplotypes via neural posterior estimation. Virus Evol 2023; 9:vead033. [PMID: 37305706 PMCID: PMC10256221 DOI: 10.1093/ve/vead033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/30/2023] [Accepted: 05/16/2023] [Indexed: 06/13/2023] Open
Abstract
RNA viruses are particularly notorious for their high levels of genetic diversity, which is generated through the forces of mutation and natural selection. However, disentangling these two forces is a considerable challenge, and this may lead to widely divergent estimates of viral mutation rates, as well as difficulties in inferring the fitness effects of mutations. Here, we develop, test, and apply an approach aimed at inferring the mutation rate and key parameters that govern natural selection, from haplotype sequences covering full-length genomes of an evolving virus population. Our approach employs neural posterior estimation, a computational technique that applies simulation-based inference with neural networks to jointly infer multiple model parameters. We first tested our approach on synthetic data simulated using different mutation rates and selection parameters while accounting for sequencing errors. Reassuringly, the inferred parameter estimates were accurate and unbiased. We then applied our approach to haplotype sequencing data from a serial passaging experiment with the MS2 bacteriophage, a virus that parasites Escherichia coli. We estimated that the mutation rate of this phage is around 0.2 mutations per genome per replication cycle (95% highest density interval: 0.051-0.56). We validated this finding with two different approaches based on single-locus models that gave similar estimates but with much broader posterior distributions. Furthermore, we found evidence for reciprocal sign epistasis between four strongly beneficial mutations that all reside in an RNA stem loop that controls the expression of the viral lysis protein, responsible for lysing host cells and viral egress. We surmise that there is a fine balance between over- and underexpression of lysis that leads to this pattern of epistasis. To recap, we have developed an approach for joint inference of the mutation rate and selection parameters from full haplotype data with sequencing errors and used it to reveal features governing MS2 evolution.
Collapse
Affiliation(s)
- Itamar Caspi
- Shmunis School of Biomedicine and Cancer Research, Faculty of Life Sciences, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
| | - Moran Meir
- Shmunis School of Biomedicine and Cancer Research, Faculty of Life Sciences, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
| | - Nadav Ben Nun
- Edmond J. Safra Center for Bioinformatics, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
| | | | - Uri Yakhini
- Shmunis School of Biomedicine and Cancer Research, Faculty of Life Sciences, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
- Edmond J. Safra Center for Bioinformatics, Tel Aviv University, P.O. Box 39040, Tel Aviv 6997801, Israel
| | | | - Yoav Ram
- *Corresponding author: E-mail: ;
| |
Collapse
|
11
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
12
|
Wortel MT, Agashe D, Bailey SF, Bank C, Bisschop K, Blankers T, Cairns J, Colizzi ES, Cusseddu D, Desai MM, van Dijk B, Egas M, Ellers J, Groot AT, Heckel DG, Johnson ML, Kraaijeveld K, Krug J, Laan L, Lässig M, Lind PA, Meijer J, Noble LM, Okasha S, Rainey PB, Rozen DE, Shitut S, Tans SJ, Tenaillon O, Teotónio H, de Visser JAGM, Visser ME, Vroomans RMA, Werner GDA, Wertheim B, Pennings PS. Towards evolutionary predictions: Current promises and challenges. Evol Appl 2023; 16:3-21. [PMID: 36699126 PMCID: PMC9850016 DOI: 10.1111/eva.13513] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 12/14/2022] Open
Abstract
Evolution has traditionally been a historical and descriptive science, and predicting future evolutionary processes has long been considered impossible. However, evolutionary predictions are increasingly being developed and used in medicine, agriculture, biotechnology and conservation biology. Evolutionary predictions may be used for different purposes, such as to prepare for the future, to try and change the course of evolution or to determine how well we understand evolutionary processes. Similarly, the exact aspect of the evolved population that we want to predict may also differ. For example, we could try to predict which genotype will dominate, the fitness of the population or the extinction probability of a population. In addition, there are many uses of evolutionary predictions that may not always be recognized as such. The main goal of this review is to increase awareness of methods and data in different research fields by showing the breadth of situations in which evolutionary predictions are made. We describe how diverse evolutionary predictions share a common structure described by the predictive scope, time scale and precision. Then, by using examples ranging from SARS-CoV2 and influenza to CRISPR-based gene drives and sustainable product formation in biotechnology, we discuss the methods for predicting evolution, the factors that affect predictability and how predictions can be used to prevent evolution in undesirable directions or to promote beneficial evolution (i.e. evolutionary control). We hope that this review will stimulate collaboration between fields by establishing a common language for evolutionary predictions.
Collapse
Affiliation(s)
- Meike T. Wortel
- Swammerdam Institute for Life SciencesUniversity of AmsterdamAmsterdamThe Netherlands
| | - Deepa Agashe
- National Centre for Biological SciencesBangaloreIndia
| | | | - Claudia Bank
- Institute of Ecology and EvolutionUniversity of BernBernSwitzerland
- Swiss Institute of BioinformaticsLausanneSwitzerland
- Gulbenkian Science InstituteOeirasPortugal
| | - Karen Bisschop
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
- Origins CenterGroningenThe Netherlands
- Laboratory of Aquatic Biology, KU Leuven KulakKortrijkBelgium
| | - Thomas Blankers
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
- Origins CenterGroningenThe Netherlands
| | | | - Enrico Sandro Colizzi
- Origins CenterGroningenThe Netherlands
- Mathematical InstituteLeiden UniversityLeidenThe Netherlands
| | | | | | - Bram van Dijk
- Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Martijn Egas
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
| | - Jacintha Ellers
- Department of Ecological ScienceVrije Universiteit AmsterdamAmsterdamThe Netherlands
| | - Astrid T. Groot
- Institute for Biodiversity and Ecosystem DynamicsUniversity of AmsterdamAmsterdamThe Netherlands
| | | | | | - Ken Kraaijeveld
- Leiden Centre for Applied BioscienceUniversity of Applied Sciences LeidenLeidenThe Netherlands
| | - Joachim Krug
- Institute for Biological PhysicsUniversity of CologneCologneGermany
| | - Liedewij Laan
- Department of Bionanoscience, Kavli Institute of NanoscienceTU DelftDelftThe Netherlands
| | - Michael Lässig
- Institute for Biological PhysicsUniversity of CologneCologneGermany
| | - Peter A. Lind
- Department Molecular BiologyUmeå UniversityUmeåSweden
| | - Jeroen Meijer
- Theoretical Biology and Bioinformatics, Department of BiologyUtrecht UniversityUtrechtThe Netherlands
| | - Luke M. Noble
- Institute de Biologie, École Normale Supérieure, CNRS, InsermParisFrance
| | | | - Paul B. Rainey
- Department of Microbial Population BiologyMax Planck Institute for Evolutionary BiologyPlönGermany
- Laboratoire Biophysique et Évolution, CBI, ESPCI Paris, Université PSL, CNRSParisFrance
| | - Daniel E. Rozen
- Institute of Biology, Leiden UniversityLeidenThe Netherlands
| | - Shraddha Shitut
- Origins CenterGroningenThe Netherlands
- Institute of Biology, Leiden UniversityLeidenThe Netherlands
| | | | | | | | | | - Marcel E. Visser
- Department of Animal EcologyNetherlands Institute of Ecology (NIOO‐KNAW)WageningenThe Netherlands
| | - Renske M. A. Vroomans
- Origins CenterGroningenThe Netherlands
- Informatics InstituteUniversity of AmsterdamAmsterdamThe Netherlands
| | | | - Bregje Wertheim
- Groningen Institute for Evolutionary Life SciencesUniversity of GroningenGroningenThe Netherlands
| | | |
Collapse
|
13
|
Declines in prevalence alter the optimal level of sexual investment for the malaria parasite Plasmodium falciparum. Proc Natl Acad Sci U S A 2022; 119:e2122165119. [PMID: 35867831 PMCID: PMC9335338 DOI: 10.1073/pnas.2122165119] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Like most human pathogens, the malaria parasite Plasmodium falciparum experiences strong selection pressure from public health interventions such as drug treatment. While most commonly studied in the context of drug targets and related pathways, parasite adaptation to control measures likely extends to phenotypes beyond drug resistance. Here, we use modeling to explore how control measures can reduce levels of within-host competition between P. falciparum genotypes and favor higher rates of sexual investment. We validate these predictions with longitudinally sampled genomic data from French Guiana during a period of malaria decline and find that the most strongly selected genes are enriched for transcription factors involved in commitment to and development of the parasite’s sexual gametocyte form. Successful infectious disease interventions can result in large reductions in parasite prevalence. Such demographic change has fitness implications for individual parasites and may shift the parasite’s optimal life history strategy. Here, we explore whether declining infection rates can alter Plasmodium falciparum’s investment in sexual versus asexual growth. Using a multiscale mathematical model, we demonstrate how the proportion of polyclonal infections, which decreases as parasite prevalence declines, affects the optimal sexual development strategy: Within-host competition in multiclone infections favors a greater investment in asexual growth whereas single-clone infections benefit from higher conversion to sexual forms. At the same time, drug treatment also imposes selection pressure on sexual development by shortening infection length and reducing within-host competition. We assess these models using 148 P. falciparum parasite genomes sampled in French Guiana over an 18-y period of intensive intervention (1998 to 2015). During this time frame, multiple public health measures, including the introduction of new drugs and expanded rapid diagnostic testing, were implemented, reducing P. falciparum malaria cases by an order of magnitude. Consistent with this prevalence decline, we see an increase in the relatedness among parasites, but no single clonal background grew to dominate the population. Analyzing individual allele frequency trajectories, we identify genes that likely experienced selective sweeps. Supporting our model predictions, genes showing the strongest signatures of selection include transcription factors involved in the development of P. falciparum’s sexual gametocyte form. These results highlight how public health interventions impose wide-ranging selection pressures that affect basic parasite life history traits.
Collapse
|
14
|
LaMont C, Otwinowski J, Vanshylla K, Gruell H, Klein F, Nourmohammad A. Design of an optimal combination therapy with broadly neutralizing antibodies to suppress HIV-1. eLife 2022; 11:76004. [PMID: 35852143 PMCID: PMC9467514 DOI: 10.7554/elife.76004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Infusion of broadly neutralizing antibodies (bNAbs) has shown promise as an alternative to anti-retroviral therapy against HIV. A key challenge is to suppress viral escape, which is more effectively achieved with a combination of bNAbs. Here, we propose a computational approach to predict the efficacy of a bNAb therapy based on the population genetics of HIV escape, which we parametrize using high-throughput HIV sequence data from bNAb-naive patients. By quantifying the mutational target size and the fitness cost of HIV-1 escape from bNAbs, we predict the distribution of rebound times in three clinical trials. We show that a cocktail of three bNAbs is necessary to effectively suppress viral escape, and predict the optimal composition of such bNAb cocktail. Our results offer a rational therapy design for HIV, and show how genetic data can be used to predict treatment outcomes and design new approaches to pathogenic control.
Collapse
Affiliation(s)
- Colin LaMont
- Max Planck Institute for Dynamics and Self-Organization
| | | | | | | | | | | |
Collapse
|
15
|
Chen DW, Garud NR. Rapid evolution and strain turnover in the infant gut microbiome. Genome Res 2022; 32:1124-1136. [PMID: 35545448 PMCID: PMC9248880 DOI: 10.1101/gr.276306.121] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 05/06/2022] [Indexed: 11/25/2022]
Abstract
Although the ecological dynamics of the infant gut microbiome have been intensely studied, relatively little is known about evolutionary dynamics in the infant gut microbiome. Here we analyze longitudinal fecal metagenomic data from more than 700 infants and their mothers over the first year of life and find that the evolutionary dynamics in infant gut microbiomes are distinct from those of adults. We find evidence for more than a 10-fold increase in the rate of evolution and strain turnover in the infant gut compared with healthy adults, with the mother-infant transition at delivery being a particularly dynamic period in which gene loss dominates. Within a few months after birth, these dynamics stabilize, and gene gains become increasingly frequent as the microbiome matures. We furthermore find that evolutionary changes in infants show signatures of being seeded by a mixture of de novo mutations and transmissions of pre-evolved lineages from the broader family. Several of these evolutionary changes occur in parallel across infants, highlighting candidate genes that may play important roles in the development of the infant gut microbiome. Our results point to a picture of a volatile infant gut microbiome characterized by rapid evolutionary and ecological change in the early days of life.
Collapse
Affiliation(s)
- Daisy W Chen
- Computational and Systems Biology, University of California, Los Angeles, California 90095-1606, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, California 92093, USA
| | - Nandita R Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA
- Department of Human Genetics, University of California, Los Angeles, California 90095-1606, USA
| |
Collapse
|
16
|
Johri P, Stephan W, Jensen JD. Soft selective sweeps: Addressing new definitions, evaluating competing models, and interpreting empirical outliers. PLoS Genet 2022; 18:e1010022. [PMID: 35202407 PMCID: PMC8870509 DOI: 10.1371/journal.pgen.1010022] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The ability to accurately identify and quantify genetic signatures associated with soft selective sweeps based on patterns of nucleotide variation has remained controversial. We here provide counter viewpoints to recent publications in PLOS Genetics that have argued not only for the statistical identifiability of soft selective sweeps, but also for their pervasive evolutionary role in both Drosophila and HIV populations. We present evidence that these claims owe to a lack of consideration of competing evolutionary models, unjustified interpretations of empirical outliers, as well as to new definitions of the processes themselves. Our results highlight the dangers of fitting evolutionary models based on hypothesized and episodic processes without properly first considering common processes and, more generally, of the tendency in certain research areas to view pervasive positive selection as a foregone conclusion.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| | | | - Jeffrey D. Jensen
- School of Life Sciences, Arizona State University, Tempe, Arizona, United States of America
| |
Collapse
|
17
|
Leigh DM, Peranić K, Prospero S, Cornejo C, Ćurković-Perica M, Kupper Q, Nuskern L, Rigling D, Ježić M. Long-read sequencing reveals the evolutionary drivers of intra-host diversity across natural RNA mycovirus infections. Virus Evol 2021; 7:veab101. [PMID: 35299787 PMCID: PMC8923234 DOI: 10.1093/ve/veab101] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 11/23/2021] [Accepted: 12/01/2021] [Indexed: 01/05/2023] Open
Abstract
Intra-host dynamics are a core component of virus evolution but most intra-host data come from a narrow range of hosts or experimental infections. Gaining broader information on the intra-host diversity and dynamics of naturally occurring virus infections is essential to our understanding of evolution across the virosphere. Here we used PacBio long-read HiFi sequencing to characterize the intra-host populations of natural infections of the RNA mycovirus Cryphonectria hypovirus 1 (CHV1). CHV1 is a biocontrol agent for the chestnut blight fungus (Cryphonectria parasitica), which co-invaded Europe alongside the fungus. We characterized the mutational and haplotypic intra-host virus diversity of thirty-eight natural CHV1 infections spread across four locations in Croatia and Switzerland. Intra-host CHV1 diversity values were shaped by purifying selection and accumulation of mutations over time as well as epistatic interactions within the host genome at defense loci. Geographical landscape features impacted CHV1 inter-host relationships through restricting dispersal and causing founder effects. Interestingly, a small number of intra-host viral haplotypes showed high sequence similarity across large geographical distances unlikely to be linked by dispersal.
Collapse
Affiliation(s)
- Deborah M Leigh
- Phytopathology, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf CH-8903, Switzerland
| | - Karla Peranić
- Faculty of Science, University of Zagreb, Zagreb, Grad Zagreb 10000, Croatia
| | - Simone Prospero
- Phytopathology, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf CH-8903, Switzerland
| | - Carolina Cornejo
- Phytopathology, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf CH-8903, Switzerland
| | | | | | - Lucija Nuskern
- Faculty of Science, University of Zagreb, Zagreb, Grad Zagreb 10000, Croatia
| | - Daniel Rigling
- Phytopathology, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf CH-8903, Switzerland
| | - Marin Ježić
- Faculty of Science, University of Zagreb, Zagreb, Grad Zagreb 10000, Croatia
| |
Collapse
|
18
|
Stephan W. The classical hitchhiking model with continuous mutational pressure and purifying selection. Ecol Evol 2021; 11:15896-15904. [PMID: 34824798 PMCID: PMC8601925 DOI: 10.1002/ece3.8259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Revised: 08/24/2021] [Accepted: 10/08/2021] [Indexed: 11/14/2022] Open
Abstract
Detecting selective sweeps driven by strong positive selection and localizing the targets of selection in the genome play a major role in modern population genetics and genomics. Most of these analyses are based on the classical model of genetic hitchhiking proposed by Maynard Smith and Haigh (1974, Genetical Research, 23, 23). Here, we consider extensions of the classical two-locus model. Introducing mutation at the strongly selected site, we analyze the conditions under which soft sweeps may arise. We identify a new parameter (the ratio of the beneficial mutation rate to the selection coefficient) that characterizes the occurrence of multiple-origin soft sweeps. Furthermore, we quantify the hitchhiking effect when the polymorphism at the linked locus is not neutral but maintained in a mutation-selection balance. In this case, we find a smaller relative reduction of heterozygosity at the linked site than for a neutral polymorphism. In our analysis, we use a semi-deterministic approach; i.e., we analyze the frequency process of the beneficial allele in an infinitely large population when its frequency is above a certain threshold; however, for very small frequencies in the initial phase after the onset of selection we rely on diffusion theory.
Collapse
Affiliation(s)
- Wolfgang Stephan
- Leibniz‐Institute for Evolution and Biodiversity ScienceNatural History MuseumBerlinGermany
| |
Collapse
|
19
|
Feder AF, Harper KN, Brumme CJ, Pennings PS. Understanding patterns of HIV multi-drug resistance through models of temporal and spatial drug heterogeneity. eLife 2021; 10:e69032. [PMID: 34473060 PMCID: PMC8412921 DOI: 10.7554/elife.69032] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 08/03/2021] [Indexed: 01/09/2023] Open
Abstract
Triple-drug therapies have transformed HIV from a fatal condition to a chronic one. These therapies should prevent HIV drug resistance evolution, because one or more drugs suppress any partially resistant viruses. In practice, such therapies drastically reduced, but did not eliminate, resistance evolution. In this article, we reanalyze published data from an evolutionary perspective and demonstrate several intriguing patterns about HIV resistance evolution - resistance evolves (1) even after years on successful therapy, (2) sequentially, often via one mutation at a time and (3) in a partially predictable order. We describe how these observations might emerge under two models of HIV drugs varying in space or time. Despite decades of work in this area, much opportunity remains to create models with realistic parameters for three drugs, and to match model outcomes to resistance rates and genetic patterns from individuals on triple-drug therapy. Further, lessons from HIV may inform other systems.
Collapse
Affiliation(s)
- Alison F Feder
- Department of Integrative Biology, University of California, BerkeleyBerkeleyUnited States
- Department of Genome Sciences, University of WashingtonSeattleUnited States
| | - Kristin N Harper
- Harper Health and Science Communications, LLCSeattleUnited States
| | - Chanson J Brumme
- British Columbia Centre for Excellence in HIV/AIDSVancouverCanada
- Department of Medicine, University of British ColumbiaVancouverCanada
| | - Pleuni S Pennings
- Department of Biology, San Francisco State UniversitySan FranciscoUnited States
| |
Collapse
|
20
|
Rodrigues MF, Cogni R. Genomic Responses to Climate Change: Making the Most of the Drosophila Model. Front Genet 2021; 12:676218. [PMID: 34326859 PMCID: PMC8314211 DOI: 10.3389/fgene.2021.676218] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 06/15/2021] [Indexed: 11/18/2022] Open
Abstract
It is pressing to understand how animal populations evolve in response to climate change. We argue that new sequencing technologies and the use of historical samples are opening unprecedented opportunities to investigate genome-wide responses to changing environments. However, there are important challenges in interpreting the emerging findings. First, it is essential to differentiate genetic adaptation from phenotypic plasticity. Second, it is extremely difficult to map genotype, phenotype, and fitness. Third, neutral demographic processes and natural selection affect genetic variation in similar ways. We argue that Drosophila melanogaster, a classical model organism with decades of climate adaptation research, is uniquely suited to overcome most of these challenges. In the near future, long-term time series genome-wide datasets of D. melanogaster natural populations will provide exciting opportunities to study adaptation to recent climate change and will lay the groundwork for related research in non-model systems.
Collapse
Affiliation(s)
- Murillo F. Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, United States
| | - Rodrigo Cogni
- Department of Ecology, Institute of Biosciences, University of São Paulo, São Paulo, Brazil
| |
Collapse
|