1
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585703. [PMID: 38746108 PMCID: PMC11092427 DOI: 10.1101/2024.03.19.585703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app .
Collapse
|
2
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. ARXIV 2024:arXiv:2403.12684v2. [PMID: 38745695 PMCID: PMC11092678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
3
|
Mwima R, Hui TYJ, Kayondo JK, Burt A. The population genetics of partial diapause, with applications to the aestivating malaria mosquito Anopheles coluzzii. Mol Ecol Resour 2024; 24:e13949. [PMID: 38511493 DOI: 10.1111/1755-0998.13949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 02/27/2024] [Accepted: 03/08/2024] [Indexed: 03/22/2024]
Abstract
Diapause, a form of dormancy to delay or halt the reproductive development during unfavourable seasons, has evolved in many insect species. One example is aestivation, an adult-stage diapause enhancing malaria vectors' survival during the dry season (DS) and their re-establishment in the next rainy season (RS). This work develops a novel genetic approach to estimate the number or proportion of individuals undergoing diapause, as well as the breeding sizes of the two seasons, using signals from temporal allele frequency dynamics. Our modelling shows the magnitude of drift is dampened at early RS when previously aestivating individuals reappear. Aestivation severely biases the temporal effective population size (N e $$ {N}_e $$ ), leading to overestimation of the DS breeding size by1 / 1 - α 2 $$ 1/{\left(1-\alpha \right)}^2 $$ across 1 year, whereα $$ \alpha $$ is the aestivating proportion. We find sampling breeding individuals in three consecutive seasons starting from an RS is sufficient for parameter estimation, and perform extensive simulations to verify our derivations. This method does not require sampling individuals in the dormant state, the biggest challenge in most studies. We illustrate the method by applying it to a published data set for Anopheles coluzzii mosquitoes from Thierola, Mali. Our method and the expected evolutionary implications are applicable to any species in which a fraction of the population diapauses for more than one generation, and are difficult or impossible to sample during that stage.
Collapse
Affiliation(s)
- Rita Mwima
- Department of Entomology, Uganda Virus Research Institute (UVRI), Entebbe, Uganda
- Department of Biotechnical and Diagnostic Sciences, College of Veterinary Medicine, Animal Resources and Biosecurity (COVAB), Makerere University, Kampala, Uganda
| | - Tin-Yu J Hui
- Department of Life Sciences, Imperial College London, Ascot, UK
| | - Jonathan K Kayondo
- Department of Entomology, Uganda Virus Research Institute (UVRI), Entebbe, Uganda
| | - Austin Burt
- Department of Life Sciences, Imperial College London, Ascot, UK
| |
Collapse
|
4
|
Khatri BS, Burt A. A theory of resistance to multiplexed gene drive demonstrates the significant role of weakly deleterious natural genetic variation. Proc Natl Acad Sci U S A 2022; 119:e2200567119. [PMID: 35914131 PMCID: PMC9371675 DOI: 10.1073/pnas.2200567119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 06/28/2022] [Indexed: 11/18/2022] Open
Abstract
Evolution of resistance is a major barrier to successful deployment of gene-drive systems to suppress natural populations, which could greatly reduce the burden of many vector-borne diseases. Multiplexed guide RNAs (gRNAs) that require resistance mutations in all target cut sites are a promising antiresistance strategy since, in principle, resistance would only arise in unrealistically large populations. Using stochastic simulations that accurately model evolution at very large population sizes, we explore the probability of resistance due to three important mechanisms: 1) nonhomologous end-joining mutations, 2) single-nucleotide mutants arising de novo, or 3) single-nucleotide polymorphisms preexisting as standing variation. Our results explore the relative importance of these mechanisms and highlight a complexity of the mutation-selection-drift balance between haplotypes with complete resistance and those with an incomplete number of resistant alleles. We find that this leads to a phenomenon where weakly deleterious naturally occurring variants greatly amplify the probability of multisite resistance compared to de novo mutation. This key result provides design criterion for antiresistance multiplexed systems, which, in general, will need a larger number of gRNAs compared to de novo expectations. This theory may have wider application to the evolution of resistance or evolutionary rescue when multiple changes are required before selection can act.
Collapse
Affiliation(s)
- Bhavin S. Khatri
- Department of Life Sciences, Imperial College London, Ascot SL5 7PY, United Kingdom
- Chromosome Segregation Laboratory, and Mechanobiology and Biophysics Laboratory, The Francis Crick Institute, London, NW1 1AT, United Kingdom
| | - Austin Burt
- Department of Life Sciences, Imperial College London, Ascot SL5 7PY, United Kingdom
| |
Collapse
|
5
|
Smith MR, Trofimova M, Weber A, Duport Y, Kühnert D, von Kleist M. Rapid incidence estimation from SARS-CoV-2 genomes reveals decreased case detection in Europe during summer 2020. Nat Commun 2021; 12:6009. [PMID: 34650062 PMCID: PMC8517019 DOI: 10.1038/s41467-021-26267-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 09/24/2021] [Indexed: 12/24/2022] Open
Abstract
By October 2021, 230 million SARS-CoV-2 diagnoses have been reported. Yet, a considerable proportion of cases remains undetected. Here, we propose GInPipe, a method that rapidly reconstructs SARS-CoV-2 incidence profiles solely from publicly available, time-stamped viral genomes. We validate GInPipe against simulated outbreaks and elaborate phylodynamic analyses. Using available sequence data, we reconstruct incidence histories for Denmark, Scotland, Switzerland, and Victoria (Australia) and demonstrate, how to use the method to investigate the effects of changing testing policies on case ascertainment. Specifically, we find that under-reporting was highest during summer 2020 in Europe, coinciding with more liberal testing policies at times of low testing capacities. Due to the increased use of real-time sequencing, it is envisaged that GInPipe can complement established surveillance tools to monitor the SARS-CoV-2 pandemic. In post-pandemic times, when diagnostic efforts are decreasing, GInPipe may facilitate the detection of hidden infection dynamics.
Collapse
Affiliation(s)
- Maureen Rebecca Smith
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany.
| | - Maria Trofimova
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany
| | - Ariane Weber
- Transmission, Infection, Diversification and Evolution Group, Max-Planck Institute for the Science of Human History, Jena, Germany
| | - Yannick Duport
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany
| | - Denise Kühnert
- Transmission, Infection, Diversification and Evolution Group, Max-Planck Institute for the Science of Human History, Jena, Germany
- German COVID Omics Initiative (deCOI), Bonn, Germany
| | - Max von Kleist
- Systems Medicine of Infectious Disease (P5), Robert Koch Institute, Berlin, Germany.
- Bioinformatics (MF1), Robert Koch Institute, Berlin, Germany.
- German COVID Omics Initiative (deCOI), Bonn, Germany.
| |
Collapse
|
6
|
Hui TYJ, Brenas JH, Burt A. Contemporary N e estimation using temporally spaced data with linked loci. Mol Ecol Resour 2021; 21:2221-2230. [PMID: 33950582 PMCID: PMC8518636 DOI: 10.1111/1755-0998.13412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 04/23/2021] [Accepted: 04/27/2021] [Indexed: 11/30/2022]
Abstract
The contemporary effective population size Ne is important in many disciplines including population genetics, conservation science and pest management. One of the most popular methods of estimating this quantity uses temporal changes in allele frequency due to genetic drift. A significant assumption of the existing methods is the independence among loci while constructing confidence intervals (CI), which restricts the types of species or genetic data applicable to the methods. Although genetic linkage does not bias point Ne estimates, applying these methods to linked loci can yield unreliable CI that are far too narrow. We extend the current methods to enable the use of many linked loci to produce precise contemporary Ne estimates, while preserving the targeted CI width and coverage. This is achieved by deriving the covariance of changes in allele frequency at linked loci in the face of recombination and sampling errors, such that the extra sampling variance due to between‐locus correlation is properly handled. Extensive simulations are used to verify the new method. We apply the method to two temporally spaced genomic data sets of Anopheles mosquitoes collected from a cluster of villages in Burkina Faso between 2012 and 2014. With over 33,000 linked loci considered, the Ne estimate for Anopheles coluzzii is 9,242 (95% CI 5,702–24,282), and for Anopheles gambiae it is 4,826 (95% CI 3,602–7,353).
Collapse
Affiliation(s)
- Tin-Yu J Hui
- Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, UK
| | - Jon Haël Brenas
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Saffron Walden, UK
| | - Austin Burt
- Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, UK
| |
Collapse
|
7
|
Abstract
HIV can evolve remarkably quickly in response to antiretroviral therapies and the immune system. This evolution stymies treatment effectiveness and prevents the development of an HIV vaccine. Consequently, there has been a great interest in using population genetics to disentangle the forces that govern the HIV adaptive landscape (selection, drift, mutation, and recombination). Traditional population genetics approaches look at the current state of genetic variation and infer the processes that can generate it. However, because HIV evolves rapidly, we can also sample populations repeatedly over time and watch evolution in action. In this paper, we demonstrate how time series data can bound evolutionary parameters in a way that complements and informs traditional population genetic approaches. Specifically, we focus on our recent paper (Feder et al., 2016, eLife), in which we show that, as improved HIV drugs have led to fewer patients failing therapy due to resistance evolution, less genetic diversity has been maintained following the fixation of drug resistance mutations. Because soft sweeps of multiple drug resistance mutations spreading simultaneously have been previously documented in response to the less effective HIV therapies used early in the epidemic, we interpret the maintenance of post-sweep diversity in response to poor therapies as further evidence of soft sweeps and therefore a high population mutation rate (θ) in these intra-patient HIV populations. Because improved drugs resulted in rarer resistance evolution accompanied by lower post-sweep diversity, we suggest that both observations can be explained by decreased population mutation rates and a resultant transition to hard selective sweeps. A recent paper (Harris et al., 2018, PLOS Genetics) proposed an alternative interpretation: Diversity maintenance following drug resistance evolution in response to poor therapies may have been driven by recombination during slow, hard selective sweeps of single mutations. Then, if better drugs have led to faster hard selective sweeps of resistance, recombination will have less time to rescue diversity during the sweep, recapitulating the decrease in post-sweep diversity as drugs have improved. In this paper, we use time series data to show that drug resistance evolution during ineffective treatment is very fast, providing new evidence that soft sweeps drove early HIV treatment failure.
Collapse
Affiliation(s)
- Alison F. Feder
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Pleuni S. Pennings
- Department of Biology, San Francisco State University, San Francisco, California, United States of America
| | - Dmitri A. Petrov
- Department of Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
8
|
Lumby CK, Zhao L, Breuer J, Illingworth CJR. A large effective population size for established within-host influenza virus infection. eLife 2020; 9:e56915. [PMID: 32773034 PMCID: PMC7431133 DOI: 10.7554/elife.56915] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Accepted: 07/30/2020] [Indexed: 12/13/2022] Open
Abstract
Strains of the influenza virus form coherent global populations, yet exist at the level of single infections in individual hosts. The relationship between these scales is a critical topic for understanding viral evolution. Here we investigate the within-host relationship between selection and the stochastic effects of genetic drift, estimating an effective population size of infection Ne for influenza infection. Examining whole-genome sequence data describing a chronic case of influenza B in a severely immunocompromised child we infer an Ne of 2.5 × 107 (95% confidence range 1.0 × 107 to 9.0 × 107) suggesting that genetic drift is of minimal importance during an established influenza infection. Our result, supported by data from influenza A infection, suggests that positive selection during within-host infection is primarily limited by the typically short period of infection. Atypically long infections may have a disproportionate influence upon global patterns of viral evolution.
Collapse
Affiliation(s)
- Casper K Lumby
- Department of Genetics, University of CambridgeCambridgeUnited Kingdom
| | - Lei Zhao
- Department of Genetics, University of CambridgeCambridgeUnited Kingdom
| | - Judith Breuer
- Great Ormond Street HospitalLondonUnited Kingdom
- Division of Infection and Immunity, University College LondonLondonUnited Kingdom
| | - Christopher JR Illingworth
- Department of Genetics, University of CambridgeCambridgeUnited Kingdom
- Department of Applied Mathematics and Theoretical Physics, University of CambridgeCambridgeUnited Kingdom
- Department of Computer Science, Institute of Biotechnology, University of HelsinkiHelsinkiFinland
| |
Collapse
|
9
|
Hazzouri KM, Sudalaimuthuasari N, Kundu B, Nelson D, Al-Deeb MA, Le Mansour A, Spencer JJ, Desplan C, Amiri KMA. The genome of pest Rhynchophorus ferrugineus reveals gene families important at the plant-beetle interface. Commun Biol 2020; 3:323. [PMID: 32581279 PMCID: PMC7314810 DOI: 10.1038/s42003-020-1060-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 06/08/2020] [Indexed: 11/17/2022] Open
Abstract
The red palm weevil, Rhynchophorus ferrugineus, infests palm plantations, leading to large financial losses and soil erosion. Pest-host interactions are poorly understood in R. ferrugineus, but the analysis of genetic diversity and pest origins will help advance efforts to eradicate this pest. We sequenced the genome of R. ferrugineus using a combination of paired-end Illumina sequencing (150 bp), Oxford Nanopore long reads, 10X Genomics and synteny analysis to produce an assembly with a scaffold N50 of ~60 Mb. Structural variations showed duplication of detoxifying and insecticide resistance genes (e.g., glutathione S-transferase, P450, Rdl). Furthermore, the evolution of gene families identified those under positive selection including one glycosyl hydrolase (GH16) gene family, which appears to result from horizontal gene transfer. This genome will be a valuable resource to understand insect evolution and behavior and to allow the genetic modification of key genes that will help control this pest.
Collapse
Affiliation(s)
- Khaled Michel Hazzouri
- Khalifa Center for Genetic Engineering and Biotechnology, United Arab Emirates University, PO Box 15551, Al Ain, UAE
| | | | - Biduth Kundu
- Department of Biology, United Arab Emirates University, PO Box 15551, Al Ain, UAE
| | - David Nelson
- Center for Genomics and Systems Biology, New York University Abu Dhabi, PO Box 129188, Abu Dhabi, UAE
| | - Mohammad Ali Al-Deeb
- Department of Biology, United Arab Emirates University, PO Box 15551, Al Ain, UAE
| | - Alain Le Mansour
- Date Palm Tissue Culture, United Arab Emirates University, PO Box 15551, Al Ain, UAE
| | - Johnston J Spencer
- Department of Entomology, Texas A&M University, TAMU 2475, College Station, TX, USA
| | - Claude Desplan
- Center for Genomics and Systems Biology, New York University Abu Dhabi, PO Box 129188, Abu Dhabi, UAE.
| | - Khaled M A Amiri
- Khalifa Center for Genetic Engineering and Biotechnology, United Arab Emirates University, PO Box 15551, Al Ain, UAE.
- Department of Biology, United Arab Emirates University, PO Box 15551, Al Ain, UAE.
| |
Collapse
|
10
|
Deep-Time Demographic Inference Suggests Ecological Release as Driver of Neoavian Adaptive Radiation. DIVERSITY-BASEL 2020. [DOI: 10.3390/d12040164] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Assessing the applicability of theory to major adaptive radiations in deep time represents an extremely difficult problem in evolutionary biology. Neoaves, which includes 95% of living birds, is believed to have undergone a period of rapid diversification roughly coincident with the Cretaceous–Paleogene (K-Pg) boundary. We investigate whether basal neoavian lineages experienced an ecological release in response to ecological opportunity, as evidenced by density compensation. We estimated effective population sizes (Ne) of basal neoavian lineages by combining coalescent branch lengths (CBLs) and the numbers of generations between successive divergences. We used a modified version of Accurate Species TRee Algorithm (ASTRAL) to estimate CBLs directly from insertion–deletion (indel) data, as well as from gene trees using DNA sequence and/or indel data. We found that some divergences near the K-Pg boundary involved unexpectedly high gene tree discordance relative to the estimated number of generations between speciation events. The simplest explanation for this result is an increase in Ne, despite the caveats discussed herein. It appears that at least some early neoavian lineages, similar to the ancestor of the clade comprising doves, mesites, and sandgrouse, experienced ecological release near the time of the K-Pg mass extinction.
Collapse
|