1
|
Li Y, Barton JP. Correlated Allele Frequency Changes Reveal Clonal Structure and Selection in Temporal Genetic Data. Mol Biol Evol 2024; 41:msae060. [PMID: 38507665 PMCID: PMC10986812 DOI: 10.1093/molbev/msae060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 02/02/2024] [Accepted: 03/15/2024] [Indexed: 03/22/2024] Open
Abstract
In evolving populations where the rate of beneficial mutations is large, subpopulations of individuals with competing beneficial mutations can be maintained over long times. Evolution with this kind of clonal structure is commonly observed in a wide range of microbial and viral populations. However, it can be difficult to completely resolve clonal dynamics in data. This is due to limited read lengths in high-throughput sequencing methods, which are often insufficient to directly measure linkage disequilibrium or determine clonal structure. Here, we develop a method to infer clonal structure using correlated allele frequency changes in time-series sequence data. Simulations show that our method recovers true, underlying clonal structures when they are known and accurately estimate linkage disequilibrium. This information can then be combined with other inference methods to improve estimates of the fitness effects of individual mutations. Applications to data suggest novel clonal structures in an E. coli long-term evolution experiment, and yield improved predictions of the effects of mutations on bacterial fitness and antibiotic resistance. Moreover, our method is computationally efficient, requiring orders of magnitude less run time for large data sets than existing methods. Overall, our method provides a powerful tool to infer clonal structures from data sets where only allele frequencies are available, which can also improve downstream analyses.
Collapse
Affiliation(s)
- Yunxiao Li
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
| | - John P Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| |
Collapse
|
2
|
Gao Y, Barton JP. A binary trait model reveals the fitness effects of HIV-1 escape from T cell responses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.03.583183. [PMID: 38464239 PMCID: PMC10925374 DOI: 10.1101/2024.03.03.583183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Natural selection often acts on multiple traits simultaneously. For example, the virus HIV-1 faces pressure to evade host immunity while also preserving replicative fitness. While past work has studied selection during HIV-1 evolution, it is challenging to quantitatively separate different contributions to fitness. This task is made more difficult because a single mutation can affect both immune escape and replication. Here, we develop an evolutionary model that disentangles the effects of escaping CD8+ T cell-mediated immunity, which we model as a binary trait, from other contributions to fitness. After validation in simulations, we applied this model to study within-host HIV-1 evolution in a clinical data set. We observed strong selection for immune escape, sometimes greatly exceeding past estimates, especially early in infection. Conservative estimates suggest that roughly half of HIV-1 fitness gains during the first months to years of infection can be attributed to T cell escape. Our approach is not limited to HIV-1 or viruses, and could be adapted to study the evolution of quantitative traits in other contexts.
Collapse
Affiliation(s)
- Yirui Gao
- Department of Physics and Astronomy, University of California, Riverside, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
| |
Collapse
|
3
|
Hong Z, Barton JP. popDMS infers mutation effects from deep mutational scanning data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577759. [PMID: 38352383 PMCID: PMC10862717 DOI: 10.1101/2024.01.29.577759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
| |
Collapse
|
4
|
Zhang H, Bull RA, Quadeer AA, McKay MR. HCV E1 influences the fitness landscape of E2 and may enhance escape from E2-specific antibodies. Virus Evol 2023; 9:vead068. [PMID: 38107333 PMCID: PMC10722114 DOI: 10.1093/ve/vead068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 09/27/2023] [Accepted: 11/16/2023] [Indexed: 12/19/2023] Open
Abstract
The Hepatitis C virus (HCV) envelope glycoprotein E1 forms a non-covalent heterodimer with E2, the main target of neutralizing antibodies. How E1-E2 interactions influence viral fitness and contribute to resistance to E2-specific antibodies remain largely unknown. We investigate this problem using a combination of fitness landscape and evolutionary modeling. Our analysis indicates that E1 and E2 proteins collectively mediate viral fitness and suggests that fitness-compensating E1 mutations may accelerate escape from E2-targeting antibodies. Our analysis also identifies a set of E2-specific human monoclonal antibodies that are predicted to be especially resilient to escape via genetic variation in both E1 and E2, providing directions for robust HCV vaccine development.
Collapse
Affiliation(s)
- Hang Zhang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, SAR, China
| | - Rowena A Bull
- School of Biomedical Sciences, Faculty of Medicine and Health, University of New South Wales, Sydney, NSW 2052, Australia
- The Kirby Institute for Infection and Immunity, Sydney, NSW 2052, Australia
| | - Ahmed Abdul Quadeer
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, SAR, China
- Department of Electrical and Electronic Engineering, University of Melbourne, Parkville, VIC 3010, Australia
| | - Matthew R McKay
- Department of Electrical and Electronic Engineering, University of Melbourne, Parkville, VIC 3010, Australia
- Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3000, Australia
| |
Collapse
|
5
|
Strahan J, Finkel J, Dinner AR, Weare J. Predicting rare events using neural networks and short-trajectory data. JOURNAL OF COMPUTATIONAL PHYSICS 2023; 488:112152. [PMID: 37332834 PMCID: PMC10270692 DOI: 10.1016/j.jcp.2023.112152] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions to Feynman-Kac equations (partial differential equations). Here, we develop an approach to solve Feynman-Kac equations by training neural networks on short-trajectory data. Our approach is based on a Markov approximation but otherwise avoids assumptions about the underlying model and dynamics. This makes it applicable to treating complex computational models and observational data. We illustrate the advantages of our method using a low-dimensional model that facilitates visualization, and this analysis motivates an adaptive sampling strategy that allows on-the-fly identification of and addition of data to regions important for predicting the statistics of interest. Finally, we demonstrate that we can compute accurate statistics for a 75-dimensional model of sudden stratospheric warming. This system provides a stringent test bed for our method.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, the University of Chicago, Chicago, IL 60637
| | - Justin Finkel
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, the University of Chicago, Chicago, IL 60637
- Committee on Computational and Applied Mathematics, the University of Chicago, Chicago, IL 60637
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012
| |
Collapse
|
6
|
Chen H, Pelizzola M, Futschik A. Haplotype based testing for a better understanding of the selective architecture. BMC Bioinformatics 2023; 24:322. [PMID: 37633901 PMCID: PMC10463365 DOI: 10.1186/s12859-023-05437-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 08/03/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND The identification of genomic regions affected by selection is one of the most important goals in population genetics. If temporal data are available, allele frequency changes at SNP positions are often used for this purpose. Here we provide a new testing approach that uses haplotype frequencies instead of allele frequencies. RESULTS Using simulated data, we show that compared to SNP based test, our approach has higher power, especially when the number of candidate haplotypes is small or moderate. To improve power when the number of haplotypes is large, we investigate methods to combine them with a moderate number of haplotype subsets. Haplotype frequencies can often be recovered with less noise than SNP frequencies, especially under pool sequencing, giving our test an additional advantage. Furthermore, spurious outlier SNPs may lead to false positives, a problem usually not encountered when working with haplotypes. Post hoc tests for the number of selected haplotypes and for differences between their selection coefficients are also provided for a better understanding of the underlying selection dynamics. An application on a real data set further illustrates the performance benefits. CONCLUSIONS Due to less multiple testing correction and noise reduction, haplotype based testing is able to outperform SNP based tests in terms of power in most scenarios.
Collapse
Affiliation(s)
- Haoyu Chen
- University of Veterinary Medicine Vienna, Vienna, Austria
- Vienna Graduate School of Population Genetics, Vienna, Austria
| | | | | |
Collapse
|
7
|
Li Y, Barton JP. Estimating linkage disequilibrium and selection from allele frequency trajectories. Genetics 2023; 223:iyac189. [PMID: 36610715 PMCID: PMC9991507 DOI: 10.1093/genetics/iyac189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 10/14/2022] [Accepted: 12/11/2022] [Indexed: 01/09/2023] Open
Abstract
Genetic sequences collected over time provide an exciting opportunity to study natural selection. In such studies, it is important to account for linkage disequilibrium to accurately measure selection and to distinguish between selection and other effects that can cause changes in allele frequencies, such as genetic hitchhiking or clonal interference. However, most high-throughput sequencing methods cannot directly measure linkage due to short-read lengths. Here we develop a simple method to estimate linkage disequilibrium from time-series allele frequencies. This reconstructed linkage information can then be combined with other inference methods to infer the fitness effects of individual mutations. Simulations show that our approach reliably outperforms inference that ignores linkage disequilibrium and, with sufficient sampling, performs similarly to inference using the true linkage information. We also introduce two regularization methods derived from random matrix theory that help to preserve its performance under limited sampling effects. Overall, our method enables the use of linkage-aware inference methods even for data sets where only allele frequency time series are available.
Collapse
Affiliation(s)
- Yunxiao Li
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
| | - John P Barton
- Department of Physics and Astronomy, University of California, Riverside, CA 92521, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, USA
| |
Collapse
|
8
|
Shimagaki K, Barton JP. Bézier interpolation improves the inference of dynamical models from data. Phys Rev E 2023; 107:024116. [PMID: 36932614 PMCID: PMC10027371 DOI: 10.1103/physreve.107.024116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 01/23/2023] [Indexed: 06/18/2023]
Abstract
Many dynamical systems, from quantum many-body systems to evolving populations to financial markets, are described by stochastic processes. Parameters characterizing such processes can often be inferred using information integrated over stochastic paths. However, estimating time-integrated quantities from real data with limited time resolution is challenging. Here, we propose a framework for accurately estimating time-integrated quantities using Bézier interpolation. We applied our approach to two dynamical inference problems: Determining fitness parameters for evolving populations and inferring forces driving Ornstein-Uhlenbeck processes. We found that Bézier interpolation reduces the estimation bias for both dynamical inference problems. This improvement was especially noticeable for data sets with limited time resolution. Our method could be broadly applied to improve accuracy for other dynamical inference problems using finitely sampled data.
Collapse
Affiliation(s)
- Kai Shimagaki
- Department of Physics and Astronomy, University of California, Riverside, California 92521, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, California 92521, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
9
|
Jankowiak M, Obermeyer FH, Lemieux JE. Inferring selection effects in SARS-CoV-2 with Bayesian Viral Allele Selection. PLoS Genet 2022; 18:e1010540. [PMID: 36508459 PMCID: PMC9779722 DOI: 10.1371/journal.pgen.1010540] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 12/22/2022] [Accepted: 11/23/2022] [Indexed: 12/14/2022] Open
Abstract
The global effort to sequence millions of SARS-CoV-2 genomes has provided an unprecedented view of viral evolution. Characterizing how selection acts on SARS-CoV-2 is critical to developing effective, long-lasting vaccines and other treatments, but the scale and complexity of genomic surveillance data make rigorous analysis challenging. To meet this challenge, we develop Bayesian Viral Allele Selection (BVAS), a principled and scalable probabilistic method for inferring the genetic determinants of differential viral fitness and the relative growth rates of viral lineages, including newly emergent lineages. After demonstrating the accuracy and efficacy of our method through simulation, we apply BVAS to 6.9 million SARS-CoV-2 genomes. We identify numerous mutations that increase fitness, including previously identified mutations in the SARS-CoV-2 Spike and Nucleocapsid proteins, as well as mutations in non-structural proteins whose contribution to fitness is less well characterized. In addition, we extend our baseline model to identify mutations whose fitness exhibits strong dependence on vaccination status as well as pairwise interaction effects, i.e. epistasis. Strikingly, both these analyses point to the pivotal role played by the N501 residue in the Spike protein. Our method, which couples Bayesian variable selection with a diffusion approximation in allele frequency space, lays a foundation for identifying fitness-associated mutations under the assumption that most alleles are neutral.
Collapse
Affiliation(s)
- Martin Jankowiak
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- * E-mail:
| | - Fritz H. Obermeyer
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- Generate Biomedicines, Cambridge, Massachusetts, United States of America
| | - Jacob E. Lemieux
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
- Division of Infectious Diseases, Massachusetts General Hospital, Cambridge, Massachusetts, United States of America
| |
Collapse
|
10
|
Sohail MS, Louie RHY, Hong Z, Barton JP, McKay MR. Inferring Epistasis from Genetic Time-series Data. Mol Biol Evol 2022; 39:6710201. [PMID: 36130322 PMCID: PMC9558069 DOI: 10.1093/molbev/msac199] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Epistasis refers to fitness or functional effects of mutations that depend on the sequence background in which these mutations arise. Epistasis is prevalent in nature, including populations of viruses, bacteria, and cancers, and can contribute to the evolution of drug resistance and immune escape. However, it is difficult to directly estimate epistatic effects from sampled observations of a population. At present, there are very few methods that can disentangle the effects of selection (including epistasis), mutation, recombination, genetic drift, and genetic linkage in evolving populations. Here we develop a method to infer epistasis, along with the fitness effects of individual mutations, from observed evolutionary histories. Simulations show that we can accurately infer pairwise epistatic interactions provided that there is sufficient genetic diversity in the data. Our method also allows us to identify which fitness parameters can be reliably inferred from a particular data set and which ones are unidentifiable. Our approach therefore allows for the inference of more complex models of selection from time-series genetic data, while also quantifying uncertainty in the inferred parameters.
Collapse
Affiliation(s)
- Muhammad Saqib Sohail
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, People’s Republic of China
| | - Raymond H Y Louie
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
| | - Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, CA, USA
| | | | | |
Collapse
|
11
|
Vaccinia-Virus-Based Vaccines Are Expected to Elicit Highly Cross-Reactive Immunity to the 2022 Monkeypox Virus. Viruses 2022; 14:v14091960. [PMID: 36146766 PMCID: PMC9506226 DOI: 10.3390/v14091960] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 08/03/2022] [Accepted: 09/01/2022] [Indexed: 11/17/2022] Open
Abstract
Beginning in May 2022, a novel cluster of monkeypox virus infections was detected in humans. This virus has spread rapidly to non-endemic countries, sparking global concern. Specific vaccines based on the vaccinia virus (VACV) have demonstrated high efficacy against monkeypox viruses in the past and are considered an important outbreak control measure. Viruses observed in the current outbreak carry distinct genetic variations that have the potential to affect vaccine-induced immune recognition. Here, by investigating genetic variation with respect to orthologous immunogenic vaccinia-virus proteins, we report data that anticipates immune responses induced by VACV-based vaccines, including the currently available MVA-BN and ACAM2000 vaccines, to remain highly cross-reactive against the newly observed monkeypox viruses.
Collapse
|
12
|
Morales-Arce AY, Johri P, Jensen JD. Inferring the distribution of fitness effects in patient-sampled and experimental virus populations: two case studies. Heredity (Edinb) 2022; 128:79-87. [PMID: 34987185 PMCID: PMC8728706 DOI: 10.1038/s41437-021-00493-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 12/12/2021] [Accepted: 12/13/2021] [Indexed: 11/19/2022] Open
Abstract
We here propose an analysis pipeline for inferring the distribution of fitness effects (DFE) from either patient-sampled or experimentally-evolved viral populations, that explicitly accounts for non-Wright-Fisher and non-equilibrium population dynamics inherent to pathogens. We examine the performance of this approach via extensive power and performance analyses, and highlight two illustrative applications - one from an experimentally-passaged RNA virus, and the other from a clinically-sampled DNA virus. Finally, we discuss how such DFE inference may shed light on major research questions in virus evolution, ranging from a quantification of the population genetic processes governing genome size, to the role of Hill-Robertson interference in dictating adaptive outcomes, to the potential design of novel therapeutic approaches to eradicate within-patient viral populations via induced mutational meltdown.
Collapse
Affiliation(s)
- Ana Y. Morales-Arce
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| | - Parul Johri
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| | - Jeffrey D. Jensen
- grid.215654.10000 0001 2151 2636Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ USA
| |
Collapse
|
13
|
Doelger J, Kardar M, Chakraborty AK. Inferring the intrinsic mutational fitness landscape of influenzalike evolving antigens from temporally ordered sequence data. Phys Rev E 2022; 105:024401. [PMID: 35291059 DOI: 10.1103/physreve.105.024401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 01/19/2022] [Indexed: 06/14/2023]
Abstract
There still are no effective long-term protective vaccines against viruses that continuously evolve under immune pressure such as seasonal influenza, which has caused, and can cause, devastating epidemics in the human population. To find such a broadly protective immunization strategy, it is useful to know how easily the virus can escape via mutation from specific antibody responses. This information is encoded in the fitness landscape of the viral proteins (i.e., knowledge of the viral fitness as a function of sequence). Here we present a computational method to infer the intrinsic mutational fitness landscape of influenzalike evolving antigens from yearly sequence data. We test inference performance with computer-generated sequence data that are based on stochastic simulations mimicking basic features of immune-driven viral evolution. Although the numerically simulated model does create a phylogeny based on the allowed mutations, the inference scheme does not use this information. This provides a contrast to other methods that rely on reconstruction of phylogenetic trees. Our method just needs a sufficient number of samples over multiple years. With our method, we are able to infer single as well as pairwise mutational fitness effects from the simulated sequence time series for short antigenic proteins. Our fitness inference approach may have potential future use for the design of immunization protocols by identifying intrinsically vulnerable immune target combinations on antigens that evolve under immune-driven selection. In the future, this approach may be applied to influenza and other novel viruses such as SARS-CoV-2, which evolves and, like influenza, might continue to escape the natural and vaccine-mediated immune pressures.
Collapse
Affiliation(s)
- Julia Doelger
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Mehran Kardar
- Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Arup K Chakraborty
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; and Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
14
|
Evolutionary modeling reveals enhanced mutational flexibility of HCV subtype 1b compared with 1a. iScience 2022; 25:103569. [PMID: 34988406 PMCID: PMC8704487 DOI: 10.1016/j.isci.2021.103569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 11/19/2021] [Accepted: 12/02/2021] [Indexed: 11/24/2022] Open
Abstract
Hepatitis C virus (HCV) is a leading cause of liver-associated disease and liver cancer. Of the major HCV subtypes, patients infected with subtype 1b have been associated with having a higher risk of developing chronic infection and hepatocellular carcinoma. However, underlying reasons for this increased disease severity remain unknown. Here, we provide an evolutionary rationale, based on a comparative study of fitness landscape and in-host evolutionary models of the E2 glycoprotein of HCV subtypes 1a and 1b. Our analysis demonstrates that a higher chronicity rate of 1b may be attributed to lower fitness constraints, enabling 1b viruses to more easily escape antibody responses. More generally, our results suggest that differences in evolutionary constraints between HCV subtypes may be an important factor in mediating distinct disease outcomes. Our analysis also identifies antibodies that appear escape-resistant against both subtypes 1a and 1b, providing directions for designing HCV vaccines having cross-subtype protection. Comparative analysis of the fitness landscapes of HCV subtypes 1a and 1b Subtype 1b evolution is subject to less constraints than 1a Subtype 1b appears to evade antibodies more easily compared with 1a Antibodies are identified that are difficult to escape for both subtypes 1a and 1b
Collapse
|
15
|
Sesta L, Uguzzoni G, Fernandez-de-Cossio-Diaz J, Pagnani A. AMaLa: Analysis of Directed Evolution Experiments via Annealed Mutational Approximated Landscape. Int J Mol Sci 2021; 22:10908. [PMID: 34681569 PMCID: PMC8535593 DOI: 10.3390/ijms222010908] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 09/24/2021] [Accepted: 09/27/2021] [Indexed: 01/12/2023] Open
Abstract
We present Annealed Mutational approximated Landscape (AMaLa), a new method to infer fitness landscapes from Directed Evolution experiments sequencing data. Such experiments typically start from a single wild-type sequence, which undergoes Darwinian in vitro evolution via multiple rounds of mutation and selection for a target phenotype. In the last years, Directed Evolution is emerging as a powerful instrument to probe fitness landscapes under controlled experimental conditions and as a relevant testing ground to develop accurate statistical models and inference algorithms (thanks to high-throughput screening and sequencing). Fitness landscape modeling either uses the enrichment of variants abundances as input, thus requiring the observation of the same variants at different rounds or assuming the last sequenced round as being sampled from an equilibrium distribution. AMaLa aims at effectively leveraging the information encoded in the whole time evolution. To do so, while assuming statistical sampling independence between sequenced rounds, the possible trajectories in sequence space are gauged with a time-dependent statistical weight consisting of two contributions: (i) an energy term accounting for the selection process and (ii) a generalized Jukes-Cantor model for the purely mutational step. This simple scheme enables accurately describing the Directed Evolution dynamics and inferring a fitness landscape that correctly reproduces the measures of the phenotype under selection (e.g., antibiotic drug resistance), notably outperforming widely used inference strategies. In addition, we assess the reliability of AMaLa by showing how the inferred statistical model could be used to predict relevant structural properties of the wild-type sequence.
Collapse
Affiliation(s)
- Luca Sesta
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
| | - Guido Uguzzoni
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
| | - Jorge Fernandez-de-Cossio-Diaz
- Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR 8023 & PSL Research, Sorbonne Université, 24 rue Lhomond, 75005 Paris, France
- Center of Molecular Immunology, Systems Biology Department, Playa, Havana CP 11600, Cuba
| | - Andrea Pagnani
- Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy; (L.S.); (G.U.); (A.P.)
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060 Candiolo, Italy
- INFN, Sezione di Torino, I-10125 Torino, Italy
| |
Collapse
|
16
|
Sohail MS, Louie RHY, McKay MR, Barton JP. MPL resolves genetic linkage in fitness inference from complex evolutionary histories. Nat Biotechnol 2021; 39:472-479. [PMID: 33257862 PMCID: PMC8044047 DOI: 10.1038/s41587-020-0737-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 10/14/2020] [Indexed: 12/13/2022]
Abstract
Genetic linkage causes the fate of new mutations in a population to be contingent on the genetic background on which they appear. This makes it challenging to identify how individual mutations affect fitness. To overcome this challenge, we developed marginal path likelihood (MPL), a method to infer selection from evolutionary histories that resolves genetic linkage. Validation on real and simulated data sets shows that MPL is fast and accurate, outperforming existing inference approaches. We found that resolving linkage is crucial for accurately quantifying selection in complex evolving populations, which we demonstrate through a quantitative analysis of intrahost HIV-1 evolution using multiple patient data sets. Linkage effects generated by variants that sweep rapidly through the population are particularly strong, extending far across the genome. Taken together, our results argue for the importance of resolving linkage in studies of natural selection.
Collapse
Affiliation(s)
- Muhammad Saqib Sohail
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China
| | - Raymond H Y Louie
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China
- Institute for Advanced Study, Hong Kong University of Science and Technology, Hong Kong, China
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
- School of Medical Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Matthew R McKay
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, China.
- Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China.
| | - John P Barton
- Department of Physics and Astronomy, University of California, Riverside, Riverside, CA, USA.
| |
Collapse
|