1
|
Vila JA. Analysis of proteins in the light of mutations. EUROPEAN BIOPHYSICS JOURNAL : EBJ 2024:10.1007/s00249-024-01714-y. [PMID: 38955858 DOI: 10.1007/s00249-024-01714-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 05/23/2024] [Accepted: 06/18/2024] [Indexed: 07/04/2024]
Abstract
Proteins have evolved through mutations-amino acid substitutions-since life appeared on Earth, some 109 years ago. The study of these phenomena has been of particular significance because of their impact on protein stability, function, and structure. This study offers a new viewpoint on how the most recent findings in these areas can be used to explore the impact of mutations on protein sequence, stability, and evolvability. Preliminary results indicate that: (1) mutations can be viewed as sensitive probes to identify 'typos' in the amino-acid sequence, and also to assess the resistance of naturally occurring proteins to unwanted sequence alterations; (2) the presence of 'typos' in the amino acid sequence, rather than being an evolutionary obstacle, could promote faster evolvability and, in turn, increase the likelihood of higher protein stability; (3) the mutation site is far more important than the substituted amino acid in terms of the marginal stability changes of the protein, and (4) the unpredictability of protein evolution at the molecular level-by mutations-exists even in the absence of epistasis effects. Finally, the Darwinian concept of evolution "descent with modification" and experimental evidence endorse one of the results of this study, which suggests that some regions of any protein sequence are susceptible to mutations while others are not. This work contributes to our general understanding of protein responses to mutations and may spur significant progress in our efforts to develop methods to accurately forecast changes in protein stability, their propensity for metamorphism, and their ability to evolve.
Collapse
Affiliation(s)
- Jorge A Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de los Andes 950, 5700, San Luis, Argentina.
| |
Collapse
|
2
|
Alpay BA, Desai MM. Effects of selection stringency on the outcomes of directed evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.09.598029. [PMID: 38895455 PMCID: PMC11185767 DOI: 10.1101/2024.06.09.598029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Directed evolution makes mutant lineages compete in climbing complicated sequence-function landscapes. Given this underlying complexity it is unclear how selection stringency, a ubiquitous parameter of directed evolution, impacts the outcome. Here we approach this question in terms of the fitnesses of the candidate variants at each round and the heterogeneity of their distributions of fitness effects. We show that even if the fittest mutant is most likely to yield the fittest mutants in the next round of selection, diversification can improve outcomes by sampling a larger variety of fitness effects. We find that heterogeneity in fitness effects between variants, larger population sizes, and evolution over a greater number of rounds all encourage diversification.
Collapse
Affiliation(s)
- Berk A. Alpay
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Michael M. Desai
- Systems, Synthetic, and Quantitative Biology Program, Harvard University, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Department of Physics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
3
|
Gizzio J, Thakur A, Haldane A, Levy RM. Evolutionary sequence and structural basis for the distinct conformational landscapes of Tyr and Ser/Thr kinases. RESEARCH SQUARE 2024:rs.3.rs-4048991. [PMID: 38746330 PMCID: PMC11092858 DOI: 10.21203/rs.3.rs-4048991/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Protein kinases are molecular machines with rich sequence variation that distinguishes the two main evolutionary branches - tyrosine kinases (TKs) from serine/threonine kinases (STKs). Using a sequence co-variation Potts statistical energy model we previously concluded that TK catalytic domains are more likely than STKs to adopt an inactive conformation with the activation loop in an autoinhibitory "folded" conformation, due to intrinsic sequence effects. Here we investigated the structural basis for this phenomenon by integrating the sequence-based model with structure-based molecular dynamics (MD) to determine the effects of mutations on the free energy difference between active and inactive conformations, using a novel thermodynamic cycle involving many (n=108) protein-mutation free energy perturbation (FEP) simulations in the active and inactive conformations. The sequence and structure-based results are consistent and support the hypothesis that the inactive conformation "DFG-out Activation Loop Folded", is a functional regulatory state that has been stabilized in TKs relative to STKs over the course of their evolution via the accumulation of residue substitutions in the activation loop and catalytic loop that facilitate distinct substrate binding modes in trans and additional modes of regulation in cis for TKs.
Collapse
Affiliation(s)
- Joan Gizzio
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| | - Abhishek Thakur
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| | - Allan Haldane
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Physics, Temple University, Philadelphia, Pennsylvania 19122
| | - Ronald M. Levy
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| |
Collapse
|
4
|
Gizzio J, Thakur A, Haldane A, Post CB, Levy RM. Evolutionary sequence and structural basis for the distinct conformational landscapes of Tyr and Ser/Thr kinases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.08.584161. [PMID: 38559238 PMCID: PMC10979876 DOI: 10.1101/2024.03.08.584161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Protein kinases are molecular machines with rich sequence variation that distinguishes the two main evolutionary branches - tyrosine kinases (TKs) from serine/threonine kinases (STKs). Using a sequence co-variation Potts statistical energy model we previously concluded that TK catalytic domains are more likely than STKs to adopt an inactive conformation with the activation loop in an autoinhibitory "folded" conformation, due to intrinsic sequence effects. Here we investigated the structural basis for this phenomenon by integrating the sequence-based model with structure-based molecular dynamics (MD) to determine the effects of mutations on the free energy difference between active and inactive conformations, using a novel thermodynamic cycle involving many (n=108) protein-mutation free energy perturbation (FEP) simulations in the active and inactive conformations. The sequence and structure-based results are consistent and support the hypothesis that the inactive conformation "DFG-out Activation Loop Folded", is a functional regulatory state that has been stabilized in TKs relative to STKs over the course of their evolution via the accumulation of residue substitutions in the activation loop and catalytic loop that facilitate distinct substrate binding modes in trans and additional modes of regulation in cis for TKs.
Collapse
Affiliation(s)
- Joan Gizzio
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| | - Abhishek Thakur
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| | - Allan Haldane
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Physics, Temple University, Philadelphia, Pennsylvania 19122
| | - Carol Beth Post
- Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907
| | - Ronald M. Levy
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, Pennsylvania 19122
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122
| |
Collapse
|
5
|
Cisneros AF, Nielly-Thibault L, Mallik S, Levy ED, Landry CR. Mutational biases favor complexity increases in protein interaction networks after gene duplication. Mol Syst Biol 2024; 20:549-572. [PMID: 38499674 PMCID: PMC11066126 DOI: 10.1038/s44320-024-00030-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/27/2024] [Accepted: 02/28/2024] [Indexed: 03/20/2024] Open
Abstract
Biological systems can gain complexity over time. While some of these transitions are likely driven by natural selection, the extent to which they occur without providing an adaptive benefit is unknown. At the molecular level, one example is heteromeric complexes replacing homomeric ones following gene duplication. Here, we build a biophysical model and simulate the evolution of homodimers and heterodimers following gene duplication using distributions of mutational effects inferred from available protein structures. We keep the specific activity of each dimer identical, so their concentrations drift neutrally without new functions. We show that for more than 60% of tested dimer structures, the relative concentration of the heteromer increases over time due to mutational biases that favor the heterodimer. However, allowing mutational effects on synthesis rates and differences in the specific activity of homo- and heterodimers can limit or reverse the observed bias toward heterodimers. Our results show that the accumulation of more complex protein quaternary structures is likely under neutral evolution, and that natural selection would be needed to reverse this tendency.
Collapse
Affiliation(s)
- Angel F Cisneros
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, G1V 0A6, Québec, Canada
- Institut de biologie intégrative et des systèmes, Université Laval, G1V 0A6, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, G1V 0A6, Québec, Canada
- Department of Chemical and Structural Biology, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Lou Nielly-Thibault
- Institut de biologie intégrative et des systèmes, Université Laval, G1V 0A6, Québec, Canada
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Québec, Canada
- Centre de recherche sur les données massives, Université Laval, G1V 0A6, Québec, Canada
- Département de biologie, Faculté des sciences et de génie, Université Laval, G1V 0A6, Québec, Canada
| | - Saurav Mallik
- Department of Chemical and Structural Biology, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Emmanuel D Levy
- Department of Chemical and Structural Biology, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Christian R Landry
- Département de biochimie, de microbiologie et de bio-informatique, Faculté des sciences et de génie, Université Laval, G1V 0A6, Québec, Canada.
- Institut de biologie intégrative et des systèmes, Université Laval, G1V 0A6, Québec, Canada.
- PROTEO, Le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, G1V 0A6, Québec, Canada.
- Centre de recherche sur les données massives, Université Laval, G1V 0A6, Québec, Canada.
- Département de biologie, Faculté des sciences et de génie, Université Laval, G1V 0A6, Québec, Canada.
| |
Collapse
|
6
|
Reddy KD, Rasool B, Akher FB, Kutlešić N, Pant S, Boudker O. Evolutionary analysis reveals the origin of sodium coupling in glutamate transporters. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.03.569786. [PMID: 38106174 PMCID: PMC10723334 DOI: 10.1101/2023.12.03.569786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Secondary active membrane transporters harness the energy of ion gradients to concentrate their substrates. Homologous transporters evolved to couple transport to different ions in response to changing environments and needs. The bases of such diversification, and thus principles of ion coupling, are unexplored. Employing phylogenetics and ancestral protein reconstruction, we investigated sodium-coupled transport in prokaryotic glutamate transporters, a mechanism ubiquitous across life domains and critical to neurotransmitter recycling in humans. We found that the evolutionary transition from sodium-dependent to independent substrate binding to the transporter preceded changes in the coupling mechanism. Structural and functional experiments suggest that the transition entailed allosteric mutations, making sodium binding dispensable without affecting ion-binding sites. Allosteric tuning of transporters' energy landscapes might be a widespread route of their functional diversification.
Collapse
Affiliation(s)
- Krishna D. Reddy
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Burha Rasool
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Farideh Badichi Akher
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Nemanja Kutlešić
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Swati Pant
- Dept. of Biochemistry, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| | - Olga Boudker
- Dept. of Physiology & Biophysics, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
- Howard Hughes Medical Institute, Weill Cornell Medical College, 1300 York Ave, New York, NY 10021, USA
| |
Collapse
|
7
|
Park Y, Metzger BP, Thornton JW. The simplicity of protein sequence-function relationships. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.02.556057. [PMID: 37732229 PMCID: PMC10508729 DOI: 10.1101/2023.09.02.556057] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
How complicated is the genetic architecture of proteins - the set of causal effects by which sequence determines function? High-order epistatic interactions among residues are thought to be pervasive, making a protein's function difficult to predict or understand from its sequence. Most studies, however, used methods that overestimate epistasis, because they analyze genetic architecture relative to a designated reference sequence - causing measurement noise and small local idiosyncrasies to propagate into pervasive high-order interactions - or have not effectively accounted for global nonlinearity in the sequence-function relationship. Here we present a new reference-free method that jointly estimates global nonlinearity and specific epistatic interactions across a protein's entire genotype-phenotype map. This method yields a maximally efficient explanation of a protein's genetic architecture and is more robust than existing methods to measurement noise, partial sampling, and model misspecification. We reanalyze 20 combinatorial mutagenesis experiments from a diverse set of proteins and find that additive and pairwise effects, along with a simple nonlinearity to account for limited dynamic range, explain a median of 96% of total variance in measured phenotypes (and >92% in every case). Only a tiny fraction of genotypes are strongly affected by third- or higher-order epistasis. Genetic architecture is also sparse: the number of terms required to explain the vast majority of variance is smaller than the number of genotypes by many orders of magnitude. The sequence-function relationship in most proteins is therefore far simpler than previously thought, opening the way for new and tractable approaches to characterize it.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637
- Current affiliation: Center for RNA Research, Institute for Basic Science, Seoul, Republic of Korea 08826
| | - Brian P.H. Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Current affiliation: Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | - Joseph W. Thornton
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637
- Department of Human Genetics, University of Chicago, Chicago, IL 60637
| |
Collapse
|
8
|
Dangat Y, Freindorf M, Kraka E. Mechanistic Insights into S-Depalmitolyse Activity of Cln5 Protein Linked to Neurodegeneration and Batten Disease: A QM/MM Study. J Am Chem Soc 2024; 146:145-158. [PMID: 38055807 DOI: 10.1021/jacs.3c06397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
Ceroid lipofuscinosis neuronal protein 5 (Cln5) is encoded by the CLN5 gene. The genetic variants of this gene are associated with the CLN5 form of Batten disease. Recently, the first crystal structure of Cln5 was reported. Cln5 shows cysteine palmitoyl thioesterase S-depalmitoylation activity, which was explored via fluorescent emission spectroscopy utilizing the fluorescent probe DDP-5. In this work, the mechanism of the reaction between Cln5 and DDP-5 was studied computationally by applying a QM/MM methodology at the ωB97X-D/6-31G(d,p):AMBER level. The results of our study clearly demonstrate the critical role of the catalytic triad Cys280-His166-Glu183 in S-depalmitoylation activity. This is evidenced through a comparison of the pathways catalyzed by the Cys280-His166-Glu183 triad and those with only Cys280 involved. The computed reaction barriers are in agreement with the catalytic efficiency. The calculated Gibb's free-energy profile suggests that S-depalmitoylation is a rate-limiting step compared to the preceding S-palmitoylation, with barriers of 26.1 and 25.3 kcal/mol, respectively. The energetics were complemented by monitoring the fluctuations in the electron density distribution through NBO charges and bond strength alterations via local mode stretching force constants during the catalytic pathways. This comprehensive protocol led to a more holistic picture of the reaction mechanism at the atomic level. It forms the foundation for future studies on the effects of gene mutations on both the S-palmitoylation and S-depalmitoylation steps, providing valuable data for the further development of enzyme replacement therapy, which is currently the only FDA-approved therapy for childhood neurodegenerative diseases, including Batten disease.
Collapse
Affiliation(s)
- Yuvraj Dangat
- Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, United States
| | - Marek Freindorf
- Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, United States
| | - Elfi Kraka
- Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, United States
| |
Collapse
|
9
|
Vila JA. Protein folding rate evolution upon mutations. Biophys Rev 2023; 15:661-669. [PMID: 37681091 PMCID: PMC10480377 DOI: 10.1007/s12551-023-01088-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 06/24/2023] [Indexed: 09/09/2023] Open
Abstract
Despite the spectacular success of cutting-edge protein fold prediction methods, many critical questions remain unanswered, including why proteins can reach their native state in a biologically reasonable time. A satisfactory answer to this simple question could shed light on the slowest folding rate of proteins as well as how mutations-amino-acid substitutions and/or post-translational modifications-might affect it. Preliminary results indicate that (i) Anfinsen's dogma validity ensures that proteins reach their native state on a reasonable timescale regardless of their sequence or length, and (ii) it is feasible to determine the evolution of protein folding rates without accounting for epistasis effects or the mutational trajectories between the starting and target sequences. These results have direct implications for evolutionary biology because they lay the groundwork for a better understanding of why, and to what extent, mutations-a crucial element of evolution and a factor influencing it-affect protein evolvability. Furthermore, they may spur significant progress in our efforts to solve crucial structural biology problems, such as how a sequence encodes its folding.
Collapse
Affiliation(s)
- Jorge A. Vila
- IMASL-CONICET, Universidad Nacional de San Luis, Ejército de Los Andes 950, 5700 San Luis, Argentina
| |
Collapse
|
10
|
Harman JL, Reardon PN, Costello SM, Warren GD, Phillips SR, Connor PJ, Marqusee S, Harms MJ. Evolution avoids a pathological stabilizing interaction in the immune protein S100A9. Proc Natl Acad Sci U S A 2022; 119:e2208029119. [PMID: 36194634 PMCID: PMC9565474 DOI: 10.1073/pnas.2208029119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 09/07/2022] [Indexed: 01/03/2023] Open
Abstract
Stability constrains evolution. While much is known about constraints on destabilizing mutations, less is known about the constraints on stabilizing mutations. We recently identified a mutation in the innate immune protein S100A9 that provides insight into such constraints. When introduced into human S100A9, M63F simultaneously increases the stability of the protein and disrupts its natural ability to activate Toll-like receptor 4. Using chemical denaturation, we found that M63F stabilizes a calcium-bound conformation of hS100A9. We then used NMR to solve the structure of the mutant protein, revealing that the mutation distorts the hydrophobic binding surface of hS100A9, explaining its deleterious effect on function. Hydrogen-deuterium exchange (HDX) experiments revealed stabilization of the region around M63F in the structure, notably Phe37. In the structure of the M63F mutant, the Phe37 and Phe63 sidechains are in contact, plausibly forming an edge-face π-stack. Mutating Phe37 to Leu abolished the stabilizing effect of M63F as probed by both chemical denaturation and HDX. It also restored the biological activity of S100A9 disrupted by M63F. These findings reveal that Phe63 creates a molecular staple with Phe37 that stabilizes a nonfunctional conformation of the protein, thus disrupting function. Using a bioinformatic analysis, we found that S100A9 proteins from different organisms rarely have Phe at both positions 37 and 63, suggesting that avoiding a pathological stabilizing interaction indeed constrains S100A9 evolution. This work highlights an important evolutionary constraint on stabilizing mutations, namely, that they must avoid inappropriately stabilizing nonfunctional protein conformations.
Collapse
Affiliation(s)
- Joseph L Harman
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick N Reardon
- College of Science, NMR Facility, Oregon State University, Corvallis, OR 97331
| | - Shawn M Costello
- Biophysics Graduate Program, University of California, Berkeley, Berkeley, CA 94720
| | - Gus D Warren
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Sophia R Phillips
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick J Connor
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720
| | - Michael J Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| |
Collapse
|
11
|
Kim I, Dubrow A, Zuniga B, Zhao B, Sherer N, Bastiray A, Li P, Cho JH. Energy landscape reshaped by strain-specific mutations underlies epistasis in NS1 evolution of influenza A virus. Nat Commun 2022; 13:5775. [PMID: 36182933 PMCID: PMC9526705 DOI: 10.1038/s41467-022-33554-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/22/2022] [Indexed: 11/24/2022] Open
Abstract
Elucidating how individual mutations affect the protein energy landscape is crucial for understanding how proteins evolve. However, predicting mutational effects remains challenging because of epistasis—the nonadditive interactions between mutations. Here, we investigate the biophysical mechanism of strain-specific epistasis in the nonstructural protein 1 (NS1) of influenza A viruses (IAVs). We integrate structural, kinetic, thermodynamic, and conformational dynamics analyses of four NS1s of influenza strains that emerged between 1918 and 2004. Although functionally near-neutral, strain-specific NS1 mutations exhibit long-range epistatic interactions with residues at the p85β-binding interface. We reveal that strain-specific mutations reshaped the NS1 energy landscape during evolution. Using NMR spin dynamics, we find that the strain-specific mutations altered the conformational dynamics of the hidden network of tightly packed residues, underlying the evolution of long-range epistasis. This work shows how near-neutral mutations silently alter the biophysical energy landscapes, resulting in diverse background effects during molecular evolution. Influenza A virus (IAV) nonstructural protein 1 (NS1) is a multifunctional virulence factor that interacts with several host factors such as phosphatidylinositol-3-kinase (PI3K). NS1 binds specifically to the p85β regulatory subunit of PI3K and subsequently activates PI3K signaling. Here, Kim et al. show that functionally near-neutral, strain-specific NS1 mutations lead to variations in binding kinetics to p85β exhibit long-range epistatic interactions. Applying NMR they provide evidence that the structural dynamics of the NS1 hydrophobic core have evolved over time and contributed to epistasis.
Collapse
Affiliation(s)
- Iktae Kim
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Alyssa Dubrow
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Bryan Zuniga
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Baoyu Zhao
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Noah Sherer
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Abhishek Bastiray
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Pingwei Li
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA
| | - Jae-Hyun Cho
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA.
| |
Collapse
|
12
|
Smith CE, Smith ANH, Cooper TF, Moore FBG. Fitness of evolving bacterial populations is contingent on deep and shallow history but only shallow history creates predictable patterns. Proc Biol Sci 2022; 289:20221292. [PMID: 36100026 PMCID: PMC9470251 DOI: 10.1098/rspb.2022.1292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Long-term evolution experiments have tested the importance of genetic and environmental factors in influencing evolutionary outcomes. Differences in phylogenetic history, recent adaptation to distinct environments and chance events, all influence the fitness of a population. However, the interplay of these factors on a population's evolutionary potential remains relatively unexplored. We tracked the outcome of 2000 generations of evolution of four natural isolates of Escherichia coli bacteria that were engineered to also create differences in shallow history by adding previously identified mutations selected in a separate long-term experiment. Replicate populations started from each progenitor evolved in four environments. We found that deep and shallow phylogenetic histories both contributed significantly to differences in evolved fitness, though by different amounts in different selection environments. With one exception, chance effects were not significant. Whereas the effect of deep history did not follow any detectable pattern, effects of shallow history followed a pattern of diminishing returns whereby fitter ancestors had smaller fitness increases. These results are consistent with adaptive evolution being contingent on the interaction of several evolutionary forces but demonstrate that the nature of these interactions is not fixed and may not be predictable even when the role of chance is small.
Collapse
Affiliation(s)
- Chelsea E Smith
- Department of Biological Sciences, Kent State University, Kent, OH 44242, USA
| | - Adam N H Smith
- School of Mathematical and Computational Sciences, Massey University, Auckland 0634, New Zealand
| | - Tim F Cooper
- School of Natural Sciences, Massey University, Auckland 0634, New Zealand
| | - Francisco B-G Moore
- Department of Biological Sciences, Kent State University, Kent, OH 44242, USA.,Department of Biology, University of Akron, Akron, OH 44325, USA
| |
Collapse
|
13
|
Three-dimensional structure-guided evolution of a ribosome with tethered subunits. Nat Chem Biol 2022; 18:990-998. [PMID: 35836020 PMCID: PMC9815830 DOI: 10.1038/s41589-022-01064-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 05/17/2022] [Indexed: 01/11/2023]
Abstract
RNA-based macromolecular machines, such as the ribosome, have functional parts reliant on structural interactions spanning sequence-distant regions. These features limit evolutionary exploration of mutant libraries and confound three-dimensional structure-guided design. To address these challenges, we describe Evolink (evolution and linkage), a method that enables high-throughput evolution of sequence-distant regions in large macromolecular machines, and library design guided by computational RNA modeling to enable exploration of structurally stable designs. Using Evolink, we evolved a tethered ribosome with a 58% increased activity in orthogonal protein translation and a 97% improvement in doubling times in SQ171 cells compared to a previously developed tethered ribosome, and reveal new permissible sequences in a pair of ribosomal helices with previously explored biological function. The Evolink approach may enable enhanced engineering of macromolecular machines for new and improved functions for synthetic biology.
Collapse
|
14
|
Park Y, Metzger BPH, Thornton JW. Epistatic drift causes gradual decay of predictability in protein evolution. Science 2022; 376:823-830. [PMID: 35587978 DOI: 10.1126/science.abn6895] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Epistatic interactions can make the outcomes of evolution unpredictable, but no comprehensive data are available on the extent and temporal dynamics of changes in the effects of mutations as protein sequences evolve. Here, we use phylogenetic deep mutational scanning to measure the functional effect of every possible amino acid mutation in a series of ancestral and extant steroid receptor DNA binding domains. Across 700 million years of evolution, epistatic interactions caused the effects of most mutations to become decorrelated from their initial effects and their windows of evolutionary accessibility to open and close transiently. Most effects changed gradually and without bias at rates that were largely constant across time, indicating a neutral process caused by many weak epistatic interactions. Our findings show that protein sequences drift inexorably into contingency and unpredictability, but that the process is statistically predictable, given sufficient phylogenetic and experimental data.
Collapse
Affiliation(s)
- Yeonwoo Park
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Brian P H Metzger
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Joseph W Thornton
- Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA.,Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
15
|
Vila JA. Proteins' Evolution upon Point Mutations. ACS OMEGA 2022; 7:14371-14376. [PMID: 35573218 PMCID: PMC9089682 DOI: 10.1021/acsomega.2c01407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 04/05/2022] [Indexed: 05/03/2023]
Abstract
As the reader must be already aware, state-of-the-art protein folding prediction methods have reached a smashing success in their goal of accurately determining the three-dimensional structures of proteins. Yet, a solution to simple problems such as the effects of protein point mutations on their (i) native conformation; (ii) marginal stability; (iii) ensemble of high-energy nativelike conformations; and (iv) metamorphism propensity and, hence, their evolvability, remains as an unsolved problem. As a plausible solution to the latter, some properties of the amide hydrogen-deuterium exchange, a highly sensitive probe of the structure, stability, and folding of proteins, are assessed from a new perspective. The preliminary results indicate that the protein marginal stability change upon point mutations provides the necessary and sufficient information to estimate, through a Boltzmann factor, the evolution of the amide hydrogen exchange protection factors and, consequently, that of the ensemble of folded conformations coexisting with the native state. This work contributes to our general understanding of the effects of point mutations on proteins and may spur significant progress in our efforts to develop methods to determine the appearance of new folds and functions accurately.
Collapse
|
16
|
Abstract
Modern evolutionary theory gives a detailed quantitative description of microevolutionary processes that occur within evolving populations of organisms, but evolutionary transitions and emergence of multiple levels of complexity remain poorly understood. Here, we establish the correspondence among the key features of evolution, learning dynamics, and renormalizability of physical theories to outline a theory of evolution that strives to incorporate all evolutionary processes within a unified mathematical framework of the theory of learning. According to this theory, for example, replication of genetic material and natural selection readily emerge from the learning dynamics, and in sufficiently complex systems, the same learning phenomena occur on multiple levels or on different scales, similar to the case of renormalizable physical theories. We apply the theory of learning to physically renormalizable systems in an attempt to outline a theory of biological evolution, including the origin of life, as multilevel learning. We formulate seven fundamental principles of evolution that appear to be necessary and sufficient to render a universe observable and show that they entail the major features of biological evolution, including replication and natural selection. It is shown that these cornerstone phenomena of biology emerge from the fundamental features of learning dynamics such as the existence of a loss function, which is minimized during learning. We then sketch the theory of evolution using the mathematical framework of neural networks, which provides for detailed analysis of evolutionary phenomena. To demonstrate the potential of the proposed theoretical framework, we derive a generalized version of the Central Dogma of molecular biology by analyzing the flow of information during learning (back propagation) and predicting (forward propagation) the environment by evolving organisms. The more complex evolutionary phenomena, such as major transitions in evolution (in particular, the origin of life), have to be analyzed in the thermodynamic limit, which is described in detail in the paper by Vanchurin et al. [V. Vanchurin, Y. I. Wolf, E. V. Koonin, M. I. Katsnelson, Proc. Natl. Acad. Sci. U.S.A. 119, 10.1073/pnas.2120042119 (2022)].
Collapse
|
17
|
Ogbunugafor CB. The mutation effect reaction norm (mu-rn) highlights environmentally dependent mutation effects and epistatic interactions. Evolution 2022; 76:37-48. [PMID: 34989399 DOI: 10.1111/evo.14428] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 12/23/2021] [Indexed: 11/27/2022]
Abstract
Since the modern synthesis, the fitness effects of mutations and epistasis have been central yet provocative concepts in evolutionary and population genetics. Studies of how the interactions between parcels of genetic information can change as a function of environmental context have added a layer of complexity to these discussions. Here I introduce the "mutation effect reaction norm" (Mu-RN), a new instrument through which one can analyze the phenotypic consequences of mutations and interactions across environmental contexts. It embodies the fusion of measurements of genetic interactions with the reaction norm, a classic depiction of the performance of genotypes across environments. I demonstrate the utility of the Mu-RN through the signature of a "compensatory ratchet" mutation that undermines reverse evolution of antimicrobial resistance. More broadly, I argue that the mutation effect reaction norm may help us resolve the dynamism and unpredictability of evolution, with implications for theoretical biology, genetic modification technology, and public health. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- C Brandon Ogbunugafor
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520, USA
| |
Collapse
|
18
|
Mokhtari DA, Appel MJ, Fordyce PM, Herschlag D. High throughput and quantitative enzymology in the genomic era. Curr Opin Struct Biol 2021; 71:259-273. [PMID: 34592682 PMCID: PMC8648990 DOI: 10.1016/j.sbi.2021.07.010] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/23/2021] [Indexed: 12/28/2022]
Abstract
Accurate predictions from models based on physical principles are the ultimate metric of our biophysical understanding. Although there has been stunning progress toward structure prediction, quantitative prediction of enzyme function has remained challenging. Realizing this goal will require large numbers of quantitative measurements of rate and binding constants and the use of these ground-truth data sets to guide the development and testing of these quantitative models. Ground truth data more closely linked to the underlying physical forces are also desired. Here, we describe technological advances that enable both types of ground truth measurements. These advances allow classic models to be tested, provide novel mechanistic insights, and place us on the path toward a predictive understanding of enzyme structure and function.
Collapse
Affiliation(s)
- D A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - M J Appel
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - P M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA; ChEM-H Institute, Stanford University, Stanford, CA, 94305, USA; Department of Genetics, Stanford University, Stanford, CA, 94305, USA; Chan Zuckerberg Biohub San Francisco, CA, 94110, USA.
| | - D Herschlag
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA; Department of Chemical Engineering, Stanford University, Stanford, CA, 94305, USA; ChEM-H Institute, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
19
|
Castiglione GM, Zhou L, Xu Z, Neiman Z, Hung CF, Duh EJ. Evolutionary pathways to SARS-CoV-2 resistance are opened and closed by epistasis acting on ACE2. PLoS Biol 2021; 19:e3001510. [PMID: 34932561 PMCID: PMC8730403 DOI: 10.1371/journal.pbio.3001510] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 01/05/2022] [Accepted: 12/08/2021] [Indexed: 02/06/2023] Open
Abstract
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infects a broader range of mammalian species than previously predicted, binding a diversity of angiotensin converting enzyme 2 (ACE2) orthologs despite extensive sequence divergence. Within this sequence degeneracy, we identify a rare sequence combination capable of conferring SARS-CoV-2 resistance. We demonstrate that this sequence was likely unattainable during human evolution due to deleterious effects on ACE2 carboxypeptidase activity, which has vasodilatory and cardioprotective functions in vivo. Across the 25 ACE2 sites implicated in viral binding, we identify 6 amino acid substitutions unique to mouse-one of the only known mammalian species resistant to SARS-CoV-2. Substituting human variants at these positions is sufficient to confer binding of the SARS-CoV-2 S protein to mouse ACE2, facilitating cellular infection. Conversely, substituting mouse variants into either human or dog ACE2 abolishes viral binding, diminishing cellular infection. However, these same substitutions decrease human ACE2 activity by 50% and are predicted as pathogenic, consistent with the extreme rarity of human polymorphisms at these sites. This trade-off can be avoided, however, depending on genetic background; if substituted simultaneously, these same mutations have no deleterious effect on dog ACE2 nor that of the rodent ancestor estimated to exist 70 million years ago. This genetic contingency (epistasis) may have therefore opened the road to resistance for some species, while making humans susceptible to viruses that use these ACE2 surfaces for binding, as does SARS-CoV-2.
Collapse
Affiliation(s)
- Gianni M. Castiglione
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Lingli Zhou
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Zhenhua Xu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Zachary Neiman
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Chien-Fu Hung
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Elia J. Duh
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
20
|
The generative capacity of probabilistic protein sequence models. Nat Commun 2021; 12:6302. [PMID: 34728624 PMCID: PMC8563988 DOI: 10.1038/s41467-021-26529-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 09/23/2021] [Indexed: 01/10/2023] Open
Abstract
Potts models and variational autoencoders (VAEs) have recently gained popularity as generative protein sequence models (GPSMs) to explore fitness landscapes and predict mutation effects. Despite encouraging results, current model evaluation metrics leave unclear whether GPSMs faithfully reproduce the complex multi-residue mutational patterns observed in natural sequences due to epistasis. Here, we develop a set of sequence statistics to assess the “generative capacity” of three current GPSMs: the pairwise Potts Hamiltonian, the VAE, and the site-independent model. We show that the Potts model’s generative capacity is largest, as the higher-order mutational statistics generated by the model agree with those observed for natural sequences, while the VAE’s lies between the Potts and site-independent models. Importantly, our work provides a new framework for evaluating and interpreting GPSM accuracy which emphasizes the role of higher-order covariation and epistasis, with broader implications for probabilistic sequence models in general. Generative models have become increasingly popular in protein design, yet rigorous metrics that allow the comparison of these models are lacking. Here, the authors propose a set of such metrics and use them to compare three popular models.
Collapse
|
21
|
Morrison AJ, Wonderlick DR, Harms MJ. Ensemble epistasis: thermodynamic origins of nonadditivity between mutations. Genetics 2021; 219:iyab105. [PMID: 34849909 PMCID: PMC8633102 DOI: 10.1093/genetics/iyab105] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 06/19/2021] [Indexed: 01/02/2023] Open
Abstract
Epistasis-when mutations combine nonadditively-is a profoundly important aspect of biology. It is often difficult to understand its mechanistic origins. Here, we show that epistasis can arise from the thermodynamic ensemble, or the set of interchanging conformations a protein adopts. Ensemble epistasis occurs because mutations can have different effects on different conformations of the same protein, leading to nonadditive effects on its average, observable properties. Using a simple analytical model, we found that ensemble epistasis arises when two conditions are met: (1) a protein populates at least three conformations and (2) mutations have differential effects on at least two conformations. To explore the relative magnitude of ensemble epistasis, we performed a virtual deep-mutational scan of the allosteric Ca2+ signaling protein S100A4. We found that 47% of mutation pairs exhibited ensemble epistasis with a magnitude on the order of thermal fluctuations. We observed many forms of epistasis: magnitude, sign, and reciprocal sign epistasis. The same mutation pair could even exhibit different forms of epistasis under different environmental conditions. The ubiquity of thermodynamic ensembles in biology and the pervasiveness of ensemble epistasis in our dataset suggests that it may be a common mechanism of epistasis in proteins and other macromolecules.
Collapse
Affiliation(s)
- Anneliese J Morrison
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403, USA
- Department of Chemistry and Biochemistry, University of Oregon, Eugene OR 97403, USA
| | - Daria R Wonderlick
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403, USA
- Department of Chemistry and Biochemistry, University of Oregon, Eugene OR 97403, USA
| | - Michael J Harms
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403, USA
- Department of Chemistry and Biochemistry, University of Oregon, Eugene OR 97403, USA
| |
Collapse
|
22
|
Ginsberg SD, Neubert TA, Sharma S, Digwal CS, Yan P, Timbus C, Wang T, Chiosis G. Disease-specific interactome alterations via epichaperomics: the case for Alzheimer's disease. FEBS J 2021; 289:2047-2066. [PMID: 34028172 DOI: 10.1111/febs.16031] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 04/23/2021] [Accepted: 05/20/2021] [Indexed: 12/22/2022]
Abstract
The increasingly appreciated prevalence of complicated stressor-to-phenotype associations in human disease requires a greater understanding of how specific stressors affect systems or interactome properties. Many currently untreatable diseases arise due to variations in, and through a combination of, multiple stressors of genetic, epigenetic, and environmental nature. Unfortunately, how such stressors lead to a specific disease phenotype or inflict a vulnerability to some cells and tissues but not others remains largely unknown and unsatisfactorily addressed. Analysis of cell- and tissue-specific interactome networks may shed light on organization of biological systems and subsequently to disease vulnerabilities. However, deriving human interactomes across different cell and disease contexts remains a challenge. To this end, this opinion article links stressor-induced protein interactome network perturbations to the formation of pathologic scaffolds termed epichaperomes, revealing a viable and reproducible experimental solution to obtaining rigorous context-dependent interactomes. This article presents our views on how a specialized 'omics platform called epichaperomics may complement and enhance the currently available conventional approaches and aid the scientific community in defining, understanding, and ultimately controlling interactome networks of complex diseases such as Alzheimer's disease. Ultimately, this approach may aid the transition from a limited single-alteration perspective in disease to a comprehensive network-based mindset, which we posit will result in precision medicine paradigms for disease diagnosis and treatment.
Collapse
Affiliation(s)
- Stephen D Ginsberg
- Center for Dementia Research, Nathan Kline Institute, Orangeburg, NY, USA.,Departments of Psychiatry, Neuroscience & Physiology, The NYU Neuroscience Institute, New York University Grossman School of Medicine, NY, USA
| | - Thomas A Neubert
- Kimmel Center for Biology and Medicine at the Skirball Institute, NYU School of Medicine, New York, NY, USA
| | - Sahil Sharma
- Program in Chemical Biology, Sloan Kettering Institute, New York, NY, USA
| | - Chander S Digwal
- Program in Chemical Biology, Sloan Kettering Institute, New York, NY, USA
| | - Pengrong Yan
- Program in Chemical Biology, Sloan Kettering Institute, New York, NY, USA
| | - Calin Timbus
- Department of Mathematics, Technical University of Cluj-Napoca, CJ, Romania
| | - Tai Wang
- Program in Chemical Biology, Sloan Kettering Institute, New York, NY, USA
| | - Gabriela Chiosis
- Program in Chemical Biology, Sloan Kettering Institute, New York, NY, USA.,Breast Cancer Medicine Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
23
|
Selection for cooperativity causes epistasis predominately between native contacts and enables epistasis-based structure reconstruction. Proc Natl Acad Sci U S A 2021; 118:2010057118. [PMID: 33879570 DOI: 10.1073/pnas.2010057118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Epistasis and cooperativity of folding both result from networks of energetic interactions in proteins. Epistasis results from energetic interactions among mutants, whereas cooperativity results from energetic interactions during folding that reduce the presence of intermediate states. The two concepts seem intuitively related, but it is unknown how they are related, particularly in terms of selection. To investigate their relationship, we simulated protein evolution under selection for cooperativity and separately under selection for epistasis. Strong selection for cooperativity created strong epistasis between contacts in the native structure but weakened epistasis between nonnative contacts. In contrast, selection for epistasis increased epistasis in both native and nonnative contacts and reduced cooperativity. Because epistasis can be used to predict protein structure only if it preferentially occurs in native contacts, this result indicates that selection for cooperativity may be key for predicting structure using epistasis. To evaluate this inference, we simulated the evolution of guanine nucleotide-binding protein (GB1) with and without cooperativity. With cooperativity, strong epistatic interactions clearly map out the native GB1 structure, while allowing the presence of intermediate states (low cooperativity) obscured the structure. This indicates that using epistasis measurements to reconstruct protein structure may be inappropriate for proteins with stable intermediates.
Collapse
|
24
|
Reiskind MOB, Moody ML, Bolnick DI, Hanifin CT, Farrior CE. Nothing in Evolution Makes Sense Except in the Light of Biology. Bioscience 2021; 71:370-382. [PMID: 33867868 PMCID: PMC8038875 DOI: 10.1093/biosci/biaa170] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A key question in biology is the predictability of the evolutionary process. If we can correctly predict the outcome of evolution, we may be better equipped to anticipate and manage species' adaptation to climate change, habitat loss, invasive species, or emerging infectious diseases, as well as improve our basic understanding of the history of life on Earth. In the present article, we ask the questions when, why, and if the outcome of future evolution is predictable. We first define predictable and then discuss two conflicting views: that evolution is inherently unpredictable and that evolution is predictable given the ability to collect the right data. We identify factors that generate unpredictability, the data that might be required to make predictions at some level of precision or at a specific timescale, and the intellectual and translational value of understanding when prediction is or is not possible.
Collapse
Affiliation(s)
- Martha O Burford Reiskind
- Department of Biological Sciences and the director of the Genetic and Genomic Scholars graduate program, North Carolina State University, Raleigh, North Carolina, United States
| | - Michael L Moody
- Department of Biological Sciences and director of Herbarium UTEP, University of Texas, El Paso, El Paso, Texas, United States
| | - Daniel I Bolnick
- University of Connecticut, Mansfield, Connecticut, United States, and editor-in-chief of The American Naturalist, Chicago, Illinois, United States
| | | | - Caroline E Farrior
- University of Texas at Austin, Austin, Texas, United States, The author order was determined by a random number generator
| |
Collapse
|
25
|
Kryazhimskiy S. Emergence and propagation of epistasis in metabolic networks. eLife 2021; 10:e60200. [PMID: 33527897 PMCID: PMC7924954 DOI: 10.7554/elife.60200] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 02/01/2021] [Indexed: 12/11/2022] Open
Abstract
Epistasis is often used to probe functional relationships between genes, and it plays an important role in evolution. However, we lack theory to understand how functional relationships at the molecular level translate into epistasis at the level of whole-organism phenotypes, such as fitness. Here, I derive two rules for how epistasis between mutations with small effects propagates from lower- to higher-level phenotypes in a hierarchical metabolic network with first-order kinetics and how such epistasis depends on topology. Most importantly, weak epistasis at a lower level may be distorted as it propagates to higher levels. Computational analyses show that epistasis in more realistic models likely follows similar, albeit more complex, patterns. These results suggest that pairwise inter-gene epistasis should be common, and it should generically depend on the genetic background and environment. Furthermore, the epistasis coefficients measured for high-level phenotypes may not be sufficient to fully infer the underlying functional relationships.
Collapse
Affiliation(s)
- Sergey Kryazhimskiy
- Division of Biological Sciences, University of California, San DiegoLa JollaUnited States
| |
Collapse
|
26
|
Abstract
Living systems evolve one mutation at a time, but a single mutation can alter the effect of subsequent mutations. The underlying mechanistic determinants of such epistasis are unclear. Here, we demonstrate that the physical dynamics of a biological system can generically constrain epistasis. We analyze models and experimental data on proteins and regulatory networks. In each, we find that if the long-time physical dynamics is dominated by a slow, collective mode, then the dimensionality of mutational effects is reduced. Consequently, epistatic coefficients for different combinations of mutations are no longer independent, even if individually strong. Such epistasis can be summarized as resulting from a global nonlinearity applied to an underlying linear trait, that is, as global epistasis. This constraint, in turn, reduces the ruggedness of the sequence-to-function map. By providing a generic mechanistic origin for experimentally observed global epistasis, our work suggests that slow collective physical modes can make biological systems evolvable.
Collapse
Affiliation(s)
- Kabir Husain
- Department of Physics, University of Chicago, Chicago, IL
| | - Arvind Murugan
- Department of Physics, University of Chicago, Chicago, IL
| |
Collapse
|
27
|
Sailer ZR, Shafik SH, Summers RL, Joule A, Patterson-Robert A, Martin RE, Harms MJ. Inferring a complete genotype-phenotype map from a small number of measured phenotypes. PLoS Comput Biol 2020; 16:e1008243. [PMID: 32991585 PMCID: PMC7546491 DOI: 10.1371/journal.pcbi.1008243] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Revised: 10/09/2020] [Accepted: 08/13/2020] [Indexed: 01/02/2023] Open
Abstract
Understanding evolution requires detailed knowledge of genotype-phenotype maps; however, it can be a herculean task to measure every phenotype in a combinatorial map. We have developed a computational strategy to predict the missing phenotypes from an incomplete, combinatorial genotype-phenotype map. As a test case, we used an incomplete genotype-phenotype dataset previously generated for the malaria parasite’s ‘chloroquine resistance transporter’ (PfCRT). Wild-type PfCRT (PfCRT3D7) lacks significant chloroquine (CQ) transport activity, but the introduction of the eight mutations present in the ‘Dd2’ isoform of PfCRT (PfCRTDd2) enables the protein to transport CQ away from its site of antimalarial action. This gain of a transport function imparts CQ resistance to the parasite. A combinatorial map between PfCRT3D7 and PfCRTDd2 consists of 256 genotypes, of which only 52 have had their CQ transport activities measured through expression in the Xenopus laevis oocyte. We trained a statistical model with these 52 measurements to infer the CQ transport activity for the remaining 204 combinatorial genotypes between PfCRT3D7 and PfCRTDd2. Our best-performing model incorporated a binary classifier, a nonlinear scale, and additive effects for each mutation. The addition of specific pairwise- and high-order-epistatic coefficients decreased the predictive power of the model. We evaluated our predictions by experimentally measuring the CQ transport activities of 24 additional PfCRT genotypes. The R2 value between our predicted and newly-measured phenotypes was 0.90. We then used the model to probe the accessibility of evolutionary trajectories through the map. Approximately 1% of the possible trajectories between PfCRT3D7 and PfCRTDd2 are accessible; however, none of the trajectories entailed eight successive increases in CQ transport activity. These results demonstrate that phenotypes can be inferred with known uncertainty from a partial genotype-phenotype dataset. We also validated our approach against a collection of previously published genotype-phenotype maps. The model therefore appears general and should be applicable to a large number of genotype-phenotype maps. Biological macromolecules are built from chains of building blocks. The function of a macromolecule depends on the specific chemical properties of the building blocks that make it up. Macromolecules evolve through mutations that swap one building block for another. Understanding how biomolecules work and evolve therefore requires knowledge of the effects of mutations. The effects of mutations can be measured experimentally; however, because there are a vast number of possible combinations of mutations, it is often difficult to make enough measurements to understand biomolecular function and evolution. In this paper, we describe a simple method to predict the effects of mutations on biomolecules from a small number of measurements. This method works by appropriately averaging the effects of mutations seen in different contexts. We test the method by predicting the effects of mutations on a PfCRT—a macromolecule from the malarial parasite that confers drug resistance. We find that our method is fast and effective. Using a small number of measurements, we were able to gain insight into the evolutionary steps by which this macromolecule conferred drug resistance. To make this method accessible to other researchers, we have released it as an open-source software package: https://gpseer.readthedocs.io.
Collapse
Affiliation(s)
- Zachary R. Sailer
- Institute for Molecular Biology, University of Oregon, Eugene, OR, United States of America
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR, United States of America
| | - Sarah H. Shafik
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Robert L. Summers
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Alex Joule
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | | | - Rowena E. Martin
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- * E-mail: (REM); (MJH)
| | - Michael J. Harms
- Institute for Molecular Biology, University of Oregon, Eugene, OR, United States of America
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR, United States of America
- * E-mail: (REM); (MJH)
| |
Collapse
|
28
|
Abstract
Cells adapt to changing environments. Perturb a cell and it returns to a point of homeostasis. Perturb a population and it evolves toward a fitness peak. We review quantitative models of the forces of adaptation and their visualizations on landscapes. While some adaptations result from single mutations or few-gene effects, others are more cooperative, more delocalized in the genome, and more universal and physical. For example, homeostasis and evolution depend on protein folding and aggregation, energy and protein production, protein diffusion, molecular motor speeds and efficiencies, and protein expression levels. Models provide a way to learn about the fitness of cells and cell populations by making and testing hypotheses.
Collapse
Affiliation(s)
- Luca Agozzino
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA
| | - Gábor Balázsi
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York 11794, USA
| | - Jin Wang
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790, USA
| | - Ken A Dill
- The Louis and Beatrice Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA; .,Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA.,Department of Chemistry, Stony Brook University, Stony Brook, New York 11790, USA
| |
Collapse
|
29
|
Crona K, Luo M, Greene D. An uncertainty law for microbial evolution. J Theor Biol 2020; 489:110155. [PMID: 31926205 DOI: 10.1016/j.jtbi.2020.110155] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 01/05/2020] [Accepted: 01/07/2020] [Indexed: 11/28/2022]
Abstract
Medical practice would benefit from a thorough understanding of constraints and uncertainty in microbial evolution. Higher order epistasis refers to unexpected effects of multiple mutations even if both single mutations and pairwise effects have been accounted for. Recent studies show that higher order epistasis is abundant in nature, for bacteria as well as higher organisms. However, the importance of higher order effects has been debated. It has been suggested that such effects cannot be interpreted, and should not be considered. Here, we show conclusively that higher order epistasis changes the adaptive prospects for a population. The conclusion is based on an exhaustive search of 193,270,310 hyper-cube graphs and applications of graph theory. Our results are more precise, yet more universal, than related research since they depend on mathematical theory, rather than sampling or simulations. Moreover, the uncertainty we establish for microbial evolution, due to higher order epistasis is not sensitive for detailed model assumptions, such as the baseline being additive or log-additive fitness.
Collapse
Affiliation(s)
- Kristina Crona
- Department of Mathematics and Statistics 4400 Massachusetts Avenue NW Washington, DC 20016-8050, United States.
| | - Mengming Luo
- University of California at San Diego, CA, United States.
| | - Devin Greene
- Department of Mathematics and Statistics 4400 Massachusetts Avenue NW Washington, DC 20016-8050, United States.
| |
Collapse
|
30
|
Bridging non-overlapping reads illuminates high-order epistasis between distal protein sites in a GPCR. Nat Commun 2020; 11:690. [PMID: 32019920 PMCID: PMC7000732 DOI: 10.1038/s41467-020-14495-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 01/06/2020] [Indexed: 11/09/2022] Open
Abstract
Epistasis emerges when the effects of an amino acid depend on the identities of interacting residues. This phenomenon shapes fitness landscapes, which have the power to reveal evolutionary paths and inform evolution of desired functions. However, there is a need for easily implemented, high-throughput methods to capture epistasis particularly at distal sites. Here, we combine deep mutational scanning (DMS) with a straightforward data processing step to bridge reads in distal sites within genes (BRIDGE). We use BRIDGE, which matches non-overlapping reads to their cognate templates, to uncover prevalent epistasis within the binding pocket of a human G protein-coupled receptor (GPCR) yielding variants with 4-fold greater affinity to a target ligand. The greatest functional improvements in our screen result from distal substitutions and substitutions that are deleterious alone. Our results corroborate findings of mutational tolerance in GPCRs, even in conserved motifs, but reveal inherent constraints restricting tolerated substitutions due to epistasis. Epistasis effects among amino acids at distal sites within binding pockets can have important impacts on protein fitness landscapes. Here the authors present BRIDGE, which matches non-overlapping sequence reads with their cognate DNA templates.
Collapse
|
31
|
Esteban L, Lonishin LR, Bobrovskiy DM, Leleytner G, Bogatyreva NS, Kondrashov FA, Ivankov DN. HypercubeME: two hundred million combinatorially complete datasets from a single experiment. Bioinformatics 2019; 36:btz841. [PMID: 31742320 PMCID: PMC7703787 DOI: 10.1093/bioinformatics/btz841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 11/17/2022] Open
Abstract
MOTIVATION Epistasis, the context-dependence of the contribution of an amino acid substitution to fitness, is common in evolution. To detect epistasis, fitness must be measured for at least four genotypes: the reference genotype, two different single mutants and a double mutant with both of the single mutations. For higher-order epistasis of the order n, fitness has to be measured for all 2n genotypes of an n-dimensional hypercube in genotype space forming a "combinatorially complete dataset". So far, only a handful of such datasets have been produced by manual curation. Concurrently, random mutagenesis experiments have produced measurements of fitness and other phenotypes in a high-throughput manner, potentially containing a number of combinatorially complete datasets. RESULTS We present an effective recursive algorithm for finding all hypercube structures in random mutagenesis experimental data. To test the algorithm, we applied it to the data from a recent HIS3 protein dataset and found all 199,847,053 unique combinatorially complete genotype combinations of dimensionality ranging from two to twelve. The algorithm may be useful for researchers looking for higher-order epistasis in their high-throughput experimental data. AVAILABILITY https://github.com/ivankovlab/HypercubeME.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Lyubov R Lonishin
- Faculty of Medical Physics, Institute of Biomedical System and Technologies, Peter the Great Saint Petersburg Polytechnic University, Saint Petersburg 195251, Russia
| | - Daniil M Bobrovskiy
- Faculty of Bioengineering and Bioinformatics, Moscow State University, Moscow 119234, Russia
| | - Gregory Leleytner
- Department of Innovation and High Technology, Moscow Institute of Physics and Technology, Moscow 141701, Russia
| | - Natalya S Bogatyreva
- Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
- Bioinformatics and Genomics Programme, Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain
- Laboratory of Protein Physics, Institute of Protein Research of the Russian Academy of Sciences, Moscow 142290, Russia
| | | | - Dmitry N Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| |
Collapse
|
32
|
Klein SA, Majumdar A, Barrick D. A Second Backbone: The Contribution of a Buried Asparagine Ladder to the Global and Local Stability of a Leucine-Rich Repeat Protein. Biochemistry 2019; 58:3480-3493. [PMID: 31347358 PMCID: PMC7184636 DOI: 10.1021/acs.biochem.9b00355] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Parallel β-sheet-containing repeat proteins often display a structural motif in which conserved asparagines form a continuous ladder buried within the hydrophobic core. In such "asparagine ladders", the asparagine side-chain amides form a repetitive pattern of hydrogen bonds with neighboring main-chain NH and CO groups. Although asparagine ladders have been thought to be important for stability, there is little experimental evidence to support such speculation. Here we test the contribution of a minimal asparagine ladder from the leucine-rich repeat protein pp32 to stability and investigate lattice rigidity and hydrogen bond character using solution nuclear magnetic resonance (NMR) spectroscopy. Point substitutions of the two ladder asparagines of pp32 are strongly destabilizing and decrease the cooperativity of unfolding. The chemical shifts of the ladder side-chain HZ protons are shifted significantly downfield in the NMR spectrum and have low temperature coefficients, indicative of strong hydrogen bonding. In contrast, the HE protons are shifted upfield and have temperature coefficients close to zero, suggesting an asymmetry in hydrogen bond strength along the ladder. Ladder NH2 groups have weak 1H-15N cross-peak intensities; 1H-15N nuclear Overhauser effect and 15N CPMG experiments show this to be the result of high rigidity. Hydrogen exchange measurements demonstrate that the ladder NH2 groups exchange very slowly, with rates approaching the global exchange limit. Overall, these results show that the asparagine side chains are held in a very rigid, nondynamic structure, making a significant contribution to the overall stability. In this regard, buried asparagine ladders can be considered "second backbones" within the cores of their elongated β-sheet host proteins.
Collapse
Affiliation(s)
- Sean A. Klein
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Ananya Majumdar
- The Johns Hopkins University Biomolecular NMR Center, Johns Hopkins University, Baltimore, Maryland, 21218
| | - Doug Barrick
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD 21218 USA
| |
Collapse
|
33
|
Diaz-Uriarte R, Vasallo C. Every which way? On predicting tumor evolution using cancer progression models. PLoS Comput Biol 2019; 15:e1007246. [PMID: 31374072 PMCID: PMC6693785 DOI: 10.1371/journal.pcbi.1007246] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 08/14/2019] [Accepted: 07/05/2019] [Indexed: 11/18/2022] Open
Abstract
Successful prediction of the likely paths of tumor progression is valuable for diagnostic, prognostic, and treatment purposes. Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations and thus CPMs encode the paths of tumor progression. Here we analyze the performance of four CPMs to examine whether they can be used to predict the true distribution of paths of tumor progression and to estimate evolutionary unpredictability. Employing simulations we show that if fitness landscapes are single peaked (have a single fitness maximum) there is good agreement between true and predicted distributions of paths of tumor progression when sample sizes are large, but performance is poor with the currently common much smaller sample sizes. Under multi-peaked fitness landscapes (i.e., those with multiple fitness maxima), performance is poor and improves only slightly with sample size. In all cases, detection regime (when tumors are sampled) is a key determinant of performance. Estimates of evolutionary unpredictability from the best performing CPM, among the four examined, tend to overestimate the true unpredictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of twenty-two cancer data sets shows low evolutionary unpredictability for several of the data sets. But most of the predictions of paths of tumor progression are very unreliable, and unreliability increases with the number of features analyzed. Our results indicate that CPMs could be valuable tools for predicting cancer progression but that, currently, obtaining useful predictions of paths of tumor progression from CPMs is dubious, and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancer.
Collapse
Affiliation(s)
- Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| | - Claudia Vasallo
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain
- Instituto de Investigaciones Biomédicas “Alberto Sols” (UAM-CSIC), Madrid, Spain
| |
Collapse
|
34
|
Guin D, Gruebele M. Weak Chemical Interactions That Drive Protein Evolution: Crowding, Sticking, and Quinary Structure in Folding and Function. Chem Rev 2019; 119:10691-10717. [PMID: 31356058 DOI: 10.1021/acs.chemrev.8b00753] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
In recent years, better instrumentation and greater computing power have enabled the imaging of elusive biomolecule dynamics in cells, driving many advances in understanding the chemical organization of biological systems. The focus of this Review is on interactions in the cell that affect both biomolecular stability and function and modulate them. The same protein or nucleic acid can behave differently depending on the time in the cell cycle, the location in a specific compartment, or the stresses acting on the cell. We describe in detail the crowding, sticking, and quinary structure in the cell and the current methods to quantify them both in vitro and in vivo. Finally, we discuss protein evolution in the cell in light of current biophysical evidence. We describe the factors that drive protein evolution and shape protein interaction networks. These interactions can significantly affect the free energy, ΔG, of marginally stable and low-population proteins and, due to epistasis, direct the evolutionary pathways in an organism. We finally conclude by providing an outlook on experiments to come and the possibility of collaborative evolutionary biology and biophysical efforts.
Collapse
Affiliation(s)
- Drishti Guin
- Department of Chemistry , University of Illinois , Urbana , Illinois 61801 , United States
| | - Martin Gruebele
- Department of Chemistry , University of Illinois , Urbana , Illinois 61801 , United States.,Department of Physics , University of Illinois , Urbana , Illinois 61801 , United States.,Center for Biophysics and Quantitative Biology , University of Illinois , Urbana , Illinois 61801 , United States
| |
Collapse
|
35
|
Wencewicz TA. Crossroads of Antibiotic Resistance and Biosynthesis. J Mol Biol 2019; 431:3370-3399. [PMID: 31288031 DOI: 10.1016/j.jmb.2019.06.033] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Revised: 06/20/2019] [Accepted: 06/27/2019] [Indexed: 12/14/2022]
Abstract
The biosynthesis of antibiotics and self-protection mechanisms employed by antibiotic producers are an integral part of the growing antibiotic resistance threat. The origins of clinically relevant antibiotic resistance genes found in human pathogens have been traced to ancient microbial producers of antibiotics in natural environments. Widespread and frequent antibiotic use amplifies environmental pools of antibiotic resistance genes and increases the likelihood for the selection of a resistance event in human pathogens. This perspective will provide an overview of the origins of antibiotic resistance to highlight the crossroads of antibiotic biosynthesis and producer self-protection that result in clinically relevant resistance mechanisms. Some case studies of synergistic antibiotic combinations, adjuvants, and hybrid antibiotics will also be presented to show how native antibiotic producers manage the emergence of antibiotic resistance.
Collapse
Affiliation(s)
- Timothy A Wencewicz
- Department of Chemistry, Washington University in St. Louis, One Brookings Drive, St. Louis, MO 63130, USA.
| |
Collapse
|
36
|
Abstract
Classically, phenotype is what is observed, and genotype is the genetic makeup. Statistical studies aim to project phenotypic likelihoods of genotypic patterns. The traditional genotype-to-phenotype theory embraces the view that the encoded protein shape together with gene expression level largely determines the resulting phenotypic trait. Here, we point out that the molecular biology revolution at the turn of the century explained that the gene encodes not one but ensembles of conformations, which in turn spell all possible gene-associated phenotypes. The significance of a dynamic ensemble view is in understanding the linkage between genetic change and the gained observable physical or biochemical characteristics. Thus, despite the transformative shift in our understanding of the basis of protein structure and function, the literature still commonly relates to the classical genotype-phenotype paradigm. This is important because an ensemble view clarifies how even seemingly small genetic alterations can lead to pleiotropic traits in adaptive evolution and in disease, why cellular pathways can be modified in monogenic and polygenic traits, and how the environment may tweak protein function.
Collapse
Affiliation(s)
- Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| | - Hyunbum Jang
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| |
Collapse
|
37
|
Consensus sequence design as a general strategy to create hyperstable, biologically active proteins. Proc Natl Acad Sci U S A 2019; 116:11275-11284. [PMID: 31110018 DOI: 10.1073/pnas.1816707116] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Consensus sequence design offers a promising strategy for designing proteins of high stability while retaining biological activity since it draws upon an evolutionary history in which residues important for both stability and function are likely to be conserved. Although there have been several reports of successful consensus design of individual targets, it is unclear from these anecdotal studies how often this approach succeeds and how often it fails. Here, we attempt to assess generality by designing consensus sequences for a set of six protein families with a range of chain lengths, structures, and activities. We characterize the resulting consensus proteins for stability, structure, and biological activities in an unbiased way. We find that all six consensus proteins adopt cooperatively folded structures in solution. Strikingly, four of six of these consensus proteins show increased thermodynamic stability over naturally occurring homologs. Each consensus protein tested for function maintained at least partial biological activity. Although peptide binding affinity by a consensus-designed SH3 is rather low, K m values for consensus enzymes are similar to values from extant homologs. Although consensus enzymes are slower than extant homologs at low temperature, they are faster than some thermophilic enzymes at high temperature. An analysis of sequence properties shows consensus proteins to be enriched in charged residues, and rarified in uncharged polar residues. Sequence differences between consensus and extant homologs are predominantly located at weakly conserved surface residues, highlighting the importance of these residues in the success of the consensus strategy.
Collapse
|
38
|
Horovitz A, Fleisher RC, Mondal T. Double-mutant cycles: new directions and applications. Curr Opin Struct Biol 2019; 58:10-17. [PMID: 31029859 DOI: 10.1016/j.sbi.2019.03.025] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Accepted: 03/20/2019] [Indexed: 11/17/2022]
Abstract
Double-mutant cycle (DMC) analysis is a powerful approach for detecting and quantifying the energetics of both direct and long-range interactions in proteins and other chemical systems. It can also be used to unravel higher-order interactions (e.g. three-body effects) that lead to cooperativity in protein folding and function. In this review, we describe new applications of DMC analysis based on advances in native mass spectrometry and high-throughput methods such as next generation sequencing and protein complementation assays. These developments have facilitated carrying out high-throughput DMC analysis, which can be used to characterize increasingly higher-order interactions and very large interaction networks in proteins. Such studies have provided insights into the extent of cooperativity (epistasis) in protein structures. High-throughput DMC studies have also been used to validate correlated mutation analysis and can provide restraints for protein docking.
Collapse
Affiliation(s)
- Amnon Horovitz
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel.
| | - Rachel C Fleisher
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Tridib Mondal
- Department of Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
39
|
Qiu C, Kaplan CD. Functional assays for transcription mechanisms in high-throughput. Methods 2019; 159-160:115-123. [PMID: 30797033 PMCID: PMC6589137 DOI: 10.1016/j.ymeth.2019.02.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Accepted: 02/18/2019] [Indexed: 01/12/2023] Open
Abstract
Dramatic increases in the scale of programmed synthesis of nucleic acid libraries coupled with deep sequencing have powered advances in understanding nucleic acid and protein biology. Biological systems centering on nucleic acids or encoded proteins greatly benefit from such high-throughput studies, given that large DNA variant pools can be synthesized and DNA, or RNA products of transcription, can be easily analyzed by deep sequencing. Here we review the scope of various high-throughput functional assays for studies of nucleic acids and proteins in general, followed by discussion of how these types of study have yielded insights into the RNA Polymerase II (Pol II) active site as an example. We discuss methodological considerations in the design and execution of these experiments that should be valuable to studies in any system.
Collapse
Affiliation(s)
- Chenxi Qiu
- Department of Medicine, Division of Translational Therapeutics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Cancer Research Institute, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
40
|
How Often Do Protein Genes Navigate Valleys of Low Fitness? Genes (Basel) 2019; 10:genes10040283. [PMID: 30965625 PMCID: PMC6523826 DOI: 10.3390/genes10040283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Revised: 03/27/2019] [Accepted: 04/02/2019] [Indexed: 11/17/2022] Open
Abstract
To escape from local fitness peaks, a population must navigate across valleys of low fitness. How these transitions occur, and what role they play in adaptation, have been subjects of active interest in evolutionary genetics for almost a century. However, to our knowledge, this problem has never been addressed directly by considering the evolution of a gene, or group of genes, as a whole, including the complex effects of fitness interactions among multiple loci. Here, we use a precise model of protein fitness to compute the probability P ( s , Δ t ) that an allele, randomly sampled from a population at time t, has crossed a fitness valley of depth s during an interval t - Δ t , t in the immediate past. We study populations of model genes evolving under equilibrium conditions consistent with those in mammalian mitochondria. From this data, we estimate that genes encoding small protein motifs navigate fitness valleys of depth 2 N s ≳ 30 with probability P ≳ 0 . 1 on a time scale of human evolution, where N is the (mitochondrial) effective population size. The results are consistent with recent findings for Watson⁻Crick switching in mammalian mitochondrial tRNA molecules.
Collapse
|
41
|
Bolnick DI, Barrett RD, Oke KB, Rennison DJ, Stuart YE. (Non)Parallel Evolution. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2018. [DOI: 10.1146/annurev-ecolsys-110617-062240] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Parallel evolution across replicate populations has provided evolutionary biologists with iconic examples of adaptation. When multiple populations colonize seemingly similar habitats, they may evolve similar genes, traits, or functions. Yet, replicated evolution in nature or in the laboratory often yields inconsistent outcomes: Some replicate populations evolve along highly similar trajectories, whereas other replicate populations evolve to different extents or in distinct directions. To understand these heterogeneous outcomes, biologists are increasingly treating parallel evolution not as a binary phenomenon but rather as a quantitative continuum ranging from parallel to nonparallel. By measuring replicate populations’ positions along this (non)parallel continuum, we can test hypotheses about evolutionary and ecological factors that influence the extent of repeatable evolution. We review evidence regarding the manifestation of (non)parallel evolution in the laboratory, in natural populations, and in applied contexts such as cancer. We enumerate the many genetic, ecological, and evolutionary processes that contribute to variation in the extent of parallel evolution.
Collapse
Affiliation(s)
- Daniel I. Bolnick
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712, USA
- Current affiliation: Department of Ecology and Evolution, University of Connecticut, Storrs, Connecticut 06268, USA
| | | | - Krista B. Oke
- Redpath Museum, McGill University, Montreal, Quebec H3A 2K6, Canada
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, California 95060, USA
| | - Diana J. Rennison
- Institute of Ecology and Evolution, University of Bern, 3012 Bern, Switzerland
| | - Yoel E. Stuart
- Department of Integrative Biology, University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
42
|
Srivastava A, Nagai T, Srivastava A, Miyashita O, Tama F. Role of Computational Methods in Going beyond X-ray Crystallography to Explore Protein Structure and Dynamics. Int J Mol Sci 2018; 19:E3401. [PMID: 30380757 PMCID: PMC6274748 DOI: 10.3390/ijms19113401] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 10/20/2018] [Accepted: 10/27/2018] [Indexed: 12/13/2022] Open
Abstract
Protein structural biology came a long way since the determination of the first three-dimensional structure of myoglobin about six decades ago. Across this period, X-ray crystallography was the most important experimental method for gaining atomic-resolution insight into protein structures. However, as the role of dynamics gained importance in the function of proteins, the limitations of X-ray crystallography in not being able to capture dynamics came to the forefront. Computational methods proved to be immensely successful in understanding protein dynamics in solution, and they continue to improve in terms of both the scale and the types of systems that can be studied. In this review, we briefly discuss the limitations of X-ray crystallography in studying protein dynamics, and then provide an overview of different computational methods that are instrumental in understanding the dynamics of proteins and biomacromolecular complexes.
Collapse
Affiliation(s)
- Ashutosh Srivastava
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
| | - Tetsuro Nagai
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Arpita Srivastava
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
| | - Osamu Miyashita
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| | - Florence Tama
- Institute of Transformative Bio-Molecules (WPI), Nagoya University, Nagoya, Aichi 464-8601, Japan.
- Department of Physics, Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8602, Japan.
- RIKEN-Center for Computational Science, Kobe, Hyogo 650-0047, Japan.
| |
Collapse
|
43
|
Castiglione GM, Chang BS. Functional trade-offs and environmental variation shaped ancient trajectories in the evolution of dim-light vision. eLife 2018; 7:35957. [PMID: 30362942 PMCID: PMC6203435 DOI: 10.7554/elife.35957] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 09/09/2018] [Indexed: 12/11/2022] Open
Abstract
Trade-offs between protein stability and activity can restrict access to evolutionary trajectories, but widespread epistasis may facilitate indirect routes to adaptation. This may be enhanced by natural environmental variation, but in multicellular organisms this process is poorly understood. We investigated a paradoxical trajectory taken during the evolution of tetrapod dim-light vision, where in the rod visual pigment rhodopsin, E122 was fixed 350 million years ago, a residue associated with increased active-state (MII) stability but greatly diminished rod photosensitivity. Here, we demonstrate that high MII stability could have likely evolved without E122, but instead, selection appears to have entrenched E122 in tetrapods via epistatic interactions with nearby coevolving sites. In fishes by contrast, selection may have exploited these epistatic effects to explore alternative trajectories, but via indirect routes with low MII stability. Our results suggest that within tetrapods, E122 and high MII stability cannot be sacrificed-not even for improvements to rod photosensitivity.
Collapse
Affiliation(s)
- Gianni M Castiglione
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada
| | - Belinda Sw Chang
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada.,Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada.,Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada
| |
Collapse
|
44
|
Nelson ED, Grishin NV. Inference of epistatic effects in a key mitochondrial protein. Phys Rev E 2018; 97:062404. [PMID: 30011480 DOI: 10.1103/physreve.97.062404] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Indexed: 12/17/2022]
Abstract
We use Potts model inference to predict pair epistatic effects in a key mitochondrial protein-cytochrome c oxidase subunit 2-for ray-finned fishes. We examine the effect of phylogenetic correlations on our predictions using a simple exact fitness model, and we find that, although epistatic effects are underpredicted, they maintain a roughly linear relationship to their true (model) values. After accounting for this correction, epistatic effects in the protein are still relatively weak, leading to fitness valleys of depth 2Ns≃-5 in compensatory double mutants. Interestingly, positive epistasis is more pronounced than negative epistasis, and the strongest positive effects capture nearly all sites subject to positive selection in fishes, similar to virus proteins evolving under selection pressure in the context of drug therapy.
Collapse
Affiliation(s)
- Erik D Nelson
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 6001 Forest Park Blvd., Room ND10.124, Dallas, Texas 75235-9050, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 6001 Forest Park Blvd., Room ND10.124, Dallas, Texas 75235-9050, USA
| |
Collapse
|
45
|
Otwinowski J. Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function. Mol Biol Evol 2018; 35:2345-2354. [PMID: 30085303 PMCID: PMC6188545 DOI: 10.1093/molbev/msy141] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. The essential function of many proteins that fold into a specific structure is their ability to bind to a ligand, which can be assayed for thousands of mutated variants. However, binding assays do not distinguish whether mutations affect the stability of the binding interface or the overall fold. Here, we introduce a statistical method to infer a detailed energy landscape of how a protein folds and binds to a ligand by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer distinct folding and binding energies for each mutation providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer the folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its nonlinear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.
Collapse
Affiliation(s)
- Jakub Otwinowski
- Biology Department, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
46
|
Abstract
The deterministic force of natural selection and stochastic influence of drift shape RNA virus evolution. New deep-sequencing and microfluidics technologies allow us to quantify the effect of mutations and trace the evolution of viral populations with single-genome and single-nucleotide resolution. Such experiments can reveal the topography of the genotype-fitness landscapes that shape the path of viral evolution. By combining historical analyses, like phylogenetic approaches, with high-throughput and high-resolution evolutionary experiments, we can observe parallel patterns of evolution that drive important phenotypic transitions. These developments provide a framework for quantifying and anticipating potential evolutionary events. Here, we examine emerging technologies that can map the selective landscapes of viruses, focusing on their application to pathogenic viruses. We identify areas where these technologies can bolster our ability to study the evolution of viruses and to anticipate and possibly intervene in evolutionary events and prevent viral disease.
Collapse
Affiliation(s)
- Patrick T Dolan
- Department of Biology, Stanford University, E200 Clark Center, 318 Campus Drive, Stanford, CA 94305, USA; Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA
| | - Zachary J Whitfield
- Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA
| | - Raul Andino
- Department of Microbiology and Immunology, University of California, San Francisco, 600 16th Street, GH-S572, UCSF Box 2280, San Francisco, CA 94143-2280, USA.
| |
Collapse
|
47
|
Hydrogen Bonds and Life in the Universe. Life (Basel) 2018; 8:life8010001. [PMID: 29301382 PMCID: PMC5871933 DOI: 10.3390/life8010001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Revised: 12/18/2017] [Accepted: 12/18/2017] [Indexed: 11/17/2022] Open
Abstract
The scientific community is allocating more and more resources to space missions and astronomical observations dedicated to the search for life beyond Earth. This experimental endeavor needs to be backed by a theoretical framework aimed at defining universal criteria for the existence of life. With this aim in mind, we have explored which chemical and physical properties should be expected for life possibly different from the terrestrial one, but similarly sustained by genetic and catalytic molecules. We show that functional molecules performing genetic and catalytic tasks must feature a hierarchy of chemical interactions operating in distinct energy bands. Of all known chemical bonds and forces, only hydrogen bonds are able to mediate the directional interactions of lower energy that are needed for the operation of genetic and catalytic tasks. For this reason and because of the unique quantum properties of hydrogen bonding, the functional molecules involved in life processes are predicted to have extensive hydrogen-bonding capabilities. A molecular medium generating a hydrogen-bond network is probably essential to support the activity of the functional molecules. These hydrogen-bond requirements constrain the viability of hypothetical biochemistries alternative to the terrestrial one, provide thermal limits to life molecular processes, and offer a conceptual framework to define a transition from a “covalent-bond stage” to a “hydrogen-bond stage” in prebiotic chemistry.
Collapse
|