1
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.19.585703. [PMID: 38746108 PMCID: PMC11092427 DOI: 10.1101/2024.03.19.585703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app .
Collapse
|
2
|
Meijers M, Ruchnewitz D, Eberhardt J, Karmakar M, Łuksza M, Lässig M. Concepts and methods for predicting viral evolution. ARXIV 2024:arXiv:2403.12684v2. [PMID: 38745695 PMCID: PMC11092678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The seasonal human influenza virus undergoes rapid evolution, leading to significant changes in circulating viral strains from year to year. These changes are typically driven by adaptive mutations, particularly in the antigenic epitopes, the regions of the viral surface protein haemagglutinin targeted by human antibodies. Here we describe a consistent set of methods for data-driven predictive analysis of viral evolution. Our pipeline integrates four types of data: (1) sequence data of viral isolates collected on a worldwide scale, (2) epidemiological data on incidences, (3) antigenic characterization of circulating viruses, and (4) intrinsic viral phenotypes. From the combined analysis of these data, we obtain estimates of relative fitness for circulating strains and predictions of clade frequencies for periods of up to one year. Furthermore, we obtain comparative estimates of protection against future viral populations for candidate vaccine strains, providing a basis for pre-emptive vaccine strain selection. Continuously updated predictions obtained from the prediction pipeline for influenza and SARS-CoV-2 are available on the website previr.app.
Collapse
Affiliation(s)
- Matthijs Meijers
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Denis Ruchnewitz
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Jan Eberhardt
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Malancha Karmakar
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| | - Marta Łuksza
- Tisch Cancer Institute, Departments of Oncological Sciences and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael Lässig
- Institute for Biological Physics, University of Cologne, Zülpicherstr. 77, 50937, Köln, Germany
| |
Collapse
|
3
|
Chisholm LO, Orlandi KN, Phillips SR, Shavlik MJ, Harms MJ. Ancestral Reconstruction and the Evolution of Protein Energy Landscapes. Annu Rev Biophys 2023; 53:10.1146/annurev-biophys-030722-125440. [PMID: 38134334 PMCID: PMC11192866 DOI: 10.1146/annurev-biophys-030722-125440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
A protein's sequence determines its conformational energy landscape. This, in turn, determines the protein's function. Understanding the evolution of new protein functions therefore requires understanding how mutations alter the protein energy landscape. Ancestral sequence reconstruction (ASR) has proven a valuable tool for tackling this problem. In ASR, one phylogenetically infers the sequences of ancient proteins, allowing characterization of their properties. When coupled to biophysical, biochemical, and functional characterization, ASR can reveal how historical mutations altered the energy landscape of ancient proteins, allowing the evolution of enzyme activity, altered conformations, binding specificity, oligomerization, and many other protein features. In this article, we review how ASR studies have been used to dissect the evolution of energy landscapes. We also discuss ASR studies that reveal how energy landscapes have shaped protein evolution. Finally, we propose that thinking about evolution from the perspective of an energy landscape can improve how we approach and interpret ASR studies. Expected final online publication date for the Annual Review of Biophysics, Volume 53 is May 2024. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Lauren O Chisholm
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon, USA;
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| | - Kona N Orlandi
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
- Department of Biology, University of Oregon, Eugene, Oregon, USA
| | - Sophia R Phillips
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon, USA;
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| | - Michael J Shavlik
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
- Department of Biology, University of Oregon, Eugene, Oregon, USA
| | - Michael J Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon, USA;
- Institute of Molecular Biology, University of Oregon, Eugene, Oregon, USA
| |
Collapse
|
4
|
Luzuriaga-Neira AR, Ritchie AM, Payne BL, Carrillo-Parramon O, Liberles DA, Alvarez-Ponce D. Highly Abundant Proteins Are Highly Thermostable. Genome Biol Evol 2023; 15:evad112. [PMID: 37399326 DOI: 10.1093/gbe/evad112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/08/2023] [Indexed: 07/05/2023] Open
Abstract
Highly abundant proteins tend to evolve slowly (a trend called E-R anticorrelation), and a number of hypotheses have been proposed to explain this phenomenon. The misfolding avoidance hypothesis attributes the E-R anticorrelation to the abundance-dependent toxic effects of protein misfolding. To avoid these toxic effects, protein sequences (particularly those of highly expressed proteins) would be under selection to fold properly. One prediction of the misfolding avoidance hypothesis is that highly abundant proteins should exhibit high thermostability (i.e., a highly negative free energy of folding, ΔG). Thus far, only a handful of analyses have tested for a relationship between protein abundance and thermostability, producing contradictory results. These analyses have been limited by 1) the scarcity of ΔG data, 2) the fact that these data have been obtained by different laboratories and under different experimental conditions, 3) the problems associated with using proteins' melting energy (Tm) as a proxy for ΔG, and 4) the difficulty of controlling for potentially confounding variables. Here, we use computational methods to compare the free energy of folding of pairs of human-mouse orthologous proteins with different expression levels. Even though the effect size is limited, the most highly expressed ortholog is often the one with a more negative ΔG of folding, indicating that highly expressed proteins are often more thermostable.
Collapse
Affiliation(s)
| | - Andrew M Ritchie
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | | | | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, USA
| | | |
Collapse
|
5
|
Del Amparo R, Arenas M. Influence of substitution model selection on protein phylogenetic tree reconstruction. Gene 2023; 865:147336. [PMID: 36871672 DOI: 10.1016/j.gene.2023.147336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 02/22/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Probabilistic phylogenetic tree reconstruction is traditionally performed under a best-fitting substitution model of molecular evolution previously selected according to diverse statistical criteria. Interestingly, some recent studies proposed that this procedure is unnecessary for phylogenetic tree reconstruction leading to a debate in the field. In contrast to DNA sequences, phylogenetic tree reconstruction from protein sequences is traditionally based on empirical exchangeability matrices that can differ among taxonomic groups and protein families. Considering this aspect, here we investigated the influence of selecting a substitution model of protein evolution on phylogenetic tree reconstruction by the analyses of real and simulated data. We found that phylogenetic tree reconstructions based on a selected best-fitting substitution model of protein evolution are the most accurate, in terms of topology and branch lengths, compared with those derived from substitution models with amino acid replacement matrices far from the selected best-fitting model, especially when the data has large genetic diversity. Indeed, we found that substitution models with similar amino acid replacement matrices produce similar reconstructed phylogenetic trees, suggesting the use of substitution models as similar as possible to a selected best-fitting model when the latter cannot be used. Therefore, we recommend the use of the traditional protocol of selection among substitution models of evolution for protein phylogenetic tree reconstruction.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain; Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain.
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain; Department of Biochemistry, Genetics and Immunology, Universidade de Vigo, 36310 Vigo, Spain; Galicia Sur Health Research Institute (IIS Galicia Sur), 36310 Vigo, Spain.
| |
Collapse
|
6
|
Page BM, Martin TA, Wright CL, Fenton LA, Villar MT, Tang Q, Artigues A, Lamb A, Fenton AW, Swint-Kruse L. Odd one out? Functional tuning of Zymomonas mobilis pyruvate kinase is narrower than its allosteric, human counterpart. Protein Sci 2022; 31:e4336. [PMID: 35762709 DOI: 10.1002/pro.4336] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 04/29/2022] [Accepted: 05/03/2022] [Indexed: 11/08/2022]
Abstract
Various protein properties are often illuminated using sequence comparisons of protein homologs. For example, in analyses of the pyruvate kinase multiple sequence alignment, the set of positions that changed during speciation ("phylogenetic" positions) were enriched for "rheostat" positions in human liver pyruvate kinase (hLPYK). (Rheostat positions are those which, when substituted with various amino acids, yield a range of functional outcomes). However, the correlation was moderate, which could result from multiple biophysical constraints acting on the same position during evolution and/or various sources of noise. To further examine this correlation, we here tested Zymomonas mobilis PYK (ZmPYK), which has <65% sequence identity to any other PYK sequence. Twenty-six ZmPYK positions were selected based on their phylogenetic scores, substituted with multiple amino acids, and assessed for changes in Kapp-PEP . Although we expected to identify multiple, strong rheostat positions, only one moderate rheostat position was detected. Instead, nearly half of the 271 ZmPYK variants were inactive and most others showed near wild-type function. Indeed, for the active ZmPYK variants, the total range of Kapp,PEP values ("tunability") was 40-fold less than that observed for hLPYK variants. The combined functional studies and sequence comparisons suggest that ZmPYK has evolved functional and/or structural attributes that differ from the rest of the family. We hypothesize that including such "orphan" sequences in MSA analyses obscures the correlations used to predict rheostat positions. Finally, results raise the intriguing biophysical question as to how the same protein fold can support rheostat positions in one homolog but not another.
Collapse
Affiliation(s)
- Braelyn M Page
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tyler A Martin
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Collette L Wright
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA
| | - Lauren A Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Maite T Villar
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Qingling Tang
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Antonio Artigues
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Audrey Lamb
- Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA.,Department of Chemistry, University of Texas at San Antonio, San Antonio, Texas, USA
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
7
|
Dyakin VV, Uversky VN. Arrow of Time, Entropy, and Protein Folding: Holistic View on Biochirality. Int J Mol Sci 2022; 23:ijms23073687. [PMID: 35409047 PMCID: PMC8998916 DOI: 10.3390/ijms23073687] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/23/2022] [Accepted: 03/25/2022] [Indexed: 02/06/2023] Open
Abstract
Chirality is a universal phenomenon, embracing the space–time domains of non-organic and organic nature. The biological time arrow, evident in the aging of proteins and organisms, should be linked to the prevalent biomolecular chirality. This hypothesis drives our exploration of protein aging, in relation to the biological aging of an organism. Recent advances in the chirality discrimination methods and theoretical considerations of the non-equilibrium thermodynamics clarify the fundamental issues, concerning the biphasic, alternative, and stepwise changes in the conformational entropy associated with protein folding. Living cells represent open, non-equilibrium, self-organizing, and dissipative systems. The non-equilibrium thermodynamics of cell biology are determined by utilizing the energy stored, transferred, and released, via adenosine triphosphate (ATP). At the protein level, the synthesis of a homochiral polypeptide chain of L-amino acids (L-AAs) represents the first state in the evolution of the dynamic non-equilibrium state of the system. At the next step the non-equilibrium state of a protein-centric system is supported and amended by a broad set of posttranslational modifications (PTMs). The enzymatic phosphorylation, being the most abundant and ATP-driven form of PTMs, illustrates the principal significance of the energy-coupling, in maintaining and reshaping the system. However, the physiological functions of phosphorylation are under the permanent risk of being compromised by spontaneous racemization. Therefore, the major distinct steps in protein-centric aging include the biosynthesis of a polypeptide chain, protein folding assisted by the system of PTMs, and age-dependent spontaneous protein racemization and degradation. To the best of our knowledge, we are the first to pay attention to the biphasic, alternative, and stepwise changes in the conformational entropy of protein folding. The broader view on protein folding, including the impact of spontaneous racemization, will help in the goal-oriented experimental design in the field of chiral proteomics.
Collapse
Affiliation(s)
- Victor V. Dyakin
- Virtual Reality Perception Lab (VRPL), The Nathan S. Kline Institute for Psychiatric Research (NKI), 140 Old Orangeburg Road, Bldg, 35, Orangeburg, NY 10962, USA
- Correspondence:
| | - Vladimir N. Uversky
- Department of Molecular Medicine, Byrd Alzheimer’s Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd., MDC07, Tampa, FL 33612, USA;
| |
Collapse
|
8
|
Chu WT, Yan Z, Chu X, Zheng X, Liu Z, Xu L, Zhang K, Wang J. Physics of biomolecular recognition and conformational dynamics. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021; 84:126601. [PMID: 34753115 DOI: 10.1088/1361-6633/ac3800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
Biomolecular recognition usually leads to the formation of binding complexes, often accompanied by large-scale conformational changes. This process is fundamental to biological functions at the molecular and cellular levels. Uncovering the physical mechanisms of biomolecular recognition and quantifying the key biomolecular interactions are vital to understand these functions. The recently developed energy landscape theory has been successful in quantifying recognition processes and revealing the underlying mechanisms. Recent studies have shown that in addition to affinity, specificity is also crucial for biomolecular recognition. The proposed physical concept of intrinsic specificity based on the underlying energy landscape theory provides a practical way to quantify the specificity. Optimization of affinity and specificity can be adopted as a principle to guide the evolution and design of molecular recognition. This approach can also be used in practice for drug discovery using multidimensional screening to identify lead compounds. The energy landscape topography of molecular recognition is important for revealing the underlying flexible binding or binding-folding mechanisms. In this review, we first introduce the energy landscape theory for molecular recognition and then address four critical issues related to biomolecular recognition and conformational dynamics: (1) specificity quantification of molecular recognition; (2) evolution and design in molecular recognition; (3) flexible molecular recognition; (4) chromosome structural dynamics. The results described here and the discussions of the insights gained from the energy landscape topography can provide valuable guidance for further computational and experimental investigations of biomolecular recognition and conformational dynamics.
Collapse
Affiliation(s)
- Wen-Ting Chu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Xiakun Chu
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| | - Xiliang Zheng
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zuojia Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Li Xu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Kun Zhang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| |
Collapse
|
9
|
Evolutionary Processes and Biophysical Mechanisms: Revisiting Why Evolved Proteins Are Marginally Stable. J Mol Evol 2021; 88:415-417. [PMID: 32385626 DOI: 10.1007/s00239-020-09948-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Evolved proteins observed in natural organisms are found to be only marginally stable. Several mechanistic hypotheses have been presented to date to explain this observation. One idea that has been put forward is that active selection prevents proteins from becoming too stable to enable proper function. A second idea is that marginal stability reflects the point of mutation-selection-drift balance, where it is mutational pressure that generates marginal stability. A third idea explored in this issue of Journal of Molecular Evolution is that a physical limit prevents the evolution of more stable proteins rather than an evolutionary process. While the first two notions are based upon specific evolutionary processes, discussion here is aimed at reconciling evolutionary processes with the physics of protein folding, drawing upon the ideas that have been presented.
Collapse
|
10
|
Swint-Kruse L, Martin TA, Page BM, Wu T, Gerhart PM, Dougherty LL, Tang Q, Parente DJ, Mosier BR, Bantis LE, Fenton AW. Rheostat functional outcomes occur when substitutions are introduced at nonconserved positions that diverge with speciation. Protein Sci 2021; 30:1833-1853. [PMID: 34076313 DOI: 10.1002/pro.4136] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/25/2021] [Accepted: 05/28/2021] [Indexed: 12/14/2022]
Abstract
When amino acids vary during evolution, the outcome can be functionally neutral or biologically-important. We previously found that substituting a subset of nonconserved positions, "rheostat" positions, can have surprising effects on protein function. Since changes at rheostat positions can facilitate functional evolution or cause disease, more examples are needed to understand their unique biophysical characteristics. Here, we explored whether "phylogenetic" patterns of change in multiple sequence alignments (such as positions with subfamily specific conservation) predict the locations of functional rheostat positions. To that end, we experimentally tested eight phylogenetic positions in human liver pyruvate kinase (hLPYK), using 10-15 substitutions per position and biochemical assays that yielded five functional parameters. Five positions were strongly rheostatic and three were non-neutral. To test the corollary that positions with low phylogenetic scores were not rheostat positions, we combined these phylogenetic positions with previously-identified hLPYK rheostat, "toggle" (most substitution abolished function), and "neutral" (all substitutions were like wild-type) positions. Despite representing 428 variants, this set of 33 positions was poorly statistically powered. Thus, we turned to the in vivo phenotypic dataset for E. coli lactose repressor protein (LacI), which comprised 12-13 substitutions at 329 positions and could be used to identify rheostat, toggle, and neutral positions. Combined hLPYK and LacI results show that positions with strong phylogenetic patterns of change are more likely to exhibit rheostat substitution outcomes than neutral or toggle outcomes. Furthermore, phylogenetic patterns were more successful at identifying rheostat positions than were co-evolutionary or eigenvector centrality measures of evolutionary change.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tyler A Martin
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Braelyn M Page
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tiffany Wu
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Paige M Gerhart
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Larissa L Dougherty
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.,Department of Biochemistry and Cell Biology, Geisel School of Medicine at Dartmouth College, Hanover, New Hampshire, USA
| | - Qingling Tang
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Daniel J Parente
- Department of Family Medicine and Community Health, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Brian R Mosier
- Department of Biostatistics and Data Science, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Leonidas E Bantis
- Department of Biostatistics and Data Science, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
11
|
Ritchie AM, Stark TL, Liberles DA. Inferring the number and position of changes in selective regime in a non-equilibrium mutation-selection framework. BMC Ecol Evol 2021; 21:39. [PMID: 33691618 PMCID: PMC7944921 DOI: 10.1186/s12862-021-01770-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 02/25/2021] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Recovering the historical patterns of selection acting on a protein coding sequence is a major goal of evolutionary biology. Mutation-selection models address this problem by explicitly modelling fixation rates as a function of site-specific amino acid fitness values.However, they are restricted in their utility for investigating directional evolution because they require prior knowledge of the locations of fitness changes in the lineages of a phylogeny. RESULTS We apply a modified mutation-selection methodology that relaxes assumptions of equlibrium and time-reversibility. Our implementation allows us to identify branches where adaptive or compensatory shifts in the fitness landscape have taken place, signalled by a change in amino acid fitness profiles. Through simulation and analysis of an empirical data set of [Formula: see text]-lactamase genes, we test our ability to recover the position of adaptive events within the tree and successfully reconstruct initial codon frequencies and fitness profile parameters generated under the non-stationary model. CONCLUSION We demonstrate successful detection of selective shifts and identification of the affected branch on partitions of 300 codons or more. We successfully reconstruct fitness parameters and initial codon frequencies in simulated data and demonstrate that failing to account for non-equilibrium evolution can increase the error in fitness profile estimation. We also demonstrate reconstruction of plausible shifts in amino acid fitnesses in the bacterial [Formula: see text]-lactamase family and discuss some caveats for interpretation.
Collapse
Affiliation(s)
- Andrew M Ritchie
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA
| | - Tristan L Stark
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA
| | - David A Liberles
- Department of Biology, Temple University, 1900 North 12th Street, Philadelphia, PA, USA.
| |
Collapse
|
12
|
Conformational Ensembles by NMR and MD Simulations in Model Heptapeptides with Select Tri-Peptide Motifs. Int J Mol Sci 2021; 22:ijms22031364. [PMID: 33573010 PMCID: PMC7866422 DOI: 10.3390/ijms22031364] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Revised: 01/20/2021] [Accepted: 01/22/2021] [Indexed: 12/13/2022] Open
Abstract
Both nuclear magnetic resonance (NMR) and molecular dynamics (MD) simulations are routinely used in understanding the conformational space sampled by peptides in the solution state. To investigate the role of single-residue change in the ensemble of conformations sampled by a set of heptapeptides, AEVXEVG with X = L, F, A, or G, comprehensive NMR, and MD simulations were performed. The rationale for selecting the particular model peptides is based on the high variability in the occurrence of tri-peptide E*L between the transmembrane β-barrel (TMB) than in globular proteins. The ensemble of conformations sampled by E*L was compared between the three sets of ensembles derived from NMR spectroscopy, MD simulations with explicit solvent, and the random coil conformations. In addition to the estimation of global determinants such as the radius of gyration of a large sample of structures, the ensembles were analyzed using principal component analysis (PCA). In general, the results suggest that the -EVL- peptide indeed adopts a conformational preference that is distinctly different not only from a random distribution but also from other peptides studied here. The relatively straightforward approach presented herein could help understand the conformational preferences of small peptides in the solution state.
Collapse
|
13
|
Selberg AGA, Gaucher EA, Liberles DA. Ancestral Sequence Reconstruction: From Chemical Paleogenetics to Maximum Likelihood Algorithms and Beyond. J Mol Evol 2021; 89:157-164. [PMID: 33486547 PMCID: PMC7828096 DOI: 10.1007/s00239-021-09993-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 01/04/2021] [Indexed: 12/13/2022]
Abstract
As both a computational and an experimental endeavor, ancestral sequence reconstruction remains a timely and important technique. Modern approaches to conduct ancestral sequence reconstruction for proteins are built upon a conceptual framework from journal founder Emile Zuckerkandl. On top of this, work on maximum likelihood phylogenetics published in Journal of Molecular Evolution in 1996 was one of the first approaches for generating maximum likelihood ancestral sequences of proteins. From its computational history, future model development needs as well as potential applications in areas as diverse as computational systems biology, molecular community ecology, infectious disease therapeutics and other biomedical applications, and biotechnology are discussed. From its past in this journal, there is a bright future for ancestral sequence reconstruction in the field of evolutionary biology.
Collapse
Affiliation(s)
- Avery G A Selberg
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Eric A Gaucher
- Department of Biology, Georgia State University, Atlanta, GA, 30303, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA.
| |
Collapse
|
14
|
Choi YJ, Takahashi T, Taki M, Sawada K, Takahashi K. Label-free attomolar protein detection using a MEMS optical interferometric surface-stress immunosensor with a freestanding PMMA/parylene-C nanosheet. Biosens Bioelectron 2021; 172:112778. [PMID: 33157412 DOI: 10.1016/j.bios.2020.112778] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 11/26/2022]
Abstract
We demonstrated an optical interferometer-based surface-stress immunosensor using freestanding polymethyl methacrylate (PMMA)/parylene-C nanosheet with high sensitivity for detection of biomolecules. PMMA/parylene-C nanosheets were transferred onto a silicon substrate with microcavities to fabricate freestanding submicron-thick membrane with a sealed cavity structure. The adhesive force between the transferred parylene-C and binder parylene-C layer was measured to be 1.06-2.4 N/10 mm by tape test. Evading Debye shielding, these nanomechanical sensors allow detection of the adsorption on the membrane surface through changes in surface stress transduced by the electric charge. We optimized the density of receptors and mode of immobilization for high sensitivity. To evaluate the selectivity of the sensor, membrane deflections induced by various proteins were measured and the spectral shifts showed high selectivity only for the target antigen. The minimum limit of detection (LOD) of the sensor for human serum albumin antigen was 0.1-1 fg/mL (1.5-15 aM), which was 20,000 times lower than that of the conventional micro-cantilever sensor.
Collapse
Affiliation(s)
- Yong-Joon Choi
- Department of Electrical and Electronic Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempakucho, Toyohashi, Aichi, 441-8580, Japan.
| | - Toshiaki Takahashi
- Department of Electrical and Electronic Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempakucho, Toyohashi, Aichi, 441-8580, Japan
| | - Miki Taki
- Department of Electrical and Electronic Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempakucho, Toyohashi, Aichi, 441-8580, Japan
| | - Kazuaki Sawada
- Department of Electrical and Electronic Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempakucho, Toyohashi, Aichi, 441-8580, Japan
| | - Kazuhiro Takahashi
- Department of Electrical and Electronic Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tempakucho, Toyohashi, Aichi, 441-8580, Japan.
| |
Collapse
|
15
|
Li J, Chen G, Guo Y, Wang H, Li H. Single molecule force spectroscopy reveals the context dependent folding pathway of the C-terminal fragment of Top7. Chem Sci 2020; 12:2876-2884. [PMID: 34164053 PMCID: PMC8179357 DOI: 10.1039/d0sc06344d] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Top7 is a de novo designed protein with atomic level accuracy and shows a folded structure not found in nature. Previous studies showed that the folding of Top7 is not cooperative and involves various folding intermediate states. In addition, various fragments of Top7 were found to fold on their own in isolation. These features displayed by Top7 are distinct from those of naturally occurring proteins of a similar size and suggest a rough folding energy landscape. However, it remains unknown if and how the intra-polypeptide chain interactions among the neighboring sequences of Top7 affect the folding of these Top7 fragments. Here we used single-molecule optical tweezers to investigate the folding–unfolding pathways of full length Top7 as well as its C-terminal fragment (CFr) in different sequence environments. Our results showed that the mechanical folding of Top7 involves an intermediate state that likely involves non-native interactions/structure. More importantly, we found that the folding of CFr is entirely dependent upon its sequence context in which it is located. When in isolation, CFr indeed folds into a cooperative structure showing near-equilibrium unfolding–folding transitions at ∼6.5 pN in OT experiments. However, CFr loses its autonomous cooperative folding ability and displays a folding pathway that is dependent on its interactions with its neighboring sequence/structure. This context-dependent folding dynamics and pathway of CFr are distinct from those of naturally occurring proteins and highlight the critical importance of intra-chain interactions in shaping the overall energy landscape and the folding pathway of Top7. These new insights may have important implications on the de novo design of proteins. Optical tweezers experiments reveal that the folding of the C-terminal fragment of Top7 (cFr) is context-dependent. Depending on its neighboring sequence, cFr shows very different folding pathways and folding kinetics. ![]()
Collapse
Affiliation(s)
- Jiayu Li
- Department of Chemistry, University of British Columbia Vancouver BC V6T 1Z1 Canada
| | - Guojun Chen
- Department of Chemistry, University of British Columbia Vancouver BC V6T 1Z1 Canada
| | - Yabin Guo
- Department of Chemistry, University of British Columbia Vancouver BC V6T 1Z1 Canada
| | - Han Wang
- Department of Chemistry, University of British Columbia Vancouver BC V6T 1Z1 Canada
| | - Hongbin Li
- Department of Chemistry, University of British Columbia Vancouver BC V6T 1Z1 Canada
| |
Collapse
|
16
|
Chi PB, Kosater WM, Liberles DA. Detecting Signatures of Positive Selection against a Backdrop of Compensatory Processes. Mol Biol Evol 2020; 37:3353-3362. [PMID: 32895716 DOI: 10.1093/molbev/msaa161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
There are known limitations in methods of detecting positive selection. Common methods do not enable differentiation between positive selection and compensatory covariation, a major limitation. Further, the traditional method of calculating the ratio of nonsynonymous to synonymous substitutions (dN/dS) does not take into account the 3D structure of biomacromolecules nor differences between amino acids. It also does not account for saturation of synonymous mutations (dS) over long evolutionary time that renders codon-based methods ineffective for older divergences. This work aims to address these shortcomings for detecting positive selection through the development of a statistical model that examines clusters of substitutions in clusters of variable radii. Additionally, it uses a parametric bootstrapping approach to differentiate positive selection from compensatory processes. A previously reported case of positive selection in the leptin protein of primates was reexamined using this methodology.
Collapse
Affiliation(s)
- Peter B Chi
- Department of Mathematics and Statistics, Villanova University, Villanova, PA.,Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| | - Westin M Kosater
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA
| |
Collapse
|
17
|
Funneled energy landscape unifies principles of protein binding and evolution. Proc Natl Acad Sci U S A 2020; 117:27218-27223. [PMID: 33067388 DOI: 10.1073/pnas.2013822117] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Most proteins have evolved to spontaneously fold into native structure and specifically bind with their partners for the purpose of fulfilling biological functions. According to Darwin, protein sequences evolve through random mutations, and only the fittest survives. The understanding of how the evolutionary selection sculpts the interaction patterns for both biomolecular folding and binding is still challenging. In this study, we incorporated the constraint of functional binding into the selection fitness based on the principle of minimal frustration for the underlying biomolecular interactions. Thermodynamic stability and kinetic accessibility were derived and quantified from a global funneled energy landscape that satisfies the requirements of both the folding into the stable structure and binding with the specific partner. The evolution proceeds via a bowl-like evolution energy landscape in the sequence space with a closed-ring attractor at the bottom. The sequence space is increasingly reduced until this ring attractor is reached. The molecular-interaction patterns responsible for folding and binding are identified from the evolved sequences, respectively. The residual positions participating in the interactions responsible for folding are highly conserved and maintain the hydrophobic core under additional evolutionary constraints of functional binding. The positions responsible for binding constitute a distributed network via coupling conservations that determine the specificity of binding with the partner. This work unifies the principles of protein binding and evolution under minimal frustration and sheds light on the evolutionary design of proteins for functions.
Collapse
|
18
|
Liberles DA, Chang B, Geiler-Samerotte K, Goldman A, Hey J, Kaçar B, Meyer M, Murphy W, Posada D, Storfer A. Emerging Frontiers in the Study of Molecular Evolution. J Mol Evol 2020; 88:211-226. [PMID: 32060574 PMCID: PMC7386396 DOI: 10.1007/s00239-020-09932-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A collection of the editors of Journal of Molecular Evolution have gotten together to pose a set of key challenges and future directions for the field of molecular evolution. Topics include challenges and new directions in prebiotic chemistry and the RNA world, reconstruction of early cellular genomes and proteins, macromolecular and functional evolution, evolutionary cell biology, genome evolution, molecular evolutionary ecology, viral phylodynamics, theoretical population genomics, somatic cell molecular evolution, and directed evolution. While our effort is not meant to be exhaustive, it reflects research questions and problems in the field of molecular evolution that are exciting to our editors.
Collapse
Affiliation(s)
- David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA.
| | - Belinda Chang
- Department of Ecology and Evolutionary Biology and Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, ON, M5S 3G5, Canada
| | - Kerry Geiler-Samerotte
- Center for Mechanisms of Evolution, School of Life Sciences, Arizona State University, Tempe, AZ, 85287, USA
| | - Aaron Goldman
- Department of Biology, Oberlin College and Conservatory, K123 Science Center, 119 Woodland Street, Oberlin, OH, 44074, USA
| | - Jody Hey
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Betül Kaçar
- Department of Molecular and Cell Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - Michelle Meyer
- Department of Biology, Boston College, Chestnut Hill, MA, 02467, USA
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, 77843, USA
| | - David Posada
- Biomedical Research Center (CINBIO), University of Vigo, Vigo, Spain
| | - Andrew Storfer
- School of Biological Sciences, Washington State University, Pullman, WA, 99164, USA
| |
Collapse
|
19
|
Northover DE, Shank SD, Liberles DA. Characterizing lineage-specific evolution and the processes driving genomic diversification in chordates. BMC Evol Biol 2020; 20:24. [PMID: 32046633 PMCID: PMC7011509 DOI: 10.1186/s12862-020-1585-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 01/16/2020] [Indexed: 11/21/2022] Open
Abstract
Background Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. By examining genome evolution through the guise of lineage-specific evolution, it is possible to make inferences about the evolutionary events that have given rise to species-specific diversification. Here we characterize the evolutionary trends found in chordate species using The Adaptive Evolution Database (TAED). TAED is a database of phylogenetically indexed gene families designed to detect episodes of directional or diversifying selection across chordates. Gene families within the database have been assessed for lineage-specific estimates of dN/dS and have been reconciled to the chordate species to identify retained duplicates. Gene families have also been mapped to the functional pathways and amino acid changes which occurred on high dN/dS lineages have been mapped to protein structures. Results An analysis of this exhaustive database has enabled a characterization of the processes of lineage-specific diversification in chordates. A pathway level enrichment analysis of TAED determined that pathways most commonly found to have elevated rates of evolution included those involved in metabolism, immunity, and cell signaling. An analysis of protein fold presence on proteins, after normalizing for frequency in the database, found common folds such as Rossmann folds, Jelly Roll folds, and TIM barrels were overrepresented on proteins most likely to undergo directional selection. A set of gene families which experience increased numbers of duplications within short evolutionary times are associated with pathways involved in metabolism, olfactory reception, and signaling. An analysis of protein secondary structure indicated more relaxed constraint in β-sheets and stronger constraint on alpha Helices, amidst a general preference for substitutions at exposed sites. Lastly a detailed analysis of the ornithine decarboxylase gene family, a key enzyme in the pathway for polyamine synthesis, revealed lineage-specific evolution along the lineage leading to Cetacea through rapid sequence evolution in a duplicate gene with amino acid substitutions causing active site rearrangement. Conclusion Episodes of lineage-specific evolution are frequent throughout chordate species. Both duplication and directional selection have played large roles in the evolution of the phylum. TAED is a powerful tool for facilitating this understanding of lineage-specific evolution.
Collapse
Affiliation(s)
- David E Northover
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Stephen D Shank
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA. .,Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA.
| |
Collapse
|
20
|
Arenas M, Bastolla U. ProtASR2: Ancestral reconstruction of protein sequences accounting for folding stability. Methods Ecol Evol 2020. [DOI: 10.1111/2041-210x.13341] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Miguel Arenas
- Department of Biochemistry, Genetics and Immunology University of Vigo Vigo Spain
- Biomedical Research Center (CINBIO) University of Vigo Vigo Spain
| | - Ugo Bastolla
- Bioinformatics Unit Centre for Molecular Biology Severo Ochoa (CSIC) Madrid Spain
| |
Collapse
|
21
|
In-Silico Evaluation of a New Gene From Wheat Reveals the Divergent Evolution of the CAP160 Homologous Genes Into Monocots. J Mol Evol 2019; 88:151-163. [PMID: 31820048 DOI: 10.1007/s00239-019-09920-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 11/19/2019] [Indexed: 10/25/2022]
Abstract
This study reports the evolutionary history and in-silico functional characterization of a novel water-deficit and ABA-responsive gene in wheat. This gene has remote sequence similarity to known abiotic stress-related genes in different plants, including CAP160 in Spinacia oleracea, RD29B in Arabidopsis thaliana, and CDeT11-24 in Craterostigma plantagineum. The study investigated if these genes form a close homologous relationship or if they are a result of convergent evolutionary processes. The results indicated a closely shared homologous relationship between these genes. Bayesian phylogenetic analysis of the protein sequences of the remotely related CAP160 proteins from various plant species indicated the presence of three distinct clades. Further analyses indicated that CAP160 homologous genes have predominantly evolved through neutral processes, with multiple regions experiencing signatures of purifying selection, while others were indicated to be the result of episodic diversifying selection events. Functional predictions revealed that these genes might share at least two functions related to abiotic stress conditions: one similar to the cryoprotective function of LEA protein, and the other a signalling molecule with phosphatidic acid binding specificity. Studies focused on the identification of cold-responsive genes are essential for the development of cold-tolerant crop plants, if we are to increase agricultural productivity throughout temperate regions.
Collapse
|
22
|
Berkut AA, Chugunov AO, Mineev KS, Peigneur S, Tabakmakher VM, Krylov NA, Oparin PB, Lihonosova AF, Novikova EV, Arseniev AS, Grishin EV, Tytgat J, Efremov RG, Vassilevski AA. Protein surface topography as a tool to enhance the selective activity of a potassium channel blocker. J Biol Chem 2019; 294:18349-18359. [PMID: 31533989 DOI: 10.1074/jbc.ra119.010494] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Indexed: 01/24/2023] Open
Abstract
Tk-hefu is an artificial peptide designed based on the α-hairpinin scaffold, which selectively blocks voltage-gated potassium channels Kv1.3. Here we present its spatial structure resolved by NMR spectroscopy and analyze its interaction with channels using computer modeling. We apply protein surface topography to suggest mutations and increase Tk-hefu affinity to the Kv1.3 channel isoform. We redesign the functional surface of Tk-hefu to better match the respective surface of the channel pore vestibule. The resulting peptide Tk-hefu-2 retains Kv1.3 selectivity and displays ∼15 times greater activity compared with Tk-hefu. We verify the mode of Tk-hefu-2 binding to the channel outer vestibule experimentally by site-directed mutagenesis. We argue that scaffold engineering aided by protein surface topography represents a reliable tool for design and optimization of specific ion channel ligands.
Collapse
Affiliation(s)
- Antonina A Berkut
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| | - Anton O Chugunov
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; National Research University Higher School of Economics, 101000 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia
| | - Konstantin S Mineev
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia
| | - Steve Peigneur
- Toxicology and Pharmacology, University of Leuven, 3000 Leuven, Belgium
| | - Valentin M Tabakmakher
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; School of Biomedicine, Far Eastern Federal University, 690950 Vladivostok, Russia
| | - Nikolay A Krylov
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; National Research University Higher School of Economics, 101000 Moscow, Russia
| | - Peter B Oparin
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| | - Alyona F Lihonosova
- National Research University Higher School of Economics, 101000 Moscow, Russia
| | - Ekaterina V Novikova
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia
| | - Alexander S Arseniev
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia
| | - Eugene V Grishin
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia
| | - Jan Tytgat
- Toxicology and Pharmacology, University of Leuven, 3000 Leuven, Belgium
| | - Roman G Efremov
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; National Research University Higher School of Economics, 101000 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia.
| | - Alexander A Vassilevski
- M.M. Shemyakin & Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; Moscow Institute of Physics and Technology (State University), 141700 Dolgoprudny, Russia.
| |
Collapse
|
23
|
Held T, Klemmer D, Lässig M. Survival of the simplest in microbial evolution. Nat Commun 2019; 10:2472. [PMID: 31171781 PMCID: PMC6554311 DOI: 10.1038/s41467-019-10413-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 05/10/2019] [Indexed: 01/09/2023] Open
Abstract
The evolution of microbial and viral organisms often generates clonal interference, a mode of competition between genetic clades within a population. Here we show how interference impacts systems biology by constraining genetic and phenotypic complexity. Our analysis uses biophysically grounded evolutionary models for molecular phenotypes, such as fold stability and enzymatic activity of genes. We find a generic mode of phenotypic interference that couples the function of individual genes and the population’s global evolutionary dynamics. Biological implications of phenotypic interference include rapid collateral system degradation in adaptation experiments and long-term selection against genome complexity: each additional gene carries a cost proportional to the total number of genes. Recombination above a threshold rate can eliminate this cost, which establishes a universal, biophysically grounded scenario for the evolution of sex. In a broader context, our analysis suggests that the systems biology of microbes is strongly intertwined with their mode of evolution. In asexual populations selection at different genomic loci can interfere with each other. Here, using a biophysical model of molecular evolution the authors show that interference results in long-term degradation of molecular function, an effect that strongly depends on genome size.
Collapse
Affiliation(s)
- Torsten Held
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Daniel Klemmer
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany
| | - Michael Lässig
- Institut für Biologische Physik, Universität zu Köln, Zülpicherstr. 77, 50937, Köln, Germany.
| |
Collapse
|
24
|
Kuzminkova AA, Sokol AD, Ushakova KE, Popadin KY, Gunbin KV. mtProtEvol: the resource presenting molecular evolution analysis of proteins involved in the function of Vertebrate mitochondria. BMC Evol Biol 2019; 19:47. [PMID: 30813887 PMCID: PMC6391778 DOI: 10.1186/s12862-019-1371-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Heterotachy is the variation in the evolutionary rate of aligned sites in different parts of the phylogenetic tree. It occurs mainly due to epistatic interactions among the substitutions, which are highly complex and make it difficult to study protein evolution. The vast majority of computational evolutionary approaches for studying these epistatic interactions or their evolutionary consequences in proteins require high computational time. However, recently, it has been shown that the evolution of residue solvent accessibility (RSA) is tightly linked with changes in protein fitness and intra-protein epistatic interactions. This provides a computationally fast alternative, based on comparison of evolutionary rates of amino acid replacements with the rates of RSA evolutionary changes in order to recognize any shifts in epistatic interaction. RESULTS Based on RSA information, data randomization and phylogenetic approaches, we constructed a software pipeline, which can be used to analyze the evolutionary consequences of intra-protein epistatic interactions with relatively low computational time. We analyzed the evolution of 512 protein families tightly linked to mitochondrial function in Vertebrates and created "mtProtEvol", the web resource with data on protein evolution. In strict agreement with lifespan and metabolic rate data, we demonstrated that different functional categories of mitochondria-related proteins subjected to selection on accelerated and decelerated RSA rates in rodents and primates. For example, accelerated RSA evolution in rodents has been shown for Krebs cycle enzymes, respiratory chain and reactive oxygen species metabolism, while in primates these functions are stress-response, translation and mtDNA integrity. Decelerated RSA evolution in rodents has been demonstrated for translational machinery and oxidative stress response components. CONCLUSIONS mtProtEvol is an interactive resource focused on evolutionary analysis of epistatic interactions in protein families involved in Vertebrata mitochondria function and available at http://bioinfodbs.kantiana.ru/mtProtEvol /. This resource and the devised software pipeline may be useful tool for researchers in area of protein evolution.
Collapse
Affiliation(s)
- Anastasia A. Kuzminkova
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
| | - Anastasia D. Sokol
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
| | - Kristina E. Ushakova
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
| | - Konstantin Yu. Popadin
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Konstantin V. Gunbin
- Center for Mitochondrial Functional Genomics, School of Life Science, Immanuel Kant Baltic Federal University, Kaliningrad, Russia
- Center of Brain Neurobiology and Neurogenetics, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
- Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
25
|
Yan Z, Wang J. Superfunneled Energy Landscape of Protein Evolution Unifies the Principles of Protein Evolution, Folding, and Design. PHYSICAL REVIEW LETTERS 2019; 122:018103. [PMID: 31012725 DOI: 10.1103/physrevlett.122.018103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 11/08/2018] [Indexed: 06/09/2023]
Abstract
Evolution is essential for shaping the biological functions. Darwin proposed the selection as the driving force for evolution upon mutations. While mutations are clear, the quantification of the selection force is still challenging. In this study, we identified and quantified both thermodynamic stability and kinetic accessibility as the selection forces for protein evolution. The protein evolution can be viewed and quantified as a trajectory moving along a superfunneled energy landscape with a line attractor at the bottom. The resulting evolved sequences and structures show strong protein characteristics including the hydrophobic core, high designability, and fast folding. The evolution principle uncovered here is validated on real proteins and sheds light on the protein design.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York 11790, USA
| |
Collapse
|
26
|
Fragata I, Blanckaert A, Dias Louro MA, Liberles DA, Bank C. Evolution in the light of fitness landscape theory. Trends Ecol Evol 2019; 34:69-82. [DOI: 10.1016/j.tree.2018.10.009] [Citation(s) in RCA: 84] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 10/16/2018] [Accepted: 10/17/2018] [Indexed: 01/28/2023]
|
27
|
Liberles DA, Teufel AI. Evolution and Structure of Proteins and Proteomes. Genes (Basel) 2018; 9:E583. [PMID: 30487453 PMCID: PMC6315575 DOI: 10.3390/genes9120583] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Accepted: 11/26/2018] [Indexed: 12/13/2022] Open
Abstract
This themed issue centered on the evolution and structure of proteins and proteomes is comprised of seven published manuscripts. [...].
Collapse
Affiliation(s)
- David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA 19122, USA.
| | - Ashley I Teufel
- Department of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
28
|
Gupta M, Sharma R, Kumar A. Docking techniques in pharmacology: How much promising? Comput Biol Chem 2018; 76:210-217. [PMID: 30067954 DOI: 10.1016/j.compbiolchem.2018.06.005] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Revised: 02/21/2018] [Accepted: 06/30/2018] [Indexed: 01/01/2023]
|
29
|
|
30
|
Using the Mutation-Selection Framework to Characterize Selection on Protein Sequences. Genes (Basel) 2018; 9:genes9080409. [PMID: 30104502 PMCID: PMC6115872 DOI: 10.3390/genes9080409] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 08/02/2018] [Accepted: 08/09/2018] [Indexed: 12/13/2022] Open
Abstract
When mutational pressure is weak, the generative process of protein evolution involves explicit probabilities of mutations of different types coupled to their conditional probabilities of fixation dependent on selection. Establishing this mechanistic modeling framework for the detection of selection has been a goal in the field of molecular evolution. Building on a mathematical framework proposed more than a decade ago, numerous methods have been introduced in an attempt to detect and measure selection on protein sequences. In this review, we discuss the structure of the original model, subsequent advances, and the series of assumptions that these models operate under.
Collapse
|
31
|
Platt A, Weber CC, Liberles DA. Protein evolution depends on multiple distinct population size parameters. BMC Evol Biol 2018; 18:17. [PMID: 29422024 PMCID: PMC5806465 DOI: 10.1186/s12862-017-1085-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 11/20/2017] [Indexed: 01/08/2023] Open
Abstract
That population size affects the fate of new mutations arising in genomes, modulating both how frequently they arise and how efficiently natural selection is able to filter them, is well established. It is therefore clear that these distinct roles for population size that characterize different processes should affect the evolution of proteins and need to be carefully defined. Empirical evidence is consistent with a role for demography in influencing protein evolution, supporting the idea that functional constraints alone do not determine the composition of coding sequences. Given that the relationship between population size, mutant fitness and fixation probability has been well characterized, estimating fitness from observed substitutions is well within reach with well-formulated models. Molecular evolution research has, therefore, increasingly begun to leverage concepts from population genetics to quantify the selective effects associated with different classes of mutation. However, in order for this type of analysis to provide meaningful information about the intra- and inter-specific evolution of coding sequences, a clear definition of concepts of population size, what they influence, and how they are best parameterized is essential. Here, we present an overview of the many distinct concepts that “population size” and “effective population size” may refer to, what they represent for studying proteins, and how this knowledge can be harnessed to produce better specified models of protein evolution.
Collapse
Affiliation(s)
- Alexander Platt
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, 19121, USA
| | - Claudia C Weber
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, 19121, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, 19121, USA.
| |
Collapse
|
32
|
Reddy S, Kimball RT, Pandey A, Hosner PA, Braun MJ, Hackett SJ, Han KL, Harshman J, Huddleston CJ, Kingston S, Marks BD, Miglia KJ, Moore WS, Sheldon FH, Witt CC, Yuri T, Braun EL. Why Do Phylogenomic Data Sets Yield Conflicting Trees? Data Type Influences the Avian Tree of Life more than Taxon Sampling. Syst Biol 2018; 66:857-879. [PMID: 28369655 DOI: 10.1093/sysbio/syx041] [Citation(s) in RCA: 146] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Accepted: 03/22/2017] [Indexed: 01/27/2023] Open
Abstract
Phylogenomics, the use of large-scale data matrices in phylogenetic analyses, has been viewed as the ultimate solution to the problem of resolving difficult nodes in the tree of life. However, it has become clear that analyses of these large genomic data sets can also result in conflicting estimates of phylogeny. Here, we use the early divergences in Neoaves, the largest clade of extant birds, as a "model system" to understand the basis for incongruence among phylogenomic trees. We were motivated by the observation that trees from two recent avian phylogenomic studies exhibit conflicts. Those studies used different strategies: 1) collecting many characters [$\sim$ 42 mega base pairs (Mbp) of sequence data] from 48 birds, sometimes including only one taxon for each major clade; and 2) collecting fewer characters ($\sim$ 0.4 Mbp) from 198 birds, selected to subdivide long branches. However, the studies also used different data types: the taxon-poor data matrix comprised 68% non-coding sequences whereas coding exons dominated the taxon-rich data matrix. This difference raises the question of whether the primary reason for incongruence is the number of sites, the number of taxa, or the data type. To test among these alternative hypotheses we assembled a novel, large-scale data matrix comprising 90% non-coding sequences from 235 bird species. Although increased taxon sampling appeared to have a positive impact on phylogenetic analyses the most important variable was data type. Indeed, by analyzing different subsets of the taxa in our data matrix we found that increased taxon sampling actually resulted in increased congruence with the tree from the previous taxon-poor study (which had a majority of non-coding data) instead of the taxon-rich study (which largely used coding data). We suggest that the observed differences in the estimates of topology for these studies reflect data-type effects due to violations of the models used in phylogenetic analyses, some of which may be difficult to detect. If incongruence among trees estimated using phylogenomic methods largely reflects problems with model fit developing more "biologically-realistic" models is likely to be critical for efforts to reconstruct the tree of life. [Birds; coding exons; GTR model; model fit; Neoaves; non-coding DNA; phylogenomics; taxon sampling.].
Collapse
Affiliation(s)
- Sushma Reddy
- Biology Department, Loyola University Chicago, 1032 West Sheridan Road, Chicago, IL 60660, USA
| | - Rebecca T Kimball
- Department of Biology, University of Florida, Gainesville, FL 32607, USA
| | - Akanksha Pandey
- Department of Biology, University of Florida, Gainesville, FL 32607, USA
| | - Peter A Hosner
- Department of Biology, University of Florida, Gainesville, FL 32607, USA.,Florida Museum of Natural History, University of Florida, Gainesville, FL 32607, USA
| | - Michael J Braun
- Behavior, Ecology, Evolution, and Systematics Program, University of Maryland, College Park, MD 20742, USA.,Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution-MRC 163, PO Box 37012, Washington, DC 20013-7012, USA
| | - Shannon J Hackett
- Zoology Department, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, IL 60605, USA
| | - Kin-Lan Han
- Department of Biology, University of Florida, Gainesville, FL 32607, USA
| | | | - Christopher J Huddleston
- Collections Program, National Museum of Natural History, Smithsonian Institution, 4210 Silver Hill Road, Suitland, MD 20746, USA
| | - Sarah Kingston
- Behavior, Ecology, Evolution, and Systematics Program, University of Maryland, College Park, MD 20742, USA.,Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution-MRC 163, PO Box 37012, Washington, DC 20013-7012, USA.,Bowdoin College, Department of Biology and Coastal Studies Center, 6500 College Station, Brunwick, ME 04011, USA
| | - Ben D Marks
- Zoology Department, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, IL 60605, USA
| | - Kathleen J Miglia
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI 48202, USA
| | - William S Moore
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI 48202, USA
| | - Frederick H Sheldon
- Museum of Natural Science and Department of Biological Sciences, Louisiana State University, 119 Foster Hall, Baton Rouge, LA 70803, USA
| | - Christopher C Witt
- Department of Biology and Museum of Southwestern Biology, University 15 of New Mexico, Albuquerque, New Mexico 87131, USA
| | - Tamaki Yuri
- Department of Biology, University of Florida, Gainesville, FL 32607, USA.,Sam Noble Museum, University of Oklahoma, 2401 Chautauqua Avenue, Norman, OK 73072, USA
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL 32607, USA.,Genetics Institute, University of Florida, Gainesville, FL 32607, USA
| |
Collapse
|
33
|
Beyond Thermodynamic Constraints: Evolutionary Sampling Generates Realistic Protein Sequence Variation. Genetics 2018; 208:1387-1395. [PMID: 29382650 DOI: 10.1534/genetics.118.300699] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 01/25/2018] [Indexed: 01/01/2023] Open
Abstract
Biological evolution generates a surprising amount of site-specific variability in protein sequences. Yet, attempts at modeling this process have been only moderately successful, and current models based on protein structural metrics explain, at best, 60% of the observed variation. Surprisingly, simple measures of protein structure, such as solvent accessibility, are often better predictors of site-specific variability than more complex models employing all-atom energy functions and detailed structural modeling. We suggest here that these more complex models perform poorly because they lack consideration of the evolutionary process, which is, in part, captured by the simpler metrics. We compare protein sequences that are computationally designed to sequences that are computationally evolved using the same protein-design energy function and to homologous natural sequences. We find that, by a wide variety of metrics, evolved sequences are much more similar to natural sequences than are designed sequences. In particular, designed sequences are too conserved on the protein surface relative to natural sequences, whereas evolved sequences are not. Our results suggest that evolutionary simulation produces a realistic sampling of sequence space. By contrast, protein design-at least as currently implemented-does not. Existing energy functions seem to be sufficiently accurate to correctly describe the key thermodynamic constraints acting on protein sequences, but they need to be paired with realistic sampling schemes to generate realistic sequence alignments.
Collapse
|
34
|
Chi PB, Kim D, Lai JK, Bykova N, Weber CC, Kubelka J, Liberles DA. A new parameter-rich structure-aware mechanistic model for amino acid substitution during evolution. Proteins 2017; 86:218-228. [PMID: 29178386 DOI: 10.1002/prot.25429] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 11/14/2017] [Accepted: 11/22/2017] [Indexed: 02/06/2023]
Abstract
Improvements in the description of amino acid substitution are required to develop better pseudo-energy-based protein structure-aware models for use in phylogenetic studies. These models are used to characterize the probabilities of amino acid substitution and enable better simulation of protein sequences over a phylogeny. A better characterization of amino acid substitution probabilities in turn enables numerous downstream applications, like detecting positive selection, ancestral sequence reconstruction, and evolutionarily-motivated protein engineering. Many existing Markov models for amino acid substitution in molecular evolution disregard molecular structure and describe the amino acid substitution process over longer evolutionary periods poorly. Here, we present a new model upgraded with a site-specific parameterization of pseudo-energy terms in a coarse-grained force field, which describes local heterogeneity in physical constraints on amino acid substitution better than a previous pseudo-energy-based model with minimum cost in runtime. The importance of each weight term parameterization in characterizing underlying features of the site, including contact number, solvent accessibility, and secondary structural elements was evaluated, returning both expected and biologically reasonable relationships between model parameters. This results in the acceptance of proposed amino acid substitutions that more closely resemble those observed site-specific frequencies in gene family alignments. The modular site-specific pseudo-energy function is made available for download through the following website: https://liberles.cst.temple.edu/Software/CASS/index.html.
Collapse
Affiliation(s)
- Peter B Chi
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, 19122.,Department of Mathematics and Computer Science, Ursinus College, Collegeville, Pennsylvania, 19426
| | - Dohyup Kim
- Department of Molecular Biology, University of Wyoming, Laramie, Wyoming, 82071
| | - Jason K Lai
- Department of Molecular Biology, University of Wyoming, Laramie, Wyoming, 82071
| | - Nadia Bykova
- Department of Molecular Biology, University of Wyoming, Laramie, Wyoming, 82071.,Faculty of Bioengineering and Bioinformatics, Moscow State University, Moscow, 119234, Russia
| | - Claudia C Weber
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, 19122
| | - Jan Kubelka
- Department of Chemistry, University of Wyoming, Laramie, Wyoming, 82071
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, 19122.,Department of Molecular Biology, University of Wyoming, Laramie, Wyoming, 82071
| |
Collapse
|
35
|
Teufel AI, Wilke CO. Accelerated simulation of evolutionary trajectories in origin-fixation models. J R Soc Interface 2017; 14:20160906. [PMID: 28228542 PMCID: PMC5332577 DOI: 10.1098/rsif.2016.0906] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Accepted: 01/31/2017] [Indexed: 11/12/2022] Open
Abstract
We present an accelerated algorithm to forward-simulate origin-fixation models. Our algorithm requires, on average, only about two fitness evaluations per fixed mutation, whereas traditional algorithms require, per one fixed mutation, a number of fitness evaluations of the order of the effective population size, Ne Our accelerated algorithm yields the exact same steady state as the original algorithm but produces a different order of fixed mutations. By comparing several relevant evolutionary metrics, such as the distribution of fixed selection coefficients and the probability of reversion, we find that the two algorithms behave equivalently in many respects. However, the accelerated algorithm yields less variance in fixed selection coefficients. Notably, we are able to recover the expected amount of variance by rescaling population size, and we find a linear relationship between the rescaled population size and the population size used by the original algorithm. Considering the widespread usage of origin-fixation simulations across many areas of evolutionary biology, we introduce our accelerated algorithm as a useful tool for increasing the computational complexity of fitness functions without sacrificing much in terms of accuracy of the evolutionary simulation.
Collapse
Affiliation(s)
- Ashley I Teufel
- Department of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Institute for Cellular and Molecular Biology, and Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
36
|
Bastolla U, Dehouck Y, Echave J. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol 2017; 42:59-66. [DOI: 10.1016/j.sbi.2016.10.020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 10/19/2016] [Accepted: 10/24/2016] [Indexed: 12/21/2022]
|
37
|
Randall RN, Radford CE, Roof KA, Natarajan DK, Gaucher EA. An experimental phylogeny to benchmark ancestral sequence reconstruction. Nat Commun 2016; 7:12847. [PMID: 27628687 PMCID: PMC5027606 DOI: 10.1038/ncomms12847] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 08/05/2016] [Indexed: 12/15/2022] Open
Abstract
Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as 'modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences.
Collapse
Affiliation(s)
- Ryan N. Randall
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Caelan E. Radford
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Kelsey A. Roof
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Divya K. Natarajan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Eric A. Gaucher
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
- Institute for Bioengineering and Biosciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| |
Collapse
|