Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Morcos F, Schafer NP, Cheng RR, Onuchic JN, Wolynes PG. Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection. Proc Natl Acad Sci U S A 2014;111:12408-13. [PMID: 25114242 DOI: 10.1073/pnas.1413575111] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Morcos F, Schafer NP, Cheng RR, Onuchic JN, Wolynes PG. Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection. Proc Natl Acad Sci U S A 2014;111:12408-13. [PMID: 25114242 DOI: 10.1073/pnas.1413575111] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Gizzio J, Thakur A, Haldane A, Post CB, Levy RM. Evolutionary sequence and structural basis for the distinct conformational landscapes of Tyr and Ser/Thr kinases. Nat Commun 2024;15:6545. [PMID: 39095350 PMCID: PMC11297160 DOI: 10.1038/s41467-024-50812-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 07/22/2024] [Indexed: 08/04/2024] Open

Kinshuk S, Li L, Meckes B, Chan CTY. Sequence-Based Protein Design: A Review of Using Statistical Models to Characterize Coevolutionary Traits for Developing Hybrid Proteins as Genetic Sensors. Int J Mol Sci 2024;25:8320. [PMID: 39125888 PMCID: PMC11312098 DOI: 10.3390/ijms25158320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 07/23/2024] [Accepted: 07/26/2024] [Indexed: 08/12/2024] Open

Martin J, Lequerica Mateos M, Onuchic JN, Coluzza I, Morcos F. Machine learning in biological physics: From biomolecular prediction to design. Proc Natl Acad Sci U S A 2024;121:e2311807121. [PMID: 38913893 PMCID: PMC11228481 DOI: 10.1073/pnas.2311807121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024] Open

Fram B, Su Y, Truebridge I, Riesselman AJ, Ingraham JB, Passera A, Napier E, Thadani NN, Lim S, Roberts K, Kaur G, Stiffler MA, Marks DS, Bahl CD, Khan AR, Sander C, Gauthier NP. Simultaneous enhancement of multiple functional properties using evolution-informed protein design. Nat Commun 2024;15:5141. [PMID: 38902262 PMCID: PMC11190266 DOI: 10.1038/s41467-024-49119-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 05/24/2024] [Indexed: 06/22/2024] Open

Affiliation(s)

Benjamin Fram Department of Systems Biology, Harvard Medical School, Boston, MA, USA. Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
Yang Su Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Ian Truebridge Institute for Protein Innovation, Boston, MA, USA Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA AI Proteins, Boston, MA, USA
Adam J Riesselman Department of Systems Biology, Harvard Medical School, Boston, MA, USA Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
John B Ingraham Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Alessandro Passera Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030, Vienna, Austria
Eve Napier School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
Nicole N Thadani Department of Systems Biology, Harvard Medical School, Boston, MA, USA Apriori Bio, Cambridge, MA, USA
Samuel Lim Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Kristen Roberts Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
Gurleen Kaur Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
Michael A Stiffler Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA Dyno Therapeutics, 343 Arsenal Street, Watertown, MA, USA
Debora S Marks Department of Systems Biology, Harvard Medical School, Boston, MA, USA Broad Institute of MIT and Harvard, Cambridge, MA, USA
Christopher D Bahl Institute for Protein Innovation, Boston, MA, USA Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA AI Proteins, Boston, MA, USA
Amir R Khan School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
Chris Sander Department of Systems Biology, Harvard Medical School, Boston, MA, USA Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA Broad Institute of MIT and Harvard, Cambridge, MA, USA
Nicholas P Gauthier Department of Systems Biology, Harvard Medical School, Boston, MA, USA. Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA. Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Collapse

He J, Wu W, Wang X. DIProT: A deep learning based interactive toolkit for efficient and effective Protein design. Synth Syst Biotechnol 2024;9:217-222. [PMID: 38385151 PMCID: PMC10876589 DOI: 10.1016/j.synbio.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/02/2024] [Accepted: 01/30/2024] [Indexed: 02/23/2024] Open

Sánchez IE, Galpern EA, Ferreiro DU. Solvent constraints for biopolymer folding and evolution in extraterrestrial environments. Proc Natl Acad Sci U S A 2024;121:e2318905121. [PMID: 38739787 PMCID: PMC11127021 DOI: 10.1073/pnas.2318905121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open

Jaafari H, Bueno C, Schafer NP, Martin J, Morcos F, Wolynes PG. The physical and evolutionary energy landscapes of devolved protein sequences corresponding to pseudogenes. Proc Natl Acad Sci U S A 2024;121:e2322428121. [PMID: 38739795 PMCID: PMC11127006 DOI: 10.1073/pnas.2322428121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/26/2024] [Indexed: 05/16/2024] Open

Schwerdtfeger P, Wales DJ. 100 Years of the Lennard-Jones Potential. J Chem Theory Comput 2024;20:3379-3405. [PMID: 38669689 DOI: 10.1021/acs.jctc.4c00135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]

Gizzio J, Thakur A, Haldane A, Levy RM. Evolutionary sequence and structural basis for the distinct conformational landscapes of Tyr and Ser/Thr kinases. RESEARCH SQUARE 2024:rs.3.rs-4048991. [PMID: 38746330 PMCID: PMC11092858 DOI: 10.21203/rs.3.rs-4048991/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]

Gizzio J, Thakur A, Haldane A, Post CB, Levy RM. Evolutionary sequence and structural basis for the distinct conformational landscapes of Tyr and Ser/Thr kinases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.08.584161. [PMID: 38559238 PMCID: PMC10979876 DOI: 10.1101/2024.03.08.584161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]

Michael R, Kæstel-Hansen J, Mørch Groth P, Bartels S, Salomon J, Tian P, Hatzakis NS, Boomsma W. A systematic analysis of regression models for protein engineering. PLoS Comput Biol 2024;20:e1012061. [PMID: 38701099 PMCID: PMC11095727 DOI: 10.1371/journal.pcbi.1012061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 05/15/2024] [Accepted: 04/10/2024] [Indexed: 05/05/2024] Open

Pereira de Araújo AF. Sequence-dependent and -independent information in a combined random energy model for protein folding and coding. Proteins 2024;92:679-687. [PMID: 38158239 DOI: 10.1002/prot.26658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 12/11/2023] [Accepted: 12/15/2023] [Indexed: 01/03/2024]

Abstract

Random energy models (REMs) provide a simple description of the energy landscapes that guide protein folding and evolution. The requirement of a large energy gap between the native structure and unfolded conformations, considered necessary for cooperative, protein-like, folding behavior, indicates that proteins differ markedly from random heteropolymers. It has been suggested, therefore, that natural selection might have acted to choose nonrandom amino acid sequences satisfying this particular condition, implying that a large fraction of possible, unselected random sequences, would not fold to any structure. From an informational perspective, however, this scenario could indicate that protein structures, regarded as messages to be transmitted through a communication channel, would not be efficiently encoded in amino acid sequences, regarded as the communication channel for this transmission, since a large fraction of possible channel states would not be used. Here, we use a combined REM for conformations and sequences, with previously estimated parameters for natural proteins, to explore an alternative possibility in which the appropriate shape of the landscape results mainly from the deviation from randomness of possible native structures instead of sequences. We observe that this situation emerges naturally if the distribution of conformational energies happens to arise from two independent contributions corresponding to sequence-dependent and -independent terms. This construction is consistent with the hypothesis of a protein burial folding code, with native structures being determined by a modest amount of sequence-dependent atomic burial information with sequence-independent constraints imposed by unspecific hydrogen bond formation. More generally, an appropriate combination of sequence-dependent and -independent information accommodates the possibility of an efficient structural encoding with the main physical requirement for folding, providing possible insight not only on the folding process but also on several aspects sequence evolution such as neutral networks, conformational coverage, and de novo gene emergence.

Collapse

Biswas A, Choudhuri I, Arnold E, Lyumkis D, Haldane A, Levy RM. Kinetic coevolutionary models predict the temporal emergence of HIV-1 resistance mutations under drug selection pressure. Proc Natl Acad Sci U S A 2024;121:e2316662121. [PMID: 38557187 PMCID: PMC11009627 DOI: 10.1073/pnas.2316662121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 02/23/2024] [Indexed: 04/04/2024] Open

Thakur A, Gizzio J, Levy RM. Potts Hamiltonian Models and Molecular Dynamics Free Energy Simulations for Predicting the Impact of Mutations on Protein Kinase Stability. J Phys Chem B 2024;128:1656-1667. [PMID: 38350894 PMCID: PMC10939730 DOI: 10.1021/acs.jpcb.3c08097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2024]

Abstract

Single-point mutations in kinase proteins can affect their stability and fitness, and computational analysis of these effects can provide insights into the relationships among protein sequence, structure, and function for this enzyme family. To assess the impact of mutations on protein stability, we used a sequence-based Potts Hamiltonian model trained on a kinase family multiple-sequence alignment (MSA) to calculate the statistical energy (fitness) effects of mutations and compared these against relative folding free energies (ΔΔGs) calculated from all-atom molecular dynamics free energy perturbation (FEP) simulations in explicit solvent. The fitness effects of mutations in the Potts model (ΔEs) showed good agreement with experimental thermostability data (Pearson r = 0.68), similar to the correlation we observed with ΔΔGs predicted from structure-based relative FEP simulations. Recognizing the possible advantages of using Potts models to rapidly estimate protein stability effects of kinase mutations seen in cancer genomics data, we used the Potts statistical energy model to estimate the stability effects of 65 conservative and nonconservative mutations across three distinct kinases (Wee1, Abl1, and Cdc7) with somatic mutations reported in the Genomic Data Commons (GDC) database. The ΔEs of these mutations calculated from the Potts model are consistent with the corresponding ΔΔGs from FEP simulations (Pearson ratio of 0.72). The agreement between these methods suggests that the Potts model may be used as a sequence-based tool for high-throughput screening of mutational effects as part of a computational pipeline for predicting the stability effects of mutations. We also demonstrate how the scalability of the fitness-based Potts model calculations permits analyses that are not easily accessed using FEP simulations. To this end, we employed site-saturation mutagenesis in the Potts model in order to investigate the relative stability effects of mutations seen in different cancer evolutionary scenarios. We used this approach to analyze the effects of drug pressure in Abl kinase by contrasting the relative fitness penalties of somatic mutations seen in miscellaneous cancer types with those calculated for mutations associated with cancer drug resistance. We observed that, in contrast to somatic mutations of Abl seen in various tumors that appear to have evolved neutrally, cancer mutations that evolved under drug pressure in Abl-targeted therapies tend to preserve enzyme stability.

Collapse

Nartey C, Koo HJ, Laurendon C, Shaik HZ, O’maille P, Noel JP, Morcos F. Coevolutionary Information Captures Catalytic Functions and Reveals Divergent Roles of Terpene Synthase Interdomain Connections. Biochemistry 2024;63:355-366. [PMID: 38206111 PMCID: PMC10851433 DOI: 10.1021/acs.biochem.3c00578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/22/2023] [Accepted: 12/27/2023] [Indexed: 01/12/2024]

Alvarez S, Nartey CM, Mercado N, de la Paz JA, Huseinbegovic T, Morcos F. In vivo functional phenotypes from a computational epistatic model of evolution. Proc Natl Acad Sci U S A 2024;121:e2308895121. [PMID: 38285950 PMCID: PMC10861889 DOI: 10.1073/pnas.2308895121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 12/19/2023] [Indexed: 01/31/2024] Open

Hayes RL, Nixon CF, Marqusee S, Brooks CL. Selection pressures on evolution of ribonuclease H explored with rigorous free-energy-based design. Proc Natl Acad Sci U S A 2024;121:e2312029121. [PMID: 38194446 PMCID: PMC10801872 DOI: 10.1073/pnas.2312029121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/22/2023] [Indexed: 01/11/2024] Open

Abstract

Understanding natural protein evolution and designing novel proteins are motivating interest in development of high-throughput methods to explore large sequence spaces. In this work, we demonstrate the application of multisite λ dynamics (MSλD), a rigorous free energy simulation method, and chemical denaturation experiments to quantify evolutionary selection pressure from sequence-stability relationships and to address questions of design. This study examines a mesophilic phylogenetic clade of ribonuclease H (RNase H), furthering its extensive characterization in earlier studies, focusing on E. coli RNase H (ecRNH) and a more stable consensus sequence (AncCcons) differing at 15 positions. The stabilities of 32,768 chimeras between these two sequences were computed using the MSλD framework. The most stable and least stable chimeras were predicted and tested along with several other sequences, revealing a designed chimera with approximately the same stability increase as AncCcons, but requiring only half the mutations. Comparing the computed stabilities with experiment for 12 sequences reveals a Pearson correlation of 0.86 and root mean squared error of 1.18 kcal/mol, an unprecedented level of accuracy well beyond less rigorous computational design methods. We then quantified selection pressure using a simple evolutionary model in which sequences are selected according to the Boltzmann factor of their stability. Selection temperatures from 110 to 168 K are estimated in three ways by comparing experimental and computational results to evolutionary models. These estimates indicate selection pressure is high, which has implications for evolutionary dynamics and for the accuracy required for design, and suggests accurate high-throughput computational methods like MSλD may enable more effective protein design.

Collapse

García-Morales A, Balleza D. Exploring Flexibility and Folding Patterns Throughout Time in Voltage Sensors. J Mol Evol 2023;91:819-836. [PMID: 37955698 DOI: 10.1007/s00239-023-10140-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 10/27/2023] [Indexed: 11/14/2023]

Abstract

The voltage-sensing domain (VSD) is a module capable of responding to changes in the membrane potential through conformational changes and facilitating electromechanical coupling to open a pore gate, activate proton permeation pathways, or promote enzymatic activity in some membrane-anchored phosphatases. To carry out these functions, this module acts cooperatively through conformational changes. The VSD is formed by four transmembrane segments (S1-S4) but the S4 segment is critical since it carries positively charged residues, mainly Arg or Lys, which require an aqueous environment for its proper function. The discovery of this module in voltage-gated ion channels (VGICs), proton channels (Hv1), and voltage sensor-containing phosphatases (VSPs) has expanded our understanding of the principle of modularity in the voltage-sensing mechanism of these proteins. Here, by sequence comparison and the evaluation of the relationship between sequence composition, intrinsic flexibility, and structural analysis in 14 selected representatives of these three major protein groups, we report five interesting differences in the folding patterns of the VSD both in prokaryotes and eukaryotes. Our main findings indicate that this module is highly conserved throughout the evolutionary scale, however: (1) segments S1 to S3 in eukaryotes are significantly more hydrophobic than those present in prokaryotes; (2) the S4 segment has retained its hydrophilic character; (3) in eukaryotes the extramembranous linkers are significantly larger and more flexible in comparison with those present in prokaryotes; (4) the sensors present in the kHv1 proton channel and the ciVSP phosphatase, both of eukaryotic origin, exhibit relationships of flexibility and folding patterns very close to the typical ones found in prokaryotic voltage sensors; and (5) archaeal channels KvAP and MVP have flexibility profiles which are clearly contrasting in the S3-S4 region, which could explain their divergent activation mechanisms. Finally, to elucidate the obscure origins of this module, we show further evidence for a possible connection between voltage sensors and TolQ proteins.

Collapse

Gaudreault F, Corbeil CR, Sulea T. Enhanced antibody-antigen structure prediction from molecular docking using AlphaFold2. Sci Rep 2023;13:15107. [PMID: 37704686 PMCID: PMC10499836 DOI: 10.1038/s41598-023-42090-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 09/05/2023] [Indexed: 09/15/2023] Open

Li Y, Peng HQ, Yang LQ. Structural determinants underlying high-temperature adaptation of thermophilic xylanase from hot-spring microorganisms. Front Microbiol 2023;14:1210420. [PMID: 37485531 PMCID: PMC10360402 DOI: 10.3389/fmicb.2023.1210420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 06/21/2023] [Indexed: 07/25/2023] Open

Alvarez S, Nartey CM, Mercado N, de la Paz A, Huseinbegovic T, Morcos F. In vivo functional phenotypes from a computational epistatic model of evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.24.542176. [PMID: 37292895 PMCID: PMC10245989 DOI: 10.1101/2023.05.24.542176] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Gizzio J, Thakur A, Haldane A, Levy RM. Evolutionary divergence in the conformational landscapes of tyrosine vs serine/threonine kinases. eLife 2022;11:83368. [PMID: 36562610 PMCID: PMC9822262 DOI: 10.7554/elife.83368] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/22/2022] [Indexed: 12/24/2022] Open

Abstract

Inactive conformations of protein kinase catalytic domains where the DFG motif has a "DFG-out" orientation and the activation loop is folded present a druggable binding pocket that is targeted by FDA-approved 'type-II inhibitors' in the treatment of cancers. Tyrosine kinases (TKs) typically show strong binding affinity with a wide spectrum of type-II inhibitors while serine/threonine kinases (STKs) usually bind more weakly which we suggest here is due to differences in the folded to extended conformational equilibrium of the activation loop between TKs vs. STKs. To investigate this, we use sequence covariation analysis with a Potts Hamiltonian statistical energy model to guide absolute binding free-energy molecular dynamics simulations of 74 protein-ligand complexes. Using the calculated binding free energies together with experimental values, we estimated free-energy costs for the large-scale (~17-20 Å) conformational change of the activation loop by an indirect approach, circumventing the very challenging problem of simulating the conformational change directly. We also used the Potts statistical potential to thread large sequence ensembles over active and inactive kinase states. The structure-based and sequence-based analyses are consistent; together they suggest TKs evolved to have free-energy penalties for the classical 'folded activation loop' DFG-out conformation relative to the active conformation, that is, on average, 4-6 kcal/mol smaller than the corresponding values for STKs. Potts statistical energy analysis suggests a molecular basis for this observation, wherein the activation loops of TKs are more weakly 'anchored' against the catalytic loop motif in the active conformation and form more stable substrate-mimicking interactions in the inactive conformation. These results provide insights into the molecular basis for the divergent functional properties of TKs and STKs, and have pharmacological implications for the target selectivity of type-II inhibitors.

Collapse

Sánchez IE, Galpern EA, Garibaldi MM, Ferreiro DU. Molecular Information Theory Meets Protein Folding. J Phys Chem B 2022;126:8655-8668. [PMID: 36282961 DOI: 10.1021/acs.jpcb.2c04532] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

Colberg M, Schofield J. Configurational entropy, transition rates, and optimal interactions for rapid folding in coarse-grained model proteins. J Chem Phys 2022;157:125101. [PMID: 36182418 DOI: 10.1063/5.0098612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Gerardos A, Dietler N, Bitbol AF. Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences. PLoS Comput Biol 2022;18:e1010147. [PMID: 35576238 PMCID: PMC9135348 DOI: 10.1371/journal.pcbi.1010147] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 05/26/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open

Chi H, Zhou Q, Tutol JN, Phelps SM, Lee J, Kapadia P, Morcos F, Dodani SC. Coupling a Live Cell Directed Evolution Assay with Coevolutionary Landscapes to Engineer an Improved Fluorescent Rhodopsin Chloride Sensor. ACS Synth Biol 2022;11:1627-1638. [PMID: 35389621 PMCID: PMC9184236 DOI: 10.1021/acssynbio.2c00033] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Hayes RL, Vilseck JZ, Brooks CL. Addressing Intersite Coupling Unlocks Large Combinatorial Chemical Spaces for Alchemical Free Energy Methods. J Chem Theory Comput 2022;18:2114-2123. [PMID: 35255214 PMCID: PMC9700482 DOI: 10.1021/acs.jctc.1c00948] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Enhancing computational enzyme design by a maximum entropy strategy. Proc Natl Acad Sci U S A 2022;119:2122355119. [PMID: 35135886 PMCID: PMC8851541 DOI: 10.1073/pnas.2122355119] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/03/2022] [Indexed: 01/16/2023] Open

Do HN, Haldane A, Levy RM, Miao Y. Unique features of different classes of G-protein-coupled receptors revealed from sequence coevolutionary and structural analysis. Proteins 2022;90:601-614. [PMID: 34599827 PMCID: PMC8738117 DOI: 10.1002/prot.26256] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/21/2021] [Accepted: 09/27/2021] [Indexed: 02/03/2023]

Röder K, Wales DJ. The Energy Landscape Perspective: Encoding Structure and Function for Biomolecules. Front Mol Biosci 2022;9:820792. [PMID: 35155579 PMCID: PMC8829389 DOI: 10.3389/fmolb.2022.820792] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 01/07/2022] [Indexed: 12/02/2022] Open

Kazan IC, Sharma P, Rahman MI, Bobkov A, Fromme R, Ghirlanda G, Ozkan SB. Design of novel cyanovirin-N variants by modulation of binding dynamics through distal mutations. eLife 2022;11:67474. [PMID: 36472898 PMCID: PMC9725752 DOI: 10.7554/elife.67474] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Accepted: 11/28/2022] [Indexed: 12/07/2022] Open

Miyazawa S. Boltzmann Machine Learning and Regularization Methods for Inferring Evolutionary Fields and Couplings From a Multiple Sequence Alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:328-342. [PMID: 32396099 DOI: 10.1109/tcbb.2020.2993232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Chu WT, Yan Z, Chu X, Zheng X, Liu Z, Xu L, Zhang K, Wang J. Physics of biomolecular recognition and conformational dynamics. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021;84:126601. [PMID: 34753115 DOI: 10.1088/1361-6633/ac3800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]

Bisardi M, Rodriguez-Rivas J, Zamponi F, Weigt M. Modeling sequence-space exploration and emergence of epistatic signals in protein evolution. Mol Biol Evol 2021;39:6424001. [PMID: 34751386 PMCID: PMC8789065 DOI: 10.1093/molbev/msab321] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Shen Y, Olson ER, Van Deelen TR. Spatially explicit modeling of community occupancy using Markov Random Field models with imperfect observation: Mesocarnivores in Apostle Islands National Lakeshore. Ecol Modell 2021. [DOI: 10.1016/j.ecolmodel.2021.109712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Barrat-Charlaix P, Muntoni AP, Shimagaki K, Weigt M, Zamponi F. Sparse generative modeling via parameter reduction of Boltzmann machines: Application to protein-sequence families. Phys Rev E 2021;104:024407. [PMID: 34525554 DOI: 10.1103/physreve.104.024407] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 07/19/2021] [Indexed: 11/07/2022]

On the effect of phylogenetic correlations in coevolution-based contact prediction in proteins. PLoS Comput Biol 2021;17:e1008957. [PMID: 34029316 PMCID: PMC8177639 DOI: 10.1371/journal.pcbi.1008957] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 06/04/2021] [Accepted: 04/09/2021] [Indexed: 12/04/2022] Open

Abstract

Coevolution-based contact prediction, either directly by coevolutionary couplings resulting from global statistical sequence models or using structural supervision and deep learning, has found widespread application in protein-structure prediction from sequence. However, one of the basic assumptions in global statistical modeling is that sequences form an at least approximately independent sample of an unknown probability distribution, which is to be learned from data. In the case of protein families, this assumption is obviously violated by phylogenetic relations between protein sequences. It has turned out to be notoriously difficult to take phylogenetic correlations into account in coevolutionary model learning. Here, we propose a complementary approach: we develop strategies to randomize or resample sequence data, such that conservation patterns and phylogenetic relations are preserved, while intrinsic (i.e. structure- or function-based) coevolutionary couplings are removed. A comparison between the results of Direct Coupling Analysis applied to real and to resampled data shows that the largest coevolutionary couplings, i.e. those used for contact prediction, are only weakly influenced by phylogeny. However, the phylogeny-induced spurious couplings in the resampled data are compatible in size with the first false-positive contact predictions from real data. Dissecting functional from phylogeny-induced couplings might therefore extend accurate contact predictions to the range of intermediate-size couplings.

Many homologous protein families contain thousands of highly diverged amino-acid sequences, which fold into close-to-identical three-dimensional structures and fulfill almost identical biological tasks. Global coevolutionary models, like those inferred by the Direct Coupling Analysis (DCA), assume that families can be considered as samples of some unknown statistical model, and that the parameters of these models represent evolutionary constraints acting on protein sequences. To learn these models from data, DCA and related approaches have to also assume that the distinct sequences in a protein family are close to independent, while in reality they are characterized by involved hierarchical phylogenetic relationships. Here we propose Null models for sequence alignments, which maintain patterns of amino-acid conservation and phylogeny contained in the data, but destroy any coevolutionary couplings, frequently used in protein structure prediction. We find that phylogeny actually induces spurious non-zero couplings. These are, however, significantly smaller that the largest couplings derived from natural sequences, and therefore have only little influence on the first predicted contacts. However, in the range of intermediate couplings, they may lead to statistically significant effects. Dissecting phylogenetic from functional couplings might therefore extend the range of accurately predicted structural contacts down to smaller coupling strengths than those currently used.

Collapse

Sequeiros-Borja CE, Surpeta B, Brezovsky J. Recent advances in user-friendly computational tools to engineer protein function. Brief Bioinform 2021;22:bbaa150. [PMID: 32743637 PMCID: PMC8138880 DOI: 10.1093/bib/bbaa150] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 06/03/2020] [Accepted: 06/16/2020] [Indexed: 12/14/2022] Open

Zou T, Woodrum BW, Halloran N, Campitelli P, Bobkov AA, Ghirlanda G, Ozkan SB. Local Interactions That Contribute Minimal Frustration Determine Foldability. J Phys Chem B 2021;125:2617-2626. [PMID: 33687216 DOI: 10.1021/acs.jpcb.1c00364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Gianni S, Freiberger MI, Jemth P, Ferreiro DU, Wolynes PG, Fuxreiter M. Fuzziness and Frustration in the Energy Landscape of Protein Folding, Function, and Assembly. Acc Chem Res 2021;54:1251-1259. [PMID: 33550810 PMCID: PMC8023570 DOI: 10.1021/acs.accounts.0c00813] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Indexed: 12/20/2022]

Abstract

Are all protein interactions fully optimized? Do suboptimal interactions compromise specificity? What is the functional impact of frustration? Why does evolution not optimize some contacts? Proteins and their complexes are best described as ensembles of states populating an energy landscape. These ensembles vary in breadth from narrow ensembles clustered around a single average X-ray structure to broader ensembles encompassing a few different functional "taxonomic" states on to near continua of rapidly interconverting conformations, which are called "fuzzy" or even "intrinsically disordered". Here we aim to provide a comprehensive framework for confronting the structural and dynamical continuum of protein assemblies by combining the concepts of energetic frustration and interaction fuzziness. The diversity of the protein structural ensemble arises from the frustrated conflicts between the interactions that create the energy landscape. When frustration is minimal after folding, it results in a narrow ensemble, but residual frustrated interactions result in fuzzy ensembles, and this fuzziness allows a versatile repertoire of biological interactions. Here we discuss how fuzziness and frustration play off each other as proteins fold and assemble, viewing their significance from energetic, functional, and evolutionary perspectives.We demonstrate, in particular, that the common physical origin of both concepts is related to the ruggedness of the energy landscapes, intramolecular in the case of frustration and intermolecular in the case of fuzziness. Within this framework, we show that alternative sets of suboptimal contacts may encode specificity without achieving a single structural optimum. Thus, we demonstrate that structured complexes may not be optimized, and energetic frustration is realized via different sets of contacts leading to multiplicity of specific complexes. Furthermore, we propose that these suboptimal, frustrated, or fuzzy interactions are under evolutionary selection and expand the biological repertoire by providing a multiplicity of biological activities. In accord, we show that non-native interactions in folding or interaction landscapes can cooperate to generate diverse functional states, which are essential to facilitate adaptation to different cellular conditions. Thus, we propose that not fully optimized structures may actually be beneficial for biological activities of proteins via an alternative set of suboptimal interactions. The importance of such variability has not been recognized across different areas of biology.This account provides a modern view on folding, function, and assembly across the protein universe. The physical framework presented here is applicable to the structure and dynamics continuum of proteins and opens up new perspectives for drug design involving not fully structured, highly dynamic protein assemblies.

Collapse

Crippa M, Andreghetti D, Capelli R, Tiana G. Evolution of frustrated and stabilising contacts in reconstructed ancient proteins. EUROPEAN BIOPHYSICS JOURNAL 2021;50:699-712. [PMID: 33569610 PMCID: PMC8260555 DOI: 10.1007/s00249-021-01500-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 12/14/2020] [Accepted: 01/13/2021] [Indexed: 11/30/2022]

Thadani NN, Zhou Q, Reyes Gamas K, Butler S, Bueno C, Schafer NP, Morcos F, Wolynes PG, Suh J. Frustration and Direct-Coupling Analyses to Predict Formation and Function of Adeno-Associated Virus. Biophys J 2020;120:489-503. [PMID: 33359833 DOI: 10.1016/j.bpj.2020.12.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 11/08/2020] [Accepted: 12/08/2020] [Indexed: 01/03/2023] Open

Abstract

Adeno-associated virus (AAV) is a promising gene therapy vector because of its efficient gene delivery and relatively mild immunogenicity. To improve delivery target specificity, researchers use combinatorial and rational library design strategies to generate novel AAV capsid variants. These approaches frequently propose high proportions of nonforming or noninfective capsid protein sequences that reduce the effective depth of synthesized vector DNA libraries, thereby raising the discovery cost of novel vectors. We evaluated two computational techniques for their ability to estimate the impact of residue mutations on AAV capsid protein-protein interactions and thus predict changes in vector fitness, reasoning that these approaches might inform the design of functionally enriched AAV libraries and accelerate therapeutic candidate identification. The Frustratometer computes an energy function derived from the energy landscape theory of protein folding. Direct-coupling analysis (DCA) is a statistical framework that captures residue coevolution within proteins. We applied the Frustratometer to select candidate protein residues predicted to favor assembled or disassembled capsid states, then predicted mutation effects at these sites using the Frustratometer and DCA. Capsid mutants were experimentally assessed for changes in virus formation, stability, and transduction ability. The Frustratometer-based metric showed a counterintuitive correlation with viral stability, whereas a DCA-derived metric was highly correlated with virus transduction ability in the small population of residues studied. Our results suggest that coevolutionary models may be able to elucidate complex capsid residue-residue interaction networks essential for viral function, but further study is needed to understand the relationship between protein energy simulations and viral capsid metastability.

Collapse

Hu L, Hu P, Luo X, Yuan X, You ZH. Incorporating the Coevolving Information of Substrates in Predicting HIV-1 Protease Cleavage Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:2017-2028. [PMID: 31056514 DOI: 10.1109/tcbb.2019.2914208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Tian P, Best RB. Exploring the sequence fitness landscape of a bridge between protein folds. PLoS Comput Biol 2020;16:e1008285. [PMID: 33048928 PMCID: PMC7553338 DOI: 10.1371/journal.pcbi.1008285] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/24/2020] [Indexed: 12/15/2022] Open

Molecular origins of folding rate differences in the thioredoxin family. Biochem J 2020;477:1083-1087. [DOI: 10.1042/bcj20190864] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Revised: 02/19/2020] [Accepted: 02/20/2020] [Indexed: 12/13/2022]

Epistatic contributions promote the unification of incompatible models of neutral molecular evolution. Proc Natl Acad Sci U S A 2020;117:5873-5882. [PMID: 32123092 PMCID: PMC7084075 DOI: 10.1073/pnas.1913071117] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Abstract

Mathematical models of evolution help us understand mechanisms driving protein-sequence change. Previous models recapitulate a disjoint subset of statistical features of natural sequences. We present a neutral evolution model that unifies features including extreme variance of the molecular clock’s tick rate and the observation of an evolutionary Stokes shift, an irreversible effect of mutations in the fitness landscape during sequence evolution. We show that interactions between amino acid sites, which inform our fitness metric, are required to observe these features. These interactions are inferred by using direct coupling analysis, which has been successfully utilized to predict protein structures, dynamics, and complexes from coevolutionary information. We anticipate our model will have applications in phylogenetics, ancestral reconstruction of sequences, and protein design.

We introduce a model of amino acid sequence evolution that accounts for the statistical behavior of real sequences induced by epistatic interactions. We base the model dynamics on parameters derived from multiple sequence alignments analyzed by using direct coupling analysis methodology. Known statistical properties such as overdispersion, heterotachy, and gamma-distributed rate-across-sites are shown to be emergent properties of this model while being consistent with neutral evolution theory, thereby unifying observations from previously disjointed evolutionary models of sequences. The relationship between site restriction and heterotachy is characterized by tracking the effective alphabet dynamics of sites. We also observe an evolutionary Stokes shift in the fitness of sequences that have undergone evolution under our simulation. By analyzing the structural information of some proteins, we corroborate that the strongest Stokes shifts derive from sites that physically interact in networks near biochemically important regions. Perspectives on the implementation of our model in the context of the molecular clock are discussed.

Collapse

Rivoire O. Parsimonious evolutionary scenario for the origin of allostery and coevolution patterns in proteins. Phys Rev E 2020;100:032411. [PMID: 31640027 DOI: 10.1103/physreve.100.032411] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Indexed: 12/16/2022]

Rizzato F, Coucke A, de Leonardis E, Barton JP, Tubiana J, Monasson R, Cocco S. Inference of compressed Potts graphical models. Phys Rev E 2020;101:012309. [PMID: 32069678 DOI: 10.1103/physreve.101.012309] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Indexed: 06/10/2023]

Hayes RL, Vilseck JZ, Brooks CL. Approaching protein design with multisite λ dynamics: Accurate and scalable mutational folding free energies in T4 lysozyme. Protein Sci 2019;27:1910-1922. [PMID: 30175503 DOI: 10.1002/pro.3500] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Revised: 08/06/2018] [Accepted: 08/15/2018] [Indexed: 12/14/2022]

Rodriguez Horta E, Barrat-Charlaix P, Weigt M. Toward Inferring Potts Models for Phylogenetically Correlated Sequence Data. ENTROPY 2019;21:1090. [PMCID: PMC7514434 DOI: 10.3390/e21111090] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 11/06/2019] [Indexed: 06/16/2023]