1
|
Cocco S, Posani L, Monasson R. Functional effects of mutations in proteins can be predicted and interpreted by guided selection of sequence covariation information. Proc Natl Acad Sci U S A 2024; 121:e2312335121. [PMID: 38889151 PMCID: PMC11214004 DOI: 10.1073/pnas.2312335121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 04/21/2024] [Indexed: 06/20/2024] Open
Abstract
Predicting the effects of one or more mutations to the in vivo or in vitro properties of a wild-type protein is a major computational challenge, due to the presence of epistasis, that is, of interactions between amino acids in the sequence. We introduce a computationally efficient procedure to build minimal epistatic models to predict mutational effects by combining evolutionary (homologous sequence) and few mutational-scan data. Mutagenesis measurements guide the selection of links in a sparse graphical model, while the parameters on the nodes and the edges are inferred from sequence data. We show, on 10 mutational scans, that our pipeline exhibits performances comparable to state-of-the-art deep networks trained on many more data, while requiring much less parameters and being hence more interpretable. In particular, the identified interactions adapt to the wild-type protein and to the fitness or biochemical property experimentally measured, mostly focus on key functional sites, and are not necessarily related to structural contacts. Therefore, our method is able to extract information relevant for one mutational experiment from homologous sequence data reflecting the multitude of structural and functional constraints acting on proteins throughout evolution.
Collapse
Affiliation(s)
- Simona Cocco
- Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR8023 and Paris Sciences & Lettres (PSL) Research, Sorbonne Université, 75005Paris, France
| | - Lorenzo Posani
- Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR8023 and Paris Sciences & Lettres (PSL) Research, Sorbonne Université, 75005Paris, France
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR8023 and Paris Sciences & Lettres (PSL) Research, Sorbonne Université, 75005Paris, France
| |
Collapse
|
2
|
Malbranke C, Bikard D, Cocco S, Monasson R, Tubiana J. Machine learning for evolutionary-based and physics-inspired protein design: Current and future synergies. Curr Opin Struct Biol 2023; 80:102571. [PMID: 36947951 DOI: 10.1016/j.sbi.2023.102571] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 01/29/2023] [Accepted: 02/07/2023] [Indexed: 03/24/2023]
Abstract
Computational protein design facilitates the discovery of novel proteins with prescribed structure and functionality. Exciting designs were recently reported using novel data-driven methodologies that can be roughly divided into two categories: evolutionary-based and physics-inspired approaches. The former infer characteristic sequence features shared by sets of evolutionary-related proteins, such as conserved or coevolving positions, and recombine them to generate candidates with similar structure and function. The latter approaches estimate key biochemical properties, such as structure free energy, conformational entropy, or binding affinities using machine learning surrogates, and optimize them to yield improved designs. Here, we review recent progress along both tracks, discuss their strengths and weaknesses, and highlight opportunities for synergistic approaches.
Collapse
Affiliation(s)
- Cyril Malbranke
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France; Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, 75015 Paris, France.
| | - David Bikard
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, 75015 Paris, France
| | - Simona Cocco
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale Supérieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| | - Jérôme Tubiana
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
3
|
Dietler N, Lupo U, Bitbol AF. Impact of phylogeny on structural contact inference from protein sequence data. J R Soc Interface 2023; 20:20220707. [PMID: 36751926 PMCID: PMC9905998 DOI: 10.1098/rsif.2022.0707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023] Open
Abstract
Local and global inference methods have been developed to infer structural contacts from multiple sequence alignments of homologous proteins. They rely on correlations in amino acid usage at contacting sites. Because homologous proteins share a common ancestry, their sequences also feature phylogenetic correlations, which can impair contact inference. We investigate this effect by generating controlled synthetic data from a minimal model where the importance of contacts and of phylogeny can be tuned. We demonstrate that global inference methods, specifically Potts models, are more resilient to phylogenetic correlations than local methods, based on covariance or mutual information. This holds whether or not phylogenetic corrections are used, and may explain the success of global methods. We analyse the roles of selection strength and of phylogenetic relatedness. We show that sites that mutate early in the phylogeny yield false positive contacts. We consider natural data and realistic synthetic data, and our findings generalize to these cases. Our results highlight the impact of phylogeny on contact prediction from protein sequences and illustrate the interplay between the rich structure of biological data and inference.
Collapse
Affiliation(s)
- Nicola Dietler
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Umberto Lupo
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland,SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
4
|
Gerardos A, Dietler N, Bitbol AF. Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences. PLoS Comput Biol 2022; 18:e1010147. [PMID: 35576238 PMCID: PMC9135348 DOI: 10.1371/journal.pcbi.1010147] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 05/26/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Inferring protein-protein interactions from sequences is an important task in computational biology. Recent methods based on Direct Coupling Analysis (DCA) or Mutual Information (MI) allow to find interaction partners among paralogs of two protein families. Does successful inference mainly rely on correlations from structural contacts or from phylogeny, or both? Do these two types of signal combine constructively or hinder each other? To address these questions, we generate and analyze synthetic data produced using a minimal model that allows us to control the amounts of structural constraints and phylogeny. We show that correlations from these two sources combine constructively to increase the performance of partner inference by DCA or MI. Furthermore, signal from phylogeny can rescue partner inference when signal from contacts becomes less informative, including in the realistic case where inter-protein contacts are restricted to a small subset of sites. We also demonstrate that DCA-inferred couplings between non-contact pairs of sites improve partner inference in the presence of strong phylogeny, while deteriorating it otherwise. Moreover, restricting to non-contact pairs of sites preserves inference performance in the presence of strong phylogeny. In a natural data set, as well as in realistic synthetic data based on it, we find that non-contact pairs of sites contribute positively to partner inference performance, and that restricting to them preserves performance, evidencing an important role of phylogeny.
Collapse
Affiliation(s)
- Andonis Gerardos
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicola Dietler
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
5
|
Extracting phylogenetic dimensions of coevolution reveals hidden functional signals. Sci Rep 2022; 12:820. [PMID: 35039514 PMCID: PMC8764114 DOI: 10.1038/s41598-021-04260-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 12/17/2021] [Indexed: 11/08/2022] Open
Abstract
Despite the structural and functional information contained in the statistical coupling between pairs of residues in a protein, coevolution associated with function is often obscured by artifactual signals such as genetic drift, which shapes a protein's phylogenetic history and gives rise to concurrent variation between protein sequences that is not driven by selection for function. Here, we introduce a background model for phylogenetic contributions of statistical coupling that separates the coevolution signal due to inter-clade and intra-clade sequence comparisons and demonstrate that coevolution can be measured on multiple phylogenetic timescales within a single protein. Our method, nested coevolution (NC), can be applied as an extension to any coevolution metric. We use NC to demonstrate that poorly conserved residues can nonetheless have important roles in protein function. Moreover, NC improved the structural-contact predictions of several coevolution-based methods, particularly in subsampled alignments with fewer sequences. NC also lowered the noise in detecting functional sectors of collectively coevolving residues. Sectors of coevolving residues identified after application of NC were more spatially compact and phylogenetically distinct from the rest of the protein, and strongly enriched for mutations that disrupt protein activity. Thus, our conceptualization of the phylogenetic separation of coevolution provides the potential to further elucidate relationships among protein evolution, function, and genetic diseases.
Collapse
|
6
|
Mehrabiani KM, Cheng RR, Onuchic JN. Expanding Direct Coupling Analysis to Identify Heterodimeric Interfaces from Limited Protein Sequence Data. J Phys Chem B 2021; 125:11408-11417. [PMID: 34618469 DOI: 10.1021/acs.jpcb.1c07145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Direct coupling analysis (DCA) is a global statistical approach that uses information encoded in protein sequence data to predict spatial contacts in a three-dimensional structure of a folded protein. DCA has been widely used to predict the monomeric fold at amino acid resolution and to identify biologically relevant interaction sites within a folded protein. Going beyond single proteins, DCA has also been used to identify spatial contacts that stabilize the interaction in protein complex formation. However, extracting this higher order information necessary to predict dimer contacts presents a significant challenge. A DCA evolutionary signal is much stronger at the single protein level (intraprotein contacts) than at the protein-protein interface (interprotein contacts). Therefore, if DCA-derived information is to be used to predict the structure of these complexes, there is a need to identify statistically significant DCA predictions. We propose a simple Z-score measure that can filter good predictions despite noisy, limited data. This new methodology not only improves our prediction ability but also provides a quantitative measure for the validity of the prediction.
Collapse
Affiliation(s)
- Kareem M Mehrabiani
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Systems, Synthetic, and Physical Biology, Rice University, Houston, Texas 77005, United States
| | - Ryan R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Systems, Synthetic, and Physical Biology, Rice University, Houston, Texas 77005, United States.,Department of Physics & Astronomy, Rice University, Houston, Texas 77005, United States.,Department of Chemistry, Rice University, Houston, Texas 77005, United States.,Department of Biosciences, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
7
|
Wang W, Liu Q, Liu Q, Hendrickson WA. Conformational equilibria in allosteric control of Hsp70 chaperones. Mol Cell 2021; 81:3919-3933.e7. [PMID: 34453889 DOI: 10.1016/j.molcel.2021.07.039] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 06/04/2021] [Accepted: 07/28/2021] [Indexed: 01/16/2023]
Abstract
Heat-shock proteins of 70 kDa (Hsp70s) are vital for all life and are notably important in protein folding. Hsp70s use ATP binding and hydrolysis at a nucleotide-binding domain (NBD) to control the binding and release of client polypeptides at a substrate-binding domain (SBD); however, the mechanistic basis for this allostery has been elusive. Here, we first characterize biochemical properties of selected domain-interface mutants in bacterial Hsp70 DnaK. We then develop a theoretical model for allosteric equilibria among Hsp70 conformational states to explain the observations: a restraining state, Hsp70R-ATP, restricts ATP hydrolysis and binds peptides poorly, whereas a stimulating state, Hsp70S-ATP, hydrolyzes ATP rapidly and has high intrinsic substrate affinity but rapid binding kinetics. We support this model for allosteric regulation with DnaK structures obtained in the postulated stimulating state S with biochemical tests of the S-state interface and with improved peptide-binding-site definition in an R-state structure.
Collapse
Affiliation(s)
- Wei Wang
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Qinglian Liu
- Department of Physiology and Biophysics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Qun Liu
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Wayne A Hendrickson
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA; Department of Physiology and Cellular Biophysics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
8
|
Malbranke C, Bikard D, Cocco S, Monasson R. Improving sequence-based modeling of protein families using secondary structure quality assessment. Bioinformatics 2021; 37:4083-4090. [PMID: 34117879 PMCID: PMC9502231 DOI: 10.1093/bioinformatics/btab442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 06/03/2021] [Accepted: 06/16/2021] [Indexed: 12/03/2022] Open
Abstract
Motivation Modeling of protein family sequence distribution from homologous sequence data recently received considerable attention, in particular for structure and function predictions, as well as for protein design. In particular, direct coupling analysis, a method to infer effective pairwise interactions between residues, was shown to capture important structural constraints and to successfully generate functional protein sequences. Building on this and other graphical models, we introduce a new framework to assess the quality of the secondary structures of the generated sequences with respect to reference structures for the family. Results We introduce two scoring functions characterizing the likeliness of the secondary structure of a protein sequence to match a reference structure, called Dot Product and Pattern Matching. We test these scores on published experimental protein mutagenesis and design dataset, and show improvement in the detection of nonfunctional sequences. We also show that use of these scores help rejecting nonfunctional sequences generated by graphical models (Restricted Boltzmann Machines) learned from homologous sequence alignments. Availability and implementation Data and code available at https://github.com/CyrilMa/ssqa Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cyril Malbranke
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France.,Synthetic Biology, Microbiology Department, Institut Pasteur, Paris, France
| | - David Bikard
- Synthetic Biology, Microbiology Department, Institut Pasteur, Paris, France
| | - Simona Cocco
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Université de Paris, Paris, France
| |
Collapse
|
9
|
Mayer MP. The Hsp70-Chaperone Machines in Bacteria. Front Mol Biosci 2021; 8:694012. [PMID: 34164436 PMCID: PMC8215388 DOI: 10.3389/fmolb.2021.694012] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 05/20/2021] [Indexed: 12/02/2022] Open
Abstract
The ATP-dependent Hsp70s are evolutionary conserved molecular chaperones that constitute central hubs of the cellular protein quality surveillance network. None of the other main chaperone families (Tig, GroELS, HtpG, IbpA/B, ClpB) have been assigned with a comparable range of functions. Through a multitude of functions Hsp70s are involved in many cellular control circuits for maintaining protein homeostasis and have been recognized as key factors for cell survival. Three mechanistic properties of Hsp70s are the basis for their high versatility. First, Hsp70s bind to short degenerate sequence motifs within their client proteins. Second, Hsp70 chaperones switch in a nucleotide-controlled manner between a state of low affinity for client proteins and a state of high affinity for clients. Third, Hsp70s are targeted to their clients by a large number of cochaperones of the J-domain protein (JDP) family and the lifetime of the Hsp70-client complex is regulated by nucleotide exchange factors (NEF). In this review I will discuss advances in the understanding of the molecular mechanism of the Hsp70 chaperone machinery focusing mostly on the bacterial Hsp70 DnaK and will compare the two other prokaryotic Hsp70s HscA and HscC with DnaK.
Collapse
Affiliation(s)
- Matthias P Mayer
- Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, Heidelberg, Germany
| |
Collapse
|
10
|
Gandarilla-Pérez CA, Mergny P, Weigt M, Bitbol AF. Statistical physics of interacting proteins: Impact of dataset size and quality assessed in synthetic sequences. Phys Rev E 2020; 101:032413. [PMID: 32290011 DOI: 10.1103/physreve.101.032413] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Accepted: 03/04/2020] [Indexed: 11/07/2022]
Abstract
Identifying protein-protein interactions is crucial for a systems-level understanding of the cell. Recently, algorithms based on inverse statistical physics, e.g., direct coupling analysis (DCA), have allowed to use evolutionarily related sequences to address two conceptually related inference tasks: finding pairs of interacting proteins and identifying pairs of residues which form contacts between interacting proteins. Here we address two underlying questions: How are the performances of both inference tasks related? How does performance depend on dataset size and the quality? To this end, we formalize both tasks using Ising models defined over stochastic block models, with individual blocks representing single proteins and interblock couplings protein-protein interactions; controlled synthetic sequence data are generated by Monte Carlo simulations. We show that DCA is able to address both inference tasks accurately when sufficiently large training sets of known interaction partners are available and that an iterative pairing algorithm allows to make predictions even without a training set. Noise in the training data deteriorates performance. In both tasks we find a quadratic scaling relating dataset quality and size that is consistent with noise adding in square-root fashion and signal adding linearly when increasing the dataset. This implies that it is generally good to incorporate more data even if their quality are imperfect, thereby shedding light on the empirically observed performance of DCA applied to natural protein sequences.
Collapse
Affiliation(s)
- Carlos A Gandarilla-Pérez
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France.,Facultad de Física, Universidad de la Habana, San Lázaro y L, Vedado, Habana 4, CP-10400, Cuba
| | - Pierre Mergny
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France.,Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Jean Perrin (LJP, UMR 8237), F-75005 Paris, France
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France
| | - Anne-Florence Bitbol
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Jean Perrin (LJP, UMR 8237), F-75005 Paris, France.,Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| |
Collapse
|
11
|
Fantini M, Lisi S, De Los Rios P, Cattaneo A, Pastore A. Protein Structural Information and Evolutionary Landscape by In Vitro Evolution. Mol Biol Evol 2020; 37:1179-1192. [PMID: 31670785 PMCID: PMC7086169 DOI: 10.1093/molbev/msz256] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Protein structure is tightly intertwined with function according to the laws of evolution. Understanding how structure determines function has been the aim of structural biology for decades. Here, we have wondered instead whether it is possible to exploit the function for which a protein was evolutionary selected to gain information on protein structure and on the landscape explored during the early stages of molecular and natural evolution. To answer to this question, we developed a new methodology, which we named CAMELS (Coupling Analysis by Molecular Evolution Library Sequencing), that is able to obtain the in vitro evolution of a protein from an artificial selection based on function. We were able to observe with CAMELS many features of the TEM-1 beta-lactamase local fold exclusively by generating and sequencing large libraries of mutational variants. We demonstrated that we can, whenever a functional phenotypic selection of a protein is available, sketch the structural and evolutionary landscape of a protein without utilizing purified proteins, collecting physical measurements, or relying on the pool of natural protein variants.
Collapse
Affiliation(s)
- Marco Fantini
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
| | - Simonetta Lisi
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
| | - Paolo De Los Rios
- Institute of Physics, School of Basic Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Antonino Cattaneo
- BioSNS Laboratory of Biology, Scuola Normale Superiore (SNS), Pisa, Italy
- European Brain Research Institute, Rome, Italy
| | - Annalisa Pastore
- Department of Clinical and Basic Neuroscience, Maurice Wohl Institute, King's College London, London, United Kingdom
- Dementia Research Institute, King’s College London, London, United Kingdom
| |
Collapse
|
12
|
Epistatic contributions promote the unification of incompatible models of neutral molecular evolution. Proc Natl Acad Sci U S A 2020; 117:5873-5882. [PMID: 32123092 PMCID: PMC7084075 DOI: 10.1073/pnas.1913071117] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Mathematical models of evolution help us understand mechanisms driving protein-sequence change. Previous models recapitulate a disjoint subset of statistical features of natural sequences. We present a neutral evolution model that unifies features including extreme variance of the molecular clock’s tick rate and the observation of an evolutionary Stokes shift, an irreversible effect of mutations in the fitness landscape during sequence evolution. We show that interactions between amino acid sites, which inform our fitness metric, are required to observe these features. These interactions are inferred by using direct coupling analysis, which has been successfully utilized to predict protein structures, dynamics, and complexes from coevolutionary information. We anticipate our model will have applications in phylogenetics, ancestral reconstruction of sequences, and protein design. We introduce a model of amino acid sequence evolution that accounts for the statistical behavior of real sequences induced by epistatic interactions. We base the model dynamics on parameters derived from multiple sequence alignments analyzed by using direct coupling analysis methodology. Known statistical properties such as overdispersion, heterotachy, and gamma-distributed rate-across-sites are shown to be emergent properties of this model while being consistent with neutral evolution theory, thereby unifying observations from previously disjointed evolutionary models of sequences. The relationship between site restriction and heterotachy is characterized by tracking the effective alphabet dynamics of sites. We also observe an evolutionary Stokes shift in the fitness of sequences that have undergone evolution under our simulation. By analyzing the structural information of some proteins, we corroborate that the strongest Stokes shifts derive from sites that physically interact in networks near biochemically important regions. Perspectives on the implementation of our model in the context of the molecular clock are discussed.
Collapse
|
13
|
Malinverni D, Barducci A. Coevolutionary Analysis of Protein Subfamilies by Sequence Reweighting. ENTROPY (BASEL, SWITZERLAND) 2020; 21:1127. [PMID: 32002010 PMCID: PMC6992422 DOI: 10.3390/e21111127] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 11/14/2019] [Indexed: 01/07/2023]
Abstract
Extracting structural information from sequence co-variation has become a common computational biology practice in the recent years, mainly due to the availability of large sequence alignments of protein families. However, identifying features that are specific to sub-classes and not shared by all members of the family using sequence-based approaches has remained an elusive problem. We here present a coevolutionary-based method to differentially analyze subfamily specific structural features by a continuous sequence reweighting (SR) approach. We introduce the underlying principles and test its predictive capabilities on the Response Regulator family, whose subfamilies have been previously shown to display distinct, specific homo-dimerization patterns. Our results show that this reweighting scheme is effective in assigning structural features known a priori to subfamilies, even when sequence data is relatively scarce. Furthermore, sequence reweighting allows assessing if individual structural contacts pertain to specific subfamilies and it thus paves the way for the identification specificity-determining contacts from sequence variation data.
Collapse
Affiliation(s)
- Duccio Malinverni
- Medical Research Council (MRC) Laboratory of Molecular Biology, Cambridge CB20QH, UK
| | - Alessandro Barducci
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, 34090 Montpellier, France
| |
Collapse
|
14
|
Astl L, Verkhivker GM. Dynamic View of Allosteric Regulation in the Hsp70 Chaperones by J-Domain Cochaperone and Post-Translational Modifications: Computational Analysis of Hsp70 Mechanisms by Exploring Conformational Landscapes and Residue Interaction Networks. J Chem Inf Model 2020; 60:1614-1631. [DOI: 10.1021/acs.jcim.9b01045] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Lindy Astl
- Graduate Program in Computational and Data Sciences, Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, One University Drive, Orange, California 92866, United States
| | - Gennady M. Verkhivker
- Graduate Program in Computational and Data Sciences, Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, One University Drive, Orange, California 92866, United States
- Depatment of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States
| |
Collapse
|
15
|
Li Y, De la Paz JA, Jiang X, Liu R, Pokkulandra AP, Bleris L, Morcos F. Coevolutionary Couplings Unravel PAM-Proximal Constraints of CRISPR-SpCas9. Biophys J 2019; 117:1684-1691. [PMID: 31648792 DOI: 10.1016/j.bpj.2019.09.040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 09/25/2019] [Accepted: 09/30/2019] [Indexed: 01/07/2023] Open
Abstract
The clustered regularly interspaced short palindromic repeats (CRISPR) system, an immune system analog found in prokaryotes, allows a single-guide RNA to direct a CRISPR-associated protein (Cas) with combined helicase and nuclease activity to DNA. The presence of a specific protospacer adjacent motif (PAM) next to the DNA target site plays a crucial role in determining both efficacy and specificity of gene editing. Herein, we introduce a coevolutionary framework to computationally unveil nonobvious molecular interactions in CRISPR systems and experimentally probe their functional role. Specifically, we use direct coupling analysis, a statistical inference framework used to infer direct coevolutionary couplings, in the context of protein/nucleic acid interactions. Applied to Streptococcus pyogenes Cas9, a Hamiltonian metric obtained from coevolutionary relationships reveals, to our knowledge, novel PAM-proximal nucleotide preferences at the seventh position of S. pyogenes Cas9 PAM (5'-NGRNNNT-3'), which was experimentally confirmed by in vitro and functional assays in human cells. We show that coevolved and conserved interactions point to specific clues toward rationally engineering new generations of Cas9 systems and may eventually help decipher the diversity of this family of proteins.
Collapse
Affiliation(s)
- Yi Li
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas
| | - José A De la Paz
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Xianli Jiang
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Richard Liu
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas
| | - Adarsha P Pokkulandra
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Leonidas Bleris
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas; Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas.
| | - Faruck Morcos
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas; Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas.
| |
Collapse
|
16
|
Vandova V, Vankova P, Durech M, Houser J, Kavan D, Man P, Muller P, Trcka F. HSPA1A conformational mutants reveal a conserved structural unit in Hsp70 proteins. Biochim Biophys Acta Gen Subj 2019; 1864:129458. [PMID: 31676290 DOI: 10.1016/j.bbagen.2019.129458] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2019] [Revised: 08/22/2019] [Accepted: 10/15/2019] [Indexed: 01/19/2023]
Abstract
BACKGROUND The Hsp70 proteins maintain proteome integrity through the capacity of their nucleotide- and substrate-binding domains (NBD and SBD) to allosterically regulate substrate affinity in a nucleotide-dependent manner. Crystallographic studies showed that Hsp70 allostery relies on formation of contacts between ATP-bound NBD and an interdomain linker, accompanied by SBD subdomains docking onto distinct sites of the NBD leading to substrate release. However, the mechanics of ATP-induced SBD subdomains detachment is largely unknown. METHODS Here, we investigated the structural and allosteric properties of human HSPA1A using hydrogen/deuterium exchange mass spectrometry, ATPase assays, surface plasmon resonance and fluorescence polarization-based substrate binding assays. RESULTS Analysis of HSPA1A proteins bearing mutations at the interface of SBD subdomains close to the interdomain linker (amino acids L399, L510, I515, and D529) revealed that this region forms a folding unit stabilizing the structure of both SBD subdomains in the nucleotide-free state. The introduced mutations modulate HSPA1A allostery as they localize to the NBD-SBD interfaces in the ATP-bound protein. CONCLUSIONS These findings show that residues forming the hydrophobic structural unit stabilizing the SBD structure are relocated during ATP-activated detachment of the SBD subdomains to different NBD-SBD docking interfaces enabling HSPA1A allostery. GENERAL SIGNIFICANCE Mutation-induced perturbations tuned HSPA1A sensitivity to peptide/protein substrates and to Hsp40 in a way that is common for other Hsp70 proteins. Our results provide an insight into structural rearrangements in the SBD of Hsp70 proteins and highlight HSPA1A-specific allostery features, which is a prerequisite for selective targeting in Hsp-related pathologies.
Collapse
Affiliation(s)
- Veronika Vandova
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Pavla Vankova
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic; Department of Biochemistry, Faculty of Science, Charles University, Hlavova 8, 128 43 Prague, Czech Republic
| | - Michal Durech
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Josef Houser
- Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic; National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kotlarska 2, 611 37 Brno, Czech Republic
| | - Daniel Kavan
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic; Department of Biochemistry, Faculty of Science, Charles University, Hlavova 8, 128 43 Prague, Czech Republic
| | - Petr Man
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic; Department of Biochemistry, Faculty of Science, Charles University, Hlavova 8, 128 43 Prague, Czech Republic
| | - Petr Muller
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic.
| | - Filip Trcka
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic.
| |
Collapse
|
17
|
Phylogenetic correlations can suffice to infer protein partners from sequences. PLoS Comput Biol 2019; 15:e1007179. [PMID: 31609984 PMCID: PMC6812855 DOI: 10.1371/journal.pcbi.1007179] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 10/24/2019] [Accepted: 09/25/2019] [Indexed: 12/30/2022] Open
Abstract
Determining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among paralogous proteins from sequence data. This success of DCA at predicting protein-protein interactions could be mainly based on its known ability to identify pairs of residues that are in contact in the three-dimensional structure of protein complexes and that coevolve to remain physicochemically complementary. However, interacting proteins possess similar evolutionary histories. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involve phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that share evolutionary history. While phylogenetic correlations confound the identification of contacting residues by DCA, they are thus useful to predict interacting partners among paralogs. We find that DCA performs as well as phylogenetic methods to this end, and slightly better than them with large and accurate training sets. Employing DCA or phylogenetic methods within an Iterative Pairing Algorithm (IPA) allows to predict pairs of evolutionary partners without a training set. We further demonstrate the ability of these various methods to correctly predict pairings among real paralogous proteins with genome proximity but no known direct physical interaction, illustrating the importance of phylogenetic correlations in natural data. However, for physically interacting and strongly coevolving proteins, DCA and mutual information outperform phylogenetic methods. We finally discuss how to distinguish physically interacting proteins from proteins that only share a common evolutionary history. Many biologically important protein-protein interactions are conserved over evolutionary time scales. This leads to two different signals that can be used to computationally predict interactions between protein families and to identify specific interaction partners. First, the shared evolutionary history leads to highly similar phylogenetic relationships between interacting proteins of the two families. Second, the need to keep the interaction surfaces of partner proteins biophysically compatible causes a correlated amino-acid usage of interface residues. Employing simulated data, we show that the shared history alone can be used to detect partner proteins. Similar accuracies are achieved by algorithms comparing phylogenetic relationships and by methods based on Direct Coupling Analysis (DCA), which are primarily known for their ability to detect the second type of signal. Using natural sequence data, we show that in cases with shared evolutionary history but without known physical interactions, both methods work with similar accuracy, while for some physically interacting systems, DCA and mutual information outperform phylogenetic methods. We propose methods allowing both to predict interactions between protein families and to find interacting partners among paralogs.
Collapse
|
18
|
Takakuwa JE, Nitika, Knighton LE, Truman AW. Oligomerization of Hsp70: Current Perspectives on Regulation and Function. Front Mol Biosci 2019; 6:81. [PMID: 31555664 PMCID: PMC6742908 DOI: 10.3389/fmolb.2019.00081] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 08/22/2019] [Indexed: 12/14/2022] Open
Abstract
The Hsp70 molecular chaperone in conjunction with Hsp90 and a suite of helper co-chaperones are required for the folding and subsequent refolding of a large proportion of the proteome. These proteins are critical for cell viability and play major roles in diseases of proteostasis which include neurodegenerative diseases and cancer. As a consequence, a large scientific effort has gone into understanding how chaperones such as Hsp70 function at the in vitro and in vivo level. Although many chaperones require constitutive self-interaction (dimerization and oligomerization) to function, Hsp70 has been thought to exist as a monomer, especially in eukaryotic cells. Recent studies have demonstrated that both bacterial and mammalian Hsp70 can exist as a dynamic pool of monomers, dimer, and oligomers. In this mini-review, we discuss the mechanisms and roles of Hsp70 oligomerization in Hsp70 function, as well as thoughts on how this integrates into well-established ideas of Hsp70 regulation.
Collapse
Affiliation(s)
| | | | | | - Andrew W. Truman
- Department of Biological Sciences, The University of North Carolina at Charlotte, Charlotte, NC, United States
| |
Collapse
|
19
|
Astl L, Verkhivker GM. Data-driven computational analysis of allosteric proteins by exploring protein dynamics, residue coevolution and residue interaction networks. Biochim Biophys Acta Gen Subj 2019:S0304-4165(19)30179-5. [PMID: 31330173 DOI: 10.1016/j.bbagen.2019.07.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 07/15/2019] [Accepted: 07/17/2019] [Indexed: 02/07/2023]
Abstract
BACKGROUND Computational studies of allosteric interactions have witnessed a recent renaissance fueled by the growing interest in modeling of the complex molecular assemblies and biological networks. Allosteric interactions in protein structures allow for molecular communication in signal transduction networks. METHODS In this work, we performed a large scale comprehensive and multi-faceted analysis of >300 diverse allosteric proteins and complexes with allosteric modulators. By modeling and exploring coarse-grained dynamics, residue coevolution, and residue interaction networks for allosteric proteins, we have determined unifying molecular signatures shared by allosteric systems. RESULTS The results of this study have suggested that allosteric inhibitors and allosteric activators may differentially affect global dynamics and network organization of protein systems, leading to diverse allosteric mechanisms. By using structural and functional data on protein kinases, we present a detailed case study that that included atomic-level analysis of coevolutionary networks in kinases bound with allosteric inhibitors and activators. CONCLUSIONS We have found that coevolutionary networks can form direct communication pathways connecting functional regions and can recapitulate key regulatory sites and interactions responsible for allosteric signaling in the studied protein systems. The results of this computational investigation are compared with the experimental studies and reveal molecular signatures of known regulatory hotspots in protein kinases. GENERAL SIGNIFICANCE This study has shown that allosteric inhibitors and allosteric activators can have a different effect on residue interaction networks and can exploit distinct regulatory mechanisms, which could open up opportunities for probing allostery and new drug combinations with broad range of activities.
Collapse
Affiliation(s)
- Lindy Astl
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, United States of America
| | - Gennady M Verkhivker
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, United States of America; Department of Pharmacology, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, United States of America.
| |
Collapse
|
20
|
The role of coevolutionary signatures in protein interaction dynamics, complex inference, molecular recognition, and mutational landscapes. Curr Opin Struct Biol 2019; 56:179-186. [PMID: 31029927 DOI: 10.1016/j.sbi.2019.03.024] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 03/18/2019] [Accepted: 03/19/2019] [Indexed: 11/22/2022]
Abstract
Evolution imposes constraints at the interface of interacting biomolecules in order to preserve function or maintain fitness. This pressure may have a direct effect on the sequence composition of interacting biomolecules. As a result, statistical patterns of amino acid or nucleotide covariance that encode for physical and functional interactions are observed in sequences of extant organisms. In recent years, global pairwise models of amino acid and nucleotide coevolution from multiple sequence alignments have been developed and utilized to study molecular interactions in structural biology. In proteins, for which the energy landscape is funneled and minimally frustrated, a direct connection between the physical and sequence space landscapes can be established. Estimating coevolutionary information from sequences of interacting molecules has a broad impact in molecular biology. Applications include the accurate determination of 3D structures of molecular complexes, inference of protein interaction partners, models of protein-protein interaction specificity, the elucidation, and design of protein-nucleic acid recognition as well as the discovery of genome-wide epistatic effects. The current state of the art of coevolutionary analysis includes biomedical applications ranging from mutational landscapes and drug-design to vaccine development.
Collapse
|
21
|
Wang SW, Bitbol AF, Wingreen NS. Revealing evolutionary constraints on proteins through sequence analysis. PLoS Comput Biol 2019; 15:e1007010. [PMID: 31017888 PMCID: PMC6502352 DOI: 10.1371/journal.pcbi.1007010] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 05/06/2019] [Accepted: 04/06/2019] [Indexed: 02/03/2023] Open
Abstract
Statistical analysis of alignments of large numbers of protein sequences has revealed "sectors" of collectively coevolving amino acids in several protein families. Here, we show that selection acting on any functional property of a protein, represented by an additive trait, can give rise to such a sector. As an illustration of a selected trait, we consider the elastic energy of an important conformational change within an elastic network model, and we show that selection acting on this energy leads to correlations among residues. For this concrete example and more generally, we demonstrate that the main signature of functional sectors lies in the small-eigenvalue modes of the covariance matrix of the selected sequences. However, secondary signatures of these functional sectors also exist in the extensively-studied large-eigenvalue modes. Our simple, general model leads us to propose a principled method to identify functional sectors, along with the magnitudes of mutational effects, from sequence data. We further demonstrate the robustness of these functional sectors to various forms of selection, and the robustness of our approach to the identification of multiple selected traits.
Collapse
Affiliation(s)
- Shou-Wen Wang
- Department of Engineering Physics, Tsinghua University, Beijing, China
- Beijing Computational Science Research Center, Beijing, China
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Anne-Florence Bitbol
- Sorbonne Université, CNRS, Laboratoire Jean Perrin (UMR 8237), F-75005 Paris, France
| | - Ned S. Wingreen
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
22
|
Tubiana J, Cocco S, Monasson R. Learning protein constitutive motifs from sequence data. eLife 2019; 8:e39397. [PMID: 30857591 PMCID: PMC6436896 DOI: 10.7554/elife.39397] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 02/24/2019] [Indexed: 12/11/2022] Open
Abstract
Statistical analysis of evolutionary-related protein sequences provides information about their structure, function, and history. We show that Restricted Boltzmann Machines (RBM), designed to learn complex high-dimensional data and their statistical features, can efficiently model protein families from sequence information. We here apply RBM to 20 protein families, and present detailed results for two short protein domains (Kunitz and WW), one long chaperone protein (Hsp70), and synthetic lattice proteins for benchmarking. The features inferred by the RBM are biologically interpretable: they are related to structure (residue-residue tertiary contacts, extended secondary motifs (α-helixes and β-sheets) and intrinsically disordered regions), to function (activity and ligand specificity), or to phylogenetic identity. In addition, we use RBM to design new protein sequences with putative properties by composing and 'turning up' or 'turning down' the different modes at will. Our work therefore shows that RBM are versatile and practical tools that can be used to unveil and exploit the genotype-phenotype relationship for protein families.
Collapse
Affiliation(s)
- Jérôme Tubiana
- Laboratory of Physics of the Ecole Normale SupérieureCNRS UMR 8023 & PSL ResearchParisFrance
| | - Simona Cocco
- Laboratory of Physics of the Ecole Normale SupérieureCNRS UMR 8023 & PSL ResearchParisFrance
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale SupérieureCNRS UMR 8023 & PSL ResearchParisFrance
| |
Collapse
|
23
|
Tiroli-Cepeda AO, Seraphim TV, Pinheiro GM, Souto DE, Kubota LT, Borges JC, Barbosa LR, Ramos CH. Studies on the effect of the J-domain on the substrate binding domain (SBD) of Hsp70 using a chimeric human J-SBD polypeptide. Int J Biol Macromol 2019; 124:111-120. [DOI: 10.1016/j.ijbiomac.2018.11.130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 11/13/2018] [Accepted: 11/14/2018] [Indexed: 10/27/2022]
|
24
|
Trcka F, Durech M, Vankova P, Chmelik J, Martinkova V, Hausner J, Kadek A, Marcoux J, Klumpler T, Vojtesek B, Muller P, Man P. Human Stress-inducible Hsp70 Has a High Propensity to Form ATP-dependent Antiparallel Dimers That Are Differentially Regulated by Cochaperone Binding. Mol Cell Proteomics 2019; 18:320-337. [PMID: 30459217 PMCID: PMC6356074 DOI: 10.1074/mcp.ra118.001044] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Revised: 11/09/2018] [Indexed: 12/23/2022] Open
Abstract
Eukaryotic protein homeostasis (proteostasis) is largely dependent on the action of highly conserved Hsp70 molecular chaperones. Recent evidence indicates that, apart from conserved molecular allostery, Hsp70 proteins have retained and adapted the ability to assemble as functionally relevant ATP-bound dimers throughout evolution. Here, we have compared the ATP-dependent dimerization of DnaK, human stress-inducible Hsp70, Hsc70 and BiP Hsp70 proteins, showing that their dimerization propensities differ, with stress-inducible Hsp70 being predominantly dimeric in the presence of ATP. Structural analyses using hydrogen/deuterium exchange mass spectrometry, native electrospray ionization mass spectrometry and small-angle X-ray scattering revealed that stress-inducible Hsp70 assembles in solution as an antiparallel dimer with the intermolecular interface closely resembling the ATP-bound dimer interfaces captured in DnaK and BiP crystal structures. ATP-dependent dimerization of stress-inducible Hsp70 is necessary for its efficient interaction with Hsp40, as shown by experiments with dimerization-deficient mutants. Moreover, dimerization of ATP-bound Hsp70 is required for its participation in high molecular weight protein complexes detected ex vivo, supporting its functional role in vivo As human cytosolic Hsp70 can interact with tetratricopeptide repeat (TPR) domain containing cochaperones, we tested the interaction of Hsp70 ATP-dependent dimers with Chip and Tomm34 cochaperones. Although Chip associates with intact Hsp70 dimers to form a larger complex, binding of Tomm34 disrupts the Hsp70 dimer and this event plays an important role in Hsp70 activity regulation. In summary, this study provides structural evidence of robust ATP-dependent antiparallel dimerization of human inducible Hsp70 protein and suggests a novel role of TPR domain cochaperones in multichaperone complexes involving Hsp70 ATP-bound dimers.
Collapse
Affiliation(s)
- Filip Trcka
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Michal Durech
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Pavla Vankova
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic;; Department of Biochemistry, Faculty of Science, Charles University in Prague, Hlavova 8, 128 43 Prague, Czech Republic
| | - Josef Chmelik
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic;; Department of Biochemistry, Faculty of Science, Charles University in Prague, Hlavova 8, 128 43 Prague, Czech Republic
| | - Veronika Martinkova
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Jiri Hausner
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic;; Department of Biochemistry, Faculty of Science, Charles University in Prague, Hlavova 8, 128 43 Prague, Czech Republic
| | - Alan Kadek
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic;; Department of Biochemistry, Faculty of Science, Charles University in Prague, Hlavova 8, 128 43 Prague, Czech Republic
| | - Julien Marcoux
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, Toulouse, France
| | - Tomas Klumpler
- CEITEC-Central European Institute of Technology, Masaryk University, 625 00 Brno, Czech Republic
| | - Borivoj Vojtesek
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic
| | - Petr Muller
- Regional Centre for Applied Molecular Oncology, Masaryk Memorial Cancer Institute, Zluty kopec 7, 656 53 Brno, Czech Republic;.
| | - Petr Man
- BioCeV - Institute of Microbiology of the Czech Academy of Sciences, v.v.i., Prumyslova 595, 252 50 Vestec, Czech Republic;; Department of Biochemistry, Faculty of Science, Charles University in Prague, Hlavova 8, 128 43 Prague, Czech Republic;.
| |
Collapse
|
25
|
Abstract
Thanks to the explosion of genomic sequencing, coevolutionary analysis of protein sequences has gained great and ever-increasing popularity in the last decade, and it is currently an important and well-established tool in structural bioinformatics and computational biology. This chapter concisely introduces the theoretical foundation and the practical aspects of coevolutionary analysis, as well as discusses the molecular modeling strategies to exploit its results in the study of protein structure, dynamics, and interactions. We present here a complete pipeline from sequence extraction to contact prediction through two examples, focusing on the predictions of inter-residue contacts in a single protein domain and on the analysis of a multi-domain protein that undergoes functional, large-scale conformational transitions.
Collapse
Affiliation(s)
- Duccio Malinverni
- Laboratory of Statistical Biophysics, Institute of Physics, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
| | - Alessandro Barducci
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, Montpellier, France.
| |
Collapse
|
26
|
Mayer MP, Gierasch LM. Recent advances in the structural and mechanistic aspects of Hsp70 molecular chaperones. J Biol Chem 2018; 294:2085-2097. [PMID: 30455352 DOI: 10.1074/jbc.rev118.002810] [Citation(s) in RCA: 169] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Hsp70 chaperones are central hubs of the protein quality control network and collaborate with co-chaperones having a J-domain (an ∼70-residue-long helical hairpin with a flexible loop and a conserved His-Pro-Asp motif required for ATP hydrolysis by Hsp70s) and also with nucleotide exchange factors to facilitate many protein-folding processes that (re)establish protein homeostasis. The Hsp70s are highly dynamic nanomachines that modulate the conformation of their substrate polypeptides by transiently binding to short, mostly hydrophobic stretches. This interaction is regulated by an intricate allosteric mechanism. The J-domain co-chaperones target Hsp70 to their polypeptide substrates, and the nucleotide exchange factors regulate the lifetime of the Hsp70-substrate complexes. Significant advances in recent years are beginning to unravel the molecular mechanism of this chaperone machine and how they treat their substrate proteins.
Collapse
Affiliation(s)
- Matthias P Mayer
- From the Center for Molecular Biology of Heidelberg University (ZMBH), DKFZ-ZMBH-Alliance, 69120 Heidelberg, Germany and
| | - Lila M Gierasch
- the Departments of Biochemistry and Molecular Biology and.,Chemistry, University of Massachusetts, Amherst, Massachusetts 01003
| |
Collapse
|
27
|
Bitbol AF. Inferring interaction partners from protein sequences using mutual information. PLoS Comput Biol 2018; 14:e1006401. [PMID: 30422978 PMCID: PMC6258550 DOI: 10.1371/journal.pcbi.1006401] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 11/27/2018] [Accepted: 10/27/2018] [Indexed: 11/30/2022] Open
Abstract
Functional protein-protein interactions are crucial in most cellular processes. They enable multi-protein complexes to assemble and to remain stable, and they allow signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interacting partners, and thus in correlations between their sequences. Pairwise maximum-entropy based models have enabled successful inference of pairs of amino-acid residues that are in contact in the three-dimensional structure of multi-protein complexes, starting from the correlations in the sequence data of known interaction partners. Recently, algorithms inspired by these methods have been developed to identify which proteins are functional interaction partners among the paralogous proteins of two families, starting from sequence data alone. Here, we demonstrate that a slightly higher performance for partner identification can be reached by an approximate maximization of the mutual information between the sequence alignments of the two protein families. Our mutual information-based method also provides signatures of the existence of interactions between protein families. These results stand in contrast with structure prediction of proteins and of multi-protein complexes from sequence data, where pairwise maximum-entropy based global statistical models substantially improve performance compared to mutual information. Our findings entail that the statistical dependences allowing interaction partner prediction from sequence data are not restricted to the residue pairs that are in direct contact at the interface between the partner proteins.
Collapse
Affiliation(s)
- Anne-Florence Bitbol
- Sorbonne Université, CNRS, Laboratoire Jean Perrin (UMR 8237), F-75005 Paris, France
| |
Collapse
|
28
|
Mileo E, Ilbert M, Barducci A, Bordes P, Castanié-Cornet MP, Garnier C, Genevaux P, Gillet R, Goloubinoff P, Ochsenbein F, Richarme G, Iobbi-Nivol C, Giudici-Orticoni MT, Gontero B, Genest O. Emerging fields in chaperone proteins: A French workshop. Biochimie 2018; 151:159-165. [PMID: 29890204 DOI: 10.1016/j.biochi.2018.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 06/06/2018] [Indexed: 10/14/2022]
Abstract
The "Bioénergétique et Ingénierie des Protéines (BIP)" laboratory, CNRS (France), organized its first French workshop on molecular chaperone proteins and protein folding in November 2017. The goal of this workshop was to gather scientists working in France on chaperone proteins and protein folding. This initiative was a great success with excellent talks and fruitful discussions. The highlights were on the description of unexpected functions and post-translational regulation of known molecular chaperones (such as Hsp90, Hsp33, SecB, GroEL) and on state-of-the-art methods to tackle questions related to this theme, including Cryo-electron microscopy, Nuclear Magnetic Resonance (NMR), Electron Paramagnetic Resonance (EPR), simulation and modeling. We expect to organize a second workshop in two years that will include more scientists working in France in the chaperone field.
Collapse
Affiliation(s)
- Elisabetta Mileo
- Aix Marseille Univ, CNRS, Laboratoire de Bioénergétique et Ingénierie des Protéines, Marseille, France
| | - Marianne Ilbert
- Aix Marseille Univ, CNRS, Laboratoire de Bioénergétique et Ingénierie des Protéines, Marseille, France
| | - Alessandro Barducci
- Centre de Biochimie Structurale (CBS), INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Patricia Bordes
- Laboratoire de Microbiologie et Génétique Moléculaires, Centre de Biologie Intégrative, CNRS, Université Paul-Sabatier, Toulouse, France
| | - Marie-Pierre Castanié-Cornet
- Laboratoire de Microbiologie et Génétique Moléculaires, Centre de Biologie Intégrative, CNRS, Université Paul-Sabatier, Toulouse, France
| | - Cyrille Garnier
- Mécanismes Moléculaires dans les Démences Neurodégénératives, Université de Montpellier, EPHE, INSERM, U1198, F-34095, Montpellier, France; Université de Rennes 1, France
| | - Pierre Genevaux
- Laboratoire de Microbiologie et Génétique Moléculaires, Centre de Biologie Intégrative, CNRS, Université Paul-Sabatier, Toulouse, France
| | - Reynald Gillet
- Univ. Rennes, CNRS, Institut de Génétique et Développement de Rennes (IGDR) UMR6290, Rennes, France
| | - Pierre Goloubinoff
- Département de Biologie Moléculaire Végétale, Université de Lausanne, 1015, Lausanne, Switzerland
| | - Françoise Ochsenbein
- Institute for Integrative Biology of the Cell (I2BC), Joliot, CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Gilbert Richarme
- UMR 8601 CNRS, Laboratoire de Chimie et Biochimie Pharmacologiques et Toxicologiques, Université Paris Descartes-Sorbonne Paris Cité, Paris, France
| | - Chantal Iobbi-Nivol
- Aix Marseille Univ, CNRS, Laboratoire de Bioénergétique et Ingénierie des Protéines, Marseille, France
| | | | - Brigitte Gontero
- Aix Marseille Univ, CNRS, Laboratoire de Bioénergétique et Ingénierie des Protéines, Marseille, France
| | - Olivier Genest
- Aix Marseille Univ, CNRS, Laboratoire de Bioénergétique et Ingénierie des Protéines, Marseille, France.
| |
Collapse
|
29
|
Szurmant H, Weigt M. Inter-residue, inter-protein and inter-family coevolution: bridging the scales. Curr Opin Struct Biol 2018; 50:26-32. [PMID: 29101847 PMCID: PMC5940578 DOI: 10.1016/j.sbi.2017.10.014] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 10/12/2017] [Accepted: 10/13/2017] [Indexed: 10/18/2022]
Abstract
Interacting proteins coevolve at multiple but interconnected scales, from the residue-residue over the protein-protein up to the family-family level. The recent accumulation of enormous amounts of sequence data allows for the development of novel, data-driven computational approaches. Notably, these approaches can bridge scales within a single statistical framework. Although being currently applied mostly to isolated problems on single scales, their immense potential for an evolutionary informed, structural systems biology is steadily emerging.
Collapse
Affiliation(s)
- Hendrik Szurmant
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA 91766, USA.
| | - Martin Weigt
- Sorbonne Universités, UPMC Université Paris 06, CNRS, Biologie Computationnelle et Quantitative - Institut de Biologie Paris Seine, 75005 Paris, France.
| |
Collapse
|
30
|
Dib L, Salamin N, Gfeller D. Polymorphic sites preferentially avoid co-evolving residues in MHC class I proteins. PLoS Comput Biol 2018; 14:e1006188. [PMID: 29782520 PMCID: PMC5983860 DOI: 10.1371/journal.pcbi.1006188] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 06/01/2018] [Accepted: 05/09/2018] [Indexed: 01/11/2023] Open
Abstract
Major histocompatibility complex class I (MHC-I) molecules are critical to adaptive immune defence mechanisms in vertebrate species and are encoded by highly polymorphic genes. Polymorphic sites are located close to the ligand-binding groove and entail MHC-I alleles with distinct binding specificities. Some efforts have been made to investigate the relationship between polymorphism and protein stability. However, less is known about the relationship between polymorphism and MHC-I co-evolutionary constraints. Using Direct Coupling Analysis (DCA) we found that co-evolution analysis accurately pinpoints structural contacts, although the protein family is restricted to vertebrates and comprises less than five hundred species, and that the co-evolutionary signal is mainly driven by inter-species changes, and not intra-species polymorphism. Moreover, we show that polymorphic sites in human preferentially avoid co-evolving residues, as well as residues involved in protein stability. These results suggest that sites displaying high polymorphism may have been selected during vertebrates’ evolution to avoid co-evolutionary constraints and thereby maximize their mutability. Amino acid co-evolution represents cases of simultaneous substitution of amino acids at distinct positions in protein sequences. In the MHC-I protein family, such co-evolution could result from either amino acid changes across species or changes within species due to the high polymorphism of MHC-I molecules. Here we show that signals captured by global methods such as Direct Coupling Analysis (DCA) to estimate co-evolution primarily result from changes across species. Moreover, our results indicate that polymorphic sites in MHC-I molecules tend to be decoupled from co-evolving ones. This could suggest that they have been selected to maximize their mutability, which is known to be functionally important to entail MHC-I molecules with a wide repertoire of binding specificities for antigen presentation.
Collapse
Affiliation(s)
- Linda Dib
- Department of Oncology, Ludwig Institute for Cancer Research, University of Lausanne, Switzerland
- Swiss Institutes of Bioinformatics, Quartier Sorge, Lausanne, Switzerland
| | - Nicolas Salamin
- Swiss Institutes of Bioinformatics, Quartier Sorge, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - David Gfeller
- Department of Oncology, Ludwig Institute for Cancer Research, University of Lausanne, Switzerland
- Swiss Institutes of Bioinformatics, Quartier Sorge, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
31
|
Neuwald AF, Aravind L, Altschul SF. Inferring joint sequence-structural determinants of protein functional specificity. eLife 2018; 7. [PMID: 29336305 PMCID: PMC5770160 DOI: 10.7554/elife.29880] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 12/22/2017] [Indexed: 01/05/2023] Open
Abstract
Residues responsible for allostery, cooperativity, and other subtle but functionally important interactions remain difficult to detect. To aid such detection, we employ statistical inference based on the assumption that residues distinguishing a protein subgroup from evolutionarily divergent subgroups often constitute an interacting functional network. We identify such networks with the aid of two measures of statistical significance. One measure aids identification of divergent subgroups based on distinguishing residue patterns. For each subgroup, a second measure identifies structural interactions involving pattern residues. Such interactions are derived either from atomic coordinates or from Direct Coupling Analysis scores, used as surrogates for structural distances. Applying this approach to N-acetyltransferases, P-loop GTPases, RNA helicases, synaptojanin-superfamily phosphatases and nucleases, and thymine/uracil DNA glycosylases yielded results congruent with biochemical understanding of these proteins, and also revealed striking sequence-structural features overlooked by other methods. These and similar analyses can aid the design of drugs targeting allosteric sites.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, United States.,Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, United States
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, United States
| | - Stephen F Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, United States
| |
Collapse
|
32
|
Suplatov D, Sharapova Y, Timonina D, Kopylov K, Švedas V. The visualCMAT: A web-server to select and interpret correlated mutations/co-evolving residues in protein families. J Bioinform Comput Biol 2017; 16:1840005. [PMID: 29361894 DOI: 10.1142/s021972001840005x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The visualCMAT web-server was designed to assist experimental research in the fields of protein/enzyme biochemistry, protein engineering, and drug discovery by providing an intuitive and easy-to-use interface to the analysis of correlated mutations/co-evolving residues. Sequence and structural information describing homologous proteins are used to predict correlated substitutions by the Mutual information-based CMAT approach, classify them into spatially close co-evolving pairs, which either form a direct physical contact or interact with the same ligand (e.g. a substrate or a crystallographic water molecule), and long-range correlations, annotate and rank binding sites on the protein surface by the presence of statistically significant co-evolving positions. The results of the visualCMAT are organized for a convenient visual analysis and can be downloaded to a local computer as a content-rich all-in-one PyMol session file with multiple layers of annotation corresponding to bioinformatic, statistical and structural analyses of the predicted co-evolution, or further studied online using the built-in interactive analysis tools. The online interactivity is implemented in HTML5 and therefore neither plugins nor Java are required. The visualCMAT web-server is integrated with the Mustguseal web-server capable of constructing large structure-guided sequence alignments of protein families and superfamilies using all available information about their structures and sequences in public databases. The visualCMAT web-server can be used to understand the relationship between structure and function in proteins, implemented at selecting hotspots and compensatory mutations for rational design and directed evolution experiments to produce novel enzymes with improved properties, and employed at studying the mechanism of selective ligand's binding and allosteric communication between topologically independent sites in protein structures. The web-server is freely available at https://biokinet.belozersky.msu.ru/visualcmat and there are no login requirements.
Collapse
Affiliation(s)
- Dmitry Suplatov
- 1 Belozersky Institute of Physicochemical Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskiye Gory 1-73, Moscow 119991, Russia
| | - Yana Sharapova
- 1 Belozersky Institute of Physicochemical Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskiye Gory 1-73, Moscow 119991, Russia
| | - Daria Timonina
- 1 Belozersky Institute of Physicochemical Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskiye Gory 1-73, Moscow 119991, Russia
| | - Kirill Kopylov
- 1 Belozersky Institute of Physicochemical Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskiye Gory 1-73, Moscow 119991, Russia
| | - Vytas Švedas
- 1 Belozersky Institute of Physicochemical Biology, Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Leninskiye Gory 1-73, Moscow 119991, Russia
| |
Collapse
|
33
|
Assessment of data-assisted prediction by inclusion of crosslinking/mass-spectrometry and small angle X-ray scattering data in the 12thCritical Assessment of protein Structure Prediction experiment. Proteins 2017; 86 Suppl 1:215-227. [DOI: 10.1002/prot.25442] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 11/16/2017] [Accepted: 12/10/2017] [Indexed: 12/26/2022]
|
34
|
Lopez T, Dalton K, Tomlinson A, Pande V, Frydman J. An information theoretic framework reveals a tunable allosteric network in group II chaperonins. Nat Struct Mol Biol 2017; 24:726-733. [PMID: 28741612 PMCID: PMC5986071 DOI: 10.1038/nsmb.3440] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 06/22/2017] [Indexed: 12/19/2022]
Abstract
ATP-dependent allosteric regulation of the ring-shaped group II chaperonins remains ill defined, in part because their complex oligomeric topology has limited the success of structural techniques in suggesting allosteric determinants. Further, their high sequence conservation has hindered the prediction of allosteric networks using mathematical covariation approaches. Here, we develop an information theoretic strategy that is robust to residue conservation and apply it to group II chaperonins. We identify a contiguous network of covarying residues that connects all nucleotide-binding pockets within each chaperonin ring. An interfacial residue between the networks of neighboring subunits controls positive cooperativity by communicating nucleotide occupancy within each ring. Strikingly, chaperonin allostery is tunable through single mutations at this position. Naturally occurring variants at this position that double the extent of positive cooperativity are less prevalent in nature. We propose that being less cooperative than attainable allows chaperonins to support robust folding over a wider range of metabolic conditions.
Collapse
Affiliation(s)
- Tom Lopez
- Department of Biology, Stanford University, Stanford, California, USA
| | - Kevin Dalton
- Biophysics Program, Stanford University, Stanford, California, USA
| | - Anthony Tomlinson
- Department of Biology, Stanford University, Stanford, California, USA
| | - Vijay Pande
- Biophysics Program, Stanford University, Stanford, California, USA
- Department of Chemistry, Stanford University, Stanford, California, USA
| | - Judith Frydman
- Department of Biology, Stanford University, Stanford, California, USA
- Biophysics Program, Stanford University, Stanford, California, USA
| |
Collapse
|
35
|
Abstract
Co-evolution techniques were originally conceived to assist in protein structure prediction by inferring pairs of residues that share spatial proximity. However, the functional relationships that can be extrapolated from co-evolution have also proven to be useful in a wide array of structural bioinformatics applications. These techniques are a powerful way to extract structural and functional information in a sequence-rich world.
Collapse
|
36
|
Fantini M, Malinverni D, De Los Rios P, Pastore A. New Techniques for Ancient Proteins: Direct Coupling Analysis Applied on Proteins Involved in Iron Sulfur Cluster Biogenesis. Front Mol Biosci 2017; 4:40. [PMID: 28664160 PMCID: PMC5471300 DOI: 10.3389/fmolb.2017.00040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Accepted: 05/24/2017] [Indexed: 12/01/2022] Open
Abstract
Direct coupling analysis (DCA) is a powerful statistical inference tool used to study protein evolution. It was introduced to predict protein folds and protein-protein interactions, and has also been applied to the prediction of entire interactomes. Here, we have used it to analyze three proteins of the iron-sulfur biogenesis machine, an essential metabolic pathway conserved in all organisms. We show that DCA can correctly reproduce structural features of the CyaY/frataxin family (a protein involved in the human disease Friedreich's ataxia) despite being based on the relatively small number of sequences allowed by its genomic distribution. This result gives us confidence in the method. Its application to the iron-sulfur cluster scaffold protein IscU, which has been suggested to function both as an ordered and a disordered form, allows us to distinguish evolutionary traces of the structured species, suggesting that, if present in the cell, the disordered form has not left evolutionary imprinting. We observe instead, for the first time, direct indications of how the protein can dimerize head-to-head and bind 4Fe4S clusters. Analysis of the alternative scaffold protein IscA provides strong support to a coordination of the cluster by a dimeric form rather than a tetramer, as previously suggested. Our analysis also suggests the presence in solution of a mixture of monomeric and dimeric species, and guides us to the prevalent one. Finally, we used DCA to analyze interactions between some of these proteins, and discuss the potentials and limitations of the method.
Collapse
Affiliation(s)
- Marco Fantini
- BioSNS, Faculty of Mathematical and Natural Sciences, Scuola Normale SuperiorePisa, Italy
| | - Duccio Malinverni
- Institute of Physics, School of Basic Sciences, and Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de LausanneLausanne, Switzerland
| | - Paolo De Los Rios
- Institute of Physics, School of Basic Sciences, and Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de LausanneLausanne, Switzerland
| | - Annalisa Pastore
- Maurice Wohl Institute, King's CollegeLondon, United Kingdom.,Molecular Medicine Department, University of PaviaPavia, Italy
| |
Collapse
|
37
|
Suqueli García MF, Castellote MA, Feingold SE, Corva PM. Characterization of a deletion in the Hsp70 cluster in the bovine reference genome. Anim Genet 2017; 48:377-385. [PMID: 28568840 DOI: 10.1111/age.12561] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2017] [Indexed: 11/27/2022]
Abstract
The 70 kilodalton heat shock proteins (Hsp70) are highly conserved molecular chaperones which have a crucial role in the stress response of the cell. In mammals, the Hsp70 proteins are encoded by a cluster of three genes: HSPA1A, HSPA1B and HSPA1L. In bovines, this cluster is located on chromosome 23 downstream of the major histocompatibility complex (BoLA). We detected inconsistencies in the location of markers on the Hsp70 genes reported in the literature that pointed to a potential deletion in the bovine reference genome UMD 3.1.1. An in silico analysis of the bovine genomic region of the Hsp70 cluster, using available information from public databases, confirmed the existence of a deletion of 11.1-kb spanning the HSPA1B gene and the intergenic region between HSPA1B and HSPA1A. Although we originally considered this an assembly error, it is most likely a particular condition of L1 Dominette 01449, the cow sequenced in the Bovine Genome Project. Moreover, we suggest a new classification of bovine Hsp70 sequences reported in NCBI and a reassignment of the location of SNPs from dbSNP that map to the deletion on BTA23. We also compared the location of selected transcription factor binding sites on the promoters of HSPA1A and HSPA1B. The results generated in the present work could be helpful to refine the reference genome of an important livestock species and also to understand the role and the regulation of the bovine Hsp70 genes.
Collapse
Affiliation(s)
- M F Suqueli García
- Facultad de Ciencias Agrarias, Universidad Nacional de Mar del Plata, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - M A Castellote
- Laboratorio de Agrobiotecnología, EEA Balcarce, Instituto Nacional de Tecnología Agropecuaria, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - S E Feingold
- Laboratorio de Agrobiotecnología, EEA Balcarce, Instituto Nacional de Tecnología Agropecuaria, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - P M Corva
- Facultad de Ciencias Agrarias, Universidad Nacional de Mar del Plata, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| |
Collapse
|
38
|
Malinverni D, Jost Lopez A, De Los Rios P, Hummer G, Barducci A. Modeling Hsp70/Hsp40 interaction by multi-scale molecular simulations and coevolutionary sequence analysis. eLife 2017; 6. [PMID: 28498104 PMCID: PMC5519331 DOI: 10.7554/elife.23471] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 05/10/2017] [Indexed: 01/01/2023] Open
Abstract
The interaction between the Heat Shock Proteins 70 and 40 is at the core of the ATPase regulation of the chaperone machinery that maintains protein homeostasis. However, the structural details of the interaction remain elusive and contrasting models have been proposed for the transient Hsp70/Hsp40 complexes. Here we combine molecular simulations based on both coarse-grained and atomistic models with coevolutionary sequence analysis to shed light on this problem by focusing on the bacterial DnaK/DnaJ system. The integration of these complementary approaches resulted in a novel structural model that rationalizes previous experimental observations. We identify an evolutionarily conserved interaction surface formed by helix II of the DnaJ J-domain and a structurally contiguous region of DnaK, involving lobe IIA of the nucleotide binding domain, the inter-domain linker, and the β-basket of the substrate binding domain. DOI:http://dx.doi.org/10.7554/eLife.23471.001
Collapse
Affiliation(s)
- Duccio Malinverni
- Laboratoire de Biophysique Statistique, Faculté de Sciences de Base, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | | | - Paolo De Los Rios
- Laboratoire de Biophysique Statistique, Faculté de Sciences de Base, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.,Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Gerhard Hummer
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany.,Institut für Biophysik, Johann Wolfgang Goethe Universität Frankfurt, Frankfurt am Main, Germany
| | - Alessandro Barducci
- Inserm, U1054, Montpellier, France.,Université de Montpellier, CNRS, UMR 5048, Centre de Biochimie Structurale, Montpellier, France
| |
Collapse
|
39
|
Uguzzoni G, John Lovis S, Oteri F, Schug A, Szurmant H, Weigt M. Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis. Proc Natl Acad Sci U S A 2017; 114:E2662-E2671. [PMID: 28289198 PMCID: PMC5380090 DOI: 10.1073/pnas.1615068114] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Proteins have evolved to perform diverse cellular functions, from serving as reaction catalysts to coordinating cellular propagation and development. Frequently, proteins do not exert their full potential as monomers but rather undergo concerted interactions as either homo-oligomers or with other proteins as hetero-oligomers. The experimental study of such protein complexes and interactions has been arduous. Theoretical structure prediction methods are an attractive alternative. Here, we investigate homo-oligomeric interfaces by tracing residue coevolution via the global statistical direct coupling analysis (DCA). DCA can accurately infer spatial adjacencies between residues. These adjacencies can be included as constraints in structure prediction techniques to predict high-resolution models. By taking advantage of the ongoing exponential growth of sequence databases, we go significantly beyond anecdotal cases of a few protein families and apply DCA to a systematic large-scale study of nearly 2,000 Pfam protein families with sufficient sequence information and structurally resolved homo-oligomeric interfaces. We find that large interfaces are commonly identified by DCA. We further demonstrate that DCA can differentiate between subfamilies with different binding modes within one large Pfam family. Sequence-derived contact information for the subfamilies proves sufficient to assemble accurate structural models of the diverse protein-oligomers. Thus, we provide an approach to investigate oligomerization for arbitrary protein families leading to structural models complementary to often-difficult experimental methods. Combined with ever more abundant sequential data, we anticipate that this study will be instrumental to allow the structural description of many heteroprotein complexes in the future.
Collapse
Affiliation(s)
- Guido Uguzzoni
- Sorbonne Universités, Université Pierre-et-Marie-Curie Université Paris 06, CNRS, Biologie Computationnelle et Quantitative-Institut de Biologie Paris Seine, 75005 Paris, France
| | - Shalini John Lovis
- Steinbuch Centre for Computing, Karlsruhe Institute of Technology, 76344 Eggenstein-Leopoldshafen, Germany
| | - Francesco Oteri
- Sorbonne Universités, Université Pierre-et-Marie-Curie Université Paris 06, CNRS, Biologie Computationnelle et Quantitative-Institut de Biologie Paris Seine, 75005 Paris, France
| | - Alexander Schug
- Steinbuch Centre for Computing, Karlsruhe Institute of Technology, 76344 Eggenstein-Leopoldshafen, Germany;
| | - Hendrik Szurmant
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA 91766
| | - Martin Weigt
- Sorbonne Universités, Université Pierre-et-Marie-Curie Université Paris 06, CNRS, Biologie Computationnelle et Quantitative-Institut de Biologie Paris Seine, 75005 Paris, France;
| |
Collapse
|
40
|
Stetz G, Verkhivker GM. Computational Analysis of Residue Interaction Networks and Coevolutionary Relationships in the Hsp70 Chaperones: A Community-Hopping Model of Allosteric Regulation and Communication. PLoS Comput Biol 2017; 13:e1005299. [PMID: 28095400 PMCID: PMC5240922 DOI: 10.1371/journal.pcbi.1005299] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 12/06/2016] [Indexed: 12/28/2022] Open
Abstract
Allosteric interactions in the Hsp70 proteins are linked with their regulatory mechanisms and cellular functions. Despite significant progress in structural and functional characterization of the Hsp70 proteins fundamental questions concerning modularity of the allosteric interaction networks and hierarchy of signaling pathways in the Hsp70 chaperones remained largely unexplored and poorly understood. In this work, we proposed an integrated computational strategy that combined atomistic and coarse-grained simulations with coevolutionary analysis and network modeling of the residue interactions. A novel aspect of this work is the incorporation of dynamic residue correlations and coevolutionary residue dependencies in the construction of allosteric interaction networks and signaling pathways. We found that functional sites involved in allosteric regulation of Hsp70 may be characterized by structural stability, proximity to global hinge centers and local structural environment that is enriched by highly coevolving flexible residues. These specific characteristics may be necessary for regulation of allosteric structural transitions and could distinguish regulatory sites from nonfunctional conserved residues. The observed confluence of dynamics correlations and coevolutionary residue couplings with global networking features may determine modular organization of allosteric interactions and dictate localization of key mediating sites. Community analysis of the residue interaction networks revealed that concerted rearrangements of local interacting modules at the inter-domain interface may be responsible for global structural changes and a population shift in the DnaK chaperone. The inter-domain communities in the Hsp70 structures harbor the majority of regulatory residues involved in allosteric signaling, suggesting that these sites could be integral to the network organization and coordination of structural changes. Using a network-based formalism of allostery, we introduced a community-hopping model of allosteric communication. Atomistic reconstruction of signaling pathways in the DnaK structures captured a direction-specific mechanism and molecular details of signal transmission that are fully consistent with the mutagenesis experiments. The results of our study reconciled structural and functional experiments from a network-centric perspective by showing that global properties of the residue interaction networks and coevolutionary signatures may be linked with specificity and diversity of allosteric regulation mechanisms. The diversity of allosteric mechanisms in the Hsp70 proteins could range from modulation of the inter-domain interactions and conformational dynamics to fine-tuning of the Hsp70 interactions with co-chaperones. The goal of this study is to present a systematic computational analysis of the dynamic and evolutionary factors underlying allosteric structural transformations of the Hsp70 proteins. We investigated the relationship between functional dynamics, residue coevolution, and network organization of residue interactions in the Hsp70 proteins. The results of this study revealed that conformational dynamics of the Hsp70 proteins may be linked with coevolutionary propensities and mutual information dependencies of the protein residues. Modularity and connectivity of allosteric interactions in the Hsp70 chaperones are coordinated by stable functional sites that feature unique coevolutionary signatures and high network centrality. The emergence of the inter-domain communities that are coordinated by functional centers and include highly coevolving residues could facilitate structural transitions through cooperative reorganization of the local interacting modules. We determined that the differences in the modularity of the residue interactions and organization of coevolutionary networks in DnaK may be associated with variations in their allosteric mechanisms. The network signatures of the DnaK structures are characteristic of a population-shift allostery that allows for coordinated structural rearrangements of local communities. A dislocation of mediating centers and insufficient coevolutionary coupling between functional regions may render a reduced cooperativity and promote a limited entropy-driven allostery in the Sse1 chaperone that occurs without structural changes. The results of this study showed that a network-centric framework and a community-hopping model of allosteric communication pathways may provide novel insights into molecular and evolutionary principles of allosteric regulation in the Hsp70 proteins.
Collapse
Affiliation(s)
- Gabrielle Stetz
- Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California, United States of America
| | - Gennady M. Verkhivker
- Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California, United States of America
- Chapman University School of Pharmacy, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|
41
|
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone. Proc Natl Acad Sci U S A 2016; 113:15018-15023. [PMID: 27965389 DOI: 10.1073/pnas.1611861114] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Collapse
|
42
|
Levy RM, Haldane A, Flynn WF. Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Curr Opin Struct Biol 2016; 43:55-62. [PMID: 27870991 DOI: 10.1016/j.sbi.2016.11.004] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/03/2016] [Indexed: 11/17/2022]
Abstract
Potts Hamiltonian models of protein sequence co-variation are statistical models constructed from the pair correlations observed in a multiple sequence alignment (MSA) of a protein family. These models are powerful because they capture higher order correlations induced by mutations evolving under constraints and help quantify the connections between protein sequence, structure, and function maintained through evolution. We review recent work with Potts models to predict protein structure and sequence-dependent conformational free energy landscapes, to survey protein fitness landscapes and to explore the effects of epistasis on fitness. We also comment on the numerical methods used to infer these models for each application.
Collapse
Affiliation(s)
- Ronald M Levy
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States.
| | - Allan Haldane
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States
| | - William F Flynn
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States; Department of Physics and Astronomy, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
| |
Collapse
|
43
|
Cheng RR, Nordesjö O, Hayes RL, Levine H, Flores SC, Onuchic JN, Morcos F. Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes. Mol Biol Evol 2016; 33:3054-3064. [PMID: 27604223 PMCID: PMC5100047 DOI: 10.1093/molbev/msw188] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Two-component signaling (TCS) is the primary means by which bacteria sense and respond to the environment. TCS involves two partner proteins working in tandem, which interact to perform cellular functions whereas limiting interactions with non-partners (i.e., cross-talk). We construct a Potts model for TCS that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners. The parameters of this model are inferred directly from protein sequence data. This approach drastically reduces the computational complexity of exploring the sequence-space of TCS proteins. As a stringent test, we compare its predictions to a recent comprehensive mutational study, which characterized the functionality of 204 mutational variants of the PhoQ kinase in Escherichia coli We find that our best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP. These predictions demonstrate the evolutionary pressure to preserve the interaction between TCS partners as well as prevent unwanted cross-talk. Further, we calculate the mutational change in the binding affinity between PhoQ and PhoP, providing an estimate to the amount of destabilization needed to disrupt TCS.
Collapse
Affiliation(s)
- R R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, TX
| | - O Nordesjö
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - R L Hayes
- Department of Biophysics, University of Michigan, Ann Arbor, MI
| | - H Levine
- Center for Theoretical Biological Physics, Rice University, Houston, TX.,Department of Bioengineering, Rice University, Houston, TX
| | - S C Flores
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - J N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX .,Department of Physics and Astronomy, Rice University, Houston, TX.,Department of Chemistry, and Biosciences, Rice University, Houston, TX
| | - F Morcos
- Department of Biological Sciences and Center for Systems Biology, University of Texas at Dallas, Dallas, TX
| |
Collapse
|
44
|
Stetz G, Verkhivker GM. Probing Allosteric Inhibition Mechanisms of the Hsp70 Chaperone Proteins Using Molecular Dynamics Simulations and Analysis of the Residue Interaction Networks. J Chem Inf Model 2016; 56:1490-517. [DOI: 10.1021/acs.jcim.5b00755] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Gabrielle Stetz
- Graduate
Program in Computational and Data Sciences, Department of Computational
Sciences, Schmid College of Science and Technology, Chapman University, One University Drive, Orange, California 92866, United States
| | - Gennady M. Verkhivker
- Graduate
Program in Computational and Data Sciences, Department of Computational
Sciences, Schmid College of Science and Technology, Chapman University, One University Drive, Orange, California 92866, United States
- Chapman University School of Pharmacy, Irvine, California 92618, United States
| |
Collapse
|
45
|
Alderson TR, Kim JH, Markley JL. Dynamical Structures of Hsp70 and Hsp70-Hsp40 Complexes. Structure 2016; 24:1014-30. [PMID: 27345933 DOI: 10.1016/j.str.2016.05.011] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 05/05/2016] [Accepted: 05/10/2016] [Indexed: 12/25/2022]
Abstract
Protein misfolding and aggregation are pathological events that place a significant amount of stress on the maintenance of protein homeostasis (proteostasis). For prevention and repair of protein misfolding and aggregation, cells are equipped with robust mechanisms that mainly rely on molecular chaperones. Two classes of molecular chaperones, heat shock protein 70 kDa (Hsp70) and Hsp40, recognize and bind to misfolded proteins, preventing their toxic biomolecular aggregation and enabling refolding or targeted degradation. Here, we review the current state of structural biology of Hsp70 and Hsp40-Hsp70 complexes and examine the link between their structures, dynamics, and functions. We highlight the power of nuclear magnetic resonance spectroscopy to untangle complex relationships behind molecular chaperones and their mechanism(s) of action.
Collapse
Affiliation(s)
- Thomas Reid Alderson
- Department of Chemistry, University of Oxford, South Parks Road, Oxford OX1 3TA, UK; Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Jin Hae Kim
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - John Lute Markley
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
46
|
Neuwald AF. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Curr Opin Struct Biol 2016; 38:1-8. [PMID: 27179293 DOI: 10.1016/j.sbi.2016.04.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 04/28/2016] [Accepted: 04/29/2016] [Indexed: 10/24/2022]
Abstract
The availability of vast amounts of protein sequence data facilitates detection of subtle statistical correlations due to imposed structural and functional constraints. Recent breakthroughs using Direct Coupling Analysis (DCA) and related approaches have tapped into correlations believed to be due to compensatory mutations. This has yielded some remarkable results, including substantially improved prediction of protein intra- and inter-domain 3D contacts, of membrane and globular protein structures, of substrate binding sites, and of protein conformational heterogeneity. A complementary approach is Bayesian Partitioning with Pattern Selection (BPPS), which partitions related proteins into hierarchically-arranged subgroups based on correlated residue patterns. These correlated patterns are presumably due to structural and functional constraints associated with evolutionary divergence rather than to compensatory mutations. Hence joint application of DCA- and BPPS-based approaches should help sort out the structural and functional constraints contributing to sequence correlations.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, 801 West Baltimore St., BioPark II, Room 617, Baltimore, MD 21201, United States.
| |
Collapse
|
47
|
Dual role of ribosome-associated chaperones in prion formation and propagation. Curr Genet 2016; 62:677-685. [PMID: 26968706 DOI: 10.1007/s00294-016-0586-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 02/25/2016] [Accepted: 02/27/2016] [Indexed: 01/20/2023]
Abstract
Chaperones of the diverse ubiquitous Hsp70 family are involved in the regulation of ordered self-perpetuating protein aggregates (amyloids and prions), implicated in both devastating diseases and protein-based inheritance. Yeast ribosome-associated chaperone complex (RAC), composed of the Hsp40 protein Zuo1 and non-canonical Hsp70 protein Ssz1, mediates association of the Hsp70 chaperone Ssb with translating ribosomes. Ssb participates in co-translational protein folding, regulation of premature translation termination, and ribosome biogenesis. The loss of Ssb or disruption of RAC results in the increased formation of [PSI +], a prion form of the translation termination factor Sup35 (eRF3). This implicates co-translational protein misfolding in de novo prion formation. However, RAC disruption also destabilizes pre-existing [PSI +] prions, as Ssb, released from ribosomes to the cytosol in the absence of RAC, antagonizes the function of the major cytosolic chaperone, Ssa, in prion propagation. The mechanism of the Ssa/Ssb antagonism is currently under investigation and may include a competition for substrates and/or co-chaperones. Notably, yeast cells with wild-type RAC also release Ssb to the cytosol in certain unfavorable growth conditions, and Ssb contributes to increased prion loss in these conditions. This indicates that the circulation of Ssb between the ribosome and cytosol may serve as a physiological regulator of the formation and propagation of self-perpetuating protein aggregates. Indeed, RAC and Ssb modulate toxicity of some aggregating proteins in yeast. Mammalian cells lack the Ssb ortholog but contain a RAC counterpart, apparently recruiting other Hsp70 protein(s). Thus, amyloid modulation by ribosome-associated chaperones could be applicable beyond yeast.
Collapse
|
48
|
McCallister C, Kdeiss B, Nikolaidis N. Biochemical characterization of the interaction between HspA1A and phospholipids. Cell Stress Chaperones 2016; 21:41-53. [PMID: 26342809 PMCID: PMC4679732 DOI: 10.1007/s12192-015-0636-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2015] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 01/15/2023] Open
Abstract
Seventy-kilodalton heat shock proteins (Hsp70s) are molecular chaperones essential for maintaining cellular homeostasis. Apart from their indispensable roles in protein homeostasis, specific Hsp70s localize at the plasma membrane and bind to specific lipids. The interaction of Hsp70s with lipids has direct physiological outcomes including lysosomal rescue, microautophagy, and promotion of cell apoptosis. Despite these essential functions, the Hsp70-lipid interactions remain largely uncharacterized. In this study, we characterized the interaction of HspA1A, an inducible Hsp70, with five phospholipids. We first used high concentrations of potassium and established that HspA1A embeds in membranes when bound to all anionic lipids tested. Furthermore, we found that protein insertion is enhanced by increasing the saturation level of the lipids. Next, we determined that the nucleotide-binding domain (NBD) of the protein binds to lipids quantitatively more than the substrate-binding domain (SBD). However, for all lipids tested, the full-length protein is necessary for embedding. We also used calcium and reaction buffers equilibrated at different pH values and determined that electrostatic interactions alone may not fully explain the association of HspA1A with lipids. We then determined that lipid binding is inhibited by nucleotide-binding, but it is unaffected by protein-substrate binding. These results suggest that the HspA1A lipid-association is specific, depends on the physicochemical properties of the lipid, and is mediated by multiple molecular forces. These mechanistic details of the Hsp70-lipid interactions establish a framework of possible physiological functions as they relate to chaperone regulation and localization.
Collapse
Affiliation(s)
- Chelsea McCallister
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, California State University, Fullerton, Fullerton, CA, 92834, USA
| | - Brianna Kdeiss
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, California State University, Fullerton, Fullerton, CA, 92834, USA
| | - Nikolas Nikolaidis
- Department of Biological Science, Center for Applied Biotechnology Studies, and Center for Computational and Applied Mathematics, California State University, Fullerton, Fullerton, CA, 92834, USA.
| |
Collapse
|
49
|
Nillegoda NB, Bukau B. Metazoan Hsp70-based protein disaggregases: emergence and mechanisms. Front Mol Biosci 2015; 2:57. [PMID: 26501065 PMCID: PMC4598581 DOI: 10.3389/fmolb.2015.00057] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 09/22/2015] [Indexed: 11/13/2022] Open
Abstract
Proteotoxic stresses and aging cause breakdown of cellular protein homeostasis, allowing misfolded proteins to form aggregates, which dedicated molecular machines have evolved to solubilize. In bacteria, fungi, protozoa and plants protein disaggregation involves an Hsp70•J-protein chaperone system, which loads and activates a powerful AAA+ ATPase (Hsp100) disaggregase onto protein aggregate substrates. Metazoans lack cytosolic and nuclear Hsp100 disaggregases but still eliminate protein aggregates. This longstanding puzzle of protein quality control is now resolved. Robust protein disaggregation activity recently shown for the metazoan Hsp70-based disaggregases relies instead on a crucial cooperation between two J-protein classes and interaction with the Hsp110 co-chaperone. An expanding multiplicity of Hsp70 and J-protein family members in metazoan cells facilitates different configurations of this Hsp70-based disaggregase allowing unprecedented versatility and specificity in protein disaggregation. Here we review the architecture, operation, and adaptability of the emerging metazoan disaggregation system and discuss how this evolved.
Collapse
Affiliation(s)
- Nadinath B Nillegoda
- Center for Molecular Biology (ZMBH) of the University of Heidelberg and German Cancer Research Center (DKFZ), DKFZ-ZMBH Alliance Heidelberg, Germany
| | - Bernd Bukau
- Center for Molecular Biology (ZMBH) of the University of Heidelberg and German Cancer Research Center (DKFZ), DKFZ-ZMBH Alliance Heidelberg, Germany
| |
Collapse
|
50
|
Nillegoda NB, Bukau B. Metazoan Hsp70-based protein disaggregases: emergence and mechanisms. Front Mol Biosci 2015; 2:57. [PMID: 26501065 DOI: 10.3389/fmolb.2015.00057/bibtex] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 09/22/2015] [Indexed: 05/25/2023] Open
Abstract
Proteotoxic stresses and aging cause breakdown of cellular protein homeostasis, allowing misfolded proteins to form aggregates, which dedicated molecular machines have evolved to solubilize. In bacteria, fungi, protozoa and plants protein disaggregation involves an Hsp70•J-protein chaperone system, which loads and activates a powerful AAA+ ATPase (Hsp100) disaggregase onto protein aggregate substrates. Metazoans lack cytosolic and nuclear Hsp100 disaggregases but still eliminate protein aggregates. This longstanding puzzle of protein quality control is now resolved. Robust protein disaggregation activity recently shown for the metazoan Hsp70-based disaggregases relies instead on a crucial cooperation between two J-protein classes and interaction with the Hsp110 co-chaperone. An expanding multiplicity of Hsp70 and J-protein family members in metazoan cells facilitates different configurations of this Hsp70-based disaggregase allowing unprecedented versatility and specificity in protein disaggregation. Here we review the architecture, operation, and adaptability of the emerging metazoan disaggregation system and discuss how this evolved.
Collapse
Affiliation(s)
- Nadinath B Nillegoda
- Center for Molecular Biology (ZMBH) of the University of Heidelberg and German Cancer Research Center (DKFZ), DKFZ-ZMBH Alliance Heidelberg, Germany
| | - Bernd Bukau
- Center for Molecular Biology (ZMBH) of the University of Heidelberg and German Cancer Research Center (DKFZ), DKFZ-ZMBH Alliance Heidelberg, Germany
| |
Collapse
|