1
|
Rosignoli S, Lustrino E, Di Silverio I, Paiardini A. Making Use of Averaging Methods in MODELLER for Protein Structure Prediction. Int J Mol Sci 2024; 25:1731. [PMID: 38339009 PMCID: PMC10855553 DOI: 10.3390/ijms25031731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/23/2024] [Accepted: 01/29/2024] [Indexed: 02/12/2024] Open
Abstract
Recent advances in protein structure prediction, driven by AlphaFold 2 and machine learning, demonstrate proficiency in static structures but encounter challenges in capturing essential dynamic features crucial for understanding biological function. In this context, homology-based modeling emerges as a cost-effective and computationally efficient alternative. The MODELLER (version 10.5, accessed on 30 November 2023) algorithm can be harnessed for this purpose since it computes intermediate models during simulated annealing, enabling the exploration of attainable configurational states and energies while minimizing its objective function. There have been a few attempts to date to improve the models generated by its algorithm, and in particular, there is no literature regarding the implementation of an averaging procedure involving the intermediate models in the MODELLER algorithm. In this study, we examined MODELLER's output using 225 target-template pairs, extracting the best representatives of intermediate models. Applying an averaging procedure to the selected intermediate structures based on statistical potentials, we aimed to determine: (1) whether averaging improves the quality of structural models during the building phase; (2) if ranking by statistical potentials reliably selects the best models, leading to improved final model quality; (3) whether using a single template versus multiple templates affects the averaging approach; (4) whether the "ensemble" nature of the MODELLER building phase can be harnessed to capture low-energy conformations in holo structures modeling. Our findings indicate that while improvements typically fall short of a few decimal points in the model evaluation metric, a notable fraction of configurations exhibit slightly higher similarity to the native structure than MODELLER's proposed final model. The averaging-building procedure proves particularly beneficial in (1) regions of low sequence identity between the target and template(s), the most challenging aspect of homology modeling; (2) holo protein conformations generation, an area in which MODELLER and related tools usually fall short of the expected performance.
Collapse
Affiliation(s)
| | | | | | - Alessandro Paiardini
- Department of Biochemical Sciences, Sapienza University of Rome, 00185 Rome, Italy; (S.R.); (E.L.); (I.D.S.)
| |
Collapse
|
2
|
Zea DJ, Teppa E, Marino-Buslje C. Easy Not Easy: Comparative Modeling with High-Sequence Identity Templates. Methods Mol Biol 2023; 2627:83-100. [PMID: 36959443 DOI: 10.1007/978-1-0716-2974-1_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
Homology modeling is the most common technique to build structural models of a target protein based on the structure of proteins with high-sequence identity and available high-resolution structures. This technique is based on the idea that protein structure shows fewer changes than sequence through evolution. While in this scenario single mutations would minimally perturb the structure, experimental evidence shows otherwise: proteins with high conformational diversity impose a limit of the paradigm of comparative modeling as the same protein sequence can adopt dissimilar three-dimensional structures. These cases present challenges for modeling; at first glance, they may seem to be easy cases, but they have a complexity that is not evident at the sequence level. In this chapter, we address the following questions: Why should we care about conformational diversity? How to consider conformational diversity when doing template-based modeling in a practical way?
Collapse
Affiliation(s)
- Diego Javier Zea
- Laboratory of Computational and Quantitative Biology, LCQB, UMR 7238 CNRS, IBPS, Sorbonne Université, Paris, France
| | - Elin Teppa
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France
| | | |
Collapse
|
3
|
Iyer M, Jaroszewski L, Sedova M, Godzik A. What the protein data bank tells us about the evolutionary conservation of protein conformational diversity. Protein Sci 2022; 31:e4325. [PMID: 35762711 PMCID: PMC9207624 DOI: 10.1002/pro.4325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 03/29/2022] [Accepted: 04/06/2022] [Indexed: 11/09/2022]
Abstract
Proteins sample a multitude of different conformations by undergoing small- and large-scale conformational changes that are often intrinsic to their functions. Information about these changes is often captured in the Protein Data Bank by the apparently redundant deposition of independent structural solutions of identical proteins. Here, we mine these data to examine the conservation of large-scale conformational changes between homologous proteins. This is important for both practical reasons, such as predicting alternative conformations of a protein by comparative modeling, and conceptual reasons, such as understanding the extent of conservation of different features in evolution. To study this question, we introduce a novel approach to compare conformational changes between proteins by the comparison of their difference distance maps (DDMs). We found that proteins undergoing similar conformational changes have similar DDMs and that this similarity could be quantified by the correlation between the DDMs. By comparing the DDMs of homologous protein pairs, we found that large-scale conformational changes show a high level of conservation across a broad range of sequence identities. This shows that conformational space is usually conserved between homologs, even relatively distant ones.
Collapse
Affiliation(s)
- Mallika Iyer
- Graduate School of Biomedical Sciences, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, California, USA
| | - Lukasz Jaroszewski
- Biosciences Division, University of California Riverside School of Medicine, Riverside, California, USA
| | - Mayya Sedova
- Biosciences Division, University of California Riverside School of Medicine, Riverside, California, USA
| | - Adam Godzik
- Biosciences Division, University of California Riverside School of Medicine, Riverside, California, USA
| |
Collapse
|
4
|
Masrati G, Landau M, Ben-Tal N, Lupas A, Kosloff M, Kosinski J. Integrative Structural Biology in the Era of Accurate Structure Prediction. J Mol Biol 2021; 433:167127. [PMID: 34224746 DOI: 10.1016/j.jmb.2021.167127] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/28/2021] [Accepted: 06/28/2021] [Indexed: 11/16/2022]
Abstract
Characterizing the three-dimensional structure of macromolecules is central to understanding their function. Traditionally, structures of proteins and their complexes have been determined using experimental techniques such as X-ray crystallography, NMR, or cryo-electron microscopy-applied individually or in an integrative manner. Meanwhile, however, computational methods for protein structure prediction have been improving their accuracy, gradually, then suddenly, with the breakthrough advance by AlphaFold2, whose models of monomeric proteins are often as accurate as experimental structures. This breakthrough foreshadows a new era of computational methods that can build accurate models for most monomeric proteins. Here, we envision how such accurate modeling methods can combine with experimental structural biology techniques, enhancing integrative structural biology. We highlight the challenges that arise when considering multiple structural conformations, protein complexes, and polymorphic assemblies. These challenges will motivate further developments, both in modeling programs and in methods to solve experimental structures, towards better and quicker investigation of structure-function relationships.
Collapse
Affiliation(s)
- Gal Masrati
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Meytal Landau
- Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel; European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Andrei Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.
| | - Mickey Kosloff
- Department of Human Biology, Faculty of Natural Sciences, University of Haifa, 199 Aba Khoushy Ave., Mt. Carmel, 3498838 Haifa, Israel.
| | - Jan Kosinski
- European Molecular Biology Laboratory (EMBL), Hamburg 22607, Germany; Centre for Structural Systems Biology (CSSB), Hamburg 22607, Germany; Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany.
| |
Collapse
|
5
|
Searching protein space for ancient sub-domain segments. Curr Opin Struct Biol 2021; 68:105-112. [PMID: 33476896 DOI: 10.1016/j.sbi.2020.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 11/29/2020] [Indexed: 01/08/2023]
Abstract
Evolutionary processes that formed the current protein universe left their traces, among them homologous segments that recur, or are 'reused,' in multiple proteins. These reused segments, called 'themes,' can be found at various scales, the best known of which is the domain. Yet, recent studies have begun to focus on the evolutionary insights that can be derived from sub-domain-scale themes, which are candidates for traces of more ancient events. Characterizing these may provide clues to the emergence of domains. Particularly interesting are themes that are reused across dissimilar contexts, that is, where the rest of the protein domain differs. We survey computational studies identifying reused themes within different contexts at the sub-domain level.
Collapse
|
6
|
Vetrivel I, de Brevern AG, Cadet F, Srinivasan N, Offmann B. Structural variations within proteins can be as large as variations observed across their homologues. Biochimie 2019; 167:162-170. [PMID: 31560932 DOI: 10.1016/j.biochi.2019.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 09/18/2019] [Indexed: 10/26/2022]
Abstract
Understanding the structural plasticity of proteins is key to understanding the intricacies of their functions and mechanistic basis. In the current study, we analyzed the available multiple crystal structures of the same protein for the structural differences. For this purpose we used an abstraction of protein structures referred as Protein Blocks (PBs) that was previously established. We also characterized the nature of the structural variations for a few proteins using molecular dynamics simulations. In both the cases, the structural variations were summarized in the form of substitution matrices of PBs. We show that certain conformational states are preferably replaced by other specific conformational states. Interestingly, these structural variations are highly similar to those previously observed across structures of homologous proteins (r2 = 0.923) or across the ensemble of conformations from NMR data (r2 = 0.919). Thus our study quantitatively shows that overall trends of structural changes in a given protein are nearly identical to the trends of structural differences that occur in the topologically equivalent positions in homologous proteins. Specific case studies are used to illustrate the nature of these structural variations.
Collapse
Affiliation(s)
- Iyanar Vetrivel
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France
| | - Alexandre G de Brevern
- INSERM UMR_S 1134, DSIMB Team, Laboratory of Excellence, GR-Ex, Univ Paris Diderot, Univ Sorbonne Paris Cité, INTS, 6 Rue Alexandre Cabanel, Paris, France
| | - Frédéric Cadet
- University of Paris, UMR_S1134, BIGR, Inserm, F-75015, Paris, France; DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, F-97715, Saint-Denis, France; PEACCEL, Protein Engineering Accelerator, 6 Square Albin Cachot, Box 42, 75013, Paris, France
| | | | - Bernard Offmann
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France.
| |
Collapse
|
7
|
Developments in integrative modeling with dynamical interfaces. Curr Opin Struct Biol 2019; 56:11-17. [DOI: 10.1016/j.sbi.2018.10.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 10/26/2018] [Accepted: 10/27/2018] [Indexed: 11/19/2022]
|
8
|
Navigating Among Known Structures in Protein Space. Methods Mol Biol 2018. [PMID: 30298400 DOI: 10.1007/978-1-4939-8736-8_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Present-day protein space is the result of 3.7 billion years of evolution, constrained by the underlying physicochemical qualities of the proteins. It is difficult to differentiate between evolutionary traces and effects of physicochemical constraints. Nonetheless, as a rule of thumb, instances of structural reuse, or focusing on structural similarity, are likely attributable to physicochemical constraints, whereas sequence reuse, or focusing on sequence similarity, may be more indicative of evolutionary relationships. Both types of relationships have been studied and can provide meaningful insights to protein biophysics and evolution, which in turn can lead to better algorithms for protein search, annotation, and maybe even design.In broad strokes, studies of protein space vary in the entities they represent, the similarity measure comparing these entities, and the representation used. The entities can be, for example, protein chains, domains, supra-domains, or smaller protein sub-parts denoted themes. The measures of similarity between the entities can be based on sequence, structure, function, or any combination of these. The representation can be global, encompassing the whole space, or local, focusing on a particular region surrounding protein(s) of interest. Global representations include lists of grouped proteins, protein networks, and maps. Networks are the abstraction that is derived most directly from the similarity data: each node is the protein entity (e.g., a domain), and edges connect similar domains. Selecting the entities, the similarity measure, and the abstraction are three intertwined decisions: the similarity measures allow us to identify the entities, and the selection of entities influences what is a meaningful similarity measure. Similarly, we seek entities that are related to each other in a way, for which a simple representation describes their relationships succinctly and accurately. This chapter will cover studies that rely on different entities, similarity measures, and a range of representations to better understand protein structure space. Scholars may use publicly available navigators offering a global representation, and in particular the hierarchical classifications SCOP, CATH, and ECOD, or a local representation, which encompass structural alignment algorithms. Alternatively, scholars can configure their own navigator using existing tools. To demonstrate this DIY (do it yourself) approach for navigating in protein space, we investigate substrate-binding proteins. By presenting sequence similarities among this large and diverse protein family as a network, we can infer that one member (pdb ID 4ntl; of yet unknown function) may bind methionine and suggest a putative binding mechanism.
Collapse
|
9
|
Sameach H, Narunsky A, Azoulay-Ginsburg S, Gevorkyan-Aiapetov L, Zehavi Y, Moskovitz Y, Juven-Gershon T, Ben-Tal N, Ruthstein S. Structural and Dynamics Characterization of the MerR Family Metalloregulator CueR in its Repression and Activation States. Structure 2017; 25:988-996.e3. [PMID: 28578875 DOI: 10.1016/j.str.2017.05.004] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 04/12/2017] [Accepted: 05/05/2017] [Indexed: 10/19/2022]
Abstract
CueR (Cu export regulator) is a metalloregulator protein that "senses" Cu(I) ions with very high affinity, thereby stimulating DNA binding and the transcription activation of two other metalloregulator proteins. The crystal structures of CueR when unbound or bound to DNA and a metal ion are very similar to each other, and the role of CueR and Cu(I) in initiating the transcription has not been fully understood yet. Using double electron-electron resonance (DEER) measurements and structure modeling, we investigate the conformational changes that CueR undergoes upon binding Cu(I) and DNA in solution. We observe three distinct conformations, corresponding to apo-CueR, DNA-bound CueR in the absence of Cu(I) (the "repression" state), and CueR-Cu(I)-DNA (the "activation" state). We propose a detailed structural mechanism underlying CueR's regulation of the transcription process. The mechanism explicitly shows the dependence of CueR activity on copper, thereby revealing the important negative feedback mechanism essential for regulating the intracellular copper concentration.
Collapse
Affiliation(s)
- Hila Sameach
- The Chemistry Department, Faculty of Exact Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Aya Narunsky
- Department of Biochemistry and Molecular Biochemistry, George S. Wise Faculty of Life sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Salome Azoulay-Ginsburg
- The Chemistry Department, Faculty of Exact Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Lada Gevorkyan-Aiapetov
- The Chemistry Department, Faculty of Exact Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Yonathan Zehavi
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Yoni Moskovitz
- The Chemistry Department, Faculty of Exact Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Tamar Juven-Gershon
- The Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biochemistry, George S. Wise Faculty of Life sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Sharon Ruthstein
- The Chemistry Department, Faculty of Exact Sciences, Bar Ilan University, Ramat-Gan 5290002, Israel.
| |
Collapse
|
10
|
Guffy SL, Der BS, Kuhlman B. Probing the minimal determinants of zinc binding with computational protein design. Protein Eng Des Sel 2016; 29:327-338. [PMID: 27358168 PMCID: PMC4955873 DOI: 10.1093/protein/gzw026] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 06/01/2016] [Accepted: 06/02/2016] [Indexed: 11/15/2022] Open
Abstract
Structure-based protein design tests our understanding of the minimal determinants of protein structure and function. Previous studies have demonstrated that placing zinc binding amino acids (His, Glu, Asp or Cys) near each other in a folded protein in an arrangement predicted to be tetrahedral is often sufficient to achieve binding to zinc. However, few designs have been characterized with high-resolution structures. Here, we use X-ray crystallography, binding studies and mutation analysis to evaluate three alternative strategies for designing zinc binding sites with the molecular modeling program Rosetta. While several of the designs were observed to bind zinc, crystal structures of two designs reveal binding configurations that differ from the design model. In both cases, the modeling did not accurately capture the presence or absence of second-shell hydrogen bonds critical in determining binding-site structure. Efforts to more explicitly design second-shell hydrogen bonds were largely unsuccessful as evidenced by mutation analysis and low expression of proteins engineered with extensive primary and secondary networks. Our results suggest that improved methods for designing interaction networks will be needed for creating metal binding sites with high accuracy.
Collapse
Affiliation(s)
- Sharon L. Guffy
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 27599-7260, USA
| | - Bryan S. Der
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 27599-7260, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, NC 27599-7260, USA
| |
Collapse
|
11
|
F. M. Cellier M. Evolutionary analysis of Slc11 mechanism of proton-coupled metal-ion transmembrane import. AIMS BIOPHYSICS 2016. [DOI: 10.3934/biophy.2016.2.286] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
|