1
|
Waterhouse AM, Studer G, Robin X, Bienert S, Tauriello G, Schwede T. The structure assessment web server: for proteins, complexes and more. Nucleic Acids Res 2024; 52:W318-W323. [PMID: 38634802 PMCID: PMC11223858 DOI: 10.1093/nar/gkae270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
The 'structure assessment' web server is a one-stop shop for interactive evaluation and benchmarking of structural models of macromolecular complexes including proteins and nucleic acids. A user-friendly web dashboard links sequence with structure information and results from a variety of state-of-the-art tools, which facilitates the visual exploration and evaluation of structure models. The dashboard integrates stereochemistry information, secondary structure information, global and local model quality assessment of the tertiary structure of comparative protein models, as well as prediction of membrane location. In addition, a benchmarking mode is available where a model can be compared to a reference structure, providing easy access to scores that have been used in recent CASP experiments and CAMEO. The structure assessment web server is available at https://swissmodel.expasy.org/assess.
Collapse
Affiliation(s)
- Andrew M Waterhouse
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Xavier Robin
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| |
Collapse
|
2
|
Scat S, Weissman KJ, Chagot B. Insights into docking in megasynthases from the investigation of the toblerol trans-AT polyketide synthase: many α-helical means to an end. RSC Chem Biol 2024; 5:669-683. [PMID: 38966669 PMCID: PMC11221535 DOI: 10.1039/d4cb00075g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 05/16/2024] [Indexed: 07/06/2024] Open
Abstract
The fidelity of biosynthesis by modular polyketide synthases (PKSs) depends on specific moderate affinity interactions between successive polypeptide subunits mediated by docking domains (DDs). These sequence elements are notably portable, allowing their transplantation into alternative biosynthetic and metabolic contexts. Herein, we use integrative structural biology to characterize a pair of DDs from the toblerol trans-AT PKS. Both are intrinsically disordered regions (IDRs) that fold into a 3 α-helix docking complex of unprecedented topology. The C-terminal docking domain (CDD) resembles the 4 α-helix type (4HB) CDDs, which shows that the same type of DD can be redeployed to form complexes of distinct geometry. By carefully re-examining known DD structures, we further extend this observation to type 2 docking domains, establishing previously unsuspected structural relations between DD types. Taken together, these data illustrate the plasticity of α-helical DDs, which allow the formation of a diverse topological spectrum of docked complexes. The newly identified DDs should also find utility in modular PKS genetic engineering.
Collapse
Affiliation(s)
- Serge Scat
- Université de Lorraine, CNRS, IMoPA F-54000 Nancy France
| | | | | |
Collapse
|
3
|
Huang YJ, Montelione GT. Hidden Structural States of Proteins Revealed by Conformer Selection with AlphaFold-NMR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.26.600902. [PMID: 38979209 PMCID: PMC11230435 DOI: 10.1101/2024.06.26.600902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Recent advances in molecular modeling using deep learning can revolutionize our understanding of dynamic protein structures. NMR is particularly well-suited for determining dynamic features of biomolecular structures. The conventional process for determining biomolecular structures from experimental NMR data involves its representation as conformation-dependent restraints, followed by generation of structural models guided by these spatial restraints. Here we describe an alternative approach: generating a distribution of realistic protein conformational models using artificial intelligence-(AI-) based methods and then selecting the sets of conformers that best explain the experimental data. We applied this conformational selection approach to redetermine the solution NMR structure of the enzyme Gaussia luciferase. First, we generated a diverse set of conformer models using AlphaFold2 (AF2) with an enhanced sampling protocol. The models that best-fit NOESY and chemical shift data were then selected with a Bayesian scoring metric. The resulting models include features of both the published NMR structure and the standard AF2 model generated without enhanced sampling. This "AlphaFold-NMR" protocol also generated an alternative "open" conformational state that fits nearly as well to the overall NMR data but accounts for some NOESY data that is not consistent with first "closed" conformational state; while other NOESY data consistent with this second state are not consistent with the first conformational state. The structure of this "open" structural state differs from that of the "closed" state primarily by the position of a thumb-shaped loop between α-helices H5 and H6, revealing a cryptic surface pocket. These alternative conformational states of Gluc are supported by "double recall" analysis of NOESY data and AF2 models. Additional structural states are also indicated by backbone chemical shift data indicating partially-disordered conformations for the C-terminal segment. Considered as a multistate ensemble, these multiple states of Gluc together fit the NOESY and chemical shift data better than the "restraint-based" NMR structure and provide novel insights into its structure-dynamic-function relationships. This study demonstrates the potential of AI-based modeling with enhanced sampling to generate conformational ensembles followed by conformer selection with experimental data as an alternative to conventional restraint satisfaction protocols for protein NMR structure determination.
Collapse
Affiliation(s)
- Yuanpeng J. Huang
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Gaetano T. Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
4
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint validation of biomolecular structures determined by NMR in the Protein Data Bank. Structure 2024; 32:824-837.e1. [PMID: 38490206 PMCID: PMC11162339 DOI: 10.1016/j.str.2024.02.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/13/2024] [Accepted: 02/19/2024] [Indexed: 03/17/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NEF and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB restraint violation report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA.
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100 Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, La Jolla, CA, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan; Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, UK.
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
5
|
Burley SK, Piehl DW, Vallat B, Zardecki C. RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures. IUCRJ 2024; 11:279-286. [PMID: 38597878 PMCID: PMC11067742 DOI: 10.1107/s2052252524002604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
The Protein Data Bank (PDB) was established as the first open-access digital data resource in biology and medicine in 1971 with seven X-ray crystal structures of proteins. Today, the PDB houses >210 000 experimentally determined, atomic level, 3D structures of proteins and nucleic acids as well as their complexes with one another and small molecules (e.g. approved drugs, enzyme cofactors). These data provide insights into fundamental biology, biomedicine, bioenergy and biotechnology. They proved particularly important for understanding the SARS-CoV-2 global pandemic. The US-funded Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) and other members of the Worldwide Protein Data Bank (wwPDB) partnership jointly manage the PDB archive and support >60 000 `data depositors' (structural biologists) around the world. wwPDB ensures the quality and integrity of the data in the ever-expanding PDB archive and supports global open access without limitations on data usage. The RCSB PDB research-focused web portal at https://www.rcsb.org/ (RCSB.org) supports millions of users worldwide, representing a broad range of expertise and interests. In addition to retrieving 3D structure data, PDB `data consumers' access comparative data and external annotations, such as information about disease-causing point mutations and genetic variations. RCSB.org also provides access to >1 000 000 computed structure models (CSMs) generated using artificial intelligence/machine-learning methods. To avoid doubt, the provenance and reliability of experimentally determined PDB structures and CSMs are identified. Related training materials are available to support users in their RCSB.org explorations.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Biology Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
6
|
Caparotta M, Perez A. Advancing Molecular Dynamics: Toward Standardization, Integration, and Data Accessibility in Structural Biology. J Phys Chem B 2024; 128:2219-2227. [PMID: 38418288 DOI: 10.1021/acs.jpcb.3c04823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Molecular dynamics (MD) simulations have become a valuable tool in structural biology, offering insights into complex biological systems that are difficult to obtain through experimental techniques alone. The lack of available data sets and structures in most published computational work has limited other researchers' use of these models. In recent years, the emergence of online sharing platforms and MD database initiatives favor the deposition of ensembles and structures to accompany publications, favoring reuse of the data sets. However, the lack of uniform metadata collection, formats, and what data are deposited limits the impact and its use by different communities that are not necessarily experts in MD. This Perspective highlights the need for standardization and better resource sharing for processing and interpreting MD simulation results, akin to efforts in other areas of structural biology. As the field moves forward, we will see an increase in popularity and benefits of MD-based integrative approaches combining experimental data and simulations through probabilistic reasoning, but these too are limited by uniformity in experimental data availability and choices on how the data are modeled that are not trivial to decipher from papers. Other fields have addressed similar challenges comprehensively by establishing task forces with different degrees of success. The large scope and number of communities to represent the breadth of types of MD simulations complicates a parallel approach that would fit all. Thus, each group typically decides what data and which format to upload on servers like Zenodo. Uploading data with FAIR (findable, accessible, interoperable, reusable) principles in mind including optimal metadata collection will make the data more accessible and actionable by the community. Such a wealth of simulation data will foster method development and infrastructure advancements, thus propelling the field forward.
Collapse
Affiliation(s)
- Marcelo Caparotta
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
7
|
Kleywegt GJ, Adams PD, Butcher SJ, Lawson CL, Rohou A, Rosenthal PB, Subramaniam S, Topf M, Abbott S, Baldwin PR, Berrisford JM, Bricogne G, Choudhary P, Croll TI, Danev R, Ganesan SJ, Grant T, Gutmanas A, Henderson R, Heymann JB, Huiskonen JT, Istrate A, Kato T, Lander GC, Lok SM, Ludtke SJ, Murshudov GN, Pye R, Pintilie GD, Richardson JS, Sachse C, Salih O, Scheres SHW, Schroeder GF, Sorzano COS, Stagg SM, Wang Z, Warshamanage R, Westbrook JD, Winn MD, Young JY, Burley SK, Hoch JC, Kurisu G, Morris K, Patwardhan A, Velankar S. Community recommendations on cryoEM data archiving and validation. IUCRJ 2024; 11:140-151. [PMID: 38358351 PMCID: PMC10916293 DOI: 10.1107/s2052252524001246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/06/2024] [Indexed: 02/16/2024]
Abstract
In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for the deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 47 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and the resulting consensus recommendations. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations.
Collapse
Affiliation(s)
| | - Paul D. Adams
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California, Berkeley, CA, USA
| | | | | | | | | | | | - Maya Topf
- Birkbeck, University of London, London, United Kingdom
| | | | | | | | | | | | | | | | - Sai J. Ganesan
- University of California at San Francisco, San Francisco, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | - Ryan Pye
- EMBL-EBI, Cambridge, United Kingdom
| | | | | | | | | | | | | | | | | | - Zhe Wang
- EMBL-EBI, Cambridge, United Kingdom
| | | | | | - Martyn D. Winn
- Science and Technology Facilities Council, Research Complex at Harwell, Oxon, United Kingdom
| | - Jasmine Y. Young
- RCSB Protein Data Bank, The State University of New Jersey, NJ, USA
| | | | | | | | | | | | | |
Collapse
|
8
|
Kiirikki AM, Antila HS, Bort LS, Buslaev P, Favela-Rosales F, Ferreira TM, Fuchs PFJ, Garcia-Fandino R, Gushchin I, Kav B, Kučerka N, Kula P, Kurki M, Kuzmin A, Lalitha A, Lolicato F, Madsen JJ, Miettinen MS, Mingham C, Monticelli L, Nencini R, Nesterenko AM, Piggot TJ, Piñeiro Á, Reuter N, Samantray S, Suárez-Lestón F, Talandashti R, Ollila OHS. Overlay databank unlocks data-driven analyses of biomolecules for all. Nat Commun 2024; 15:1136. [PMID: 38326316 PMCID: PMC10850068 DOI: 10.1038/s41467-024-45189-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 01/17/2024] [Indexed: 02/09/2024] Open
Abstract
Tools based on artificial intelligence (AI) are currently revolutionising many fields, yet their applications are often limited by the lack of suitable training data in programmatically accessible format. Here we propose an effective solution to make data scattered in various locations and formats accessible for data-driven and machine learning applications using the overlay databank format. To demonstrate the practical relevance of such approach, we present the NMRlipids Databank-a community-driven, open-for-all database featuring programmatic access to quality-evaluated atom-resolution molecular dynamics simulations of cellular membranes. Cellular membrane lipid composition is implicated in diseases and controls major biological functions, but membranes are difficult to study experimentally due to their intrinsic disorder and complex phase behaviour. While MD simulations have been useful in understanding membrane systems, they require significant computational resources and often suffer from inaccuracies in model parameters. Here, we demonstrate how programmable interface for flexible implementation of data-driven and machine learning applications, and rapid access to simulation data through a graphical user interface, unlock possibilities beyond current MD simulation and experimental studies to understand cellular membranes. The proposed overlay databank concept can be further applied to other biomolecules, as well as in other fields where similar barriers hinder the AI revolution.
Collapse
Affiliation(s)
- Anne M Kiirikki
- University of Helsinki, Institute of Biotechnology, Helsinki, Finland
| | - Hanne S Antila
- Department of Theory and Bio-Systems, Max Planck Institute of Colloids and Interfaces, 14424, Potsdam, Germany
- Department of Biomedicine, University of Bergen, 5020, Bergen, Norway
| | - Lara S Bort
- Department of Theory and Bio-Systems, Max Planck Institute of Colloids and Interfaces, 14424, Potsdam, Germany
- University of Potsdam, Institute of Physics and Astronomy, 14476, Potsdam-Golm, Germany
| | - Pavel Buslaev
- Nanoscience Center and Department of Chemistry, University of Jyväskylä, 40014, Jyväskylä, Finland
| | - Fernando Favela-Rosales
- Departamento de Ciencias Básicas, Tecnológico Nacional de México - ITS Zacatecas Occidente, Sombrerete, 99102, Zacatecas, Mexico
| | - Tiago Mendes Ferreira
- NMR group - Institute for Physics, Martin Luther University Halle-Wittenberg, 06120, Halle (Saale), Germany
| | - Patrick F J Fuchs
- Sorbonne Université, Ecole Normale Supérieure, PSL University, CNRS, Laboratoire des Biomolécules (LBM), F-75005, Paris, France
- Université Paris Cité, F-75006, Paris, France
| | - Rebeca Garcia-Fandino
- Center for Research in Biological Chemistry and Molecular Materials (CiQUS), Universidade de Santiago de Compostela, E-15782, Santiago de Compostela, Spain
| | | | - Batuhan Kav
- Institute of Biological Information Processing: Structural Biochemistry (IBI-7), Forschungszentrum Jülich, 52428, Jülich, Germany
- ariadne.ai GmbH (Germany), Häusserstraße 3, 69115, Heidelberg, Germany
| | - Norbert Kučerka
- Department of Physical Chemistry of Drugs, Faculty of Pharmacy, Comenius University Bratislava, 832 32, Bratislava, Slovakia
| | - Patrik Kula
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 542/2, CZ-16610, Prague, Czech Republic
| | - Milla Kurki
- School of Pharmacy, University of Eastern Finland, 70211, Kuopio, Finland
| | | | - Anusha Lalitha
- Institut Charles Gerhardt Montpellier (UMR CNRS 5253), Université Montpellier, Place Eugène Bataillon, 34095, Montpellier, Cedex 05, France
| | - Fabio Lolicato
- Heidelberg University Biochemistry Center, 69120, Heidelberg, Germany
- Department of Physics, University of Helsinki, FI-00014, Helsinki, Finland
| | - Jesper J Madsen
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, 33612, Tampa, FL, USA
- Center for Global Health and Infectious Diseases Research, Global and Planetary Health, College of Public Health, University of South Florida, 33612, Tampa, FL, USA
| | - Markus S Miettinen
- Department of Theory and Bio-Systems, Max Planck Institute of Colloids and Interfaces, 14424, Potsdam, Germany
- Department of Chemistry, University of Bergen, 5007, Bergen, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, 5008, Bergen, Norway
| | - Cedric Mingham
- Hochschule Mannheim, University of Applied Sciences, 68163, Mannheim, Germany
| | - Luca Monticelli
- University of Lyon, CNRS, Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), F-69007, Lyon, France
- Institut National de la Santé et de la Recherche Médicale (INSERM), Lyon, France
| | - Ricky Nencini
- University of Helsinki, Institute of Biotechnology, Helsinki, Finland
- Division of Pharmaceutical Biosciences, Faculty of Pharmacy, University of Helsinki, 00014, Helsinki, Finland
| | - Alexey M Nesterenko
- Department of Chemistry, University of Bergen, 5007, Bergen, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, 5008, Bergen, Norway
| | - Thomas J Piggot
- Chemistry, University of Southampton, Highfield, SO17 1BJ, Southampton, UK
| | - Ángel Piñeiro
- Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, E-15782, Santiago de Compostela, Spain
| | - Nathalie Reuter
- Department of Chemistry, University of Bergen, 5007, Bergen, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, 5008, Bergen, Norway
| | - Suman Samantray
- Institute of Biological Information Processing: Structural Biochemistry (IBI-7), Forschungszentrum Jülich, 52428, Jülich, Germany
- Institute of Biotechnology, RWTH Aachen University, Worringerweg 3, 52074, Aachen, Germany
| | - Fabián Suárez-Lestón
- Center for Research in Biological Chemistry and Molecular Materials (CiQUS), Universidade de Santiago de Compostela, E-15782, Santiago de Compostela, Spain
- Department of Applied Physics, Faculty of Physics, University of Santiago de Compostela, E-15782, Santiago de Compostela, Spain
- MD.USE Innovations S.L., Edificio Emprendia, 15782, Santiago de Compostela, Spain
| | - Reza Talandashti
- Department of Chemistry, University of Bergen, 5007, Bergen, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, 5008, Bergen, Norway
| | - O H Samuli Ollila
- University of Helsinki, Institute of Biotechnology, Helsinki, Finland.
- VTT Technical Research Centre of Finland, Espoo, Finland.
| |
Collapse
|
9
|
Kleywegt GJ, Adams PD, Butcher SJ, Lawson CL, Rohou A, Rosenthal PB, Subramaniam S, Topf M, Abbott S, Baldwin PR, Berrisford JM, Bricogne G, Choudhary P, Croll TI, Danev R, Ganesan SJ, Grant T, Gutmanas A, Henderson R, Heymann JB, Huiskonen JT, Istrate A, Kato T, Lander GC, Lok SM, Ludtke SJ, Murshudov GN, Pye R, Pintilie GD, Richardson JS, Sachse C, Salih O, Scheres SHW, Schroeder GF, Sorzano COS, Stagg SM, Wang Z, Warshamanage R, Westbrook JD, Winn MD, Young JY, Burley SK, Hoch JC, Kurisu G, Morris K, Patwardhan A, Velankar S. Community recommendations on cryoEM data archiving and validation: Outcomes of a wwPDB/EMDB workshop on cryoEM data management, deposition and validation. ARXIV 2024:arXiv:2311.17640v3. [PMID: 38076521 PMCID: PMC10705588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 47 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and consensus recommendations resulting from the workshop. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations.
Collapse
Affiliation(s)
| | - Paul D Adams
- Lawrence Berkeley Laboratory, Berkeley, CA, USA and University of California, Berkeley, CA, USA
| | | | - Catherine L Lawson
- RCSB Protein Data Bank, Rutgers, The State University of New Jersey, USA
| | | | | | | | - Maya Topf
- Birkbeck, University of London, London, UK
| | | | | | | | | | | | | | | | - Sai J Ganesan
- University of California at San Francisco, San Francisco, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - John D Westbrook
- RCSB Protein Data Bank, Rutgers, The State University of New Jersey, USA
| | - Martyn D Winn
- Science and Technology Facilities Council, Research Complex at Harwell, Oxon, UK
| | - Jasmine Y Young
- RCSB Protein Data Bank, Rutgers, The State University of New Jersey, USA
| | - Stephen K Burley
- RCSB Protein Data Bank, Rutgers, The State University of New Jersey, USA
| | | | | | | | | | | |
Collapse
|
10
|
Baskaran K, Ploskon E, Tejero R, Yokochi M, Harrus D, Liang Y, Peisach E, Persikova I, Ramelot TA, Sekharan M, Tolchard J, Westbrook JD, Bardiaux B, Schwieters CD, Patwardhan A, Velankar S, Burley SK, Kurisu G, Hoch JC, Montelione GT, Vuister GW, Young JY. Restraint Validation of Biomolecular Structures Determined by NMR in the Protein Data Bank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.15.575520. [PMID: 38328042 PMCID: PMC10849500 DOI: 10.1101/2024.01.15.575520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NMR exchange (NEF) and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB Restraint Violation Report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Eliza Ploskon
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Dr. Moliner, 50 46100-Burjassot, Valencia, Spain
| | - Masashi Yokochi
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Deborah Harrus
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Theresa A Ramelot
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - James Tolchard
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Benjamin Bardiaux
- Department of Structural Biology and Chemistry, Institut Pasteur, Université Paris Cité, CNRS UMR3528, 75015 Paris, France
| | - Charles D Schwieters
- Computational Biomolecular Magnetic Resonance Core, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA
| | - Ardan Patwardhan
- The Electron Microscopy Data Bank, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, California, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
- Protein Data Bank Japan, Protein Research Foundation, Minoh, Osaka 562-8686, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030-3305, USA
| | - Gaetano T Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 7RH, United Kingdom
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
11
|
Ramelot TA, Tejero R, Montelione GT. Representing structures of the multiple conformational states of proteins. Curr Opin Struct Biol 2023; 83:102703. [PMID: 37776602 PMCID: PMC10841472 DOI: 10.1016/j.sbi.2023.102703] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/18/2023] [Accepted: 08/23/2023] [Indexed: 10/02/2023]
Abstract
Biomolecules exhibit dynamic behavior that single-state models of their structures cannot fully capture. We review some recent advances for investigating multiple conformations of biomolecules, including experimental methods, molecular dynamics simulations, and machine learning. We also address the challenges associated with representing single- and multiple-state models in data archives, with a particular focus on NMR structures. Establishing standardized representations and annotations will facilitate effective communication and understanding of these complex models to the broader scientific community.
Collapse
Affiliation(s)
- Theresa A Ramelot
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA.
| | - Roberto Tejero
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Gaetano T Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA.
| |
Collapse
|
12
|
Fowler NJ, Albalwi MF, Lee S, Hounslow AM, Williamson MP. Improved methodology for protein NMR structure calculation using hydrogen bond restraints and ANSURR validation: The SH2 domain of SH2B1. Structure 2023; 31:975-986.e3. [PMID: 37311460 DOI: 10.1016/j.str.2023.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/02/2023] [Accepted: 05/18/2023] [Indexed: 06/15/2023]
Abstract
Protein structures calculated using NMR data are less accurate and less well-defined than they could be. Here we use the program ANSURR to show that this deficiency is at least in part due to a lack of hydrogen bond restraints. We describe a protocol to introduce hydrogen bond restraints into the structure calculation of the SH2 domain from SH2B1 in a systematic and transparent way and show that the structures generated are more accurate and better defined as a result. We also show that ANSURR can be used as a guide to know when the structure calculation is good enough to stop.
Collapse
Affiliation(s)
- Nicholas J Fowler
- School of Biosciences, University of Sheffield, S10 2TN Sheffield, UK.
| | - Marym F Albalwi
- School of Biosciences, University of Sheffield, S10 2TN Sheffield, UK
| | - Subin Lee
- School of Biosciences, University of Sheffield, S10 2TN Sheffield, UK
| | - Andrea M Hounslow
- School of Biosciences, University of Sheffield, S10 2TN Sheffield, UK
| | - Mike P Williamson
- School of Biosciences, University of Sheffield, S10 2TN Sheffield, UK.
| |
Collapse
|
13
|
Li EH, Spaman LE, Tejero R, Janet Huang Y, Ramelot TA, Fraga KJ, Prestegard JH, Kennedy MA, Montelione GT. Blind assessment of monomeric AlphaFold2 protein structure models with experimental NMR data. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2023; 352:107481. [PMID: 37257257 PMCID: PMC10659763 DOI: 10.1016/j.jmr.2023.107481] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 05/08/2023] [Accepted: 05/15/2023] [Indexed: 06/02/2023]
Abstract
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microscopy (cryoEM). Comparing AF2 models to structures determined using solution NMR data, both high similarities and distinct differences have been observed. Since AF2 was trained on X-ray crystal and cryoEM structures, we assessed how accurately AF2 can model small, monomeric, solution protein NMR structures which (i) were not used in the AF2 training data set, and (ii) did not have homologous structures in the Protein Data Bank at the time of AF2 training. We identified nine open-source protein NMR data sets for such "blind" targets, including chemical shift, raw NMR FID data, NOESY peak lists, and (for 1 case) 15N-1H residual dipolar coupling data. For these nine small (70-108 residues) monomeric proteins, we generated AF2 prediction models and assessed how well these models fit to these experimental NMR data, using several well-established NMR structure validation tools. In most of these cases, the AF2 models fit the NMR data nearly as well, or sometimes better than, the corresponding NMR structure models previously deposited in the Protein Data Bank. These results provide benchmark NMR data for assessing new NMR data analysis and protein structure prediction methods. They also document the potential for using AF2 as a guiding tool in protein NMR data analysis, and more generally for hypothesis generation in structural biology research.
Collapse
Affiliation(s)
- Ethan H Li
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Laura E Spaman
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Roberto Tejero
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Keith J Fraga
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - James H Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602, USA.
| | - Michael A Kennedy
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA.
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| |
Collapse
|
14
|
D'Arminio N, Giordano D, Scafuri B, Facchiano A, Marabotti A. Standardizing macromolecular structure files: further efforts are needed. Trends Biochem Sci 2023; 48:590-596. [PMID: 37031054 DOI: 10.1016/j.tibs.2023.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 02/25/2023] [Accepted: 03/14/2023] [Indexed: 04/10/2023]
Abstract
Investigating large datasets of biological information by automatic procedures may offer chances of progress in knowledge. Recently, tremendous improvements in structural biology have allowed the number of structures in the Protein Data Bank (PDB) archive to increase rapidly, in particular those for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-associated proteins. However, their automatic analysis can be hampered by the nonuniform descriptors used by authors in some records of the PDB and PDBx/mmCIF files. In this opinion article we highlight the difficulties encountered in automating the analysis of hundreds of structures, suggesting that further standardization of the description of these molecular entities and of their attributes, generalized to the macromolecular structures contained in the PDB, might generate files more suitable for automatized analyses of a large number of structures.
Collapse
Affiliation(s)
- Nancy D'Arminio
- Department of Chemistry and Biology 'A. Zambelli', University of Salerno, Fisciano, (SA), Italy
| | - Deborah Giordano
- National Research Council, Institute of Food Science, Avellino, Italy
| | - Bernardina Scafuri
- Department of Chemistry and Biology 'A. Zambelli', University of Salerno, Fisciano, (SA), Italy
| | - Angelo Facchiano
- National Research Council, Institute of Food Science, Avellino, Italy.
| | - Anna Marabotti
- Department of Chemistry and Biology 'A. Zambelli', University of Salerno, Fisciano, (SA), Italy.
| |
Collapse
|
15
|
A Comparison of Bonded and Nonbonded Zinc(II) Force Fields with NMR Data. Int J Mol Sci 2023; 24:ijms24065440. [PMID: 36982515 PMCID: PMC10055966 DOI: 10.3390/ijms24065440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 02/16/2023] [Accepted: 02/22/2023] [Indexed: 03/18/2023] Open
Abstract
Classical molecular dynamics (MD) simulations are widely used to inspect the behavior of zinc(II)-proteins at the atomic level, hence the need to properly model the zinc(II) ion and the interaction with its ligands. Different approaches have been developed to represent zinc(II) sites, with the bonded and nonbonded models being the most used. In the present work, we tested the well-known zinc AMBER force field (ZAFF) and a recently developed nonbonded force field (NBFF) to assess how accurately they reproduce the dynamic behavior of zinc(II)-proteins. For this, we selected as benchmark six zinc-fingers. This superfamily is extremely heterogenous in terms of architecture, binding mode, function, and reactivity. From repeated MD simulations, we computed the order parameter (S2) of all backbone N-H bond vectors in each system. These data were superimposed to heteronuclear Overhauser effect measurements taken by NMR spectroscopy. This provides a quantitative estimate of the accuracy of the FFs in reproducing protein dynamics, leveraging the information about the protein backbone mobility contained in the NMR data. The correlation between the MD-computed S2 and the experimental data indicated that both tested FFs reproduce well the dynamic behavior of zinc(II)-proteins, with comparable accuracy. Thus, along with ZAFF, NBFF represents a useful tool to simulate metalloproteins with the advantage of being extensible to diverse systems such as those bearing dinuclear metal sites.
Collapse
|
16
|
Li EH, Spaman L, Tejero R, Huang YJ, Ramelot TA, Fraga KJ, Prestegard JH, Kennedy MA, Montelione GT. Blind Assessment of Monomeric AlphaFold2 Protein Structure Models with Experimental NMR Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.22.525096. [PMID: 36712039 PMCID: PMC9882346 DOI: 10.1101/2023.01.22.525096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microscopy (cryoEM). Comparing AF2 models to structures determined using solution NMR data, both high similarities and distinct differences have been observed. Since AF2 was trained on X-ray crystal and cryoEM structures, we assessed how accurately AF2 can model small, monomeric, solution protein NMR structures which (i) were not used in the AF2 training data set, and (ii) did not have homologous structures in the Protein Data Bank at the time of AF2 training. We identified nine open source protein NMR data sets for such "blind" targets, including chemical shift, raw NMR FID data, NOESY peak lists, and (for 1 case) 15 N- 1 H residual dipolar coupling data. For these nine small (70 - 108 residues) monomeric proteins, we generated AF2 prediction models and assessed how well these models fit to these experimental NMR data, using several well-established NMR structure validation tools. In most of these cases, the AF2 models fit the NMR data nearly as well, or sometimes better than, the corresponding NMR structure models previously deposited in the Protein Data Bank. These results provide benchmark NMR data for assessing new NMR data analysis and protein structure prediction methods. They also document the potential for using AF2 as a guiding tool in protein NMR data analysis, and more generally for hypothesis generation in structural biology research. Highlights AF2 models assessed against NMR data for 9 monomeric proteins not used in training.AF2 models fit NMR data almost as well as the experimentally-determined structures. RPF-DP, PSVS , and PDBStat software provide structure quality and RDC assessment. RPF-DP analysis using AF2 models suggests multiple conformational states.
Collapse
Affiliation(s)
- Ethan H. Li
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Laura Spaman
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Roberto Tejero
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Theresa A. Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Keith J. Fraga
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - James H. Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602 USA
| | - Michael A. Kennedy
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056 USA
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
17
|
Dicks L, Wales DJ. Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins. J Phys Chem B 2022; 126:8381-8390. [PMID: 36257022 PMCID: PMC9623586 DOI: 10.1021/acs.jpcb.2c04647] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Rotamers, namely amino acid side chain conformations common to many different peptides, can be compiled into libraries. These rotamer libraries are used in protein modeling, where the limited conformational space occupied by amino acid side chains is exploited. Here, we construct a sequence-dependent rotamer library from simulations of all possible tripeptides, which provides rotameric states dependent on adjacent amino acids. We observe significant sensitivity of rotamer populations to sequence and find that the library is successful in locating side chain conformations present in crystal structures. The library is designed for applications with basin-hopping global optimization, where we use it to propose moves in conformational space. The addition of rotamer moves significantly increases the efficiency of protein structure prediction within this framework, and we determine parameters to optimize efficiency.
Collapse
Affiliation(s)
- L. Dicks
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,IBM
Research, The Hartree Centre STFC Laboratory,
Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - D. J. Wales
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,
| |
Collapse
|
18
|
Burley SK, Berman HM, Duarte JM, Feng Z, Flatt JW, Hudson BP, Lowe R, Peisach E, Piehl DW, Rose Y, Sali A, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Young JY, Zardecki C. Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students. Biomolecules 2022; 12:1425. [PMID: 36291635 PMCID: PMC9599165 DOI: 10.3390/biom12101425] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 11/18/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
19
|
Fraga KJ, Huang YJ, Ramelot TA, Swapna GVT, Lashawn Anak Kendary A, Li E, Korf I, Montelione GT. SpecDB: A relational database for archiving biomolecular NMR spectral data. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2022; 342:107268. [PMID: 35930941 PMCID: PMC9922030 DOI: 10.1016/j.jmr.2022.107268] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 06/16/2022] [Accepted: 07/06/2022] [Indexed: 05/11/2023]
Abstract
NMR is a valuable experimental tool in the structural biologist's toolkit to elucidate the structures, functions, and motions of biomolecules. The progress of machine learning, particularly in structural biology, reveals the critical importance of large, diverse, and reliable datasets in developing new methods and understanding in structural biology and science more broadly. Biomolecular NMR research groups produce large amounts of data, and there is renewed interest in organizing these data to train new, sophisticated machine learning architectures and to improve biomolecular NMR analysis pipelines. The foundational data type in NMR is the free-induction decay (FID). There are opportunities to build sophisticated machine learning methods to tackle long-standing problems in NMR data processing, resonance assignment, dynamics analysis, and structure determination using NMR FIDs. Our goal in this study is to provide a lightweight, broadly available tool for archiving FID data as it is generated at the spectrometer, and grow a new resource of FID data and associated metadata. This study presents a relational schema for storing and organizing the metadata items that describe an NMR sample and FID data, which we call Spectral Database (SpecDB). SpecDB is implemented in SQLite and includes a Python software library providing a command-line application to create, organize, query, backup, share, and maintain the database. This set of software tools and database schema allow users to store, organize, share, and learn from NMR time domain data. SpecDB is freely available under an open source license at https://github.rpi.edu/RPIBioinformatics/SpecDB.
Collapse
Affiliation(s)
- Keith J Fraga
- Department of Molecular and Cellular Biology, University of California, Davis, CA 95616, USA.
| | - Yuanpeng J Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.
| | - G V T Swapna
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA; Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers The State University of New Jersey, Piscataway, NJ 08854, USA.
| | | | - Ethan Li
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.
| | - Ian Korf
- Department of Molecular and Cellular Biology, University of California, Davis, CA 95616, USA.
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA.
| |
Collapse
|
20
|
Grigas AT, Liu Z, Regan L, O'Hern CS. Core packing of well-defined X-ray and NMR structures is the same. Protein Sci 2022; 31:e4373. [PMID: 35900019 PMCID: PMC9277709 DOI: 10.1002/pro.4373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2022] [Revised: 05/06/2022] [Accepted: 06/02/2022] [Indexed: 11/10/2022]
Abstract
Numerous studies have investigated the differences and similarities between protein structures determined by solution NMR spectroscopy and those determined by X-ray crystallography. A fundamental question is whether any observed differences are due to differing methodologies or to differences in the behavior of proteins in solution versus in the crystalline state. Here, we compare the properties of the hydrophobic cores of high-resolution protein crystal structures and those in NMR structures, determined using increasing numbers and types of restraints. Prior studies have reported that many NMR structures have denser cores compared with those of high-resolution X-ray crystal structures. Our current work investigates this result in more detail and finds that these NMR structures tend to violate basic features of protein stereochemistry, such as small non-bonded atomic overlaps and few Ramachandran and sidechain dihedral angle outliers. We find that NMR structures solved with more restraints, and which do not significantly violate stereochemistry, have hydrophobic cores that have a similar size and packing fraction as their counterparts determined by X-ray crystallography at high resolution. These results lead us to conclude that, at least regarding the core packing properties, high-quality structures determined by NMR and X-ray crystallography are the same, and the differences reported earlier are most likely a consequence of methodology, rather than fundamental differences between the protein in the two different environments.
Collapse
Affiliation(s)
- Alex T. Grigas
- Graduate Program in Computational Biology and BioinformaticsYale UniversityNew HavenConnecticutUSA
- Integrated Graduate Program in Physical and Engineering BiologyYale UniversityNew HavenConnecticutUSA
| | - Zhuoyi Liu
- Integrated Graduate Program in Physical and Engineering BiologyYale UniversityNew HavenConnecticutUSA
- Department of Mechanical Engineering and Materials ScienceYale UniversityNew HavenConnecticutUSA
| | - Lynne Regan
- Institute of Quantitative Biology, Biochemistry and BiotechnologyCentre for Synthetic and Systems Biology, School of Biological Sciences, University of EdinburghEdinburghUK
| | - Corey S. O'Hern
- Graduate Program in Computational Biology and BioinformaticsYale UniversityNew HavenConnecticutUSA
- Integrated Graduate Program in Physical and Engineering BiologyYale UniversityNew HavenConnecticutUSA
- Department of Mechanical Engineering and Materials ScienceYale UniversityNew HavenConnecticutUSA
- Department of PhysicsYale UniversityNew HavenConnecticutUSA
- Department of Applied PhysicsYale UniversityNew HavenConnecticutUSA
| |
Collapse
|
21
|
Tejero R, Huang YJ, Ramelot TA, Montelione GT. AlphaFold Models of Small Proteins Rival the Accuracy of Solution NMR Structures. Front Mol Biosci 2022; 9:877000. [PMID: 35769913 PMCID: PMC9234698 DOI: 10.3389/fmolb.2022.877000] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Recent advances in molecular modeling using deep learning have the potential to revolutionize the field of structural biology. In particular, AlphaFold has been observed to provide models of protein structures with accuracies rivaling medium-resolution X-ray crystal structures, and with excellent atomic coordinate matches to experimental protein NMR and cryo-electron microscopy structures. Here we assess the hypothesis that AlphaFold models of small, relatively rigid proteins have accuracies (based on comparison against experimental data) similar to experimental solution NMR structures. We selected six representative small proteins with structures determined by both NMR and X-ray crystallography, and modeled each of them using AlphaFold. Using several structure validation tools integrated under the Protein Structure Validation Software suite (PSVS), we then assessed how well these models fit to experimental NMR data, including NOESY peak lists (RPF-DP scores), comparisons between predicted rigidity and chemical shift data (ANSURR scores), and 15N-1H residual dipolar coupling data (RDC Q factors) analyzed by software tools integrated in the PSVS suite. Remarkably, the fits to NMR data for the protein structure models predicted with AlphaFold are generally similar, or better, than for the corresponding experimental NMR or X-ray crystal structures. Similar conclusions were reached in comparing AlphaFold2 predictions and NMR structures for three targets from the Critical Assessment of Protein Structure Prediction (CASP). These results contradict the widely held misperception that AlphaFold cannot accurately model solution NMR structures. They also document the value of PSVS for model vs. data assessment of protein NMR structures, and the potential for using AlphaFold models for guiding analysis of experimental NMR data and more generally in structural biology.
Collapse
Affiliation(s)
- Roberto Tejero
- Departamento de Química Física, Universidad de Valencia, Valencia, Spain
| | - Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - Theresa A. Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, United States
| |
Collapse
|
22
|
Antila HS, Kav B, Miettinen MS, Martinez-Seara H, Jungwirth P, Ollila OHS. Emerging Era of Biomolecular Membrane Simulations: Automated Physically-Justified Force Field Development and Quality-Evaluated Databanks. J Phys Chem B 2022. [DOI: 10.1021/acs.jpcb.2c01954] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Hanne S. Antila
- Department of Biomaterials, Max Planck Institute of Colloids and Interfaces, 14424 Potsdam, Germany
| | - Batuhan Kav
- Institute of Biological Information Processing, Structural Biochemistry (IBI-7), Forschungszentrum
Jülich, Wilhelm-Johnen-Str., 52425 Jülich, Germany
| | - Markus S. Miettinen
- Computational Biology Unit, Department of Informatics, University of Bergen, 5008 Bergen, Norway
- Department of Chemistry, University of Bergen, 5020 Bergen, Norway
| | - Hector Martinez-Seara
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Flemingovo nam. 2, 16000 Prague 6, Czech Republic
| | - Pavel Jungwirth
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Flemingovo nam. 2, 16000 Prague 6, Czech Republic
| | - O. H. Samuli Ollila
- Institute of Biotechonology, University of Helsinki, Helsinki 00014, Finland
| |
Collapse
|
23
|
Wang Z, Patwardhan A, Kleywegt GJ. Validation analysis of EMDB entries. Acta Crystallogr D Struct Biol 2022; 78:542-552. [PMID: 35503203 PMCID: PMC9063848 DOI: 10.1107/s205979832200328x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 03/23/2022] [Indexed: 11/17/2022] Open
Abstract
The Electron Microscopy Data Bank (EMDB) is the central archive of the electron cryo-microscopy (cryo-EM) community for storing and disseminating volume maps and tomograms. With input from the community, EMDB has developed new resources for the validation of cryo-EM structures, focusing on the quality of the volume data alone and that of the fit of any models, themselves archived in the Protein Data Bank (PDB), to the volume data. Based on recommendations from community experts, the validation resources are developed in a three-tiered system. Tier 1 covers an extensive and evolving set of validation metrics, including tried and tested metrics as well as more experimental ones, which are calculated for all EMDB entries and presented in the Validation Analysis (VA) web resource. This system is particularly useful for cryo-EM experts, both to validate individual structures and to assess the utility of new validation metrics. Tier 2 comprises a subset of the validation metrics covered by the VA resource that have been subjected to extensive testing and are considered to be useful for specialists as well as nonspecialists. These metrics are presented on the entry-specific web pages for the entire archive on the EMDB website. As more experience is gained with the metrics included in the VA resource, it is expected that consensus will emerge in the community regarding a subset that is suitable for inclusion in the tier 2 system. Tier 3, finally, consists of the validation reports and servers that are produced by the Worldwide Protein Data Bank (wwPDB) Consortium. Successful metrics from tier 2 will be proposed for inclusion in the wwPDB validation pipeline and reports. The details of the new resource are described, with an emphasis on the tier 1 system. The output of all three tiers is publicly available, either through the EMDB website (tiers 1 and 2) or through the wwPDB ftp sites (tier 3), although the content of all three will evolve over time (fastest for tier 1 and slowest for tier 3). It is our hope that these validation resources will help the cryo-EM community to obtain a better understanding of the quality and of the best ways to assess the quality of cryo-EM structures in EMDB and PDB.
Collapse
Affiliation(s)
- Zhe Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Ardan Patwardhan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Gerard J. Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
24
|
Negahdaripour M, Rahbar MR, Mosalanejad Z, Gholami A. Theta-Defensins to Counter COVID-19 as Furin Inhibitors: In Silico Efficiency Prediction and Novel Compound Design. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9735626. [PMID: 35154362 PMCID: PMC8829439 DOI: 10.1155/2022/9735626] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 12/28/2021] [Accepted: 01/21/2022] [Indexed: 12/13/2022]
Abstract
Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was characterized as a pandemic by the World Health Organization (WHO) in Dec. 2019. SARS-CoV-2 binds to the cell membrane through spike proteins on its surface and infects the cell. Furin, a host-cell enzyme, possesses a binding site for the spike protein. Thus, molecules that block furin could potentially be a therapeutic solution. Defensins are antimicrobial peptides that can hypothetically inhibit furin because of their arginine-rich structure. Theta-defensins, a subclass of defensins, have attracted attention as drug candidates due to their small size, unique structure, and involvement in several defense mechanisms. Theta-defensins could be a potential treatment for COVID-19 through furin inhibition and an anti-inflammatory mechanism. Note that inflammatory events are a significant and deadly condition that could happen at the later stages of COVID-19 infection. Here, the potential of theta-defensins against SARS-CoV-2 infection was investigated through in silico approaches. Based on docking analysis results, theta-defensins can function as furin inhibitors. Additionally, a novel candidate peptide against COVID-19 with optimal properties regarding antigenicity, stability, electrostatic potential, and binding strength was proposed. Further in vitro/in vivo investigations could verify the efficiency of the designed novel peptide.
Collapse
Affiliation(s)
- Manica Negahdaripour
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohammad Reza Rahbar
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Zahra Mosalanejad
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Ahmad Gholami
- Biotechnology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
25
|
Omar H, Hein A, Cole CA, Valafar H. Concurrent Identification and Characterization of Protein Structure and Continuous Internal Dynamics with REDCRAFT. Front Mol Biosci 2022; 9:806584. [PMID: 35187082 PMCID: PMC8856112 DOI: 10.3389/fmolb.2022.806584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022] Open
Abstract
Internal dynamics of proteins can play a critical role in the biological function of some proteins. Several well documented instances have been reported such as MBP, DHFR, hTS, DGCR8, and NSP1 of the SARS-CoV family of viruses. Despite the importance of internal dynamics of proteins, there currently are very few approaches that allow for meaningful separation of internal dynamics from structural aspects using experimental data. Here we present a computational approach named REDCRAFT that allows for concurrent characterization of protein structure and dynamics. Here, we have subjected DHFR (PDB-ID 1RX2), a 159-residue protein, to a fictitious, mixed mode model of internal dynamics. In this simulation, DHFR was segmented into 7 regions where 4 of the fragments were fixed with respect to each other, two regions underwent rigid-body dynamics, and one region experienced uncorrelated and melting event. The two dynamical and rigid-body segments experienced an average orientational modification of 7° and 12° respectively. Observable RDC data for backbone C′-N, N-HN, and C′-HN were generated from 102 uniformly sampled frames that described the molecular trajectory. The structure calculation of DHFR with REDCRAFT by using traditional Ramachandran restraint produced a structure with 29 Å of structural difference measured over the backbone atoms (bb-rmsd) over the entire length of the protein and an average bb-rmsd of more than 4.7 Å over each of the dynamical fragments. The same exercise repeated with context-specific dihedral restraints generated by PDBMine produced a structure with bb-rmsd of 21 Å over the entire length of the protein but with bb-rmsd of less than 3 Å over each of the fragments. Finally, utilization of the Dynamic Profile generated by REDCRAFT allowed for the identification of different dynamical regions of the protein and the recovery of individual fragments with bb-rmsd of less than 1 Å. Following the recovery of the fragments, our assembly procedure of domains (larger segments consisting of multiple fragments with a common dynamical profile) correctly assembled the four fragments that are rigid with respect to each other, categorized the two domains that underwent rigid-body dynamics, and identified one dynamical region for which no conserved structure could be defined. In conclusion, our approach was successful in identifying the dynamical domains, recovery of structure where it is meaningful, and relative assembly of the domains when possible.
Collapse
|
26
|
Baskaran K, Craft DL, Eghbalnia HR, Gryk MR, Hoch JC, Maciejewski MW, Schuyler AD, Wedell JR, Wilburn CW. Merging NMR Data and Computation Facilitates Data-Centered Research. Front Mol Biosci 2022; 8:817175. [PMID: 35111815 PMCID: PMC8802229 DOI: 10.3389/fmolb.2021.817175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 12/23/2021] [Indexed: 12/01/2022] Open
Abstract
The Biological Magnetic Resonance Data Bank (BMRB) has served the NMR structural biology community for 40 years, and has been instrumental in the development of many widely-used tools. It fosters the reuse of data resources in structural biology by embodying the FAIR data principles (Findable, Accessible, Inter-operable, and Re-usable). NMRbox is less than a decade old, but complements BMRB by providing NMR software and high-performance computing resources, facilitating the reuse of software resources. BMRB and NMRbox both facilitate reproducible research. NMRbox also fosters the development and deployment of complex meta-software. Combining BMRB and NMRbox helps speed and simplify workflows that utilize BMRB, and enables facile federation of BMRB with other data repositories. Utilization of BMRB and NMRbox in tandem will enable additional advances, such as machine learning, that are poised to become increasingly powerful.
Collapse
Affiliation(s)
| | | | - Hamid R. Eghbalnia
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, United States
| | | | - Jeffrey C. Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, United States
| | | | | | | | | |
Collapse
|
27
|
BEHZADI PAYAM, GAJDÁCS MÁRIÓ. Worldwide Protein Data Bank (wwPDB): A virtual treasure for research in biotechnology. Eur J Microbiol Immunol (Bp) 2021; 11:77-86. [PMID: 34908533 PMCID: PMC8830413 DOI: 10.1556/1886.2021.00020] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 11/23/2021] [Indexed: 12/25/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RSCB PDB) provides a wide range of digital data regarding biology and biomedicine. This huge internet resource involves a wide range of important biological data, obtained from experiments around the globe by different scientists. The Worldwide Protein Data Bank (wwPDB) represents a brilliant collection of 3D structure data associated with important and vital biomolecules including nucleic acids (RNAs and DNAs) and proteins. Moreover, this database accumulates knowledge regarding function and evolution of biomacromolecules which supports different disciplines such as biotechnology. 3D structure, functional characteristics and phylogenetic properties of biomacromolecules give a deep understanding of the biomolecules' characteristics. An important advantage of the wwPDB database is the data updating time, which is done every week. This updating process helps users to have the newest data and information for their projects. The data and information in wwPDB can be a great support to have an accurate imagination and illustrations of the biomacromolecules in biotechnology. As demonstrated by the SARS-CoV-2 pandemic, rapidly reliable and accessible biological data for microbiology, immunology, vaccinology, and drug development are critical to address many healthcare-related challenges that are facing humanity. The aim of this paper is to introduce the readers to wwPDB, and to highlight the importance of this database in biotechnology, with the expectation that the number of scientists interested in the utilization of Protein Data Bank's resources will increase substantially in the coming years.
Collapse
Affiliation(s)
- PAYAM BEHZADI
- Department of Microbiology, College of Basic Sciences, Shahr-e-Qods Branch, Islamic Azad University, Tehran, 37541-374, Iran
| | - MÁRIÓ GAJDÁCS
- Department of Oral Biology and Experimental Dental Research, Faculty of Dentistry, University of Szeged, 6720, Szeged, Hungary,*Corresponding author. Tel.: +36-62-342-532. E-mail:
| |
Collapse
|
28
|
Wishart DS, Sayeeda Z, Budinski Z, Guo A, Lee BL, Berjanskii M, Rout M, Peters H, Dizon R, Mah R, Torres-Calzada C, Hiebert-Giesbrecht M, Varshavi D, Varshavi D, Oler E, Allen D, Cao X, Gautam V, Maras A, Poynton EF, Tavangar P, Yang V, van Santen JA, Ghosh R, Sarma S, Knutson E, Sullivan V, Jystad AM, Renslow R, Sumner LW, Linington RG, Cort JR. NP-MRD: the Natural Products Magnetic Resonance Database. Nucleic Acids Res 2021; 50:D665-D677. [PMID: 34791429 PMCID: PMC8728158 DOI: 10.1093/nar/gkab1052] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/15/2021] [Accepted: 10/19/2021] [Indexed: 11/15/2022] Open
Abstract
The Natural Products Magnetic Resonance Database (NP-MRD) is a comprehensive, freely available electronic resource for the deposition, distribution, searching and retrieval of nuclear magnetic resonance (NMR) data on natural products, metabolites and other biologically derived chemicals. NMR spectroscopy has long been viewed as the ‘gold standard’ for the structure determination of novel natural products and novel metabolites. NMR is also widely used in natural product dereplication and the characterization of biofluid mixtures (metabolomics). All of these NMR applications require large collections of high quality, well-annotated, referential NMR spectra of pure compounds. Unfortunately, referential NMR spectral collections for natural products are quite limited. It is because of the critical need for dedicated, open access natural product NMR resources that the NP-MRD was funded by the National Institute of Health (NIH). Since its launch in 2020, the NP-MRD has grown quickly to become the world's largest repository for NMR data on natural products and other biological substances. It currently contains both structural and NMR data for nearly 41,000 natural product compounds from >7400 different living species. All structural, spectroscopic and descriptive data in the NP-MRD is interactively viewable, searchable and fully downloadable in multiple formats. Extensive hyperlinks to other databases of relevance are also provided. The NP-MRD also supports community deposition of NMR assignments and NMR spectra (1D and 2D) of natural products and related meta-data. The deposition system performs extensive data enrichment, automated data format conversion and spectral/assignment evaluation. Details of these database features, how they are implemented and plans for future upgrades are also provided. The NP-MRD is available at https://np-mrd.org.
Collapse
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada.,Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada.,Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada.,Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada
| | - Zinat Sayeeda
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
| | - Zachary Budinski
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - AnChi Guo
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Brian L Lee
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Mark Berjanskii
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Manoj Rout
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Harrison Peters
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Raynard Dizon
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Robert Mah
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | | | | | - Dorna Varshavi
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Dorsa Varshavi
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Eponine Oler
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Dana Allen
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Xuan Cao
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Andrew Maras
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Ella F Poynton
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Pegah Tavangar
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Vera Yang
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | | | - Rajarshi Ghosh
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Saurav Sarma
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Eleanor Knutson
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Victoria Sullivan
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Amy M Jystad
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Ryan Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Lloyd W Sumner
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA.,MU Metabolomics Center, University of Missouri, Columbia, MO 65211, USA.,Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - John R Cort
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
29
|
Huang YJ, Zhang N, Bersch B, Fidelis K, Inouye M, Ishida Y, Kryshtafovych A, Kobayashi N, Kuroda Y, Liu G, LiWang A, Swapna GVT, Wu N, Yamazaki T, Montelione GT. Assessment of prediction methods for protein structures determined by NMR in CASP14: Impact of AlphaFold2. Proteins 2021; 89:1959-1976. [PMID: 34559429 PMCID: PMC8616817 DOI: 10.1002/prot.26246] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 09/09/2021] [Accepted: 09/14/2021] [Indexed: 12/26/2022]
Abstract
NMR studies can provide unique information about protein conformations in solution. In CASP14, three reference structures provided by solution NMR methods were available (T1027, T1029, and T1055), as well as a fourth data set of NMR‐derived contacts for an integral membrane protein (T1088). For the three targets with NMR‐based structures, the best prediction results ranged from very good (GDT_TS = 0.90, for T1055) to poor (GDT_TS = 0.47, for T1029). We explored the basis of these results by comparing all CASP14 prediction models against experimental NMR data. For T1027, NMR data reveal extensive internal dynamics, presenting a unique challenge for protein structure prediction methods. The analysis of T1029 motivated exploration of a novel method of “inverse structure determination,” in which an AlphaFold2 model was used to guide NMR data analysis. NMR data provided to CASP predictor groups for target T1088, a 238‐residue integral membrane porin, was also used to assess several NMR‐assisted prediction methods. Most groups involved in this exercise generated similar beta‐barrel models, with good agreement with the experimental data. However, as was also observed in CASP13, some pure prediction groups that did not use any NMR data generated models for T1088 that better fit the NMR data than the models generated using these experimental data. These results demonstrate the remarkable power of modern methods to predict structures of proteins with accuracies rivaling solution NMR structures, and that it is now possible to reliably use prediction models to guide and complement experimental NMR data analysis.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA
| | - Ning Zhang
- Department of Chemistry and Biochemistry, University of California, Merced, California, USA
| | - Beate Bersch
- Biomolecular NMR Spectroscopy Group, Institut de Biologie Structurale, UMD-5075, CNRS-CEA-UJF, Grenoble, France
| | | | - Masayori Inouye
- Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA.,Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | - Yojiro Ishida
- Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA.,Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey, USA
| | | | - Naohiro Kobayashi
- NMR Science and Development Division, RSC, RIKEN, Yokohama, Kanagawa, Japan
| | - Yutaka Kuroda
- Department of Biotechnology and Life Science, Graduate School of Engineering, Tokyo University of Agriculture and Technology (TUAT), Tokyo, Japan
| | - Gaohua Liu
- Nexomics Biosciences, Inc., Rocky Hill, New Jersey, USA
| | - Andy LiWang
- Department of Chemistry and Biochemistry, University of California, Merced, California, USA.,Center for Cellular and Biomolecular Machines and Health Sciences Research Institute, University of California, Merced, California, USA
| | - G V T Swapna
- Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers University, Piscataway, New Jersey, USA
| | - Nan Wu
- College of Food and Bioengineering, Zhengzhou University of Light Industry, Zhengzhou, China
| | - Toshio Yamazaki
- NMR Science and Development Division, RSC, RIKEN, Yokohama, Kanagawa, Japan
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, USA
| |
Collapse
|
30
|
Fowler NJ, Sljoka A, Williamson MP. The accuracy of NMR protein structures in the Protein Data Bank. Structure 2021; 29:1430-1439.e2. [PMID: 34331857 DOI: 10.1016/j.str.2021.07.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 06/18/2021] [Accepted: 07/14/2021] [Indexed: 11/18/2022]
Abstract
The program ANSURR measures the accuracy of NMR structures by comparing rigidity obtained from experimental backbone chemical shifts and from structures. We report on ANSURR analysis of 7,000 PDB NMR ensembles within the Protein Data Bank, which can be found at ansurr.com. The accuracy of NMR structures progressively improved up until 2005, but since then, it has plateaued. Most structures have accurate secondary structure, but are generally too floppy, particularly in loops. Thus, there is a need for more experimental restraints in loops. Currently, the best predictors of accuracy are Ramachandran distribution and the number of NOE restraints per residue. The precision of structures within the ensemble correlates well with accuracy, as does the number of hydrogen bond restraints per residue. Structure accuracy is improved when other components (such as additional polypeptide chains or ligands) are included.
Collapse
Affiliation(s)
- Nicholas J Fowler
- Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield S10 2TN, UK
| | - Adnan Sljoka
- RIKEN Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo 103-0027 Japan; Department of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, ON L5L 1C6, Canada.
| | - Mike P Williamson
- Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield S10 2TN, UK.
| |
Collapse
|
31
|
Vincenzi M, Mercurio FA, Leone M. NMR Spectroscopy in the Conformational Analysis of Peptides: An Overview. Curr Med Chem 2021; 28:2729-2782. [PMID: 32614739 DOI: 10.2174/0929867327666200702131032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 05/21/2020] [Accepted: 05/28/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND NMR spectroscopy is one of the most powerful tools to study the structure and interaction properties of peptides and proteins from a dynamic perspective. Knowing the bioactive conformations of peptides is crucial in the drug discovery field to design more efficient analogue ligands and inhibitors of protein-protein interactions targeting therapeutically relevant systems. OBJECTIVE This review provides a toolkit to investigate peptide conformational properties by NMR. METHODS Articles cited herein, related to NMR studies of peptides and proteins were mainly searched through PubMed and the web. More recent and old books on NMR spectroscopy written by eminent scientists in the field were consulted as well. RESULTS The review is mainly focused on NMR tools to gain the 3D structure of small unlabeled peptides. It is more application-oriented as it is beyond its goal to deliver a profound theoretical background. However, the basic principles of 2D homonuclear and heteronuclear experiments are briefly described. Protocols to obtain isotopically labeled peptides and principal triple resonance experiments needed to study them, are discussed as well. CONCLUSION NMR is a leading technique in the study of conformational preferences of small flexible peptides whose structure can be often only described by an ensemble of conformations. Although NMR studies of peptides can be easily and fast performed by canonical protocols established a few decades ago, more recently we have assisted to tremendous improvements of NMR spectroscopy to investigate instead large systems and overcome its molecular weight limit.
Collapse
Affiliation(s)
- Marian Vincenzi
- Institute of Biostructures and Bioimaging, National Research Council of Italy, Via Mezzocannone 16, 80134, Naples, Italy
| | - Flavia Anna Mercurio
- Institute of Biostructures and Bioimaging, National Research Council of Italy, Via Mezzocannone 16, 80134, Naples, Italy
| | - Marilisa Leone
- Institute of Biostructures and Bioimaging, National Research Council of Italy, Via Mezzocannone 16, 80134, Naples, Italy
| |
Collapse
|
32
|
Bozin TN, Chukhontseva KN, Lesovoy DM, Filatov VV, Kozlovskiy VI, Demidyuk IV, Bocharov EV. NMR assignments and secondary structure distribution of emfourin, a novel proteinaceous protease inhibitor. BIOMOLECULAR NMR ASSIGNMENTS 2021; 15:10.1007/s12104-021-10030-x. [PMID: 34091855 DOI: 10.1007/s12104-021-10030-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Accepted: 06/01/2021] [Indexed: 06/12/2023]
Abstract
Emfourin (M4in) from Serratia proteamaculans is a new proteinaceous inhibitor of protealysin-like proteases (PLPs), a subgroup of the well-known and widely represented metallopeptidase M4 family. Although the biological role of PLPs is debatable, data published indicate their involvement in pathogenesis, including bacterial invasion into eukaryotic cells, suppression of immune defense of some animals, and destruction of plant cell walls. Gene colocalization into a bicistronic operon observed for some PLPs and their inhibitors (as in the case of M4in) implies a mutually consistent functioning of both entities. The originality of the amino acid sequence of M4in suggests it belongs to a previously unknown protein family and this encourages structural studies. In this work, we report a near-complete assignment of 1H, 13C, and 15N resonances of recombinant M4in and its structural-dynamic properties derived from the chemical shifts. According the NMR data analysis, the M4in molecule comprises 3-5 helical elements and 4-6 β-strands, at least two of which are apparently antiparallel, ascribing this obviously globular protein to the α + β structural class. Besides, two disordered regions also exist in the central loops between the regular secondary structural elements. The obtained data provide the basis for determining the high-resolution structure as well as functioning mechanism of M4in that can be used for development of new antibacterial therapeutic strategies.
Collapse
Affiliation(s)
- Timur N Bozin
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", 2, Kurchatov Sq, 123182, Moscow, Russia
- National Research Centre "Kurchatov Institute", Moscow, Russia
| | - Ksenia N Chukhontseva
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", 2, Kurchatov Sq, 123182, Moscow, Russia
| | - Dmitry M Lesovoy
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, Russia
| | - Vasily V Filatov
- Chernogolovka Branch of the Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Chernogolovka, Russia
| | - Viacheslav I Kozlovskiy
- Chernogolovka Branch of the Semenov Federal Research Center for Chemical Physics, Russian Academy of Sciences, Chernogolovka, Russia
| | - Ilya V Demidyuk
- Institute of Molecular Genetics of National Research Centre "Kurchatov Institute", 2, Kurchatov Sq, 123182, Moscow, Russia.
| | - Eduard V Bocharov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
33
|
Burley SK, Berman HM. Open-access data: A cornerstone for artificial intelligence approaches to protein structure prediction. Structure 2021; 29:515-520. [PMID: 33984281 PMCID: PMC8178243 DOI: 10.1016/j.str.2021.04.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 04/08/2021] [Accepted: 04/23/2021] [Indexed: 12/28/2022]
Abstract
The Protein Data Bank (PDB) was established in 1971 to archive three-dimensional (3D) structures of biological macromolecules as a public good. Fifty years later, the PDB is providing millions of data consumers around the world with open access to more than 175,000 experimentally determined structures of proteins and nucleic acids (DNA, RNA) and their complexes with one another and small-molecule ligands. PDB data users are working, teaching, and learning in fundamental biology, biomedicine, bioengineering, biotechnology, and energy sciences. They also represent the fields of agriculture, chemistry, physics and materials science, mathematics, statistics, computer science, and zoology, and even the social sciences. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA.
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; The Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
34
|
|
35
|
Feng Z, Westbrook JD, Sala R, Smart OS, Bricogne G, Matsubara M, Yamada I, Tsuchiya S, Aoki-Kinoshita KF, Hoch JC, Kurisu G, Velankar S, Burley SK, Young JY. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank. Structure 2021; 29:393-400.e1. [PMID: 33657417 PMCID: PMC8026741 DOI: 10.1016/j.str.2021.02.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/22/2021] [Accepted: 02/11/2021] [Indexed: 12/19/2022]
Abstract
The Worldwide Protein Data Bank (wwPDB) has provided validation reports based on recommendations from community Validation Task Forces for structures in the PDB since 2013. To further enhance validation of small molecules as recommended from the 2016 Ligand Validation Workshop, wwPDB, Global Phasing Ltd., and the Noguchi Institute, recently formed a public/private partnership to incorporate some of their software tools into the wwPDB validation package. Augmented wwPDB validation report features include: two-dimensional (2D) diagrams of small-molecule ligands and carbohydrates, highlighting geometric validation outcomes; 2D topological diagrams of oligosaccharides present in branched entities generated using 2D Symbol Nomenclature for Glycan representation; and views of 3D electron density maps for ligands and carbohydrates, illustrating the goodness-of-fit between the atomic structure and experimental data (X-ray crystallographic structures only). These improvements will impact confidence in ligand conformation and ligand-macromolecular interactions that will aid in understanding biochemical function and contribute to small-molecule drug discovery.
Collapse
Affiliation(s)
- Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - Raul Sala
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA
| | - Oliver S Smart
- Global Phasing Ltd., Sheraton House, Castle Park, Cambridge CB3 0AX, UK
| | - Gérard Bricogne
- Global Phasing Ltd., Sheraton House, Castle Park, Cambridge CB3 0AX, UK
| | - Masaaki Matsubara
- The Noguchi Institute, 1-9-7, Kaga, Itabashi-ku, Tokyo 173-0003, Japan
| | - Issaku Yamada
- The Noguchi Institute, 1-9-7, Kaga, Itabashi-ku, Tokyo 173-0003, Japan
| | | | - Kiyoko F Aoki-Kinoshita
- Faculty of Science and Engineering, Soka University, 1-236 Tangi-machi, Hachioji-shi, Tokyo 192-8577, Japan; Glycan & Life Systems Integration Center, Soka University, 1-236 Tangi-machi, Hachioji-shi, Tokyo 192-8577, Japan
| | - Jeffrey C Hoch
- Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, UConn Health, 263 Farmington Avenue, Farmington, CT 06030-3305, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA 92093, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
36
|
Wu XL, Hu H, Dong XQ, Zhang J, Wang J, Schwieters CD, Liu J, Wu GX, Li B, Lin JY, Wang HY, Lu JX. The amyloid structure of mouse RIPK3 (receptor interacting protein kinase 3) in cell necroptosis. Nat Commun 2021; 12:1627. [PMID: 33712586 PMCID: PMC7955032 DOI: 10.1038/s41467-021-21881-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 02/15/2021] [Indexed: 12/24/2022] Open
Abstract
RIPK3 amyloid complex plays crucial roles during TNF-induced necroptosis and in response to immune defense in both human and mouse. Here, we have structurally characterized mouse RIPK3 homogeneous self-assembly using solid-state NMR, revealing a well-ordered N-shaped amyloid core structure featured with 3 parallel in-register β-sheets. This structure differs from previously published human RIPK1/RIPK3 hetero-amyloid complex structure, which adopted a serpentine fold. Functional studies indicate both RIPK1-RIPK3 binding and RIPK3 amyloid formation are essential but not sufficient for TNF-induced necroptosis. The structural integrity of RIPK3 fibril with three β-strands is necessary for signaling. Molecular dynamics simulations with a mouse RIPK1/RIPK3 model indicate that the hetero-amyloid is less stable when adopting the RIPK3 fibril conformation, suggesting a structural transformation of RIPK3 from RIPK1-RIPK3 binding to RIPK3 amyloid formation. This structural transformation would provide the missing link connecting RIPK1-RIPK3 binding to RIPK3 homo-oligomer formation in the signal transduction. Receptor Interacting Protein Kinase 3 (RIPK3) has a key role in TNF-induced necroptosis. Here, the authors combine solid state NMR measurements, MD simulations and cell based assays to characterize mouse RIPK3 and they present the structure of the RIPK3 amyloid core.
Collapse
Affiliation(s)
- Xia-Lian Wu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Hong Hu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Xing-Qi Dong
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Jing Zhang
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Jian Wang
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China
| | - Charles D Schwieters
- Laboratory of Imaging Sciences, Office of Intramural Research, Center for Information Technology, National Institutes of Health, Bethesda, MD, USA
| | - Jing Liu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Guo-Xiang Wu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Bing Li
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China
| | - Jing-Yu Lin
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.,CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, PR China.,University of Chinese Academy of Sciences, Beijing, PR China
| | - Hua-Yi Wang
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.
| | - Jun-Xia Lu
- School of Life Science and Technology, ShanghaiTech University, Shanghai, PR China.
| |
Collapse
|
37
|
D'Andréa ÉD, Retel JS, Diehl A, Schmieder P, Oschkinat H, Pires JR. NMR structure and dynamics of Q4DY78, a conserved kinetoplasid-specific protein from Trypanosoma cruzi. J Struct Biol 2021; 213:107715. [PMID: 33705979 DOI: 10.1016/j.jsb.2021.107715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 03/01/2021] [Accepted: 03/03/2021] [Indexed: 10/21/2022]
Abstract
The 106-residue protein Q4DY78 (UniProt accession number) from Trypanosoma cruzi is highly conserved in the related kinetoplastid pathogens Trypanosoma brucei and Leishmania major. Given the essentiality of its orthologue in T. brucei, the high sequence conservation with other trypanosomatid proteins, and the low sequence similarity with mammalian proteins, Q4DY78 is an attractive protein for structural characterization. Here, we solved the structure of Q4DY78 by solution NMR and evaluated its backbone dynamics. Q4DY78 is composed of five α -helices and a small, two-stranded antiparallel β-sheet. The backbone RMSD is 0.22 ± 0.05 Å for the representative ensemble of the 20 lowest-energy structures. Q4DY78 is overall rigid, except for N-terminal residues (V8 to I10), residues at loop 4 (K57 to G65) and residues at the C-terminus (F89 to F112). Q4DY78 has a short motif FPCAP that could potentially mediate interactions with the host cytoskeleton via interaction with EVH1 (Drosophila Enabled (Ena)/Vasodilator-stimulated phosphoprotein (VASP) homology 1) domains. Albeit Q4DY78 lacks calcium-binding motifs, its fold resembles that of eukaryotic calcium-binding proteins such as calcitracin, calmodulin, and polcacin Bet V4. We characterized this novel protein with a calcium binding fold without the capacity to bind calcium.
Collapse
Affiliation(s)
- Éverton Dias D'Andréa
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho, 373 - Bloco E, sala 32, Rio de Janeiro, RJ 21941-902, Brazil
| | - Joren Sebastian Retel
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Anne Diehl
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Peter Schmieder
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany
| | - Hartmut Oschkinat
- Leibniz-Institut für Molekulare Pharmakologie, FMP, Robert-Rössle-Straβe 10, Berlin 13125, Germany; Freie Universität Berlin, Institut für Chemie und Biochemie, Takustrasse 3, Berlin 14195, Germany
| | - José Ricardo Pires
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho, 373 - Bloco E, sala 32, Rio de Janeiro, RJ 21941-902, Brazil.
| |
Collapse
|
38
|
Lawson CL, Kryshtafovych A, Adams PD, Afonine PV, Baker ML, Barad BA, Bond P, Burnley T, Cao R, Cheng J, Chojnowski G, Cowtan K, Dill KA, DiMaio F, Farrell DP, Fraser JS, Herzik MA, Hoh SW, Hou J, Hung LW, Igaev M, Joseph AP, Kihara D, Kumar D, Mittal S, Monastyrskyy B, Olek M, Palmer CM, Patwardhan A, Perez A, Pfab J, Pintilie GD, Richardson JS, Rosenthal PB, Sarkar D, Schäfer LU, Schmid MF, Schröder GF, Shekhar M, Si D, Singharoy A, Terashi G, Terwilliger TC, Vaiana A, Wang L, Wang Z, Wankowicz SA, Williams CJ, Winn M, Wu T, Yu X, Zhang K, Berman HM, Chiu W. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat Methods 2021; 18:156-164. [PMID: 33542514 PMCID: PMC7864804 DOI: 10.1038/s41592-020-01051-w] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 12/21/2020] [Indexed: 01/30/2023]
Abstract
This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.
Collapse
Affiliation(s)
- Catherine L. Lawson
- grid.430387.b0000 0004 1936 8796Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Andriy Kryshtafovych
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Paul D. Adams
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Bioengineering, University of California Berkeley, Berkeley, CA USA
| | - Pavel V. Afonine
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA
| | - Matthew L. Baker
- grid.267308.80000 0000 9206 2401Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Benjamin A. Barad
- grid.214007.00000000122199231Department of Integrated Computational Structural Biology, The Scripps Research Institute, La Jolla, CA USA
| | - Paul Bond
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom Burnley
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Renzhi Cao
- grid.261584.c0000 0001 0492 9915Department of Computer Science, Pacific Lutheran University, Tacoma, WA USA
| | - Jianlin Cheng
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Grzegorz Chojnowski
- grid.475756.20000 0004 0444 5410European Molecular Biology Laboratory, c/o DESY, Hamburg, Germany
| | - Kevin Cowtan
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Ken A. Dill
- grid.36425.360000 0001 2216 9681Laufer Center, Stony Brook University, Stony Brook, NY USA
| | - Frank DiMaio
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - Daniel P. Farrell
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - James S. Fraser
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA
| | - Mark A. Herzik
- grid.266100.30000 0001 2107 4242Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA USA
| | - Soon Wen Hoh
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- grid.262962.b0000 0004 1936 9342Department of Computer Science, Saint Louis University, St. Louis, MO USA
| | - Li-Wei Hung
- grid.148313.c0000 0004 0428 3079Los Alamos National Laboratory, Los Alamos, NM USA
| | - Maxim Igaev
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Agnel P. Joseph
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Daisuke Kihara
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.169077.e0000 0004 1937 2197Department of Computer Science, Purdue University, West Lafayette, IN USA
| | - Dilip Kumar
- grid.39382.330000 0001 2160 926XVerna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX USA
| | - Sumit Mittal
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.411530.20000 0001 0694 3745School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Bohdan Monastyrskyy
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Mateusz Olek
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Colin M. Palmer
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Ardan Patwardhan
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Alberto Perez
- grid.15276.370000 0004 1936 8091Department of Chemistry, University of Florida, Gainesville, FL USA
| | - Jonas Pfab
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Grigore D. Pintilie
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Jane S. Richardson
- grid.26009.3d0000 0004 1936 7961Department of Biochemistry, Duke University, Durham, NC USA
| | - Peter B. Rosenthal
- grid.451388.30000 0004 1795 1830Structural Biology of Cells and Viruses Laboratory, Francis Crick Institute, London, UK
| | - Daipayan Sarkar
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Luisa U. Schäfer
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Michael F. Schmid
- grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| | - Gunnar F. Schröder
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany ,grid.411327.20000 0001 2176 9917Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.66859.34Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Dong Si
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Abishek Singharoy
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Genki Terashi
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | | | - Andrea Vaiana
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Liguo Wang
- grid.34477.330000000122986657Department of Biological Structure, University of Washington, Seattle, WA USA
| | - Zhe Wang
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Stephanie A. Wankowicz
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Biophysics Graduate Program, University of California, San Francisco, CA USA
| | | | - Martyn Winn
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Tianqi Wu
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Xiaodi Yu
- grid.497530.c0000 0004 0389 4927SMPS, Janssen Research and Development, Spring House, PA USA
| | - Kaiming Zhang
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Helen M. Berman
- grid.430387.b0000 0004 1936 8796Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ USA ,grid.42505.360000 0001 2156 6853Department of Biological Sciences and Bridge Institute, University of Southern California, Los Angeles, CA USA
| | - Wah Chiu
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA ,grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| |
Collapse
|
39
|
Abstract
Protein Data Bank is the single worldwide archive of experimentally determined macromolecular structure data. Established in 1971 as the first open access data resource in biology, the PDB archive is managed by the worldwide Protein Data Bank (wwPDB) consortium which has four partners-the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu ). The PDB archive currently includes ~175,000 entries. The wwPDB has established a number of task forces and working groups that bring together experts form the community who provide recommendations on improving data standards and data validation for improving data quality and integrity. The wwPDB members continue to develop the joint deposition, biocuration, and validation system (OneDep) to improve data quality and accommodate new data from emerging techniques such as 3DEM. Each PDB entry contains coordinate model and associated metadata for all experimentally determined atomic structures, experimental data for the traditional structure determination techniques (X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy), validation reports, and additional information on quaternary structures. The wwPDB partners are committed to following the FAIR (Findability, Accessibility, Interoperability, and Reproducibility) principles and have implemented a DOI resolution mechanism that provides access to all the relevant files for a given PDB entry. On average, >250 new entries are added to the archive every week and made available by each wwPDB partner via FTP area. The wwPDB partner sites also develop data access and analysis tools and make these available via their websites. wwPDB continues to work with experts in the community to establish a federation of archives for archiving structures determined using integrative/hybrid method where multiple experimental techniques are used.
Collapse
Affiliation(s)
- Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA, USA
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka, Japan
| | - Jeffrey C Hoch
- BioMagResBank, Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, USA
| | - John L Markley
- BioMagResBank, Biochemistry Department, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
40
|
Burley SK. Impact of structural biologists and the Protein Data Bank on small-molecule drug discovery and development. J Biol Chem 2021; 296:100559. [PMID: 33744282 PMCID: PMC8059052 DOI: 10.1016/j.jbc.2021.100559] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 02/02/2021] [Accepted: 03/16/2021] [Indexed: 12/12/2022] Open
Abstract
The Protein Data Bank (PDB) is an international core data resource central to fundamental biology, biomedicine, bioenergy, and biotechnology/bioengineering. Now celebrating its 50th anniversary, the PDB houses >175,000 experimentally determined atomic structures of proteins, nucleic acids, and their complexes with one another and small molecules and drugs. The importance of three-dimensional (3D) biostructure information for research and education obtains from the intimate link between molecular form and function evident throughout biology. Among the most prolific consumers of PDB data are biomedical researchers, who rely on the open access resource as the authoritative source of well-validated, expertly curated biostructures. This review recounts how the PDB grew from just seven protein structures to contain more than 49,000 structures of human proteins that have proven critical for understanding their roles in human health and disease. It then describes how these structures are used in academe and industry to validate drug targets, assess target druggability, characterize how tool compounds and other small-molecules bind to drug targets, guide medicinal chemistry optimization of binding affinity and selectivity, and overcome challenges during preclinical drug development. Three case studies drawn from oncology exemplify how structural biologists and open access to PDB structures impacted recent regulatory approvals of antineoplastic drugs.
Collapse
Affiliation(s)
- Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, New Jersey, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, USA.
| |
Collapse
|
41
|
Sali A. From integrative structural biology to cell biology. J Biol Chem 2021; 296:100743. [PMID: 33957123 PMCID: PMC8203844 DOI: 10.1016/j.jbc.2021.100743] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 04/09/2021] [Accepted: 04/30/2021] [Indexed: 12/16/2022] Open
Abstract
Integrative modeling is an increasingly important tool in structural biology, providing structures by combining data from varied experimental methods and prior information. As a result, molecular architectures of large, heterogeneous, and dynamic systems, such as the ∼52-MDa Nuclear Pore Complex, can be mapped with useful accuracy, precision, and completeness. Key challenges in improving integrative modeling include expanding model representations, increasing the variety of input data and prior information, quantifying a match between input information and a model in a Bayesian fashion, inventing more efficient structural sampling, as well as developing better model validation, analysis, and visualization. In addition, two community-level challenges in integrative modeling are being addressed under the auspices of the Worldwide Protein Data Bank (wwPDB). First, the impact of integrative structures is maximized by PDB-Development, a prototype wwPDB repository for archiving, validating, visualizing, and disseminating integrative structures. Second, the scope of structural biology is expanded by linking the wwPDB resource for integrative structures with archives of data that have not been generally used for structure determination but are increasingly important for computing integrative structures, such as data from various types of mass spectrometry, spectroscopy, optical microscopy, proteomics, and genetics. To address the largest of modeling problems, a type of integrative modeling called metamodeling is being developed; metamodeling combines different types of input models as opposed to different types of data to compute an output model. Collectively, these developments will facilitate the structural biology mindset in cell biology and underpin spatiotemporal mapping of the entire cell.
Collapse
Affiliation(s)
- Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, the Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA.
| |
Collapse
|
42
|
Fowler NJ, Sljoka A, Williamson MP. A method for validating the accuracy of NMR protein structures. Nat Commun 2020; 11:6321. [PMID: 33339822 PMCID: PMC7749147 DOI: 10.1038/s41467-020-20177-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 11/13/2020] [Indexed: 01/13/2023] Open
Abstract
We present a method that measures the accuracy of NMR protein structures. It compares random coil index [RCI] against local rigidity predicted by mathematical rigidity theory, calculated from NMR structures [FIRST], using a correlation score (which assesses secondary structure), and an RMSD score (which measures overall rigidity). We test its performance using: structures refined in explicit solvent, which are much better than unrefined structures; decoy structures generated for 89 NMR structures; and conventional predictors of accuracy such as number of restraints per residue, restraint violations, energy of structure, ensemble RMSD, Ramachandran distribution, and clashscore. Restraint violations and RMSD are poor measures of accuracy. Comparisons of NMR to crystal structures show that secondary structure is equally accurate, but crystal structures are typically too rigid in loops, whereas NMR structures are typically too floppy overall. We show that the method is a useful addition to existing measures of accuracy.
Collapse
Affiliation(s)
- Nicholas J Fowler
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK
| | - Adnan Sljoka
- RIKEN Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo, 103-0027, Japan.
- Dept of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, ON, L5L 1C6, Canada.
| | - Mike P Williamson
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK.
| |
Collapse
|
43
|
Toward Increased Reliability, Transparency, and Accessibility in Cross-linking Mass Spectrometry. Structure 2020; 28:1259-1268. [PMID: 33065067 DOI: 10.1016/j.str.2020.09.011] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 09/02/2020] [Accepted: 09/24/2020] [Indexed: 01/09/2023]
Abstract
Cross-linking mass spectrometry (MS) has substantially matured as a method over the past 2 decades through parallel development in multiple labs, demonstrating its applicability to protein structure determination, conformation analysis, and mapping protein interactions in complex mixtures. Cross-linking MS has become a much-appreciated and routinely applied tool, especially in structural biology. Therefore, it is timely that the community commits to the development of methodological and reporting standards. This white paper builds on an open process comprising a number of events at community conferences since 2015 and identifies aspects of Cross-linking MS for which guidelines should be developed as part of a Cross-linking MS standards initiative.
Collapse
|
44
|
Berman HM, Vallat B, Lawson CL. The data universe of structural biology. IUCRJ 2020; 7:630-638. [PMID: 32695409 PMCID: PMC7340255 DOI: 10.1107/s205225252000562x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 04/21/2020] [Indexed: 05/05/2023]
Abstract
The Protein Data Bank (PDB) has grown from a small data resource for crystallographers to a worldwide resource serving structural biology. The history of the growth of the PDB and the role that the community has played in developing standards and policies are described. This article also illustrates how other biophysics communities are collaborating with the worldwide PDB to create a network of interoperating data resources. This network will expand the capabilities of structural biology and enable the determination and archiving of increasingly complex structures.
Collapse
Affiliation(s)
- Helen M. Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Biological Sciences and Bridge Institute, University of Southern California, Los Angeles, CA 90089, USA
| | - Brinda Vallat
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Catherine L. Lawson
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
45
|
Kubatova N, Pyper DJ, Jonker HRA, Saxena K, Remmel L, Richter C, Brantl S, Evguenieva‐Hackenberg E, Hess WR, Klug G, Marchfelder A, Soppa J, Streit W, Mayzel M, Orekhov VY, Fuxreiter M, Schmitz RA, Schwalbe H. Rapid Biophysical Characterization and NMR Spectroscopy Structural Analysis of Small Proteins from Bacteria and Archaea. Chembiochem 2020; 21:1178-1187. [PMID: 31705614 PMCID: PMC7217052 DOI: 10.1002/cbic.201900677] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Indexed: 01/08/2023]
Abstract
Proteins encoded by small open reading frames (sORFs) have a widespread occurrence in diverse microorganisms and can be of high functional importance. However, due to annotation biases and their technically challenging direct detection, these small proteins have been overlooked for a long time and were only recently rediscovered. The currently rapidly growing number of such proteins requires efficient methods to investigate their structure-function relationship. Herein, a method is presented for fast determination of the conformational properties of small proteins. Their small size makes them perfectly amenable for solution-state NMR spectroscopy. NMR spectroscopy can provide detailed information about their conformational states (folded, partially folded, and unstructured). In the context of the priority program on small proteins funded by the German research foundation (SPP2002), 27 small proteins from 9 different bacterial and archaeal organisms have been investigated. It is found that most of these small proteins are unstructured or partially folded. Bioinformatics tools predict that some of these unstructured proteins can potentially fold upon complex formation. A protocol for fast NMR spectroscopy structure elucidation is described for the small proteins that adopt a persistently folded structure by implementation of new NMR technologies, including automated resonance assignment and nonuniform sampling in combination with targeted acquisition.
Collapse
Affiliation(s)
- Nina Kubatova
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Dennis J. Pyper
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Hendrik R. A. Jonker
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Krishna Saxena
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Laura Remmel
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Christian Richter
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| | - Sabine Brantl
- AG BakteriengenetikMatthias-Schleiden-InstitutPhilosophenweg 1207743JenaGermany
| | - Elena Evguenieva‐Hackenberg
- Institute for Microbiology and Molecular BiologyJustus Liebig University GiessenHeinrich-Buff-Ring 2635392GiessenGermany
| | - Wolfgang R. Hess
- Faculty of Biology, Genetics and Experimental BioinformaticsAlbert Ludwigs University FreiburgSchänzlestrasse 179104FreiburgGermany
| | - Gabriele Klug
- Institute for Microbiology and Molecular BiologyJustus Liebig University GiessenHeinrich-Buff-Ring 2635392GiessenGermany
| | | | - Jörg Soppa
- Institute for Molecular BiosciencesJohann Wolfgang Goethe UniversityMax-von-Laue-Strasse 960438Frankfurt am MainGermany
| | - Wolfgang Streit
- Department of Microbiology and BiotechnologyUniversity of HamburgOhnhorststrasse 1822609HamburgGermany
| | - Maxim Mayzel
- Swedish NMR CentreUniversity of GothenburgP. O. Box 46540530GothenburgSweden
| | - Vladislav Y. Orekhov
- Swedish NMR CentreUniversity of GothenburgP. O. Box 46540530GothenburgSweden
- Department of Chemistry and Molecular BiologyUniversity of GothenburgKemigården 441296GothenburgSweden
| | - Monika Fuxreiter
- MTA-DE Laboratory of Protein DynamicsDepartment of Biochemistry and Molecular BiologyUniversity of DebrecenNagyerdei krt 984032DebrecenHungary
| | - Ruth A. Schmitz
- Institute for General MicrobiologyChristian Albrechts University KielAm Botanischen Garten 1–924118KielGermany
| | - Harald Schwalbe
- Institute for Organic Chemistry and Chemical BiologyCenter for Biomolecular Magnetic Resonance (BMRZ)Johann Wolfgang Goethe UniversityMax-von-Laue-Strasse 760438Frankfurt/MainGermany
| |
Collapse
|
46
|
Chen L, He J. Outlier Profiles of Atomic Structures Derived from X-ray Crystallography and from Cryo-Electron Microscopy. Molecules 2020; 25:E1540. [PMID: 32231015 PMCID: PMC7181022 DOI: 10.3390/molecules25071540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Revised: 03/16/2020] [Accepted: 03/24/2020] [Indexed: 11/19/2022] Open
Abstract
Background: As more protein atomic structures are determined from cryo-electron microscopy (cryo-EM) density maps, validation of such structures is an important task. Methods: We applied a histogram-based outlier score (HBOS) to six sets of cryo-EM atomic structures and five sets of X-ray atomic structures, including one derived from X-ray data with better than 1.5 Å resolution. Cryo-EM data sets contain structures released by December 2016 and those released between 2017 and 2019, derived from resolution ranges 0-4 Å and 4-6 Å respectively. Results: The distribution of HBOS values in five sets of X-ray structures show that HBOS is sensitive distinguishing sets of X-ray structures derived from different resolution ranges-higher than 1.5 Å, 1.5-2.0 Å, 2.0-2.5 Å, 2.5-3.0 Å, and 3.0-3.5 Å. The overall quality of cryo-EM structures is likely improved, as shown in a comparison of cryo-EM structures released before the end of 2016, those between 2017 and 2018, and those between 2018 and 2019. Our investigation shows that leucine (LEU) has a significantly higher rate of HBOS outliers than that of the reference data set (X-ray-1.5) and of other residue types in the cryo-EM data sets. HBOS was able to detect outliers for those residues that are currently marked as green in PDB validation reports. Conclusions: The HBOS profile of a dataset is a potential method to characterize the overall structural quality of the set. Residue LEU deserves special attention since it has a significantly higher HBOS outlier rate in sets of cryo-EM structures and those X-ray structures derived from X-ray data of lower than 2.5 Å resolutions. Most HBOS outlier residues from the EM-0-4-2019 set are located on loops for most types of residues.
Collapse
Affiliation(s)
- Lin Chen
- Department of Computer Science, Valdosta State University, 1500 N Patterson St, Valdosta, GA 31698, USA
| | - Jing He
- Department of Computer Science, Old Dominion University, 5115 Hampton Blvd, Norfolk, VA 23529, USA;
| |
Collapse
|
47
|
Rout MP, Sali A. Principles for Integrative Structural Biology Studies. Cell 2020; 177:1384-1403. [PMID: 31150619 DOI: 10.1016/j.cell.2019.05.016] [Citation(s) in RCA: 177] [Impact Index Per Article: 44.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 04/24/2019] [Accepted: 05/06/2019] [Indexed: 12/22/2022]
Abstract
Integrative structure determination is a powerful approach to modeling the structures of biological systems based on data produced by multiple experimental and theoretical methods, with implications for our understanding of cellular biology and drug discovery. This Primer introduces the theory and methods of integrative approaches, emphasizing the kinds of data that can be effectively included in developing models and using the nuclear pore complex as an example to illustrate the practice and challenges involved. These guidelines are intended to aid the researcher in understanding and applying integrative structural methods to systems of their interest and thus take advantage of this rapidly evolving field.
Collapse
Affiliation(s)
- Michael P Rout
- Laboratory of Cellular and Structural Biology, The Rockefeller University, New York, NY 10065, USA.
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, California Institute for Quantitative Biosciences, Byers Hall, 1700 4th Street, Suite 503B, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
48
|
Watzel J, Hacker C, Duchardt-Ferner E, Bode HB, Wöhnert J. A New Docking Domain Type in the Peptide-Antimicrobial-Xenorhabdus Peptide Producing Nonribosomal Peptide Synthetase from Xenorhabdus bovienii. ACS Chem Biol 2020; 15:982-989. [DOI: 10.1021/acschembio.9b01022] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jonas Watzel
- Molecular Biotechnology, Institute of Molecular Biosciences, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Carolin Hacker
- Institute of Molecular Biosciences and Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Elke Duchardt-Ferner
- Institute of Molecular Biosciences and Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Helge B. Bode
- Molecular Biotechnology, Institute of Molecular Biosciences, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
- Buchmann Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
- Senckenberg Gesellschaft für Naturforschung, 60325 Frankfurt am Main, Germany
| | - Jens Wöhnert
- Institute of Molecular Biosciences and Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| |
Collapse
|
49
|
Hatti KS, McCoy AJ, Oeffner RD, Sammito MD, Read RJ. Factors influencing estimates of coordinate error for molecular replacement. Acta Crystallogr D Struct Biol 2020; 76:19-27. [PMID: 31909740 PMCID: PMC6939440 DOI: 10.1107/s2059798319015730] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 11/21/2019] [Indexed: 11/24/2022] Open
Abstract
Good prior estimates of the effective root-mean-square deviation (r.m.s.d.) between the atomic coordinates of the model and the target optimize the signal in molecular replacement, thereby increasing the success rate in difficult cases. Previous studies using protein structures solved by X-ray crystallography as models showed that optimal error estimates (refined after structure solution) were correlated with the sequence identity between the model and target, and with the number of residues in the model. Here, this work has been extended to find additional correlations between parameters of the model and the target and hence improved prior estimates of the coordinate error. Using a graph database, a curated set of 6030 molecular-replacement calculations using models that had been solved by X-ray crystallography was analysed to consider about 120 model and target parameters. Improved estimates were achieved by replacing the sequence identity with the Gonnet score for sequence similarity, as well as by considering the resolution of the target structure and the MolProbity score of the model. This approach was extended by analysing 12 610 additional molecular-replacement calculations where the model was determined by NMR. The median r.m.s.d. between pairs of models in an ensemble was found to be correlated with the estimated r.m.s.d. to the target. For models solved by NMR, the overall coordinate error estimates were larger than for structures determined by X-ray crystallography, and were more highly correlated with the number of residues.
Collapse
Affiliation(s)
- Kaushik S. Hatti
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, England
| | - Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, England
| | - Robert D. Oeffner
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, England
| | - Massimo D. Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, England
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, England
| |
Collapse
|
50
|
Prisant MG, Williams CJ, Chen VB, Richardson JS, Richardson DC. New tools in MolProbity validation: CaBLAM for CryoEM backbone, UnDowser to rethink "waters," and NGL Viewer to recapture online 3D graphics. Protein Sci 2020; 29:315-329. [PMID: 31724275 PMCID: PMC6933861 DOI: 10.1002/pro.3786] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 11/08/2019] [Accepted: 11/11/2019] [Indexed: 12/17/2022]
Abstract
The MolProbity web service provides macromolecular model validation to help correct local errors, for the structural biology community worldwide. Here we highlight new validation features, and also describe how we are fighting back against outside developments which compromise that mission. Our new tool called UnDowser analyzes the properties and context of clashing HOH "waters" to diagnose what they might actually represent; a dozen distinct scenarios are illustrated and described. We now treat alternate conformations more thoroughly, and switching to the Neo4j database (graphical rather than relational) enables cleaner, more comprehensive, and much larger reference datasets. A problematic outside change is that refinement software now increasingly restrains traditional validation criteria (geometry, clashes, rotamers, and even Ramachandran) in order to supplement the sparser experimental data at 3-4 Å resolutions typical of modern cryoEM. But unfortunately the broad density allows model optimization without fixing underlying problems, which means these structures often score much better on validation than they really are. CaBLAM, our tool designed for evaluating peptide orientations at lower resolutions, was described in the previous Tools issue, and here we demonstrate its effectiveness in diagnosing local errors even when other validation outliers have been artificially removed. Sophisticated hacking of the MolProbity server has required continual monitoring and various security measures short of restricting user access. The deprecation of Java applets now prevents KiNG interactive online display of outliers on the 3D model during a MolProbity run, but that important functionality has now been recaptured with a modified version of the Javascript NGL Viewer.
Collapse
Affiliation(s)
- Michael G. Prisant
- Department of BiochemistryDuke University Medical CenterDurhamNorth Carolina
| | | | - Vincent B. Chen
- Department of BiochemistryDuke University Medical CenterDurhamNorth Carolina
| | - Jane S. Richardson
- Department of BiochemistryDuke University Medical CenterDurhamNorth Carolina
| | - David C. Richardson
- Department of BiochemistryDuke University Medical CenterDurhamNorth Carolina
| |
Collapse
|