1
|
Bittrich S, Segura J, Duarte JM, Burley SK, Rose Y. RCSB protein Data Bank: exploring protein 3D similarities via comprehensive structural alignments. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae370. [PMID: 38870521 DOI: 10.1093/bioinformatics/btae370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 05/15/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024]
Abstract
MOTIVATION Tools for pairwise alignments between 3D structures of proteins are of fundamental importance for structural biology and bioinformatics, enabling visual exploration of evolutionary and functional relationships. However, the absence of a user-friendly, browser-based tool for creating alignments and visualizing them at both 1D sequence and 3D structural levels makes this process unnecessarily cumbersome. RESULTS We introduce a novel pairwise structure alignment tool (rcsb.org/alignment) that seamlessly integrates into the RCSB Protein Data Bank (RCSB PDB) research-focused RCSB.org web portal. Our tool and its underlying application programming interface (alignment.rcsb.org) empowers users to align several protein chains with a reference structure by providing access to established alignment algorithms (FATCAT, CE, TM-align, or Smith-Waterman 3D). The user-friendly interface simplifies parameter setup and input selection. Within seconds, our tool enables visualization of results in both sequence (1D) and structural (3D) perspectives through the RCSB PDB RCSB.org Sequence Annotations viewer and Mol* 3D viewer, respectively. Users can effortlessly compare structures deposited in the PDB archive alongside more than a million incorporated Computed Structure Models coming from the ModelArchive and AlphaFold DB. Moreover, this tool can be used to align custom structure data by providing a link/URL or uploading atomic coordinate files directly. Importantly, alignment results can be bookmarked and shared with collaborators. By bridging the gap between 1D sequence and 3D structures of proteins, our tool facilitates deeper understanding of complex evolutionary relationships among proteins through comprehensive sequence and structural analyses. AVAILABILITY AND IMPLEMENTATION The alignment tool is part of the RCSB PDB research-focused RCSB.org web portal and available at rcsb.org/alignment. Programmatic access is available via alignment.rcsb.org. Frontend code has been published at github.com/rcsb/rcsb-pecos-app. Visualization is powered by the open-source Mol* viewer (github.com/molstar/molstar and github.com/molstar/rcsb-molstar) plus the Sequence Annotations in 3D Viewer (github.com/rcsb/rcsb-saguaro-3d).
Collapse
Affiliation(s)
- Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, United States
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States
| |
Collapse
|
2
|
Ellaway JIJ, Anyango S, Nair S, Zaki HA, Nadzirin N, Powell HR, Gutmanas A, Varadi M, Velankar S. Identifying protein conformational states in the Protein Data Bank: Toward unlocking the potential of integrative dynamics studies. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2024; 11:034701. [PMID: 38774441 PMCID: PMC11106648 DOI: 10.1063/4.0000251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 05/08/2024] [Indexed: 05/24/2024]
Abstract
Studying protein dynamics and conformational heterogeneity is crucial for understanding biomolecular systems and treating disease. Despite the deposition of over 215 000 macromolecular structures in the Protein Data Bank and the advent of AI-based structure prediction tools such as AlphaFold2, RoseTTAFold, and ESMFold, static representations are typically produced, which fail to fully capture macromolecular motion. Here, we discuss the importance of integrating experimental structures with computational clustering to explore the conformational landscapes that manifest protein function. We describe the method developed by the Protein Data Bank in Europe - Knowledge Base to identify distinct conformational states, demonstrate the resource's primary use cases, through examples, and discuss the need for further efforts to annotate protein conformations with functional information. Such initiatives will be crucial in unlocking the potential of protein dynamics data, expediting drug discovery research, and deepening our understanding of macromolecular mechanisms.
Collapse
Affiliation(s)
- Joseph I. J. Ellaway
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Stephen Anyango
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Sreenath Nair
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Hossam A. Zaki
- The Warren Alpert Medical School of Brown University, Providence, Rhode Island 02903, USA
| | - Nurul Nadzirin
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Harold R. Powell
- Imperial College London, Department of Life Sciences, London, United Kingdom
| | - Aleksandras Gutmanas
- WaveBreak Therapeutics Ltd., Clarendon House, Clarendon Road, Cambridge, United Kingdom
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, European Bioinformatics Institute, Hinxton, United Kingdom
| |
Collapse
|
3
|
Mróz J, Pelc M, Mitusińska K, Chorostowska-Wynimko J, Jezela-Stanek A. Computational Tools to Assist in Analyzing Effects of the SERPINA1 Gene Variation on Alpha-1 Antitrypsin (AAT). Genes (Basel) 2024; 15:340. [PMID: 38540399 PMCID: PMC10970068 DOI: 10.3390/genes15030340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Revised: 02/28/2024] [Accepted: 03/04/2024] [Indexed: 06/14/2024] Open
Abstract
In the rapidly advancing field of bioinformatics, the development and application of computational tools to predict the effects of single nucleotide variants (SNVs) are shedding light on the molecular mechanisms underlying disorders. Also, they hold promise for guiding therapeutic interventions and personalized medicine strategies in the future. A comprehensive understanding of the impact of SNVs in the SERPINA1 gene on alpha-1 antitrypsin (AAT) protein structure and function requires integrating bioinformatic approaches. Here, we provide a guide for clinicians to navigate through the field of computational analyses which can be applied to describe a novel genetic variant. Predicting the clinical significance of SERPINA1 variation allows clinicians to tailor treatment options for individuals with alpha-1 antitrypsin deficiency (AATD) and related conditions, ultimately improving the patient's outcome and quality of life. This paper explores the various bioinformatic methodologies and cutting-edge approaches dedicated to the assessment of molecular variants of genes and their product proteins using SERPINA1 and AAT as an example.
Collapse
Affiliation(s)
- Jakub Mróz
- Tunneling Group, Biotechnology Center, Silesian University of Technology, Krzywoustego St. 8, 44-100 Gliwice, Poland;
| | - Magdalena Pelc
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| | - Karolina Mitusińska
- Tunneling Group, Biotechnology Center, Silesian University of Technology, Krzywoustego St. 8, 44-100 Gliwice, Poland;
| | - Joanna Chorostowska-Wynimko
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| | - Aleksandra Jezela-Stanek
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| |
Collapse
|
4
|
Thakur M, Buniello A, Brooksbank C, Gurwitz KT, Hall M, Hartley M, Hulcoop DG, Leach AR, Marques D, Martin M, Mithani A, McDonagh EM, Mutasa-Gottgens E, Ochoa D, Perez-Riverol Y, Stephenson J, Varadi M, Velankar S, Vizcaino JA, Witham R, McEntyre J. EMBL's European Bioinformatics Institute (EMBL-EBI) in 2023. Nucleic Acids Res 2024; 52:D10-D17. [PMID: 38015445 PMCID: PMC10767983 DOI: 10.1093/nar/gkad1088] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/23/2023] [Accepted: 10/30/2023] [Indexed: 11/29/2023] Open
Abstract
The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the latest developments in the services provided by EMBL-EBI data resources to scientific communities globally. These developments aim to ensure EMBL-EBI resources meet the current and future needs of these scientific communities, accelerating the impact of open biological data for all.
Collapse
Affiliation(s)
- Matthew Thakur
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Annalisa Buniello
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Catherine Brooksbank
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Kim T Gurwitz
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matthew Hall
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matthew Hartley
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - David G Hulcoop
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Andrew R Leach
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Diana Marques
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Maria Martin
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Aziz Mithani
- Training Team, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ellen M McDonagh
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Euphemia Mutasa-Gottgens
- Industry Partnerships, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - David Ochoa
- Open Targets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Yasset Perez-Riverol
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - James Stephenson
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Mihaly Varadi
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sameer Velankar
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Juan Antonio Vizcaino
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Rick Witham
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Johanna McEntyre
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| |
Collapse
|
5
|
Segura J, Rose Y, Bi C, Duarte J, Burley SK, Bittrich S. RCSB Protein Data Bank: visualizing groups of experimentally determined PDB structures alongside computed structure models of proteins. FRONTIERS IN BIOINFORMATICS 2023; 3:1311287. [PMID: 38111685 PMCID: PMC10726007 DOI: 10.3389/fbinf.2023.1311287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 11/17/2023] [Indexed: 12/20/2023] Open
Abstract
Recent advances in Artificial Intelligence and Machine Learning (e.g., AlphaFold, RosettaFold, and ESMFold) enable prediction of three-dimensional (3D) protein structures from amino acid sequences alone at accuracies comparable to lower-resolution experimental methods. These tools have been employed to predict structures across entire proteomes and the results of large-scale metagenomic sequence studies, yielding an exponential increase in available biomolecular 3D structural information. Given the enormous volume of this newly computed biostructure data, there is an urgent need for robust tools to manage, search, cluster, and visualize large collections of structures. Equally important is the capability to efficiently summarize and visualize metadata, biological/biochemical annotations, and structural features, particularly when working with vast numbers of protein structures of both experimental origin from the Protein Data Bank (PDB) and computationally-predicted models. Moreover, researchers require advanced visualization techniques that support interactive exploration of multiple sequences and structural alignments. This paper introduces a suite of tools provided on the RCSB PDB research-focused web portal RCSB. org, tailor-made for efficient management, search, organization, and visualization of this burgeoning corpus of 3D macromolecular structure data.
Collapse
Affiliation(s)
- Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
| | - Jose Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
| | - Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, United States
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States
| |
Collapse
|
6
|
Stahl K, Graziadei A, Dau T, Brock O, Rappsilber J. Protein structure prediction with in-cell photo-crosslinking mass spectrometry and deep learning. Nat Biotechnol 2023; 41:1810-1819. [PMID: 36941363 PMCID: PMC10713450 DOI: 10.1038/s41587-023-01704-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/06/2023] [Indexed: 03/23/2023]
Abstract
While AlphaFold2 can predict accurate protein structures from the primary sequence, challenges remain for proteins that undergo conformational changes or for which few homologous sequences are known. Here we introduce AlphaLink, a modified version of the AlphaFold2 algorithm that incorporates experimental distance restraint information into its network architecture. By employing sparse experimental contacts as anchor points, AlphaLink improves on the performance of AlphaFold2 in predicting challenging targets. We confirm this experimentally by using the noncanonical amino acid photo-leucine to obtain information on residue-residue contacts inside cells by crosslinking mass spectrometry. The program can predict distinct conformations of proteins on the basis of the distance restraints provided, demonstrating the value of experimental data in driving protein structure prediction. The noise-tolerant framework for integrating data in protein structure prediction presented here opens a path to accurate characterization of protein structures from in-cell data.
Collapse
Affiliation(s)
- Kolja Stahl
- Robotics and Biology Laboratory, Technische Universität Berlin, Berlin, Germany
| | - Andrea Graziadei
- Technische Universität Berlin, Chair of Bioanalytics, Berlin, Germany
| | - Therese Dau
- Technische Universität Berlin, Chair of Bioanalytics, Berlin, Germany
- Fritz Lipmann Institute, Leibniz Institute on Aging, Jena, Germany
| | - Oliver Brock
- Robotics and Biology Laboratory, Technische Universität Berlin, Berlin, Germany.
- Science of Intelligence, Research Cluster of Excellence, Berlin, Germany.
| | - Juri Rappsilber
- Technische Universität Berlin, Chair of Bioanalytics, Berlin, Germany.
- Si-M/'Der Simulierte Mensch', a Science Framework of Technische Universität Berlin and Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
7
|
Kulczyk AW. Artificial intelligence and the analysis of cryo-EM data provide structural insight into the molecular mechanisms underlying LN-lamininopathies. Sci Rep 2023; 13:17825. [PMID: 37857770 PMCID: PMC10587063 DOI: 10.1038/s41598-023-45200-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/17/2023] [Indexed: 10/21/2023] Open
Abstract
Laminins (Lm) are major components of basement membranes (BM), which polymerize to form a planar lattice on cell surface. Genetic alternations of Lm affect their oligomerization patterns and lead to failures in BM assembly manifesting in a group of human disorders collectively defined as Lm N-terminal domain lamininopathies (LN-lamininopathies). We have employed a recently determined cryo-EM structure of the Lm polymer node, the basic repeating unit of the Lm lattice, along with structure prediction and modeling to systematically analyze structures of twenty-three pathogenic Lm polymer nodes implicated in human disease. Our analysis provides the detailed mechanistic explanation how Lm mutations lead to failures in Lm polymerization underlining LN-lamininopathies. We propose the new categorization scheme of LN-lamininopathies based on the insight gained from the structural analysis. Our results can help to facilitate rational drug design aiming in the treatment of Lm deficiencies.
Collapse
Affiliation(s)
- Arkadiusz W Kulczyk
- Institute for Quantitative Biomedicine, Rutgers University, 174 Frelinghuysen Road, Piscataway, NJ, 08854, USA.
- Department of Biochemistry & Microbiology, Rutgers University, 75 Lipman Drive, New Brunswick, NJ, 08901, USA.
| |
Collapse
|
8
|
Varadi M, Tsenkov M, Velankar S. Challenges in bridging the gap between protein structure prediction and functional interpretation. Proteins 2023. [PMID: 37850517 DOI: 10.1002/prot.26614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/19/2023]
Abstract
The rapid evolution of protein structure prediction tools has significantly broadened access to protein structural data. Although predicted structure models have the potential to accelerate and impact fundamental and translational research significantly, it is essential to note that they are not validated and cannot be considered the ground truth. Thus, challenges persist, particularly in capturing protein dynamics, predicting multi-chain structures, interpreting protein function, and assessing model quality. Interdisciplinary collaborations are crucial to overcoming these obstacles. Databases like the AlphaFold Protein Structure Database, the ESM Metagenomic Atlas, and initiatives like the 3D-Beacons Network provide FAIR access to these data, enabling their interpretation and application across a broader scientific community. Whilst substantial advancements have been made in protein structure prediction, further progress is required to address the remaining challenges. Developing training materials, nurturing collaborations, and ensuring open data sharing will be paramount in this pursuit. The continued evolution of these tools and methodologies will deepen our understanding of protein function and accelerate disease pathogenesis and drug development discoveries.
Collapse
Affiliation(s)
- Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Maxim Tsenkov
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
9
|
O'Reilly FJ, Graziadei A, Forbrig C, Bremenkamp R, Charles K, Lenz S, Elfmann C, Fischer L, Stülke J, Rappsilber J. Protein complexes in cells by AI-assisted structural proteomics. Mol Syst Biol 2023; 19:e11544. [PMID: 36815589 PMCID: PMC10090944 DOI: 10.15252/msb.202311544] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 01/24/2023] [Accepted: 02/07/2023] [Indexed: 02/24/2023] Open
Abstract
Accurately modeling the structures of proteins and their complexes using artificial intelligence is revolutionizing molecular biology. Experimental data enable a candidate-based approach to systematically model novel protein assemblies. Here, we use a combination of in-cell crosslinking mass spectrometry and co-fractionation mass spectrometry (CoFrac-MS) to identify protein-protein interactions in the model Gram-positive bacterium Bacillus subtilis. We show that crosslinking interactions prior to cell lysis reveals protein interactions that are often lost upon cell lysis. We predict the structures of these protein interactions and others in the SubtiWiki database with AlphaFold-Multimer and, after controlling for the false-positive rate of the predictions, we propose novel structural models of 153 dimeric and 14 trimeric protein assemblies. Crosslinking MS data independently validates the AlphaFold predictions and scoring. We report and validate novel interactors of central cellular machineries that include the ribosome, RNA polymerase, and pyruvate dehydrogenase, assigning function to several uncharacterized proteins. Our approach uncovers protein-protein interactions inside intact cells, provides structural insight into their interaction interfaces, and is applicable to genetically intractable organisms, including pathogenic bacteria.
Collapse
Affiliation(s)
- Francis J O'Reilly
- Chair of BioanalyticsTechnische Universität BerlinBerlinGermany
- Present address:
Center for Structural Biology, Center for Cancer ResearchNational Cancer Institute (NCI)FrederickMDUSA
| | | | | | - Rica Bremenkamp
- Department of General Microbiology, Institute of Microbiology and GeneticsAugust‐University GöttingenGöttingenGermany
| | | | - Swantje Lenz
- Chair of BioanalyticsTechnische Universität BerlinBerlinGermany
| | - Christoph Elfmann
- Department of General Microbiology, Institute of Microbiology and GeneticsAugust‐University GöttingenGöttingenGermany
| | - Lutz Fischer
- Chair of BioanalyticsTechnische Universität BerlinBerlinGermany
| | - Jörg Stülke
- Department of General Microbiology, Institute of Microbiology and GeneticsAugust‐University GöttingenGöttingenGermany
| | - Juri Rappsilber
- Chair of BioanalyticsTechnische Universität BerlinBerlinGermany
- Wellcome Centre for Cell BiologyUniversity of EdinburghEdinburghUK
| |
Collapse
|
10
|
Bittrich S, Bhikadiya C, Bi C, Chao H, Duarte JM, Dutta S, Fayazi M, Henry J, Khokhriakov I, Lowe R, Piehl DW, Segura J, Vallat B, Voigt M, Westbrook JD, Burley SK, Rose Y. RCSB Protein Data Bank: Efficient Searching and Simultaneous Access to One Million Computed Structure Models Alongside the PDB Structures Enabled by Architectural Advances. J Mol Biol 2023:167994. [PMID: 36738985 DOI: 10.1016/j.jmb.2023.167994] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/27/2023] [Accepted: 01/28/2023] [Indexed: 02/05/2023]
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) provides open access to experimentally-determined three-dimensional (3D) structures of biomolecules. The RCSB PDB RCSB.org research-focused web portal is used annually by many millions of users around the world. They access biostructure information, run complex queries utilizing various search services (e.g., full-text, structural and chemical attribute, chemical, sequence, and structure similarity searches), and visualize macromolecules in 3D, all at no charge and with no limitations on data usage. Notwithstanding more than 24,000-fold growth of the PDB over the past five decades, experimentally-determined structures are only available for a small subset of the millions of proteins of known sequence. Recently developed machine learning software tools can predict 3D structures of proteins at accuracies comparable to lower-resolution experimental methods. The RCSB PDB now provides access to ∼1,000,000 Computed Structure Models (CSMs) of proteins coming from AlphaFold DB and the ModelArchive alongside ∼200,000 experimentally-determined PDB structures. Both CSMs and PDB structures are available on RCSB.org and via well-established RCSB PDB Data, Search, and 1D-Coordinates application programming interfaces (APIs). Simultaneous delivery of PDB data and CSMs provides users with access to complementary structural information across the human proteome and those of model organisms and selected pathogens. API enhancements are backwards-compatible and programmatic users can "opt in" to access CSMs with minimal effort. Herein, we describe modifications to RCSB PDB cyberinfrastructure required to support sixfold scaling of 3D biostructure data delivery and lay the groundwork for scaling to accommodate hundreds of millions of CSMs.
Collapse
Affiliation(s)
- Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA.
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| |
Collapse
|
11
|
Baltoumas FA, Karatzas E, Paez-Espino D, Venetsianou NK, Aplakidou E, Oulas A, Finn RD, Ovchinnikov S, Pafilis E, Kyrpides NC, Pavlopoulos GA. Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters. FRONTIERS IN BIOINFORMATICS 2023; 3:1157956. [PMID: 36959975 PMCID: PMC10029925 DOI: 10.3389/fbinf.2023.1157956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.
Collapse
Affiliation(s)
- Fotis A. Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - David Paez-Espino
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
| | - Nefeli K. Venetsianou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Eleni Aplakidou
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
| | - Anastasis Oulas
- The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Robert D. Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, United States
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Heraklion, Greece
| | - Nikos C. Kyrpides
- Lawrence Berkeley National Laboratory, DOE Joint Genome Institute, Berkeley, CA, United States
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| | - Georgios A. Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, Greece
- Center of New Biotechnologies and Precision Medicine, Department of Medicine, School of Health Sciences, National and Kapodistrian University of Athens, Athens, Greece
- Hellenic Army Academy, Vari, Greece
- *Correspondence: Fotis A. Baltoumas, ; Nikos C. Kyrpides, ; Georgios A. Pavlopoulos,
| |
Collapse
|
12
|
Ngashangva N, Mukherjee PK, Sharma C, Kalita MC, Sarangthem I. Integrated genomics and proteomics analysis of Paenibacillus peoriae IBSD35 and insights into its antimicrobial characteristics. Sci Rep 2022; 12:18861. [PMID: 36344671 PMCID: PMC9640621 DOI: 10.1038/s41598-022-23613-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 11/02/2022] [Indexed: 11/09/2022] Open
Abstract
Antimicrobial resistance has been developing fast and incurring a loss of human life, and there is a need for new antimicrobial agents. Naturally occurring antimicrobial peptides offer the characteristics to counter AMR because the resistance development is low or no resistance. Antimicrobial peptides from Paenibacillus peoriae IBSD35 cell-free supernatant were salted out and purified using chromatography and characterized with liquid chromatography-tandem-mass spectrometry. The extract has shown a high and broad spectrum of antimicrobial activity. Combining the strain IBSD35 genome sequence with its proteomic data enabled the prediction of biosynthetic gene clusters by connecting the peptide from LC-MS/MS data to the gene that encode. Antimicrobial peptide databases offered a platform for the effective search, prediction, and design of AMPs and expanded the studies on their isolation, structure elucidation, biological evaluation, and pathway engineering. The genome-based taxonomy and comparisons have shown that P. peoriae IBSD35 is closely related to Paenibacillus peoriae FSL J3-0120. P. peoriae IBSD35 harbored endophytic trait genes and nonribosomal peptide synthases biosynthetic gene clusters. The comparative genomics revealed evolutionary insights and facilitated the discovery of novel SMs using proteomics from the extract of P. peoriae IBSD35. It will increase the potential to find novel bio-molecules to counter AMR.
Collapse
Affiliation(s)
- Ng Ngashangva
- grid.464584.f0000 0004 0640 0101A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Takyelpat, Imphal, Manipur 795001 India
| | - Pulok K. Mukherjee
- grid.464584.f0000 0004 0640 0101A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Takyelpat, Imphal, Manipur 795001 India
| | - Chandradev Sharma
- grid.464584.f0000 0004 0640 0101A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Takyelpat, Imphal, Manipur 795001 India
| | - Mohan C. Kalita
- grid.411779.d0000 0001 2109 4622Department of Biotechnology, Gauhati University, Jalukbari, Guwahati, Assam 781014 India
| | - Indira Sarangthem
- grid.464584.f0000 0004 0640 0101A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Takyelpat, Imphal, Manipur 795001 India
| |
Collapse
|
13
|
Shao C, Bittrich S, Wang S, Burley SK. Assessing PDB macromolecular crystal structure confidence at the individual amino acid residue level. Structure 2022; 30:1385-1394.e3. [PMID: 36049478 PMCID: PMC9547844 DOI: 10.1016/j.str.2022.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/24/2022] [Accepted: 08/05/2022] [Indexed: 11/22/2022]
Abstract
Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within ∼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient ∼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.
Collapse
Affiliation(s)
- Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Statistics, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA; Rutgers Cancer Institute of New Jersey, Robert Wood Johnson Medical School, New Brunswick, NJ 08903, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
14
|
Paria P, Chakraborty HJ, Behera BK. Identification of novel salt tolerance-associated proteins from the secretome of Enterococcus faecalis. World J Microbiol Biotechnol 2022; 38:177. [PMID: 35934729 DOI: 10.1007/s11274-022-03354-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/04/2022] [Indexed: 11/30/2022]
Abstract
The ability of bacteria to adapt to the external environment is fundamental for their survival. A halotolerant microorganism Enterococcus faecalis able to grow under high salt stress conditions was isolated in the present study. The SDS-PAGE analysis of the secretome showed a protein band with a molecular weight of 28 kDa, gradually increased with an increase in salt concentration, and the highest intensity was observed at 15% salt stress condition. LC-MS/MS analysis of this particular band identified fourteen different proteins, out of which nine proteins were uncharacterized. Further, the function of uncharacterized proteins was predicted based on structure-function relationship using a reverse template search approach deciphering uncharacterized protein into type III polyketide synthases, stress-induced protein-1, Eed-h3k79me3, ba42 protein, 3-methyladenine DNA glycosylase, Atxa protein, membrane-bound respiratory hydrogenase, type-i restriction-modification system methylation subunit and ManxA. STRING network analysis further a showed strong association among the proteins. The processes predicted involvement of these proteins in signal transduction, ions transport, synthesis of the protective layer, cellular homeostasis and regulation of gene expression and different metabolic pathways. Thus, the fourteen proteins identified in the secretome play an essential role in maintaining cellular homeostasis in E. faecalis under high-salinity stress. This may represent a novel and previously unreported strategy by E. faecalis to maintain their normal growth and physiology under high salinity conditions.
Collapse
Affiliation(s)
- Prasenjit Paria
- Aquatic Environmental Biotechnology and Nanotechnology Division, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, 700120, India
| | - Hirak Jyoti Chakraborty
- Aquatic Environmental Biotechnology and Nanotechnology Division, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, 700120, India
| | - Bijay Kumar Behera
- Aquatic Environmental Biotechnology and Nanotechnology Division, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, 700120, India.
| |
Collapse
|
15
|
Westbrook JD, Young JY, Shao C, Feng Z, Guranovic V, Lawson CL, Vallat B, Adams PD, Berrisford JM, Bricogne G, Diederichs K, Joosten RP, Keller P, Moriarty NW, Sobolev OV, Velankar S, Vonrhein C, Waterman DG, Kurisu G, Berman HM, Burley SK, Peisach E. PDBx/mmCIF Ecosystem: Foundational Semantic Tools for Structural Biology. J Mol Biol 2022; 434:167599. [PMID: 35460671 DOI: 10.1016/j.jmb.2022.167599] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/31/2022] [Accepted: 04/13/2022] [Indexed: 02/07/2023]
Abstract
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.
Collapse
Affiliation(s)
- John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Catherine L Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Paul D Adams
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Department of Bioengineering, University of California at Berkeley, Berkeley, CA 94720, USA
| | - John M Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gerard Bricogne
- Global Phasing Ltd, Sheraton House, Castle Park, Cambridge CB3 0AK, UK
| | | | - Robbie P Joosten
- Department of Biochemistry, Netherlands Cancer Institute, Amsterdam, the Netherlands; Oncode Institute, 3521 AL Utrecht, the Netherlands. https://www.twitter.com/Robbie_Joosten
| | - Peter Keller
- Global Phasing Ltd, Sheraton House, Castle Park, Cambridge CB3 0AK, UK
| | - Nigel W Moriarty
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Oleg V Sobolev
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clemens Vonrhein
- Global Phasing Ltd, Sheraton House, Castle Park, Cambridge CB3 0AK, UK
| | - David G Waterman
- UKRI-STFC Rutherford Appleton Laboratory, Didcot OX11 0FA, UK; CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, UK. https://www.twitter.com/upintheair
| | - Genji Kurisu
- Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Helen M Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; The Bridge Institute, Michelson Center for Convergent Bioscience, University of Southern California, Los Angeles, CA, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA.
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
16
|
Segura J, Rose Y, Bittrich S, Burley SK, Duarte JM. OUP accepted manuscript. Bioinformatics 2022; 38:3304-3305. [PMID: 35543462 PMCID: PMC9191206 DOI: 10.1093/bioinformatics/btac317] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 05/06/2022] [Indexed: 11/17/2022] Open
Abstract
Motivation Mapping positional features from one-dimensional (1D) sequences onto three-dimensional (3D) structures of biological macromolecules is a powerful tool to show geometric patterns of biochemical annotations and provide a better understanding of the mechanisms underpinning protein and nucleic acid function at the atomic level. Results We present a new library designed to display fully customizable interactive views between 1D positional features of protein and/or nucleic acid sequences and their 3D structures as isolated chains or components of macromolecular assemblies. Availability and implementation https://github.com/rcsb/rcsb-saguaro-3d. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA
| |
Collapse
|
17
|
Kryshtafovych A, Moult J, Albrecht R, Chang GA, Chao K, Fraser A, Greenfield J, Hartmann MD, Herzberg O, Josts I, Leiman PG, Linden SB, Lupas AN, Nelson DC, Rees SD, Shang X, Sokolova ML, Tidow H. Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 2021; 89:1633-1646. [PMID: 34449113 PMCID: PMC8616789 DOI: 10.1002/prot.26223] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/11/2021] [Accepted: 08/17/2021] [Indexed: 01/20/2023]
Abstract
Critical assessment of structure prediction (CASP) conducts community experiments to determine the state of the art in computing protein structure from amino acid sequence. The process relies on the experimental community providing information about not yet public or about to be solved structures, for use as targets. For some targets, the experimental structure is not solved in time for use in CASP. Calculated structure accuracy improved dramatically in this round, implying that models should now be much more useful for resolving many sorts of experimental difficulties. To test this, selected models for seven unsolved targets were provided to the experimental groups. These models were from the AlphaFold2 group, who overall submitted the most accurate predictions in CASP14. Four targets were solved with the aid of the models, and, additionally, the structure of an already solved target was improved. An a posteriori analysis showed that, in some cases, models from other groups would also be effective. This paper provides accounts of the successful application of models to structure determination, including molecular replacement for X-ray crystallography, backbone tracing and sequence positioning in a cryo-electron microscopy structure, and correction of local features. The results suggest that, in future, there will be greatly increased synergy between computational and experimental approaches to structure determination.
Collapse
Affiliation(s)
| | - John Moult
- Institute for Bioscience and Biotechnology Research, Department of Cell Biology and Molecular genetics, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Reinhard Albrecht
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Geoffrey A. Chang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
- Department of Pharmacology, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Kinlin Chao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Alec Fraser
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Julia Greenfield
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Marcus D. Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Osnat Herzberg
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742, USA
| | - Inokentijs Josts
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Petr G. Leiman
- Department of Biochemistry and Molecular Biology, Sealy Center for Structural Biology and Molecular Biophysics (SCSB), The University of Texas Medical Branch at Galveston, TX 77555, USA
| | - Sara B. Linden
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Andrei N. Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Daniel C. Nelson
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
- Department of Veterinary Medicine, University of Maryland, College Park, MD 20742, USA
| | - Steven D. Rees
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California-San Diego, La Jolla, CA, 92093, USA
| | - Xiaoran Shang
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD 20850, USA
| | - Maria L. Sokolova
- Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, 121205, Russia
| | - Henning Tidow
- The Hamburg Advanced Research Center for Bioorganic Chemistry (HARBOR) & Department of Chemistry, Institute for Biochemistry and Molecular Biology, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | | |
Collapse
|
18
|
Ngashangva N, Mukherjee P, Sharma KC, Kalita MC, Indira S. Analysis of Antimicrobial Peptide Metabolome of Bacterial Endophyte Isolated From Traditionally Used Medicinal Plant Millettia pachycarpa Benth. Front Microbiol 2021; 12:656896. [PMID: 34149644 PMCID: PMC8208310 DOI: 10.3389/fmicb.2021.656896] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 04/21/2021] [Indexed: 12/13/2022] Open
Abstract
Increasing prevalence of antimicrobial resistance (AMR) has posed a major health concern worldwide, and the addition of new antimicrobial agents is diminishing due to overexploitation of plants and microbial resources. Inevitably, alternative sources and new strategies are needed to find novel biomolecules to counter AMR and pandemic circumstances. The association of plants with microorganisms is one basic natural interaction that involves the exchange of biomolecules. Such a symbiotic relationship might affect the respective bio-chemical properties and production of secondary metabolites in the host and microbes. Furthermore, the discovery of taxol and taxane from an endophytic fungus, Taxomyces andreanae from Taxus wallachiana, has stimulated much research on endophytes from medicinal plants. A gram-positive endophytic bacterium, Paenibacillus peoriae IBSD35, was isolated from the stem of Millettia pachycarpa Benth. It is a rod-shaped, motile, gram-positive, and endospore-forming bacteria. It is neutralophilic as per Joint Genome Institute’s (JGI) IMG system analysis. The plant was selected based on its ethnobotany history of traditional uses and highly insecticidal properties. Bioactive molecules were purified from P. peoriae IBSD35 culture broth using 70% ammonium sulfate and column chromatography techniques. The biomolecule was enriched to 151.72-fold and the yield percentage was 0.05. Peoriaerin II, a highly potent and broad-spectrum antimicrobial peptide against Staphylococcus aureus ATCC 25923, Escherichia coli ATCC 25922, and Candida albicans ATCC 10231 was isolated. LC-MS sequencing revealed that its N-terminal is methionine. It has four negatively charged residues (Asp + Glu) and a total number of two positively charged residues (Arg + Lys). Its molecular weight is 4,685.13 Da. It is linked to an LC-MS/MS inferred biosynthetic gene cluster with accession number A0A2S6P0H9, and blastp has shown it is 82.4% similar to fusaricidin synthetase of Paenibacillus polymyxa SC2. The 3D structure conformation of the BGC and AMP were predicted using SWISS MODEL homology modeling. Therefore, combining both genomic and proteomic results obtained from P. peoriae IBSD35, associated with M. pachycarpa Benth., will substantially increase the understanding of antimicrobial peptides and assist to uncover novel biological agents.
Collapse
Affiliation(s)
- Ng Ngashangva
- A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Imphal, India
| | - Pulok Mukherjee
- A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Imphal, India
| | - K Chandradev Sharma
- A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Imphal, India
| | - M C Kalita
- Department of Biotechnology, Gauhati University, Guwahati, India
| | - Sarangthem Indira
- A National Institute of Department of Biotechnology, Institute of Bioresources and Sustainable Development (IBSD), Govt. of India, Imphal, India
| |
Collapse
|
19
|
Pizzagalli MD, Bensimon A, Superti‐Furga G. A guide to plasma membrane solute carrier proteins. FEBS J 2021; 288:2784-2835. [PMID: 32810346 PMCID: PMC8246967 DOI: 10.1111/febs.15531] [Citation(s) in RCA: 155] [Impact Index Per Article: 51.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 08/07/2020] [Accepted: 08/17/2020] [Indexed: 12/13/2022]
Abstract
This review aims to serve as an introduction to the solute carrier proteins (SLC) superfamily of transporter proteins and their roles in human cells. The SLC superfamily currently includes 458 transport proteins in 65 families that carry a wide variety of substances across cellular membranes. While members of this superfamily are found throughout cellular organelles, this review focuses on transporters expressed at the plasma membrane. At the cell surface, SLC proteins may be viewed as gatekeepers of the cellular milieu, dynamically responding to different metabolic states. With altered metabolism being one of the hallmarks of cancer, we also briefly review the roles that surface SLC proteins play in the development and progression of cancer through their influence on regulating metabolism and environmental conditions.
Collapse
Affiliation(s)
- Mattia D. Pizzagalli
- CeMM, Research Center for Molecular Medicine of the Austrian Academy of SciencesViennaAustria
| | - Ariel Bensimon
- CeMM, Research Center for Molecular Medicine of the Austrian Academy of SciencesViennaAustria
| | - Giulio Superti‐Furga
- CeMM, Research Center for Molecular Medicine of the Austrian Academy of SciencesViennaAustria
- Center for Physiology and PharmacologyMedical University of ViennaAustria
| |
Collapse
|
20
|
de Araújo RSA, Mendonça FJ, Scotti MT, Scotti L. Protein modeling. PHYSICAL SCIENCES REVIEWS 2021. [DOI: 10.1515/psr-2018-0161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Proteins are essential and versatile polymers consisting of sequenced amino acids that often possess an organized three-dimensional arrangement, (a result of their monomeric composition), which determines their biological role in cellular function. Proteins are involved in enzymatic catalysis; they participate in genetic information decoding and transmission processes, in cell recognition, in signaling, and transport of substances, in regulation of intra and extracellular conditions, and other functions.
Collapse
Affiliation(s)
- Rodrigo S. A. de Araújo
- Biological Science Department, Laboratory of Synthesis and Drug Delivery , State University of Paraiba , 58070-450 , João Pessoa , PB , Brazil
| | - Francisco J. B. Mendonça
- Biological Science Department, Laboratory of Synthesis and Drug Delivery , State University of Paraiba , 58070-450 , João Pessoa , PB , Brazil
| | - Marcus T. Scotti
- Health Center , Federal University of Paraíba , 50670-910 , João Pessoa , PB , Brazil
| | - Luciana Scotti
- Health Center , Federal University of Paraíba , 50670-910 , João Pessoa , PB , Brazil
| |
Collapse
|
21
|
Wang T, Cook I, Leyh TS. The molecular basis of OH-PCB estrogen receptor activation. J Biol Chem 2021; 296:100353. [PMID: 33524392 PMCID: PMC7949139 DOI: 10.1016/j.jbc.2021.100353] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Revised: 01/25/2021] [Accepted: 01/27/2021] [Indexed: 11/16/2022] Open
Abstract
Polychlorinated bisphenols (PCBs) continue to contaminate food chains globally where they concentrate in tissues and disrupt the endocrine systems of species throughout the ecosphere. Hydroxylated PCBs (OH-PCBs) are major PCB metabolites and high-affinity inhibitors of human estrogen sulfotransferase (SULT1E1), which sulfonates estrogens and thus prevents them from binding to and activating their receptors. OH-PCB inhibition of SULT1E1 is believed to contribute significantly to PCB-based endocrine disruption. Here, for the first time, the molecular basis of OH-PCB inhibition of SULT1E1 is revealed in a structure of SULT1E1 in complex with OH-PCB1 (4ʹ-OH-2,6-dichlorobiphenol) and its substrates, estradiol (E2), and PAP (3’-phosphoadenosine-5-phosphosulfate). OH-PCB1 prevents catalysis by intercalating between E2 and catalytic residues and establishes a new E2-binding site whose E2 affinity and positioning are greater than and competitive with those of the reactive-binding pocket. Such complexes have not been observed previously and offer a novel template for the design of high-affinity inhibitors. Mutating residues in direct contact with OH-PCB weaken its affinity without compromising the enzyme’s catalytic parameters. These OH-PCB resistant mutants were used in stable transfectant studies to demonstrate that OH-PCBs regulate estrogen receptors in cultured human cell lines by binding the OH-PCB binding pocket of SULT1E1.
Collapse
Affiliation(s)
- Ting Wang
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Ian Cook
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Thomas S Leyh
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York, USA.
| |
Collapse
|
22
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. In the following chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
|
23
|
Sali A. From integrative structural biology to cell biology. J Biol Chem 2021; 296:100743. [PMID: 33957123 PMCID: PMC8203844 DOI: 10.1016/j.jbc.2021.100743] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 04/09/2021] [Accepted: 04/30/2021] [Indexed: 12/16/2022] Open
Abstract
Integrative modeling is an increasingly important tool in structural biology, providing structures by combining data from varied experimental methods and prior information. As a result, molecular architectures of large, heterogeneous, and dynamic systems, such as the ∼52-MDa Nuclear Pore Complex, can be mapped with useful accuracy, precision, and completeness. Key challenges in improving integrative modeling include expanding model representations, increasing the variety of input data and prior information, quantifying a match between input information and a model in a Bayesian fashion, inventing more efficient structural sampling, as well as developing better model validation, analysis, and visualization. In addition, two community-level challenges in integrative modeling are being addressed under the auspices of the Worldwide Protein Data Bank (wwPDB). First, the impact of integrative structures is maximized by PDB-Development, a prototype wwPDB repository for archiving, validating, visualizing, and disseminating integrative structures. Second, the scope of structural biology is expanded by linking the wwPDB resource for integrative structures with archives of data that have not been generally used for structure determination but are increasingly important for computing integrative structures, such as data from various types of mass spectrometry, spectroscopy, optical microscopy, proteomics, and genetics. To address the largest of modeling problems, a type of integrative modeling called metamodeling is being developed; metamodeling combines different types of input models as opposed to different types of data to compute an output model. Collectively, these developments will facilitate the structural biology mindset in cell biology and underpin spatiotemporal mapping of the entire cell.
Collapse
Affiliation(s)
- Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, the Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA.
| |
Collapse
|
24
|
Toward Increased Reliability, Transparency, and Accessibility in Cross-linking Mass Spectrometry. Structure 2020; 28:1259-1268. [PMID: 33065067 DOI: 10.1016/j.str.2020.09.011] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 09/02/2020] [Accepted: 09/24/2020] [Indexed: 01/09/2023]
Abstract
Cross-linking mass spectrometry (MS) has substantially matured as a method over the past 2 decades through parallel development in multiple labs, demonstrating its applicability to protein structure determination, conformation analysis, and mapping protein interactions in complex mixtures. Cross-linking MS has become a much-appreciated and routinely applied tool, especially in structural biology. Therefore, it is timely that the community commits to the development of methodological and reporting standards. This white paper builds on an open process comprising a number of events at community conferences since 2015 and identifies aspects of Cross-linking MS for which guidelines should be developed as part of a Cross-linking MS standards initiative.
Collapse
|
25
|
Dokholyan NV. Experimentally-driven protein structure modeling. J Proteomics 2020; 220:103777. [PMID: 32268219 PMCID: PMC7214187 DOI: 10.1016/j.jprot.2020.103777] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 03/17/2020] [Accepted: 04/02/2020] [Indexed: 11/25/2022]
Abstract
Revolutions in natural and exact sciences started at the dawn of last century have led to the explosion of theoretical, experimental, and computational approaches to determine structures of molecules, complexes, as well as their rich conformational dynamics. Since different experimental methods produce information that is attributed to specific time and length scales, corresponding computational methods have to be tailored to these scales and experiments. These methods can be then combined and integrated in scales, hence producing a fuller picture of molecular structure and motion from the "puzzle pieces" offered by various experiments. Here, we describe a number of computational approaches to utilize experimental data to glance into structure of proteins and understand their dynamics. We will also discuss the limitations and the resolution of the constraints-based modeling approaches. SIGNIFICANCE: Experimentally-driven computational structure modeling and determination is a rapidly evolving alternative to traditional approaches for molecular structure determination. These new hybrid experimental-computational approaches are proving to be a powerful microscope to glance into the structural features of intrinsically or partially disordered proteins, dynamics of molecules and complexes. In this review, we describe various approaches in the field of experimentally-driven computational structure modeling.
Collapse
Affiliation(s)
- Nikolay V Dokholyan
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA 17033, USA; Department of Biochemistry & Molecular Biology, Penn State College of Medicine, Hershey, PA 17033, USA.; Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.; Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
26
|
Abriata LA, Lepore R, Dal Peraro M. About the need to make computational models of biological macromolecules available and discoverable. Bioinformatics 2020; 36:2952-2954. [DOI: 10.1093/bioinformatics/btaa086] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/13/2020] [Accepted: 02/06/2020] [Indexed: 12/19/2022] Open
Affiliation(s)
- Luciano A Abriata
- Laboratory for Biomolecular Modeling
- Protein Production and Structure Core Facility, School of Life Sciences, École Polytechnique Fédérale de Lausanne and Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland
| | - Rosalba Lepore
- BSC-CNS Barcelona Supercomputing Center, Barcelona, Spain
| | | |
Collapse
|
27
|
Revisiting the "satisfaction of spatial restraints" approach of MODELLER for protein homology modeling. PLoS Comput Biol 2019; 15:e1007219. [PMID: 31846452 PMCID: PMC6938380 DOI: 10.1371/journal.pcbi.1007219] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 12/31/2019] [Accepted: 11/13/2019] [Indexed: 01/02/2023] Open
Abstract
The most frequently used approach for protein structure prediction is currently homology modeling. The 3D model building phase of this methodology is critical for obtaining an accurate and biologically useful prediction. The most widely employed tool to perform this task is MODELLER. This program implements the “modeling by satisfaction of spatial restraints” strategy and its core algorithm has not been altered significantly since the early 1990s. In this work, we have explored the idea of modifying MODELLER with two effective, yet computationally light strategies to improve its 3D modeling performance. Firstly, we have investigated how the level of accuracy in the estimation of structural variability between a target protein and its templates in the form of σ values profoundly influences 3D modeling. We show that the σ values produced by MODELLER are on average weakly correlated to the true level of structural divergence between target-template pairs and that increasing this correlation greatly improves the program’s predictions, especially in multiple-template modeling. Secondly, we have inquired into how the incorporation of statistical potential terms (such as the DOPE potential) in the MODELLER’s objective function impacts positively 3D modeling quality by providing a small but consistent improvement in metrics such as GDT-HA and lDDT and a large increase in stereochemical quality. Python modules to harness this second strategy are freely available at https://github.com/pymodproject/altmod. In summary, we show that there is a large room for improving MODELLER in terms of 3D modeling quality and we propose strategies that could be pursued in order to further increase its performance. Proteins are fundamental biological molecules that carry out countless activities in living beings. Since the function of proteins is dictated by their three-dimensional atomic structures, acquiring structural details of proteins provides deep insights into their function. Currently, the most frequently used computational approach for protein structure prediction is template-based modeling. In this approach, a target protein is modeled using the experimentally-derived structural information of a template protein assumed to have a similar structure to the target. MODELLER is the most frequently used program for template-based 3D model building. Despite its success, its predictions are not always accurate enough to be useful in Biomedical Research. Here, we show that it is possible to greatly increase the performance of MODELLER by modifying two aspects of its algorithm. First, we demonstrate that providing the program with accurate estimations of local target-template structural divergence greatly increases the quality of its predictions. Additionally, we show that modifying MODELLER’s scoring function with statistical potential energetic terms also helps to improve modeling quality. This work will be useful in future research, since it reports practical strategies to improve the performance of this core tool in Structural Bioinformatics.
Collapse
|
28
|
Berman HM, Adams PD, Bonvin AA, Burley SK, Carragher B, Chiu W, DiMaio F, Ferrin TE, Gabanyi MJ, Goddard TD, Griffin PR, Haas J, Hanke CA, Hoch JC, Hummer G, Kurisu G, Lawson CL, Leitner A, Markley JL, Meiler J, Montelione GT, Phillips GN, Prisner T, Rappsilber J, Schriemer DC, Schwede T, Seidel CAM, Strutzenberg TS, Svergun DI, Tajkhorshid E, Trewhella J, Vallat B, Velankar S, Vuister GW, Webb B, Westbrook JD, White KL, Sali A. Federating Structural Models and Data: Outcomes from A Workshop on Archiving Integrative Structures. Structure 2019; 27:1745-1759. [PMID: 31780431 DOI: 10.1016/j.str.2019.11.002] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 10/31/2019] [Accepted: 11/06/2019] [Indexed: 12/23/2022]
Abstract
Structures of biomolecular systems are increasingly computed by integrative modeling. In this approach, a structural model is constructed by combining information from multiple sources, including varied experimental methods and prior models. In 2019, a Workshop was held as a Biophysical Society Satellite Meeting to assess progress and discuss further requirements for archiving integrative structures. The primary goal of the Workshop was to build consensus for addressing the challenges involved in creating common data standards, building methods for federated data exchange, and developing mechanisms for validating integrative structures. The summary of the Workshop and the recommendations that emerged are presented here.
Collapse
Affiliation(s)
- Helen M Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA; Bridge Institute, Michelson Center, University of Southern California, Los Angeles, CA 90089, USA.
| | - Paul D Adams
- Physical Biosciences Division, Lawrence Berkeley Laboratory, Berkeley, CA 94720-8235, USA; Department of Bioengineering, University of California-Berkeley, Berkeley, CA 94720, USA
| | - Alexandre A Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, the Netherlands
| | - Stephen K Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA
| | - Bridget Carragher
- Simons Electron Microscopy Center, New York Structural Biology Center, New York, NY 10027, USA; Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | - Wah Chiu
- Department of Bioengineering, Department of Microbiology and Immunology, Stanford University, Stanford, CA 94305-5447, USA; SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Thomas E Ferrin
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158, USA
| | - Margaret J Gabanyi
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Thomas D Goddard
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158, USA
| | | | - Juergen Haas
- Swiss Institute of Bioinformatics and Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Christian A Hanke
- Molecular Physical Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | - Jeffrey C Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT 06030, USA
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany; Institute for Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Genji Kurisu
- Protein Data Bank Japan (PDBj), Institute for Protein Research, Osaka University, Osaka 565-0871, Japan
| | - Catherine L Lawson
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Alexander Leitner
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
| | - John L Markley
- BioMagResBank (BMRB), Biochemistry Department, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, 465 21st Avenue South, Nashville, TN 37221, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Biochemistry, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytech Institute, Troy, NY 12180, USA
| | - George N Phillips
- BioSciences at Rice and Department of Chemistry, Rice University, Houston, TX 77251, USA
| | - Thomas Prisner
- Institute of Physical and Theoretical Chemistry and Center of Biomolecular Magnetic Resonance, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Juri Rappsilber
- Wellcome Trust Centre for Cell Biology, Edinburgh EH9 3JR, Scotland
| | - David C Schriemer
- Department of Biochemistry & Molecular Biology, Robson DNA Science Centre, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Torsten Schwede
- Swiss Institute of Bioinformatics and Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Claus A M Seidel
- Molecular Physical Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| | | | - Dmitri I Svergun
- European Molecular Biology Laboratory (EMBL), Hamburg Outstation, Notkestrasse 85, 22607 Hamburg, Germany
| | - Emad Tajkhorshid
- Department of Biochemistry, NIH Center for Macromolecular Modeling and Bioinformatics, Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Jill Trewhella
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW 2006, Australia; Department of Chemistry, University of Utah, Salt Lake City, UT 84112, USA
| | - Brinda Vallat
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Leicester Institute of Structural and Chemical Biology, University of Leicester, Leicester LE1 9HN, UK
| | - Benjamin Webb
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - John D Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Kate L White
- Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA; Bridge Institute, Michelson Center, University of Southern California, Los Angeles, CA 90089, USA
| | - Andrej Sali
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158, USA; Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
29
|
Haas J, Gumienny R, Barbato A, Ackermann F, Tauriello G, Bertoni M, Studer G, Smolinski A, Schwede T. Introducing "best single template" models as reference baseline for the Continuous Automated Model Evaluation (CAMEO). Proteins 2019; 87:1378-1387. [PMID: 31571280 DOI: 10.1002/prot.25815] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2019] [Revised: 09/10/2019] [Accepted: 09/13/2019] [Indexed: 12/17/2022]
Abstract
Critical blind assessment of structure prediction techniques is crucial for the scientific community to establish the state of the art, identify bottlenecks, and guide future developments. In Critical Assessment of Techniques in Structure Prediction (CASP), human experts assess the performance of participating methods in relation to the difficulty of the prediction task in a biennial experiment on approximately 100 targets. Yet, the development of automated computational modeling methods requires more frequent evaluation cycles and larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements CASP by conducting fully automated blind prediction evaluations based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the Protein Data Bank (PDB). Each week, CAMEO publishes benchmarking results for predictions corresponding to a set of about 20 targets collected during a 4-day prediction window. CAMEO benchmarking data are generated consistently for all methods at the same point in time, enabling developers to cross-validate their method's performance, and referring to their results in publications. Many successful participants of CASP have used CAMEO-either by directly benchmarking their methods within the system or by comparing their own performance to CAMEO reference data. CAMEO offers a variety of scores reflecting different aspects of structure modeling, for example, binding site accuracy, homo-oligomer interface quality, or accuracy of local model confidence estimates. By introducing the "bestSingleTemplate" method based on structure superpositions as a reference for the accuracy of 3D modeling predictions, CAMEO facilitates objective comparison of techniques and fosters the development of advanced methods.
Collapse
Affiliation(s)
- Juergen Haas
- Computational Structural Biology, University of Basel, Switzerland
| | - Rafal Gumienny
- Computational Structural Biology, Swiss Institute of Bioinformatics, Switzerland
| | - Alessandro Barbato
- Computational Structural Biology, Universitat Basel Department Biozentrum, Switzerland
| | - Flavio Ackermann
- Computational Structural Biology, University of Basel, Switzerland
| | | | - Martino Bertoni
- Computational Structural Biology, Universitat Basel Department Biozentrum, Switzerland
| | - Gabriel Studer
- Computational Structural Biology, University of Basel, Switzerland
| | - Anna Smolinski
- Computational Structural Biology, University of Basel, Switzerland
| | - Torsten Schwede
- Computational Structural Biology, University of Basel, Switzerland
| |
Collapse
|
30
|
Chen L, He J. A Histogram-based Outlier Profile for Atomic Structures Derived from Cryo-Electron Microscopy. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2019; 2019:586-591. [PMID: 35838364 PMCID: PMC9279010 DOI: 10.1145/3307339.3343865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As more atomic structures are determined from cryo-electron microscopy (cryo-EM) density maps, validation of such structures is an important task. We report findings after analyzing the change of cryo-EM structures in a comparison between those released by December 2016 and those released between 2017 and 2019. The cryo-EM models created from density maps with resolution better than 6 Å were divided into six data sets. A histogram-based outlier score (HBOS) was implemented and validation reports were collected from the Protein Data Bank. The results suggest that the overall quality of EM structures released after December 2016 is better than that of structures released before 2017. The conformation qualities of most residue types might have been improved, except for Leucine, Phenylalanine, and Serine in high-resolution datasets (higher than 4 Å). We observe that structures solved from 0-4 Å resolution density maps have an almost identical HBOS profile as that of structures derived from density maps with 4-6 Å resolution.
Collapse
Affiliation(s)
| | - Jing He
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529
| |
Collapse
|
31
|
Holt MC, Ho CS, Morano MI, Barrett SD, Stein AJ. Improved homology modeling of the human & rat EP 4 prostanoid receptors. BMC Mol Cell Biol 2019; 20:37. [PMID: 31455205 PMCID: PMC6712885 DOI: 10.1186/s12860-019-0212-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 07/11/2019] [Indexed: 12/02/2022] Open
Abstract
Background The EP4 prostanoid receptor is one of four GPCRs that mediate the diverse actions of prostaglandin E2 (PGE2). Novel selective EP4 receptor agonists would assist to further elucidate receptor sub-type function and promote development of therapeutics for bone healing, heart failure, and other receptor associated conditions. The rat EP4 (rEP4) receptor has been used as a surrogate for the human EP4 (hEP4) receptor in multiple SAR studies. To better understand the validity of this traditional approach, homology models were generated by threading for both receptors using the RaptorX server. These models were fit to an implicit membrane using the PPM server and OPM database with refinement of intra and extracellular loops by Prime (Schrödinger). To understand the interaction between the receptors and known agonists, induced-fit docking experiments were performed using Glide and Prime (Schrödinger), with both endogenous agonists and receptor sub-type selective, small-molecule agonists. The docking scores and observed interactions were compared with radioligand displacement experiments and receptor (rat & human) activation assays monitoring cAMP. Results Rank-ordering of in silico compound docking scores aligned well with in vitro activity assay EC50 and radioligand binding Ki. We observed variations between rat and human EP4 binding pockets that have implications in future small-molecule receptor-modulator design and SAR, specifically a S103G mutation within the rEP4 receptor. Additionally, these models helped identify key interactions between the EP4 receptor and ligands including PGE2 and several known sub-type selective agonists while serving as a marked improvement over the previously reported models. Conclusions This work has generated a set of novel homology models of the rEP4 and hEP4 receptors. The homology models provide an improvement upon the previously reported model, largely due to improved solvation. The hEP4 docking scores correlates best with the cAMP activation data, where both data sets rank order Rivenprost>CAY10684 > PGE1 ≈ PGE2 > 11-deoxy-PGE1 ≈ 11-dexoy-PGE2 > 8-aza-11-deoxy-PGE1. This rank-ordering matches closely with the rEP4 receptor as well. Species-specific differences were noted for the weak agonists Sulprostone and Misoprostol, which appear to dock more readily within human receptor versus rat receptor. Electronic supplementary material The online version of this article (10.1186/s12860-019-0212-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Melissa C Holt
- Cayman Chemical Co, 1180 E. Ellsworth Rd, Ann Arbor, MI, 48108, USA
| | - Chi S Ho
- Cayman Chemical Co, 1180 E. Ellsworth Rd, Ann Arbor, MI, 48108, USA
| | - M Inés Morano
- Cayman Chemical Co, 1180 E. Ellsworth Rd, Ann Arbor, MI, 48108, USA
| | | | - Adam J Stein
- Cayman Chemical Co, 1180 E. Ellsworth Rd, Ann Arbor, MI, 48108, USA.
| |
Collapse
|
32
|
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 2019; 46:W296-W303. [PMID: 29788355 PMCID: PMC6030848 DOI: 10.1093/nar/gky427] [Citation(s) in RCA: 6948] [Impact Index Per Article: 1389.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 05/07/2018] [Indexed: 11/13/2022] Open
Abstract
Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
Collapse
Affiliation(s)
- Andrew Waterhouse
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Florian T Heer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Tjaart A P de Beer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Christine Rempfer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rosalba Lepore
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| |
Collapse
|
33
|
Chen L, Baker B, Santos E, Sheep M, Daftarian D. A Visualization Tool for Cryo-EM Protein Validation with an Unsupervised Machine Learning Model in Chimera Platform. MEDICINES (BASEL, SWITZERLAND) 2019; 6:E86. [PMID: 31390767 PMCID: PMC6789601 DOI: 10.3390/medicines6030086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 07/31/2019] [Accepted: 08/02/2019] [Indexed: 11/22/2022]
Abstract
Background: Cryo-electron microscopy (cryo-EM) has become a major technique for protein structure determination. However, due to the low quality of cryo-EM density maps, many protein structures derived from cryo-EM contain outliers introduced during the modeling process. The current protein model validation system lacks identification features for cryo-EM proteins making it not enough to identify outliers in cryo-EM proteins. Methods: This study introduces an efficient unsupervised outlier detection model for validating protein models built from cryo-EM technique. The current model uses a high-resolution X-ray dataset (<1.5 Å) as the reference dataset. The distal block distance, side-chain length, phi, psi, and first chi angle of the residues in the reference dataset are collected and saved as a database of the histogram-based outlier score (HBOS). The HBOS value of the residues in target cryo-EM proteins can be read from this HBOS database. Results: Protein residues with a HBOS value greater than ten are labeled as outliers by default. Four datasets containing proteins derived from cryo-EM density maps were tested with this probabilistic anomaly detection model. Conclusions: According to the proposed model, a visualization assistant tool was designed for Chimera, a protein visualization platform.
Collapse
Affiliation(s)
- Lin Chen
- Department of Computer Science, Valdosta State University, Valdosta, GA 31693, USA.
| | - Brandon Baker
- Department of Natural Science, Elizabeth City State University, Elizabeth City, NC 27909, USA
| | - Eduardo Santos
- Department of Natural Science, Elizabeth City State University, Elizabeth City, NC 27909, USA
| | - Michell Sheep
- Department of Mathematics & Computer Science, Elizabeth City State University, Elizabeth City, NC 27909, USA
| | - Darius Daftarian
- Department of Mathematics & Computer Science, Elizabeth City State University, Elizabeth City, NC 27909, USA
| |
Collapse
|
34
|
Sanyal T, Mittal J, Shell MS. A hybrid, bottom-up, structurally accurate, Go¯-like coarse-grained protein model. J Chem Phys 2019; 151:044111. [PMID: 31370551 PMCID: PMC6663515 DOI: 10.1063/1.5108761] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 06/24/2019] [Indexed: 12/21/2022] Open
Abstract
Coarse-grained (CG) protein models in the structural biology literature have improved over the years from being simple tools to understand general folding and aggregation driving forces to capturing detailed structures achieved by actual folding sequences. Here, we ask whether such models can be developed systematically from recent advances in bottom-up coarse-graining methods without relying on bioinformatic data (e.g., protein data bank statistics). We use relative entropy coarse-graining to develop a hybrid CG but Go¯-like CG peptide model, hypothesizing that the landscape of proteinlike folds is encoded by the backbone interactions, while the sidechain interactions define which of these structures globally minimizes the free energy in a unique native fold. To construct a model capable of capturing varied secondary structures, we use a new extended ensemble relative entropy method to coarse-grain based on multiple reference atomistic simulations of short polypeptides with varied α and β character. Subsequently, we assess the CG model as a putative protein backbone forcefield by combining it with sidechain interactions based on native contacts but not incorporating native distances explicitly, unlike standard Go¯ models. We test the model's ability to fold a range of proteins and find that it achieves high accuracy (∼2 Å root mean square deviation resolution for both short sequences and large globular proteins), suggesting the strong role that backbone conformational preferences play in defining the fold landscape. This model can be systematically extended to non-natural amino acids and nonprotein polymers and sets the stage for extensions to non-Go¯ models with sequence-specific sidechain interactions.
Collapse
Affiliation(s)
- Tanmoy Sanyal
- Department of Chemical Engineering, University of California Santa Barbara, Santa Barbara, California 93106, USA
| | - Jeetain Mittal
- Department of Chemical and Biomolecular Engineering, Lehigh University, Bethlehem, Pennsylvania 18015, USA
| | - M. Scott Shell
- Department of Chemical Engineering, University of California Santa Barbara, Santa Barbara, California 93106, USA
| |
Collapse
|
35
|
Studer G, Tauriello G, Bienert S, Waterhouse AM, Bertoni M, Bordoli L, Schwede T, Lepore R. Modeling of Protein Tertiary and Quaternary Structures Based on Evolutionary Information. Methods Mol Biol 2019; 1851:301-316. [PMID: 30298405 DOI: 10.1007/978-1-4939-8736-8_17] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Proteins are subject to evolutionary forces that shape their three-dimensional structure to meet specific functional demands. The knowledge of the structure of a protein is therefore instrumental to gain information about the molecular basis of its function. However, experimental structure determination is inherently time consuming and expensive, making it impossible to follow the explosion of sequence data deriving from genome-scale projects. As a consequence, computational structural modeling techniques have received much attention and established themselves as a valuable complement to experimental structural biology efforts. Among these, comparative modeling remains the method of choice to model the three-dimensional structure of a protein when homology to a protein of known structure can be detected.The general strategy consists of using experimentally determined structures of proteins as templates for the generation of three-dimensional models of related family members (targets) of which the structure is unknown. This chapter provides a description of the individual steps needed to obtain a comparative model using SWISS-MODEL, one of the most widely used automated servers for protein structure homology modeling.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Andrew Mark Waterhouse
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Rosalba Lepore
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
36
|
Mechanism of activating mutations and allosteric drug inhibition of the phosphatase SHP2. Nat Commun 2018; 9:4507. [PMID: 30375376 PMCID: PMC6207724 DOI: 10.1038/s41467-018-06814-w] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 09/20/2018] [Indexed: 01/01/2023] Open
Abstract
Protein tyrosine phosphatase SHP2 functions as a key regulator of cell cycle control, and activating mutations cause several cancers. Here, we dissect the energy landscape of wild-type SHP2 and the oncogenic mutation E76K. NMR spectroscopy and X-ray crystallography reveal that wild-type SHP2 exchanges between closed, inactive and open, active conformations. E76K mutation shifts this equilibrium toward the open state. The previously unknown open conformation is characterized, including the active-site WPD loop in the inward and outward conformations. Binding of the allosteric inhibitor SHP099 to E76K mutant, despite much weaker, results in an identical structure as the wild-type complex. A conformational selection to the closed state reduces drug affinity which, combined with E76K’s much higher activity, demands significantly greater SHP099 concentrations to restore wild-type activity levels. The differences in structural ensembles and drug-binding kinetics of cancer-associated SHP2 forms may stimulate innovative ideas for developing more potent inhibitors for activated SHP2 mutants. The protein tyrosine phosphatase SHP2 is a key regulator of cell cycle control. Here the authors combine NMR measurements and X-ray crystallography and show that wild-type SHP2 dynamically exchanges between a closed inactive conformation and an open activated form and that the oncogenic E76K mutation shifts the equilibrium to the open state, which is reversed by binding of the allosteric inhibitor SHP099.
Collapse
|
37
|
Role of solvent accessibility for aggregation-prone patches in protein folding. Sci Rep 2018; 8:12896. [PMID: 30150761 PMCID: PMC6110721 DOI: 10.1038/s41598-018-31289-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 08/15/2018] [Indexed: 11/21/2022] Open
Abstract
The arrangement of amino acids in a protein sequence encodes its native folding. However, the same arrangement in aggregation-prone regions may cause misfolding as a result of local environmental stress. Under normal physiological conditions, such regions congregate in the protein’s interior to avoid aggregation and attain the native fold. We have used solvent accessibility of aggregation patches (SAAPp) to determine the packing of aggregation-prone residues. Our results showed that SAAPp has low values for native crystal structures, consistent with protein folding as a mechanism to minimize the solvent accessibility of aggregation-prone residues. SAAPp also shows an average correlation of 0.76 with the global distance test (GDT) score on CASP12 template-based protein models. Using SAAPp scores and five structural features, a random forest machine learning quality assessment tool, SAAP-QA, showed 2.32 average GDT loss between best model predicted and actual best based on GDT score on independent CASP test data, with the ability to discriminate native-like folds having an AUC of 0.94. Overall, the Pearson correlation coefficient (PCC) between true and predicted GDT scores on independent CASP data was 0.86 while on the external CAMEO dataset, comprising high quality protein structures, PCC and average GDT loss were 0.71 and 4.46 respectively. SAAP-QA can be used to detect the quality of models and iteratively improve them to native or near-native structures.
Collapse
|
38
|
Morris J, Na YJ, Zhu H, Lee JH, Giang H, Ulyanova AV, Baltuch GH, Brem S, Chen HI, Kung DK, Lucas TH, O'Rourke DM, Wolf JA, Grady MS, Sul JY, Kim J, Eberwine J. Pervasive within-Mitochondrion Single-Nucleotide Variant Heteroplasmy as Revealed by Single-Mitochondrion Sequencing. Cell Rep 2018; 21:2706-2713. [PMID: 29212019 DOI: 10.1016/j.celrep.2017.11.031] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 10/05/2017] [Accepted: 11/08/2017] [Indexed: 11/18/2022] Open
Abstract
A number of mitochondrial diseases arise from single-nucleotide variant (SNV) accumulation in multiple mitochondria. Here, we present a method for identification of variants present at the single-mitochondrion level in individual mouse and human neuronal cells, allowing for extremely high-resolution study of mitochondrial mutation dynamics. We identified extensive heteroplasmy between individual mitochondrion, along with three high-confidence variants in mouse and one in human that were present in multiple mitochondria across cells. The pattern of variation revealed by single-mitochondrion data shows surprisingly pervasive levels of heteroplasmy in inbred mice. Distribution of SNV loci suggests inheritance of variants across generations, resulting in Poisson jackpot lines with large SNV load. Comparison of human and mouse variants suggests that the two species might employ distinct modes of somatic segregation. Single-mitochondrion resolution revealed mitochondria mutational dynamics that we hypothesize to affect risk probabilities for mutations reaching disease thresholds.
Collapse
Affiliation(s)
- Jacqueline Morris
- Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Young-Ji Na
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hua Zhu
- Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jae-Hee Lee
- Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hoa Giang
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Alexandra V Ulyanova
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Gordon H Baltuch
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Steven Brem
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - H Isaac Chen
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - David K Kung
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Timothy H Lucas
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Donald M O'Rourke
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - John A Wolf
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - M Sean Grady
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jai-Yoon Sul
- Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Junhyong Kim
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - James Eberwine
- Department of Pharmacology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
39
|
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 2018. [PMID: 29788355 DOI: 10.1093/nar/gky427.pmid:29788355] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023] Open
Abstract
Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
Collapse
Affiliation(s)
- Andrew Waterhouse
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Florian T Heer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Tjaart A P de Beer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Christine Rempfer
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Rosalba Lepore
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH-4056 Basel, Switzerland
| |
Collapse
|
40
|
Assessing Exhaustiveness of Stochastic Sampling for Integrative Modeling of Macromolecular Structures. Biophys J 2018; 113:2344-2353. [PMID: 29211988 DOI: 10.1016/j.bpj.2017.10.005] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 09/22/2017] [Accepted: 10/02/2017] [Indexed: 12/22/2022] Open
Abstract
Modeling of macromolecular structures involves structural sampling guided by a scoring function, resulting in an ensemble of good-scoring models. By necessity, the sampling is often stochastic, and must be exhaustive at a precision sufficient for accurate modeling and assessment of model uncertainty. Therefore, the very first step in analyzing the ensemble is an estimation of the highest precision at which the sampling is exhaustive. Here, we present an objective and automated method for this task. As a proxy for sampling exhaustiveness, we evaluate whether two independently and stochastically generated sets of models are sufficiently similar. The protocol includes testing 1) convergence of the model score, 2) whether model scores for the two samples were drawn from the same parent distribution, 3) whether each structural cluster includes models from each sample proportionally to its size, and 4) whether there is sufficient structural similarity between the two model samples in each cluster. The evaluation also provides the sampling precision, defined as the smallest clustering threshold that satisfies the third, most stringent test. We validate the protocol with the aid of enumerated good-scoring models for five illustrative cases of binary protein complexes. Passing the proposed four tests is necessary, but not sufficient for thorough sampling. The protocol is general in nature and can be applied to the stochastic sampling of any set of models, not just structural models. In addition, the tests can be used to stop stochastic sampling as soon as exhaustiveness at desired precision is reached, thereby improving sampling efficiency; they may also help in selecting a model representation that is sufficiently detailed to be informative, yet also sufficiently coarse for sampling to be exhaustive.
Collapse
|
41
|
Haas J, Barbato A, Behringer D, Studer G, Roth S, Bertoni M, Mostaguir K, Gumienny R, Schwede T. Continuous Automated Model EvaluatiOn (CAMEO) complementing the critical assessment of structure prediction in CASP12. Proteins 2017; 86 Suppl 1:387-398. [PMID: 29178137 DOI: 10.1002/prot.25431] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 11/10/2017] [Accepted: 11/22/2017] [Indexed: 12/22/2022]
Abstract
Every second year, the community experiment "Critical Assessment of Techniques for Structure Prediction" (CASP) is conducting an independent blind assessment of structure prediction methods, providing a framework for comparing the performance of different approaches and discussing the latest developments in the field. Yet, developers of automated computational modeling methods clearly benefit from more frequent evaluations based on larger sets of data. The "Continuous Automated Model EvaluatiOn (CAMEO)" platform complements the CASP experiment by conducting fully automated blind prediction assessments based on the weekly pre-release of sequences of those structures, which are going to be published in the next release of the PDB Protein Data Bank. CAMEO publishes weekly benchmarking results based on models collected during a 4-day prediction window, on average assessing ca. 100 targets during a time frame of 5 weeks. CAMEO benchmarking data is generated consistently for all participating methods at the same point in time, enabling developers to benchmark and cross-validate their method's performance, and directly refer to the benchmarking results in publications. In order to facilitate server development and promote shorter release cycles, CAMEO sends weekly email with submission statistics and low performance warnings. Many participants of CASP have successfully employed CAMEO when preparing their methods for upcoming community experiments. CAMEO offers a variety of scores to allow benchmarking diverse aspects of structure prediction methods. By introducing new scoring schemes, CAMEO facilitates new development in areas of active research, for example, modeling quaternary structure, complexes, or ligand binding sites.
Collapse
Affiliation(s)
- Jürgen Haas
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Alessandro Barbato
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Dario Behringer
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Steven Roth
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Martino Bertoni
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Khaled Mostaguir
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Computational Structural Biology, Basel, Switzerland
| |
Collapse
|
42
|
Wang T, Cook I, Leyh TS. The NSAID allosteric site of human cytosolic sulfotransferases. J Biol Chem 2017; 292:20305-20312. [PMID: 29038294 DOI: 10.1074/jbc.m117.817387] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 10/04/2017] [Indexed: 11/06/2022] Open
Abstract
Non-steroidal anti-inflammatory drugs (NSAIDs) are among the most commonly prescribed drugs worldwide-more than 111 million prescriptions were written in the United States in 2014. NSAIDs allosterically inhibit cytosolic sulfotransferases (SULTs) with high specificity and therapeutically relevant affinities. This study focuses on the interactions of SULT1A1 and mefenamic acid (MEF)-a potent, highly specific NSAID inhibitor of 1A1. Here, the first structure of an NSAID allosteric site-the MEF-binding site of SULT1A1-is determined using spin-label triangulation NMR. The structure is confirmed by site-directed mutagenesis and provides a molecular framework for understanding NSAID binding and isoform specificity. The mechanism of NSAID inhibition is explored using molecular dynamics and equilibrium and pre-steady-state ligand-binding studies. MEF inhibits SULT1A1 turnover through an indirect (helix-mediated) stabilization of the closed form of the active-site cap of the enzyme, which traps the nucleotide and slows its release. Using the NSAID-binding site structure of SULT1A1 as a comparative model, it appears that 11 of the 13 human SULT isoforms harbor an NSAID-binding site. We hypothesize that these sites evolved to enable SULT isoforms to respond to metabolites that lie within their metabolic domains. Finally, the NSAID-binding site structure offers a template for developing isozyme-specific allosteric inhibitors that can be used to regulate specific areas of sulfuryl-transfer metabolism.
Collapse
Affiliation(s)
- Ting Wang
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York 10461-1926
| | - Ian Cook
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York 10461-1926
| | - Thomas S Leyh
- Department of Microbiology and Immunology, Albert Einstein College of Medicine, Bronx, New York 10461-1926.
| |
Collapse
|
43
|
Solís-Calero C, Carvalho HF. KLK14 interactions with HAI-1 and HAI-2 serine protease inhibitors: A molecular dynamics and relative free-energy calculations study. Cell Biol Int 2017; 41:1246-1264. [PMID: 28817220 DOI: 10.1002/cbin.10839] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Accepted: 08/12/2017] [Indexed: 01/13/2023]
Abstract
Kallikrein 14 (KLK14) is a serine protease linked to several pathologies including prostate cancer and positively correlates with Gleason score. Though KLK14 functioning in cancer is poorly understood, it has been implicated in HGF/Met signaling, given that KLK14 proteolytically inhibits HGF activator-inhibitor 1 (HAI-1), which strongly inhibits pro-HGF activators, thereby contributing to tumor progression. In this work, KLK14 binding to either hepatocyte growth factor activator inhibitor type-1 (HAI-1) or type-2 (HAI-2) was essayed using homology modeling, molecular dynamic simulations and free-energy calculations through MM/PBSA and MM/GBSA. KLK14 was successfully modeled. Calculated free energies suggested higher binding affinity for the KLK14/HAI-1 interaction than for KLK14/HAI-2. This difference in binding affinity is largely explained by the higher stability of the hydrogen-bond networks in KLK14/HAI-1 along the simulation trajectory. A key arginine residue in both HAI-1 and HAI-2 is responsible for their interaction with the S1 pocket in KLK14. Additionally, MM/GBSA free-energy decomposition postulates that KLK14 Asp174 and Trp196 are hotspots for binding HAI-1 and HAI-2.
Collapse
Affiliation(s)
- Christian Solís-Calero
- Department of Structural and Functional Biology, State University of Campinas, Campinas, São Paulo, Brazil
| | - Hernandes F Carvalho
- Department of Structural and Functional Biology, State University of Campinas, Campinas, São Paulo, Brazil
| |
Collapse
|
44
|
Antúnez-Argüelles E, Rojo-Domínguez A, Arregui-Mena AL, Jacobo-Albavera L, Márquez MF, Iturralde-Torres P, Villarreal-Molina MT. Compound heterozygous KCNQ1 mutations (A300T/P535T) in a child with sudden unexplained death: Insights into possible molecular mechanisms based on protein modeling. Gene 2017; 627:40-48. [PMID: 28600177 DOI: 10.1016/j.gene.2017.06.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 05/18/2017] [Accepted: 06/05/2017] [Indexed: 11/26/2022]
Abstract
Sudden death in a child is a devastating event with important medical implications for surviving relatives. Because it may be the first manifestation of unknown inherited cardiac disease, molecular autopsy can be helpful to determine the cause of death and identify at risk family members. The aim of the study was to perform a molecular autopsy in a seven year-old girl with sudden unexplained death, to find evidence supporting the possible pathogenicity of mutations identified in inherited cardiac disease genes, and to clinically and genetically assess first-degree relatives. DNA from the index case was extracted from umbilical cord cells stored at birth, and DNA of first-degree relatives from blood samples. Targeted sequencing was performed using a Haloplex design including 81 cardiogenes. Possible functional consequences of the mutations were analyzed using protein modeling and structural mobility analyses. The child was compound heterozygous for KCNQ1 variants p.Ala300Thr and p.Pro535Thr. Ala300Thr is known to cause long QT syndrome in the homozygous state, while Pro535Thr is novel and of unknown clinical significance. The father and sibling were Ala300Thr heterozygous, and had normal QTc intervals at rest and during exercise. The asymptomatic mother was heterozygous for Pro535Thr, and showed borderline QTc at rest, but prolonged QTc during exercise. Protein modeling predicted that Ala300Thr alters the mobility profile of the Kv7.1 tetramer and Thr535 disrupts a calmodulin-binding site, probably causing co-assembly or trafficking defects of the mutant monomer. Altogether, the evidence strongly suggests that this child was affected with a recessive form of Romano Ward syndrome.
Collapse
Affiliation(s)
- Erika Antúnez-Argüelles
- Laboratorio de Genómica de Enfermedades Cardiovasculares, Instituto Nacional de Medicina Genómica, Mexico
| | - Arturo Rojo-Domínguez
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Mexico
| | - Ana Leticia Arregui-Mena
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Mexico
| | - Leonor Jacobo-Albavera
- Laboratorio de Genómica de Enfermedades Cardiovasculares, Instituto Nacional de Medicina Genómica, Mexico
| | - Manlio Fabio Márquez
- Departamento de Electrofisiología, Instituto Nacional de Cardiología "Ignacio Chávez", Mexico
| | - Pedro Iturralde-Torres
- Departamento de Electrofisiología, Instituto Nacional de Cardiología "Ignacio Chávez", Mexico
| | | |
Collapse
|
45
|
Dakal TC, Kumar R, Ramotar D. Structural modeling of human organic cation transporters. Comput Biol Chem 2017; 68:153-163. [PMID: 28343125 DOI: 10.1016/j.compbiolchem.2017.03.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 02/01/2017] [Accepted: 03/11/2017] [Indexed: 12/12/2022]
Abstract
Human organic cation transporters (hOCTs) belong to solute carriers (SLC) 22 family of membrane proteins that play a central role in transportation of chemotherapeutic drugs for several clinical and pathological conditions, including cancer and diabetes. These transporters mediate drug transport; however, the precise mechanism of drug-binding and transport by them is not fully uncovered yet, partly due to unavailability of any crystal structure record. In this work, we performed a multi-phasic approach to compute the 3D structural models of seven human organic cation transporters (hOCTs) starting from primary protein sequence. Our structure modeling approach included 1) I-TASSER based comparative sequence alignment, threading and ab-initio protein modeling; 2) models comparison with PSIPRED secondary structure prediction; 3) loop modeling for incongruent secondary structure in Chimera 1.10.1; 4) high resolution structure simulation, refinement, energy minimization using ModRefiner, and 5) validation of the structure models using PROCHECK at SAVEs. From structural point, the computed 3D structures of hOCTs consist of a typical major facilitator superfamily (MFS) fold of twelve α-transmembrane helix domains arranged in a manner rendering hOCTs a barrel shaped structure with a large cleft that opens in cytoplasm. The modeled 3D structure of all hOCTs closely resemble to human SLC2A3 (GLUT3) transporter (PDB ID: 5c65) and displayed an outward-open confirmation and putative cyclic C1 protein symmetry. In addition, hOCTs has a large (>100 amino acids) unique extracellular loop between TMH1 and TMH2 having potential glycosylation sites (Asn-Xaa-Ser/Thr) and cysteine residues, both features indicative of putative role in drug binding and uptake. There is an intracellular three/four-helix loop between TMH6 and TMH7 containing putative phosphorylation sites for precise regulation of hOCTs function as drug transporters. There are nine loops of 4 to 11 amino acids length that protrude from membrane, both intracellularly and extracellularly, and connect adjacent TMHs. The 2D structure prediction showed Nin-Cin topology of all hOCTs. In the unavailability of the crystal structures of hOCTs, the 3D structural models computed in-silico and presented herein can be used for studying the mechanism of drug binding and transport by hOCTs.
Collapse
Affiliation(s)
- Tikam Chand Dakal
- Maisonneuve-Rosemont Hospital, Research Center, Université de Montréal, Department of Medicine, 5415 Boul. de L' Assomption, Montréal, Québec H1T 2M4, Canada.
| | - Rajender Kumar
- Architecture et Fonction des Macromolécules Biologiques (AFMB), Campus de Luminy, Aix-Marseille Université, Marseille, France; Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), Sector 67, S.A.S. Nagar, 160 062, Punjab, India
| | - Dindial Ramotar
- Maisonneuve-Rosemont Hospital, Research Center, Université de Montréal, Department of Medicine, 5415 Boul. de L' Assomption, Montréal, Québec H1T 2M4, Canada
| |
Collapse
|
46
|
Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, Beglov D, Vajda S. The ClusPro web server for protein-protein docking. Nat Protoc 2017. [PMID: 28079879 DOI: 10.1038/nprot2016169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.
Collapse
Affiliation(s)
- Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, New York, USA
| | | | - Bing Xia
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Kathryn A Porter
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA
| | - Christine Yueh
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
| |
Collapse
|
47
|
Abstract
The ClusPro server (https://cluspro.org) is a widely used tool for protein-protein docking. The server provides a simple home page for basic use, requiring only two files in Protein Data Bank (PDB) format. However, ClusPro also offers a number of advanced options to modify the search; these include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Six different energy functions can be used, depending on the type of protein. Docking with each energy parameter set results in ten models defined by centers of highly populated clusters of low-energy docked structures. This protocol describes the use of the various options, the construction of auxiliary restraints files, the selection of the energy parameters, and the analysis of the results. Although the server is heavily used, runs are generally completed in <4 h.
Collapse
|
48
|
Ingale AG. Prediction of Structural and Functional Aspects of Protein. PHARMACEUTICAL SCIENCES 2017. [DOI: 10.4018/978-1-5225-1762-7.ch021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
To predict the structure of protein from a primary amino acid sequence is computationally difficult. An investigation of the methods and algorithms used to predict protein structure and a thorough knowledge of the function and structure of proteins are critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this chapter sheds light on the methods used for protein structure prediction. This chapter covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, it presents an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction, giving unique insight into the future applications of the modeled protein structures. In this chapter, current protein structure prediction methods are reviewed for a milieu on structure prediction, the prediction of structural fundamentals, tertiary structure prediction, and functional imminent. The basic ideas and advances of these directions are discussed in detail.
Collapse
|
49
|
Bohnuud T, Luo L, Wodak SJ, Vajda S, Bonvin AM, Weng Z, Schueler-Furman O, Kozakov D. A benchmark testing ground for integrating homology modeling and protein docking. Proteins 2017; 85:10-16. [PMID: 27172383 PMCID: PMC5817996 DOI: 10.1002/prot.25063] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 05/08/2016] [Indexed: 12/20/2022]
Abstract
Protein docking procedures carry out the task of predicting the structure of a protein-protein complex starting from the known structures of the individual protein components. More often than not, however, the structure of one or both components is not known, but can be derived by homology modeling on the basis of known structures of related proteins deposited in the Protein Data Bank (PDB). Thus, the problem is to develop methods that optimally integrate homology modeling and docking with the goal of predicting the structure of a complex directly from the amino acid sequences of its component proteins. One possibility is to use the best available homology modeling and docking methods. However, the models built for the individual subunits often differ to a significant degree from the bound conformation in the complex, often much more so than the differences observed between free and bound structures of the same protein, and therefore additional conformational adjustments, both at the backbone and side chain levels need to be modeled to achieve an accurate docking prediction. In particular, even homology models of overall good accuracy frequently include localized errors that unfavorably impact docking results. The predicted reliability of the different regions in the model can also serve as a useful input for the docking calculations. Here we present a benchmark dataset that should help to explore and solve combined modeling and docking problems. This dataset comprises a subset of the experimentally solved 'target' complexes from the widely used Docking Benchmark from the Weng Lab (excluding antibody-antigen complexes). This subset is extended to include the structures from the PDB related to those of the individual components of each complex, and hence represent potential templates for investigating and benchmarking integrated homology modeling and docking approaches. Template sets can be dynamically customized by specifying ranges in sequence similarity and in PDB release dates, or using other filtering options, such as excluding sets of specific structures from the template list. Multiple sequence alignments, as well as structural alignments of the templates to their corresponding subunits in the target are also provided. The resource is accessible online or can be downloaded at http://cluspro.org/benchmark, and is updated on a weekly basis in synchrony with new PDB releases. Proteins 2016; 85:10-16. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Tanggis Bohnuud
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Lingqi Luo
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Shoshana J. Wodak
- VIB Structural Biology Research Center, VUB Pleinlaan 2, 1050 Brussels
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- Department of Chemistry, Boston University, Boston, MA 02215, USA
| | - Alexandre M.J.J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, the Netherlands
| | - Zhiping Weng
- Biochemistry and Molecular Pharmacology University of Massachusetts Medical School Worcester MA United States
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Hadassah Medical School, Hebrew University, Jerusalem, Israel
| | - Dima Kozakov
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
- Department of Applied Mathematics and Statistics, Stony Brook University NY, USA
| |
Collapse
|
50
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. In the following chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
Affiliation(s)
- Benjamin Webb
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, CA, 94143, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|