1
|
Lin J, Yin Y, Cao J, Zou B, Han K, Chen Y, Li S, Huang C, Chen J, Lv Y, Xu S, Xie D, Wang F. Nudix Hydrolase 13 Impairs the Initiation of Colorectal Cancer by Inhibiting PKM1 ADP-Ribosylation. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2410058. [PMID: 39921866 PMCID: PMC11967829 DOI: 10.1002/advs.202410058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 01/06/2025] [Indexed: 02/10/2025]
Abstract
Metabolic dysregulation has been implicated as a key factor in colorectal cancer (CRC) initiation, however, the underlying driving forces and mechanisms remain poorly understood. Herein, transcriptome profiling of paired early-stage CRCs and adenomas identifies Nudix hydrolase 13 (NUDT13) as a critical suppressor. Elevated NUDT13 expression impedes the proliferation of CRC cells under hypoxic conditions and markedly inhibits CRC initiation by upregulating PKM1. Mechanistically, NUDT13 directly binds and stabilizes PKM1 protein by reducing its poly ADP-ribosylation (PARylation), which is catalyzed by PARP1 at E275/D281/E282/E285/D296, thereby inducing an oxidative phosphorylation (OXPHOS) phenotype in CRC cells. Moreover, spatiotemporal knockout of Nudt13 enhances intestinal tumorigenesis in mice, which can be significantly suppressed by PARP1 inhibitor Olaparib. Notably, residues E245/E248/E249 within the Nudix box motif of NUDT13 are essential for PKM1 PARylation, and a mimic peptide derived from this motif is sufficient to stabilize PKM1 protein and robustly inhibit CRC tumorigenesis. Collectively, this study reveals a previously unknown PARylation-dependent mechanism that regulates PKM1 protein stability and switches the metabolic pathway of CRC cells, providing a promising target for CRC treatment.
Collapse
Affiliation(s)
- Jinlong Lin
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
- Department of Thoracic SurgerySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Yixin Yin
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
- Department of AnesthesiologySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Jinghua Cao
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Bingxu Zou
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Kai Han
- Department of Colorectal SurgerySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Yufan Chen
- Department of EndoscopySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Siyu Li
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Cijun Huang
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Jiewei Chen
- Department of PathologySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Yongrui Lv
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Shuidan Xu
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Dan Xie
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
- Department of PathologySun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| | - Fengwei Wang
- State Key Laboratory of Oncology in South ChinaGuangdong Provincial Clinical Research Center for CancerSun Yat‐sen University Cancer CenterGuangzhou510060P. R. China
| |
Collapse
|
2
|
Notari E, Wood CW, Michel J. Assessment of the Topology and Oligomerisation States of Coiled Coils Using Metadynamics with Conformational Restraints. J Chem Theory Comput 2025; 21:3260-3276. [PMID: 40042175 PMCID: PMC11948332 DOI: 10.1021/acs.jctc.4c01695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 02/04/2025] [Accepted: 02/17/2025] [Indexed: 03/26/2025]
Abstract
Coiled-coil proteins provide an excellent scaffold for multistate de novo protein design due to their established sequence-to-structure relationships and ability to switch conformations in response to external stimuli, such as changes in pH or temperature. However, the computational design of multistate coiled-coil protein assemblies is challenging, as it requires accurate estimates of the free energy differences between multiple alternative coiled-coil conformations. Here, we demonstrate how this challenge can be tackled using metadynamics simulations with orientational, positional and conformational restraints. We show that, even for subtle sequence variations, our protocol can predict the preferred topology of coiled-coil dimers and trimers, the preferred oligomerization states of coiled-coil dimers, trimers, and tetramers, as well as the switching behavior of a pH-dependent multistate system. Our approach provides a method for predicting the stability of coiled-coil designs and offers a new framework for computing binding free energies in protein-protein and multiprotein complexes.
Collapse
Affiliation(s)
- Evangelia Notari
- EaStCHEM
School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh EH9 3FJ, U.K.
| | - Christopher W. Wood
- School
of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, U.K.
| | - Julien Michel
- EaStCHEM
School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh EH9 3FJ, U.K.
| |
Collapse
|
3
|
Leung J, Qu L, Ye Q, Zhong Z. The immune duality of osteopontin and its therapeutic implications for kidney transplantation. Front Immunol 2025; 16:1520777. [PMID: 40093009 PMCID: PMC11906708 DOI: 10.3389/fimmu.2025.1520777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Accepted: 02/10/2025] [Indexed: 03/19/2025] Open
Abstract
Osteopontin (OPN) is a multifunctional glycoprotein with various structural domains that enable it to perform diverse functions in both physiological and pathological states. This review comprehensively examines OPN from multiple perspectives, including its protein structure, interactions with receptors, interactions with immune cells, and roles in kidney diseases and transplantation. This review explores the immunological duality of OPN and its significance and value as a biomarker and therapeutic target in kidney transplantation. In cancer, OPN typically promotes tumor evasion by suppressing the immune system. Conversely, in immune-related kidney diseases, particularly kidney transplantation, OPN activates the immune system by enhancing the migration and activation of immune cells, thereby exacerbating kidney damage. This immunological duality may stem from different OPN splice variants and the exposure, after cleavage, of different structural domains, which play distinct biological roles in cellular interactions. Additionally, OPN has a significant biological impact posttransplantation and on chronic kidney disease and, highlighting its importance as a biomarker and potential therapeutic target. Future research should further explore the specific mechanisms of OPN in kidney transplantation to improve treatment strategies and enhance patient quality of life.
Collapse
Affiliation(s)
- Junto Leung
- Zhongnan Hospital of Wuhan University, Institute of Hepatobiliary Diseases of Wuhan University, Transplant Center of Wuhan University, National Quality Control Center for Donated Organ Procurement, Hubei Key Laboratory of Medical Technology on Transplantation, Hubei Provincial Clinical Research Center for Natural Polymer Biological Liver, Wuhan, Hubei, China
| | - Lei Qu
- Zhongnan Hospital of Wuhan University, Institute of Hepatobiliary Diseases of Wuhan University, Transplant Center of Wuhan University, National Quality Control Center for Donated Organ Procurement, Hubei Key Laboratory of Medical Technology on Transplantation, Hubei Provincial Clinical Research Center for Natural Polymer Biological Liver, Wuhan, Hubei, China
| | - Qifa Ye
- Zhongnan Hospital of Wuhan University, Institute of Hepatobiliary Diseases of Wuhan University, Transplant Center of Wuhan University, National Quality Control Center for Donated Organ Procurement, Hubei Key Laboratory of Medical Technology on Transplantation, Hubei Provincial Clinical Research Center for Natural Polymer Biological Liver, Wuhan, Hubei, China
- The 3rd Xiangya Hospital of Central South University, NHC Key Laboratory of Translational Research on Transplantation Medicine, Changsha, China
| | - Zibiao Zhong
- Zhongnan Hospital of Wuhan University, Institute of Hepatobiliary Diseases of Wuhan University, Transplant Center of Wuhan University, National Quality Control Center for Donated Organ Procurement, Hubei Key Laboratory of Medical Technology on Transplantation, Hubei Provincial Clinical Research Center for Natural Polymer Biological Liver, Wuhan, Hubei, China
| |
Collapse
|
4
|
Popović ME, Stevanović M, Pantović Pavlović M. Biothermodynamics of Hemoglobin and Red Blood Cells: Analysis of Structure and Evolution of Hemoglobin and Red Blood Cells, Based on Molecular and Empirical Formulas, Biosynthesis Reactions, and Thermodynamic Properties of Formation and Biosynthesis. J Mol Evol 2024; 92:776-798. [PMID: 39516253 DOI: 10.1007/s00239-024-10205-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 09/04/2024] [Indexed: 11/16/2024]
Abstract
Hemoglobin and red blood cells (erythrocytes) have been studied extensively from the perspective of life and biomedical sciences. However, no analysis of hemoglobin and red blood cells from the perspective of chemical thermodynamics has been reported in the literature. Such an analysis would provide an insight into their structure and turnover from the aspect of biothermodynamics and bioenergetics. In this paper, a biothermodynamic analysis was made of hemoglobin and red blood cells. Molecular formulas, empirical formulas, biosynthesis reactions, and thermodynamic properties of formation and biosynthesis were determined for the alpha chain, beta chain, heme B, hemoglobin and red blood cells. Empirical formulas and thermodynamic properties of hemoglobin were compared to those of other biological macromolecules, which include proteins and nucleic acids. Moreover, the energetic requirements of biosynthesis of hemoglobin and red blood cells were analyzed. Based on this, a discussion was made of the specific structure of red blood cells (i.e. no nuclei nor organelles) and its role as an evolutionary adaptation for more energetically efficient biosynthesis needed for the turnover of red blood cells.
Collapse
Affiliation(s)
- Marko E Popović
- Institute of Chemistry, Technology and Metallurgy, University of Belgrade, Njegoševa 12, 11000, Belgrade, Serbia.
| | - Maja Stevanović
- Inovation Centre of the Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, 11120, Belgrade, Serbia
| | - Marijana Pantović Pavlović
- Institute of Chemistry, Technology and Metallurgy, University of Belgrade, Njegoševa 12, 11000, Belgrade, Serbia
- Centre of Excellence in Chemistry and Environmental Engineering - ICTM, University of Belgrade, Belgrade, Serbia
| |
Collapse
|
5
|
Utgés JS, Barton GJ. Comparative evaluation of methods for the prediction of protein-ligand binding sites. J Cheminform 2024; 16:126. [PMID: 39529176 PMCID: PMC11552181 DOI: 10.1186/s13321-024-00923-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
The accurate identification of protein-ligand binding sites is of critical importance in understanding and modulating protein function. Accordingly, ligand binding site prediction has remained a research focus for over three decades with over 50 methods developed and a change of paradigm from geometry-based to machine learning. In this work, we collate 13 ligand binding site predictors, spanning 30 years, focusing on the latest machine learning-based methods such as VN-EGNN, IF-SitePred, GrASP, PUResNet, and DeepPocket and compare them to the established P2Rank, PRANK and fpocket and earlier methods like PocketFinder, Ligsite and Surfnet. We benchmark the methods against the human subset of our new curated reference dataset, LIGYSIS. LIGYSIS is a comprehensive protein-ligand complex dataset comprising 30,000 proteins with bound ligands which aggregates biologically relevant unique protein-ligand interfaces across biological units of multiple structures from the same protein. LIGYSIS is an improvement for testing methods over earlier datasets like sc-PDB, PDBbind, binding MOAD, COACH420 and HOLO4K which either include 1:1 protein-ligand complexes or consider asymmetric units. Re-scoring of fpocket predictions by PRANK and DeepPocket display the highest recall (60%) whilst IF-SitePred presents the lowest recall (39%). We demonstrate the detrimental effect that redundant prediction of binding sites has on performance as well as the beneficial impact of stronger pocket scoring schemes, with improvements up to 14% in recall (IF-SitePred) and 30% in precision (Surfnet). Finally, we propose top-N+2 recall as the universal benchmark metric for ligand binding site prediction and urge authors to share not only the source code of their methods, but also of their benchmark.Scientific contributionsThis study conducts the largest benchmark of ligand binding site prediction methods to date, comparing 13 original methods and 15 variants using 10 informative metrics. The LIGYSIS dataset is introduced, which aggregates biologically relevant protein-ligand interfaces across multiple structures of the same protein. The study highlights the detrimental effect of redundant binding site prediction and demonstrates significant improvement in recall and precision through stronger scoring schemes. Finally, top-N+2 recall is proposed as a universal benchmark metric for ligand binding site prediction, with a recommendation for open-source sharing of both methods and benchmarks.
Collapse
Affiliation(s)
- Javier S Utgés
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, Scotland, UK
| | - Geoffrey J Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, DD1 5EH, Scotland, UK.
| |
Collapse
|
6
|
Kwon S, Safer J, Nguyen DT, Hoksza D, May P, Arbesfeld JA, Rubin AF, Campbell AJ, Burgin A, Iqbal S. Genomics 2 Proteins portal: a resource and discovery tool for linking genetic screening outputs to protein sequences and structures. Nat Methods 2024; 21:1947-1957. [PMID: 39294369 PMCID: PMC11466821 DOI: 10.1038/s41592-024-02409-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 08/09/2024] [Indexed: 09/20/2024]
Abstract
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics have generated genetic variants at an unprecedented scale. However, efficient tools and resources are needed to link disparate data types-to 'map' variants onto protein structures, to better understand how the variation causes disease, and thereby design therapeutics. Here we present the Genomics 2 Proteins portal ( https://g2p.broadinstitute.org/ ): a human proteome-wide resource that maps 20,076,998 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the Genomics 2 Proteins portal allows users to interactively upload protein residue-wise annotations (for example, variants and scores) as well as the protein structure beyond databases to establish the connection between genomics to proteins. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotypes.
Collapse
Affiliation(s)
- Seulki Kwon
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jordan Safer
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Duyen T Nguyen
- PATTERN, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Hoksza
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Victoria, Australia
| | - Arthur J Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Alex Burgin
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Cancer Data Sciences, Dana-Farber/Harvard Cancer Center, Boston, MA, USA.
| |
Collapse
|
7
|
Sotiropoulou AI, Hatzinikolaou DG, Chrysina ED. Structural studies of β-glucosidase from the thermophilic bacterium Caldicellulosiruptor saccharolyticus. Acta Crystallogr D Struct Biol 2024; 80:733-743. [PMID: 39361356 PMCID: PMC11448918 DOI: 10.1107/s2059798324009252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 09/20/2024] [Indexed: 10/05/2024] Open
Abstract
β-Glucosidase from the thermophilic bacterium Caldicellulosiruptor saccharolyticus (Bgl1) has been denoted as having an attractive catalytic profile for various industrial applications. Bgl1 catalyses the final step of in the decomposition of cellulose, an unbranched glucose polymer that has attracted the attention of researchers in recent years as it is the most abundant renewable source of reduced carbon in the biosphere. With the aim of enhancing the thermostability of Bgl1 for a broad spectrum of biotechnological processes, it has been subjected to structural studies. Crystal structures of Bgl1 and its complex with glucose were determined at 1.47 and 1.95 Å resolution, respectively. Bgl1 is a member of glycosyl hydrolase family 1 (GH1 superfamily, EC 3.2.1.21) and the results showed that the 3D structure of Bgl1 follows the overall architecture of the GH1 family, with a classical (β/α)8 TIM-barrel fold. Comparisons of Bgl1 with sequence or structural homologues of β-glucosidase reveal quite similar structures but also unique structural features in Bgl1 with plausible functional roles.
Collapse
Affiliation(s)
- Anastasia I Sotiropoulou
- Institute of Chemical Biology, National Hellenic Research Foundation, 48 Vassileos Constantinou Avenue, 116 35 Athens, Greece
| | - Dimitris G Hatzinikolaou
- Enzyme and Microbial Biotechnology Unit, Department of Biology, National and Kapodistrian University of Athens, Panepistimiopolis Zografou, 157 72 Athens, Greece
| | - Evangelia D Chrysina
- Institute of Chemical Biology, National Hellenic Research Foundation, 48 Vassileos Constantinou Avenue, 116 35 Athens, Greece
| |
Collapse
|
8
|
Shen J, Su X, Wang S, Wang Z, Zhong C, Huang Y, Duan S. RhoJ: an emerging biomarker and target in cancer research and treatment. Cancer Gene Ther 2024; 31:1454-1464. [PMID: 38858534 DOI: 10.1038/s41417-024-00792-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 05/24/2024] [Accepted: 05/29/2024] [Indexed: 06/12/2024]
Abstract
RhoJ is a Rho GTPase that belongs to the Cdc42 subfamily and has a molecular weight of approximately 21 kDa. It can activate the p21-activated kinase family either directly or indirectly, influencing the activity of various downstream effectors and playing a role in regulating the cytoskeleton, cell movement, and cell cycle. RhoJ's expression and activity are controlled by multiple upstream factors at different levels, including expression, subcellular localization, and activation. High RhoJ expression is generally associated with a poor prognosis for cancer patients and is mainly due to an increased number of tumor blood vessels and abnormal expression in malignant cells. RhoJ promotes tumor progression through several pathways, particularly in tumor angiogenesis and drug resistance. Clinical data also indicates that high RhoJ expression is closely linked to the pathological features of tumor malignancy. There are various cancer treatment methods that target RhoJ signaling, such as direct binding to inhibit the RhoJ effector pocket, inhibiting RhoJ expression, blocking RhoJ upstream and downstream signals, and indirectly inhibiting RhoJ's effect. RhoJ is an emerging cancer biomarker and a significant target for future cancer clinical research and drug development.
Collapse
Affiliation(s)
- Jinze Shen
- Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, China
| | - Xinming Su
- Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, China
| | - Shana Wang
- Department of Clinical Medicine, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Zehua Wang
- Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, China
| | - Chenming Zhong
- Medical Genetics Center, School of Medicine, Ningbo University, Ningbo, Zhejiang, China
| | - Yi Huang
- Department of Neurosurgery, The First Affiliated Hospital of Ningbo University, Ningbo, Zhejiang, China.
| | - Shiwei Duan
- Key Laboratory of Novel Targets and Drug Study for Neural Repair of Zhejiang Province, School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, China.
| |
Collapse
|
9
|
McCann H, Meade CD, Williams LD, Petrov AS, Johnson PZ, Simon AE, Hoksza D, Nawrocki EP, Chan PP, Lowe TM, Ribas CE, Sweeney BA, Madeira F, Anyango S, Appasamy SD, Deshpande M, Varadi M, Velankar S, Zirbel CL, Naiden A, Jossinet F, Petrov AI. R2DT: A COMPREHENSIVE PLATFORM FOR VISUALISING RNA SECONDARY STRUCTURE. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.29.611006. [PMID: 39803519 PMCID: PMC11722224 DOI: 10.1101/2024.09.29.611006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2025]
Abstract
RNA secondary (2D) structure visualisation is an essential tool for understanding RNA function. R2DT is a software package designed to visualise RNA 2D structures in consistent, recognisable, and reproducible layouts. The latest release, R2DT 2.0, introduces multiple significant features, including the ability to display position-specific information, such as single nucleotide polymorphisms (SNPs) or SHAPE reactivities. It also offers a new template-free mode allowing visualisation of RNAs without pre-existing templates, alongside a constrained folding mode and support for animated visualisations. Users can interactively modify R2DT diagrams, either manually or using natural language prompts, to generate new templates or create publication-quality images. Additionally, R2DT features faster performance, an expanded template library, and a growing collection of compatible tools and utilities. Already integrated into multiple biological databases, R2DT has evolved into a comprehensive platform for RNA 2D visualisation, accessible at https://r2dt.bio.
Collapse
Affiliation(s)
- Holly McCann
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Caeden D. Meade
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Loren Dean Williams
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Anton S. Petrov
- NASA Center for Integration of the Origin of Life, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA; School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332-0400, USA
| | - Philip Z. Johnson
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Anne E. Simon
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - David Hoksza
- Department of Software Engineering, Charles University, Prague, 118 00, Czech Republic
| | - Eric P. Nawrocki
- National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Patricia P. Chan
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Todd M. Lowe
- Department of Biomolecular Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Carlos Eduardo Ribas
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Blake A. Sweeney
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Fábio Madeira
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Stephen Anyango
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Sri Devan Appasamy
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Mandar Deshpande
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Mihaly Varadi
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, Wellcome Genome Campus, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Craig L. Zirbel
- Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, OH 43403, USA
| | | | - Fabrice Jossinet
- Faculty of Life Sciences, University of Strasbourg, Strasbourg, 67000, France
| | | |
Collapse
|
10
|
Vollmar M, Tirunagari S, Harrus D, Armstrong D, Gáborová R, Gupta D, Afonso MQL, Evans G, Velankar S. Dataset from a human-in-the-loop approach to identify functionally important protein residues from literature. Sci Data 2024; 11:1032. [PMID: 39333508 PMCID: PMC11436914 DOI: 10.1038/s41597-024-03841-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/29/2024] [Indexed: 09/29/2024] Open
Abstract
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting structure features and functional annotations at residue-level from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, and LitSuggest and HuggingFace models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from HuggingFace. Using a human-in-the-loop annotation system, we iteratively developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
Collapse
Affiliation(s)
- Melanie Vollmar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Santosh Tirunagari
- Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Deborah Harrus
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - David Armstrong
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Romana Gáborová
- CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 62500, Brno, Czech Republic
| | - Deepti Gupta
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marcelo Querino Lima Afonso
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Genevieve Evans
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
11
|
Waman VP, Bordin N, Alcraft R, Vickerstaff R, Rauer C, Chan Q, Sillitoe I, Yamamori H, Orengo C. CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds. J Mol Biol 2024; 436:168551. [PMID: 38548261 DOI: 10.1016/j.jmb.2024.168551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/20/2024] [Accepted: 03/22/2024] [Indexed: 04/07/2024]
Abstract
CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), has been developed to classify high-quality domains into CATH superfamilies and identify novel fold groups and superfamilies. CATH-AlphaFlow uses a novel state-of-the-art structure-based domain boundary prediction method (ChainSaw) for identifying domains in multi-domain proteins. We applied CATH-AlphaFlow to process PDB structures not classified in CATH and AFDB structures from 21 model organisms, expanding CATH by over 100%. Domains not classified in existing CATH superfamilies or fold groups were used to seed novel folds, giving 253 new folds from PDB structures (September 2023 release) and 96 from AFDB structures of proteomes of 21 model organisms. Where possible, functional annotations were obtained using (i) predictions from publicly available methods (ii) annotations from structural relatives in AFDB/UniProt50. We also predicted functional sites and highly conserved residues. Some folds are associated with important functions such as photosynthetic acclimation (in flowering plants), iron permease activity (in fungi) and post-natal spermatogenesis (in mice). CATH-AlphaFlow will allow us to identify many more CATH relatives in the AFDB, further characterising the protein structure landscape.
Collapse
Affiliation(s)
- Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Rachel Alcraft
- Advanced Research Computing Centre, University College London, London, United Kingdom
| | - Robert Vickerstaff
- Advanced Research Computing Centre, University College London, London, United Kingdom
| | - Clemens Rauer
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Qian Chan
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Hazuki Yamamori
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, United Kingdom.
| |
Collapse
|
12
|
Manfredi M, Savojardo C, Iardukhin G, Salomoni D, Costantini A, Martelli PL, Casadio R. Alpha&ESMhFolds: A Web Server for Comparing AlphaFold2 and ESMFold Models of the Human Reference Proteome. J Mol Biol 2024; 436:168593. [PMID: 38718922 DOI: 10.1016/j.jmb.2024.168593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/22/2024] [Accepted: 04/30/2024] [Indexed: 05/16/2024]
Abstract
We develop a novel database Alpha&ESMhFolds which allows the direct comparison of AlphaFold2 and ESMFold predicted models for 42,942 proteins of the Reference Human Proteome, and when available, their comparison with 2,900 directly associated PDB structures with at least a structure to sequence coverage of 70%. Statistics indicate that good quality models tend to overlap with a TM-score >0.6 as long as some PDB structural information is available. As expected, a direct model superimposition to the PDB structure highlights that AlphaFold2 models are slightly superior to ESMFold ones. However, some 55% of the database is endowed with models overlapping with TM-score <0.6. This highlights the different outputs of the two methods. The database is freely available for usage at https://alpha-esmhfolds.biocomp.unibo.it/.
Collapse
Affiliation(s)
- Matteo Manfredi
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | - Castrense Savojardo
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Georgii Iardukhin
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| | | | | | - Pier Luigi Martelli
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy.
| | - Rita Casadio
- Biocomputing Group, Dept. of Pharmacy and Biotechnology, University of Bologna, Italy
| |
Collapse
|
13
|
Tiemann JKS, Szczuka M, Bouarroudj L, Oussaren M, Garcia S, Howard RJ, Delemotte L, Lindahl E, Baaden M, Lindorff-Larsen K, Chavent M, Poulain P. MDverse, shedding light on the dark matter of molecular dynamics simulations. eLife 2024; 12:RP90061. [PMID: 39212001 PMCID: PMC11364437 DOI: 10.7554/elife.90061] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024] Open
Abstract
The rise of open science and the absence of a global dedicated data repository for molecular dynamics (MD) simulations has led to the accumulation of MD files in generalist data repositories, constituting the dark matter of MD - data that is technically accessible, but neither indexed, curated, or easily searchable. Leveraging an original search strategy, we found and indexed about 250,000 files and 2000 datasets from Zenodo, Figshare and Open Science Framework. With a focus on files produced by the Gromacs MD software, we illustrate the potential offered by the mining of publicly available MD data. We identified systems with specific molecular composition and were able to characterize essential parameters of MD simulation such as temperature and simulation length, and could identify model resolution, such as all-atom and coarse-grain. Based on this analysis, we inferred metadata to propose a search engine prototype to explore the MD data. To continue in this direction, we call on the community to pursue the effort of sharing MD data, and to report and standardize metadata to reuse this valuable matter.
Collapse
Affiliation(s)
- Johanna KS Tiemann
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Magdalena Szczuka
- Institut de Pharmacologie et Biologie Structurale, CNRS, Université de ToulouseToulouseFrance
| | - Lisa Bouarroudj
- Université Paris Cité, CNRS, Institut Jacques MonodParisFrance
| | | | | | - Rebecca J Howard
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm UniversityStockholmSweden
| | - Lucie Delemotte
- Department of applied physics, Science for Life Laboratory, KTH Royal Institute of TechnologyStockholmSweden
| | - Erik Lindahl
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm UniversityStockholmSweden
- Department of applied physics, Science for Life Laboratory, KTH Royal Institute of TechnologyStockholmSweden
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS, Université Paris CitéParisFrance
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Matthieu Chavent
- Institut de Pharmacologie et Biologie Structurale, CNRS, Université de ToulouseToulouseFrance
| | - Pierre Poulain
- Université Paris Cité, CNRS, Institut Jacques MonodParisFrance
| |
Collapse
|
14
|
Visani GM, Pun MN, Galvin W, Daniel E, Borisiak K, Wagura U, Nourmohammad A. HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction. ARXIV 2024:arXiv:2407.06703v1. [PMID: 39040640 PMCID: PMC11261993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Predicting the stability and fitness effects of amino acid mutations in proteins is a cornerstone of biological discovery and engineering. Various experimental techniques have been developed to measure mutational effects, providing us with extensive datasets across a diverse range of proteins. By training on these data, traditional computational modeling and more recent machine learning approaches have advanced significantly in predicting mutational effects. Here, we introduce HERMES, a 3D rotationally equivariant structure-based neural network model for mutational effect and stability prediction. Pre-trained to predict amino acid propensity from its surrounding 3D structure, HERMES can be fine-tuned for mutational effects using our open-source code. We present a suite of HERMES models, pre-trained with different strategies, and fine-tuned to predict the stability effect of mutations. Benchmarking against other models shows that HERMES often outperforms or matches their performance in predicting mutational effect on stability, binding, and fitness. HERMES offers versatile tools for evaluating mutational effects and can be fine-tuned for specific predictive objectives.
Collapse
Affiliation(s)
- Gian Marco Visani
- Department of Computer Science and Engineering, University of Washington, Seattle, USA
| | - Michael N. Pun
- Department of Physics, University of Washington, 3910 15th Avenue Northeast, Seattle, WA 98195, USA
| | - William Galvin
- Department of Computer Science and Engineering, University of Washington, Seattle, USA
| | - Eric Daniel
- Department of Computer Science and Engineering, University of Washington, Seattle, USA
| | - Kevin Borisiak
- Department of Physics, University of Washington, 3910 15th Avenue Northeast, Seattle, WA 98195, USA
| | - Utheri Wagura
- Department of Physics, University of Washington, 3910 15th Avenue Northeast, Seattle, WA 98195, USA
- Department of Physics, Massachusetts Institute of Technology, 182 Memorial Dr, Cambridge, MA 02139
| | - Armita Nourmohammad
- Department of Computer Science and Engineering, University of Washington, Seattle, USA
- Department of Physics, University of Washington, 3910 15th Avenue Northeast, Seattle, WA 98195, USA
- Department of Applied Mathematics, University of Washington, Seattle, USA
- Fred Hutchinson cancer Research Center, 1100 Fairview ave N, Seattle, WA 98109, USA
| |
Collapse
|
15
|
Aguilera-Puga MDC, Plisson F. Structure-aware machine learning strategies for antimicrobial peptide discovery. Sci Rep 2024; 14:11995. [PMID: 38796582 PMCID: PMC11127937 DOI: 10.1038/s41598-024-62419-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/16/2024] [Indexed: 05/28/2024] Open
Abstract
Machine learning models are revolutionizing our approaches to discovering and designing bioactive peptides. These models often need protein structure awareness, as they heavily rely on sequential data. The models excel at identifying sequences of a particular biological nature or activity, but they frequently fail to comprehend their intricate mechanism(s) of action. To solve two problems at once, we studied the mechanisms of action and structural landscape of antimicrobial peptides as (i) membrane-disrupting peptides, (ii) membrane-penetrating peptides, and (iii) protein-binding peptides. By analyzing critical features such as dipeptides and physicochemical descriptors, we developed models with high accuracy (86-88%) in predicting these categories. However, our initial models (1.0 and 2.0) exhibited a bias towards α-helical and coiled structures, influencing predictions. To address this structural bias, we implemented subset selection and data reduction strategies. The former gave three structure-specific models for peptides likely to fold into α-helices (models 1.1 and 2.1), coils (1.3 and 2.3), or mixed structures (1.4 and 2.4). The latter depleted over-represented structures, leading to structure-agnostic predictors 1.5 and 2.5. Additionally, our research highlights the sensitivity of important features to different structure classes across models.
Collapse
Affiliation(s)
- Mariana D C Aguilera-Puga
- Department of Biotechnology and Biochemistry, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), Irapuato Unit, 36824, Irapuato, Guanajuato, Mexico
| | - Fabien Plisson
- Department of Biotechnology and Biochemistry, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), Irapuato Unit, 36824, Irapuato, Guanajuato, Mexico.
| |
Collapse
|
16
|
Tiemann JKS, Szczuka M, Bouarroudj L, Oussaren M, Garcia S, Howard RJ, Delemotte L, Lindahl E, Baaden M, Lindorff-Larsen K, Chavent M, Poulain P. MDverse: Shedding Light on the Dark Matter of Molecular Dynamics Simulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.02.538537. [PMID: 37205542 PMCID: PMC10187166 DOI: 10.1101/2023.05.02.538537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
The rise of open science and the absence of a global dedicated data repository for molecular dynamics (MD) simulations has led to the accumulation of MD files in generalist data repositories, constituting the dark matter of MD - data that is technically accessible, but neither indexed, curated, or easily searchable. Leveraging an original search strategy, we found and indexed about 250,000 files and 2,000 datasets from Zenodo, Figshare and Open Science Framework. With a focus on files produced by the Gromacs MD software, we illustrate the potential offered by the mining of publicly available MD data. We identified systems with specific molecular composition and were able to characterize essential parameters of MD simulation such as temperature and simulation length, and could identify model resolution, such as all-atom and coarse-grain. Based on this analysis, we inferred metadata to propose a search engine prototype to explore the MD data. To continue in this direction, we call on the community to pursue the effort of sharing MD data, and to report and standardize metadata to reuse this valuable matter.
Collapse
|
17
|
MacGowan SA, Madeira F, Britto-Borges T, Barton GJ. A unified analysis of evolutionary and population constraint in protein domains highlights structural features and pathogenic sites. Commun Biol 2024; 7:447. [PMID: 38605212 PMCID: PMC11009406 DOI: 10.1038/s42003-024-06117-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/27/2024] [Indexed: 04/13/2024] Open
Abstract
Protein evolution is constrained by structure and function, creating patterns in residue conservation that are routinely exploited to predict structure and other features. Similar constraints should affect variation across individuals, but it is only with the growth of human population sequencing that this has been tested at scale. Now, human population constraint has established applications in pathogenicity prediction, but it has not yet been explored for structural inference. Here, we map 2.4 million population variants to 5885 protein families and quantify residue-level constraint with a new Missense Enrichment Score (MES). Analysis of 61,214 structures from the PDB spanning 3661 families shows that missense depleted sites are enriched in buried residues or those involved in small-molecule or protein binding. MES is complementary to evolutionary conservation and a combined analysis allows a new classification of residues according to a conservation plane. This approach finds functional residues that are evolutionarily diverse, which can be related to specificity, as well as family-wide conserved sites that are critical for folding or function. We also find a possible contrast between lethal and non-lethal pathogenic sites, and a surprising clinical variant hot spot at a subset of missense enriched positions.
Collapse
Affiliation(s)
- Stuart A MacGowan
- Division of Computational Biology School of Life Sciences University of Dundee, Dow Street Dundee, DD1 5EH, Scotland, UK
| | - Fábio Madeira
- Division of Computational Biology School of Life Sciences University of Dundee, Dow Street Dundee, DD1 5EH, Scotland, UK
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Thiago Britto-Borges
- Division of Computational Biology School of Life Sciences University of Dundee, Dow Street Dundee, DD1 5EH, Scotland, UK
- Section of Bioinformatics and Systems Cardiology, Department of Internal Medicine III and Klaus Tschira Institute for Integrative Computational Cardiology, Heidelberg University Hospital, Heidelberg, Germany
| | - Geoffrey J Barton
- Division of Computational Biology School of Life Sciences University of Dundee, Dow Street Dundee, DD1 5EH, Scotland, UK.
| |
Collapse
|
18
|
Nikte SV, Joshi M, Sengupta D. State-dependent dynamics of extramembrane domains in the β 2 -adrenergic receptor. Proteins 2024; 92:317-328. [PMID: 37864328 DOI: 10.1002/prot.26613] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 08/22/2023] [Accepted: 09/25/2023] [Indexed: 10/22/2023]
Abstract
G protein-coupled receptors (GPCRs) are membrane-bound signaling proteins that play an essential role in cellular signaling processes. Due to their intrinsic function of transmitting internal signals in response to external cues, these receptors are adapted to be highly dynamic in nature. The β2 -adrenergic receptor (β2 AR) is a representative member of the family that has been extensively analyzed in terms of its structure and activation. Although the structure of the transmembrane domain has been characterized in the different functional states of the receptor, the conformational dynamics of the extramembrane domains, especially the intrinsically disordered regions are still emerging. In this study, we analyze the state-dependent dynamics of extramembrane domains of β2 AR using atomistic molecular dynamics simulations. We introduce a parameter, the residue excess dynamics that allows us to better quantify receptor dynamics. Using this measure, we show that the dynamics of the extramembrane domains are sensitive to the receptor state. Interestingly, the ligand-bound intermediateR ' state shows the maximal dynamics compared to either the active R*G or inactive R states. Ligand binding appears to be correlated with high residue excess dynamics that are dampened upon G protein coupling. The intracellular loop-3 (ICL3) domain has a tendency to flip towards the membrane upon ligand binding, which could contribute to receptor "priming." We highlight an important ICL1-helix-8 interplay that is broken in the ligand-bound state but is retained in the active state. Overall, our study highlights the importance of characterizing the functional dynamics of the GPCR loop domains.
Collapse
Affiliation(s)
- Siddhanta V Nikte
- Physical and Materials Chemistry Division, National Chemical Laboratory, Pune, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Manali Joshi
- Bioinformatics Center, Savitribai Phule Pune University, Pune, India
| | - Durba Sengupta
- Physical and Materials Chemistry Division, National Chemical Laboratory, Pune, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
19
|
Ko S, Kim J, Lim J, Lee SM, Park JY, Woo J, Scott-Nevros ZK, Kim JR, Yoon H, Kim D. Blanket antimicrobial resistance gene database with structural information, BOARDS, provides insights on historical landscape of resistance prevalence and effects of mutations in enzyme structure. mSystems 2024; 9:e0094323. [PMID: 38085058 PMCID: PMC10871167 DOI: 10.1128/msystems.00943-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/02/2023] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance (AMR) in pathogenic bacteria poses a significant threat to public health, yet there is still a need for development in the tools to deeply understand AMR genes based on genetic or structural information. In this study, we present an interactive web database named Blanket Overarching Antimicrobial-Resistance gene Database with Structural information (BOARDS, sbml.unist.ac.kr), a database that comprehensively includes 3,943 reported AMR gene information for 1,997 extended spectrum beta-lactamase (ESBL) and 1,946 other genes as well as a total of 27,395 predicted protein structures. These structures, which include both wild-type AMR genes and their mutants, were derived from 80,094 publicly available whole-genome sequences. In addition, we developed the rapid analysis and detection tool of antimicrobial-resistance (RADAR), a one-stop analysis pipeline to detect AMR genes across whole-genome sequencing (WGSs). By integrating BOARDS and RADAR, the AMR prevalence landscape for eight multi-drug resistant pathogens was reconstructed, leading to unexpected findings such as the pre-existence of the MCR genes before their official reports. Enzymatic structure prediction-based analysis revealed that the occurrence of mutations found in some ESBL genes was found to be closely related to the binding affinities with their antibiotic substrates. Overall, BOARDS can play a significant role in performing in-depth analysis on AMR.IMPORTANCEWhile the increasing antibiotic resistance (AMR) in pathogen has been a burden on public health, effective tools for deep understanding of AMR based on genetic or structural information remain limited. In this study, a blanket overarching antimicrobial-resistance gene database with structure information (BOARDS)-a web-based database that comprehensively collected AMR gene data with predictive protein structural information was constructed. Additionally, we report the development of a RADAR pipeline that can analyze whole-genome sequences as well. BOARDS, which includes sequence and structural information, has shown the historical landscape and prevalence of the AMR genes and can provide insight into single-nucleotide polymorphism effects on antibiotic degrading enzymes within protein structures.
Collapse
Affiliation(s)
- Seyoung Ko
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaehyung Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaewon Lim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jihoon Woo
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Zoe K. Scott-Nevros
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jong R. Kim
- School of Engineering and Digital Sciences, Nazarbayev University, Astan, Kazakhstan
| | - Hyunjin Yoon
- Department of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| |
Collapse
|
20
|
Flachsenberg F, Ehrt C, Gutermuth T, Rarey M. Redocking the PDB. J Chem Inf Model 2024; 64:219-237. [PMID: 38108627 DOI: 10.1021/acs.jcim.3c01573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Molecular docking is a standard technique in structure-based drug design (SBDD). It aims to predict the 3D structure of a small molecule in the binding site of a receptor (often a protein). Despite being a common technique, it often necessitates multiple tools and involves manual steps. Here, we present the JAMDA preprocessing and docking workflow that is easy to use and allows fully automated docking. We evaluate the JAMDA docking workflow on binding sites extracted from the complete PDB and derive key factors determining JAMDA's docking performance. With that, we try to remove most of the bias due to manual intervention and provide a realistic estimate of the redocking performance of our JAMDA preprocessing and docking workflow for any PDB structure. On this large PDBScan22 data set, our JAMDA workflow finds a pose with an RMSD of at most 2 Å to the crystal ligand on the top rank for 30.1% of the structures. When applying objective structure quality filters to the PDBScan22 data set, the success rate increases to 61.8%. Given the prepared structures from the JAMDA preprocessing pipeline, both JAMDA and the widely used AutoDock Vina perform comparably on this filtered data set (the PDBScan22-HQ data set).
Collapse
Affiliation(s)
- Florian Flachsenberg
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Torben Gutermuth
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
21
|
Fang QY, Wang YP, Zhang RQ, Fan M, Feng LX, Guo XD, Cheng CR, Zhang XW, Liu X. Carnosol ameliorated cancer cachexia-associated myotube atrophy by targeting P5CS and its downstream pathways. Front Pharmacol 2024; 14:1291194. [PMID: 38249348 PMCID: PMC10799341 DOI: 10.3389/fphar.2023.1291194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 12/11/2023] [Indexed: 01/23/2024] Open
Abstract
Introduction: Carnosol exhibited ameliorating effects on muscle atrophy of mice developed cancer cachexia in our previous research. Method: Here, the ameliorating effects of carnosol on the C2C12 myotube atrophy result from simulated cancer cachexia injury, the conditioned medium of the C26 tumor cells or the LLC tumor cells, were observed. To clarify the mechanisms of carnosol, the possible direct target proteins of carnosol were searched using DARTS (drug affinity responsive target stability) assay and then confirmed using CETSA (cellular thermal shift assay). Furthermore, proteomic analysis was used to search its possible indirect target proteins by comparing the protein expression profiles of C2C12 myotubes under treatment of C26 medium, with or without the presence of carnosol. The signal network between the direct and indirect target proteins of carnosol was then constructed. Results: Our results showed that, Delta-1-pyrroline-5-carboxylate synthase (P5CS) might be the direct target protein of carnosol in myotubes. The influence of carnosol on amino acid metabolism downstream of P5CS was confirmed. Carnosol could upregulate the expression of proteins related to glutathione metabolism, anti-oxidant system, and heat shock response. Knockdown of P5CS could also ameliorate myotube atrophy and further enhance the ameliorating effects of carnosol. Discussion: These results suggested that carnosol might ameliorate cancer cachexia-associated myotube atrophy by targeting P5CS and its downstream pathways.
Collapse
Affiliation(s)
- Qiao-Yu Fang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yue-Ping Wang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Rui-Qin Zhang
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Meng Fan
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Li-Xing Feng
- Shanghai Majorbio Bio-Pharm Technology Co., Ltd., Shanghai, China
| | - Xiao-Dong Guo
- Department of Oncology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Chun-Ru Cheng
- School of Chemical Engineering, Sichuan University of Science and Engineering, Zigong, Sichuan, China
| | - Xiong-Wen Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Xuan Liu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
22
|
Kwon S, Safer J, Nguyen DT, Hoksza D, May P, Arbesfeld JA, Rubin AF, Campbell AJ, Burgin A, Iqbal S. Genomics 2 Proteins portal: A resource and discovery tool for linking genetic screening outputs to protein sequences and structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.02.573913. [PMID: 38260256 PMCID: PMC10802383 DOI: 10.1101/2024.01.02.573913] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.
Collapse
|
23
|
Špačková A, Bazgier V, Raček T, Sehnal D, Svobodová R, Berka K. Analysis and Visualization of Protein Channels, Tunnels, and Pores with MOLEonline and ChannelsDB 2.0. Methods Mol Biol 2024; 2836:219-233. [PMID: 38995543 DOI: 10.1007/978-1-0716-4007-4_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
Channels, tunnels, and pores serve as pathways for the transport of molecules and ions through protein structures, thus participating to their functions. MOLEonline ( https://mole.upol.cz ) is an interactive web-based tool with enhanced capabilities for detecting and characterizing channels, tunnels, and pores within protein structures. MOLEonline has two distinct calculation modes for analysis of channel and tunnels or transmembrane pores. This application gives researchers rich analytical insights into channel detection, structural characterization, and physicochemical properties. ChannelsDB 2.0 ( https://channelsdb2.biodata.ceitec.cz/ ) is a comprehensive database that offers information on the location, geometry, and physicochemical characteristics of tunnels and pores within macromolecular structures deposited in Protein Data Bank and AlphaFill databases. These tunnels are sourced from manual deposition from literature and automatic detection using software tools MOLE and CAVER. MOLEonline and ChannelsDB visualization is powered by the LiteMol Viewer and Mol* viewer, ensuring a user-friendly workspace. This chapter provides an overview of user applications and usage.
Collapse
Affiliation(s)
- Anna Špačková
- Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic
| | - Václav Bazgier
- Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic
| | - Tomáš Raček
- Central European Institute of Technology (CEITEC), Masaryk University Brno, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Brno, Czech Republic
| | - David Sehnal
- Central European Institute of Technology (CEITEC), Masaryk University Brno, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Brno, Czech Republic
| | - Radka Svobodová
- Central European Institute of Technology (CEITEC), Masaryk University Brno, Brno, Czech Republic
- Faculty of Science, National Centre for Biomolecular Research, Brno, Czech Republic
| | - Karel Berka
- Department of Physical Chemistry, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic.
| |
Collapse
|
24
|
Kunnakkattu IR, Choudhary P, Pravda L, Nadzirin N, Smart OS, Yuan Q, Anyango S, Nair S, Varadi M, Velankar S. PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank. J Cheminform 2023; 15:117. [PMID: 38042830 PMCID: PMC10693035 DOI: 10.1186/s13321-023-00786-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/17/2023] [Indexed: 12/04/2023] Open
Abstract
While the Protein Data Bank (PDB) contains a wealth of structural information on ligands bound to macromolecules, their analysis can be challenging due to the large amount and diversity of data. Here, we present PDBe CCDUtils, a versatile toolkit for processing and analysing small molecules from the PDB in PDBx/mmCIF format. PDBe CCDUtils provides streamlined access to all the metadata for small molecules in the PDB and offers a set of convenient methods to compute various properties using RDKit, such as 2D depictions, 3D conformers, physicochemical properties, scaffolds, common fragments, and cross-references to small molecule databases using UniChem. The toolkit also provides methods for identifying all the covalently attached chemical components in a macromolecular structure and calculating similarity among small molecules. By providing a broad range of functionality, PDBe CCDUtils caters to the needs of researchers in cheminformatics, structural biology, bioinformatics and computational chemistry.
Collapse
Affiliation(s)
- Ibrahim Roshan Kunnakkattu
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lukas Pravda
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nurul Nadzirin
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Oliver S Smart
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Qi Yuan
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Stephen Anyango
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sreenath Nair
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
25
|
Ribeiro AJM, Riziotis IG, Tyzack JD, Borkakoti N, Thornton JM. EzMechanism: an automated tool to propose catalytic mechanisms of enzyme reactions. Nat Methods 2023; 20:1516-1522. [PMID: 37735566 PMCID: PMC10555830 DOI: 10.1038/s41592-023-02006-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 08/15/2023] [Indexed: 09/23/2023]
Abstract
Over the years, hundreds of enzyme reaction mechanisms have been studied using experimental and simulation methods. This rich literature on biological catalysis is now ripe for use as the foundation of new knowledge-based approaches to investigate enzyme mechanisms. Here, we present a tool able to automatically infer mechanistic paths for a given three-dimensional active site and enzyme reaction, based on a set of catalytic rules compiled from the Mechanism and Catalytic Site Atlas, a database of enzyme mechanisms. EzMechanism (pronounced as 'Easy' Mechanism) is available to everyone through a web user interface. When studying a mechanism, EzMechanism facilitates and improves the generation of hypotheses, by making sure that relevant information is considered, as derived from the literature on both related and unrelated enzymes. We validated EzMechanism on a set of 62 enzymes and have identified paths for further improvement, including the need for additional and more generic catalytic rules.
Collapse
Affiliation(s)
- Antonio J M Ribeiro
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | - Ioannis G Riziotis
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Jonathan D Tyzack
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Neera Borkakoti
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Janet M Thornton
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| |
Collapse
|
26
|
Maiti S, Heyden M. Model-Dependent Solvation of the K-18 Domain of the Intrinsically Disordered Protein Tau. J Phys Chem B 2023; 127:7220-7230. [PMID: 37556237 DOI: 10.1021/acs.jpcb.3c01726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
A known imbalance between intra-protein and protein-water interactions in many empirical force fields results in collapsed conformational ensembles of intrinsically disordered proteins in explicit solvent simulations that disagree with experiments. Multiple strategies have been introduced in the literature to modify protein-water interactions, which improve agreement between experiments and simulations. In this work, we combine simulations with standard and modified force fields with a spatially resolved analysis of solvation free energy contributions and compare the consequences of each strategy. We find that enhanced Lennard-Jones (LJ) interactions between protein atoms and water oxygens primarily improve the solvation of nonpolar functional groups of the protein. In contrast, modified electrostatics in the water model or strengthened LJ interactions between the protein and water hydrogens mainly affect the hydration of polar functional groups. Modified electrostatics further impact the average orientation of water molecules in the hydration shell. As a result, protein-water interactions with the first hydration layers are strengthened, while interactions with water molecules in higher hydration shells are weakened. Hence, distinct strategies to balance intra-protein and protein-water interactions in simulations have qualitatively different effects on protein solvation. These differences are not necessarily captured by comparisons to experiments that report on global parameters describing protein conformational ensembles, e.g., the radius of gyration, but will influence the tendency of a protein to form aggregates or phase-separated droplets.
Collapse
Affiliation(s)
- Sthitadhi Maiti
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| | - Matthias Heyden
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| |
Collapse
|
27
|
Zhang Z, Li C, Li Q, Su X, Li J, Zhu L, Lin XJ, Shen J. Structure prediction of novel isoforms from uveal melanoma by AlphaFold. Sci Data 2023; 10:513. [PMID: 37542084 PMCID: PMC10403560 DOI: 10.1038/s41597-023-02429-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 07/28/2023] [Indexed: 08/06/2023] Open
Abstract
Alternative splicing is an important mechanism that enhances protein functional diversity. To date, our understanding of alternative splicing variants has been based on mRNA transcript data, but due to the difficulty in predicting protein structures, protein tertiary structures have been largely unexplored. However, with the release of AlphaFold, which predicts three-dimensional models of proteins, this challenge is rapidly being overcome. Here, we present a dataset of 315 predicted structures of abnormal isoforms in 18 uveal melanoma patients based on second- and third-generation transcriptome-sequencing data. This information comprises a high-quality set of structural data on recurrent aberrant isoforms that can be used in multiple types of studies, from those aimed at revealing potential therapeutic targets to those aimed at recognizing of cancer neoantigens at the atomic level.
Collapse
Affiliation(s)
- Zhe Zhang
- Department of Ophthalmology, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai, 200025, China.
- Institute of Translational Medicine, National Facility for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Chen Li
- High Performance Computing Center, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Qian Li
- Department of Ophthalmology, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai, 200025, China
- Institute of Translational Medicine, National Facility for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaoming Su
- High Performance Computing Center, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jiayi Li
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Lili Zhu
- Songjiang Research Institute and Songjiang Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 201600, China
| | - Xinhua James Lin
- High Performance Computing Center, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Jianfeng Shen
- Department of Ophthalmology, Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
- Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai, 200025, China.
- Institute of Translational Medicine, National Facility for Translational Medicine, Shanghai Jiao Tong University, Shanghai, 200240, China.
| |
Collapse
|
28
|
Chojnowski G. Sequence-assignment validation in protein crystal structure models with checkMySequence. Acta Crystallogr D Struct Biol 2023; 79:559-568. [PMID: 37314404 PMCID: PMC10306063 DOI: 10.1107/s2059798323003765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 04/26/2023] [Indexed: 06/15/2023] Open
Abstract
Sequence-register shifts remain one of the most elusive errors in experimental macromolecular models. They may affect model interpretation and propagate to newly built models from older structures. In a recent publication, it was shown that register shifts in cryo-EM models of proteins can be detected using a systematic reassignment of short model fragments to the target sequence. Here, it is shown that the same approach can be used to detect register shifts in crystal structure models using standard, model-bias-corrected electron-density maps (2mFo - DFc). Five register-shift errors in models deposited in the PDB detected using this method are described in detail.
Collapse
Affiliation(s)
- Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| |
Collapse
|
29
|
Agirre J, Atanasova M, Bagdonas H, Ballard CB, Baslé A, Beilsten-Edmands J, Borges RJ, Brown DG, Burgos-Mármol JJ, Berrisford JM, Bond PS, Caballero I, Catapano L, Chojnowski G, Cook AG, Cowtan KD, Croll TI, Debreczeni JÉ, Devenish NE, Dodson EJ, Drevon TR, Emsley P, Evans G, Evans PR, Fando M, Foadi J, Fuentes-Montero L, Garman EF, Gerstel M, Gildea RJ, Hatti K, Hekkelman ML, Heuser P, Hoh SW, Hough MA, Jenkins HT, Jiménez E, Joosten RP, Keegan RM, Keep N, Krissinel EB, Kolenko P, Kovalevskiy O, Lamzin VS, Lawson DM, Lebedev AA, Leslie AGW, Lohkamp B, Long F, Malý M, McCoy AJ, McNicholas SJ, Medina A, Millán C, Murray JW, Murshudov GN, Nicholls RA, Noble MEM, Oeffner R, Pannu NS, Parkhurst JM, Pearce N, Pereira J, Perrakis A, Powell HR, Read RJ, Rigden DJ, Rochira W, Sammito M, Sánchez Rodríguez F, Sheldrick GM, Shelley KL, Simkovic F, Simpkin AJ, Skubak P, Sobolev E, Steiner RA, Stevenson K, Tews I, Thomas JMH, Thorn A, Valls JT, Uski V, Usón I, Vagin A, Velankar S, Vollmar M, Walden H, Waterman D, Wilson KS, Winn MD, Winter G, Wojdyr M, Yamashita K. The CCP4 suite: integrative software for macromolecular crystallography. Acta Crystallogr D Struct Biol 2023; 79:449-461. [PMID: 37259835 PMCID: PMC10233625 DOI: 10.1107/s2059798323003595] [Citation(s) in RCA: 334] [Impact Index Per Article: 167.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 04/19/2023] [Indexed: 06/02/2023] Open
Abstract
The Collaborative Computational Project No. 4 (CCP4) is a UK-led international collective with a mission to develop, test, distribute and promote software for macromolecular crystallography. The CCP4 suite is a multiplatform collection of programs brought together by familiar execution routines, a set of common libraries and graphical interfaces. The CCP4 suite has experienced several considerable changes since its last reference article, involving new infrastructure, original programs and graphical interfaces. This article, which is intended as a general literature citation for the use of the CCP4 software suite in structure determination, will guide the reader through such transformations, offering a general overview of the new features and outlining future developments. As such, it aims to highlight the individual programs that comprise the suite and to provide the latest references to them for perusal by crystallographers around the world.
Collapse
Affiliation(s)
- Jon Agirre
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Mihaela Atanasova
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Haroldas Bagdonas
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Charles B. Ballard
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Arnaud Baslé
- Biosciences Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - James Beilsten-Edmands
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Rafael J. Borges
- The Center of Medicinal Chemistry (CQMED), Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Av. Dr. André Tosello 550, 13083-886 Campinas, Brazil
| | - David G. Brown
- Laboratoires Servier SAS Institut de Recherches, Croissy-sur-Seine, France
| | - J. Javier Burgos-Mármol
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - John M. Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Paul S. Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Iracema Caballero
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Lucrezia Catapano
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
- Randall Centre for Cell and Molecular Biophysics, Faculty of Life Sciences and Medicine, King’s College London, London SE1 9RT, United Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - Atlanta G. Cook
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King’s Buildings, Edinburgh EH9 3BF, United Kingdom
| | - Kevin D. Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Tristan I. Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
- Altos Labs, Portway Building, Granta Park, Great Abington, Cambridge CB21 6GP, United Kingdom
| | - Judit É. Debreczeni
- Discovery Sciences, R&D BioPharmaceuticals, AstraZeneca, Darwin Building, Cambridge Science Park, Milton Road, Cambridge CB4 0WG, United Kingdom
| | - Nicholas E. Devenish
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Eleanor J. Dodson
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Tarik R. Drevon
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Gwyndaf Evans
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
- Rosalind Franklin Institute, Harwell Science and Innovation Campus, Didcot OX11 0QS, United Kingdom
| | - Phil R. Evans
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Maria Fando
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - James Foadi
- Department of Mathematical Sciences, University of Bath, Bath, United Kingdom
| | - Luis Fuentes-Montero
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Elspeth F. Garman
- Department of Biochemistry, University of Oxford, Dorothy Crowfoot Hodgkin Building, Oxford OX1 3QU, United Kingdom
| | - Markus Gerstel
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Richard J. Gildea
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Kaushik Hatti
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Maarten L. Hekkelman
- Oncode Institute and Department of Biochemistry, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Philipp Heuser
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Michael A. Hough
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
- School of Life Sciences, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, United Kingdom
| | - Huw T. Jenkins
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Elisabet Jiménez
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Robbie P. Joosten
- Oncode Institute and Department of Biochemistry, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Ronan M. Keegan
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Nicholas Keep
- Department of Biological Sciences, Institute of Structural and Molecular Biology, Birkbeck College, London WC1E 7HX, United Kingdom
| | - Eugene B. Krissinel
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Petr Kolenko
- Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, Břehová 7, 115 19 Prague 1, Czech Republic
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Průmyslová 55, 252 50 Vestec, Czech Republic
| | - Oleg Kovalevskiy
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Victor S. Lamzin
- European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany
| | - David M. Lawson
- Department of Biochemistry and Metabolism, John Innes Centre, Norwich NR4 7UH, United Kingdom
| | - Andrey A. Lebedev
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Andrew G. W. Leslie
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Bernhard Lohkamp
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SE-171 77 Stockholm, Sweden
| | - Fei Long
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Martin Malý
- Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, Břehová 7, 115 19 Prague 1, Czech Republic
- Institute of Biotechnology of the Czech Academy of Sciences, BIOCEV, Průmyslová 55, 252 50 Vestec, Czech Republic
- Biological Sciences, Institute for Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Airlie J. McCoy
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Stuart J. McNicholas
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Ana Medina
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Claudia Millán
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - James W. Murray
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom
| | - Garib N. Murshudov
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Robert A. Nicholls
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| | - Martin E. M. Noble
- Translational and Clinical Research Institute, Newcastle University, Paul O’Gorman Building, Medical School, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Robert Oeffner
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Navraj S. Pannu
- Department of Infectious Diseases, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - James M. Parkhurst
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
- Rosalind Franklin Institute, Harwell Science and Innovation Campus, Didcot OX11 0QS, United Kingdom
| | - Nicholas Pearce
- Department of Physics, Chemistry and Biology (IFM), Linköping University, SE-581 83 Linköping, Sweden
| | - Joana Pereira
- Biozentrum and SIB Swiss Institute of Bioinformatics, University of Basel, 4056 Basel, Switzerland
| | - Anastassis Perrakis
- Oncode Institute and Department of Biochemistry, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Harold R. Powell
- Department of Life Sciences, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom
| | - Randy J. Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - William Rochira
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Massimo Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 0XY, United Kingdom
- Discovery Centre, Biologics Engineering, AstraZeneca, Biomedical Campus, 1 Francis Crick Avenue, Trumpington, Cambridge CB2 0AA, United Kingdom
| | - Filomeno Sánchez Rodríguez
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - George M. Sheldrick
- Department of Structural Chemistry, Georg-August-Universität Göttingen, Tammannstrasse 4, 37077 Göttingen, Germany
| | - Kathryn L. Shelley
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Felix Simkovic
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Adam J. Simpkin
- Laboratoires Servier SAS Institut de Recherches, Croissy-sur-Seine, France
| | - Pavol Skubak
- Department of Infectious Diseases, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
| | - Egor Sobolev
- European Molecular Biology Laboratory, c/o DESY, Notkestrasse 85, 22607 Hamburg, Germany
| | - Roberto A. Steiner
- Randall Centre for Cell and Molecular Biophysics, Faculty of Life Sciences and Medicine, King’s College London, London SE1 9RT, United Kingdom
- Department of Biomedical Sciences, University of Padova, Italy
| | - Kyle Stevenson
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Ivo Tews
- Biological Sciences, Institute for Life Sciences, University of Southampton, Southampton SO17 1BJ, United Kingdom
| | - Jens M. H. Thomas
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Andrea Thorn
- Institute for Nanostructure and Solid State Physics, Universität Hamburg, 22761 Hamburg, Germany
| | - Josep Triviño Valls
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
| | - Ville Uski
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Isabel Usón
- Crystallographic Methods, Institute of Molecular Biology of Barcelona (IBMB–CSIC), Barcelona Science Park, Helix Building, Baldiri Reixac 15, 08028 Barcelona, Spain
- ICREA, Institució Catalana de Recerca i Estudis Avançats, Passeig Lluís Companys 23, 08003 Barcelona, Spain
| | - Alexei Vagin
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Melanie Vollmar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Helen Walden
- School of Molecular Biosciences, College of Medical Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - David Waterman
- STFC, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
- CCP4, Research Complex at Harwell, Rutherford Appleton Laboratory, Didcot OX11 0FA, United Kingdom
| | - Keith S. Wilson
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5DD, United Kingdom
| | - Martyn D. Winn
- Scientific Computing Department, Science and Technology Facilities Council, Didcot OX11 0FA, United Kingdom
| | - Graeme Winter
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot OX11 0DE, United Kingdom
| | - Marcin Wojdyr
- Global Phasing Limited (United Kingdom), Sheraton House, Castle Park, Cambridge CB3 0AX, United Kingdom
| | - Keitaro Yamashita
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom
| |
Collapse
|
30
|
Kou X, Shi P, Gao C, Ma P, Xing H, Ke Q, Zhang D. Data-Driven Elucidation of Flavor Chemistry. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:6789-6802. [PMID: 37102791 PMCID: PMC10176570 DOI: 10.1021/acs.jafc.3c00909] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Flavor molecules are commonly used in the food industry to enhance product quality and consumer experiences but are associated with potential human health risks, highlighting the need for safer alternatives. To address these health-associated challenges and promote reasonable application, several databases for flavor molecules have been constructed. However, no existing studies have comprehensively summarized these data resources according to quality, focused fields, and potential gaps. Here, we systematically summarized 25 flavor molecule databases published within the last 20 years and revealed that data inaccessibility, untimely updates, and nonstandard flavor descriptions are the main limitations of current studies. We examined the development of computational approaches (e.g., machine learning and molecular simulation) for the identification of novel flavor molecules and discussed their major challenges regarding throughput, model interpretability, and the lack of gold-standard data sets for equitable model evaluation. Additionally, we discussed future strategies for the mining and designing of novel flavor molecules based on multi-omics and artificial intelligence to provide a new foundation for flavor science research.
Collapse
Affiliation(s)
- Xingran Kou
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Peiqin Shi
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Chukun Gao
- Laboratory for Physical Chemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Peihua Ma
- Department of Nutrition and Food Science, University of Maryland, College Park, Maryland 20742, United States
| | - Huadong Xing
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Qinfei Ke
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Dachuan Zhang
- National Centre of Competence in Research (NCCR) Catalysis, Institute of Environmental Engineering, ETH Zürich, 8093 Zürich, Switzerland
| |
Collapse
|
31
|
Choudhary P, Anyango S, Berrisford J, Tolchard J, Varadi M, Velankar S. Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data. Sci Data 2023; 10:204. [PMID: 37045837 PMCID: PMC10097656 DOI: 10.1038/s41597-023-02101-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/23/2023] [Indexed: 04/14/2023] Open
Abstract
More than 61,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and their 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy and Sequences (SIFTS) resource. SIFTS incorporates residue-level annotations from many other biological resources. SIFTS data is available in various formats like XML, CSV and TSV format or also accessible via the PDBe REST API but always maintained separately from the structure data (PDBx/mmCIF file) in the PDB archive. Here, we extended the wwPDB PDBx/mmCIF data dictionary with additional categories to accommodate SIFTS data and added the UniProtKB, Pfam, SCOP2, and CATH residue-level annotations directly into the PDBx/mmCIF files from the PDB archive. With the integrated UniProtKB annotations, these files now provide consistent numbering of residues in different PDB entries allowing easy comparison of structure models. The extended dictionary yields a more consistent, standardised metadata description without altering the core PDB information. This development enables up-to-date cross-reference information at the residue level resulting in better data interoperability, supporting improved data analysis and visualisation.
Collapse
Grants
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- BB/V004247/1, PI:Sameer Velankar RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley) National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley National Science Foundation (NSF)
- DBI-2019297, PI: S.K. Burley NSF | National Science Board (NSB)
Collapse
Affiliation(s)
- Preeti Choudhary
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Stephen Anyango
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - John Berrisford
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- AstraZeneca, Biomedical Campus, 1 Francis Crick Ave, Trumpington, Cambridge, CB2 0AA, UK
| | - James Tolchard
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Claude Bernard University, Villeurbanne, Lyon, 69100, France
| | - Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
32
|
Rothfels K, Milacic M, Matthews L, Haw R, Sevilla C, Gillespie M, Stephan R, Gong C, Ragueneau E, May B, Shamovsky V, Wright A, Weiser J, Beavers D, Conley P, Tiwari K, Jassal B, Griss J, Senff-Ribeiro A, Brunson T, Petryszak R, Hermjakob H, D'Eustachio P, Wu G, Stein L. Using the Reactome Database. Curr Protoc 2023; 3:e722. [PMID: 37053306 PMCID: PMC11184634 DOI: 10.1002/cpz1.722] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Pathway databases provide descriptions of the roles of proteins, nucleic acids, lipids, carbohydrates, and other molecular entities within their biological cellular contexts. Pathway-centric views of these roles may allow for the discovery of unexpected functional relationships in data such as gene expression profiles and somatic mutation catalogues from tumor cells. For this reason, there is a high demand for high-quality pathway databases and their associated tools. The Reactome project (a collaboration between the Ontario Institute for Cancer Research, New York University Langone Health, the European Bioinformatics Institute, and Oregon Health & Science University) is one such pathway database. Reactome collects detailed information on biological pathways and processes in humans from the primary literature. Reactome content is manually curated, expert-authored, and peer-reviewed and spans the gamut from simple intermediate metabolism to signaling pathways and complex cellular events. This information is supplemented with likely orthologous molecular reactions in mouse, rat, zebrafish, worm, and other model organisms. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Browsing a Reactome pathway Basic Protocol 2: Exploring Reactome annotations of disease and drugs Basic Protocol 3: Finding the pathways involving a gene or protein Alternate Protocol 1: Finding the pathways involving a gene or protein using UniProtKB (SwissProt), Ensembl, or Entrez gene identifier Alternate Protocol 2: Using advanced search Basic Protocol 4: Using the Reactome pathway analysis tool to identify statistically overrepresented pathways Basic Protocol 5: Using the Reactome pathway analysis tool to overlay expression data onto Reactome pathway diagrams Basic Protocol 6: Comparing inferred model organism and human pathways using the Species Comparison tool Basic Protocol 7: Comparing tissue-specific expression using the Tissue Distribution tool.
Collapse
Affiliation(s)
- Karen Rothfels
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Marija Milacic
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Robin Haw
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Cristoffer Sevilla
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Marc Gillespie
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- College of Pharmacy and Health Sciences, St. John's University, Queens, New York
| | - Ralf Stephan
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Chuqiao Gong
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Eliot Ragueneau
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Bruce May
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Adam Wright
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Joel Weiser
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | | | - Krishna Tiwari
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Bijay Jassal
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Andrea Senff-Ribeiro
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Universidade Federal do Paraná, Curitiba, Brazil
| | | | - Robert Petryszak
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
- Oregon Health and Science University, Portland, Oregon
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Guanming Wu
- Oregon Health and Science University, Portland, Oregon
| | - Lincoln Stein
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
33
|
Hatami S, Sirous H, Mahnam K, Najafipour A, Fassihi A. Preparing a database of corrected protein structures important in cell signaling pathways. Res Pharm Sci 2022; 18:67-77. [PMID: 36846730 PMCID: PMC9951780 DOI: 10.4103/1735-5362.363597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/06/2022] [Accepted: 11/27/2022] [Indexed: 12/25/2022] Open
Abstract
Background and purpose Precise structures of macromolecules are important for structure-based drug design. Due to the limited resolution of some structures obtained from X-ray diffraction crystallography, differentiation between the NH and O atoms can be difficult. Sometimes a number of amino acids are missing from the protein structure. In this research, we intend to introduce a small database that we have prepared for providing the corrected 3D structure files of proteins frequently used in structure-based drug design protocols. Experimental approach 3454 soluble proteins belonging to the cancer signaling pathways were collected from the PDB database from which a dataset of 1001 was obtained. All were subjected to corrections in the protein preparation step. 896 protein structures out of 1001 were corrected successfully and the decision on the remained 105 proposed twelve for homology modeling to correct the missing residues. Three of them were subjected to molecular dynamics simulation for 30 ns. Findings / Results 896 corrected proteins were perfect and homology modeling on 12 proteins with missing residues in the backbone resulted in acceptable models according to Ramachandran, z-score, and DOPE energy plots. RMSD, RMSF, and Rg values verified the stability of the models after 30 ns molecular dynamics simulation. Conclusion and implication A collection of 1001 proteins were modified for some defects such as adjustment of the bond orders and formal charges, and addition of missing side chains of residues. Homology modeling corrected the amino missing backbone residues. This database will be completed for quite a lot of water-soluble proteins to be uploaded to the internet.
Collapse
Affiliation(s)
- Samaneh Hatami
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan University of Medical Sciences, Isfahan, I.R. Iran
| | - Hajar Sirous
- Bioinformatics Research Center, School of Pharmacy and Pharmaceutical Sciences, Isfahan University of Medical Sciences, Isfahan, I.R. Iran
| | - Karim Mahnam
- Department of Biology, Faculty of Science, Shahrekord University, Shahrekord, I.R. Iran
| | - Aylar Najafipour
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan University of Medical Sciences, Isfahan, I.R. Iran
| | - Afshin Fassihi
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan University of Medical Sciences, Isfahan, I.R. Iran,Corresponding author: A. Fassihi Tel: +98-3137927100, Fax: +98-3136680011
| |
Collapse
|
34
|
Burley SK, Berman HM, Chiu W, Dai W, Flatt JW, Hudson BP, Kaelber JT, Khare SD, Kulczyk AW, Lawson CL, Pintilie GD, Sali A, Vallat B, Westbrook JD, Young JY, Zardecki C. Electron microscopy holdings of the Protein Data Bank: the impact of the resolution revolution, new validation tools, and implications for the future. Biophys Rev 2022; 14:1281-1301. [PMID: 36474933 PMCID: PMC9715422 DOI: 10.1007/s12551-022-01013-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/06/2022] [Indexed: 12/04/2022] Open
Abstract
As a discipline, structural biology has been transformed by the three-dimensional electron microscopy (3DEM) "Resolution Revolution" made possible by convergence of robust cryo-preservation of vitrified biological materials, sample handling systems, and measurement stages operating a liquid nitrogen temperature, improvements in electron optics that preserve phase information at the atomic level, direct electron detectors (DEDs), high-speed computing with graphics processing units, and rapid advances in data acquisition and processing software. 3DEM structure information (atomic coordinates and related metadata) are archived in the open-access Protein Data Bank (PDB), which currently holds more than 11,000 3DEM structures of proteins and nucleic acids, and their complexes with one another and small-molecule ligands (~ 6% of the archive). Underlying experimental data (3DEM density maps and related metadata) are stored in the Electron Microscopy Data Bank (EMDB), which currently holds more than 21,000 3DEM density maps. After describing the history of the PDB and the Worldwide Protein Data Bank (wwPDB) partnership, which jointly manages both the PDB and EMDB archives, this review examines the origins of the resolution revolution and analyzes its impact on structural biology viewed through the lens of PDB holdings. Six areas of focus exemplifying the impact of 3DEM across the biosciences are discussed in detail (icosahedral viruses, ribosomes, integral membrane proteins, SARS-CoV-2 spike proteins, cryogenic electron tomography, and integrative structure determination combining 3DEM with complementary biophysical measurement techniques), followed by a review of 3DEM structure validation by the wwPDB that underscores the importance of community engagement.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901 USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093 USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854 USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854 USA
| | - Wah Chiu
- Department of Bioengineering, Stanford University, Stanford, CA USA
- Division of CryoEM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| | - Wei Dai
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Department of Cell Biology and Neuroscience, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Jason T. Kaelber
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Sagar D. Khare
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901 USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854 USA
| | - Arkadiusz W. Kulczyk
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Department of Biochemistry and Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ 08901 USA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | | | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158 USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901 USA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901 USA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| |
Collapse
|
35
|
Ribeiro AJM, Riziotis IG, Tyzack JD, Borkakoti N, Thornton JM. Using mechanism similarity to understand enzyme evolution. Biophys Rev 2022; 14:1273-1280. [PMID: 36659981 PMCID: PMC9842563 DOI: 10.1007/s12551-022-01022-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 11/21/2022] [Indexed: 12/04/2022] Open
Abstract
Enzyme reactions take place in the active site through a series of catalytic steps, which are collectively termed the enzyme mechanism. The catalytic step is thereby the individual unit to consider for the purposes of building new enzyme mechanisms - i.e. through the mix and match of individual catalytic steps, new enzyme mechanisms and reactions can be conceived. In the case of natural evolution, it has been shown that new enzyme functions have emerged through the tweaking of existing mechanisms by the addition, removal, or modification of some catalytic steps, while maintaining other steps of the mechanism intact. Recently, we have extracted and codified the information on the catalytic steps of hundreds of enzymes in a machine-readable way, with the aim of automating this kind of evolutionary analysis. In this paper, we illustrate how these data, which we called the "rules of enzyme catalysis", can be used to identify similar catalytic steps across enzymes that differ in their overall function and/or structural folds. A discussion on a set of three enzymes that share part of their mechanism is used as an exemplar to illustrate how this approach can reveal divergent and convergent evolution of enzymes at the mechanistic level. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01022-9.
Collapse
Affiliation(s)
- António J. M. Ribeiro
- European Bioinformatics Institute - European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Ioannis G. Riziotis
- European Bioinformatics Institute - European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Jonathan D. Tyzack
- European Bioinformatics Institute - European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Neera Borkakoti
- European Bioinformatics Institute - European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Janet M. Thornton
- European Bioinformatics Institute - European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| |
Collapse
|
36
|
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan SJ, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data bank: Tools for visualizing and understanding biological macromolecules in 3D. Protein Sci 2022; 31:e4482. [PMID: 36281733 PMCID: PMC9667899 DOI: 10.1002/pro.4482] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 12/14/2022]
Abstract
Now in its 52nd year of continuous operations, the Protein Data Bank (PDB) is the premiere open-access global archive housing three-dimensional (3D) biomolecular structure data. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) partnership. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) is funded by the National Science Foundation, National Institutes of Health, and US Department of Energy and serves as the US data center for the wwPDB. RCSB PDB is also responsible for the security of PDB data in its role as wwPDB-designated Archive Keeper. Every year, RCSB PDB serves tens of thousands of depositors of 3D macromolecular structure data (coming from macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction). The RCSB PDB research-focused web portal (RCSB.org) makes PDB data available at no charge and without usage restrictions to many millions of PDB data consumers around the world. The RCSB PDB training, outreach, and education web portal (PDB101.RCSB.org) serves nearly 700 K educators, students, and members of the public worldwide. This invited Tools Issue contribution describes how RCSB PDB (i) is organized; (ii) works with wwPDB partners to process new depositions; (iii) serves as the wwPDB-designated Archive Keeper; (iv) enables exploration and 3D visualization of PDB data via RCSB.org; and (v) supports training, outreach, and education via PDB101.RCSB.org. New tools and features at RCSB.org are presented using examples drawn from high-resolution structural studies of proteins relevant to treatment of human cancers by targeting immune checkpoints.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
- Department of Chemistry and Chemical Biology, RutgersThe State University of New JerseyPiscatawayNew JerseyUSA
| | - Charmi Bhikadiya
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chunxiao Bi
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Sebastian Bittrich
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Henry Chao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Li Chen
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Paul A. Craig
- School of Chemistry and Materials ScienceRochester Institute of TechnologyRochesterNew YorkUSA
| | - Gregg V. Crichlow
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Kenneth Dalenberg
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Shuchismita Dutta
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
| | - Maryam Fayazi
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Sai J. Ganesan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Sutapa Ghosh
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - David S. Goodsell
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Cancer Institute of New Jersey, Rutgers, The State University of New JerseyNew BrunswickNew JerseyUSA
- Department of Integrative Structural and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Rachel Kramer Green
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Vladimir Guranovic
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jeremy Henry
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Igor Khokhriakov
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Catherine L. Lawson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yuhe Liang
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Irina Persikova
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Joan Segura
- Research Collaboratory for Structural Bioinformatics Protein Data BankSan Diego Supercomputer Center, University of CaliforniaLa JollaCaliforniaUSA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Benjamin Webb
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Shamara Whetstone
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| | - Arthur Zalevsky
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic SciencesQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Pharmaceutical ChemistryQuantitative Biosciences Institute, University of CaliforniaSan FranciscoCaliforniaUSA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New JerseyPiscatawayNew JerseyUSA
| |
Collapse
|
37
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:giac118. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Ian Sillitoe
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
- Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
- Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
- European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
38
|
Burley SK, Berman HM, Duarte JM, Feng Z, Flatt JW, Hudson BP, Lowe R, Peisach E, Piehl DW, Rose Y, Sali A, Sekharan M, Shao C, Vallat B, Voigt M, Westbrook JD, Young JY, Zardecki C. Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students. Biomolecules 2022; 12:1425. [PMID: 36291635 PMCID: PMC9599165 DOI: 10.3390/biom12101425] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/23/2022] [Accepted: 09/26/2022] [Indexed: 11/18/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Collapse
Affiliation(s)
- Stephen K. Burley
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Helen M. Berman
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Jose M. Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Zukang Feng
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Justin W. Flatt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brian P. Hudson
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Robert Lowe
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Ezra Peisach
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Dennis W. Piehl
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Yana Rose
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, USA
| | - Andrej Sali
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA 94158, USA
| | - Monica Sekharan
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Chenghua Shao
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Brinda Vallat
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Maria Voigt
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - John D. Westbrook
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
| | - Jasmine Y. Young
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Christine Zardecki
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
39
|
Varadi M, Anyango S, Appasamy SD, Armstrong D, Bage M, Berrisford J, Choudhary P, Bertoni D, Deshpande M, Leines GD, Ellaway J, Evans G, Gaborova R, Gupta D, Gutmanas A, Harrus D, Kleywegt GJ, Bueno WM, Nadzirin N, Nair S, Pravda L, Afonso MQL, Sehnal D, Tanweer A, Tolchard J, Abrams C, Dunlop R, Velankar S. PDBe and PDBe-KB: Providing high-quality, up-to-date and integrated resources of macromolecular structures to support basic and applied research and education. Protein Sci 2022; 31:e4439. [PMID: 36173162 PMCID: PMC9517934 DOI: 10.1002/pro.4439] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/02/2022] [Accepted: 09/05/2022] [Indexed: 11/26/2022]
Abstract
The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sri Devan Appasamy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - David Armstrong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Marcus Bage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - John Berrisford
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Preeti Choudhary
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Damian Bertoni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Grisell Diaz Leines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Joseph Ellaway
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Genevieve Evans
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Romana Gaborova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Deepti Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Aleksandras Gutmanas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Deborah Harrus
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Gerard J Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | | | - Nurul Nadzirin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Lukas Pravda
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | | | - David Sehnal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
- CEITEC - Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Ahsan Tanweer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - James Tolchard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Charlotte Abrams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Roisin Dunlop
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| |
Collapse
|
40
|
LeDuc RD, Deutsch EW, Binz PA, Fellers RT, Cesnik AJ, Klein JA, Van Den Bossche T, Gabriels R, Yalavarthi A, Perez-Riverol Y, Carver J, Bittremieux W, Kawano S, Pullman B, Bandeira N, Kelleher NL, Thomas PM, Vizcaíno JA. Proteomics Standards Initiative's ProForma 2.0: Unifying the Encoding of Proteoforms and Peptidoforms. J Proteome Res 2022; 21:1189-1195. [PMID: 35290070 PMCID: PMC7612572 DOI: 10.1021/acs.jproteome.1c00771] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.
Collapse
Affiliation(s)
- Richard D LeDuc
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Eric W Deutsch
- Institute for Systems Biology, Seattle, Washington 98109, United States
| | - Pierre-Alain Binz
- Clinical Chemistry Service, Lausanne University Hospital, 1011 Lausanne, Switzerland
| | - Ryan T Fellers
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Anthony J Cesnik
- Department of Genetics, Stanford University, Stanford, California 94305, United States
- Chan Zuckerberg Biohub, 499 Illinois Street, San Francisco, California 94158, United States
- SciLifeLab, School of Engineering Sciences in Chemistry Biotechnology and Health, KTH-Royal Institute of Technology, SE-171 21 Solna, Stockholm, Sweden 113 51
| | - Joshua A Klein
- Program for Bioinformatics, Boston University, Boston, Massachusetts 02215, United States
| | - Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Ralf Gabriels
- VIB-UGent Center for Medical Biotechnology, VIB, Technologiepark 75-FSVM II, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000 Ghent, Belgium
| | - Arshika Yalavarthi
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| | | | | | - Shin Kawano
- Toyama University of International Studies, Toyama, 930-1292 Toyama, Higashikuromaki, 6 5-1, Japan
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | | | | | - Neil L Kelleher
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Paul M Thomas
- National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, Illinois 60611, United States
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
41
|
Goodsell DS, Burley SK. RCSB Protein Data Bank resources for structure-facilitated design of mRNA vaccines for existing and emerging viral pathogens. Structure 2022; 30:55-68.e2. [PMID: 34739839 PMCID: PMC8567414 DOI: 10.1016/j.str.2021.10.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/17/2021] [Accepted: 10/14/2021] [Indexed: 01/11/2023]
Abstract
Structural biologists provide direct insights into the molecular bases of human health and disease. The open-access Protein Data Bank (PDB) stores and delivers three-dimensional (3D) biostructure data that facilitate discovery and development of therapeutic agents and diagnostic tools. We are in the midst of a revolution in vaccinology. Non-infectious mRNA vaccines have been proven during the coronavirus disease 2019 (COVID-19) pandemic. This new technology underpins nimble discovery and clinical development platforms that use knowledge of 3D viral protein structures for societal benefit. The RCSB PDB supports vaccine designers through expert biocuration and rigorous validation of 3D structures; open-access dissemination of structure information; and search, visualization, and analysis tools for structure-guided design efforts. This resource article examines the structural biology underpinning the success of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) mRNA vaccines and enumerates some of the many protein structures in the PDB archive that could guide design of new countermeasures against existing and emerging viral pathogens.
Collapse
Affiliation(s)
- David S Goodsell
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Stephen K Burley
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08903, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| |
Collapse
|
42
|
Dey S, Prilusky J, Levy ED. QSalignWeb: A Server to Predict and Analyze Protein Quaternary Structure. Front Mol Biosci 2022; 8:787510. [PMID: 35071324 PMCID: PMC8769216 DOI: 10.3389/fmolb.2021.787510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 12/02/2021] [Indexed: 11/16/2022] Open
Abstract
The identification of physiologically relevant quaternary structures (QSs) in crystal lattices is challenging. To predict the physiological relevance of a particular QS, QSalign searches for homologous structures in which subunits interact in the same geometry. This approach proved accurate but was limited to structures already present in the Protein Data Bank (PDB). Here, we introduce a webserver (www.QSalign.org) allowing users to submit homo-oligomeric structures of their choice to the QSalign pipeline. Given a user-uploaded structure, the sequence is extracted and used to search homologs based on sequence similarity and PFAM domain architecture. If structural conservation is detected between a homolog and the user-uploaded QS, physiological relevance is inferred. The web server also generates alternative QSs with PISA and processes them the same way as the query submitted to widen the predictions. The result page also shows representative QSs in the protein family of the query, which is informative if no QS conservation was detected or if the protein appears monomeric. These representative QSs can also serve as a starting point for homology modeling.
Collapse
Affiliation(s)
- Sucharita Dey
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Jaime Prilusky
- Department of Life Sciences and Core Facilities, Weizmann Institute of Science, Rehovot, Israel
| | - Emmanuel D. Levy
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
43
|
Waman VP, Orengo C, Kleywegt GJ, Lesk AM. Three-dimensional Structure Databases of Biological Macromolecules. Methods Mol Biol 2022; 2449:43-91. [PMID: 35507259 DOI: 10.1007/978-1-0716-2095-3_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.
Collapse
Affiliation(s)
- Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Gerard J Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Arthur M Lesk
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
44
|
Perez MAS, Cuendet MA, Röhrig UF, Michielin O, Zoete V. Structural Prediction of Peptide-MHC Binding Modes. Methods Mol Biol 2022; 2405:245-282. [PMID: 35298818 DOI: 10.1007/978-1-0716-1855-4_13] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The immune system is constantly protecting its host from the invasion of pathogens and the development of cancer cells. The specific CD8+ T-cell immune response against virus-infected cells and tumor cells is based on the T-cell receptor recognition of antigenic peptides bound to class I major histocompatibility complexes (MHC) at the surface of antigen presenting cells. Consequently, the peptide binding specificities of the highly polymorphic MHC have important implications for the design of vaccines, for the treatment of autoimmune diseases, and for personalized cancer immunotherapy. Evidence-based machine-learning approaches have been successfully used for the prediction of peptide binders and are currently being developed for the prediction of peptide immunogenicity. However, understanding and modeling the structural details of peptide/MHC binding is crucial for a better understanding of the molecular mechanisms triggering the immunological processes, estimating peptide/MHC affinity using universal physics-based approaches, and driving the design of novel peptide ligands. Unfortunately, due to the large diversity of MHC allotypes and possible peptides, the growing number of 3D structures of peptide/MHC (pMHC) complexes in the Protein Data Bank only covers a small fraction of the possibilities. Consequently, there is a growing need for rapid and efficient approaches to predict 3D structures of pMHC complexes. Here, we review the key characteristics of the 3D structure of pMHC complexes before listing databases and other sources of information on pMHC structures and MHC specificities. Finally, we discuss some of the most prominent pMHC docking software.
Collapse
Affiliation(s)
- Marta A S Perez
- Computer-aided Molecular Engineering Group, Department of Oncology UNIL-CHUV, Lausanne University, Lausanne, Switzerland
- Ludwig Institute for Cancer Research, Lausanne, Switzerland
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michel A Cuendet
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Oncology Department, Centre Hospitalier Universitaire Vaudois (CHUV), Precision Oncology Center, Lausanne, Switzerland
| | - Ute F Röhrig
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Olivier Michielin
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
- Oncology Department, Centre Hospitalier Universitaire Vaudois (CHUV), Precision Oncology Center, Lausanne, Switzerland.
| | - Vincent Zoete
- Computer-aided Molecular Engineering Group, Department of Oncology UNIL-CHUV, Lausanne University, Lausanne, Switzerland.
- Ludwig Institute for Cancer Research, Lausanne, Switzerland.
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
45
|
Hunt SE, Moore B, Amode RM, Armean IM, Lemos D, Mushtaq A, Parton A, Schuilenburg H, Szpak M, Thormann A, Perry E, Trevanion SJ, Flicek P, Yates AD, Cunningham F. Annotating and prioritizing genomic variants using the Ensembl Variant Effect Predictor-A tutorial. Hum Mutat 2021; 43:986-997. [PMID: 34816521 PMCID: PMC7613081 DOI: 10.1002/humu.24298] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 11/02/2021] [Accepted: 11/14/2021] [Indexed: 11/05/2022]
Abstract
The Ensembl Variant Effect Predictor (VEP) is a freely available, open-source tool for the annotation and filtering of genomic variants. It predicts variant molecular consequences using the Ensembl/GENCODE or RefSeq gene sets. It also reports phenotype associations from databases such as ClinVar, allele frequencies from studies including gnomAD, and predictions of deleteriousness from tools such as Sorting Intolerant From Tolerant and Combined Annotation Dependent Depletion. Ensembl VEP includes filtering options to customize variant prioritization. It is well supported and updated roughly quarterly to incorporate the latest gene, variant, and phenotype association information. Ensembl VEP analysis can be performed using a highly configurable, extensible command-line tool, a Representational State Transfer application programming interface, and a user-friendly web interface. These access methods are designed to suit different levels of bioinformatics experience and meet different needs in terms of data size, visualization, and flexibility. In this tutorial, we will describe performing variant annotation using the Ensembl VEP web tool, which enables sophisticated analysis through a simple interface.
Collapse
Affiliation(s)
- Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ridwan M Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Irina M Armean
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Diana Lemos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew Parton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Michał Szpak
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
46
|
Kooistra AJ, Munk C, Hauser AS, Gloriam DE. An online GPCR structure analysis platform. Nat Struct Mol Biol 2021; 28:875-878. [PMID: 34759374 DOI: 10.1038/s41594-021-00675-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 09/22/2021] [Indexed: 11/09/2022]
Abstract
We present an online, interactive platform for comparative analysis of all available G-protein coupled receptor (GPCR) structures while correlating to functional data. The comprehensive platform encompasses structure similarity, secondary structure, protein backbone packing and movement, residue-residue contact networks, amino acid properties and prospective design of experimental mutagenesis studies. This lets any researcher tap the potential of sophisticated structural analyses enabling a plethora of basic and applied receptor research studies.
Collapse
Affiliation(s)
- Albert J Kooistra
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.
| | - Christian Munk
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.,Data Tools Department, Novozymes A/S, Copenhagen, Denmark
| | - Alexander S Hauser
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - David E Gloriam
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
47
|
PDB-wide identification of physiological hetero-oligomeric assemblies based on conserved quaternary structure geometry. Structure 2021; 29:1303-1311.e3. [PMID: 34520740 PMCID: PMC8575123 DOI: 10.1016/j.str.2021.07.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 03/22/2021] [Accepted: 07/23/2021] [Indexed: 11/21/2022]
Abstract
An accurate understanding of biomolecular mechanisms and diseases requires information on protein quaternary structure (QS). A critical challenge in inferring QS information from crystallography data is distinguishing biological interfaces from fortuitous crystal-packing contacts. Here, we employ QS conservation across homologs to infer the biological relevance of hetero-oligomers. We compare the structures and compositions of hetero-oligomers, which allow us to annotate 7,810 complexes as physiologically relevant, 1,060 as likely errors, and 1,432 with comparative information on subunit stoichiometry and composition. Excluding immunoglobulins, these annotations encompass over 51% of hetero-oligomers in the PDB. We curate a dataset of 577 hetero-oligomeric complexes to benchmark these annotations, which reveals an accuracy >94%. When homology information is not available, we compare QS across repositories (PDB, PISA, and EPPIC) to derive confidence estimates. This work provides high-quality annotations along with a large benchmark dataset of hetero-assemblies.
Collapse
|
48
|
Bindu A, Lakshmidevi N. In vitro and in silico approach for characterization of antimicrobial peptides from potential probiotic cultures against Staphylococcus aureus and Escherichia coli. World J Microbiol Biotechnol 2021; 37:172. [PMID: 34518944 DOI: 10.1007/s11274-021-03135-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/24/2021] [Indexed: 11/27/2022]
Abstract
The focus of the present study was to characterize antimicrobial peptide produced by potential probiotic cultures of Enterococcus durans DB-1aa (MCC4243), Lactiplantibacillus plantarum Cu2-PM7 (MCC4246) and Limosilactobacillus fermentum Cu3-PM8 (MCC4233) against Staphylococus aureus MTCC 96 and Escherichia coli MTCC118. The growth kinetic assay revealed 24 h of incubation to be optimum for bacteriocin production. The partially purified compound of all the three selected cultures after ion-exchange chromatography was found to be thermoresistant and stable under a wide range of pH. The compound was sensitive to proteinase-K, but resistant to trypsin, α-amylase and lipase. Comparatively, bacteriocins from L. fermentum Cu3-PM8 and L. plantarum Cu2-PM7 showed higher stability under studied parameter, hence was taken up for further investigation. The apparent molecular weight of bacteriocin from L. fermentum MCC4233 and L. plantarum MCC4246 was found to be 3.5 kDa. Further, plantaricin gene from MCC4246 was characterized in silico. The translated partial amino acid sequence of the plnA gene in MCC4246 displayed 48 amino acids showing 100 % similarity with plantaricin A of Lactobacillus plantarum (WP_0036419). The sequence revealed 7 β sheets, 6 α sheets, 6 predicted coils and 9 predicted turns. The predicted properties of the peptide included an isoelectric point of 10.82 and a hydrophobicity of 48.6 %. The molecular approach of using Geneious Prime software and protein prediction data base for characterization of bacteriocin is novel and predicts "KSSAYSLQMGATAIKQVKKLFKKWGW" to be a peptide responsible for antimicrobial activity. The study provides information about a broad spectrum bacteriocin in native probiotic culture and paves a way towards its application in functional foods as a biopreservative agent.
Collapse
Affiliation(s)
- Amrutha Bindu
- DOS in Microbiology, University of Mysore, Manasa Gangothri, Mysore, 570005, India
| | - N Lakshmidevi
- DOS in Microbiology, University of Mysore, Manasa Gangothri, Mysore, 570005, India.
| |
Collapse
|
49
|
van Ginkel G, Pravda L, Dana JM, Varadi M, Keller P, Anyango S, Velankar S. PDBeCIF: an open-source mmCIF/CIF parsing and processing package. BMC Bioinformatics 2021; 22:383. [PMID: 34301175 PMCID: PMC8299628 DOI: 10.1186/s12859-021-04271-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 06/15/2021] [Indexed: 11/26/2022] Open
Abstract
Background Biomacromolecular structural data outgrew the legacy Protein Data Bank (PDB) format which the scientific community relied on for decades, yet the use of its successor PDBx/Macromolecular Crystallographic Information File format (PDBx/mmCIF) is still not widespread. Perhaps one of the reasons is the availability of easy to use tools that only support the legacy format, but also the inherent difficulties of processing mmCIF files correctly, given the number of edge cases that make efficient parsing problematic. Nevertheless, to fully exploit macromolecular structure data and their associated annotations such as multiscale structures from integrative/hybrid methods or large macromolecular complexes determined using traditional methods, it is necessary to fully adopt the new format as soon as possible. Results To this end, we developed PDBeCIF, an open-source Python project for manipulating mmCIF and CIF files. It is part of the official list of mmCIF parsers recorded by the wwPDB and is heavily employed in the processes of the Protein Data Bank in Europe. The package is freely available both from the PyPI repository (http://pypi.org/project/pdbecif) and from GitHub (https://github.com/pdbeurope/pdbecif) along with rich documentation and many ready-to-use examples. Conclusions PDBeCIF is an efficient and lightweight Python 2.6+/3+ package with no external dependencies. It can be readily integrated with 3rd party libraries as well as adopted for broad scientific analyses. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04271-9.
Collapse
Affiliation(s)
- Glen van Ginkel
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Lukáš Pravda
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - José M Dana
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Peter Keller
- Global Phasing Ltd., Sheraton House, Castle Park, Cambridge, CB3 0AX, UK
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
50
|
PDBrenum: A webserver and program providing Protein Data Bank files renumbered according to their UniProt sequences. PLoS One 2021; 16:e0253411. [PMID: 34228733 PMCID: PMC8259974 DOI: 10.1371/journal.pone.0253411] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 06/05/2021] [Indexed: 11/19/2022] Open
Abstract
The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. In mid 2021, the database has almost 180,000 structures solved by X-ray crystallography, nuclear magnetic resonance, cryo-electron microscopy, and other methods. Many proteins have been studied under different conditions, including binding partners such as ligands, nucleic acids, or other proteins; mutations, and post-translational modifications, thus enabling extensive comparative structure-function studies. However, these studies are made more difficult because authors are allowed by the PDB to number the amino acids in each protein sequence in any manner they wish. This results in the same protein being numbered differently in the available PDB entries. For instance, some authors may include N-terminal signal peptides or the N-terminal methionine in the sequence numbering and others may not. In addition to the coordinates, there are many fields that contain structural and functional information regarding specific residues numbered according to the author. Here we provide a webserver and Python3 application that fixes the PDB sequence numbering problem by replacing the author numbering with numbering derived from the corresponding UniProt sequences. We obtain this correspondence from the SIFTS database from PDBe. The server and program can take a list of PDB entries or a list of UniProt identifiers (e.g., "P04637" or "P53_HUMAN") and provide renumbered files in mmCIF format and the legacy PDB format for both asymmetric unit files and biological assembly files provided by PDBe.
Collapse
|