1
|
Crauwels C, Heidig SL, Díaz A, Vranken WF. Large-scale structure-informed multiple sequence alignment of proteins with SIMSApiper. Bioinformatics 2024; 40:btae276. [PMID: 38648741 PMCID: PMC11099654 DOI: 10.1093/bioinformatics/btae276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 03/20/2024] [Accepted: 04/18/2024] [Indexed: 04/25/2024] Open
Abstract
SUMMARY SIMSApiper is a Nextflow pipeline that creates reliable, structure-informed MSAs of thousands of protein sequences faster than standard structure-based alignment methods. Structural information can be provided by the user or collected by the pipeline from online resources. Parallelization with sequence identity-based subsets can be activated to significantly speed up the alignment process. Finally, the number of gaps in the final alignment can be reduced by leveraging the position of conserved secondary structure elements. AVAILABILITY AND IMPLEMENTATION The pipeline is implemented using Nextflow, Python3, and Bash. It is publicly available on github.com/Bio2Byte/simsapiper.
Collapse
Affiliation(s)
- Charlotte Crauwels
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
- AI Lab, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - Sophie-Luise Heidig
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, 1050, Belgium
- AI Lab, Vrije Universiteit Brussel, Brussels, 1050, Belgium
- Evolutionary Biology & Ecology, Université libre de Bruxelles, Brussels, 1050, Belgium
| | - Adrián Díaz
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
- AI Lab, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, 1050, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
- AI Lab, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| |
Collapse
|
2
|
Dolcemascolo R, Heras-Hernández M, Goiriz L, Montagud-Martínez R, Requena-Menéndez A, Ruiz R, Pérez-Ràfols A, Higuera-Rodríguez RA, Pérez-Ropero G, Vranken WF, Martelli T, Kaiser W, Buijs J, Rodrigo G. Repurposing the mammalian RNA-binding protein Musashi-1 as an allosteric translation repressor in bacteria. eLife 2024; 12:RP91777. [PMID: 38363283 PMCID: PMC10942595 DOI: 10.7554/elife.91777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024] Open
Abstract
The RNA recognition motif (RRM) is the most common RNA-binding protein domain identified in nature. However, RRM-containing proteins are only prevalent in eukaryotic phyla, in which they play central regulatory roles. Here, we engineered an orthogonal post-transcriptional control system of gene expression in the bacterium Escherichia coli with the mammalian RNA-binding protein Musashi-1, which is a stem cell marker with neurodevelopmental role that contains two canonical RRMs. In the circuit, Musashi-1 is regulated transcriptionally and works as an allosteric translation repressor thanks to a specific interaction with the N-terminal coding region of a messenger RNA and its structural plasticity to respond to fatty acids. We fully characterized the genetic system at the population and single-cell levels showing a significant fold change in reporter expression, and the underlying molecular mechanism by assessing the in vitro binding kinetics and in vivo functionality of a series of RNA mutants. The dynamic response of the system was well recapitulated by a bottom-up mathematical model. Moreover, we applied the post-transcriptional mechanism engineered with Musashi-1 to specifically regulate a gene within an operon, implement combinatorial regulation, and reduce protein expression noise. This work illustrates how RRM-based regulation can be adapted to simple organisms, thereby adding a new regulatory layer in prokaryotes for translation control.
Collapse
Affiliation(s)
- Roswitha Dolcemascolo
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
- Department of Biotechnology, Polytechnic University of ValenciaValenciaSpain
| | - María Heras-Hernández
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
| | - Lucas Goiriz
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
- Department of Applied Mathematics, Polytechnic University of ValenciaValenciaSpain
| | - Roser Montagud-Martínez
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
- Department of Biotechnology, Polytechnic University of ValenciaValenciaSpain
| | | | - Raúl Ruiz
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
| | - Anna Pérez-Ràfols
- Giotto Biotech SRLSesto FiorentinoItaly
- Magnetic Resonance Center (CERM), Department of Chemistry Ugo Schiff, Consorzio Interuniversitario Risonanze Magnetiche di Metalloproteine (CIRMMP), University of FlorenceSesto FiorentinoItaly
| | - R Anahí Higuera-Rodríguez
- Dynamic Biosensors GmbHPlaneggGermany
- Department of Physics, Technical University of MunichGarchingGermany
| | - Guillermo Pérez-Ropero
- Ridgeview Instruments ABUppsalaSweden
- Department of Chemistry – BMC, Uppsala UniversityUppsalaSweden
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit BrusselBrusselsBelgium
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles – Vrije Universiteit BrusselBrusselsBelgium
| | | | | | - Jos Buijs
- Ridgeview Instruments ABUppsalaSweden
- Department of Immunology, Genetics, and Pathology, Uppsala UniversityUppsalaSweden
| | - Guillermo Rodrigo
- Institute for Integrative Systems Biology (I2SysBio), CSIC – University of ValenciaPaternaSpain
| |
Collapse
|
3
|
Roca-Martínez J, Dhondge H, Sattler M, Vranken WF. Deciphering the RRM-RNA recognition code: A computational analysis. PLoS Comput Biol 2023; 19:e1010859. [PMID: 36689472 PMCID: PMC9894542 DOI: 10.1371/journal.pcbi.1010859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 02/02/2023] [Accepted: 01/07/2023] [Indexed: 01/24/2023] Open
Abstract
RNA recognition motifs (RRM) are the most prevalent class of RNA binding domains in eucaryotes. Their RNA binding preferences have been investigated for almost two decades, and even though some RRM domains are now very well described, their RNA recognition code has remained elusive. An increasing number of experimental structures of RRM-RNA complexes has become available in recent years. Here, we perform an in-depth computational analysis to derive an RNA recognition code for canonical RRMs. We present and validate a computational scoring method to estimate the binding between an RRM and a single stranded RNA, based on structural data from a carefully curated multiple sequence alignment, which can predict RRM binding RNA sequence motifs based on the RRM protein sequence. Given the importance and prevalence of RRMs in humans and other species, this tool could help design RNA binding motifs with uses in medical or synthetic biology applications, leading towards the de novo design of RRMs with specific RNA recognition.
Collapse
Affiliation(s)
- Joel Roca-Martínez
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Structural biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | | | - Michael Sattler
- Institute of Structural Biology, Molecular Targets and Therapeutics Center, Helmholtz Munich, Neuherberg, Germany
- Bavarian NMR Center, Department of Bioscience, School of Natural Sciences, Technical University of Munich, Garching, Germany
| | - Wim F. Vranken
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Structural biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- * E-mail:
| |
Collapse
|
4
|
Damodaran K, Khan T, Bickel D, Jaya S, Vranken WF, Sudandiradoss C. New simulation insights on the structural transition mechanism of bovine rhodopsin activation. Proteins 2023; 91:771-780. [PMID: 36629258 DOI: 10.1002/prot.26465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 12/02/2022] [Accepted: 01/02/2023] [Indexed: 01/12/2023]
Abstract
Inactive rhodopsin can absorb photons, which induces different structural transitions that finally activate rhodopsin. We have examined the change in spatial configurations and physicochemical factors that result during the transition mechanism from the inactive to the active rhodopsin state via intermediates. During the activation process, many existing atomic contacts are disrupted, and new ones are formed. This is related to the movement of Helix 5, which tilts away from Helix 3 in the intermediate state in lumirhodopsin and moves closer to Helix 3 again in the active state. Similar patterns of changing atomic contacts are observed between Helices 3 and 5 of the adenosine and neurotensin receptors. In addition, residues 220-238 of rhodopsin, which are disordered in the inactive state, fold in the active state before binding to the Gα, where it catalyzes GDP/GTP exchange on the Gα subunit. Finally, molecular dynamics simulations in the membrane environment revealed that the arrestin binding region adopts a more flexible extended conformation upon phosphorylation, likely promoting arrestin binding and inactivation. In summary, our results provide additional structural understanding of specific rhodopsin activation which might be relevant to other Class A G protein-coupled receptor proteins.
Collapse
Affiliation(s)
- Kamalesh Damodaran
- Department of Integrative Biology, School of Bioscience and Technology, Vellore Institute of Technology, Vellore, India.,Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Brussels, Belgium
| | - Taushif Khan
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - David Bickel
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Sreeshma Jaya
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore, India
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Chinnappan Sudandiradoss
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore, India
| |
Collapse
|
5
|
Roca-Martinez J, Lazar T, Gavalda-Garcia J, Bickel D, Pancsa R, Dixit B, Tzavella K, Ramasamy P, Sanchez-Fornaris M, Grau I, Vranken WF. Challenges in describing the conformation and dynamics of proteins with ambiguous behavior. Front Mol Biosci 2022; 9:959956. [PMID: 35992270 PMCID: PMC9382080 DOI: 10.3389/fmolb.2022.959956] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
Traditionally, our understanding of how proteins operate and how evolution shapes them is based on two main data sources: the overall protein fold and the protein amino acid sequence. However, a significant part of the proteome shows highly dynamic and/or structurally ambiguous behavior, which cannot be correctly represented by the traditional fixed set of static coordinates. Representing such protein behaviors remains challenging and necessarily involves a complex interpretation of conformational states, including probabilistic descriptions. Relating protein dynamics and multiple conformations to their function as well as their physiological context (e.g., post-translational modifications and subcellular localization), therefore, remains elusive for much of the proteome, with studies to investigate the effect of protein dynamics relying heavily on computational models. We here investigate the possibility of delineating three classes of protein conformational behavior: order, disorder, and ambiguity. These definitions are explored based on three different datasets, using interpretable machine learning from a set of features, from AlphaFold2 to sequence-based predictions, to understand the overlap and differences between these datasets. This forms the basis for a discussion on the current limitations in describing the behavior of dynamic and ambiguous proteins.
Collapse
Affiliation(s)
- Joel Roca-Martinez
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - Jose Gavalda-Garcia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - David Bickel
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Rita Pancsa
- Research Centre for Natural Sciences, Institute of Enzymology, Budapest, Hungary
| | - Bhawna Dixit
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- IBiTech-Biommeda, Universiteit Gent, Gent, Belgium
| | - Konstantina Tzavella
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, Universiteit Gent, Gent, Belgium
| | - Maite Sanchez-Fornaris
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- Department of Computer Sciences, University of Camagüey, Camagüey, Cuba
| | - Isel Grau
- Information Systems, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Wim F. Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, VUB/ULB, Brussels, Belgium
- *Correspondence: Wim F. Vranken,
| |
Collapse
|
6
|
Ramasamy P, Vandermarliere E, Vranken WF, Martens L. Panoramic Perspective on Human Phosphosites. J Proteome Res 2022; 21:1894-1915. [PMID: 35793420 DOI: 10.1021/acs.jproteome.2c00164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein phosphorylation is the most common reversible post-translational modification of proteins and is key in the regulation of many cellular processes. Due to this importance, phosphorylation is extensively studied, resulting in the availability of a large amount of mass spectrometry-based phospho-proteomics data. Here, we leverage the information in these large-scale phospho-proteomics data sets, as contained in Scop3P, to analyze and characterize proteome-wide protein phosphorylation sites (P-sites). First, we set out to differentiate correctly observed P-sites from false-positive sites using five complementary site properties. We then describe the context of these P-sites in terms of the protein structure, solvent accessibility, structural transitions and disorder, and biophysical properties. We also investigate the relative prevalence of disease-linked mutations on and around P-sites. Moreover, we assess the structural dynamics of P-sites in their phosphorylated and unphosphorylated states. As a result, we show how large-scale reprocessing of available proteomics experiments can enable a more reliable view on proteome-wide P-sites. Furthermore, adding the structural context of proteins around P-sites helps uncover possible conformational switches upon phosphorylation. Moreover, by placing sites in different biophysical contexts, we show the differential preference in protein dynamics at phosphorylated sites when compared to the nonphosphorylated counterparts.
Collapse
Affiliation(s)
- Pathmanaban Ramasamy
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, 1050 Brussels, Belgium.,Centre for Structural Biology, VIB, 1050 Brussels, Belgium
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, 1050 Brussels, Belgium.,Centre for Structural Biology, VIB, 1050 Brussels, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium.,Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
7
|
Kagami L, Roca-Martínez J, Gavaldá-García J, Ramasamy P, Feenstra KA, Vranken WF. Online biophysical predictions for SARS-CoV-2 proteins. BMC Mol Cell Biol 2021; 22:23. [PMID: 33892639 PMCID: PMC8062939 DOI: 10.1186/s12860-021-00362-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 04/01/2021] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND The SARS-CoV-2 virus, the causative agent of COVID-19, consists of an assembly of proteins that determine its infectious and immunological behavior, as well as its response to therapeutics. Major structural biology efforts on these proteins have already provided essential insights into the mode of action of the virus, as well as avenues for structure-based drug design. However, not all of the SARS-CoV-2 proteins, or regions thereof, have a well-defined three-dimensional structure, and as such might exhibit ambiguous, dynamic behaviour that is not evident from static structure representations, nor from molecular dynamics simulations using these structures. MAIN: We present a website ( https://bio2byte.be/sars2/ ) that provides protein sequence-based predictions of the backbone and side-chain dynamics and conformational propensities of these proteins, as well as derived early folding, disorder, β-sheet aggregation, protein-protein interaction and epitope propensities. These predictions attempt to capture the inherent biophysical propensities encoded in the sequence, rather than context-dependent behaviour such as the final folded state. In addition, we provide the biophysical variation that is observed in homologous proteins, which gives an indication of the limits of their functionally relevant biophysical behaviour. CONCLUSION The https://bio2byte.be/sars2/ website provides a range of protein sequence-based predictions for 27 SARS-CoV-2 proteins, enabling researchers to form hypotheses about their possible functional modes of action.
Collapse
Affiliation(s)
- Luciano Kagami
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
| | - Joel Roca-Martínez
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
| | - Jose Gavaldá-García
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
| | - Pathmanaban Ramasamy
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000, Ghent, Belgium
| | - K Anton Feenstra
- IBIVU - Center for Integrative Bioinformatics, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, 1081HV, The Netherlands
- AIMMS - Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, Amsterdam, 1081HV, The Netherlands
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium.
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium.
- VIB Structural Biology Research Centre, Pleinlaan 2, 1050, Brussels, Belgium.
| |
Collapse
|
8
|
Piovesan D, Necci M, Escobedo N, Monzon AM, Hatos A, Mičetić I, Quaglia F, Paladin L, Ramasamy P, Dosztányi Z, Vranken WF, Davey N, Parisi G, Fuxreiter M, Tosatto SE. MobiDB: intrinsically disordered proteins in 2021. Nucleic Acids Res 2021; 49:D361-D367. [PMID: 33237329 PMCID: PMC7779018 DOI: 10.1093/nar/gkaa1058] [Citation(s) in RCA: 126] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/16/2020] [Accepted: 11/19/2020] [Indexed: 12/13/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) provides predictions and annotations for intrinsically disordered proteins. Here, we report recent developments implemented in MobiDB version 4, regarding the database format, with novel types of annotations and an improved update process. The new website includes a re-designed user interface, a more effective search engine and advanced API for programmatic access. The new database schema gives more flexibility for the users, as well as simplifying the maintenance and updates. In addition, the new entry page provides more visualisation tools including customizable feature viewer and graphs of the residue contact maps. MobiDB v4 annotates the binding modes of disordered proteins, whether they undergo disorder-to-order transitions or remain disordered in the bound state. In addition, disordered regions undergoing liquid-liquid phase separation or post-translational modifications are defined. The integrated information is presented in a simplified interface, which enables faster searches and allows large customized datasets to be downloaded in TSV, Fasta or JSON formats. An alternative advanced interface allows users to drill deeper into features of interest. A new statistics page provides information at database and proteome levels. The new MobiDB version presents state-of-the-art knowledge on disordered proteins and improves data accessibility for both computational and experimental users.
Collapse
Affiliation(s)
- Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Nahuel Escobedo
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | | | - András Hatos
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Ivan Mičetić
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Federica Quaglia
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Lisanna Paladin
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Pathmanaban Ramasamy
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, Ghent 9000, Belgium
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Norman E Davey
- Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London, SW3 6JB, UK
| | - Gustavo Parisi
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | - Monika Fuxreiter
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
9
|
Orlando G, Raimondi D, Kagami LP, Vranken WF. ShiftCrypt: a web server to understand and biophysically align proteins through their NMR chemical shift values. Nucleic Acids Res 2020; 48:W36-W40. [PMID: 32459331 PMCID: PMC7319548 DOI: 10.1093/nar/gkaa391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 04/21/2020] [Accepted: 05/04/2020] [Indexed: 02/06/2023] Open
Abstract
Nuclear magnetic resonance (NMR) spectroscopy data provides valuable information on the behaviour of proteins in solution. The primary data to determine when studying proteins are the per-atom NMR chemical shifts, which reflect the local environment of atoms and provide insights into amino acid residue dynamics and conformation. Within an amino acid residue, chemical shifts present multi-dimensional and complexly cross-correlated information, making them difficult to analyse. The ShiftCrypt method, based on neural network auto-encoder architecture, compresses the per-amino acid chemical shift information in a single, interpretable, amino acid-type independent value that reflects the biophysical state of a residue. We here present the ShiftCrypt web server, which makes the method readily available. The server accepts chemical shifts input files in the NMR Exchange Format (NEF) or NMR-STAR format, executes ShiftCrypt and visualises the results, which are also accessible via an API. It also enables the ”biophysically-based” pairwise alignment of two proteins based on their ShiftCrypt values. This approach uses Dynamic Time Warping and can optionally include their amino acid code information, and has applications in, for example, the alignment of disordered regions. The server uses a token-based system to ensure the anonymity of the users and results. The web server is available at www.bio2byte.be/shiftcrypt.
Collapse
Affiliation(s)
- Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, Brussels 1050, Belgium.,Switch Laboratory, VIB, Leuven, Belgium
| | - Daniele Raimondi
- ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium
| | - Luciano Porto Kagami
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, Brussels 1050, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, Brussels 1050, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, Brussels 1050, Belgium.,VIB Structural Biology Research Centre, Pleinlaan 2, Brussels 1050, Belgium
| |
Collapse
|
10
|
Raimondi D, Orlando G, Vranken WF, Moreau Y. Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis. Sci Rep 2019; 9:16932. [PMID: 31729443 PMCID: PMC6858301 DOI: 10.1038/s41598-019-53324-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 10/25/2019] [Indexed: 11/21/2022] Open
Abstract
Machine learning (ML) is ubiquitous in bioinformatics, due to its versatility. One of the most crucial aspects to consider while training a ML model is to carefully select the optimal feature encoding for the problem at hand. Biophysical propensity scales are widely adopted in structural bioinformatics because they describe amino acids properties that are intuitively relevant for many structural and functional aspects of proteins, and are thus commonly used as input features for ML methods. In this paper we reproduce three classical structural bioinformatics prediction tasks to investigate the main assumptions about the use of propensity scales as input features for ML methods. We investigate their usefulness with different randomization experiments and we show that their effectiveness varies among the ML methods used and the tasks. We show that while linear methods are more dependent on the feature encoding, the specific biophysical meaning of the features is less relevant for non-linear methods. Moreover, we show that even among linear ML methods, the simpler one-hot encoding can surprisingly outperform the “biologically meaningful” scales. We also show that feature selection performed with non-linear ML methods may not be able to distinguish between randomized and “real” propensity scales by properly prioritizing to the latter. Finally, we show that learning problem-specific embeddings could be a simple, assumptions-free and optimal way to perform feature learning/engineering for structural bioinformatics tasks.
Collapse
Affiliation(s)
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050, Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, 3001, Leuven, Belgium.
| |
Collapse
|
11
|
Davey NE, Babu MM, Blackledge M, Bridge A, Capella-Gutierrez S, Dosztanyi Z, Drysdale R, Edwards RJ, Elofsson A, Felli IC, Gibson TJ, Gutmanas A, Hancock JM, Harrow J, Higgins D, Jeffries CM, Le Mercier P, Mészáros B, Necci M, Notredame C, Orchard S, Ouzounis CA, Pancsa R, Papaleo E, Pierattelli R, Piovesan D, Promponas VJ, Ruch P, Rustici G, Romero P, Sarntivijai S, Saunders G, Schuler B, Sharan M, Shields DC, Sussman JL, Tedds JA, Tompa P, Turewicz M, Vondrasek J, Vranken WF, Wallace BA, Wichapong K, Tosatto SCE. An intrinsically disordered proteins community for ELIXIR. F1000Res 2019; 8. [PMID: 31824649 PMCID: PMC6880265 DOI: 10.12688/f1000research.20136.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/18/2019] [Indexed: 01/20/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs. The roadmap is the result of a workshop titled “An intrinsically disordered protein user community proposal for ELIXIR” held at the University of Padua. The workshop, and further consultation with the members of the wider IDP community, identified the key priority areas for the roadmap including the development of standards for data annotation, storage and dissemination; integration of IDP data into the ELIXIR Core Data Resources; and the creation of benchmarking criteria for IDP-related software. Here, we discuss these areas of priority, how they can be implemented in cooperation with the ELIXIR platforms, and their connections to existing ELIXIR Communities and international consortia. The article provides a preliminary blueprint for an IDP Community in ELIXIR and is an appeal to identify and involve new stakeholders.
Collapse
Affiliation(s)
- Norman E Davey
- Division of Cancer Biology, Institute of Cancer Research, UK, London, SW3 6JB, UK
| | - M Madan Babu
- MRC Laboratory of Molecular Biology,, Cambridge, CB2 0QH, UK
| | - Martin Blackledge
- Institut de Biologie Structurale, Université Grenoble Alpes, Grenoble, 38000, France
| | - Alan Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | | | - Zsuzsanna Dosztanyi
- Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| | | | - Richard J Edwards
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Isabella C Felli
- Department of Chemistry and CERM "Ugo Schiff", University of Florence, Florence, Italy
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Cambridge, CB10 1SD, UK
| | - John M Hancock
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Jen Harrow
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Desmond Higgins
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin, D4, Ireland
| | - Cy M Jeffries
- European Molecular Biology Laboratory, Hamburg, Germany
| | - Philippe Le Mercier
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Balint Mészáros
- Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Cedric Notredame
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Cambridge, CB10 1SD, UK
| | - Christos A Ouzounis
- BCPL-CPERI, Centre for Research & Technology Hellas (CERTH), Thessalonica, 57001, Greece
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences of the Hungarian Academy of Sciences, Budapest, H-1117, Hungary
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen, 2100, Denmark
| | - Roberta Pierattelli
- Department of Chemistry and CERM "Ugo Schiff", University of Florence, Florence, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY-1678, Cyprus
| | - Patrick Ruch
- HES-SO/HEG and SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Gabriella Rustici
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Pedro Romero
- University of Wisconsin-Madison, Madison, WI, 53706-1544, USA
| | | | - Gary Saunders
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Benjamin Schuler
- Department of Biochemistry, University of Zurich, Zurich, Switzerland
| | - Malvika Sharan
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Denis C Shields
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin, D4, Ireland
| | - Joel L Sussman
- Department of Structural Biology and the Israel Structural Proteomics, Center (ISPC), Weizmann Institute of Science, Reḥovot, 7610001, Israel
| | | | - Peter Tompa
- VIB Center for Structural Biology (CSB), VIB Flemish Institute for Biotechnology, Brussels, 1050, Belgium
| | - Michael Turewicz
- Faculty of Medicine, Medizinisches Proteom-Center, Ruhr University Bochum, GesundheitsCampus 4, Bochum, 44801, Germany
| | - Jiri Vondrasek
- Institute of Organic Chemistry and Biochemistry, CAS, Prague, Czech Republic
| | - Wim F Vranken
- VUB/ULB Interuniversity Institute of Bioinformatics in Brussels and Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, B-1050, Belgium
| | - Bonnie Ann Wallace
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, WC1H 0HA, UK
| | - Kanin Wichapong
- Department of Biochemistry, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
| | | |
Collapse
|
12
|
Abstract
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Collapse
Affiliation(s)
- Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050, Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, Boulevard du Triomphe, CP 212, 1050, Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050, Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, Boulevard du Triomphe, CP 212, 1050, Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
| | - Rita Pancsa
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0QH, United Kingdom
| | - Taushif Khan
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050, Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050, Brussels, Belgium.
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050, Brussels, Belgium.
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium.
| |
Collapse
|
13
|
Piovesan D, Tabaro F, Paladin L, Necci M, Micetic I, Camilloni C, Davey N, Dosztányi Z, Mészáros B, Monzon AM, Parisi G, Schad E, Sormanni P, Tompa P, Vendruscolo M, Vranken WF, Tosatto SCE. MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins. Nucleic Acids Res 2019; 46:D471-D476. [PMID: 29136219 PMCID: PMC5753340 DOI: 10.1093/nar/gkx1071] [Citation(s) in RCA: 156] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 10/19/2017] [Indexed: 01/30/2023] Open
Abstract
The MobiDB (URL: mobidb.bio.unipd.it) database of protein disorder and mobility annotations has been significantly updated and upgraded since its last major renewal in 2014. Several curated datasets for intrinsic disorder and folding upon binding have been integrated from specialized databases. The indirect evidence has also been expanded to better capture information available in the PDB, such as high temperature residues in X-ray structures and overall conformational diversity. Novel nuclear magnetic resonance chemical shift data provides an additional experimental information layer on conformational dynamics. Predictions have been expanded to provide new types of annotation on backbone rigidity, secondary structure preference and disordered binding regions. MobiDB 3.0 contains information for the complete UniProt protein set and synchronization has been improved by covering all UniParc sequences. An advanced search function allows the creation of a wide array of custom-made datasets for download and further analysis. A large amount of information and cross-links to more specialized databases are intended to make MobiDB the central resource for the scientific community working on protein intrinsic disorder and mobility.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Francesco Tabaro
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,Institute of Biosciences and Medical Technology, Arvo Ylpön katu 34, 33520 Tampere, Finland
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,Department of Agricultural Sciences, University of Udine, via Palladio 8, 33100 Udine, Italy.,Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige, Italy
| | - Ivan Micetic
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Carlo Camilloni
- Department of Biosciences, University of Milan, 20133 Milano, Italy
| | - Norman Davey
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland.,UCD School of Medicine & Medical Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, 1/c Pázmány Péter sétány, H-1117, Budapest, Hungary
| | - Bálint Mészáros
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, 1/c Pázmány Péter sétány, H-1117, Budapest, Hungary.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | - Alexander M Monzon
- Structural Bioinformatics Group, Department of Science and Technology, National University of Quilmes, CONICET, Roque Saenz Pena 182, Bernal B1876BXD, Argentina
| | - Gustavo Parisi
- Structural Bioinformatics Group, Department of Science and Technology, National University of Quilmes, CONICET, Roque Saenz Pena 182, Bernal B1876BXD, Argentina
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | - Pietro Sormanni
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary.,Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | | | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, 1050 Brussels, Belgium
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,CNR Institute of Neuroscience, via U. Bassi 58/b, 35131 Padua, Italy
| |
Collapse
|
14
|
Varadi M, De Baets G, Vranken WF, Tompa P, Pancsa R. AmyPro: a database of proteins with validated amyloidogenic regions. Nucleic Acids Res 2019; 46:D387-D392. [PMID: 29040693 PMCID: PMC5753394 DOI: 10.1093/nar/gkx950] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 10/10/2017] [Indexed: 01/05/2023] Open
Abstract
Soluble functional proteins may transform into insoluble amyloid fibrils that deposit in a variety of tissues. Amyloid formation is a hallmark of age-related degenerative disorders. Perhaps surprisingly, amyloid fibrils can also be beneficial and are frequently exploited for diverse functional roles in organisms. Here we introduce AmyPro, an open-access database providing a comprehensive, carefully curated collection of validated amyloid fibril-forming proteins from all kingdoms of life classified into broad functional categories (http://amypro.net). In particular, AmyPro provides the boundaries of experimentally validated amyloidogenic sequence regions, short descriptions of the functional relevance of the proteins and their amyloid state, a list of the experimental techniques applied to study the amyloid state, important structural/functional/variation/mutation data transferred from UniProt, a list of relevant PDB structures categorized according to protein states, database cross-references and literature references. AmyPro greatly improves on similar currently available resources by incorporating both prions and functional amyloids in addition to pathogenic amyloids, and allows users to screen their sequences against the entire collection of validated amyloidogenic sequence fragments. By enabling further elucidation of the sequential determinants of amyloid fibril formation, we hope AmyPro will enhance the development of new methods for the precise prediction of amyloidogenic regions within proteins.
Collapse
Affiliation(s)
- Mihaly Varadi
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Greet De Baets
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels (IB) 2, ULB-VUB, Brussels, 1050, Belgium.,VIB Center for Structural Biology, Vrije Universiteit Brussel (VUB), Brussels, 1050, Belgium
| | - Peter Tompa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, 1050, Belgium.,VIB Center for Structural Biology, Vrije Universiteit Brussel (VUB), Brussels, 1050, Belgium.,Institute of Enzymology, Research Centre for Natural Sciences of the HAS, Budapest, 1117, Hungary
| | - Rita Pancsa
- MRC Laboratory of Molecular Biology, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK
| |
Collapse
|
15
|
Orlando G, Raimondi D, Tabaro F, Codicè F, Moreau Y, Vranken WF. Computational identification of prion-like RNA-binding proteins that form liquid phase-separated condensates. Bioinformatics 2019; 35:4617-4623. [DOI: 10.1093/bioinformatics/btz274] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 04/06/2019] [Accepted: 04/12/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Motivation
Eukaryotic cells contain different membrane-delimited compartments, which are crucial for the biochemical reactions necessary to sustain cell life. Recent studies showed that cells can also trigger the formation of membraneless organelles composed by phase-separated proteins to respond to various stimuli. These condensates provide new ways to control the reactions and phase-separation proteins (PSPs) are thus revolutionizing how cellular organization is conceived. The small number of experimentally validated proteins, and the difficulty in discovering them, remain bottlenecks in PSPs research.
Results
Here we present PSPer, the first in-silico screening tool for prion-like RNA-binding PSPs. We show that it can prioritize PSPs among proteins containing similar RNA-binding domains, intrinsically disordered regions and prions. PSPer is thus suitable to screen proteomes, identifying the most likely PSPs for further experimental investigation. Moreover, its predictions are fully interpretable in the sense that it assigns specific functional regions to the predicted proteins, providing valuable information for experimental investigation of targeted mutations on these regions. Finally, we show that it can estimate the ability of artificially designed proteins to form condensates (r=−0.87), thus providing an in-silico screening tool for protein design experiments.
Availability and implementation
PSPer is available at bio2byte.com/psp.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, 1050 Brussels, Belgium
| | | | - Francesco Tabaro
- Institute of Biosciences and Medical Technology, Tampere 33520, Finland
| | - Francesco Codicè
- Department of Computer Science and Engineering, University of Bologna, Bologna 40127, Italy
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, Leuven 3001, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Center for Structural Biology, VIB, 1050 Brussels, Belgium
| |
Collapse
|
16
|
Raimondi D, Orlando G, Tabaro F, Lenaerts T, Rooman M, Moreau Y, Vranken WF. Large-scale in-silico statistical mutagenesis analysis sheds light on the deleteriousness landscape of the human proteome. Sci Rep 2018; 8:16980. [PMID: 30451933 PMCID: PMC6242909 DOI: 10.1038/s41598-018-34959-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 10/26/2018] [Indexed: 12/18/2022] Open
Abstract
Next generation sequencing technologies are providing increasing amounts of sequencing data, paving the way for improvements in clinical genetics and precision medicine. The interpretation of the observed genomic variants in the light of their phenotypic effects is thus emerging as a crucial task to solve in order to advance our understanding of how exomic variants affect proteins and how the proteins' functional changes affect human health. Since the experimental evaluation of the effects of every observed variant is unfeasible, Bioinformatics methods are being developed to address this challenge in-silico, by predicting the impact of millions of variants, thus providing insight into the deleteriousness landscape of entire proteomes. Here we show the feasibility of this approach by using the recently developed DEOGEN2 variant-effect predictor to perform the largest in-silico mutagenesis scan to date. We computed the deleteriousness score of 170 million variants over 15000 human proteins and we analysed the results, investigating how the predicted deleteriousness landscape of the proteins relates to known functionally and structurally relevant protein regions and biophysical properties. Moreover, we qualitatively validated our results by comparing them with two mutagenesis studies targeting two specific proteins, showing the consistency of DEOGEN2 predictions with respect to experimental data.
Collapse
Affiliation(s)
- Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, 1050, Brussels, Belgium
- ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, 1050, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
| | - Francesco Tabaro
- Institute of Biosciences and Medical Technology, Arvo Ylpőn katu 34, 33520, Tampere, Finland
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, 1050, Brussels, Belgium
- Machine Learning Group, ULB, La Plaine Campus, 1050, Brussels, Belgium
| | - Marianne Rooman
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, 1050, Brussels, Belgium
- Department of BioModeling, BioInformatics & BioProcesses, Université Libre de Bruxelles, 1050, Brussels, Belgium
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium
- Imec, 3001, Leuven, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, 1050, Brussels, Belgium.
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium.
| |
Collapse
|
17
|
Orlando G, Raimondi D, Khan T, Lenaerts T, Vranken WF. SVM-dependent pairwise HMM: an application to protein pairwise alignments. Bioinformatics 2018; 33:3902-3908. [PMID: 28666322 DOI: 10.1093/bioinformatics/btx391] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Accepted: 06/12/2017] [Indexed: 12/27/2022] Open
Abstract
Motivation Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Results Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. Availability and implementation A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. Contact wim.vranken@vub.be. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2.,Structural Biology Research Center, VIB.,Structural Machine Learning Group, Université Libre de Bruxelles
| | - Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2.,Structural Biology Research Center, VIB.,Structural Machine Learning Group, Université Libre de Bruxelles
| | - Taushif Khan
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.,Structural Machine Learning Group, Université Libre de Bruxelles.,Artificial Intelligence Lab, Vrije Universiteit Brussel, 1050 Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2.,Structural Biology Research Center, VIB
| |
Collapse
|
18
|
Raimondi D, Orlando G, Moreau Y, Vranken WF. Ultra-fast global homology detection with Discrete Cosine Transform and Dynamic Time Warping. Bioinformatics 2018; 34:3118-3125. [DOI: 10.1093/bioinformatics/bty309] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 04/18/2018] [Indexed: 11/14/2022] Open
Affiliation(s)
- Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- ESAT-STADIUS, KU Leuven, Leuven, Belgium
- Machine Learning Group, Université Libre De Bruxelles, Brussels, Belgium
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Machine Learning Group, Université Libre De Bruxelles, Brussels, Belgium
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, Leuven, Belgium
- Imec, Leuven, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| |
Collapse
|
19
|
Hou Q, De Geest PFG, Vranken WF, Heringa J, Feenstra KA. Seeing the trees through the forest: sequence-based homo- and heteromeric protein-protein interaction sites prediction using random forest. Bioinformatics 2018; 33:1479-1487. [PMID: 28073761 DOI: 10.1093/bioinformatics/btx005] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Accepted: 01/06/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Genome sequencing is producing an ever-increasing amount of associated protein sequences. Few of these sequences have experimentally validated annotations, however, and computational predictions are becoming increasingly successful in producing such annotations. One key challenge remains the prediction of the amino acids in a given protein sequence that are involved in protein-protein interactions. Such predictions are typically based on machine learning methods that take advantage of the properties and sequence positions of amino acids that are known to be involved in interaction. In this paper, we evaluate the importance of various features using Random Forest (RF), and include as a novel feature backbone flexibility predicted from sequences to further optimise protein interface prediction. Results We observe that there is no single sequence feature that enables pinpointing interacting sites in our Random Forest models. However, combining different properties does increase the performance of interface prediction. Our homomeric-trained RF interface predictor is able to distinguish interface from non-interface residues with an area under the ROC curve of 0.72 in a homomeric test-set. The heteromeric-trained RF interface predictor performs better than existing predictors on a independent heteromeric test-set. We trained a more general predictor on the combined homomeric and heteromeric dataset, and show that in addition to predicting homomeric interfaces, it is also able to pinpoint interface residues in heterodimers. This suggests that our random forest model and the features included capture common properties of both homodimer and heterodimer interfaces. Availability and Implementation The predictors and test datasets used in our analyses are freely available ( http://www.ibi.vu.nl/downloads/RF_PPI/ ). Contact k.a.feenstra@vu.nl. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingzhen Hou
- Center for Integrative Bioinformatics VU (IBIVU), Amsterdam, HV, The Netherlands.,Amsterdam Institute for Molecules Medicines and Systems (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, HV, The Netherlands
| | - Paul F G De Geest
- Center for Integrative Bioinformatics VU (IBIVU), Amsterdam, HV, The Netherlands.,Amsterdam Institute for Molecules Medicines and Systems (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, HV, The Netherlands
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel,Brussels, Belgium.,Structural Biology Research Centre, VIB, Brussels, Belgium
| | - Jaap Heringa
- Center for Integrative Bioinformatics VU (IBIVU), Amsterdam, HV, The Netherlands.,Amsterdam Institute for Molecules Medicines and Systems (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, HV, The Netherlands
| | - K Anton Feenstra
- Center for Integrative Bioinformatics VU (IBIVU), Amsterdam, HV, The Netherlands.,Amsterdam Institute for Molecules Medicines and Systems (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, HV, The Netherlands
| |
Collapse
|
20
|
Brysbaert G, Lorgouilloux K, Vranken WF, Lensink MF. RINspector: a Cytoscape app for centrality analyses and DynaMine flexibility prediction. Bioinformatics 2017; 34:294-296. [PMID: 29028877 PMCID: PMC5860209 DOI: 10.1093/bioinformatics/btx586] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 09/04/2017] [Accepted: 09/18/2017] [Indexed: 12/17/2022] Open
Abstract
Motivation Protein function is directly related to amino acid residue composition and the dynamics of these residues. Centrality analyses based on residue interaction networks permit to identify key residues in a protein that are important for its fold or function. Such central residues and their environment constitute suitable targets for mutagenesis experiments. Predicted flexibility and changes in flexibility upon mutation provide valuable additional information for the design of such experiments. Results We combined centrality analyses with DynaMine flexibility predictions in a Cytoscape app called RINspector. The app performs centrality analyses and directly visualizes the results on a graph of predicted residue flexibility. In addition, the effect of mutations on local flexibility can be calculated. Availability and implementation The app is publicly available in the Cytoscape app store. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guillaume Brysbaert
- University of Lille, CNRS UMR8576 UGSF, F-59000 Lille, France
- To whom correspondence should be addressed.
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, B-1050 Brussels, Belgium
- Structural Biology Research Centre, VIB, B-1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, B-1050 Brussels, Belgium
| | - Marc F Lensink
- University of Lille, CNRS UMR8576 UGSF, F-59000 Lille, France
| |
Collapse
|
21
|
Pancsa R, Raimondi D, Cilia E, Vranken WF. Early Folding Events, Local Interactions, and Conservation of Protein Backbone Rigidity. Biophys J 2017; 110:572-583. [PMID: 26840723 DOI: 10.1016/j.bpj.2015.12.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 12/21/2015] [Accepted: 12/29/2015] [Indexed: 01/20/2023] Open
Abstract
Protein folding is in its early stages largely determined by the protein sequence and complex local interactions between amino acids, resulting in lower energy conformations that provide the context for further folding into the native state. We compiled a comprehensive data set of early folding residues based on pulsed labeling hydrogen deuterium exchange experiments. These early folding residues have corresponding higher backbone rigidity as predicted by DynaMine from sequence, an effect also present when accounting for the secondary structures in the folded protein. We then show that the amino acids involved in early folding events are not more conserved than others, but rather, early folding fragments and the secondary structure elements they are part of show a clear trend toward conserving a rigid backbone. We therefore propose that backbone rigidity is a fundamental physical feature conserved by proteins that can provide important insights into their folding mechanisms and stability.
Collapse
Affiliation(s)
- Rita Pancsa
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Daniele Raimondi
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Elisa Cilia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
| |
Collapse
|
22
|
Orlando G, Raimondi D, Vranken WF. Observation selection bias in contact prediction and its implications for structural bioinformatics. Sci Rep 2016; 6:36679. [PMID: 27857150 PMCID: PMC5114557 DOI: 10.1038/srep36679] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 10/18/2016] [Indexed: 01/14/2023] Open
Abstract
Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined protein structures lagging behind. Structural bioinformatics is attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized protein sequences, with most of the developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show that there is a substantial observational selection bias in this approach: the predictions are validated on proteins with known structures from the PDB, but exactly for those proteins significantly more homologs are available compared to less studied sequences randomly extracted from Uniprot. Structural bioinformatics methods that were developed this way are thus likely to have over-estimated performances; we demonstrate this for two contact prediction methods, where performances drop up to 60% when taking into account a more realistic amount of evolutionary information. We provide a bias-free dataset for the validation for contact prediction methods called NOUMENON.
Collapse
Affiliation(s)
- G Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, Belgium.,Structural Biology Research Center, VIB, 1050 Brussels, Belgium
| | - D Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, Belgium.,Structural Biology Research Center, VIB, 1050 Brussels, Belgium
| | - W F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, Belgium.,Structural Biology Research Center, VIB, 1050 Brussels, Belgium
| |
Collapse
|
23
|
Raimondi D, Orlando G, Messens J, Vranken WF. Investigating the Molecular Mechanisms Behind Uncharacterized Cysteine Losses from Prediction of Their Oxidation State. Hum Mutat 2016; 38:86-94. [PMID: 27667481 DOI: 10.1002/humu.23129] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Revised: 09/13/2016] [Accepted: 09/20/2016] [Indexed: 01/08/2023]
Abstract
Cysteines are among the rarest amino acids in nature, and are both functionally and structurally very important for proteins. The ability of cysteines to form disulfide bonds is especially relevant, both for constraining the folded state of the protein and for performing enzymatic duties. But how does the variation record of human proteins reflect their functional importance and structural role, especially with regard to deleterious mutations? We created HUMCYS, a manually curated dataset of single amino acid variants that (1) have a known disease/neutral phenotypic outcome and (2) cause the loss of a cysteine, in order to investigate how mutated cysteines relate to structural aspects such as surface accessibility and cysteine oxidation state. We also have developed a sequence-based in silico cysteine oxidation predictor to overcome the scarcity of experimentally derived oxidation annotations, and applied it to extend our analysis to classes of proteins for which the experimental determination of their structure is technically challenging, such as transmembrane proteins. Our investigation shows that we can gain insights into the reason behind the outcome of cysteine losses in otherwise uncharacterized proteins, and we discuss the possible molecular mechanisms leading to deleterious phenotypes, such as the involvement of the mutated cysteine in a structurally or enzymatically relevant disulfide bond.
Collapse
Affiliation(s)
- Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,Structural Biology Research Center (SBRC), VIB, Brussels, Belgium.,Machine Learning Group, ULB, Brussels, Belgium
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,Structural Biology Research Center (SBRC), VIB, Brussels, Belgium.,Machine Learning Group, ULB, Brussels, Belgium
| | - Joris Messens
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,Structural Biology Research Center (SBRC), VIB, Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium.,Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,Structural Biology Research Center (SBRC), VIB, Brussels, Belgium
| |
Collapse
|
24
|
Raimondi D, Gazzo AM, Rooman M, Lenaerts T, Vranken WF. Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects. Bioinformatics 2016; 32:1797-804. [DOI: 10.1093/bioinformatics/btw094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 02/15/2016] [Indexed: 11/14/2022] Open
|
25
|
Pancsa R, Varadi M, Tompa P, Vranken WF. Start2Fold: a database of hydrogen/deuterium exchange data on protein folding and stability. Nucleic Acids Res 2015; 44:D429-34. [PMID: 26582925 PMCID: PMC4702845 DOI: 10.1093/nar/gkv1185] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/24/2015] [Indexed: 01/02/2023] Open
Abstract
Proteins fulfil a wide range of tasks in cells; understanding how they fold into complex three-dimensional (3D) structures and how these structures remain stable while retaining sufficient dynamics for functionality is essential for the interpretation of overall protein behaviour. Since the 1950's, solvent exchange-based methods have been the most powerful experimental means to obtain information on the folding and stability of proteins. Considerable expertise and care were required to obtain the resulting datasets, which, despite their importance and intrinsic value, have never been collected, curated and classified. Start2Fold is an openly accessible database (http://start2fold.eu) of carefully curated hydrogen/deuterium exchange (HDX) data extracted from the literature that is open for new submissions from the community. The database entries contain (i) information on the proteins investigated and the underlying experimental procedures and (ii) the classification of the residues based on their exchange protection levels, also allowing for the instant visualization of the relevant residue groups on the 3D structures of the corresponding proteins. By providing a clear hierarchical framework for the easy sharing, comparison and (re-)interpretation of HDX data, Start2Fold intends to promote a better understanding of how the protein sequence encodes folding and structure as well as the development of new computational methods predicting protein folding and stability.
Collapse
Affiliation(s)
- Rita Pancsa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium Structural Biology Research Center (IB), VIB, Brussels 1050, Belgium
| | - Mihaly Varadi
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium Structural Biology Research Center (IB), VIB, Brussels 1050, Belgium
| | - Peter Tompa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium Structural Biology Research Center (IB), VIB, Brussels 1050, Belgium Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels 1050, Belgium Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest 1113, Hungary
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium Structural Biology Research Center (IB), VIB, Brussels 1050, Belgium Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels 1050, Belgium
| |
Collapse
|
26
|
Gutmanas A, Adams PD, Bardiaux B, Berman HM, Case DA, Fogh RH, Güntert P, Hendrickx PMS, Herrmann T, Kleywegt GJ, Kobayashi N, Lange OF, Markley JL, Montelione GT, Nilges M, Ragan TJ, Schwieters CD, Tejero R, Ulrich EL, Velankar S, Vranken WF, Wedell JR, Westbrook J, Wishart DS, Vuister GW. NMR Exchange Format: a unified and open standard for representation of NMR restraint data. Nat Struct Mol Biol 2015; 22:433-4. [PMID: 26036565 PMCID: PMC4546829 DOI: 10.1038/nsmb.3041] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Aleksandras Gutmanas
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Paul D Adams
- Physical Biosciences Division, Lawrence Berkeley Laboratory, Berkeley, California, USA
| | - Benjamin Bardiaux
- 1] Département de Biologie Structurale et Chimie, Unité de Bioinformatique Structurale, Institut Pasteur, Paris, France. [2] Unité Mixte de Recherche 3528, Centre National de la Recherche Scientifique, Paris, France
| | - Helen M Berman
- Department of Chemistry and Chemical Biology, Center for Integrative Proteomics Research, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA
| | - David A Case
- Department of Chemistry and Chemical Biology, Center for Integrative Proteomics Research, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA
| | - Rasmus H Fogh
- Department of Biochemistry, University of Leicester, Leicester, UK
| | - Peter Güntert
- 1] Institute of Biophysical Chemistry, Frankfurt Institute of Advanced Studies, Goethe University Frankfurt am Main, Frankfurt am Main, Germany. [2] Graduate School of Science and Engineering, Tokyo Metropolitan University, Tokyo, Japan. [3] Physical Chemistry, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Pieter M S Hendrickx
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Torsten Herrmann
- 1] Centre de Résonance Magnétique Nucléaire à Très Hauts Champs, Ecole Normale Supérieure de Lyon, Villeurbanne, France. [2] Institut des Sciences Analytiques, Unité Mixte de Recherche 5280, Centre National de la Recherche Scientifique, Villeurbanne, France
| | - Gerard J Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | | | - Oliver F Lange
- Biomolecular NMR, Munich Center for Integrated Protein Science, Department Chemie, Technische Universität München, Garching, Germany
| | - John L Markley
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Gaetano T Montelione
- 1] Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA. [2] Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA
| | - Michael Nilges
- 1] Département de Biologie Structurale et Chimie, Unité de Bioinformatique Structurale, Institut Pasteur, Paris, France. [2] Unité Mixte de Recherche 3528, Centre National de la Recherche Scientifique, Paris, France
| | - Timothy J Ragan
- Department of Biochemistry, University of Leicester, Leicester, UK
| | - Charles D Schwieters
- Division of Computational Bioscience, Center for Information Technology, National Institutes of Health, Bethesda, Maryland, USA
| | - Roberto Tejero
- Departamento de Quίmica Fίsica, Universidad de Valencia, Valencia, Spain
| | - Eldon L Ulrich
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Wim F Vranken
- 1] Structural Biology Research Centre, Vlaams Instituut voor Biotechnologie, Brussels, Belgium. [2] Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium. [3] Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
| | - Jonathan R Wedell
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - John Westbrook
- Department of Chemistry and Chemical Biology, Center for Integrative Proteomics Research, Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA
| | - David S Wishart
- 1] Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada. [2] Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | | |
Collapse
|
27
|
Raimondi D, Orlando G, Vranken WF. An Evolutionary View on Disulfide Bond Connectivities Prediction Using Phylogenetic Trees and a Simple Cysteine Mutation Model. PLoS One 2015; 10:e0131792. [PMID: 26161671 PMCID: PMC4498770 DOI: 10.1371/journal.pone.0131792] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 06/07/2015] [Indexed: 01/09/2023] Open
Abstract
Disulfide bonds are crucial for many structural and functional aspects of proteins. They have a stabilizing role during folding, can regulate enzymatic activity and can trigger allosteric changes in the protein structure. Moreover, knowledge of the topology of the disulfide connectivity can be relevant in genomic annotation tasks and can provide long range constraints for ab-initio protein structure predictors. In this paper we describe PhyloCys, a novel unsupervised predictor of disulfide bond connectivity from known cysteine oxidation states. For each query protein, PhyloCys retrieves and aligns homologs with HHblits and builds a phylogenetic tree using ClustalW. A simplified model of cysteine co-evolution is then applied to the tree in order to hypothesize the presence of oxidized cysteines in the inner nodes of the tree, which represent ancestral protein sequences. The tree is then traversed from the leaves to the root and the putative disulfide connectivity is inferred by observing repeated patterns of tandem mutations between a sequence and its ancestors. A final correction is applied using the Edmonds-Gabow maximum weight perfect matching algorithm. The evolutionary approach applied in PhyloCys results in disulfide bond predictions equivalent to Sephiroth, another approach that takes whole sequence information into account, and is 26-29% better than state of the art methods based on cysteine covariance patterns in multiple sequence alignments, while requiring one order of magnitude fewer homologous sequences (10(3) instead of 10(4)), thus extending its range of applicability. The software described in this article and the datasets used are available at http://ibsquare.be/phylocys.
Collapse
Affiliation(s)
- Daniele Raimondi
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Department of Structural Biology, VIB, Brussels, Belgium
- Machine Learning Group, ULB, Brussels, Belgium
| | - Gabriele Orlando
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Department of Structural Biology, VIB, Brussels, Belgium
- Machine Learning Group, ULB, Brussels, Belgium
| | - Wim F. Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Department of Structural Biology, VIB, Brussels, Belgium
| |
Collapse
|
28
|
Abstract
NMR is a well-established method to characterize the structure and dynamics of biomolecules in solution. High-quality structures can now be produced thanks to both experimental advances and computational developments that incorporate new NMR parameters and improved protocols and force fields in the structure calculation and refinement process. In this chapter, we give a short overview of the various types of NMR data that can provide structural information, and then focus on the structure calculation methodology itself. We discuss and illustrate with tutorial examples "classical" structure calculation, refinement, and structure validation approaches.
Collapse
Affiliation(s)
- Wim F Vranken
- Department of Structural Biology, VIB Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, B-1050, Brussels, Belgium
| | | | | |
Collapse
|
29
|
Raimondi D, Orlando G, Vranken WF. Clustering-based model of cysteine co-evolution improves disulfide bond connectivity prediction and reduces homologous sequence requirements. Bioinformatics 2014; 31:1219-25. [DOI: 10.1093/bioinformatics/btu794] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2014] [Accepted: 11/18/2014] [Indexed: 12/23/2022] Open
|
30
|
Vranken WF. NMR structure validation in relation to dynamics and structure determination. Prog Nucl Magn Reson Spectrosc 2014; 82:27-38. [PMID: 25444697 DOI: 10.1016/j.pnmrs.2014.08.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 08/14/2014] [Accepted: 08/14/2014] [Indexed: 06/04/2023]
Abstract
NMR spectroscopy is a key technique for understanding the behaviour of proteins, especially highly dynamic proteins that adopt multiple conformations in solution. Overall, protein structures determined from NMR spectroscopy data constitute just over 10% of the Protein Data Bank archive. This review covers the validation of these NMR protein structures, but rather than describing currently available methodology, it focuses on concepts that are important for understanding where and how validation is most relevant. First, the inherent characteristics of the protein under study have an influence on quality and quantity of the distinct types of data that can be acquired from NMR experiments. Second, these NMR data are necessarily transformed into a model for use in a structure calculation protocol, and the protein structures that result from this reflect the types of NMR data used as well as the protein characteristics. The validation of NMR protein structures should therefore take account, wherever possible, of the inherent behavioural characteristics of the protein, the types of available NMR data, and the calculation protocol. These concepts are discussed in the context of 'knowledge based' and 'model versus data' validation, with suggestions for questions to ask and different validation categories to consider. The principal aim of this review is to stimulate discussion and to help the reader understand the relationships between the above elements in order to make informed decisions on which validation approaches are the most relevant in particular cases.
Collapse
Affiliation(s)
- Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Department of Structural Biology, VIB, 1050 Brussels, Belgium; Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, BC Building, 6th Floor, CP 263, 1050 Brussels, Belgium.
| |
Collapse
|
31
|
Cilia E, Pancsa R, Tompa P, Lenaerts T, Vranken WF. From protein sequence to dynamics and disorder with DynaMine. Nat Commun 2014; 4:2741. [PMID: 24225580 DOI: 10.1038/ncomms3741] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 10/10/2013] [Indexed: 11/09/2022] Open
Abstract
Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine--a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5.
Collapse
Affiliation(s)
- Elisa Cilia
- 1] MLG, Département d'Informatique, Université Libre de Bruxelles, Boulevard du Triomphe, CP 212, 1050 Brussels, Belgium [2] Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
| | | | | | | | | |
Collapse
|
32
|
Sterckx YGJ, Volkov AN, Vranken WF, Kragelj J, Jensen MR, Buts L, Garcia-Pino A, Jové T, Van Melderen L, Blackledge M, van Nuland NAJ, Loris R. Small-angle X-ray scattering- and nuclear magnetic resonance-derived conformational ensemble of the highly flexible antitoxin PaaA2. Structure 2014; 22:854-65. [PMID: 24768114 DOI: 10.1016/j.str.2014.03.012] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Revised: 03/14/2014] [Accepted: 03/15/2014] [Indexed: 11/26/2022]
Abstract
Antitoxins from prokaryotic type II toxin-antitoxin modules are characterized by a high degree of intrinsic disorder. The description of such highly flexible proteins is challenging because they cannot be represented by a single structure. Here, we present a combination of SAXS and NMR data to describe the conformational ensemble of the PaaA2 antitoxin from the human pathogen E. coli O157. The method encompasses the use of SAXS data to filter ensembles out of a pool of conformers generated by a custom NMR structure calculation protocol and the subsequent refinement by a block jackknife procedure. The final ensemble obtained through the method is validated by an established residual dipolar coupling analysis. We show that the conformational ensemble of PaaA2 is highly compact and that the protein exists in solution as two preformed helices, connected by a flexible linker, that probably act as molecular recognition elements for toxin inhibition.
Collapse
Affiliation(s)
- Yann G J Sterckx
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium
| | - Alexander N Volkov
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium
| | - Wim F Vranken
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium
| | - Jaka Kragelj
- Protein Dynamics and Flexibility, Institut de Biologie Structurale Jean-Pierre Ebel CNRS-CEA-UJF UMR 5075, 41 Rue Jules Horowitz, 38027 Grenoble Cedex, France
| | - Malene Ringkjøbing Jensen
- Protein Dynamics and Flexibility, Institut de Biologie Structurale Jean-Pierre Ebel CNRS-CEA-UJF UMR 5075, 41 Rue Jules Horowitz, 38027 Grenoble Cedex, France
| | - Lieven Buts
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium
| | - Abel Garcia-Pino
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium
| | - Thomas Jové
- Laboratoire de Génétique et Physiologie Bactérienne, Institut de Biologie et de Médecine Moléculaires Faculté des Sciences, Université Libre de Bruxelles, 12 Rue des Professeurs Jeener et Brachet, B-6041 Gosselies, Belgium
| | - Laurence Van Melderen
- Laboratoire de Génétique et Physiologie Bactérienne, Institut de Biologie et de Médecine Moléculaires Faculté des Sciences, Université Libre de Bruxelles, 12 Rue des Professeurs Jeener et Brachet, B-6041 Gosselies, Belgium
| | - Martin Blackledge
- Protein Dynamics and Flexibility, Institut de Biologie Structurale Jean-Pierre Ebel CNRS-CEA-UJF UMR 5075, 41 Rue Jules Horowitz, 38027 Grenoble Cedex, France
| | - Nico A J van Nuland
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium.
| | - Remy Loris
- Structural Biology Brussels, Department of Biotechnology, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium; Molecular Recognition Unit and Jean Jeener NMR Centre, Structural Biology Research Center, VIB, Pleinlaan 2, B-1050 Brussels, Belgium.
| |
Collapse
|
33
|
Abstract
Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at http://dynamine.ibsquare.be.
Collapse
Affiliation(s)
- Elisa Cilia
- MLG, Computer Science Department, Université Libre de Bruxelles (ULB), Brussels, Belgium Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels, Belgium
| | - Rita Pancsa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium Department of Structural Biology, VIB, Brussels, Belgium
| | - Peter Tompa
- Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels, Belgium Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium Department of Structural Biology, VIB, Brussels, Belgium Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary
| | - Tom Lenaerts
- MLG, Computer Science Department, Université Libre de Bruxelles (ULB), Brussels, Belgium Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels, Belgium AI-Lab, Computer Science Department, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels (IB), ULB-VUB, Brussels, Belgium Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels, Belgium Department of Structural Biology, VIB, Brussels, Belgium
| |
Collapse
|
34
|
Montelione GT, Nilges M, Bax A, Güntert P, Herrmann T, Richardson JS, Schwieters CD, Vranken WF, Vuister GW, Wishart DS, Berman HM, Kleywegt GJ, Markley JL. Recommendations of the wwPDB NMR Validation Task Force. Structure 2014; 21:1563-70. [PMID: 24010715 DOI: 10.1016/j.str.2013.07.021] [Citation(s) in RCA: 124] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Revised: 07/19/2013] [Accepted: 07/29/2013] [Indexed: 11/25/2022]
Abstract
As methods for analysis of biomolecular structure and dynamics using nuclear magnetic resonance spectroscopy (NMR) continue to advance, the resulting 3D structures, chemical shifts, and other NMR data are broadly impacting biology, chemistry, and medicine. Structure model assessment is a critical area of NMR methods development, and is an essential component of the process of making these structures accessible and useful to the wider scientific community. For these reasons, the Worldwide Protein Data Bank (wwPDB) has convened an NMR Validation Task Force (NMR-VTF) to work with wwPDB partners in developing metrics and policies for biomolecular NMR data harvesting, structure representation, and structure quality assessment. This paper summarizes the recommendations of the NMR-VTF, and lays the groundwork for future work in developing standards and metrics for biomolecular NMR structure quality assessment.
Collapse
Affiliation(s)
- Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Vanwetswinkel S, Volkov AN, Sterckx YGJ, Garcia-Pino A, Buts L, Vranken WF, Bouckaert J, Roy R, Wyns L, van Nuland NAJ. Study of the structural and dynamic effects in the FimH adhesin upon α-d-heptyl mannose binding. J Med Chem 2014; 57:1416-27. [PMID: 24476493 DOI: 10.1021/jm401666c] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Uropathogenic Escherichia coli cause urinary tract infections by adhering to mannosylated receptors on the human urothelium via the carbohydrate-binding domain of the FimH adhesin (FimHL). Numerous α-d-mannopyranosides, including α-d-heptyl mannose (HM), inhibit this process by interacting with FimHL. To establish the molecular basis of the high-affinity HM binding, we solved the solution structure of the apo form and the crystal structure of the FimHL-HM complex. NMR relaxation analysis revealed that protein dynamics were not affected by the sugar binding, yet HM addition promoted protein dimerization, which was further confirmed by small-angle X-ray scattering. Finally, to address the role of Y48, part of the "tyrosine gate" believed to govern the affinity and specificity of mannoside binding, we characterized the FimHL Y48A mutant, whose conformational, dynamical, and HM binding properties were found to be very similar to those of the wild-type protein.
Collapse
Affiliation(s)
- Sophie Vanwetswinkel
- Jean Jeener NMR Centre, Structural Biology Brussels, Vrije Universiteit Brussel , Pleinlaan 2, 1050 Brussels, Belgium
| | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
van der Schot G, Zhang Z, Vernon R, Shen Y, Vranken WF, Baker D, Bonvin AMJJ, Lange OF. Improving 3D structure prediction from chemical shift data. J Biomol NMR 2013; 57:27-35. [PMID: 23912841 DOI: 10.1007/s10858-013-9762-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 07/16/2013] [Indexed: 05/22/2023]
Abstract
We report advances in the calculation of protein structures from chemical shift nuclear magnetic resonance data alone. Our previously developed method, CS-Rosetta, assembles structures from a library of short protein fragments picked from a large library of protein structures using chemical shifts and sequence information. Here we demonstrate that combination of a new and improved fragment picker and the iterative sampling algorithm RASREC yield significant improvements in convergence and accuracy. Moreover, we introduce improved criteria for assessing the accuracy of the models produced by the method. The method was tested on 39 proteins in the 50-100 residue size range and yields reliable structures in 70 % of the cases. All structures that passed the reliability filter were accurate (<2 Å RMSD from the reference).
Collapse
Affiliation(s)
- Gijs van der Schot
- Computational Structural Biology, Bijvoet Center for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Doreleijers JF, Sousa da Silva AW, Krieger E, Nabuurs SB, Spronk CAEM, Stevens TJ, Vranken WF, Vriend G, Vuister GW. CING: an integrated residue-based structure validation program suite. J Biomol NMR 2012; 54:267-83. [PMID: 22986687 PMCID: PMC3483101 DOI: 10.1007/s10858-012-9669-7] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 08/31/2012] [Indexed: 05/03/2023]
Abstract
We present a suite of programs, named CING for Common Interface for NMR Structure Generation that provides for a residue-based, integrated validation of the structural NMR ensemble in conjunction with the experimental restraints and other input data. External validation programs and new internal validation routines compare the NMR-derived models with empirical data, measured chemical shifts, distance- and dihedral restraints and the results are visualized in a dynamic Web 2.0 report. A red-orange-green score is used for residues and restraints to direct the user to those critiques that warrant further investigation. Overall green scores below ~20 % accompanied by red scores over ~50 % are strongly indicative of poorly modelled structures. The publically accessible, secure iCing webserver ( https://nmr.le.ac.uk ) allows individual users to upload the NMR data and run a CING validation analysis.
Collapse
Affiliation(s)
- Jurgen F. Doreleijers
- CMBI, Radboud University Medical Centre, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | | | - Elmar Krieger
- YASARA Biosciences GmbH, Wagramer Strasse 25/3/45, 1220 Vienna, Austria
| | - Sander B. Nabuurs
- CMBI, Radboud University Medical Centre, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | | | - Tim J. Stevens
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA UK
| | - Wim F. Vranken
- Department of Structural Biology, VIB, Building E, 4th Floor, Pleinlaan 2, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Building E, 4th Floor, Pleinlaan 2, 1050 Brussels, Belgium
| | - Gert Vriend
- CMBI, Radboud University Medical Centre, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | - Geerten W. Vuister
- Department of Biochemistry, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN UK
| |
Collapse
|
38
|
Abstract
Background ACPYPE (or AnteChamber PYthon Parser interfacE) is a wrapper script around the ANTECHAMBER software that simplifies the generation of small molecule topologies and parameters for a variety of molecular dynamics programmes like GROMACS, CHARMM and CNS. It is written in the Python programming language and was developed as a tool for interfacing with other Python based applications such as the CCPN software suite (for NMR data analysis) and ARIA (for structure calculations from NMR data). ACPYPE is open source code, under GNU GPL v3, and is available as a stand-alone application at http://www.ccpn.ac.uk/acpype and as a web portal application at http://webapps.ccpn.ac.uk/acpype. Findings We verified the topologies generated by ACPYPE in three ways: by comparing with default AMBER topologies for standard amino acids; by generating and verifying topologies for a large set of ligands from the PDB; and by recalculating the structures for 5 protein–ligand complexes from the PDB. Conclusions ACPYPE is a tool that simplifies the automatic generation of topology and parameters in different formats for different molecular mechanics programmes, including calculation of partial charges, while being object oriented for integration with other applications.
Collapse
Affiliation(s)
- Alan W Sousa da Silva
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK.
| | | |
Collapse
|
39
|
Rosato A, Aramini JM, Arrowsmith C, Bagaria A, Baker D, Cavalli A, Doreleijers JF, Eletsky A, Giachetti A, Guerry P, Gutmanas A, Güntert P, He Y, Herrmann T, Huang YJ, Jaravine V, Jonker HRA, Kennedy MA, Lange OF, Liu G, Malliavin TE, Mani R, Mao B, Montelione GT, Nilges M, Rossi P, van der Schot G, Schwalbe H, Szyperski TA, Vendruscolo M, Vernon R, Vranken WF, Vries SD, Vuister GW, Wu B, Yang Y, Bonvin AMJJ. Blind testing of routine, fully automated determination of protein structures from NMR data. Structure 2012; 20:227-36. [PMID: 22325772 DOI: 10.1016/j.str.2012.01.002] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2011] [Revised: 12/19/2011] [Accepted: 01/03/2012] [Indexed: 11/16/2022]
Abstract
The protocols currently used for protein structure determination by nuclear magnetic resonance (NMR) depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether it is indeed possible to generate in a fully automated manner NMR structures adequate for deposition in the Protein Data Bank, we gathered 10 experimental data sets with unassigned nuclear Overhauser effect spectroscopy (NOESY) peak lists for various proteins of unknown structure, computed structures for each of them using different, fully automatic programs, and compared the results to each other and to the manually solved reference structures that were not available at the time the data were provided. This constitutes a stringent "blind" assessment similar to the CASP and CAPRI initiatives. This study demonstrates the feasibility of routine, fully automated protein structure determination by NMR.
Collapse
Affiliation(s)
- Antonio Rosato
- Magnetic Resonance Center, University of Florence, 50019 Sesto Fiorentino, Italy.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Affiliation(s)
- Aleksandr B. Sahakyan
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge
CB2 1EW, U.K
| | - Andrea Cavalli
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge
CB2 1EW, U.K
| | - Wim F. Vranken
- Department
of Structural Biology,
VIB and Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge
CB2 1EW, U.K
| |
Collapse
|
41
|
Camilloni C, De Simone A, Vranken WF, Vendruscolo M. Determination of secondary structure populations in disordered states of proteins using nuclear magnetic resonance chemical shifts. Biochemistry 2012; 51:2224-31. [PMID: 22360139 DOI: 10.1021/bi3001825] [Citation(s) in RCA: 274] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
One of the major open challenges in structural biology is to achieve effective descriptions of disordered states of proteins. This problem is difficult because these states are conformationally highly heterogeneous and cannot be represented as single structures, and therefore it is necessary to characterize their conformational properties in terms of probability distributions. Here we show that it is possible to obtain highly quantitative information about particularly important types of probability distributions, the populations of secondary structure elements (α-helix, β-strand, random coil, and polyproline II), by using the information provided by backbone chemical shifts. The application of this approach to mammalian prions indicates that for these proteins a key role in molecular recognition is played by disordered regions characterized by highly conserved polyproline II populations. We also determine the secondary structure populations of a range of other disordered proteins that are medically relevant, including p53, α-synuclein, and the Aβ peptide, as well as an oligomeric form of αB-crystallin. Because chemical shifts are the nuclear magnetic resonance parameters that can be measured under the widest variety of conditions, our approach can be used to obtain detailed information about secondary structure populations for a vast range of different protein states.
Collapse
Affiliation(s)
- Carlo Camilloni
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
| | | | | | | |
Collapse
|
42
|
Doreleijers JF, Vranken WF, Schulte C, Markley JL, Ulrich EL, Vriend G, Vuister GW. NRG-CING: integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB. Nucleic Acids Res 2011; 40:D519-24. [PMID: 22139937 PMCID: PMC3245154 DOI: 10.1093/nar/gkr1134] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
For many macromolecular NMR ensembles from the Protein Data Bank (PDB) the experiment-based restraint lists are available, while other experimental data, mainly chemical shift values, are often available from the BioMagResBank. The accuracy and precision of the coordinates in these macromolecular NMR ensembles can be improved by recalculation using the available experimental data and present-day software. Such efforts, however, generally fail on half of all NMR ensembles due to the syntactic and semantic heterogeneity of the underlying data and the wide variety of formats used for their deposition. We have combined the remediated restraint information from our NMR Restraints Grid (NRG) database with available chemical shifts from the BioMagResBank and the Common Interface for NMR structure Generation (CING) structure validation reports into the weekly updated NRG-CING database (http://nmr.cmbi.ru.nl/NRG-CING). Eleven programs have been included in the NRG-CING production pipeline to arrive at validation reports that list for each entry the potential inconsistencies between the coordinates and the available experimental NMR data. The longitudinal validation of these data in a publicly available relational database yields a set of indicators that can be used to judge the quality of every macromolecular structure solved with NMR. The remediated NMR experimental data sets and validation reports are freely available online.
Collapse
Affiliation(s)
- Jurgen F Doreleijers
- IMM, Radboud University Nijmegen, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
43
|
Velankar S, Alhroub Y, Best C, Caboche S, Conroy MJ, Dana JM, Fernandez Montecelo MA, van Ginkel G, Golovin A, Gore SP, Gutmanas A, Haslam P, Hendrickx PMS, Heuson E, Hirshberg M, John M, Lagerstedt I, Mir S, Newman LE, Oldfield TJ, Patwardhan A, Rinaldi L, Sahni G, Sanz-García E, Sen S, Slowley R, Suarez-Uruena A, Swaminathan GJ, Symmons MF, Vranken WF, Wainwright M, Kleywegt GJ. PDBe: Protein Data Bank in Europe. Nucleic Acids Res 2011; 40:D445-52. [PMID: 22110033 PMCID: PMC3245096 DOI: 10.1093/nar/gkr998] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The Protein Data Bank in Europe (PDBe; pdbe.org) is a partner in the Worldwide PDB organization (wwPDB; wwpdb.org) and as such actively involved in managing the single global archive of biomacromolecular structure data, the PDB. In addition, PDBe develops tools, services and resources to make structure-related data more accessible to the biomedical community. Here we describe recently developed, extended or improved services, including an animated structure-presentation widget (PDBportfolio), a widget to graphically display the coverage of any UniProt sequence in the PDB (UniPDB), chemistry- and taxonomy-based PDB-archive browsers (PDBeXplore), and a tool for interactive visualization of NMR structures, corresponding experimental data as well as validation and analysis results (Vivaldi).
Collapse
Affiliation(s)
- S Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Sahakyan AB, Vranken WF, Cavalli A, Vendruscolo M. Using Side-Chain Aromatic Proton Chemical Shifts for a Quantitative Analysis of Protein Structures. Angew Chem Int Ed Engl 2011. [DOI: 10.1002/ange.201101641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
45
|
Sahakyan AB, Vranken WF, Cavalli A, Vendruscolo M. Using side-chain aromatic proton chemical shifts for a quantitative analysis of protein structures. Angew Chem Int Ed Engl 2011; 50:9620-3. [PMID: 21887824 DOI: 10.1002/anie.201101641] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2011] [Indexed: 12/14/2022]
Affiliation(s)
- Aleksandr B Sahakyan
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | | | | | | |
Collapse
|
46
|
Sahakyan AB, Vranken WF, Cavalli A, Vendruscolo M. Structure-based prediction of methyl chemical shifts in proteins. J Biomol NMR 2011; 50:331-46. [PMID: 21748266 DOI: 10.1007/s10858-011-9524-2] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2011] [Accepted: 05/17/2011] [Indexed: 05/07/2023]
Abstract
Protein methyl groups have recently been the subject of much attention in NMR spectroscopy because of the opportunities that they provide to obtain information about the structure and dynamics of proteins and protein complexes. With the advent of selective labeling schemes, methyl groups are particularly interesting in the context of chemical shift based protein structure determination, an approach that to date has exploited primarily the mapping between protein structures and backbone chemical shifts. In order to extend the scope of chemical shifts for structure determination, we present here the CH3Shift method of performing structure-based predictions of methyl chemical shifts. The terms considered in the predictions take account of ring current, magnetic anisotropy, electric field, rotameric type, and dihedral angle effects, which are considered in conjunction with polynomial functions of interatomic distances. We show that the CH3Shift method achieves an accuracy in the predictions that ranges from 0.133 to 0.198 ppm for (1)H chemical shifts for Ala, Thr, Val, Leu and Ile methyl groups. We illustrate the use of the method by assessing the accuracy of side-chain structures in structural ensembles representing the dynamics of proteins.
Collapse
Affiliation(s)
- Aleksandr B Sahakyan
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | | | | | | |
Collapse
|
47
|
von der Lieth CW, Freire AA, Blank D, Campbell MP, Ceroni A, Damerell DR, Dell A, Dwek RA, Ernst B, Fogh R, Frank M, Geyer H, Geyer R, Harrison MJ, Henrick K, Herget S, Hull WE, Ionides J, Joshi HJ, Kamerling JP, Leeflang BR, Lütteke T, Lundborg M, Maass K, Merry A, Ranzinger R, Rosen J, Royle L, Rudd PM, Schloissnig S, Stenutz R, Vranken WF, Widmalm G, Haslam SM. EUROCarbDB: An open-access platform for glycoinformatics. Glycobiology 2011; 21:493-502. [PMID: 21106561 PMCID: PMC3055595 DOI: 10.1093/glycob/cwq188] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2010] [Revised: 11/03/2010] [Accepted: 11/03/2010] [Indexed: 01/03/2023] Open
Abstract
The EUROCarbDB project is a design study for a technical framework, which provides sophisticated, freely accessible, open-source informatics tools and databases to support glycobiology and glycomic research. EUROCarbDB is a relational database containing glycan structures, their biological context and, when available, primary and interpreted analytical data from high-performance liquid chromatography, mass spectrometry and nuclear magnetic resonance experiments. Database content can be accessed via a web-based user interface. The database is complemented by a suite of glycoinformatics tools, specifically designed to assist the elucidation and submission of glycan structure and experimental data when used in conjunction with contemporary carbohydrate research workflows. All software tools and source code are licensed under the terms of the Lesser General Public License, and publicly contributed structures and data are freely accessible. The public test version of the web interface to the EUROCarbDB can be found at http://www.ebi.ac.uk/eurocarb.
Collapse
Affiliation(s)
| | - Ana Ardá Freire
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Dennis Blank
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | - Matthew P Campbell
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Alessio Ceroni
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - David R Damerell
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Anne Dell
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Raymond A Dwek
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Beat Ernst
- Department of Pharmaceutical Science, University of Basel, BaselSwitzerland
| | - Rasmus Fogh
- European Bioinformatics Institute, Hinxton, UK
| | - Martin Frank
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Hildegard Geyer
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | - Rudolf Geyer
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | | | - Kim Henrick
- European Bioinformatics Institute, Hinxton, UK
| | - Stefan Herget
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - William E Hull
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | | | - Hiren J Joshi
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
- European Bioinformatics Institute, Hinxton, UK
| | - Johannis P Kamerling
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Bas R Leeflang
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Thomas Lütteke
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | | | - Kai Maass
- Institute of Biochemistry, Faculty of Medicine, Justus, Liebig University, Giessen, Germany
| | | | - René Ranzinger
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Jimmy Rosen
- Bijvoet-Center for Biomolecular Research, University of Utrecht, Utrecht, The Netherlands
| | - Louise Royle
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Pauline M Rudd
- Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Dublin, Ireland
- Department of Biochemistry, Oxford Glycobiology Institute, University of Oxford, UK
| | - Siegfried Schloissnig
- Core Facility, Molecular Structure Analysis, German Cancer Research Center, Heidelberg, Germany
| | - Roland Stenutz
- Organic Chemistry, Stockholm University, Stockholm, Sweden
| | | | - Göran Widmalm
- Organic Chemistry, Stockholm University, Stockholm, Sweden
| | - Stuart M Haslam
- Division of Molecular Biosciences, Faculty of Natural Sciences, Biochemistry Building, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| |
Collapse
|
48
|
Bernard A, Vranken WF, Bardiaux B, Nilges M, Malliavin TE. Bayesian estimation of NMR restraint potential and weight: a validation on a representative set of protein structures. Proteins 2011; 79:1525-37. [PMID: 21365680 DOI: 10.1002/prot.22980] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Revised: 12/15/2010] [Accepted: 12/22/2010] [Indexed: 11/07/2022]
Abstract
The classical procedure for nuclear magnetic resonance structure calculation allocates empirical distance ranges and uses historical values for weighting factors. However, Bayesian analysis suggests that there are more optimal choices for potential shape (bounds-free log-harmonic shape) and restraints weights. We compare the classical protocol with the Bayesian approach for more than 300 protein structures. We analyze the conformation similarity to the corresponding X-ray crystal structure, the distribution of the conformations around their average, and independent validation criteria. On average, the log-harmonic potential reduces the difference to the X-ray crystal structure. If the log-harmonic potential is used, the constant weighting tightens the distribution around the average conformation, with respect to the distributions obtained with Bayesian weighting. Conversely, the structure quality is improved by the Bayesian weighting over the classical procedure, whereas constant weighting worsens some criteria. The quality improvement obtained with the log-harmonic potential coupled to Bayesian weighting validates this approach on a representative set of protein structures.
Collapse
Affiliation(s)
- Aymeric Bernard
- Unité de Bioinformatique Structurale, CNRS URA 2185, Institut Pasteur, 25-28 rue du Dr. Roux, Paris 75724, France
| | | | | | | | | |
Collapse
|
49
|
Velankar S, Alhroub Y, Alili A, Best C, Boutselakis HC, Caboche S, Conroy MJ, Dana JM, van Ginkel G, Golovin A, Gore SP, Gutmanas A, Haslam P, Hirshberg M, John M, Lagerstedt I, Mir S, Newman LE, Oldfield TJ, Penkett CJ, Pineda-Castillo J, Rinaldi L, Sahni G, Sawka G, Sen S, Slowley R, Sousa da Silva AW, Suarez-Uruena A, Swaminathan GJ, Symmons MF, Vranken WF, Wainwright M, Kleywegt GJ. PDBe: Protein Data Bank in Europe. Nucleic Acids Res 2010; 39:D402-10. [PMID: 21045060 PMCID: PMC3013808 DOI: 10.1093/nar/gkq985] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The Protein Data Bank in Europe (PDBe; pdbe.org) is actively involved in managing the international archive of biomacromolecular structure data as one of the partners in the Worldwide Protein Data Bank (wwPDB; wwpdb.org). PDBe also develops new tools to make structural data more widely and more easily available to the biomedical community. PDBe has developed a browser to access and analyze the structural archive using classification systems that are familiar to chemists and biologists. The PDBe web pages that describe individual PDB entries have been enhanced through the introduction of plain-English summary pages and iconic representations of the contents of an entry (PDBprints). In addition, the information available for structures determined by means of NMR spectroscopy has been expanded. Finally, the entire web site has been redesigned to make it substantially easier to use for expert and novice users alike. PDBe works closely with other teams at the European Bioinformatics Institute (EBI) and in the international scientific community to develop new resources with value-added information. The SIFTS initiative is an example of such a collaboration—it provides extensive mapping data between proteins whose structures are available from the PDB and a host of other biomedical databases. SIFTS is widely used by major bioinformatics resources.
Collapse
Affiliation(s)
- Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Abstract
The public archives containing protein information in the form of NMR chemical shift data at the BioMagResBank (BMRB) and of 3D structure coordinates at the Protein Data Bank are continuously expanding. The quality of the data contained in these archives, however, varies. The main issue for chemical shift values is that they are determined relative to a reference frequency. When this reference frequency is set incorrectly, all related chemical shift values are systematically offset. Such wrongly referenced chemical shift values, as well as other problems such as chemical shift values that are assigned to the wrong atom, are not easily distinguished from correct values and effectively reduce the usefulness of the archive. We describe a new method to correct and validate protein chemical shift values in relation to their 3D structure coordinates. This method classifies atoms using two parameters: the per-atom solvent accessible surface area (as calculated from the coordinates) and the secondary structure of the parent amino acid. Through the use of Gaussian statistics based on a large database of 3220 BMRB entries, we obtain per-entry chemical shift corrections as well as Z scores for the individual chemical shift values. In addition, information on the error of the correction value itself is available, and the method can retain only dependable correction values. We provide an online resource with chemical shift, atom exposure, and secondary structure information for all relevant BMRB entries (http://www.ebi.ac.uk/pdbe/nmr/vasco) and hope this data will aid the development of new chemical shift-based methods in NMR. Proteins 2010. © 2010 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Wolfgang Rieping
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, United Kingdom
| | | |
Collapse
|