1
|
Del Conte A, Camagni GF, Clementel D, Minervini G, Monzon AM, Ferrari C, Piovesan D, Tosatto SCE. RING 4.0: faster residue interaction networks with novel interaction types across over 35,000 different chemical structures. Nucleic Acids Res 2024:gkae337. [PMID: 38686797 DOI: 10.1093/nar/gkae337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/09/2024] [Accepted: 04/19/2024] [Indexed: 05/02/2024] Open
Abstract
Residue interaction networks (RINs) are a valuable approach for representing contacts in protein structures. RINs have been widely used in various research areas, including the analysis of mutation effects, domain-domain communication, catalytic activity, and molecular dynamics simulations. The RING server is a powerful tool to calculate non-covalent molecular interactions based on geometrical parameters, providing high-quality and reliable results. Here, we introduce RING 4.0, which includes significant enhancements for identifying both covalent and non-covalent bonds in protein structures. It now encompasses seven different interaction types, with the addition of π-hydrogen, halogen bonds and metal ion coordination sites. The definitions of all available bond types have also been refined and RING can now process the complete PDB chemical component dictionary (over 35000 different molecules) which provides atom names and covalent connectivity information for all known ligands. Optimization of the software has improved execution time by an order of magnitude. The RING web server has been redesigned to provide a more engaging and interactive user experience, incorporating new visualization tools. Users can now visualize all types of interactions simultaneously in the structure viewer and network component. The web server, including extensive help and tutorials, is available from URL: https://ring.biocomputingup.it/.
Collapse
Affiliation(s)
- Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Giorgia F Camagni
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | - Carlo Ferrari
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | |
Collapse
|
2
|
Quaglia F, Chasapi A, Nugnes MV, Aspromonte MC, Leonardi E, Piovesan D, Tosatto SCE. Best practices for the manual curation of intrinsically disordered proteins in DisProt. Database (Oxford) 2024; 2024:baae009. [PMID: 38507044 PMCID: PMC10953794 DOI: 10.1093/database/baae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/18/2023] [Accepted: 02/03/2024] [Indexed: 03/22/2024]
Abstract
The DisProt database is a resource containing manually curated data on experimentally validated intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) from the literature. Developed in 2005, its primary goal was to collect structural and functional information into proteins that lack a fixed three-dimensional structure. Today, DisProt has evolved into a major repository that not only collects experimental data but also contributes to our understanding of the IDPs/IDRs roles in various biological processes, such as autophagy or the life cycle mechanisms in viruses or their involvement in diseases (such as cancer and neurodevelopmental disorders). DisProt offers detailed information on the structural states of IDPs/IDRs, including state transitions, interactions and their functions, all provided as curated annotations. One of the central activities of DisProt is the meticulous curation of experimental data from the literature. For this reason, to ensure that every expert and volunteer curator possesses the requisite knowledge for data evaluation, collection and integration, training courses and curation materials are available. However, biocuration guidelines concur on the importance of developing robust guidelines that not only provide critical information about data consistency but also ensure data acquisition.This guideline aims to provide both biocurators and external users with best practices for manually curating IDPs and IDRs in DisProt. It describes every step of the literature curation process and provides use cases of IDP curation within DisProt. Database URL: https://disprot.org/.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Via Giovanni Amendola, 122/O, Bari 70126, Italy
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi, 58/B, Padova 35131, Italy
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, 6th km Harilaou - Thermis 57001 Thermi, Thessalonica 57001, Greece
| | - Maria Victoria Nugnes
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi, 58/B, Padova 35131, Italy
| | | | - Emanuela Leonardi
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi, 58/B, Padova 35131, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi, 58/B, Padova 35131, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi, 58/B, Padova 35131, Italy
| |
Collapse
|
3
|
Aspromonte MC, Nugnes MV, Quaglia F, Bouharoua A, Tosatto SCE, Piovesan D. DisProt in 2024: improving function annotation of intrinsically disordered proteins. Nucleic Acids Res 2024; 52:D434-D441. [PMID: 37904585 PMCID: PMC10767923 DOI: 10.1093/nar/gkad928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 10/05/2023] [Accepted: 10/10/2023] [Indexed: 11/01/2023] Open
Abstract
DisProt (URL: https://disprot.org) is the gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions. The latest version of DisProt brings significant advancements, including a broader representation of functions and an enhanced curation process. These improvements aim to increase both the quality of annotations and their coverage at the sequence level. Higher coverage has been achieved by adopting additional evidence codes. Quality of annotations has been improved by systematically applying Minimum Information About Disorder Experiments (MIADE) principles and reporting all the details of the experimental setup that could potentially influence the structural state of a protein. The DisProt database now includes new thematic datasets and has expanded the adoption of Gene Ontology terms, resulting in an extensive functional repertoire which is automatically propagated to UniProtKB. Finally, we show that DisProt's curated annotations strongly correlate with disorder predictions inferred from AlphaFold2 pLDDT (predicted Local Distance Difference Test) confidence scores. This comparison highlights the utility of DisProt in explaining apparent uncertainty of certain well-defined predicted structures, which often correspond to folding-upon-binding fragments. Overall, DisProt serves as a comprehensive resource, combining experimental evidence of disorder information to enhance our understanding of intrinsically disordered proteins and their functional implications.
Collapse
Affiliation(s)
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Adel Bouharoua
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
4
|
Monzon AM, Arrías PN, Elofsson A, Mier P, Andrade-Navarro MA, Bevilacqua M, Clementel D, Bateman A, Hirsh L, Fornasari MS, Parisi G, Piovesan D, Kajava AV, Tosatto SCE. A STRP-ed definition of Structured Tandem Repeats in Proteins. J Struct Biol 2023; 215:108023. [PMID: 37652396 DOI: 10.1016/j.jsb.2023.108023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/31/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]
Abstract
Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Dept. of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Paula Nazarena Arrías
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Arne Elofsson
- Dept. of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Tomtebodavägen 23, 171 21 Solna, Sweden
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Martina Bevilacqua
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Clementel
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Layla Hirsh
- Dept. of Engineering, Faculty of Science and Engineering, Pontifical Catholic University of Peru, Av. Universitaria 1801 San Miguel, Lima 32, Lima, Peru
| | - Maria Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
5
|
Conte AD, Mehdiabadi M, Bouhraoua A, Miguel Monzon A, Tosatto SCE, Piovesan D. Critical assessment of protein intrinsic disorder prediction (CAID) - Results of round 2. Proteins 2023; 91:1925-1934. [PMID: 37621223 DOI: 10.1002/prot.26582] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/22/2023] [Accepted: 08/08/2023] [Indexed: 08/26/2023]
Abstract
Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The CAID2 challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. Methods based on AlphaFold2 seem to be good ID predictors but they are better at detecting absence of order rather than ID regions as defined in DisProt. The CAID2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.
Collapse
Affiliation(s)
- Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Mahta Mehdiabadi
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Adel Bouhraoua
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
6
|
Arrías PN, Monzon AM, Clementel D, Mozaffari S, Piovesan D, Kajava AV, Tosatto SCE. The repetitive structure of DNA clamps: An overlooked protein tandem repeat. J Struct Biol 2023; 215:108001. [PMID: 37467824 DOI: 10.1016/j.jsb.2023.108001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/12/2023] [Accepted: 07/16/2023] [Indexed: 07/21/2023]
Abstract
Structured tandem repeats proteins (STRPs) are a specific kind of tandem repeat proteins characterized by a modular and repetitive three-dimensional structure arrangement. The majority of STRPs adopt solenoid structures, but with the increasing availability of experimental structures and high-quality predicted structural models, more STRP folds can be characterized. Here, we describe "Box repeats", an overlooked STRP fold present in the DNA sliding clamp processivity factors, which has eluded classification although structural data has been available since the late 1990s. Each Box repeat is a β⍺βββ module of about 60 residues, which forms a class V "beads-on-a-string" type STRP. The number of repeats present in processivity factors is organism dependent. Monomers of PCNA proteins in both Archaea and Eukarya have 4 repeats, while the monomers of bacterial beta-sliding clamps have 6 repeats. This new repeat fold has been added to the RepeatsDB database, which now provides structural annotation for 66 Box repeat proteins belonging to different organisms, including viruses.
Collapse
Affiliation(s)
- Paula Nazarena Arrías
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alexander Miguel Monzon
- Department of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Soroush Mozaffari
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
7
|
Mészáros B, Hatos A, Palopoli N, Quaglia F, Salladini E, Van Roey K, Arthanari H, Dosztányi Z, Felli IC, Fischer PD, Hoch JC, Jeffries CM, Longhi S, Maiani E, Orchard S, Pancsa R, Papaleo E, Pierattelli R, Piovesan D, Pritisanac I, Tenorio L, Viennet T, Tompa P, Vranken W, Tosatto SCE, Davey NE. Minimum information guidelines for experiments structurally characterizing intrinsically disordered protein regions. Nat Methods 2023; 20:1291-1303. [PMID: 37400558 DOI: 10.1038/s41592-023-01915-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 05/18/2023] [Indexed: 07/05/2023]
Abstract
An unambiguous description of an experiment, and the subsequent biological observation, is vital for accurate data interpretation. Minimum information guidelines define the fundamental complement of data that can support an unambiguous conclusion based on experimental observations. We present the Minimum Information About Disorder Experiments (MIADE) guidelines to define the parameters required for the wider scientific community to understand the findings of an experiment studying the structural properties of intrinsically disordered regions (IDRs). MIADE guidelines provide recommendations for data producers to describe the results of their experiments at source, for curators to annotate experimental data to community resources and for database developers maintaining community resources to disseminate the data. The MIADE guidelines will improve the interpretability of experimental results for data consumers, facilitate direct data submission, simplify data curation, improve data exchange among repositories and standardize the dissemination of the key metadata on an IDR experiment by IDR data sources.
Collapse
Affiliation(s)
- Bálint Mészáros
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Structural Biology and Center for Data Driven Discovery, St Jude Children's Research Hospital, Memphis, TN, USA
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Swiss Cancer Center Leman, Lausanne, Switzerland
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires, Argentina
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Kim Van Roey
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Haribabu Arthanari
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | | | - Isabella C Felli
- Department of Chemistry 'Ugo Schiff' and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Patrick D Fischer
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | - Jeffrey C Hoch
- Department of Molecular Biology and Biophysics, UConn Health, Farmington, CT, USA
| | - Cy M Jeffries
- European Molecular Biology Laboratory (EMBL), Hamburg Unit, c/o Deutsches Elektronen-Synchrotron, Hamburg, Germany
| | - Sonia Longhi
- Laboratory Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), Marseille, France
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, Copenhagen, Denmark
- UniCamillus - Saint Camillus International University of Health and Medical Sciences, Rome, Italy
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, UK
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, Copenhagen, Denmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, Lyngby, Denmark
| | - Roberta Pierattelli
- Department of Chemistry 'Ugo Schiff' and Magnetic Resonance Center, University of Florence, Sesto Fiorentino (Florence), Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Iva Pritisanac
- Hospital for Sick Children, Toronto, Ontario, Canada
- Medical University of Graz, Graz, Austria
| | - Luiggi Tenorio
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Thibault Viennet
- Harvard Medical School (HMS), Boston, MA, USA
- Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- VIB-VUB Center for Structural Biology, Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
- Structural Biology Brussels, Department of Bioengineering Sciences, Vrije Universiteit Brussel, Brussels, Belgium
| | | | - Norman E Davey
- Division Of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, Chelsea, London, UK.
| |
Collapse
|
8
|
Aspromonte MC, Conte AD, Zhu S, Tan W, Shen Y, Zhang Y, Li Q, Wang MH, Babbi G, Bovo S, Martelli PL, Casadio R, Althagafi A, Toonsi S, Kulmanov M, Hoehndorf R, Katsonis P, Williams A, Lichtarge O, Xian S, Surento W, Pejaver V, Mooney SD, Sunderam U, Srinivasan R, Murgia A, Piovesan D, Tosatto SCE, Leonardi E. CAGI6 ID-Challenge: Assessment of phenotype and variant predictions in 415 children with Neurodevelopmental Disorders (NDDs). Res Sq 2023:rs.3.rs-3209168. [PMID: 37577579 PMCID: PMC10418555 DOI: 10.21203/rs.3.rs-3209168/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.
Collapse
Affiliation(s)
| | | | - Shaowen Zhu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
| | - Wuwei Tan
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
| | | | - Qi Li
- CUHK Shenzhen Research Institute, Shenzhen
| | | | - Giulia Babbi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
| | - Samuele Bovo
- Department of Agricultural and Food Sciences, University of Bologna
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna
| | - Azza Althagafi
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
| | - Sumyyah Toonsi
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
| | - Maxat Kulmanov
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030
| | - Su Xian
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
| | - Wesley Surento
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA 98195
| | - Uma Sunderam
- Innovation Labs, Tata Consultancy Services, Hyderabad
| | | | | | | | | | | |
Collapse
|
9
|
Del Conte A, Bouhraoua A, Mehdiabadi M, Clementel D, Monzon AM, Tosatto SCE, Piovesan D. CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins. Nucleic Acids Res 2023:7184153. [PMID: 37246642 DOI: 10.1093/nar/gkad430] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/26/2023] [Accepted: 05/10/2023] [Indexed: 05/30/2023] Open
Abstract
Intrinsic disorder (ID) in proteins is well-established in structural biology, with increasing evidence for its involvement in essential biological processes. As measuring dynamic ID behavior experimentally on a large scale remains difficult, scores of published ID predictors have tried to fill this gap. Unfortunately, their heterogeneity makes it difficult to compare performance, confounding biologists wanting to make an informed choice. To address this issue, the Critical Assessment of protein Intrinsic Disorder (CAID) benchmarks predictors for ID and binding regions as a community blind-test in a standardized computing environment. Here we present the CAID Prediction Portal, a web server executing all CAID methods on user-defined sequences. The server generates standardized output and facilitates comparison between methods, producing a consensus prediction highlighting high-confidence ID regions. The website contains extensive documentation explaining the meaning of different CAID statistics and providing a brief description of all methods. Predictor output is visualized in an interactive feature viewer and made available for download in a single table, with the option to recover previous sessions via a private dashboard. The CAID Prediction Portal is a valuable resource for researchers interested in studying ID in proteins. The server is available at the URL: https://caid.idpcentral.org.
Collapse
Affiliation(s)
- Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| | - Adel Bouhraoua
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| | - Mahta Mehdiabadi
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| | - Alexander Miguel Monzon
- Department of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58b, 35121Padova, Italy
| |
Collapse
|
10
|
Del Conte A, Monzon AM, Clementel D, Camagni GF, Minervini G, Tosatto SCE, Piovesan D. RING-PyMOL: residue interaction networks of structural ensembles and molecular dynamics. Bioinformatics 2023; 39:7133739. [PMID: 37079739 PMCID: PMC10159649 DOI: 10.1093/bioinformatics/btad260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 03/21/2023] [Accepted: 04/17/2023] [Indexed: 04/22/2023] Open
Abstract
• RING-PyMOL is a plugin for PyMOL providing a set of analysis tools for structural ensembles and molecular dynamic (MD) simulations. RING-PyMOL combines residue interaction networks, as provided by the RING software, with structural clustering to enhance the analysis and visualization of the conformational complexity. It combines precise calculation of non-covalent interactions with the power of PyMOL to manipulate and visualize protein structures. The plugin identifies and highlights correlating contacts and interaction patterns that can explain structural allostery, active sites and structural heterogeneity connected with molecular function. It is easy to use and extremely fast, processing and rendering hundreds of models and long trajectories in seconds. RING-PyMOL generates a number of interactive plots and output files for use with external tools. The underlying RING software has been improved extensively. It is ten times faster, can process mmCIF files and it identifies typed interactions also for nucleic acids. AVAILABILITY AND IMPLEMENTATION https://github.com/BioComputingUP/ring-pymol.
Collapse
Affiliation(s)
- Alessio Del Conte
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| | | | - Damiano Clementel
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| | - Giorgia F Camagni
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| | - Giovanni Minervini
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padova, 35121, Italy
| |
Collapse
|
11
|
Gauthier K, Piovesan D, Ramakirshnan S, Cho S, Lawson K, Liao K, Foster P, Cheng T, Shah Y, Walters M. 56P Inhibition of HIF-2α-dependent transcription with small molecule inhibitors may provide therapeutic benefit beyond renal cell carcinoma. ESMO Open 2023. [DOI: 10.1016/j.esmoop.2023.100914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023] Open
|
12
|
Varadi M, Nair S, Sillitoe I, Tauriello G, Anyango S, Bienert S, Borges C, Deshpande M, Green T, Hassabis D, Hatos A, Hegedus T, Hekkelman ML, Joosten R, Jumper J, Laydon A, Molodenskiy D, Piovesan D, Salladini E, Salzberg SL, Sommer MJ, Steinegger M, Suhajda E, Svergun D, Tenorio-Ku L, Tosatto S, Tunyasuvunakool K, Waterhouse AM, Žídek A, Schwede T, Orengo C, Velankar S. 3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources. Gigascience 2022; 11:6854872. [PMID: 36448847 PMCID: PMC9709962 DOI: 10.1093/gigascience/giac118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/20/2022] [Accepted: 11/11/2022] [Indexed: 12/02/2022] Open
Abstract
While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.
Collapse
Affiliation(s)
- Mihaly Varadi
- Correspondence address. Mihaly Varadi, PDBe team, Wellcome Trust Genome Campus, Saffron Walden CB10 1SA, UK. E-mail:
| | | | | | | | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Clemente Borges
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| | | | | | - Andras Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy,Department of Oncology, Lausanne University Hospital, Lausanne 1015, Switzerland,Department of Computational Biology, University of Lausanne, Lausanne 1015, Switzerland,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland,Swiss Cancer Center Leman, Lausanne 1005, Switzerland
| | - Tamas Hegedus
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | | | - Robbie Joosten
- Netherlands Cancer Institute, Amsterdam 1066 CX, The Netherlands
| | | | | | - Dmitry Molodenskiy
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Steven L Salzberg
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Markus J Sommer
- Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Martin Steinegger
- School of Biology, Seoul National University, Seoul 82-2-880-6971, 6977, South Korea
| | - Erzsebet Suhajda
- Department of Biophysics and Radiation Biology, Semmelweis University, Budapest 1094, Hungary
| | - Dmitri Svergun
- Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland,European Molecular Biology Laboratory, EMBL Hamburg, Hamburg 69117, Germany
| | - Luiggi Tenorio-Ku
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | - Silvio Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35129, Italy
| | | | - Andrew Mark Waterhouse
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Christine Orengo
- Department of Structural and Molecular Biology, UCL, London WC1E 6BT, UK
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton CB10 1SA, UK
| |
Collapse
|
13
|
Piovesan D, Del Conte A, Clementel D, Monzon A, Bevilacqua M, Aspromonte M, Iserte J, Orti FE, Marino-Buslje C, Tosatto SE. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res 2022; 51:D438-D444. [PMID: 36416266 PMCID: PMC9825420 DOI: 10.1093/nar/gkac1065] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/11/2022] [Accepted: 10/25/2022] [Indexed: 11/24/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) is a knowledge base of intrinsically disordered proteins. MobiDB aggregates disorder annotations derived from the literature and from experimental evidence along with predictions for all known protein sequences. MobiDB generates new knowledge and captures the functional significance of disordered regions by processing and combining complementary sources of information. Since its first release 10 years ago, the MobiDB database has evolved in order to improve the quality and coverage of protein disorder annotations and its accessibility. MobiDB has now reached its maturity in terms of data standardization and visualization. Here, we present a new release which focuses on the optimization of user experience and database content. The major advances compared to the previous version are the integration of AlphaFoldDB predictions and the re-implementation of the homology transfer pipeline, which expands manually curated annotations by two orders of magnitude. Finally, the entry page has been restyled in order to provide an overview of the available annotations along with two separate views that highlight structural disorder evidence and functions associated with different binding modes.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | | | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | - Fernando E Orti
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, Argentina
| | | | | |
Collapse
|
14
|
Piovesan D, Monzon AM, Tosatto SCE. Intrinsic Protein Disorder and Conditional Folding in AlphaFoldDB. Protein Sci 2022; 31:e4466. [PMID: 36210722 PMCID: PMC9601767 DOI: 10.1002/pro.4466] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 09/29/2022] [Accepted: 10/05/2022] [Indexed: 11/23/2022]
Abstract
Intrinsically disordered regions (IDRs) defying the traditional protein structure–function paradigm have been difficult to analyze. The availability of accurate structure predictions on a large scale in AlphaFoldDB offers a fresh perspective on IDR prediction. Here, we establish three baselines for IDR prediction from AlphaFoldDB models based on the recent CAID dataset. Surprisingly, AlphaFoldDB is highly competitive for predicting both IDRs and conditionally folded binding regions, demonstrating the plasticity of the disorder to structure continuum.
Collapse
Affiliation(s)
| | - Alexander Miguel Monzon
- Dept. of Biomedical Sciences, University of Padova, Italy.,Dept. of Information Engineering, University of Padova, Italy
| | | |
Collapse
|
15
|
Quaglia F, Hatos A, Salladini E, Piovesan D, Tosatto SCE. Exploring Manually Curated Annotations of Intrinsically Disordered Proteins with DisProt. Curr Protoc 2022; 2:e484. [PMID: 35789137 DOI: 10.1002/cpz1.484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
DisProt is the major repository of manually curated data for intrinsically disordered proteins collected from the literature. Although lacking a stable three-dimensional structure under physiological conditions, intrinsically disordered proteins carry out a plethora of biological functions, some of them directly arising from their flexible nature. A growing number of scientific studies have been published during the last few decades to shed light on their unstructured state, their binding modes, and their functions. DisProt makes use of a team of expert biocurators to provide up-to-date annotations of intrinsically disordered proteins from the literature, making them available to the scientific community. Here we present a comprehensive description on how to use DisProt in different contexts and provide a detailed explanation of how to explore and interpret manually curated annotations of intrinsically disordered proteins. We describe how to search DisProt annotations, both using the web interface and the API for programmatic access. Finally, we explain how to visualize and interpret a DisProt entry, the SARS-CoV-2 Nucleoprotein, characterized by the presence of unstructured N-terminal and C-terminal regions and a flexible linker. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Performing a search in DisProt Support Protocol 1: Downloading options Support Protocol 2: Programmatic access with DisProt REST API Basic Protocol 2: Exploring the DisProt Ontology page Basic Protocol 3: Visualizing and interpreting DisProt entries-the SARS-CoV-2 Nucleoprotein use case.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | |
Collapse
|
16
|
Piovesan D, Arbesú M, Fuxreiter M, Pons M. Editorial: Fuzzy Interactions: Many Facets of Protein Binding. Front Mol Biosci 2022; 9:947215. [PMID: 35795824 PMCID: PMC9251902 DOI: 10.3389/fmolb.2022.947215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 05/31/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Miguel Arbesú
- Department of NMR-Supported Structural Biology, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin, Germany
| | - Monika Fuxreiter
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- *Correspondence: Monika Fuxreiter, ; Miquel Pons,
| | - Miquel Pons
- Biomolecular NMR Lab, Department of Inorganic and Organic Chemistry, Universitat de Barcelona (UB), Barcelona, Spain
- *Correspondence: Monika Fuxreiter, ; Miquel Pons,
| |
Collapse
|
17
|
Clementel D, Del Conte A, Monzon AM, Camagni GF, Minervini G, Piovesan D, Tosatto SCE. RING 3.0: fast generation of probabilistic residue interaction networks from structural ensembles. Nucleic Acids Res 2022; 50:W651-W656. [PMID: 35554554 PMCID: PMC9252747 DOI: 10.1093/nar/gkac365] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 04/15/2022] [Accepted: 04/30/2022] [Indexed: 12/18/2022] Open
Abstract
Residue interaction networks (RINs) are used to represent residue contacts in protein structures. Thanks to the advances in network theory, RINs have been proved effective as an alternative to coordinate data in the analysis of complex systems. The RING server calculates high quality and reliable non-covalent molecular interactions based on geometrical parameters. Here, we present the new RING 3.0 version extending the previous functionality in several ways. The underlying software library has been re-engineered to improve speed by an order of magnitude. RING now also supports the mmCIF format and provides typed interactions for the entire PDB chemical component dictionary, including nucleic acids. Moreover, RING now employs probabilistic graphs, where multiple conformations (e.g. NMR or molecular dynamics ensembles) are mapped as weighted edges, opening up new ways to analyze structural data. The web interface has been expanded to include a simultaneous view of the RIN alongside a structure viewer, with both synchronized and clickable. Contact evolution across models (or time) is displayed as a heatmap and can help in the discovery of correlating interaction patterns. The web server, together with an extensive help and tutorial, is available from URL: https://ring.biocomputingup.it/.
Collapse
Affiliation(s)
- Damiano Clementel
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| | - Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| | | | - Giorgia F Camagni
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| | - Giovanni Minervini
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35131, Italy
| |
Collapse
|
18
|
Piovesan D, Monzon AM, Quaglia F, Tosatto SCE. Databases for intrinsically disordered proteins. Acta Crystallogr D Struct Biol 2022; 78:144-151. [PMID: 35102880 PMCID: PMC8805306 DOI: 10.1107/s2059798321012109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 11/12/2021] [Indexed: 11/28/2022] Open
Abstract
Intrinsically disordered regions (IDRs) lacking a fixed three-dimensional protein structure are widespread and play a central role in cell regulation. Only a small fraction of IDRs have been functionally characterized, with heterogeneous experimental evidence that is largely buried in the literature. Predictions of IDRs are still difficult to estimate and are poorly characterized. Here, an overview of the publicly available knowledge about IDRs is reported, including manually curated resources, deposition databases and prediction repositories. The types, scopes and availability of the various resources are analyzed, and their complementarity and overlap are highlighted. The volume of information included and the relevance to the field of structural biology are compared.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR–IBIOM), Bari, Italy
| | | |
Collapse
|
19
|
Bevilacqua M, Paladin L, Tosatto SCE, Piovesan D. ProSeqViewer: an interactive, responsive and efficient TypeScript library for visualization of sequences and alignments in web applications. Bioinformatics 2022; 38:1129-1130. [PMID: 34788797 DOI: 10.1093/bioinformatics/btab764] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/13/2021] [Accepted: 11/10/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Biological data is ever-increasing in amount and complexity. The mapping of this data to biological entities such as nucleotide and amino acid sequences supports biological data analysis, classification and prediction. Sequence alignments and comparison allow the transfer of knowledge to evolutionary-related entities, the mapping of functional domains, the identification of binding and modification sites. To support these types of studies, we developed ProSeqViewer, a tool to visualize annotation on single sequences and multiple sequence alignments. This state-of-the-art multifunctional library was developed as a modular component to be integrated into static or dynamic web resources and support intuitive visualization of sequence features. ProseSeqViewer is extremely lightweight, fast, interactive, dynamic, responsive and works at any screen size. It generates pure HTML which is compatible with any browser and operating system. ProSeqViewer can exchange events with other visualization components and is already used by multiple biological databases. AVAILABILITY AND IMPLEMENTATION ProSeqViewer is an open-source TypeScript library compatible with state-of-the-art website environments. The source code and an extensive documentation including use cases are available from the URL: https://github.com/BioComputingUP/ProSeqViewer.
Collapse
Affiliation(s)
- Martina Bevilacqua
- Department of Biomedical Sciences, University of Padua, 35121 Padova, Italy
| | - Lisanna Paladin
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, 35121 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, 35121 Padova, Italy
| |
Collapse
|
20
|
Varadi M, Anyango S, Armstrong D, Berrisford J, Choudhary P, Deshpande M, Nadzirin N, Nair SS, Pravda L, Tanweer A, Al-Lazikani B, Andreini C, Barton GJ, Bednar D, Berka K, Blundell T, Brock KP, Carazo JM, Damborsky J, David A, Dey S, Dunbrack R, Recio JF, Fraternali F, Gibson T, Helmer-Citterich M, Hoksza D, Hopf T, Jakubec D, Kannan N, Krivak R, Kumar M, Levy ED, London N, Macias JR, Srivatsan MM, Marks DS, Martens L, McGowan SA, McGreig JE, Modi V, Parra RG, Pepe G, Piovesan D, Prilusky J, Putignano V, Radusky LG, Ramasamy P, Rausch AO, Reuter N, Rodriguez LA, Rollins NJ, Rosato A, Rubach P, Serrano L, Singh G, Skoda P, Sorzano COS, Stourac J, Sulkowska JI, Svobodova R, Tichshenko N, Tosatto SCE, Vranken W, Wass MN, Xue D, Zaidman D, Thornton J, Sternberg M, Orengo C, Velankar S. PDBe-KB: collaboratively defining the biological context of structural data. Nucleic Acids Res 2022; 50:D534-D542. [PMID: 34755867 PMCID: PMC8728252 DOI: 10.1093/nar/gkab988] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/01/2021] [Accepted: 10/14/2021] [Indexed: 12/15/2022] Open
Abstract
The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive.
Collapse
|
21
|
Quaglia F, Mészáros B, Salladini E, Hatos A, Pancsa R, Chemes LB, Pajkos M, Lazar T, Peña-Díaz S, Santos J, Ács V, Farahi N, Fichó E, Aspromonte M, Bassot C, Chasapi A, Davey N, Davidović R, Dobson L, Elofsson A, Erdős G, Gaudet P, Giglio M, Glavina J, Iserte J, Iglesias V, Kálmán Z, Lambrughi M, Leonardi E, Longhi S, Macedo-Ribeiro S, Maiani E, Marchetti J, Marino-Buslje C, Mészáros A, Monzon A, Minervini G, Nadendla S, Nilsson JF, Novotný M, Ouzounis C, Palopoli N, Papaleo E, Pereira P, Pozzati G, Promponas V, Pujols J, Rocha AS, Salas M, Sawicki LR, Schad E, Shenoy A, Szaniszló T, Tsirigos K, Veljkovic N, Parisi G, Ventura S, Dosztányi Z, Tompa P, Tosatto SCE, Piovesan D. DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation. Nucleic Acids Res 2022; 50:D480-D487. [PMID: 34850135 PMCID: PMC8728214 DOI: 10.1093/nar/gkab1082] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 02/03/2023] Open
Abstract
The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.
Collapse
Affiliation(s)
- Federica Quaglia
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Bálint Mészáros
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Edoardo Salladini
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Lucía B Chemes
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Mátyás Pajkos
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Tamas Lazar
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Samuel Peña-Díaz
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Jaime Santos
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Veronika Ács
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Nazanin Farahi
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | - Erzsébet Fichó
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- Cytocast Kft., Vecsés, Hungary
| | - Maria Cristina Aspromonte
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Claudio Bassot
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
| | - Norman E Davey
- Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Rd, Chelsea, London, UK
| | - Radoslav Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Laszlo Dobson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Arne Elofsson
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Pascale Gaudet
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliana Glavina
- Instituto de Investigaciones Biotecnológicas (IIBiO-CONICET), Universidad Nacional de San Martín, Av. 25 de Mayo y Francia, CP1650 Buenos Aires, Argentina
| | - Javier Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
| | - Valentín Iglesias
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Zsófia Kálmán
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary
| | - Matteo Lambrughi
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova, Italy
- Pediatric Research Institute, Città della Speranza, Padova, Italy
| | - Sonia Longhi
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
| | - Julia Marchetti
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | | | - Attila Mészáros
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | | | - Suvarna Nadendla
- Institute for Genome Sciences, University of Maryland School of Medicine 670 W. Baltimore St., Baltimore, MD 21201, USA
| | - Juliet F Nilsson
- Lab. Architecture et Fonction des Macromolécules Biologiques (AFMB), UMR 7257, Aix Marseille University and Centre National de la Recherche Scientifique (CNRS), 163 Avenue de Luminy, Case 932, 13288, Marseille, France
| | - Marian Novotný
- Dep. of Cell Biology, Faculty of Science, Vinicna 7, 128 43, Prague, Czech Republic
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thermi, Thessalonica 57001, Greece
- Biological Computation & Computational Biology Group, Artificial Intelligence & Information Analysis Lab, Department of Computer Science, Aristotle University of Thessalonica, Thessalonica 54124, Greece
| | - Nicolás Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, Lyngby, Denmark
| | - Pedro José Barbosa Pereira
- Instituto de Biologia Molecular e Celular (IBMC), Universidade do Porto, 4200-135 Porto, Portugal
- Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, 4200-135 Porto, Portugal
| | - Gabriele Pozzati
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Jordi Pujols
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | | | - Martin Salas
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Luciana Rodriguez Sawicki
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
| | - Aditi Shenoy
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 171 21 Solna, Sweden
| | - Tamás Szaniszló
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Konstantinos D Tsirigos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Vinča Institute of Nuclear Sciences, National Institute of the Republic of Serbia, University of Belgrade, 11000Belgrade, Serbia
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Salvador Ventura
- Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Barcelona, Spain
- Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
- ICREA, Barcelona, Spain
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest 1117, Hungary
- VIB-VUB Center for Structural Biology, Vlaams Instituut voor Biotechnology, Brussels, Belgium
- Structural Biology Brussels (SBB), Bioengineering Sciences Department, Vrije Universiteit Brussel (VUB), Brussels, Belgium
| | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
22
|
Monzon AM, Piovesan D, Fuxreiter M. Molecular Determinants of Selectivity in Disordered Complexes May Shed Light on Specificity in Protein Condensates. Biomolecules 2022; 12:biom12010092. [PMID: 35053240 PMCID: PMC8773858 DOI: 10.3390/biom12010092] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 12/22/2021] [Accepted: 12/25/2021] [Indexed: 02/01/2023] Open
Abstract
Biomolecular condensates challenge the classical concepts of molecular recognition. The variable composition and heterogeneous conformations of liquid-like protein droplets are bottlenecks for high-resolution structural studies. To obtain atomistic insights into the organization of these assemblies, here we have characterized the conformational ensembles of specific disordered complexes, including those of droplet-driving proteins. First, we found that these specific complexes exhibit a high degree of conformational heterogeneity. Second, we found that residues forming contacts at the interface also sample many conformations. Third, we found that different patterns of contacting residues form the specific interface. In addition, we observed a wide range of sequence motifs mediating disordered interactions, including charged, hydrophobic and polar contacts. These results demonstrate that selective recognition can be realized by variable patterns of weakly defined interaction motifs in many different binding configurations. We propose that these principles also play roles in determining the selectivity of biomolecular condensates.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy; (A.M.M.); (D.P.)
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy; (A.M.M.); (D.P.)
| | - Monika Fuxreiter
- Department of Biomedical Sciences, University of Padova, 35131 Padova, Italy; (A.M.M.); (D.P.)
- Department of Biochemistry and Molecular Biology, University of Debrecen, 4032 Debrecen, Hungary
- Correspondence:
| |
Collapse
|
23
|
Hatos A, Monzon AM, Tosatto SCE, Piovesan D, Fuxreiter M. FuzDB: a new phase in understanding fuzzy interactions. Nucleic Acids Res 2021; 50:D509-D517. [PMID: 34791357 PMCID: PMC8728163 DOI: 10.1093/nar/gkab1060] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/13/2021] [Accepted: 10/27/2021] [Indexed: 11/14/2022] Open
Abstract
Fuzzy interactions are specific, variable contacts between proteins and other biomolecules (proteins, DNA, RNA, small molecules) formed in accord to the cellular context. Fuzzy interactions have recently been demonstrated to regulate biomolecular condensates generated by liquid-liquid phase separation. The FuzDB v4.0 database (https://fuzdb.org) assembles experimentally identified examples of fuzzy interactions, where disordered regions mediate functionally important, context-dependent contacts between the partners in stoichiometric and higher-order assemblies. The new version of FuzDB establishes cross-links with databases on structure (PDB, BMRB, PED), function (ELM, UniProt) and biomolecular condensates (PhaSepDB, PhaSePro, LLPSDB). FuzDB v4.0 is a source to decipher molecular basis of complex cellular interaction behaviors, including those in protein droplets.
Collapse
Affiliation(s)
- Andras Hatos
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Alexander Miguel Monzon
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Monika Fuxreiter
- Department of Biomedical Sciences, University of Padova, via Ugo Bassi 58/B, 35131 Padova, Italy.,Department of Biochemistry and Molecular Biology, University of Debrecen, Nagyerdei krt 98, 4010 Debrecen, Hungary
| |
Collapse
|
24
|
Quaglia F, Lazar T, Hatos A, Tompa P, Piovesan D, Tosatto SCE. Exploring Curated Conformational Ensembles of Intrinsically Disordered Proteins in the Protein Ensemble Database. Curr Protoc 2021; 1:e192. [PMID: 34252246 DOI: 10.1002/cpz1.192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The Protein Ensemble Database (PED; https://proteinensemble.org/) is the major repository of conformational ensembles of intrinsically disordered proteins (IDPs). Conformational ensembles of IDPs are primarily provided by their authors or occasionally collected from literature, and are subsequently deposited in PED along with the corresponding structured, manually curated metadata. The modeling of conformational ensembles usually relies on experimental data from small-angle X-ray scattering (SAXS), fluorescence resonance energy transfer (FRET), NMR spectroscopy, and molecular dynamics (MD) simulations, or a combination of these techniques. The growing number of scientific studies based on these data, along with the astounding and swift progress in the field of protein intrinsic disorder, has required a significant update and upgrade of PED, first published in 2014. To this end, the database was entirely renewed in 2020 and now has a dedicated team of biocurators providing manually curated descriptions of the methods and conditions applied to generate the conformational ensembles and for checking consistency of the data. Here, we present a detailed description on how to explore PED with its protein pages and experimental pages, and how to interpret entries of conformational ensembles. We describe how to efficiently search conformational ensembles deposited in PED by means of its web interface and API. We demonstrate how to make sense of the PED protein page and its associated experimental entry pages with reference to the yeast Sic1 use case. © 2021 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Performing a search in PED Support Protocol 1: Programmatic access with the PED API Basic Protocol 2: Interpreting the protein page and the experimental entry page-the Sic1 use case Support Protocol 2: Downloading options Support Protocol 3: Understanding the validation report-the Sic1 use case Basic Protocol 3: Submitting new conformational ensembles to PED Basic Protocol 4: Providing feedback in PED.
Collapse
Affiliation(s)
- Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), Bari, Italy
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Peter Tompa
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,VIB-VUB Center for Structural Biology, Brussels, Belgium.,Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | |
Collapse
|
25
|
Hatos A, Quaglia F, Piovesan D, Tosatto SCE. APICURON: a database to credit and acknowledge the work of biocurators. Database (Oxford) 2021; 2021:baab019. [PMID: 33882120 PMCID: PMC8060004 DOI: 10.1093/database/baab019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/12/2021] [Accepted: 04/12/2021] [Indexed: 11/14/2022]
Abstract
APICURON is an open and freely accessible resource that tracks and credits the work of biocurators across multiple participating knowledgebases. Biocuration is essential to extract knowledge from research data and make it available in a structured and standardized way to the scientific community. However, processing biological data-mainly from literature-requires a huge effort that is difficult to attribute and quantify. APICURON collects biocuration events from third-party resources and aggregates this information, spotlighting biocurator contributions. APICURON promotes biocurator engagement implementing gamification concepts like badges, medals and leaderboards and at the same time provides a monitoring service for registered resources and for biocurators themselves. APICURON adopts a data model that is flexible enough to represent and track the majority of biocuration activities. Biocurators are identified through their Open Researcher and Contributor ID. The definition of curation events, scoring systems and rules for assigning badges and medals are resource-specific and easily customizable. Registered resources can transfer curation activities on the fly through a secure and robust Application Programming Interface (API). Here, we show how simple and effective it is to connect a resource to APICURON, describing the DisProt database of intrinsically disordered proteins as a use case. We believe APICURON will provide biological knowledgebases with a service to recognize and credit the effort of their biocurators, monitor their activity and promote curator engagement. Database URL: https://apicuron.org.
Collapse
Affiliation(s)
- András Hatos
- Department of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padova 35131, Italy
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padova 35131, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padova 35131, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padova 35131, Italy
| |
Collapse
|
26
|
Monzon AM, Bonato P, Necci M, Tosatto SCE, Piovesan D. FLIPPER: Predicting and Characterizing Linear Interacting Peptides in the Protein Data Bank. J Mol Biol 2021; 433:166900. [PMID: 33647288 DOI: 10.1016/j.jmb.2021.166900] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Revised: 02/22/2021] [Accepted: 02/22/2021] [Indexed: 12/31/2022]
Abstract
A large fraction of peptides or protein regions are disordered in isolation and fold upon binding. These regions, also called MoRFs, SLiMs or LIPs, are often associated with signaling and regulation processes. However, despite their importance, only a limited number of examples are available in public databases and their automatic detection at the proteome level is problematic. Here we present FLIPPER, an automatic method for the detection of structurally linear sub-regions or peptides that interact with another chain in a protein complex. FLIPPER is a random forest classification that takes the protein structure as input and provides the propensity of each amino acid to be part of a LIP region. Models are built taking into consideration structural features such as intra- and inter-chain contacts, secondary structure, solvent accessibility in both bound and unbound state, structural linearity and chain length. FLIPPER is accurate when evaluated on non-redundant independent datasets, 99% precision and 99% sensitivity on PixelDB-25 and 87% precision and 88% sensitivity on DIBS-25. Finally, we used FLIPPER to process the entire Protein Data Bank and identified different classes of LIPs based on different binding modes and partner molecules. We provide a detailed description of these LIP categories and show that a large fraction of these regions are not detected by disorder predictors. All FLIPPER predictions are integrated in the MobiDB 4.0 database.
Collapse
Affiliation(s)
| | - Paolo Bonato
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
27
|
Lazar T, Martínez-Pérez E, Quaglia F, Hatos A, Chemes L, Iserte JA, Méndez NA, Garrone NA, Saldaño T, Marchetti J, Rueda A, Bernadó P, Blackledge M, Cordeiro TN, Fagerberg E, Forman-Kay JD, Fornasari M, Gibson TJ, Gomes GNW, Gradinaru C, Head-Gordon T, Jensen MR, Lemke E, Longhi S, Marino-Buslje C, Minervini G, Mittag T, Monzon A, Pappu RV, Parisi G, Ricard-Blum S, Ruff KM, Salladini E, Skepö M, Svergun D, Vallet S, Varadi M, Tompa P, Tosatto SCE, Piovesan D. PED in 2021: a major update of the protein ensemble database for intrinsically disordered proteins. Nucleic Acids Res 2021; 49:D404-D411. [PMID: 33305318 PMCID: PMC7778965 DOI: 10.1093/nar/gkaa1021] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/13/2020] [Accepted: 12/08/2020] [Indexed: 12/21/2022] Open
Abstract
The Protein Ensemble Database (PED) (https://proteinensemble.org), which holds structural ensembles of intrinsically disordered proteins (IDPs), has been significantly updated and upgraded since its last release in 2016. The new version, PED 4.0, has been completely redesigned and reimplemented with cutting-edge technology and now holds about six times more data (162 versus 24 entries and 242 versus 60 structural ensembles) and a broader representation of state of the art ensemble generation methods than the previous version. The database has a completely renewed graphical interface with an interactive feature viewer for region-based annotations, and provides a series of descriptors of the qualitative and quantitative properties of the ensembles. High quality of the data is guaranteed by a new submission process, which combines both automatic and manual evaluation steps. A team of biocurators integrate structured metadata describing the ensemble generation methodology, experimental constraints and conditions. A new search engine allows the user to build advanced queries and search all entry fields including cross-references to IDP-related resources such as DisProt, MobiDB, BMRB and SASBDB. We expect that the renewed PED will be useful for researchers interested in the atomic-level understanding of IDP function, and promote the rational, structure-based design of IDP-targeting drugs.
Collapse
Affiliation(s)
- Tamas Lazar
- VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology, Brussels 1050, Belgium
- Structural Biology Brussels, Bioengineering Sciences Department, Vrije Universiteit Brussel, Brussels 1050, Belgium
| | - Elizabeth Martínez-Pérez
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Federica Quaglia
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| | - András Hatos
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| | - Lucía B Chemes
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Javier A Iserte
- Bioinformatics Unit, Fundación Instituto Leloir, Buenos Aires, C1405BWE, Argentina
| | - Nicolás A Méndez
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Nicolás A Garrone
- Instituto de Investigaciones Biotecnológicas “Dr. Rodolfo A. Ugalde’’, IIB-UNSAM, IIBIO-CONICET, Universidad Nacional de SanMartín, CP1650 San Martín, Buenos Aires, Argentina
| | - Tadeo E Saldaño
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Julia Marchetti
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Ana Julia Velez Rueda
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Pau Bernadó
- Centre de Biochimie Structurale (CBS), CNRS, INSERM, University of Montpellier, Montpellier 34090, France
| | | | - Tiago N Cordeiro
- Centre de Biochimie Structurale (CBS), CNRS, INSERM, University of Montpellier, Montpellier 34090, France
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras 2780-157, Portugal
| | - Eric Fagerberg
- Theoretical Chemistry, Lund University, Lund, POB 124, SE-221 00, Sweden
| | - Julie D Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, M5G 1X8, Ontario, Canada
- Department of Biochemistry, University of Toronto, Toronto, M5S 1A8, Ontario, Canada
| | - Maria S Fornasari
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Gregory-Neal W Gomes
- Department of Physics, University of Toronto, Toronto, M5S 1A7, Ontario, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, L5L 1C6, Ontario, Canada
| | - Claudiu C Gradinaru
- Department of Physics, University of Toronto, Toronto, M5S 1A7, Ontario, Canada
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, L5L 1C6, Ontario, Canada
| | - Teresa Head-Gordon
- Departments of Chemistry, Bioengineering, Chemical and Biomolecular Engineering University of California, Berkeley, CA 94720, USA
| | | | - Edward A Lemke
- Biocentre, Johannes Gutenberg-University Mainz, Mainz 55128, Germany
- Institute of Molecular Biology, Mainz 55128, Germany
| | - Sonia Longhi
- Aix-Marseille University, CNRS, Architecture et Fonction des Macromolécules Biologiques (AFMB), Marseille 13288, France
| | | | | | - Tanja Mittag
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | | | - Rohit V Pappu
- Department of Biomedical Engineering, Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | - Gustavo Parisi
- Laboratorio de Química y Biología Computacional, Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Bernal B1876BXD, Buenos Aires, Argentina
| | - Sylvie Ricard-Blum
- Univ Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, Villeurbanne, 69629 Lyon Cedex 07, France
| | - Kiersten M Ruff
- Department of Biomedical Engineering, Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | - Edoardo Salladini
- Aix-Marseille University, CNRS, Architecture et Fonction des Macromolécules Biologiques (AFMB), Marseille 13288, France
| | - Marie Skepö
- Theoretical Chemistry, Lund University, Lund, POB 124, SE-221 00, Sweden
- LINXS - Lund Institute of Advanced Neutron and X-ray Science, Lund 223 70, Sweden
| | - Dmitri Svergun
- European Molecular Biology Laboratory, Hamburg Unit, Hamburg 22607, Germany
| | - Sylvain D Vallet
- Univ Lyon, University Claude Bernard Lyon 1, CNRS, INSA Lyon, CPE, Institute of Molecular and Supramolecular Chemistry and Biochemistry (ICBMS), UMR 5246, Villeurbanne, 69629 Lyon Cedex 07, France
| | - Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Peter Tompa
- To whom correspondence should be addressed. Tel +32 473 785386;
| | - Silvio C E Tosatto
- Correspondence may also be addressed to Silvio C. E. Tosatto. Tel: +39 049 827 6269;
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Padova 35131, Italy
| |
Collapse
|
28
|
Paladin L, Bevilacqua M, Errigo S, Piovesan D, Mičetić I, Necci M, Monzon AM, Fabre ML, Lopez JL, Nilsson JF, Rios J, Menna PL, Cabrera M, Buitron MG, Kulik MG, Fernandez-Alberti S, Fornasari MS, Parisi G, Lagares A, Hirsh L, Andrade-Navarro MA, Kajava AV, Tosatto SCE. RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures. Nucleic Acids Res 2021; 49:D452-D457. [PMID: 33237313 PMCID: PMC7778985 DOI: 10.1093/nar/gkaa1097] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/17/2020] [Accepted: 11/19/2020] [Indexed: 11/21/2022] Open
Abstract
The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.
Collapse
Affiliation(s)
- Lisanna Paladin
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Martina Bevilacqua
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Sara Errigo
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Ivan Mičetić
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | | | - Maria Laura Fabre
- IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina
| | - Jose Luis Lopez
- IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina
| | - Juliet F Nilsson
- IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina
| | - Javier Rios
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Pablo Lorenzano Menna
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Maia Cabrera
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Martin Gonzalez Buitron
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Mariane Gonçalves Kulik
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Sebastian Fernandez-Alberti
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Maria Silvina Fornasari
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Dept. of Science and Technology, National University of Quilmes, Roque Sáenz Peña 352, Bernal, Buenos Aires, Argentina
| | - Antonio Lagares
- IBBM-CONICET, Dept. of Biological Sciences, La Plata National University, 49 y 115, 1900 La Plata, Argentina
| | - Layla Hirsh
- Dept. of Engineering, Faculty of Science and Engineering, Pontifical Catholic University of Peru, Av. Universitaria 1801 San Miguel, Lima 32, Lima, Peru
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier, UMR 5237, CNRS, Univ. Montpellier, Montpellier, France
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
29
|
Piovesan D, Necci M, Escobedo N, Monzon AM, Hatos A, Mičetić I, Quaglia F, Paladin L, Ramasamy P, Dosztányi Z, Vranken WF, Davey N, Parisi G, Fuxreiter M, Tosatto SE. MobiDB: intrinsically disordered proteins in 2021. Nucleic Acids Res 2021; 49:D361-D367. [PMID: 33237329 PMCID: PMC7779018 DOI: 10.1093/nar/gkaa1058] [Citation(s) in RCA: 126] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/16/2020] [Accepted: 11/19/2020] [Indexed: 12/13/2022] Open
Abstract
The MobiDB database (URL: https://mobidb.org/) provides predictions and annotations for intrinsically disordered proteins. Here, we report recent developments implemented in MobiDB version 4, regarding the database format, with novel types of annotations and an improved update process. The new website includes a re-designed user interface, a more effective search engine and advanced API for programmatic access. The new database schema gives more flexibility for the users, as well as simplifying the maintenance and updates. In addition, the new entry page provides more visualisation tools including customizable feature viewer and graphs of the residue contact maps. MobiDB v4 annotates the binding modes of disordered proteins, whether they undergo disorder-to-order transitions or remain disordered in the bound state. In addition, disordered regions undergoing liquid-liquid phase separation or post-translational modifications are defined. The integrated information is presented in a simplified interface, which enables faster searches and allows large customized datasets to be downloaded in TSV, Fasta or JSON formats. An alternative advanced interface allows users to drill deeper into features of interest. A new statistics page provides information at database and proteome levels. The new MobiDB version presents state-of-the-art knowledge on disordered proteins and improves data accessibility for both computational and experimental users.
Collapse
Affiliation(s)
- Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Nahuel Escobedo
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | | | - András Hatos
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Ivan Mičetić
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Federica Quaglia
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Lisanna Paladin
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Pathmanaban Ramasamy
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent 9000, Belgium
- Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, Ghent 9000, Belgium
| | | | - Wim F Vranken
- Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, Triomflaan, BC building, 6th floor, CP 263, 1050 Brussels, Belgium
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
- Centre for Structural Biology, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Norman E Davey
- Division of Cancer Biology, The Institute of Cancer Research, 237 Fulham Road, London, SW3 6JB, UK
| | - Gustavo Parisi
- Dept. of Science and Technology, Universidad Nacional de Quilmes, Buenos Aires, Argentina
| | - Monika Fuxreiter
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy
| |
Collapse
|
30
|
Coronado E, González A, Cárdenas A, Maya M, Chiovetto E, Piovesan D. Self-Tuning Extended Kalman Filter Parameters to Identify Ankle's Third-Order Mechanics. J Biomech Eng 2021; 143:1086083. [PMID: 32766749 DOI: 10.1115/1.4048042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Indexed: 11/08/2022]
Abstract
The estimation of human ankle's mechanical impedance is an important tool for modeling human balance. This work presents the implementation of a parameter-estimation approach based on a state-augmented extended Kalman filter (AEKF) to infer the ankle's mechanical impedance during quiet standing. However, the AEKF filter is sensitive to the initialization of the noise covariance matrices. In order to avoid a time-consuming trial-and-error method and to obtain a better estimation performance, a genetic algorithm (GA) is proposed to best tune the measurement noise (Rk) and process noise covariances (Q) of the extended Kalman filter (EKF). Results using simulated data show the efficacy of the proposed algorithm for parameter-estimation of a third-order biomechanical model. Experimental validation of these results is also presented. They suggest that age is an influencing factor in the human balance.
Collapse
Affiliation(s)
- E Coronado
- Facultad de Ingeniería, Universidad Autónoma de San Luis Potosí, San Luis Potosí 78290, Mexico
| | - A González
- Facultad de Ingeniería, CONACYT-Universidad Autónoma de San Luis Potosí, San Luis Potosí 78290, Mexico
| | - A Cárdenas
- Facultad de Ingeniería, Universidad Autónoma de San Luis Potosí, San Luis Potosí 78290, Mexico
| | - M Maya
- Facultad de Ingeniería, Universidad Autónoma de San Luis Potosí, San Luis Potosí 78290, Mexico
| | - E Chiovetto
- Department of Cognitive Neurology, University of Tuebingen, Tbingen 72076, Germany
| | - D Piovesan
- Biomedical Engineering Program, Gannon University, Erie, PA 16541
| |
Collapse
|
31
|
Necci M, Piovesan D, Clementel D, Dosztányi Z, Tosatto SCE. MobiDB-lite 3.0: fast consensus annotation of intrinsic disorder flavours in proteins. Bioinformatics 2020; 36:5533-5534. [PMID: 33325498 DOI: 10.1093/bioinformatics/btaa1045] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 11/03/2020] [Accepted: 12/07/2020] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION The earlier version of MobiDB-lite is currently used in large-scale proteome annotation platforms to detect intrinsic disorder. However, new theoretical models allow for the classification of intrinsically disordered regions into subtypes from sequence features associated with specific polymeric properties or compositional bias. RESULTS MobiDB-lite 3.0 maintains its previous speed and performance but also provides a finer classification of disorder by identifying regions with characteristics of polyolyampholytes, positive or negative polyelectrolytes, low complexity regions or enriched in cysteine, proline or glycine or polar residues. Sub-regions are abundantly detected in IDRs of the human proteome. The new version of MobiDB-lite represents a new step for the proteome level analysis of protein disorder. AVAILABILITY Both the MobiDB-lite 3.0 source code and a docker container are available from the GitHub repository: https://github.com/BioComputingUP/MobiDB-lite.
Collapse
Affiliation(s)
- Marco Necci
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Clementel
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35121 Padova, Italy
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/c, Budapest, Hungary
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35121 Padova, Italy
| |
Collapse
|
32
|
Quaglia F, Hatos A, Piovesan D, Tosatto SCE. Exploring Manually Curated Annotations of Intrinsically Disordered Proteins with DisProt. ACTA ACUST UNITED AC 2020; 72:e107. [PMID: 33017101 DOI: 10.1002/cpbi.107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
DisProt is the major repository of manually curated data for intrinsically disordered proteins collected from the literature. Although lacking a stable tertiary structure under physiological conditions, intrinsically disordered proteins carry out a plethora of biological functions, some of them directly arising from their flexible nature. A growing number of scientific studies have been published during the last few decades in an effort to shed light on their unstructured state, their binding modes, and their functions. DisProt makes use of a team of expert biocurators to provide up-to-date annotations of intrinsically disordered proteins from the literature, making them available to the scientific community. Here we present a comprehensive description on how to use DisProt in different contexts and provide a detailed explanation of how to explore and interpret manually curated annotations of intrinsically disordered proteins. We describe how to search DisProt annotations, using both the web interface and the API for programmatic access. Finally, we explain how to visualize and interpret a DisProt entry, p53, a widely studied protein characterized by the presence of unstructured N-terminal and C-terminal regions. © 2020 Wiley Periodicals LLC. Basic Protocol 1: Performing a search in DisProt Support Protocol 1: Downloading options Support Protocol 2: Programmatic access with DisProt REST API Basic Protocol 2: Visualizing and interpreting DisProt entries: the p53 use case Basic Protocol 3: Providing feedback and submitting new intrinsic disorder-related data.
Collapse
Affiliation(s)
- Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - András Hatos
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | |
Collapse
|
33
|
Lawson K, Gauthier KS, Piovesan D, Fournier J, Rosen B, Maliyan A, Beatty J, Jin L, Leleti M, Ginn E, Udyavar A, Ada C, Au J, Meleza C, Zhao S, Young S, Walters M, Powers J. Discovery and characterization of novel, potent, and selective hypoxiainducible factor (HIF)-2α inhibitors. Eur J Cancer 2020. [DOI: 10.1016/s0959-8049(20)31106-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
34
|
Jarnot P, Ziemska-Legiecka J, Dobson L, Merski M, Mier P, Andrade-Navarro MA, Hancock JM, Dosztányi Z, Paladin L, Necci M, Piovesan D, Tosatto SCE, Promponas VJ, Grynberg M, Gruca A. PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins. Nucleic Acids Res 2020; 48:W77-W84. [PMID: 32421769 PMCID: PMC7319588 DOI: 10.1093/nar/gkaa339] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/08/2020] [Accepted: 05/01/2020] [Indexed: 12/25/2022] Open
Abstract
Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in diseases. In previous decades, several methods have been developed to identify regions with LCRs or amino acid bias, but most of them as stand-alone applications and currently there is no web-based tool which allows users to explore LCRs in protein sequences with additional functional annotations. We aim to fill this gap by providing PlaToLoCo - PLAtform of TOols for LOw COmplexity-a meta-server that integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. In addition, the union or intersection of the results of the search on a query sequence can be obtained. By developing the PlaToLoCo meta-server, we provide the community with a fast and easily accessible tool for the analysis of LCRs with additional information included to aid the interpretation of the results. The PlaToLoCo platform is available at: http://platoloco.aei.polsl.pl/.
Collapse
Affiliation(s)
- Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | | | - Laszlo Dobson
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary.,Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117 Budapest, Hungary
| | - Matthew Merski
- Structural Biology Group, Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Żwirki i Wigury 101, 02-089 Warsaw, Poland
| | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - John M Hancock
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös LorándUniversity, Budapest, Pázmány Péter stny 1/c 1117, Budapest, Hungary
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, P.O. Box 20537, Nicosia, CY 1678, Cyprus
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics PAS, Pawinskiego 5A, 02-106 Warsaw, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
35
|
Paladin L, Necci M, Piovesan D, Mier P, Andrade-Navarro MA, Tosatto SCE. A novel approach to investigate the evolution of structured tandem repeat protein families by exon duplication. J Struct Biol 2020; 212:107608. [PMID: 32896658 DOI: 10.1016/j.jsb.2020.107608] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 08/19/2020] [Accepted: 08/21/2020] [Indexed: 11/30/2022]
Abstract
Tandem Repeat Proteins (TRPs) are ubiquitous in cells and are enriched in eukaryotes. They contributed to the evolution of organism complexity, specializing for functions that require quick adaptability such as immunity-related functions. To investigate the hypothesis of repeat protein evolution through exon duplication and rearrangement, we designed a tool to analyze the relationships between exon/intron patterns and structural symmetries. The tool allows comparison of the structure fragments as defined by exon/intron boundaries from Ensembl against the structural element repetitions from RepeatsDB. The all-against-all pairwise structural alignment between fragments and comparison of the two definitions (structural units and exons) are visualized in a single matrix, the "repeat/exon plot". An analysis of different repeat protein families, including the solenoids Leucine-Rich, Ankyrin, Pumilio, HEAT repeats and the β propellers Kelch-like, WD40 and RCC1, shows different behaviors, illustrated here through examples. For each example, the analysis of the exon mapping in homologous proteins supports the conservation of their exon patterns. We propose that when a clear-cut relationship between exon and structural boundaries can be identified, it is possible to infer a specific "evolutionary pattern" which may improve TRPs detection and classification.
Collapse
Affiliation(s)
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padova, Italy
| | | | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University of Mainz, Germany
| | | | | |
Collapse
|
36
|
Monzon AM, Necci M, Quaglia F, Walsh I, Zanotti G, Piovesan D, Tosatto SCE. Experimentally Determined Long Intrinsically Disordered Protein Regions Are Now Abundant in the Protein Data Bank. Int J Mol Sci 2020; 21:ijms21124496. [PMID: 32599863 PMCID: PMC7349999 DOI: 10.3390/ijms21124496] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 06/18/2020] [Accepted: 06/19/2020] [Indexed: 01/12/2023] Open
Abstract
Intrinsically disordered protein regions are commonly defined from missing electron density in X-ray structures. Experimental evidence for long disorder regions (LDRs) of at least 30 residues was so far limited to manually curated proteins. Here, we describe a comprehensive and large-scale analysis of experimental LDRs for 3133 unique proteins, demonstrating an increasing coverage of intrinsic disorder in the Protein Data Bank (PDB) in the last decade. The results suggest that long missing residue regions are a good quality source to annotate intrinsically disordered regions and perform functional analysis in large data sets. The consensus approach used to define LDRs allows to evaluate context dependent disorder and provide a common definition at the protein level.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Ian Walsh
- Bioprocessing Technology Institute, A*STAR, Singapore 138668, Singapore;
| | - Giuseppe Zanotti
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| | - Silvio C. E. Tosatto
- Department of Biomedical Sciences, University of Padua, 35131 Padua, Italy; (A.M.M.); (M.N.); (F.Q.); (G.Z.)
- Correspondence: (D.P.); (S.C.E.T.)
| |
Collapse
|
37
|
Piovesan D, Hatos A, Minervini G, Quaglia F, Monzon AM, Tosatto SCE. Assessing predictors for new post translational modification sites: A case study on hydroxylation. PLoS Comput Biol 2020; 16:e1007967. [PMID: 32569263 PMCID: PMC7332089 DOI: 10.1371/journal.pcbi.1007967] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 07/02/2020] [Accepted: 05/19/2020] [Indexed: 12/15/2022] Open
Abstract
Post-translational modification (PTM) sites have become popular for predictor development. However, with the exception of phosphorylation and a handful of other examples, PTMs suffer from a limited number of available training examples and sparsity in protein sequences. Here, proline hydroxylation is taken as an example to compare different methods and evaluate their performance on new experimentally determined sites. As a guide for effective experimental design, predictors require both high specificity and sensitivity. However, the self-reported performance may often not be indicative of prediction quality and detection of new sites is not guaranteed. We have benchmarked seven published hydroxylation site predictors on two newly constructed independent datasets. The self-reported performance is found to widely overestimate the real accuracy measured on independent datasets. No predictor performs better than random on new examples, indicating the refined models do not sufficiently generalize to detect new sites. The number of false positives is high and precision low, in particular for non-collagen proteins whose motifs are not conserved. As hydroxylation site predictors do not generalize for new data, caution is advised when using PTM predictors in the absence of independent evaluations, in particular for highly specific sites involved in signalling. Machine learning methods are extensively used by biologists to design and interpret experiments. Predictors which take the only sequence as input are of particular interest due to the large amount of available sequence data and high self-reported performance. In this work, we evaluated post-translational modification (PTM) predictors for hydroxylation sites and found that they perform no better than random, in strong contrast to performances reported in their original publications. PTMs are chemical amino acid alterations providing the cell with conditional mechanisms to fine tune protein function, regulating complex biological processes such as signalling and cell cycle. Hydroxylation sites are a good PTM test case due to the availability of a range of predictors and an abundance of newly experimentally detected modification sites. Poor performances in our results highlight the overlooked problem of predicting PTMs when best practices are not followed and training data are likely incomplete. Experimentalists should be careful when using PTM predictors blindly and more independent assessments are needed to establish their usefulness in practice.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padua, Italy
- * E-mail:
| | - Andras Hatos
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | | | - Federica Quaglia
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | | | | |
Collapse
|
38
|
El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, Sonnhammer ELL, Hirsh L, Paladin L, Piovesan D, Tosatto SCE, Finn RD. The Pfam protein families database in 2019. Nucleic Acids Res 2020; 47:D427-D432. [PMID: 30357350 PMCID: PMC6324024 DOI: 10.1093/nar/gky995] [Citation(s) in RCA: 2821] [Impact Index Per Article: 705.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 10/09/2018] [Indexed: 12/11/2022] Open
Abstract
The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors’ ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.
Collapse
Affiliation(s)
- Sara El-Gebali
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jaina Mistry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sean R Eddy
- HHMI, Harvard University, 16 Divinity Ave Cambridge, MA 02138 USA
| | - Aurélien Luciani
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon C Potter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matloob Qureshi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lorna J Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gustavo A Salazar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alfredo Smart
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Erik L L Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 17121 Solna, Sweden
| | - Layla Hirsh
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy.,Dept. of Engineering, Pontificia Universidad Católica del Perú 1801, San Miguel 15088, Lima, Perú
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, 35131 Padova, Italy
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
39
|
Hatos A, Hajdu-Soltész B, Monzon AM, Palopoli N, Álvarez L, Aykac-Fas B, Bassot C, Benítez GI, Bevilacqua M, Chasapi A, Chemes L, Davey NE, Davidović R, Dunker AK, Elofsson A, Gobeill J, Foutel NSG, Sudha G, Guharoy M, Horvath T, Iglesias V, Kajava AV, Kovacs OP, Lamb J, Lambrughi M, Lazar T, Leclercq JY, Leonardi E, Macedo-Ribeiro S, Macossay-Castillo M, Maiani E, Manso JA, Marino-Buslje C, Martínez-Pérez E, Mészáros B, Mičetić I, Minervini G, Murvai N, Necci M, Ouzounis CA, Pajkos M, Paladin L, Pancsa R, Papaleo E, Parisi G, Pasche E, Barbosa Pereira PJ, Promponas VJ, Pujols J, Quaglia F, Ruch P, Salvatore M, Schad E, Szabo B, Szaniszló T, Tamana S, Tantos A, Veljkovic N, Ventura S, Vranken W, Dosztányi Z, Tompa P, Tosatto SCE, Piovesan D. DisProt: intrinsic protein disorder annotation in 2020. Nucleic Acids Res 2020; 48:D269-D276. [PMID: 31713636 PMCID: PMC7145575 DOI: 10.1093/nar/gkz975] [Citation(s) in RCA: 98] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/11/2019] [Accepted: 10/12/2019] [Indexed: 11/29/2022] Open
Abstract
The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome.
Collapse
Affiliation(s)
- András Hatos
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Borbála Hajdu-Soltész
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Alexander M Monzon
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Lucía Álvarez
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina
| | - Burcu Aykac-Fas
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - Claudio Bassot
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Guillermo I Benítez
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Martina Bevilacqua
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica GR-57500, Greece
| | - Lucia Chemes
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina.,Departamento de Fisiología y Biología Molecular y Celular (DFBMC), Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Norman E Davey
- Division of Cancer Biology, The Institute of Cancer Research, Chelsea, London SW3 6BJ, UK
| | - Radoslav Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade 11001, Serbia
| | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, IN 46202, USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Julien Gobeill
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Nicolás S González Foutel
- Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Investigaciones Biotecnológicas IIBIO, Universidad Nacional de San Martín, San Martín, Buenos Aires, Argentina
| | - Govindarajan Sudha
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Mainak Guharoy
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Tamas Horvath
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Valentin Iglesias
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, Montpellier 34293, France.,Institut de Biologie Computationnelle(IBC), Montpellier 34095, France
| | - Orsolya P Kovacs
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - John Lamb
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Matteo Lambrughi
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - Tamas Lazar
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Jeremy Y Leclercq
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, Montpellier 34293, France
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova 35127, Italy.,Fondazione Istituto di Ricerca Pediatrica (IRP), Città della Speranza, Padova 35127, Italy
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Mauricio Macossay-Castillo
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | - Emiliano Maiani
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark
| | - José A Manso
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Cristina Marino-Buslje
- Bioinformatics Unit. Fundación Instituto Leloir, Ciudad de Buenos Aires C1405BWE, Argentina
| | | | - Bálint Mészáros
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Ivan Mičetić
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Giovanni Minervini
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Nikoletta Murvai
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Christos A Ouzounis
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica GR-57500, Greece
| | - Mátyás Pajkos
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen DK-2100, Denmark.,Translational Disease Systems Biology, Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research University of Copenhagen, Copenhagen DK-2200, Denmark
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes - CONICET, Bernal, Buenos Aires B1876BXD, Argentina
| | - Emilie Pasche
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Pedro J Barbosa Pereira
- Instituto de Biologia Molecular e Celular (IBMC) and Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Porto 4200-135, Portugal
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY 1678, Cyprus
| | - Jordi Pujols
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Federica Quaglia
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| | - Patrick Ruch
- Swiss Institute of Bioinformatics and HES-SO \ HEG, Geneva 1200, Switzerland
| | - Marco Salvatore
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Box 1031, Solna 17121, Sweden
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Beata Szabo
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Tamás Szaniszló
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY 1678, Cyprus
| | - Agnes Tantos
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences Vinca, University of Belgrade, Belgrade 11001, Serbia
| | - Salvador Ventura
- Departament de Bioquímica i Biologia Molecular and Institut de Biotecnologia i Biomedicina, Universitat Autònoma de Barcelona, Bellaterra 08193, Spain
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels (IB2), ULB-VUB, Brussels 1050, Belgium
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest 1117, Hungary
| | - Peter Tompa
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy.,CNR Institute of Neurosceince, Padova 35121, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova 35121, Italy
| |
Collapse
|
40
|
Piovesan D, Tosatto SCE. INGA 2.0: improving protein function prediction for the dark proteome. Nucleic Acids Res 2020; 47:W373-W378. [PMID: 31073595 PMCID: PMC6602455 DOI: 10.1093/nar/gkz375] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 04/29/2019] [Accepted: 04/30/2019] [Indexed: 12/21/2022] Open
Abstract
Our current knowledge of complex biological systems is stored in a computable form through the Gene Ontology (GO) which provides a comprehensive description of genes function. Prediction of GO terms from the sequence remains, however, a challenging task, which is particularly critical for novel genomes. Here we present INGA 2.0, a new version of the INGA software for protein function prediction. INGA exploits homology, domain architecture, interaction networks and information from the ‘dark proteome’, like transmembrane and intrinsically disordered regions, to generate a consensus prediction. INGA was ranked in the top ten methods on both CAFA2 and CAFA3 blind tests. The new algorithm can process entire genomes in a few hours or even less when additional input files are provided. The new interface provides a better user experience by integrating filters and widgets to explore the graph structure of the predicted terms. The INGA web server, databases and benchmarking are available from URL: https://inga.bio.unipd.it/.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Padua, Italy.,CNR Institute of Neuroscience, Padua, Italy
| |
Collapse
|
41
|
Paladin L, Schaeffer M, Gaudet P, Zahn-Zabal M, Michel PA, Piovesan D, Tosatto SCE, Bairoch A. The Feature-Viewer: a visualization tool for positional annotations on a sequence. Bioinformatics 2020; 36:3244-3245. [DOI: 10.1093/bioinformatics/btaa055] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Revised: 01/02/2020] [Accepted: 01/20/2020] [Indexed: 01/15/2023] Open
Abstract
Abstract
Summary
The Feature-Viewer is a lightweight library for the visualization of biological data mapped to a protein or nucleotide sequence. It is designed for ease of use while allowing for a full customization. The library is already used by several biological data resources and allows intuitive visual mapping of a full spectra of sequence features for different usages.
Availability and implementation
The Feature-Viewer is open source, compatible with state-of-the-art development technologies and responsive, also for mobile viewing. Documentation and usage examples are available online.
Collapse
Affiliation(s)
- Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, Padova 35121, Italy
| | - Mathieu Schaeffer
- CALIPHO Group, Swiss Institute of Bioinformatics, University of Geneva, Geneva 1206, Switzerland
| | - Pascale Gaudet
- CALIPHO Group, Swiss Institute of Bioinformatics, University of Geneva, Geneva 1206, Switzerland
| | - Monique Zahn-Zabal
- CALIPHO Group, Swiss Institute of Bioinformatics, University of Geneva, Geneva 1206, Switzerland
| | - Pierre-André Michel
- CALIPHO Group, Swiss Institute of Bioinformatics, University of Geneva, Geneva 1206, Switzerland
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padova 35121, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Padova 35121, Italy
- CNR Institute of Neuroscience, Padova 35121, Italy
| | - Amos Bairoch
- CALIPHO Group, Swiss Institute of Bioinformatics, University of Geneva, Geneva 1206, Switzerland
| |
Collapse
|
42
|
Piovanelli E, Piovesan D, Shirafuji S, Ota J. A Simple Method to Estimate Muscle Currents from HD-sEMG and MRI using Electrical Network and Graph Theory .. Annu Int Conf IEEE Eng Med Biol Soc 2020; 2019:2657-2662. [PMID: 31946442 DOI: 10.1109/embc.2019.8856616] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In the last years the spread of hand prosthetics has fueled the research on the field of signal processing applied on physiologic data. At the state of the art there are different algorithms that allow a precise estimation of hand movements, the majority of whom work just on the electrode space. Even though there are signal processing methods that access single muscle information, they are still premature for a real application on prosthetics. We present a novel method that exploit the information extracted from a magnetic resonance image (MRI) and a single row of high-density surface electromyography (HD-sEMG) electrodes to estimate the muscles currents in the forearm, providing a first experimental application on two simple wrist movements to assess its performance. The results show that the proposed method is able to identify the correct muscle with a single muscle-contraction task, whereas for a 2 muscle task it shows a high variance in the results. The method models the signal propagation from muscles to electrodes using a simple resistive electrical network and uses the graph theory to calculate the muscle currents. It brings a considerably simpler muscle's current estimation method, significantly decreasing the problem complexity, and therefore becoming a potential effective approach for future prosthetics' control.
Collapse
|
43
|
Piovanelli E, Piovesan D, Shirafuji S, Ota J. Estimating Deep Muscles Activation from High Density Surface EMG Using Graph Theory. IEEE Int Conf Rehabil Robot 2020; 2019:405-410. [PMID: 31374663 DOI: 10.1109/icorr.2019.8779462] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In the recent years important steps forward have been made in the field of signal processing on muscle signals for hand prosthetics control. At the state of the art different algorithms and techniques allow a precise estimation of hand movements. However, they mostly work exclusively on the electrode space, not seeking for any information about the currents on the contracted muscles.In this study we propose a novel simplified method to estimate the muscles currents in the forearm, along with a first experimental application on two simple movements to assess its performance. We modeled the signal propagation from muscles to electrodes using a purely resistive electrical networks and afterwards apply the graph theory to assess the muscle currents. The proposed method considerably simplify the estimation of muscle's current, decreasing the problem complexity, and therefore potentially it can be a suitable approach for future prosthetics' control.
Collapse
|
44
|
Zhou N, Jiang Y, Bergquist TR, Lee AJ, Kacsoh BZ, Crocker AW, Lewis KA, Georghiou G, Nguyen HN, Hamid MN, Davis L, Dogan T, Atalay V, Rifaioglu AS, Dalkıran A, Cetin Atalay R, Zhang C, Hurto RL, Freddolino PL, Zhang Y, Bhat P, Supek F, Fernández JM, Gemovic B, Perovic VR, Davidović RS, Sumonja N, Veljkovic N, Asgari E, Mofrad MRK, Profiti G, Savojardo C, Martelli PL, Casadio R, Boecker F, Schoof H, Kahanda I, Thurlby N, McHardy AC, Renaux A, Saidi R, Gough J, Freitas AA, Antczak M, Fabris F, Wass MN, Hou J, Cheng J, Wang Z, Romero AE, Paccanaro A, Yang H, Goldberg T, Zhao C, Holm L, Törönen P, Medlar AJ, Zosa E, Borukhov I, Novikov I, Wilkins A, Lichtarge O, Chi PH, Tseng WC, Linial M, Rose PW, Dessimoz C, Vidulin V, Dzeroski S, Sillitoe I, Das S, Lees JG, Jones DT, Wan C, Cozzetto D, Fa R, Torres M, Warwick Vesztrocy A, Rodriguez JM, Tress ML, Frasca M, Notaro M, Grossi G, Petrini A, Re M, Valentini G, Mesiti M, Roche DB, Reeb J, Ritchie DW, Aridhi S, Alborzi SZ, Devignes MD, Koo DCE, Bonneau R, Gligorijević V, Barot M, Fang H, Toppo S, Lavezzo E, Falda M, Berselli M, Tosatto SCE, Carraro M, Piovesan D, Ur Rehman H, Mao Q, Zhang S, Vucetic S, Black GS, Jo D, Suh E, Dayton JB, Larsen DJ, Omdahl AR, McGuffin LJ, Brackenridge DA, Babbitt PC, Yunes JM, Fontana P, Zhang F, Zhu S, You R, Zhang Z, Dai S, Yao S, Tian W, Cao R, Chandler C, Amezola M, Johnson D, Chang JM, Liao WH, Liu YW, Pascarelli S, Frank Y, Hoehndorf R, Kulmanov M, Boudellioua I, Politano G, Di Carlo S, Benso A, Hakala K, Ginter F, Mehryary F, Kaewphan S, Björne J, Moen H, Tolvanen MEE, Salakoski T, Kihara D, Jain A, Šmuc T, Altenhoff A, Ben-Hur A, Rost B, Brenner SE, Orengo CA, Jeffery CJ, Bosco G, Hogan DA, Martin MJ, O'Donovan C, Mooney SD, Greene CS, Radivojac P, Friedberg I. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol 2019; 20:244. [PMID: 31744546 PMCID: PMC6864930 DOI: 10.1186/s13059-019-1835-8] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 09/24/2019] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
Collapse
Affiliation(s)
- Naihui Zhou
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Yuxiang Jiang
- Indiana University Bloomington, Bloomington, Indiana, USA
| | - Timothy R Bergquist
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Alexandra J Lee
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - Balint Z Kacsoh
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA.,Department of Molecular and Systems Biology, Hanover, NH, USA
| | - Alex W Crocker
- Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Kimberley A Lewis
- Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - George Georghiou
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Huy N Nguyen
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Computer Science, Ames, IA, USA
| | - Md Nafiz Hamid
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.,Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Larry Davis
- Program in Bioinformatics and Computational Biology, Ames, IA, USA
| | - Tunca Dogan
- Department of Computer Engineering, Hacettepe University, Ankara, Turkey.,European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Volkan Atalay
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey
| | - Ahmet S Rifaioglu
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey.,Department of Computer Engineering, Iskenderun Technical University, Hatay, Turkey
| | - Alperen Dalkıran
- Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey
| | - Rengul Cetin Atalay
- CanSyL, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Rebecca L Hurto
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
| | | | - Fran Supek
- Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - José M Fernández
- INB Coordination Unit, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Catalonia, Spain.,(former) INB GN2, Structural and Computational Biology Programme, Spanish National Cancer Research Centre, Barcelona, Catalonia, Spain
| | - Branislava Gemovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Vladimir R Perovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Radoslav S Davidović
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Neven Sumonja
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Nevena Veljkovic
- Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia
| | - Ehsaneddin Asgari
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering, University of California Berkeley, Berkeley, CA, USA.,Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Berkeley, CA, USA
| | | | - Giuseppe Profiti
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.,National Research Council, IBIOM, Bologna, Italy
| | - Castrense Savojardo
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Florian Boecker
- University of Bonn: INRES Crop Bioinformatics, Bonn, North Rhine-Westphalia, Germany
| | - Heiko Schoof
- INRES Crop Bioinformatics, University of Bonn, Bonn, Germany
| | - Indika Kahanda
- Gianforte School of Computing, Montana State University, Bozeman, Montana, USA
| | - Natalie Thurlby
- University of Bristol, Computer Science, Bristol, Bristol, United Kingdom
| | - Alice C McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Brunswick, Germany.,RESIST, DFG Cluster of Excellence 2155, Brunswick, Germany
| | - Alexandre Renaux
- Interuniversity Institute of Bioinformatics in Brussels, Université libre de Bruxelles - Vrije Universiteit Brussel, Brussels, Belgium.,Machine Learning Group, Université libre de Bruxelles, Brussels, Belgium.,Artificial Intelligence lab, Vrije Universiteit Brussel, Brussels, Belgium
| | - Rabie Saidi
- European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Julian Gough
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Alex A Freitas
- University of Kent, School of Computing, Canterbury, United Kingdom
| | - Magdalena Antczak
- School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
| | - Fabio Fabris
- University of Kent, School of Computing, Canterbury, United Kingdom
| | - Mark N Wass
- School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
| | - Jie Hou
- University of Missouri, Computer Science, Columbia, Missouri, USA.,Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Zheng Wang
- University of Miami, Coral Gables, Florida, USA
| | - Alfonso E Romero
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Alberto Paccanaro
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Haixuan Yang
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Galway, Ireland.,Technical University of Munich, Garching, Germany
| | - Tatyana Goldberg
- Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - Chenguang Zhao
- Faculty for Informatics, Garching, Germany.,Department for Bioinformatics and Computational Biology, Garching, Germany.,School of Computing Sciences and Computer Engineering, Hattiesburg, Mississippi, USA
| | - Liisa Holm
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Petri Törönen
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Alan J Medlar
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland
| | - Elaine Zosa
- Institute of Biotechnology, University of Helsinki, Helsinki, Finland
| | | | - Ilya Novikov
- Baylor College of Medicine, Department of Biochemistry and Molecular Biology, Houston, TX, USA
| | - Angela Wilkins
- Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA
| | - Olivier Lichtarge
- Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA
| | - Po-Han Chi
- National TsingHua University, Hsinchu, Taiwan
| | - Wei-Cheng Tseng
- Department of Electrical Engineering in National Tsing Hua University, Hsinchu City, Taiwan
| | - Michal Linial
- The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Peter W Rose
- University of California San Diego, San Diego Supercomputer Center, La Jolla, California, USA
| | - Christophe Dessimoz
- Department of Computational Biology and Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,Department of Genetics, Evolution & Environment, and Department of Computer Science, University College London, London, UK.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Vedrana Vidulin
- Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia
| | - Saso Dzeroski
- Jozef Stefan Institute, Ljubljana, Slovenia.,Jozef Stefan International Postgraduate School, Ljubljana, Slovenia
| | - Ian Sillitoe
- Research Department of Structural and Molecular Biology, University College London, London, England
| | - Sayoni Das
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Jonathan Gill Lees
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom.,Department of Health and Life Sciences, Oxford Brookes University, London, UK
| | - David T Jones
- The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom.,Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - Cen Wan
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Domenico Cozzetto
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Rui Fa
- Department of Computer Science, University College London, London, United Kingdom.,The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom
| | - Mateo Torres
- Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
| | - Alex Warwick Vesztrocy
- Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom.,SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain
| | - Michael L Tress
- Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Marco Frasca
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Marco Notaro
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Giuliano Grossi
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Alessandro Petrini
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Matteo Re
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Giorgio Valentini
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy
| | - Marco Mesiti
- Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy.,Institut de Biologie Computationnelle, LIRMM, CNRS-UMR 5506, Universite de Montpellier, Montpellier, France
| | - Daniel B Roche
- Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - Jonas Reeb
- Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany
| | - David W Ritchie
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | - Sabeur Aridhi
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France
| | | | - Marie-Dominique Devignes
- University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.,University of Lorraine, Nancy, Lorraine, France.,Inria, Nancy, France
| | | | - Richard Bonneau
- NYU Center for Data Science, New York, 10010, NY, USA.,Flatiron Institute, CCB, New York, 10010, NY, USA
| | - Vladimir Gligorijević
- Center for Computational Biology (CCB), Flatiron Institute, Simons Foundation, New York, New York, USA
| | - Meet Barot
- Center for Data Science, New York University, New York, 10011, NY, USA
| | - Hai Fang
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Stefano Toppo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Marco Falda
- Department of Biology, University of Padova, Padova, Italy
| | - Michele Berselli
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Silvio C E Tosatto
- CNR Institute of Neuroscience, Padova, Italy.,Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Marco Carraro
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padova, Italy
| | - Hafeez Ur Rehman
- Department of Computer Science, National University of Computer and Emerging Sciences, Peshawar, Khyber Pakhtoonkhwa, Pakistan
| | - Qizhong Mao
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA.,University of California, Riverside, Philadelphia, PA, USA
| | - Shanshan Zhang
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Slobodan Vucetic
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Gage S Black
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Dane Jo
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Erica Suh
- Department of Biology, Brigham Young University, Provo, UT, USA
| | - Jonathan B Dayton
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Dallas J Larsen
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Ashton R Omdahl
- Department of Biology, Brigham Young University, Provo, UT, USA.,Bioinformatics Research Group, Provo, UT, USA
| | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, England, United Kingdom
| | | | - Patricia C Babbitt
- Department of Pharmaceutical Chemistry, San Francisco, CA, USA.,Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA
| | - Jeffrey M Yunes
- UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, 94158, CA, USA.,Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA
| | - Paolo Fontana
- Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy
| | - Feng Zhang
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, Shanghai, China.,Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China
| | - Shanfeng Zhu
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Ronghui You
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Zihan Zhang
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Suyang Dai
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China
| | - Shuwei Yao
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.,Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.,Department of Pediatrics, Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Caleb Chandler
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Miguel Amezola
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Devon Johnson
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jia-Ming Chang
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | - Wen-Hung Liao
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | - Yi-Wei Liu
- Department of Computer Science, National Chengchi University, Taipei, Taiwan
| | | | | | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia
| | - Maxat Kulmanov
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia
| | - Imane Boudellioua
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.,Computer, Electrical and Mathematical Sciences Engineering Division (CEMSE), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Gianfranco Politano
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Stefano Di Carlo
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Alfredo Benso
- Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy
| | - Kai Hakala
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland
| | - Filip Ginter
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku, Turku, Finland
| | - Farrokh Mehryary
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland
| | - Suwisa Kaewphan
- Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.,University of Turku Graduate School (UTUGS), Turku, Finland.,Turku Centre for Computer Science (TUCS), Turku, Finland
| | - Jari Björne
- Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.,Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland
| | | | | | - Tapio Salakoski
- Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.,Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland
| | - Daisuke Kihara
- Department of Biological Sciences, Department of Computer Science, Purdue University, 47907, IN, USA.,Department of Pediatrics, University of Cincinnati, Cincinnati, 45229, OH, USA
| | - Aashish Jain
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Tomislav Šmuc
- Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia
| | - Adrian Altenhoff
- Department of Computer Science, ETH Zurich, Zurich, Switzerland.,SIB Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Asa Ben-Hur
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.,Institute for Food and Plant Sciences WZW, Technische Universität München, Freising, Germany
| | | | - Christine A Orengo
- Research Department of Structural and Molecular Biology, University College London, London, United Kingdom
| | - Constance J Jeffery
- Biological Sciences, University of Illinois at Chicago, Chicago, Illinois, USA
| | - Giovanni Bosco
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Deborah A Hogan
- Geisel School of Medicine at Dartmouth, Hanover, NH, USA.,Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Maria J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Claire O'Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.,Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, Pennsylvania, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| | - Iddo Friedberg
- Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.
| |
Collapse
|
45
|
Davey NE, Babu MM, Blackledge M, Bridge A, Capella-Gutierrez S, Dosztanyi Z, Drysdale R, Edwards RJ, Elofsson A, Felli IC, Gibson TJ, Gutmanas A, Hancock JM, Harrow J, Higgins D, Jeffries CM, Le Mercier P, Mészáros B, Necci M, Notredame C, Orchard S, Ouzounis CA, Pancsa R, Papaleo E, Pierattelli R, Piovesan D, Promponas VJ, Ruch P, Rustici G, Romero P, Sarntivijai S, Saunders G, Schuler B, Sharan M, Shields DC, Sussman JL, Tedds JA, Tompa P, Turewicz M, Vondrasek J, Vranken WF, Wallace BA, Wichapong K, Tosatto SCE. An intrinsically disordered proteins community for ELIXIR. F1000Res 2019; 8. [PMID: 31824649 PMCID: PMC6880265 DOI: 10.12688/f1000research.20136.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/18/2019] [Indexed: 01/20/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-quality tools and resources to support the identification, analysis and functional characterisation of IDPs. The roadmap is the result of a workshop titled “An intrinsically disordered protein user community proposal for ELIXIR” held at the University of Padua. The workshop, and further consultation with the members of the wider IDP community, identified the key priority areas for the roadmap including the development of standards for data annotation, storage and dissemination; integration of IDP data into the ELIXIR Core Data Resources; and the creation of benchmarking criteria for IDP-related software. Here, we discuss these areas of priority, how they can be implemented in cooperation with the ELIXIR platforms, and their connections to existing ELIXIR Communities and international consortia. The article provides a preliminary blueprint for an IDP Community in ELIXIR and is an appeal to identify and involve new stakeholders.
Collapse
Affiliation(s)
- Norman E Davey
- Division of Cancer Biology, Institute of Cancer Research, UK, London, SW3 6JB, UK
| | - M Madan Babu
- MRC Laboratory of Molecular Biology,, Cambridge, CB2 0QH, UK
| | - Martin Blackledge
- Institut de Biologie Structurale, Université Grenoble Alpes, Grenoble, 38000, France
| | - Alan Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | | | - Zsuzsanna Dosztanyi
- Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| | | | - Richard J Edwards
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Isabella C Felli
- Department of Chemistry and CERM "Ugo Schiff", University of Florence, Florence, Italy
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Cambridge, CB10 1SD, UK
| | - John M Hancock
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Jen Harrow
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Desmond Higgins
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin, D4, Ireland
| | - Cy M Jeffries
- European Molecular Biology Laboratory, Hamburg, Germany
| | - Philippe Le Mercier
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Balint Mészáros
- Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Cedric Notredame
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, 08003, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Cambridge, CB10 1SD, UK
| | - Christos A Ouzounis
- BCPL-CPERI, Centre for Research & Technology Hellas (CERTH), Thessalonica, 57001, Greece
| | - Rita Pancsa
- Institute of Enzymology, Research Centre for Natural Sciences of the Hungarian Academy of Sciences, Budapest, H-1117, Hungary
| | - Elena Papaleo
- Computational Biology Laboratory, Danish Cancer Society Research Center, Copenhagen, 2100, Denmark
| | - Roberta Pierattelli
- Department of Chemistry and CERM "Ugo Schiff", University of Florence, Florence, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, CY-1678, Cyprus
| | - Patrick Ruch
- HES-SO/HEG and SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Gabriella Rustici
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Pedro Romero
- University of Wisconsin-Madison, Madison, WI, 53706-1544, USA
| | | | - Gary Saunders
- ELIXIR Hub, Wellcome Genome Campus, Cambridge, CB10 1SD, UK
| | - Benjamin Schuler
- Department of Biochemistry, University of Zurich, Zurich, Switzerland
| | - Malvika Sharan
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Denis C Shields
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin, D4, Ireland
| | - Joel L Sussman
- Department of Structural Biology and the Israel Structural Proteomics, Center (ISPC), Weizmann Institute of Science, Reḥovot, 7610001, Israel
| | | | - Peter Tompa
- VIB Center for Structural Biology (CSB), VIB Flemish Institute for Biotechnology, Brussels, 1050, Belgium
| | - Michael Turewicz
- Faculty of Medicine, Medizinisches Proteom-Center, Ruhr University Bochum, GesundheitsCampus 4, Bochum, 44801, Germany
| | - Jiri Vondrasek
- Institute of Organic Chemistry and Biochemistry, CAS, Prague, Czech Republic
| | - Wim F Vranken
- VUB/ULB Interuniversity Institute of Bioinformatics in Brussels and Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, B-1050, Belgium
| | - Bonnie Ann Wallace
- Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, WC1H 0HA, UK
| | - Kanin Wichapong
- Department of Biochemistry, Cardiovascular Research Institute Maastricht (CARIM), Maastricht University, Maastricht, The Netherlands
| | | |
Collapse
|
46
|
Piovesan D, Tabaro F, Paladin L, Necci M, Micetic I, Camilloni C, Davey N, Dosztányi Z, Mészáros B, Monzon AM, Parisi G, Schad E, Sormanni P, Tompa P, Vendruscolo M, Vranken WF, Tosatto SCE. MobiDB 3.0: more annotations for intrinsic disorder, conformational diversity and interactions in proteins. Nucleic Acids Res 2019; 46:D471-D476. [PMID: 29136219 PMCID: PMC5753340 DOI: 10.1093/nar/gkx1071] [Citation(s) in RCA: 156] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 10/19/2017] [Indexed: 01/30/2023] Open
Abstract
The MobiDB (URL: mobidb.bio.unipd.it) database of protein disorder and mobility annotations has been significantly updated and upgraded since its last major renewal in 2014. Several curated datasets for intrinsic disorder and folding upon binding have been integrated from specialized databases. The indirect evidence has also been expanded to better capture information available in the PDB, such as high temperature residues in X-ray structures and overall conformational diversity. Novel nuclear magnetic resonance chemical shift data provides an additional experimental information layer on conformational dynamics. Predictions have been expanded to provide new types of annotation on backbone rigidity, secondary structure preference and disordered binding regions. MobiDB 3.0 contains information for the complete UniProt protein set and synchronization has been improved by covering all UniParc sequences. An advanced search function allows the creation of a wide array of custom-made datasets for download and further analysis. A large amount of information and cross-links to more specialized databases are intended to make MobiDB the central resource for the scientific community working on protein intrinsic disorder and mobility.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Francesco Tabaro
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,Institute of Biosciences and Medical Technology, Arvo Ylpön katu 34, 33520 Tampere, Finland
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,Department of Agricultural Sciences, University of Udine, via Palladio 8, 33100 Udine, Italy.,Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige, Italy
| | - Ivan Micetic
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy
| | - Carlo Camilloni
- Department of Biosciences, University of Milan, 20133 Milano, Italy
| | - Norman Davey
- Conway Institute of Biomolecular & Biomedical Research, University College Dublin, Belfield, Dublin 4, Ireland.,UCD School of Medicine & Medical Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, 1/c Pázmány Péter sétány, H-1117, Budapest, Hungary
| | - Bálint Mészáros
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, 1/c Pázmány Péter sétány, H-1117, Budapest, Hungary.,Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | - Alexander M Monzon
- Structural Bioinformatics Group, Department of Science and Technology, National University of Quilmes, CONICET, Roque Saenz Pena 182, Bernal B1876BXD, Argentina
| | - Gustavo Parisi
- Structural Bioinformatics Group, Department of Science and Technology, National University of Quilmes, CONICET, Roque Saenz Pena 182, Bernal B1876BXD, Argentina
| | - Eva Schad
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | - Pietro Sormanni
- Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK
| | - Peter Tompa
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary.,Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium
| | | | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel (VUB), Brussels 1050, Belgium.,VIB-VUB Center for Structural Biology, Flanders Institute for Biotechnology (VIB), Brussels 1050, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, ULB/VUB, 1050 Brussels, Belgium
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, via U. Bassi 58/b, 35131 Padua, Italy.,CNR Institute of Neuroscience, via U. Bassi 58/b, 35131 Padua, Italy
| |
Collapse
|
47
|
Hirsh L, Paladin L, Piovesan D, Tosatto SCE. RepeatsDB-lite: a web server for unit annotation of tandem repeat proteins. Nucleic Acids Res 2019; 46:W402-W407. [PMID: 29746699 PMCID: PMC6031040 DOI: 10.1093/nar/gky360] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 04/24/2018] [Indexed: 11/15/2022] Open
Abstract
RepeatsDB-lite (http://protein.bio.unipd.it/repeatsdb-lite) is a web server for the prediction of repetitive structural elements and units in tandem repeat (TR) proteins. TRs are a widespread but poorly annotated class of non-globular proteins carrying heterogeneous functions. RepeatsDB-lite extends the prediction to all TR types and strongly improves the performance both in terms of computational time and accuracy over previous methods, with precision above 95% for solenoid structures. The algorithm exploits an improved TR unit library derived from the RepeatsDB database to perform an iterative structural search and assignment. The web interface provides tools for analyzing the evolutionary relationships between units and manually refine the prediction by changing unit positions and protein classification. An all-against-all structure-based sequence similarity matrix is calculated and visualized in real-time for every user edit. Reviewed predictions can be submitted to RepeatsDB for review and inclusion.
Collapse
Affiliation(s)
- Layla Hirsh
- Dept. of Biomedical Sciences, University of Padua, Padua, Italy.,Dept. of Engineering, Pontificia Universidad Católica del Perú, Lima, Perú
| | - Lisanna Paladin
- Dept. of Biomedical Sciences, University of Padua, Padua, Italy
| | | | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padua, Padua, Italy.,CNR Institute of Neurosciences, Padua, Italy
| |
Collapse
|
48
|
Paladin L, Piovesan D, Tosatto SCE. SODA: prediction of protein solubility from disorder and aggregation propensity. Nucleic Acids Res 2019; 45:W236-W240. [PMID: 28505312 PMCID: PMC7059794 DOI: 10.1093/nar/gkx412] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 05/09/2017] [Indexed: 01/08/2023] Open
Abstract
Solubility is an important, albeit not well understood, feature determining protein behavior. It is of paramount importance in protein engineering, where similar folded proteins may behave in very different ways in solution. Here we present SODA, a novel method to predict the changes of protein solubility based on several physico-chemical properties of the protein. SODA uses the propensity of the protein sequence to aggregate as well as intrinsic disorder, plus hydrophobicity and secondary structure preferences to estimate changes in solubility. It has been trained and benchmarked on two different datasets. The comparison to other recently published methods shows that SODA has state-of-the-art performance and is particularly well suited to predict mutations decreasing solubility. The method is fast, returning results for single mutations in seconds. A usage example estimating the full repertoire of mutations for a human germline antibody highlights several solubility hotspots on the surface. The web server, complete with RESTful interface and extensive help, can be accessed from URL: http://protein.bio.unipd.it/soda.
Collapse
Affiliation(s)
- Lisanna Paladin
- Department of Biomedical Sciences, University of Padua, Viale G. Colombo 3, 35121 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padua, Viale G. Colombo 3, 35121 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua, Viale G. Colombo 3, 35121 Padova, Italy.,CNR Institute of Neuroscience, Viale G. Colombo 3, 35121 Padova, Italy
| |
Collapse
|
49
|
Seitz L, Rieger A, Berry W, Ashok D, Direnzo D, Jin L, Lee S, Park A, Piovesan D, Tan J, Walters M, Karakunnel J. Preliminary results from a phase 1 study of AB122, a programmed cell death-1 (PD-1) inhibitor, in patients with advanced solid malignancies. Ann Oncol 2018. [DOI: 10.1093/annonc/mdy487.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
50
|
Piovesan D, Tosatto SCE. Mobi 2.0: an improved method to define intrinsic disorder, mobility and linear binding regions in protein structures. Bioinformatics 2018; 34:122-123. [PMID: 28968795 DOI: 10.1093/bioinformatics/btx592] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 09/15/2017] [Indexed: 11/14/2022] Open
Abstract
Motivation The structures contained in the Protein Data Bank (PDB) database are of paramount importance to define our knowledge of folded proteins. While providing mainly circumstantial evidence, PDB data is also increasingly used to define the lack of unique structure, represented by mobile regions and even intrinsic disorder (ID). However, alternative definitions are used by different authors and potentially limit the generality of the analyses being carried out. Results Here we present Mobi 2.0, a completely re-written version of the Mobi software for the determination of mobile and potentially disordered regions from PDB structures. Mobi 2.0 provides robust definitions of mobility based on four main sources of information: (i) missing residues, (ii) residues with high temperature factors, (iii) mobility between different models of the same structure and (iv) binding to another protein or nucleotide chain. Mobi 2.0 is well suited to aggregate information across different PDB structures for the same UniProt protein sequence, providing consensus annotations. The software is expected to standardize the treatment of mobility, allowing an easier comparison across different studies related to ID. Availability Mobi 2.0 provides the structure-based annotation for the MobiDB database. The software is available from URL http://protein.bio.unipd.it/mobi2/. Contact silvio.tosatto@unipd.it.
Collapse
Affiliation(s)
- Damiano Piovesan
- Department of Biomedical Sciences, University of Padua Viale G. Colombo 3, Padova 35121, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padua Viale G. Colombo 3, Padova 35121, Italy.,CNR Institute of Neuroscience Viale G. Colombo 3, Padova 35121, Italy
| |
Collapse
|