1
|
Selby P, Abbeloos R, Backlund JE, Basterrechea Salido M, Bauchet G, Benites-Alfaro OE, Birkett C, Calaminos VC, Carceller P, Cornut G, Vasques Costa B, Edwards JD, Finkers R, Yanxin Gao S, Ghaffar M, Glaser P, Guignon V, Hok P, Kilian A, König P, Lagare JEB, Lange M, Laporte MA, Larmande P, LeBauer DS, Lyon DA, Marshall DS, Matthews D, Milne I, Mistry N, Morales N, Mueller LA, Neveu P, Papoutsoglou E, Pearce B, Perez-Masias I, Pommier C, Ramírez-González RH, Rathore A, Raquel AM, Raubach S, Rife T, Robbins K, Rouard M, Sarma C, Scholz U, Sempéré G, Shaw PD, Simon R, Soldevilla N, Stephen G, Sun Q, Tovar C, Uszynski G, Verouden M. BrAPI-an application programming interface for plant breeding applications. Bioinformatics 2020; 35:4147-4155. [PMID: 30903186 PMCID: PMC6792114 DOI: 10.1093/bioinformatics/btz190] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 11/23/2018] [Accepted: 03/20/2019] [Indexed: 12/04/2022] Open
Abstract
Motivation Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. Results To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases. Availability and implementation More information on BrAPI, including links to the specification, test suites, BrAPPs, and sample implementations is available at https://brapi.org/. The BrAPI specification and the developer tools are provided as free and open source.
Collapse
Affiliation(s)
- Peter Selby
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, New York, USA
| | | | | | | | | | - Omar E Benites-Alfaro
- International Potato Center (CIP), Lima, Peru.,International Food Policy Research Institute (IFPRI), Washington DC, USA
| | | | - Viana C Calaminos
- International Rice Research Institute (IRRI), Los Baños, Laguna, The Philippines
| | - Pierre Carceller
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | | | | | | | - Richard Finkers
- Department of Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | - Star Yanxin Gao
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | - Mehmood Ghaffar
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Philip Glaser
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | | | - Puthick Hok
- Diversity Arrays Technology, Bruce, Australia
| | | | - Patrick König
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | | | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | | | | | - David S LeBauer
- College of Agricultural and Life Sciences, The University of Arizona, Tucson, AZ, USA
| | | | - David S Marshall
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK.,SRUC, Edinburgh, UK
| | | | - Iain Milne
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | | | | | | | - Pascal Neveu
- MISTEA, INRA, Montpellier SupAgro, Universite de Montpellier, Montpellier, France
| | - Evangelia Papoutsoglou
- Department of Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | | | | | - Cyril Pommier
- URGI, INRA, Université Paris-Saclay, Versailles, France
| | | | - Abhishek Rathore
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Angel Manica Raquel
- International Rice Research Institute (IRRI), Los Baños, Laguna, The Philippines
| | - Sebastian Raubach
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Trevor Rife
- Department of Plant Pathology, Kansas State University, Manhattan, KS, USA
| | - Kelly Robbins
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, New York, USA
| | | | - Chaitanya Sarma
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Guilhem Sempéré
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France.,INTERTRYP, Univ Montpellier, CIRAD, IRD, Montpellier, France
| | - Paul D Shaw
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | | | - Nahuel Soldevilla
- Integrated Breeding Program (IBP), CIMMYT, Texcoco, Mexico.,LeafNode Technology, Buenos Aires, Argentina
| | - Gordon Stephen
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Qi Sun
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | - Clarysabel Tovar
- Integrated Breeding Program (IBP), CIMMYT, Texcoco, Mexico.,LeafNode Technology, Buenos Aires, Argentina
| | | | - Maikel Verouden
- Wageningen University & Research, Biometris, Wageningen PB, The Netherlands
| | | |
Collapse
|
2
|
Bagnacani A, Wolfien M, Wolkenhauer O. Tools for Understanding miRNA-mRNA Interactions for Reproducible RNA Analysis. Methods Mol Biol 2019; 1912:199-214. [PMID: 30635895 DOI: 10.1007/978-1-4939-8982-9_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
MicroRNAs (miRNAs) are an integral part of gene regulation at the post-transcriptional level. The use of RNA data in gene expression analysis has become increasingly important to gain insights into the regulatory mechanisms behind miRNA-mRNA interactions. As a result, we are confronted with a growing landscape of tools, while standards for reproducibility and benchmarking lag behind. This work identifies the challenges for reproducible RNA analysis, and highlights best practices on the processing and dissemination of scientific results. We found that the success of a tool does not solely depend on its performances: equally important is how a tool is received, and then supported within a community. This leads us to a detailed presentation of the RNA workbench, a community effort for sharing workflows and processing tools, built on top of the Galaxy framework. Here, we follow the community guidelines to extend its portfolio of RNA tools with the integration of the TriplexRNA ( https://triplexrna.org ). Our findings provide the basis for the development of a recommendation system, to guide users in the choice of tools and workflows.
Collapse
Affiliation(s)
- Andrea Bagnacani
- Department of Systems Biology and Bioinformatics, Institute of Computer Science, University of Rostock, Rostock, Germany.
| | - Markus Wolfien
- Department of Systems Biology and Bioinformatics, Institute of Computer Science, University of Rostock, Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, Institute of Computer Science, University of Rostock, Rostock, Germany
- Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre, Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|
3
|
Banegas-Luna AJ, Imbernón B, Llanes Castro A, Pérez-Garrido A, Cerón-Carrasco JP, Gesing S, Merelli I, D'Agostino D, Pérez-Sánchez H. Advances in distributed computing with modern drug discovery. Expert Opin Drug Discov 2018; 14:9-22. [PMID: 30484337 DOI: 10.1080/17460441.2019.1552936] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
INTRODUCTION Computational chemistry dramatically accelerates the drug discovery process and high-performance computing (HPC) can be used to speed up the most expensive calculations. Supporting a local HPC infrastructure is both costly and time-consuming, and, therefore, many research groups are moving from in-house solutions to remote-distributed computing platforms. Areas covered: The authors focus on the use of distributed technologies, solutions, and infrastructures to gain access to HPC capabilities, software tools, and datasets to run the complex simulations required in computational drug discovery (CDD). Expert opinion: The use of computational tools can decrease the time to market of new drugs. HPC has a crucial role in handling the complex algorithms and large volumes of data required to achieve specificity and avoid undesirable side-effects. Distributed computing environments have clear advantages over in-house solutions in terms of cost and sustainability. The use of infrastructures relying on virtualization reduces set-up costs. Distributed computing resources can be difficult to access, although web-based solutions are becoming increasingly available. There is a trade-off between cost-effectiveness and accessibility in using on-demand computing resources rather than free/academic resources. Graphics processing unit computing, with its outstanding parallel computing power, is becoming increasingly important.
Collapse
Affiliation(s)
- Antonio Jesús Banegas-Luna
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| | - Baldomero Imbernón
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| | - Antonio Llanes Castro
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| | - Alfonso Pérez-Garrido
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| | - José Pedro Cerón-Carrasco
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| | - Sandra Gesing
- b Center for Research Computing , University of Notre Dame , Notre Dame , IN , USA
| | - Ivan Merelli
- c Institute for Biomedical Technologies , National Research Council of Italy , Segrate (Milan) , Italy
| | - Daniele D'Agostino
- d Institute for Applied Mathematics and Information Technologies "E. Magenes" , National Research Council of Italy , Genoa , Italy
| | - Horacio Pérez-Sánchez
- a Bioinformatics and High Performance Computing Research Group (BIO-HPC) , Universidad Católica de Murcia (UCAM) , Murcia , Spain
| |
Collapse
|
4
|
Urdidiales‐Nieto D, Navas‐Delgado I, Aldana‐Montes JF. Biological Web Service Repositories Review. Mol Inform 2017; 36:1600035. [PMID: 27783459 PMCID: PMC5434852 DOI: 10.1002/minf.201600035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2016] [Accepted: 09/27/2016] [Indexed: 12/26/2022]
Abstract
Web services play a key role in bioinformatics enabling the integration of database access and analysis of algorithms. However, Web service repositories do not usually publish information on the changes made to their registered Web services. Dynamism is directly related to the changes in the repositories (services registered or unregistered) and at service level (annotation changes). Thus, users, software clients or workflow based approaches lack enough relevant information to decide when they should review or re-execute a Web service or workflow to get updated or improved results. The dynamism of the repository could be a measure for workflow developers to re-check service availability and annotation changes in the services of interest to them. This paper presents a review on the most well-known Web service repositories in the life sciences including an analysis of their dynamism. Freshness is introduced in this paper, and has been used as the measure for the dynamism of these repositories.
Collapse
Affiliation(s)
- David Urdidiales‐Nieto
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| | - Ismael Navas‐Delgado
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| | - José F. Aldana‐Montes
- Department of Computer Languages and Computing ScienceHigher Technical School of Computer Science EngineeringUniversity of MalagaMalaga29071Spain
| |
Collapse
|
5
|
Guardia GD, Ferreira Pires L, da Silva EG, de Farias CR. SemanticSCo: A platform to support the semantic composition of services for gene expression analysis. J Biomed Inform 2017; 66:116-128. [DOI: 10.1016/j.jbi.2016.12.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Revised: 11/27/2016] [Accepted: 12/31/2016] [Indexed: 10/20/2022]
|
6
|
|
7
|
Ison J, Rapacki K, Ménager H, Kalaš M, Rydza E, Chmura P, Anthon C, Beard N, Berka K, Bolser D, Booth T, Bretaudeau A, Brezovsky J, Casadio R, Cesareni G, Coppens F, Cornell M, Cuccuru G, Davidsen K, Vedova GD, Dogan T, Doppelt-Azeroual O, Emery L, Gasteiger E, Gatter T, Goldberg T, Grosjean M, Grüning B, Helmer-Citterich M, Ienasescu H, Ioannidis V, Jespersen MC, Jimenez R, Juty N, Juvan P, Koch M, Laibe C, Li JW, Licata L, Mareuil F, Mičetić I, Friborg RM, Moretti S, Morris C, Möller S, Nenadic A, Peterson H, Profiti G, Rice P, Romano P, Roncaglia P, Saidi R, Schafferhans A, Schwämmle V, Smith C, Sperotto MM, Stockinger H, Vařeková RS, Tosatto SCE, de la Torre V, Uva P, Via A, Yachdav G, Zambelli F, Vriend G, Rost B, Parkinson H, Løngreen P, Brunak S. Tools and data services registry: a community effort to document bioinformatics resources. Nucleic Acids Res 2015; 44:D38-47. [PMID: 26538599 PMCID: PMC4702812 DOI: 10.1093/nar/gkv1116] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 10/13/2015] [Indexed: 01/24/2023] Open
Abstract
Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.
Collapse
Affiliation(s)
- Jon Ison
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | - Kristoffer Rapacki
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | - Hervé Ménager
- Centre d'Informatique pour la Biologie, C3BI, Institut Pasteur, France
| | - Matúš Kalaš
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Emil Rydza
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | - Piotr Chmura
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | - Christian Anthon
- Department of Veterinary Clinical and Animal Sciences, Faculty for Health and Medical Sciences, University of Copenhagen, Denmark
| | - Niall Beard
- School of Computer Science, University of Manchester, UK
| | - Karel Berka
- Department of Physical Chemistry, RCPTM, Faculty of Science, Palacky University, Czech Republic
| | - Dan Bolser
- The European Bioinformatics Institute (EMBL-EBI), UK
| | - Tim Booth
- NEBC Wallingford, Centre for Ecology and Hydrology, UK
| | - Anthony Bretaudeau
- INRA, UMR Institut de Génétique, Environnement et Protection des Plantes (IGEPP), BioInformatics Platform for Agroecosystems Arthropods (BIPAA), France INRIA, IRISA, GenOuest Core Facility, France
| | - Jan Brezovsky
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Czech Republic
| | - Rita Casadio
- Bologna Biocomputing Group, University of Bologna, Italy
| | | | - Frederik Coppens
- Department of Plant Systems Biology, VIB, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium
| | | | | | - Kristian Davidsen
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | | | - Tunca Dogan
- UniProt, European Bioinformatics Institute (EMBL-EBI), UK
| | | | - Laura Emery
- The European Bioinformatics Institute (EMBL-EBI), UK
| | | | - Thomas Gatter
- Faculty of Technology and Center for Biotechnology, Universität Bielefeld, Germany
| | | | - Marie Grosjean
- Institut Français de Bioinformatique (French Institute of Bioinformatics), CNRS, UMS3601, France
| | - Björn Grüning
- Albert-Ludwigs-Universität Freiburg, Fahnenbergplatz, 79085 Freiburg
| | | | - Hans Ienasescu
- Bioinformatics Centre, Department of Biology, University of Copenhagen, Denmark
| | | | - Martin Closter Jespersen
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | | | - Nick Juty
- The European Bioinformatics Institute (EMBL-EBI), UK
| | - Peter Juvan
- Centre for Functional Genomics and Biochips, Faculty of Medicine, University of Ljubljana, Slovenia
| | | | - Camille Laibe
- The European Bioinformatics Institute (EMBL-EBI), UK
| | - Jing-Woei Li
- Faculty of Medicine, The Chinese University of Hong Kong, China Hong Kong Bioinformatics Centre, School of Life Sciences,The Chinese University of Hong Kong, China
| | - Luana Licata
- Dept. of Biology, University of Rome Tor Vergata, Italy
| | - Fabien Mareuil
- Centre d'Informatique pour la Biologie, C3BI, Institut Pasteur, France
| | - Ivan Mičetić
- Department of Biomedical Sciences, University of Padua, Italy
| | | | - Sebastien Moretti
- SIB Swiss Institute of Bioinformatics, Switzerland Department of Ecology and Evolution, Biophore, Evolutionary Bioinformatics group, University of Lausanne, Switzerland
| | | | - Steffen Möller
- Department of Dermatology, University of Lübeck, Germany Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Germany
| | | | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Estonia
| | | | - Peter Rice
- Department of Computing, William Penney Laboratory, Imperial College London, UK
| | | | | | - Rabie Saidi
- UniProt, European Bioinformatics Institute (EMBL-EBI), UK
| | | | - Veit Schwämmle
- Protein Research Group, Department for Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
| | | | - Maria Maddalena Sperotto
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | | | | | | | - Victor de la Torre
- National Bioinformatics Institute Unit (INB), Fundacion Centro Nacional de Investigaciones Oncologicas, Spain
| | | | - Allegra Via
- Dept. of Physics, Sapienza University, Italy
| | - Guy Yachdav
- Department of Informatics, Bioinformatics-I12, TUM, Germany
| | - Federico Zambelli
- Institute of Biomembranes and Bioenergetics, National Research Council (CNR), and Dept. of Biosciences, University of Milano, Italy
| | - Gert Vriend
- Radboud University Medical Centre, CMBI, Netherlands
| | - Burkhard Rost
- Department of Informatics, Bioinformatics-I12, TUM, Germany
| | | | - Peter Løngreen
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis Department of Systems Biology, Technical University of Denmark, Denmark Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
| |
Collapse
|
8
|
Sfakianaki P, Koumakis L, Sfakianakis S, Iatraki G, Zacharioudakis G, Graf N, Marias K, Tsiknakis M. Semantic biomedical resource discovery: a Natural Language Processing framework. BMC Med Inform Decis Mak 2015; 15:77. [PMID: 26423616 PMCID: PMC4591066 DOI: 10.1186/s12911-015-0200-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Accepted: 09/21/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. METHODS A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). RESULTS For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. CONCLUSIONS There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.
Collapse
Affiliation(s)
- Pepi Sfakianaki
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Lefteris Koumakis
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Stelios Sfakianakis
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Galatia Iatraki
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Giorgos Zacharioudakis
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Norbert Graf
- Paediatric Haematology and Oncology, Saarland University Hospital, Homburg, Germany
| | - Kostas Marias
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
| | - Manolis Tsiknakis
- Foundation for Research and Technology Hellas (FORTH), Institute of Computer Science, N. Plastira 100, Vassilika Vouton, Heraklion, Crete Greece
- Department of Informatics Engineering, Technological Educational Institute, Heraklion, Crete Greece
| |
Collapse
|
9
|
Guardia GDA, Pires LF, Vêncio RZN, Malmegrim KCR, de Farias CRG. A Methodology for the Development of RESTful Semantic Web Services for Gene Expression Analysis. PLoS One 2015. [PMID: 26207740 PMCID: PMC4514690 DOI: 10.1371/journal.pone.0134011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Gene expression studies are generally performed through multi-step analysis processes, which require the integrated use of a number of analysis tools. In order to facilitate tool/data integration, an increasing number of analysis tools have been developed as or adapted to semantic web services. In recent years, some approaches have been defined for the development and semantic annotation of web services created from legacy software tools, but these approaches still present many limitations. In addition, to the best of our knowledge, no suitable approach has been defined for the functional genomics domain. Therefore, this paper aims at defining an integrated methodology for the implementation of RESTful semantic web services created from gene expression analysis tools and the semantic annotation of such services. We have applied our methodology to the development of a number of services to support the analysis of different types of gene expression data, including microarray and RNASeq. All developed services are publicly available in the Gene Expression Analysis Services (GEAS) Repository at http://dcm.ffclrp.usp.br/lssb/geas. Additionally, we have used a number of the developed services to create different integrated analysis scenarios to reproduce parts of two gene expression studies documented in the literature. The first study involves the analysis of one-color microarray data obtained from multiple sclerosis patients and healthy donors. The second study comprises the analysis of RNA-Seq data obtained from melanoma cells to investigate the role of the remodeller BRG1 in the proliferation and morphology of these cells. Our methodology provides concrete guidelines and technical details in order to facilitate the systematic development of semantic web services. Moreover, it encourages the development and reuse of these services for the creation of semantically integrated solutions for gene expression analysis.
Collapse
Affiliation(s)
- Gabriela D. A. Guardia
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Luís Ferreira Pires
- Faculty of Electrical Engineering, Mathematics and Computer Science—University of Twente, Enschede, the Netherlands
| | - Ricardo Z. N. Vêncio
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Kelen C. R. Malmegrim
- Department of Clinical, Toxicological and Bromatological Analysis—Faculty of Pharmaceutical Sciences of Ribeirão Preto—University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Cléver R. G. de Farias
- Department of Computer Science and Mathematics—Faculty of Philosophy, Sciences and Letters of Ribeirão Preto (FFCLRP)—University of São Paulo (USP), Ribeirão Preto, Brazil
- * E-mail:
| |
Collapse
|
10
|
Holliday GL, Bairoch A, Bagos PG, Chatonnet A, Craik DJ, Finn RD, Henrissat B, Landsman D, Manning G, Nagano N, O’Donovan C, Pruitt KD, Rawlings ND, Saier M, Sowdhamini R, Spedding M, Srinivasan N, Vriend G, Babbitt PC, Bateman A. Key challenges for the creation and maintenance of specialist protein resources. Proteins 2015; 83:1005-13. [PMID: 25820941 PMCID: PMC4446195 DOI: 10.1002/prot.24803] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Revised: 03/06/2015] [Accepted: 03/20/2015] [Indexed: 11/12/2022]
Abstract
As the volume of data relating to proteins increases, researchers rely more and more on the analysis of published data, thus increasing the importance of good access to these data that vary from the supplemental material of individual articles, all the way to major reference databases with professional staff and long-term funding. Specialist protein resources fill an important middle ground, providing interactive web interfaces to their databases for a focused topic or family of proteins, using specialized approaches that are not feasible in the major reference databases. Many are labors of love, run by a single lab with little or no dedicated funding and there are many challenges to building and maintaining them. This perspective arose from a meeting of several specialist protein resources and major reference databases held at the Wellcome Trust Genome Campus (Cambridge, UK) on August 11 and 12, 2014. During this meeting some common key challenges involved in creating and maintaining such resources were discussed, along with various approaches to address them. In laying out these challenges, we aim to inform users about how these issues impact our resources and illustrate ways in which our working together could enhance their accuracy, currency, and overall value.
Collapse
Affiliation(s)
- Gemma L Holliday
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, California, 94158
| | - Amos Bairoch
- SIB—Swiss Institute of Bioinformatics, University of GenevaGeneva, Switzerland
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of ThessalyLamia, 35100, Greece
| | - Arnaud Chatonnet
- INRA, Umr866 Dynamique Musculaire Et MétabolismeMontpellier, F-34000, France
- Université MontpellierMontpellier, F-34000, France
| | - David J Craik
- Institute for Molecular Bioscience. The University of QueenslandBrisbane, Queensland, 4072, Australia
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Bernard Henrissat
- Architecture Et Fonction Des Macromolécules Biologiques, CNRS, Aix-Marseille UniversitéMarseille, 13288, France
- Department of Biological Sciences, King Abdulaziz UniversityJeddah, Saudi Arabia
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, Maryland, 20892
| | - Gerard Manning
- Department of Bioinformatics & Computational Biology, Genentech1 DNA Way, South San Francisco, California, 98010
| | - Nozomi Nagano
- Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyo, 135-0064, Japan
| | - Claire O’Donovan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, Maryland, 20892
| | - Neil D Rawlings
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
- Wellcome Trust Sanger InstituteWellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| | - Milton Saier
- Department of Molecular Biology, University of California at San DiegoLa Jolla, California, 92093
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences, TIFRGKVK Campus, Bellary Road, Bangalore, 560065, India
| | - Michael Spedding
- Chair NC-IUPHAR, Spedding Research Solutions SARL6 Rue Ampere, Le Vesinet, 78110, France
| | | | - Gert Vriend
- Centre for Molecular and Biomolecular Informatics (CMBI), Radboud University Medical Center, Geert Grooteplein Zuid 26-28, 6525 GANijmegen, The Netherlands
| | - Patricia C Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of CaliforniaSan Francisco, California, 94158
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)Wellcome Trust Genome Campus, Hinxton, Cambridge, Cb10 1SD, United Kingdom
| |
Collapse
|
11
|
Repchevsky D, Gelpi JL. BioSWR--semantic web services registry for bioinformatics. PLoS One 2014; 9:e107889. [PMID: 25233118 PMCID: PMC4169436 DOI: 10.1371/journal.pone.0107889] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 08/21/2014] [Indexed: 11/28/2022] Open
Abstract
Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.
Collapse
Affiliation(s)
- Dmitry Repchevsky
- Barcelona Supercomputing Center, Life-Sciences Department, National Institute of Bioinformatics, Computational Bioinformatics Node, Barcelona, Spain
| | - Josep Ll. Gelpi
- Barcelona Supercomputing Center, Life-Sciences Department, National Institute of Bioinformatics, Computational Bioinformatics Node, Barcelona, Spain
- Department of Biochemistry and Molecular Biology, University of Barcelona, Barcelona, Spain
- * E-mail:
| |
Collapse
|
12
|
Relationship between genome and epigenome--challenges and requirements for future research. BMC Genomics 2014; 15:487. [PMID: 24942464 PMCID: PMC4073504 DOI: 10.1186/1471-2164-15-487] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Accepted: 05/28/2014] [Indexed: 02/06/2023] Open
Abstract
Understanding the links between genetic, epigenetic and non-genetic factors throughout the lifespan and across generations and their role in disease susceptibility and disease progression offer entirely new avenues and solutions to major problems in our society. To overcome the numerous challenges, we have come up with nine major conclusions to set the vision for future policies and research agendas at the European level.
Collapse
|
13
|
Masseroli M, Mons B, Bongcam-Rudloff E, Ceri S, Kel A, Rechenmann F, Lisacek F, Romano P. Integrated Bio-Search: challenges and trends for the integration, search and comprehensive processing of biological information. BMC Bioinformatics 2014; 15 Suppl 1:S2. [PMID: 24564249 PMCID: PMC4015876 DOI: 10.1186/1471-2105-15-s1-s2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Many efforts exist to design and implement approaches and tools for data capture, integration and analysis in the life sciences. Challenges are not only the heterogeneity, size and distribution of information sources, but also the danger of producing too many solutions for the same problem. Methodological, technological, infrastructural and social aspects appear to be essential for the development of a new generation of best practices and tools. In this paper, we analyse and discuss these aspects from different perspectives, by extending some of the ideas that arose during the NETTAB 2012 Workshop, making reference especially to the European context. First, relevance of using data and software models for the management and analysis of biological data is stressed. Second, some of the most relevant community achievements of the recent years, which should be taken as a starting point for future efforts in this research domain, are presented. Third, some of the main outstanding issues, challenges and trends are analysed. The challenges related to the tendency to fund and create large scale international research infrastructures and public-private partnerships in order to address the complex challenges of data intensive science are especially discussed. The needs and opportunities of Genomic Computing (the integration, search and display of genomic information at a very specific level, e.g. at the level of a single DNA region) are then considered. In the current data and network-driven era, social aspects can become crucial bottlenecks. How these may best be tackled to unleash the technical abilities for effective data integration and validation efforts is then discussed. Especially the apparent lack of incentives for already overwhelmed researchers appears to be a limitation for sharing information and knowledge with other scientists. We point out as well how the bioinformatics market is growing at an unprecedented speed due to the impact that new powerful in silico analysis promises to have on better diagnosis, prognosis, drug discovery and treatment, towards personalized medicine. An open business model for bioinformatics, which appears to be able to reduce undue duplication of efforts and support the increased reuse of valuable data sets, tools and platforms, is finally discussed.
Collapse
Affiliation(s)
- Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, 20133, Italy
| | - Barend Mons
- Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands
- Netherlands Bioinformatics Center, Nijmegen, 6500 HB, The Netherlands
| | - Erik Bongcam-Rudloff
- Department of Animal Breeding and Genetics, SLU-Global Bioinformatics Centre, Swedish University of Agricultural Sciences, Uppsala, 75124, Sweden
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, 75108, Sweden
| | - Stefano Ceri
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, 20133, Italy
| | - Alexander Kel
- GeneXplain GmbH, Wolfenbüttel, 38302, Germany
- Institute of Chemical Biology and Fundamental Medicine SBRAS, Novosibirsk, 630090, Russia
| | | | - Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 1211 Geneva 4, Switzerland
- Section of Biology, University of Geneva, 1211 Geneva 4, Switzerland
| | - Paolo Romano
- Biopolymers and Proteomics, IRCCS AOU San Martino IST, Genoa, 16132, Italy
| |
Collapse
|
14
|
Ison J, Kalas M, Jonassen I, Bolser D, Uludag M, McWilliam H, Malone J, Lopez R, Pettifer S, Rice P. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics 2013; 29:1325-32. [PMID: 23479348 PMCID: PMC3654706 DOI: 10.1093/bioinformatics/btt113] [Citation(s) in RCA: 126] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2012] [Revised: 02/28/2013] [Accepted: 03/01/2013] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. RESULTS EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. AVAILABILITY The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. CONTACT jison@ebi.ac.uk.
Collapse
Affiliation(s)
- Jon Ison
- EMBL European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Pérez M, Berlanga R, Sanz I, Aramburu MJ. BioUSeR: a semantic-based tool for retrieving Life Science web resources driven by text-rich user requirements. J Biomed Semantics 2013; 4:12. [PMID: 23635042 PMCID: PMC3698192 DOI: 10.1186/2041-1480-4-12] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 04/18/2013] [Indexed: 12/05/2022] Open
Abstract
Background Open metadata registries are a fundamental tool for researchers in the Life Sciences trying to locate resources. While most current registries assume that resources are annotated with well-structured metadata, evidence shows that most of the resource annotations simply consists of informal free text. This reality must be taken into account in order to develop effective techniques for resource discovery in Life Sciences. Results BioUSeR is a semantic-based tool aimed at retrieving Life Sciences resources described in free text. The retrieval process is driven by the user requirements, which consist of a target task and a set of facets of interest, both expressed in free text. BioUSeR is able to effectively exploit the available textual descriptions to find relevant resources by using semantic-aware techniques. Conclusions BioUSeR overcomes the limitations of the current registries thanks to: (i) rich specification of user information needs, (ii) use of semantics to manage textual descriptions, (iii) retrieval and ranking of resources based on user requirements.
Collapse
Affiliation(s)
- María Pérez
- Department of Computer Science and Engineering, Universitat Jaume I, Castellón, Spain.
| | | | | | | |
Collapse
|
16
|
Wollbrett J, Larmande P, de Lamotte F, Ruiz M. Clever generation of rich SPARQL queries from annotated relational schema: application to Semantic Web Service creation for biological databases. BMC Bioinformatics 2013; 14:126. [PMID: 23586394 PMCID: PMC3680174 DOI: 10.1186/1471-2105-14-126] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2012] [Accepted: 03/25/2013] [Indexed: 11/10/2022] Open
Abstract
Background In recent years, a large amount of “-omics” data have been produced. However, these data are stored in many different species-specific databases that are managed by different institutes and laboratories. Biologists often need to find and assemble data from disparate sources to perform certain analyses. Searching for these data and assembling them is a time-consuming task. The Semantic Web helps to facilitate interoperability across databases. A common approach involves the development of wrapper systems that map a relational database schema onto existing domain ontologies. However, few attempts have been made to automate the creation of such wrappers. Results We developed a framework, named BioSemantic, for the creation of Semantic Web Services that are applicable to relational biological databases. This framework makes use of both Semantic Web and Web Services technologies and can be divided into two main parts: (i) the generation and semi-automatic annotation of an RDF view; and (ii) the automatic generation of SPARQL queries and their integration into Semantic Web Services backbones. We have used our framework to integrate genomic data from different plant databases. Conclusions BioSemantic is a framework that was designed to speed integration of relational databases. We present how it can be used to speed the development of Semantic Web Services for existing relational biological databases. Currently, it creates and annotates RDF views that enable the automatic generation of SPARQL queries. Web Services are also created and deployed automatically, and the semantic annotations of our Web Services are added automatically using SAWSDL attributes. BioSemantic is downloadable at http://southgreen.cirad.fr/?q=content/Biosemantic.
Collapse
|
17
|
Klingström T, Soldatova L, Stevens R, Roos TE, Swertz MA, Müller KM, Kalaš M, Lambrix P, Taussig MJ, Litton JE, Landegren U, Bongcam-Rudloff E. Workshop on laboratory protocol standards for the Molecular Methods Database. N Biotechnol 2012; 30:109-13. [PMID: 22687389 DOI: 10.1016/j.nbt.2012.05.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 05/26/2012] [Accepted: 05/26/2012] [Indexed: 10/28/2022]
Abstract
Management of data to produce scientific knowledge is a key challenge for biological research in the 21st century. Emerging high-throughput technologies allow life science researchers to produce big data at speeds and in amounts that were unthinkable just a few years ago. This places high demands on all aspects of the workflow: from data capture (including the experimental constraints of the experiment), analysis and preservation, to peer-reviewed publication of results. Failure to recognise the issues at each level can lead to serious conflicts and mistakes; research may then be compromised as a result of the publication of non-coherent protocols, or the misinterpretation of published data. In this report, we present the results from a workshop that was organised to create an ontological data-modelling framework for Laboratory Protocol Standards for the Molecular Methods Database (MolMeth). The workshop provided a set of short- and long-term goals for the MolMeth database, the most important being the decision to use the established EXACT description of biomedical ontologies as a starting point.
Collapse
Affiliation(s)
- Tomas Klingström
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Artimo P, Jonnalagedda M, Arnold K, Baratin D, Csardi G, de Castro E, Duvaud S, Flegel V, Fortier A, Gasteiger E, Grosdidier A, Hernandez C, Ioannidis V, Kuznetsov D, Liechti R, Moretti S, Mostaguir K, Redaschi N, Rossier G, Xenarios I, Stockinger H. ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res 2012; 40:W597-603. [PMID: 22661580 PMCID: PMC3394269 DOI: 10.1093/nar/gks400] [Citation(s) in RCA: 1387] [Impact Index Per Article: 115.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a ‘decentralized’ way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across ‘selected’ resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.
Collapse
Affiliation(s)
- Panu Artimo
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Attwood TK, Coletta A, Muirhead G, Pavlopoulou A, Philippou PB, Popov I, Romá-Mateo C, Theodosiou A, Mitchell AL. The PRINTS database: a fine-grained protein sequence annotation and analysis resource--its status in 2012. Database (Oxford) 2012; 2012:bas019. [PMID: 22508994 PMCID: PMC3326521 DOI: 10.1093/database/bas019] [Citation(s) in RCA: 113] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2012] [Accepted: 03/12/2012] [Indexed: 01/07/2023]
Abstract
The PRINTS database, now in its 21st year, houses a collection of diagnostic protein family 'fingerprints'. Fingerprints are groups of conserved motifs, evident in multiple sequence alignments, whose unique inter-relationships provide distinctive signatures for particular protein families and structural/functional domains. As such, they may be used to assign uncharacterized sequences to known families, and hence to infer tentative functional, structural and/or evolutionary relationships. The February 2012 release (version 42.0) includes 2156 fingerprints, encoding 12 444 individual motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. Here, we report the current status of the database, and introduce a number of recent developments that help both to render a variety of our annotation and analysis tools easier to use and to make them more widely available. Database URL: www.bioinf.manchester.ac.uk/dbbrowser/PRINTS/.
Collapse
Affiliation(s)
- Teresa K Attwood
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK.
| | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Bolser DM, Chibon PY, Palopoli N, Gong S, Jacob D, Del Angel VD, Swan D, Bassi S, González V, Suravajhala P, Hwang S, Romano P, Edwards R, Bishop B, Eargle J, Shtatland T, Provart NJ, Clements D, Renfro DP, Bhak D, Bhak J. MetaBase--the wiki-database of biological databases. Nucleic Acids Res 2011; 40:D1250-4. [PMID: 22139927 PMCID: PMC3245051 DOI: 10.1093/nar/gkr1099] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project.
Collapse
Affiliation(s)
- Dan M Bolser
- Personal Genomics Institute, Genome Research Foundation, Suwon 443-270, South Korea.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Kalas M, Puntervoll P, Joseph A, Bartaseviciūte E, Töpfer A, Venkataraman P, Pettifer S, Bryne JC, Ison J, Blanchet C, Rapacki K, Jonassen I. BioXSD: the common data-exchange format for everyday bioinformatics web services. Bioinformatics 2010; 26:i540-6. [PMID: 20823319 PMCID: PMC2935419 DOI: 10.1093/bioinformatics/btq391] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Motivation: The world-wide community of life scientists has access to a large number of public bioinformatics databases and tools, which are developed and deployed using diverse technologies and designs. More and more of the resources offer programmatic web-service interface. However, efficient use of the resources is hampered by the lack of widely used, standard data-exchange formats for the basic, everyday bioinformatics data types. Results: BioXSD has been developed as a candidate for standard, canonical exchange format for basic bioinformatics data. BioXSD is represented by a dedicated XML Schema and defines syntax for biological sequences, sequence annotations, alignments and references to resources. We have adapted a set of web services to use BioXSD as the input and output format, and implemented a test-case workflow. This demonstrates that the approach is feasible and provides smooth interoperability. Semantics for BioXSD is provided by annotation with the EDAM ontology. We discuss in a separate section how BioXSD relates to other initiatives and approaches, including existing standards and the Semantic Web. Availability: The BioXSD 1.0 XML Schema is freely available at http://www.bioxsd.org/BioXSD-1.0.xsd under the Creative Commons BY-ND 3.0 license. The http://bioxsd.org web page offers documentation, examples of data in BioXSD format, example workflows with source codes in common programming languages, an updated list of compatible web services and tools and a repository of feature requests from the community. Contact:matus.kalas@bccs.uib.no; developers@bioxsd.org; support@bioxsd.org
Collapse
Affiliation(s)
- Matús Kalas
- Bergen Center for Computational Science, Uni Research, Bergen, Norway.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|