1
|
Vita R, Mahajan S, Overton JA, Dhanda SK, Martini S, Cantrell JR, Wheeler DK, Sette A, Peters B. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res 2020; 47:D339-D343. [PMID: 30357391 PMCID: PMC6324067 DOI: 10.1093/nar/gky1006] [Citation(s) in RCA: 1099] [Impact Index Per Article: 274.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 10/11/2018] [Indexed: 12/18/2022] Open
Abstract
The Immune Epitope Database (IEDB, iedb.org) captures experimental data confined in figures, text and tables of the scientific literature, making it freely available and easily searchable to the public. The scope of the IEDB extends across immune epitope data related to all species studied and includes antibody, T cell, and MHC binding contexts associated with infectious, allergic, autoimmune, and transplant related diseases. Having been publicly accessible for >10 years, the recent focus of the IEDB has been improved query and reporting functionality to meet the needs of our users to access and summarize data that continues to grow in quantity and complexity. Here we present an update on our current efforts and future goals.
Collapse
Affiliation(s)
- Randi Vita
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA
| | - Swapnil Mahajan
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA
| | | | - Sandeep Kumar Dhanda
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA
| | - Sheridan Martini
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA
| | | | | | - Alessandro Sette
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA.,University of California San Diego, Department of Medicine, La Jolla, CA 92093, USA
| | - Bjoern Peters
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery, La Jolla, CA 92037, USA.,University of California San Diego, Department of Medicine, La Jolla, CA 92093, USA
| |
Collapse
|
2
|
Martini S, Nielsen M, Peters B, Sette A. The Immune Epitope Database and Analysis Resource Program 2003-2018: reflections and outlook. Immunogenetics 2019; 72:57-76. [PMID: 31761977 PMCID: PMC6970984 DOI: 10.1007/s00251-019-01137-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 10/12/2019] [Indexed: 12/12/2022]
Abstract
The Immune Epitope Database and Analysis Resource (IEDB) contains information related to antibodies and T cells across an expansive scope of research fields (infectious diseases, allergy, autoimmunity, and transplantation). Capture and representation of the data to reflect growing scientific standards and techniques have required continual refinement of our rigorous curation and query and reporting processes beginning with the automated classification of over 28 million PubMed abstracts, and resulting in easily searchable data from over 20,000 published manuscripts. Data related to MHC binding and elution, nonpeptidics, natural processing, receptors, and 3D structure is first captured through manual curation and subsequently maintained through recuration to reflect evolving scientific standards. Upon promotion to the free, public database, users can query and export records of specific relevance via the online web portal which undergoes iterative development to best enable efficient data access. In parallel, the companion Analysis Resource site hosts a variety of tools that assist in the bioinformatic analyses of epitopes and related structures, which can be applied to IEDB-derived and independent datasets alike. Available tools are classified into two categories: analysis and prediction. Analysis tools include epitope clustering, sequence conservancy, and more, while prediction tools cover T and B cell epitope binding, immunogenicity, and TCR/BCR structures. In addition to these tools, benchmarking servers which allow for unbiased performance comparison are also offered. In order to expand and support the user-base of both the database and Analysis Resource, the research team actively engages in community outreach through publication of ongoing work, conference attendance and presentations, hosting of user workshops, and the provision of online help. This review provides a description of the IEDB database infrastructure, curation and recuration processes, query and reporting capabilities, the Analysis Resource, and our Community Outreach efforts, including assessment of the impact of the IEDB across the research community.
Collapse
Affiliation(s)
- Sheridan Martini
- Division of Vaccine Discovery, La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA, 92037, USA.
| | - Morten Nielsen
- Department Health Technology, Technical University of Denmark, Kgs. Lyngby, Denmark.,Instituto de Investigaciones Biotecnológicas, Universidad Nacional de San Martín, Buenos Aires, Argentina
| | - Bjoern Peters
- Division of Vaccine Discovery, La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA, 92037, USA.,Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Alessandro Sette
- Division of Vaccine Discovery, La Jolla Institute for Immunology, 9420 Athena Circle, La Jolla, CA, 92037, USA.,Department of Medicine, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
3
|
Farrell B, Bengtson J. Scientist and data architect collaborate to curate and archive an inner ear electrophysiology data collection. PLoS One 2019; 14:e0223984. [PMID: 31626635 PMCID: PMC6799921 DOI: 10.1371/journal.pone.0223984] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 10/02/2019] [Indexed: 11/19/2022] Open
Abstract
In the past scientists reported summaries of their findings; they did not provide their original data collections. Many stakeholders (e.g., funding agencies) are now requesting that such data be made publicly available. This mandate is being adopted to facilitate further discovery, and to mitigate waste and deficits in the research process. At the same time, the necessary infrastructure for data curation (e.g., repositories) has been evolving. The current target is to make research products FAIR (Findable, Accessible, Interoperable, Reusable), resulting in data that are curated and archived to be both human and machine compatible. However, most scientists have little training in data curation. Specifically, they are ill-equipped to annotate their data collections at a level that facilitates discoverability, aggregation, and broad reuse in a context separate from their creation or sub-field. To circumvent these deficits data architects may collaborate with scientists to transform and curate data. This paper's example of a data collection describes the electrical properties of outer hair cells isolated from the mammalian cochlea. The data is expressed with a variant of The Ontology for Biomedical Investigations (OBI), mirrored to provide the metadata and nested data architecture used within the Hierarchical Data Format version 5 (HDF5) format. Each digital specimen is displayed in a tree configuration (like directories in a computer) and consists of six main branches based on the ontology classes. The data collections, scripts, and ontological OWL file (OBI based Inner Ear Electrophysiology (OBI_IEE)) are deposited in three repositories. We discuss the impediments to producing such data collections for public use, and the tools and processes required for effective implementation. This work illustrates the impact that small collaborations can have on the curation of our publicly-funded collections, and is particularly salient for fields where data is sparse, throughput is low, and sacrifice of animals is required for discovery.
Collapse
Affiliation(s)
- Brenda Farrell
- Bobby R Alford Department of Otolaryngology and Head & Neck Surgery, Baylor College of Medicine, Houston, Texas, United States of America
| | - Jason Bengtson
- K-State Libraries, Kansas State University, Manhattan, Kansas, United States of America
| |
Collapse
|
4
|
Vita R, Overton JA, Mungall CJ, Sette A, Peters B. FAIR principles and the IEDB: short-term improvements and a long-term vision of OBO-foundry mediated machine-actionable interoperability. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4877121. [PMID: 29688354 PMCID: PMC5819722 DOI: 10.1093/database/bax105] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 12/21/2017] [Indexed: 12/13/2022]
Abstract
The Immune Epitope Database (IEDB), at www.iedb.org, has the mission to make published experimental data relating to the recognition of immune epitopes easily available to the scientific public. By presenting curated data in a searchable database, we have liberated it from the tables and figures of journal articles, making it more accessible and usable by immunologists. Recently, the principles of Findability, Accessibility, Interoperability and Reusability have been formulated as goals that data repositories should meet to enhance the usefulness of their data holdings. We here examine how the IEDB complies with these principles and identify broad areas of success, but also areas for improvement. We describe short-term improvements to the IEDB that are being implemented now, as well as a long-term vision of true 'machine-actionable interoperability', which we believe will require community agreement on standardization of knowledge representation that can be built on top of the shared use of ontologies.
Collapse
Affiliation(s)
- Randi Vita
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery and Center for Emerging Diseases and Biodefense, 9420 Athena Circle, La Jolla, CA 92037, USA
| | - James A Overton
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery and Center for Emerging Diseases and Biodefense, 9420 Athena Circle, La Jolla, CA 92037, USA
| | - Christopher J Mungall
- Lawrence Berkeley National Laboratory, Division of Environmental Genomics and Systems Biology, 1 Cyclotron Rd Berkeley, CA 94720, USA
| | - Alessandro Sette
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery and Center for Emerging Diseases and Biodefense, 9420 Athena Circle, La Jolla, CA 92037, USA
| | - Bjoern Peters
- La Jolla Institute for Allergy and Immunology, Division of Vaccine Discovery and Center for Emerging Diseases and Biodefense, 9420 Athena Circle, La Jolla, CA 92037, USA
| |
Collapse
|
5
|
Bukhari SAC, Martínez-Romero M, O' Connor MJ, Egyedi AL, Willrett D, Graybeal J, Musen MA, Cheung KH, Kleinstein SH. CEDAR OnDemand: a browser extension to generate ontology-based scientific metadata. BMC Bioinformatics 2018; 19:268. [PMID: 30012108 PMCID: PMC6048706 DOI: 10.1186/s12859-018-2247-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 06/14/2018] [Indexed: 12/17/2022] Open
Abstract
Background Public biomedical data repositories often provide web-based interfaces to collect experimental metadata. However, these interfaces typically reflect the ad hoc metadata specification practices of the associated repositories, leading to a lack of standardization in the collected metadata. This lack of standardization limits the ability of the source datasets to be broadly discovered, reused, and integrated with other datasets. To increase reuse, discoverability, and reproducibility of the described experiments, datasets should be appropriately annotated by using agreed-upon terms, ideally from ontologies or other controlled term sources. Results This work presents “CEDAR OnDemand”, a browser extension powered by the NCBO (National Center for Biomedical Ontology) BioPortal that enables users to seamlessly enter ontology-based metadata through existing web forms native to individual repositories. CEDAR OnDemand analyzes the web page contents to identify the text input fields and associate them with relevant ontologies which are recommended automatically based upon input fields’ labels (using the NCBO ontology recommender) and a pre-defined list of ontologies. These field-specific ontologies are used for controlling metadata entry. CEDAR OnDemand works for any web form designed in the HTML format. We demonstrate how CEDAR OnDemand works through the NCBI (National Center for Biotechnology Information) BioSample web-based metadata entry. Conclusion CEDAR OnDemand helps lower the barrier of incorporating ontologies into standardized metadata entry for public data repositories. CEDAR OnDemand is available freely on the Google Chrome store https://chrome.google.com/webstore/search/CEDAROnDemand
Collapse
Affiliation(s)
| | - Marcos Martínez-Romero
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Martin J O' Connor
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Attila L Egyedi
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Debra Willrett
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - John Graybeal
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Mark A Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Kei-Hoi Cheung
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA. .,Department of Emergency Medicine and Yale Center for Medical Informatics, Yale University School of Medicine, New Haven, CT, USA.
| | - Steven H Kleinstein
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA. .,Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA.
| |
Collapse
|
6
|
He Y, Xiang Z, Zheng J, Lin Y, Overton JA, Ong E. The eXtensible ontology development (XOD) principles and tool implementation to support ontology interoperability. J Biomed Semantics 2018; 9:3. [PMID: 29329592 PMCID: PMC5765662 DOI: 10.1186/s13326-017-0169-2] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 12/07/2017] [Indexed: 11/13/2022] Open
Abstract
Ontologies are critical to data/metadata and knowledge standardization, sharing, and analysis. With hundreds of biological and biomedical ontologies developed, it has become critical to ensure ontology interoperability and the usage of interoperable ontologies for standardized data representation and integration. The suite of web-based Ontoanimal tools (e.g., Ontofox, Ontorat, and Ontobee) support different aspects of extensible ontology development. By summarizing the common features of Ontoanimal and other similar tools, we identified and proposed an “eXtensible Ontology Development” (XOD) strategy and its associated four principles. These XOD principles reuse existing terms and semantic relations from reliable ontologies, develop and apply well-established ontology design patterns (ODPs), and involve community efforts to support new ontology development, promoting standardized and interoperable data and knowledge representation and integration. The adoption of the XOD strategy, together with robust XOD tool development, will greatly support ontology interoperability and robust ontology applications to support data to be Findable, Accessible, Interoperable and Reusable (i.e., FAIR).
Collapse
Affiliation(s)
- Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA.
| | - Zuoshuang Xiang
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Jie Zheng
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Yu Lin
- Center for Computational Science, University of Miami, Coral Gables, FL, USA
| | | | - Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
7
|
Vita R, Overton JA, Peters B. Identification of errors in the IEDB using ontologies. Database (Oxford) 2018; 2018:4904119. [PMID: 29688357 PMCID: PMC5824775 DOI: 10.1093/database/bay005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 12/11/2017] [Accepted: 01/04/2018] [Indexed: 12/02/2022]
Abstract
The Immune Epitope Database (IEDB) is a free online resource that has manually curated over 18 500 references from the scientific literature. Our database presents experimental data relating to the recognition of immune epitopes by the adaptive immune system in a structured, searchable manner. In order to be consistent and accurate in our data representation across many different journals, authors and curators, we have implemented several quality control measures, such as curation rules, controlled vocabularies and links to external ontologies and other resources. Ontologies and other resources have greatly benefited the IEDB through improved search interfaces, easier curation practices, interoperability between the IEDB and other databases and the identification of errors within our dataset. Here, we will elaborate on how ontology mapping and usage can be used to find and correct errors in a manually curated database.Database URL: www.iedb.org.
Collapse
Affiliation(s)
- Randi Vita
- Center for Infectious Disease, La Jolla Institute for Allergy and Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| | - James A Overton
- Center for Infectious Disease, La Jolla Institute for Allergy and Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| | - Bjoern Peters
- Center for Infectious Disease, La Jolla Institute for Allergy and Immunology, 9420 Athena Circle, La Jolla, CA 92037, USA
| |
Collapse
|
8
|
Vita R, Overton JA, Sette A, Peters B. Better living through ontologies at the Immune Epitope Database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2017:3074785. [PMID: 28365732 PMCID: PMC5467561 DOI: 10.1093/database/bax014] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2016] [Accepted: 02/06/2017] [Indexed: 12/27/2022]
Abstract
The Immune Epitope Database (IEDB) project incorporates independently developed ontologies and controlled vocabularies into its curation and search interface. This simplifies curation practices, improves the user query experience and facilitates interoperability between the IEDB and other resources. While the use of independently developed ontologies has long been recommended as a best practice, there continues to be a significant number of projects that develop their own vocabularies instead, or that do not fully utilize the power of ontologies that they are using. We describe how we use ontologies in the IEDB, providing a concrete example of the benefits of ontologies in practice. Database URL:www.iedb.org
Collapse
Affiliation(s)
- Randi Vita
- La Jolla Institute for Allergy & Immunology, Center for Infectious Disease, La Jolla, CA 92037, USA
| | - James A Overton
- La Jolla Institute for Allergy & Immunology, Center for Infectious Disease, La Jolla, CA 92037, USA
| | - Alessandro Sette
- La Jolla Institute for Allergy & Immunology, Center for Infectious Disease, La Jolla, CA 92037, USA
| | - Bjoern Peters
- La Jolla Institute for Allergy & Immunology, Center for Infectious Disease, La Jolla, CA 92037, USA
| |
Collapse
|
9
|
|
10
|
Ceusters W, Nasri-Heir C, Alnaas D, Cairns BE, Michelotti A, Ohrbach R. Perspectives on next steps in classification of oro-facial pain - Part 3: biomarkers of chronic oro-facial pain - from research to clinic. J Oral Rehabil 2015; 42:956-66. [PMID: 26200973 PMCID: PMC4715524 DOI: 10.1111/joor.12324] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/31/2015] [Indexed: 11/28/2022]
Abstract
The purpose of this study was to review the current status of biomarkers used in oro-facial pain conditions. Specifically, we critically appraise their relative strengths and weaknesses for assessing mechanisms associated with the oro-facial pain conditions and interpret that information in the light of their current value for use in diagnosis. In the third section, we explore biomarkers through the perspective of ontological realism. We discuss ontological problems of biomarkers as currently widely conceptualised and implemented. This leads to recommendations for research practice aimed to a better understanding of the potential contribution that biomarkers might make to oro-facial pain diagnosis and thereby fulfil our goal for an expanded multidimensional framework for oro-facial pain conditions that would include a third axis.
Collapse
Affiliation(s)
- Werner Ceusters
- Department of Biomedical Informatics, University at Buffalo, NY, USA
| | | | | | - Brian E Cairns
- Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Ambra Michelotti
- Section of Orthodontics, School of Dentistry, University of Naples Federico II, Naples, Italy
| | - Richard Ohrbach
- Department of Oral Diagnostic Sciences, University at Buffalo, NY, USA
| |
Collapse
|
11
|
Soldatova LN, Nadis D, King RD, Basu PS, Haddi E, Baumlé V, Saunders NJ, Marwan W, Rudkin BB. EXACT2: the semantics of biomedical protocols. BMC Bioinformatics 2014; 15 Suppl 14:S5. [PMID: 25472549 PMCID: PMC4255744 DOI: 10.1186/1471-2105-15-s14-s5] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously 'unseen' (not used for the construction of EXACT2) protocols. Results The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format.
Collapse
|
12
|
Soldatova LN, Sansone SA, Dumontier M, Shah NH. Selected papers from the 15th Annual Bio-Ontologies Special Interest Group Meeting. J Biomed Semantics 2013; 4 Suppl 1:I1. [PMID: 23735191 PMCID: PMC3633002 DOI: 10.1186/2041-1480-4-s1-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Over the 15 years, the Bio-Ontologies SIG at ISMB has provided a forum for discussion of the latest and most innovative research in the bio-ontologies development, its applications to biomedicine and more generally the organisation, presentation and dissemination of knowledge in biomedicine and the life sciences. The seven papers and the commentary selected for this supplement span a wide range of topics including: web-based querying over multiple ontologies, integration of data, annotating patent records, NCBO Web services, ontology developments for probabilistic reasoning and for physiological processes, and analysis of the progress of annotation and structural GO changes.
Collapse
|