1
|
Bernabé CH, Queralt-Rosinach N, Silva Souza VE, Bonino da Silva Santos LO, Mons B, Jacobsen A, Roos M. The use of foundational ontologies in biomedical research. J Biomed Semantics 2023; 14:21. [PMID: 38082345 PMCID: PMC10712036 DOI: 10.1186/s13326-023-00300-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. RESULTS From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). CONCLUSION Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.
Collapse
Affiliation(s)
- César H Bernabé
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | - Luiz Olavo Bonino da Silva Santos
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- University of Twente, Enschede, The Netherlands
| | - Barend Mons
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Annika Jacobsen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Marco Roos
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| |
Collapse
|
2
|
Oshni Alvandi A, Burstein F, Bain C. A digital health ecosystem ontology from the perspective of Australian consumers: a mixed-method literature analysis. Inform Health Soc Care 2023; 48:13-29. [PMID: 35298327 DOI: 10.1080/17538157.2022.2049273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
This study presents an ontology that scopes the digital health ecosystem from a consumer-centered perspective. We used a mixed-method analysis on a set of papers collected for a comprehensive review to identify common themes, components, and patterns that repeatedly emerge within Australian-based digital health studies. Three major and four child themes were identified as the foundational aspects of the proposed ontology. The child themes have more precise concept definitions, inherited and distinguishing attributes. Out of 179 recognized concepts, 33 were related to the Healthcare theme; 23 concepts formed a cluster of employed devices under the Technology theme; 40 concepts were associated with Use and Usability factors. 60 other concepts formed the cluster of the consumer-user theme. The theme of Digital Health was seen as being connected to 2 independent clusters. The main cluster embodied 21 extracted concepts, semantically related to "data, information, and knowledge," whilst the second cluster embodied concepts related to "healthcare." Different stakeholders can utilize this ontology to define their landscape of digitally enabled healthcare. The novelty of this work resides in capturing a consumer-centered perspective and the method we used in deriving the ontology - formalizing the results of a systematic review based on data-driven analysis methods.
Collapse
Affiliation(s)
- Abraham Oshni Alvandi
- Digital Health Theme, Department of Human-Centered Computing, Faculty of Information Technology, Monash University, Melbourne, Victoria, Australia
| | - Frada Burstein
- Digital Health Theme, Department of Human-Centered Computing, Faculty of Information Technology, Monash University, Melbourne, Victoria, Australia
| | - Chris Bain
- Digital Health Theme, Department of Human-Centered Computing, Faculty of Information Technology, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
3
|
He Y, Yu H, Huffman A, Lin AY, Natale DA, Beverley J, Zheng L, Perl Y, Wang Z, Liu Y, Ong E, Wang Y, Huang P, Tran L, Du J, Shah Z, Shah E, Desai R, Huang HH, Tian Y, Merrell E, Duncan WD, Arabandi S, Schriml LM, Zheng J, Masci AM, Wang L, Liu H, Smaili FZ, Hoehndorf R, Pendlington ZM, Roncaglia P, Ye X, Xie J, Tang YW, Yang X, Peng S, Zhang L, Chen L, Hur J, Omenn GS, Athey B, Smith B. A comprehensive update on CIDO: the community-based coronavirus infectious disease ontology. J Biomed Semantics 2022; 13:25. [PMID: 36271389 PMCID: PMC9585694 DOI: 10.1186/s13326-022-00279-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 09/13/2022] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The current COVID-19 pandemic and the previous SARS/MERS outbreaks of 2003 and 2012 have resulted in a series of major global public health crises. We argue that in the interest of developing effective and safe vaccines and drugs and to better understand coronaviruses and associated disease mechenisms it is necessary to integrate the large and exponentially growing body of heterogeneous coronavirus data. Ontologies play an important role in standard-based knowledge and data representation, integration, sharing, and analysis. Accordingly, we initiated the development of the community-based Coronavirus Infectious Disease Ontology (CIDO) in early 2020. RESULTS As an Open Biomedical Ontology (OBO) library ontology, CIDO is open source and interoperable with other existing OBO ontologies. CIDO is aligned with the Basic Formal Ontology and Viral Infectious Disease Ontology. CIDO has imported terms from over 30 OBO ontologies. For example, CIDO imports all SARS-CoV-2 protein terms from the Protein Ontology, COVID-19-related phenotype terms from the Human Phenotype Ontology, and over 100 COVID-19 terms for vaccines (both authorized and in clinical trial) from the Vaccine Ontology. CIDO systematically represents variants of SARS-CoV-2 viruses and over 300 amino acid substitutions therein, along with over 300 diagnostic kits and methods. CIDO also describes hundreds of host-coronavirus protein-protein interactions (PPIs) and the drugs that target proteins in these PPIs. CIDO has been used to model COVID-19 related phenomena in areas such as epidemiology. The scope of CIDO was evaluated by visual analysis supported by a summarization network method. CIDO has been used in various applications such as term standardization, inference, natural language processing (NLP) and clinical data integration. We have applied the amino acid variant knowledge present in CIDO to analyze differences between SARS-CoV-2 Delta and Omicron variants. CIDO's integrative host-coronavirus PPIs and drug-target knowledge has also been used to support drug repurposing for COVID-19 treatment. CONCLUSION CIDO represents entities and relations in the domain of coronavirus diseases with a special focus on COVID-19. It supports shared knowledge representation, data and metadata standardization and integration, and has been used in a range of applications.
Collapse
Affiliation(s)
- Yongqun He
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Hong Yu
- People’s Hospital of Guizhou Province, Guiyang, Guizhou China
| | | | - Asiyah Yu Lin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
- National Center for Ontological Research, Buffalo, NY USA
| | | | - John Beverley
- National Center for Ontological Research, Buffalo, NY USA
- The Johns Hopkins University Applied Physics Laboratory, Laurel, MD USA
| | - Ling Zheng
- Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ USA
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ USA
| | - Zhigang Wang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
| | - Yingtong Liu
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Edison Ong
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Yang Wang
- University of Michigan Medical School, Ann Arbor, MI USA
- People’s Hospital of Guizhou Province, Guiyang, Guizhou China
| | - Philip Huang
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Long Tran
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Jinyang Du
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Zalan Shah
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Easheta Shah
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Roshan Desai
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Hsin-hui Huang
- University of Michigan Medical School, Ann Arbor, MI USA
- National Yang-Ming University, Taipei, Taiwan
| | - Yujia Tian
- Rutgers University, New Brunswick, NJ USA
| | | | | | | | - Lynn M. Schriml
- University of Maryland School of Medicine, Baltimore, MD USA
| | - Jie Zheng
- Department of Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA USA
| | - Anna Maria Masci
- Office of Data Science, National Institute of Environmental Health Sciences, Research Triangle Park, NC USA
| | | | | | | | - Robert Hoehndorf
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Zoë May Pendlington
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Xianwei Ye
- People’s Hospital of Guizhou Province, Guiyang, Guizhou China
| | - Jiangan Xie
- School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yi-Wei Tang
- Cepheid, Danaher Diagnostic Platform, Shanghai, China
| | - Xiaolin Yang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
| | - Suyuan Peng
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luxia Zhang
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luonan Chen
- Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Junguk Hur
- University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND USA
| | | | - Brian Athey
- University of Michigan Medical School, Ann Arbor, MI USA
| | - Barry Smith
- National Center for Ontological Research, Buffalo, NY USA
- University at Buffalo, Buffalo, NY 14260 USA
| |
Collapse
|
4
|
Yagahara A, Yokoi H. Terminology integration and inconsistency identification of adverse event terminology for Japanese medical devices using SPARQL. BMC Med Inform Decis Mak 2022; 22:16. [PMID: 35042480 PMCID: PMC8767687 DOI: 10.1186/s12911-022-01748-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Accepted: 01/05/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
For standardization of terms in the reports of medical device adverse events, 89 Japanese medical device adverse event terminologies were published in March 2015. The 89 terminologies were developed independently by 13 industry associations, suggesting that there may be inconsistencies among the terms proposed. The purpose of this study was to integrate the 89 sets of terminologies and evaluate inconsistencies among them using SPARQL.
Methods
In order to evaluate the inconsistencies among the integrated terminology, the following six items were evaluated: (1) whether the two-layer structure between category term and preferred term is consistent, (2) whether synonyms of a preferred term are involved. Reversing the layer-category order of matching was also performed, (3) whether each preferred term is subordinate to only one category term, (4) whether the definitions of terms are uniquely determined, (5) whether CDRH-NCIt terms corresponding to preferred terms are uniquely determined, (6) whether a term in a medical device problem is used for patient problems.
Results
About 60% of the total number of duplicated terms were found. This is because industry associations that created multiple terminologies adopted the same terms in terminologies of similar medical device groups. In the case that all terms with the same spelling have the same concept, efficient integration can be achieved automatically using RDF. Furthermore, we evaluated six matters of inconsistency in this study, terms that need to be reviewed accounted for about 10% or less than 10% in each item.
Conclusions
The RDF and SPARQL were useful tools to explore inconsistencies of hierarchies, definition statements, and synonyms when integrating terminolgy by term notation, and these had the advantage of reducing the physical and time burden.
Collapse
|
5
|
Slater LT, Gkoutos GV, Hoehndorf R. Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies. BMC Med Inform Decis Mak 2020; 20:311. [PMID: 33319712 PMCID: PMC7736131 DOI: 10.1186/s12911-020-01336-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 11/16/2020] [Indexed: 12/25/2022] Open
Abstract
Background Ontologies are widely used throughout the biomedical domain. These ontologies formally represent the classes and relations assumed to exist within a domain. As scientific domains are deeply interlinked, so too are their representations. While individual ontologies can be tested for consistency and coherency using automated reasoning methods, systematically combining ontologies of multiple domains together may reveal previously hidden contradictions. Methods We developed a method that tests for hidden unsatisfiabilities in an ontology that arise when combined with other ontologies. For this purpose, we combined sets of ontologies and use automated reasoning to determine whether unsatisfiable classes are present. In addition, we designed and implemented a novel algorithm that can determine justifications for contradictions across extremely large and complicated ontologies, and use these justifications to semi-automatically repair ontologies by identifying a small set of axioms that, when removed, result in a consistent and coherent set of ontologies.
Results We tested the mutual consistency of the OBO Foundry and the OBO ontologies and find that the combined OBO Foundry gives rise to at least 636 unsatisfiable classes, while the OBO ontologies give rise to more than 300,000 unsatisfiable classes. We also applied our semi-automatic repair algorithm to each combination of OBO ontologies that resulted in unsatisfiable classes, finding that only 117 axioms could be removed to account for all cases of unsatisfiability across all OBO ontologies. Conclusions We identified a large set of hidden unsatisfiability across a broad range of biomedical ontologies, and we find that this large set of unsatisfiable classes is the result of a relatively small amount of axiomatic disagreements. Our results show that hidden unsatisfiability is a serious problem in ontology interoperability; however, our results also provide a way towards more consistent ontologies by addressing the issues we identified.
Collapse
Affiliation(s)
- Luke T Slater
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK. .,Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, Birmingham, B15 2TT, UK.
| | - Georgios V Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, B15 2TT, UK.,Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, Birmingham, B15 2TT, UK.,NIHR Experimental Cancer Medicine Centre, Birmingham, B15 2TT, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, Birmingham, B15 2TT, UK.,NIHR Biomedical Research Centre, Birmingham, B15 2TT, UK.,MRC Health Data Research UK (HDR UK Midlands, Birmingham, B15 2TT, UK
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia
| |
Collapse
|
6
|
Wikidata: A large-scale collaborative ontological medical database. J Biomed Inform 2019; 99:103292. [DOI: 10.1016/j.jbi.2019.103292] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 08/10/2019] [Accepted: 09/18/2019] [Indexed: 01/09/2023]
|
7
|
McDaniel M, Storey VC, Sugumaran V. Assessing the quality of domain ontologies: Metrics and an automated ranking system. DATA KNOWL ENG 2018. [DOI: 10.1016/j.datak.2018.02.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
8
|
Santana da Silva F, Jansen L, Freitas F, Schulz S. Ontological interpretation of biomedical database content. J Biomed Semantics 2017. [PMID: 28651575 PMCID: PMC5485580 DOI: 10.1186/s13326-017-0127-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Background Biological databases store data about laboratory experiments, together with semantic annotations, in order to support data aggregation and retrieval. The exact meaning of such annotations in the context of a database record is often ambiguous. We address this problem by grounding implicit and explicit database content in a formal-ontological framework. Methods By using a typical extract from the databases UniProt and Ensembl, annotated with content from GO, PR, ChEBI and NCBI Taxonomy, we created four ontological models (in OWL), which generate explicit, distinct interpretations under the BioTopLite2 (BTL2) upper-level ontology. The first three models interpret database entries as individuals (IND), defined classes (SUBC), and classes with dispositions (DISP), respectively; the fourth model (HYBR) is a combination of SUBC and DISP. For the evaluation of these four models, we consider (i) database content retrieval, using ontologies as query vocabulary; (ii) information completeness; and, (iii) DL complexity and decidability. The models were tested under these criteria against four competency questions (CQs). Results IND does not raise any ontological claim, besides asserting the existence of sample individuals and relations among them. Modelling patterns have to be created for each type of annotation referent. SUBC is interpreted regarding maximally fine-grained defined subclasses under the classes referred to by the data. DISP attempts to extract truly ontological statements from the database records, claiming the existence of dispositions. HYBR is a hybrid of SUBC and DISP and is more parsimonious regarding expressiveness and query answering complexity. For each of the four models, the four CQs were submitted as DL queries. This shows the ability to retrieve individuals with IND, and classes in SUBC and HYBR. DISP does not retrieve anything because the axioms with disposition are embedded in General Class Inclusion (GCI) statements. Conclusion Ambiguity of biological database content is addressed by a method that identifies implicit knowledge behind semantic annotations in biological databases and grounds it in an expressive upper-level ontology. The result is a seamless representation of database structure, content and annotations as OWL models. Electronic supplementary material The online version of this article (doi:10.1186/s13326-017-0127-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Filipe Santana da Silva
- Centro de Informática, Universidade Federal de Pernambuco, Av. Jornalista Anibal Fernandes, 50.740-560, Recife, Brazil.,Núcleo de Telessaúde, Universidade Federal de Pernambuco, Av. Prof. Moraes Rego, 50670-420, Recife, Brazil
| | - Ludger Jansen
- Institut für Philosophie, Universität Rostock, D-18051, Rostock, Germany
| | - Fred Freitas
- Centro de Informática, Universidade Federal de Pernambuco, Av. Jornalista Anibal Fernandes, 50.740-560, Recife, Brazil
| | - Stefan Schulz
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerplatz 2/V, Graz, 8036, Austria.
| |
Collapse
|
9
|
Pozza G, Borgo S, Oltramari A, Contalbrigo L, Marangon S. Information and organization in public health institutes: an ontology-based modeling of the entities in the reception-analysis-report phases. J Biomed Semantics 2016; 7:51. [PMID: 27608917 PMCID: PMC5017037 DOI: 10.1186/s13326-016-0095-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Ontologies are widely used both in the life sciences and in the management of public and private companies. Typically, the different offices in an organization develop their own models and related ontologies to capture specific tasks and goals. Although there might be an overall coordination, the use of distinct ontologies can jeopardize the integration of data across the organization since data sharing and reusability are sensitive to modeling choices. RESULTS The paper provides a study of the entities that are typically found at the reception, analysis and report phases in public institutes in the life science domain. Ontological considerations and techniques are introduced and their implementation exemplified by studying the Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe), a public veterinarian institute with different geographical locations and several laboratories. Different modeling issues are discussed like the identification and characterization of the main entities in these phases; the classification of the (types of) data; the clarification of the contexts and the roles of the involved entities. The study is based on a foundational ontology and shows how it can be extended to a comprehensive and coherent framework comprising the different institute's roles, processes and data. In particular, it shows how to use notions lying at the borderline between ontology and applications, like that of knowledge object. The paper aims to help the modeler to understand the core viewpoint of the organization and to improve data transparency. CONCLUSIONS The study shows that the entities at play can be analyzed within a single ontological perspective allowing us to isolate a single ontological framework for the whole organization. This facilitates the development of coherent representations of the entities and related data, and fosters the use of integrated software for data management and reasoning across the company.
Collapse
Affiliation(s)
- Giandomenico Pozza
- Istituto Zooprifilattico Sperimentale delle Venezie, Viale dell'Universitá, Legnaro (PD), 10, 35020, Italy.
| | - Stefano Borgo
- Laboratory for Applied Ontology (LOA), ISTC CNR, Via alla Cascata 56/C - Povo, Trento, 38100, Italy
| | - Alessandro Oltramari
- Bosch Research and Technology Center, 2555 Smallman Street, Pittsburgh, 15222, Pennsylvania, USA
| | - Laura Contalbrigo
- Istituto Zooprifilattico Sperimentale delle Venezie, Viale dell'Universitá, Legnaro (PD), 10, 35020, Italy
| | - Stefano Marangon
- Istituto Zooprifilattico Sperimentale delle Venezie, Viale dell'Universitá, Legnaro (PD), 10, 35020, Italy
| |
Collapse
|
10
|
Elayavilli RK, Liu H. Ion Channel ElectroPhysiology Ontology (ICEPO) - a case study of text mining assisted ontology development. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2016; 2016:42-51. [PMID: 27570648 PMCID: PMC5001744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
BACKGROUND Computational modeling of biological cascades is of great interest to quantitative biologists. Biomedical text has been a rich source for quantitative information. Gathering quantitative parameters and values from biomedical text is one significant challenge in the early steps of computational modeling as it involves huge manual effort. While automatically extracting such quantitative information from bio-medical text may offer some relief, lack of ontological representation for a subdomain serves as impedance in normalizing textual extractions to a standard representation. This may render textual extractions less meaningful to the domain experts. METHODS In this work, we propose a rule-based approach to automatically extract relations involving quantitative data from biomedical text describing ion channel electrophysiology. We further translated the quantitative assertions extracted through text mining to a formal representation that may help in constructing ontology for ion channel events using a rule based approach. We have developed Ion Channel ElectroPhysiology Ontology (ICEPO) by integrating the information represented in closely related ontologies such as, Cell Physiology Ontology (CPO), and Cardiac Electro Physiology Ontology (CPEO) and the knowledge provided by domain experts. RESULTS The rule-based system achieved an overall F-measure of 68.93% in extracting the quantitative data assertions system on an independently annotated blind data set. We further made an initial attempt in formalizing the quantitative data assertions extracted from the biomedical text into a formal representation that offers potential to facilitate the integration of text mining into ontological workflow, a novel aspect of this study. CONCLUSIONS This work is a case study where we created a platform that provides formal interaction between ontology development and text mining. We have achieved partial success in extracting quantitative assertions from the biomedical text and formalizing them in ontological framework. AVAILABILITY The ICEPO ontology is available for download at http://openbionlp.org/mutd/supplementarydata/ICEPO/ICEPO.owl.
Collapse
|
11
|
Arguello Casteleiro M, Klein J, Stevens R. The Proteasix Ontology. J Biomed Semantics 2016; 7:33. [PMID: 27259807 PMCID: PMC4893253 DOI: 10.1186/s13326-016-0078-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 05/19/2016] [Indexed: 11/10/2022] Open
Abstract
Background The Proteasix Ontology (PxO) is an ontology that supports the Proteasix tool; an open-source peptide-centric tool that can be used to predict automatically and in a large-scale fashion in silico the proteases involved in the generation of proteolytic cleavage fragments (peptides) Methods The PxO re-uses parts of the Protein Ontology, the three Gene Ontology sub-ontologies, the Chemical Entities of Biological Interest Ontology, the Sequence Ontology and bespoke extensions to the PxO in support of a series of roles: 1. To describe the known proteases and their target cleaveage sites. 2. To enable the description of proteolytic cleaveage fragments as the outputs of observed and predicted proteolysis. 3. To use knowledge about the function, species and cellular location of a protease and protein substrate to support the prioritisation of proteases in observed and predicted proteolysis. Results The PxO is designed to describe the biological underpinnings of the generation of peptides. The peptide-centric PxO seeks to support the Proteasix tool by separating domain knowledge from the operational knowledge used in protease prediction by Proteasix and to support the confirmation of its analyses and results. Availability The Proteasix Ontology may be found at: http://bioportal.bioontology.org/ontologies/PXO. This ontology is free and open for use by everyone. Electronic supplementary material The online version of this article (doi:10.1186/s13326-016-0078-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Julie Klein
- Institut National de la Sante et de la Recherche Medicale (INSERM), U1048, Toulouse, 24105, France
| | - Robert Stevens
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK.
| |
Collapse
|
12
|
Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Brief Bioinform 2015; 16:1069-80. [PMID: 25863278 PMCID: PMC4652617 DOI: 10.1093/bib/bbv011] [Citation(s) in RCA: 119] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Revised: 01/20/2015] [Indexed: 12/19/2022] Open
Abstract
Ontologies are widely used in biological and biomedical research. Their success lies in their combination of four main features present in almost all ontologies: provision of standard identifiers for classes and relations that represent the phenomena within a domain; provision of a vocabulary for a domain; provision of metadata that describes the intended meaning of the classes and relations in ontologies; and the provision of machine-readable axioms and definitions that enable computational access to some aspects of the meaning of classes and relations. While each of these features enables applications that facilitate data integration, data access and analysis, a great potential lies in the possibility of combining these four features to support integrative analysis and interpretation of multimodal data. Here, we provide a functional perspective on ontologies in biology and biomedicine, focusing on what ontologies can do and describing how they can be used in support of integrative research. We also outline perspectives for using ontologies in data-driven science, in particular their application in structured data mining and machine learning applications.
Collapse
|
13
|
Liu J, Wang Z. Diverse array-designed modes of combination therapies in Fangjiomics. Acta Pharmacol Sin 2015; 36:680-8. [PMID: 25864646 PMCID: PMC4594182 DOI: 10.1038/aps.2014.125] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2014] [Accepted: 10/30/2014] [Indexed: 12/11/2022] Open
Abstract
In line with the complexity of disease networks, diverse combination therapies have been demonstrated potential in the treatment of different patients with complex diseases in a personal combination profile. However, the identification of rational, compatible and effective drug combinations remains an ongoing challenge. Based on a holistic theory integrated with reductionism, Fangjiomics systematically develops multiple modes of array-designed combination therapies. We define diverse "magic shotgun" vertical, horizontal, focusing, siege and dynamic arrays according to different spatiotemporal distributions of hits on targets, pathways and networks. Through these multiple adaptive modes for treating complex diseases, Fangjiomics may help to identify rational drug combinations with synergistic or additive efficacy but reduced adverse side effects that reverse complex diseases by reconstructing or rewiring multiple targets, pathways and networks. Such a novel paradigm for combination therapies may allow us to achieve more precise treatments by developing phenotype-driven quantitative multi-scale modeling for rational drug combinations.
Collapse
Affiliation(s)
- Jun Liu
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Zhong Wang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| |
Collapse
|
14
|
Hoehndorf R, Slater L, Schofield PN, Gkoutos GV. Aber-OWL: a framework for ontology-based data access in biology. BMC Bioinformatics 2015; 16:26. [PMID: 25627673 PMCID: PMC4384359 DOI: 10.1186/s12859-015-0456-9] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Accepted: 01/09/2015] [Indexed: 11/10/2022] Open
Abstract
Background Many ontologies have been developed in biology and these ontologies increasingly contain large volumes of formalized knowledge commonly expressed in the Web Ontology Language (OWL). Computational access to the knowledge contained within these ontologies relies on the use of automated reasoning. Results We have developed the Aber-OWL infrastructure that provides reasoning services for bio-ontologies. Aber-OWL consists of an ontology repository, a set of web services and web interfaces that enable ontology-based semantic access to biological data and literature. Aber-OWL is freely available at http://aber-owl.net. Conclusions Aber-OWL provides a framework for automatically accessing information that is annotated with ontologies or contains terms used to label classes in ontologies. When using Aber-OWL, access to ontologies and data annotated with them is not merely based on class names or identifiers but rather on the knowledge the ontologies contain and the inferences that can be drawn from it.
Collapse
Affiliation(s)
- Robert Hoehndorf
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia. .,Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia.
| | - Luke Slater
- Computational Bioscience Research Center, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia. .,Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955-6900, Saudi Arabia. .,Department of Computer Science, Aberystwyth University, Llandinam Building, Aberystwyth, SY23 3DB, UK.
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3EG, UK.
| | - Georgios V Gkoutos
- Department of Computer Science, Aberystwyth University, Llandinam Building, Aberystwyth, SY23 3DB, UK.
| |
Collapse
|
15
|
Cardiff RD, Miller CH, Munn RJ. Analysis of mouse model pathology: a primer for studying the anatomic pathology of genetically engineered mice. Cold Spring Harb Protoc 2014; 2014:561-80. [PMID: 24890215 DOI: 10.1101/pdb.top069922] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
This primer of pathology is intended to introduce investigators to the structure (morphology) of cancer with an emphasis on genetically engineered mouse (GEM) models (GEMMs). We emphasize the necessity of using the entire biological context for the interpretation of anatomic pathology. Because the primary investigator is responsible for almost all of the information and procedures leading up to microscopic examination, they should also be responsible for documentation of experiments so that the microscopic interpretation can be rendered in context of the biology. The steps involved in this process are outlined, discussed, and illustrated. Because GEMMs are unique experimental subjects, some of the more common pitfalls are discussed. Many of these errors can be avoided with attention to detail and continuous quality assurance.
Collapse
Affiliation(s)
- Robert D Cardiff
- Center for Comparative Medicine and Center for Genomic Pathology, University of California, Davis, Davis, California 95616
| | - Claramae H Miller
- Center for Comparative Medicine and Center for Genomic Pathology, University of California, Davis, Davis, California 95616
| | - Robert J Munn
- Center for Comparative Medicine and Center for Genomic Pathology, University of California, Davis, Davis, California 95616
| |
Collapse
|
16
|
Cardiff RD, Miller CH, Munn RJ, Galvez JJ. Structured reporting in anatomic pathology for coclinical trials: the caELMIR model. Cold Spring Harb Protoc 2014; 2014:32-43. [PMID: 24173313 DOI: 10.1101/pdb.top078790] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Electronic media, with their tremendous potential for storing, retrieving, and integrating data, are an essential part of modern collaborative multidisciplinary science. Structured reporting is a fundamental aspect of keeping accurate, searchable electronic records. This discussion on structured reporting in anatomic pathology for pre- and coclinical trials in animal models provides background information for scientists who are not familiar with structured reporting. Practical examples are provided using a working database system for preclinical research-caELMIR (Cancer Electronic Laboratory Management Information and Retrieval)-developed by the U.S. National Cancer Institute's (NCI's) Mouse Models of Human Cancers Consortium (MMHCC).
Collapse
Affiliation(s)
- Robert D Cardiff
- Center for Comparative Medicine and Center for Genomic Pathology, University of California, Davis, Davis, California 95616
| | | | | | | |
Collapse
|
17
|
Hoehndorf R, Hiebert T, Hardy NW, Schofield PN, Gkoutos GV, Dumontier M. Mouse model phenotypes provide information about human drug targets. ACTA ACUST UNITED AC 2013; 30:719-25. [PMID: 24158600 PMCID: PMC3933875 DOI: 10.1093/bioinformatics/btt613] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Motivation: Methods for computational drug target identification use information from diverse information sources to predict or prioritize drug targets for known drugs. One set of resources that has been relatively neglected for drug repurposing is animal model phenotype. Results: We investigate the use of mouse model phenotypes for drug target identification. To achieve this goal, we first integrate mouse model phenotypes and drug effects, and then systematically compare the phenotypic similarity between mouse models and drug effect profiles. We find a high similarity between phenotypes resulting from loss-of-function mutations and drug effects resulting from the inhibition of a protein through a drug action, and demonstrate how this approach can be used to suggest candidate drug targets. Availability and implementation: Analysis code and supplementary data files are available on the project Web site at https://drugeffects.googlecode.com. Contact:leechuck@leechuck.de or roh25@aber.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Robert Hoehndorf
- Department of Computer Science, University of Aberystwyth, Old College, King Street, Aberystwyth SY23 2AX, Department of Biology, Institute of Biochemistry and School of Computer Science, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada and Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK
| | | | | | | | | | | |
Collapse
|
18
|
Osumi-Sutherland D, Marygold SJ, Millburn GH, McQuilton PA, Ponting L, Stefancsik R, Falls K, Brown NH, Gkoutos GV. The Drosophila phenotype ontology. J Biomed Semantics 2013; 4:30. [PMID: 24138933 PMCID: PMC3816596 DOI: 10.1186/2041-1480-4-30] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2013] [Accepted: 10/11/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Phenotype ontologies are queryable classifications of phenotypes. They provide a widely-used means for annotating phenotypes in a form that is human-readable, programatically accessible and that can be used to group annotations in biologically meaningful ways. Accurate manual annotation requires clear textual definitions for terms. Accurate grouping and fruitful programatic usage require high-quality formal definitions that can be used to automate classification. The Drosophila phenotype ontology (DPO) has been used to annotate over 159,000 phenotypes in FlyBase to date, but until recently lacked textual or formal definitions. RESULTS We have composed textual definitions for all DPO terms and formal definitions for 77% of them. Formal definitions reference terms from a range of widely-used ontologies including the Phenotype and Trait Ontology (PATO), the Gene Ontology (GO) and the Cell Ontology (CL). We also describe a generally applicable system, devised for the DPO, for recording and reasoning about the timing of death in populations. As a result of the new formalisations, 85% of classifications in the DPO are now inferred rather than asserted, with much of this classification leveraging the structure of the GO. This work has significantly improved the accuracy and completeness of classification and made further development of the DPO more sustainable. CONCLUSIONS The DPO provides a set of well-defined terms for annotating Drosophila phenotypes and for grouping and querying the resulting annotation sets in biologically meaningful ways. Such queries have already resulted in successful function predictions from phenotype annotation. Moreover, such formalisations make extended queries possible, including cross-species queries via the external ontologies used in formal definitions. The DPO is openly available under an open source license in both OBO and OWL formats. There is good potential for it to be used more broadly by the Drosophila community, which may ultimately result in its extension to cover a broader range of phenotypes.
Collapse
Affiliation(s)
| | - Steven J Marygold
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - Gillian H Millburn
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - Peter A McQuilton
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - Laura Ponting
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - Raymund Stefancsik
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - Kathleen Falls
- The Biological Laboratories, Harvard University, 16 Divinity Avenue, Cambridge, MA, USA
| | - Nicholas H Brown
- FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
- Gurdon Institute & Department of Physiology, Development and Neuroscience, University of Cambridge, Tennis Court Road, Cambridge, UK
| | - Georgios V Gkoutos
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| |
Collapse
|
19
|
Fuellen G, Jansen L, Leser U, Kurtz A. Using ontologies to study cell transitions. J Biomed Semantics 2013; 4:25. [PMID: 24103098 PMCID: PMC4128511 DOI: 10.1186/2041-1480-4-25] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Accepted: 08/19/2013] [Indexed: 11/29/2022] Open
Abstract
Background Understanding, modelling and influencing the transition between different states of cells, be it reprogramming of somatic cells to pluripotency or trans-differentiation between cells, is a hot topic in current biomedical and cell-biological research. Nevertheless, the large body of published knowledge in this area is underused, as most results are only represented in natural language, impeding their finding, comparison, aggregation, and usage. Scientific understanding of the complex molecular mechanisms underlying cell transitions could be improved by making essential pieces of knowledge available in a formal (and thus computable) manner. Results We describe the outline of two ontologies for cell phenotypes and for cellular mechanisms which together enable the representation of data curated from the literature or obtained by bioinformatics analyses and thus for building a knowledge base on mechanisms involved in cellular reprogramming. In particular, we discuss how comprehensive ontologies of cell phenotypes and of changes in mechanisms can be designed using the entity-quality (EQ) model. Conclusions We show that the principles for building cellular ontologies published in this work allow deeper insights into the relations between the continuants (cell phenotypes) and the occurrents (cell mechanism changes) involved in cellular reprogramming, although implementation remains for future work. Further, our design principles lead to ontologies that allow the meaningful application of similarity searches in the spaces of cell phenotypes and of mechanisms, and, especially, of changes of mechanisms during cellular transitions.
Collapse
Affiliation(s)
- Georg Fuellen
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock Medical School, Ernst-Heydemann-Str, 8, 18057 Rostock, Germany.
| | | | | | | |
Collapse
|
20
|
Ross KE, Arighi CN, Ren J, Natale DA, Huang H, Wu CH. Use of the protein ontology for multi-faceted analysis of biological processes: a case study of the spindle checkpoint. Front Genet 2013; 4:62. [PMID: 23637705 PMCID: PMC3636526 DOI: 10.3389/fgene.2013.00062] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2013] [Accepted: 04/05/2013] [Indexed: 11/13/2022] Open
Abstract
As a member of the Open Biomedical Ontologies (OBO) foundry, the Protein Ontology (PRO) provides an ontological representation of protein forms and complexes and their relationships. Annotations in PRO can be assigned to individual protein forms and complexes, each distinguishable down to the level of post-translational modification, thereby allowing for a more precise depiction of protein function than is possible with annotations to the gene as a whole. Moreover, PRO is fully interoperable with other OBO ontologies and integrates knowledge from other protein-centric resources such as UniProt and Reactome. Here we demonstrate the value of the PRO framework in the investigation of the spindle checkpoint, a highly conserved biological process that relies extensively on protein modification and protein complex formation. The spindle checkpoint maintains genomic integrity by monitoring the attachment of chromosomes to spindle microtubules and delaying cell cycle progression until the spindle is fully assembled. Using PRO in conjunction with other bioinformatics tools, we explored the cross-species conservation of spindle checkpoint proteins, including phosphorylated forms and complexes; studied the impact of phosphorylation on spindle checkpoint function; and examined the interactions of spindle checkpoint proteins with the kinetochore, the site of checkpoint activation. Our approach can be generalized to any biological process of interest.
Collapse
Affiliation(s)
- Karen E Ross
- Center for Bioinformatics and Computational Biology, University of Delaware Newark, DE, USA
| | | | | | | | | | | |
Collapse
|
21
|
Callahan A, Cruz-Toledo J, Dumontier M. Ontology-Based Querying with Bio2RDF's Linked Open Data. J Biomed Semantics 2013; 4 Suppl 1:S1. [PMID: 23735196 PMCID: PMC3632999 DOI: 10.1186/2041-1480-4-s1-s1] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND A key activity for life scientists in this post "-omics" age involves searching for and integrating biological data from a multitude of independent databases. However, our ability to find relevant data is hampered by non-standard web and database interfaces backed by an enormous variety of data formats. This heterogeneity presents an overwhelming barrier to the discovery and reuse of resources which have been developed at great public expense.To address this issue, the open-source Bio2RDF project promotes a simple convention to integrate diverse biological data using Semantic Web technologies. However, querying Bio2RDF remains difficult due to the lack of uniformity in the representation of Bio2RDF datasets. RESULTS We describe an update to Bio2RDF that includes tighter integration across 19 new and updated RDF datasets. All available open-source scripts were first consolidated to a single GitHub repository and then redeveloped using a common API that generates normalized IRIs using a centralized dataset registry. We then mapped dataset specific types and relations to the Semanticscience Integrated Ontology (SIO) and demonstrate simplified federated queries across multiple Bio2RDF endpoints. CONCLUSIONS This coordinated release marks an important milestone for the Bio2RDF open source linked data framework. Principally, it improves the quality of linked data in the Bio2RDF network and makes it easier to access or recreate the linked data locally. We hope to continue improving the Bio2RDF network of linked data by identifying priority databases and increasing the vocabulary coverage to additional dataset vocabularies beyond SIO.
Collapse
Affiliation(s)
- Alison Callahan
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada
| | - José Cruz-Toledo
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada
| | - Michel Dumontier
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada
- Institute of Biochemistry, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada
- School of Computer Science Carleton University, 1125 Colonel By Drive, Ottawa, ON, Canada
| |
Collapse
|
22
|
Callahan A, Cruz-Toledo J, Ansell P, Dumontier M. Bio2RDF Release 2: Improved Coverage, Interoperability and Provenance of Life Science Linked Data. THE SEMANTIC WEB: SEMANTICS AND BIG DATA 2013. [DOI: 10.1007/978-3-642-38288-8_14] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|
23
|
Gkoutos GV, Hoehndorf R. Ontology-based cross-species integration and analysis of Saccharomyces cerevisiae phenotypes. J Biomed Semantics 2012; 3 Suppl 2:S6. [PMID: 23046642 PMCID: PMC3448529 DOI: 10.1186/2041-1480-3-s2-s6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Ontologies are widely used in the biomedical community for annotation and integration of databases. Formal definitions can relate classes from different ontologies and thereby integrate data across different levels of granularity, domains and species. We have applied this methodology to the Ascomycete Phenotype Ontology (APO), enabling the reuse of various orthogonal ontologies and we have converted the phenotype associated data found in the SGD following our proposed patterns. We have integrated the resulting data in the cross-species phenotype network PhenomeNET, and we make both the cross-species integration of yeast phenotypes and a similarity-based comparison of yeast phenotypes across species available in the PhenomeBrowser. Furthermore, we utilize our definitions and the yeast phenotype annotations to suggest novel functional annotations of gene products in yeast.
Collapse
Affiliation(s)
- Georgios V Gkoutos
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, Cambridge CB2 3EH, UK.
| | | |
Collapse
|
24
|
Loebe F, Stumpf F, Hoehndorf R, Herre H. Towards improving phenotype representation in OWL. J Biomed Semantics 2012; 3 Suppl 2:S5. [PMID: 23046625 PMCID: PMC3448528 DOI: 10.1186/2041-1480-3-s2-s5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Phenotype ontologies are used in species-specific databases for the annotation of mutagenesis experiments and to characterize human diseases. The Entity-Quality (EQ) formalism is a means to describe complex phenotypes based on one or more affected entities and a quality. EQ-based definitions have been developed for many phenotype ontologies, including the Human and Mammalian Phenotype ontologies. METHODS We analyze formalizations of complex phenotype descriptions in the Web Ontology Language (OWL) that are based on the EQ model, identify several representational challenges and analyze potential solutions to address these challenges. RESULTS In particular, we suggest a novel, role-based approach to represent relational qualities such as concentration of iron in spleen, discuss its ontological foundation in the General Formal Ontology (GFO) and evaluate its representation in OWL and the benefits it can bring to the representation of phenotype annotations. CONCLUSION Our analysis of OWL-based representations of phenotypes can contribute to improving consistency and expressiveness of formal phenotype descriptions.
Collapse
Affiliation(s)
- Frank Loebe
- Department of Computer Science, University of Leipzig, 04103 Leipzig, Germany.
| | | | | | | |
Collapse
|
25
|
Abstract
Ontologies are now pervasive in biomedicine, where they serve as a means to standardize terminology, to enable access to domain knowledge, to verify data consistency and to facilitate integrative analyses over heterogeneous biomedical data. For this purpose, research on biomedical ontologies applies theories and methods from diverse disciplines such as information management, knowledge representation, cognitive science, linguistics and philosophy. Depending on the desired applications in which ontologies are being applied, the evaluation of research in biomedical ontologies must follow different strategies. Here, we provide a classification of research problems in which ontologies are being applied, focusing on the use of ontologies in basic and translational research, and we demonstrate how research results in biomedical ontologies can be evaluated. The evaluation strategies depend on the desired application and measure the success of using an ontology for a particular biomedical problem. For many applications, the success can be quantified, thereby facilitating the objective evaluation and comparison of research in biomedical ontology. The objective, quantifiable comparison of research results based on scientific applications opens up the possibility for systematically improving the utility of ontologies in biomedical research.
Collapse
Affiliation(s)
- Robert Hoehndorf
- Department of Computer Science, Aberystwyth University, Aberystwyth, Ceredigion, SY23 3DB, UK.
| | | | | |
Collapse
|
26
|
Hoehndorf R, Dumontier M, Gkoutos GV. Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics. Bioinformatics 2012; 28:2169-75. [PMID: 22711793 PMCID: PMC3493115 DOI: 10.1093/bioinformatics/bts350] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Revised: 06/11/2012] [Accepted: 06/12/2012] [Indexed: 01/22/2023] Open
Abstract
MOTIVATION Many complex diseases are the result of abnormal pathway functions instead of single abnormalities. Disease diagnosis and intervention strategies must target these pathways while minimizing the interference with normal physiological processes. Large-scale identification of disease pathways and chemicals that may be used to perturb them requires the integration of information about drugs, genes, diseases and pathways. This information is currently distributed over several pharmacogenomics databases. An integrated analysis of the information in these databases can reveal disease pathways and facilitate novel biomedical analyses. RESULTS We demonstrate how to integrate pharmacogenomics databases through integration of the biomedical ontologies that are used as meta-data in these databases. The additional background knowledge in these ontologies can then be used to enable novel analyses. We identify disease pathways using a novel multi-ontology enrichment analysis over the Human Disease Ontology, and we identify significant associations between chemicals and pathways using an enrichment analysis over a chemical ontology. The drug-pathway and disease-pathway associations are a valuable resource for research in disease and drug mechanisms and can be used to improve computational drug repurposing. AVAILABILITY http://pharmgkb-owl.googlecode.com
Collapse
Affiliation(s)
- Robert Hoehndorf
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.
| | | | | |
Collapse
|
27
|
Schofield PN, Hoehndorf R, Gkoutos GV. Mouse genetic and phenotypic resources for human genetics. Hum Mutat 2012; 33:826-36. [PMID: 22422677 DOI: 10.1002/humu.22077] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The use of model organisms to provide information on gene function has proved to be a powerful approach to our understanding of both human disease and fundamental mammalian biology. Large-scale community projects using mice, based on forward and reverse genetics, and now the pan-genomic phenotyping efforts of the International Mouse Phenotyping Consortium, are generating resources on an unprecedented scale, which will be extremely valuable to human genetics and medicine. We discuss the nature and availability of data, mice and embryonic stem cells from these large-scale programmes, the use of these resources to help prioritize and validate candidate genes in human genetic association studies, and how they can improve our understanding of the underlying pathobiology of human disease.
Collapse
Affiliation(s)
- Paul N Schofield
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom.
| | | | | |
Collapse
|
28
|
Gkoutos GV, Schofield PN, Hoehndorf R. Computational tools for comparative phenomics: the role and promise of ontologies. Mamm Genome 2012; 23:669-79. [PMID: 22814867 DOI: 10.1007/s00335-012-9404-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 05/21/2012] [Indexed: 11/28/2022]
Abstract
A major aim of the biological sciences is to gain an understanding of human physiology and disease. One important step towards such a goal is the discovery of the function of genes that will lead to a better understanding of the physiology and pathophysiology of organisms, which will ultimately lead to better diagnosis and therapy. Our increasing ability to phenotypically characterise genetic variants of model organisms coupled with systematic and hypothesis-driven mutagenesis is resulting in a wealth of information that could potentially provide insight into the functions of all genes in an organism. The challenge we are now facing is to develop computational methods that can integrate and analyse such data. The introduction of formal ontologies that make their semantics explicit and accessible to automated reasoning provides the tantalizing possibility of standardizing biomedical knowledge allowing for novel, powerful queries that bridge multiple domains, disciplines, species, and levels of granularity. We review recent computational approaches that facilitate the integration of experimental data from model organisms with clinical observations in humans. These methods foster novel cross-species analysis approaches, thereby enabling comparative phenomics and leading to the potential of translating basic discoveries from the model systems into diagnostic and therapeutic advances at the clinical level.
Collapse
Affiliation(s)
- Georgios V Gkoutos
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.
| | | | | |
Collapse
|
29
|
Jupp S, Stevens R, Hoehndorf R. Logical Gene Ontology Annotations (GOAL): exploring gene ontology annotations with OWL. J Biomed Semantics 2012; 3 Suppl 1:S3. [PMID: 22541594 PMCID: PMC3337258 DOI: 10.1186/2041-1480-3-s1-s3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
MOTIVATION Ontologies such as the Gene Ontology (GO) and their use in annotations make cross species comparisons of genes possible, along with a wide range of other analytical activities. The bio-ontologies community, in particular the Open Biomedical Ontologies (OBO) community, have provided many other ontologies and an increasingly large volume of annotations of gene products that can be exploited in query and analysis. As many annotations with different ontologies centre upon gene products, there is a possibility to explore gene products through multiple ontological perspectives at the same time. Questions could be asked that link a gene product's function, process, cellular location, phenotype and disease. Current tools, such as AmiGO, allow exploration of genes based on their GO annotations, but not through multiple ontological perspectives. In addition, the semantics of these ontology's representations should be able to, through automated reasoning, afford richer query opportunities of the gene product annotations than is currently possible. RESULTS To do this multi-perspective, richer querying of gene product annotations, we have created the Logical Gene Ontology, or GOAL ontology, in OWL that combines the Gene Ontology, Human Disease Ontology and the Mammalian Phenotype Ontology, together with classes that represent the annotations with these ontologies for mouse gene products. Each mouse gene product is represented as a class, with the appropriate relationships to the GO aspects, phenotype and disease with which it has been annotated. We then use defined classes to query these protein classes through automated reasoning, and to build a complex hierarchy of gene products. We have presented this through a Web interface that allows arbitrary queries to be constructed and the results displayed. CONCLUSION This standard use of OWL affords a rich interaction with Gene Ontology, Human Disease Ontology and Mammalian Phenotype Ontology annotations for the mouse, to give a fine partitioning of the gene products in the GOAL ontology. OWL in combination with automated reasoning can be effectively used to query across ontologies to ask biologically rich questions. We have demonstrated that automated reasoning can be used to deliver practical on-line querying support for the ontology annotations available for the mouse. AVAILABILITY The GOAL Web page is to be found at http://owl.cs.manchester.ac.uk/goal.
Collapse
Affiliation(s)
- Simon Jupp
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SD, UK
| | - Robert Stevens
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| | - Robert Hoehndorf
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| |
Collapse
|
30
|
The neurobehavior ontology: an ontology for annotation and integration of behavior and behavioral phenotypes. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2012. [PMID: 23195121 DOI: 10.1016/b978-0-12-388408-4.00004-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
In recent years, considerable advances have been made toward our understanding of the genetic architecture of behavior and the physical, mental, and environmental influences that underpin behavioral processes. The provision of a method for recording behavior-related phenomena is necessary to enable integrative and comparative analyses of data and knowledge about behavior. The neurobehavior ontology facilitates the systematic representation of behavior and behavioral phenotypes, thereby improving the unification and integration behavioral data in neuroscience research.
Collapse
|
31
|
Schofield PN, Sundberg JP, Hoehndorf R, Gkoutos GV. New approaches to the representation and analysis of phenotype knowledge in human diseases and their animal models. Brief Funct Genomics 2011; 10:258-65. [PMID: 21987712 PMCID: PMC3189694 DOI: 10.1093/bfgp/elr031] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The systematic investigation of the phenotypes associated with genotypes in model organisms holds the promise of revealing genotype-phenotype relations directly and without additional, intermediate inferences. Large-scale projects are now underway to catalog the complete phenome of a species, notably the mouse. With the increasing amount of phenotype information becoming available, a major challenge that biology faces today is the systematic analysis of this information and the translation of research results across species and into an improved understanding of human disease. The challenge is to integrate and combine phenotype descriptions within a species and to systematically relate them to phenotype descriptions in other species, in order to form a comprehensive understanding of the relations between those phenotypes and the genotypes involved in human disease. We distinguish between two major approaches for comparative phenotype analyses: the first relies on evolutionary relations to bridge the species gap, while the other approach compares phenotypes directly. In particular, the direct comparison of phenotypes relies heavily on the quality and coherence of phenotype and disease databases. We discuss major achievements and future challenges for these databases in light of their potential to contribute to the understanding of the molecular mechanisms underlying human disease. In particular, we discuss how the use of ontologies and automated reasoning can significantly contribute to the analysis of phenotypes and demonstrate their potential for enabling translational research.
Collapse
|
32
|
Hoehndorf R, Dumontier M, Gennari JH, Wimalaratne S, de Bono B, Cook DL, Gkoutos GV. Integrating systems biology models and biomedical ontologies. BMC SYSTEMS BIOLOGY 2011; 5:124. [PMID: 21835028 PMCID: PMC3170340 DOI: 10.1186/1752-0509-5-124] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Accepted: 08/11/2011] [Indexed: 01/30/2023]
Abstract
BACKGROUND Systems biology is an approach to biology that emphasizes the structure and dynamic behavior of biological systems and the interactions that occur within them. To succeed, systems biology crucially depends on the accessibility and integration of data across domains and levels of granularity. Biomedical ontologies were developed to facilitate such an integration of data and are often used to annotate biosimulation models in systems biology. RESULTS We provide a framework to integrate representations of in silico systems biology with those of in vivo biology as described by biomedical ontologies and demonstrate this framework using the Systems Biology Markup Language. We developed the SBML Harvester software that automatically converts annotated SBML models into OWL and we apply our software to those biosimulation models that are contained in the BioModels Database. We utilize the resulting knowledge base for complex biological queries that can bridge levels of granularity, verify models based on the biological phenomenon they represent and provide a means to establish a basic qualitative layer on which to express the semantics of biosimulation models. CONCLUSIONS We establish an information flow between biomedical ontologies and biosimulation models and we demonstrate that the integration of annotated biosimulation models and biomedical ontologies enables the verification of models as well as expressive queries. Establishing a bi-directional information flow between systems biology and biomedical ontologies has the potential to enable large-scale analyses of biological systems that span levels of granularity from molecules to organisms.
Collapse
Affiliation(s)
- Robert Hoehndorf
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| | - Michel Dumontier
- Department of Biology, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada
- School of Computer Science, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada
| | - John H Gennari
- Biomedical & Health Informatics, Department of Medical Education and Biomedical Informatics, University of Washington, 1959 NE Pacific Street, Box 357420, Seattle, Washington 98195, USA
| | - Sarala Wimalaratne
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Bernard de Bono
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Daniel L Cook
- Department of Physiology & Biophysics, University of Washington, 1705 NE Pacific Street, Box 357290, Seattle, Washington 98195, USA
- Department of Biological Structure, University of Washington, 1959 NE Pacific Street, Box 357420, Seattle, Washington 98195, USA
| | - Georgios V Gkoutos
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK
| |
Collapse
|