1
|
Azzi R, Bordea G, Griffier R, Nikiema JN, Mougin F. Enriching the FIDEO ontology with food-drug interactions from online knowledge sources. J Biomed Semantics 2024; 15:1. [PMID: 38438913 PMCID: PMC10913206 DOI: 10.1186/s13326-024-00302-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 02/05/2024] [Indexed: 03/06/2024] Open
Abstract
The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.
Collapse
Affiliation(s)
- Rabia Azzi
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Georgeta Bordea
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- Univ. La Rochelle, L3i, F-17000, La Rochelle, France
| | - Romain Griffier
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Jean Noël Nikiema
- Department of Management, Evaluation and Health Policy, School of Public Health, Université de Montréal, Québec, Canada
| | - Fleur Mougin
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France.
| |
Collapse
|
2
|
Zayas CE, Whorton JM, Sexton KW, Mabry CD, Dowland SC, Brochhausen M. Development and validation of the early warning system scores ontology. J Biomed Semantics 2023; 14:14. [PMID: 37730667 PMCID: PMC10510162 DOI: 10.1186/s13326-023-00296-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 09/09/2023] [Indexed: 09/22/2023] Open
Abstract
BACKGROUND Clinical early warning scoring systems, have improved patient outcomes in a range of specializations and global contexts. These systems are used to predict patient deterioration. A multitude of patient-level physiological decompensation data has been made available through the widespread integration of early warning scoring systems within EHRs across national and international health care organizations. These data can be used to promote secondary research. The diversity of early warning scoring systems and various EHR systems is one barrier to secondary analysis of early warning score data. Given that early warning score parameters are varied, this makes it difficult to query across providers and EHR systems. Moreover, mapping and merging the parameters is challenging. We develop and validate the Early Warning System Scores Ontology (EWSSO), representing three commonly used early warning scores: the National Early Warning Score (NEWS), the six-item modified Early Warning Score (MEWS), and the quick Sequential Organ Failure Assessment (qSOFA) to overcome these problems. METHODS We apply the Software Development Lifecycle Framework-conceived by Winston Boyce in 1970-to model the activities involved in organizing, producing, and evaluating the EWSSO. We also follow OBO Foundry Principles and the principles of best practice for domain ontology design, terms, definitions, and classifications to meet BFO requirements for ontology building. RESULTS We developed twenty-nine new classes, reused four classes and four object properties to create the EWSSO. When we queried the data our ontology-based process could differentiate between necessary and unnecessary features for score calculation 100% of the time. Further, our process applied the proper temperature conversions for the early warning score calculator 100% of the time. CONCLUSIONS Using synthetic datasets, we demonstrate the EWSSO can be used to generate and query health system data on vital signs and provide input to calculate the NEWS, six-item MEWS, and qSOFA. Future work includes extending the EWSSO by introducing additional early warning scores for adult and pediatric patient populations and creating patient profiles that contain clinical, demographic, and outcomes data regarding the patient.
Collapse
Affiliation(s)
- Cilia E Zayas
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA.
| | - Justin M Whorton
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Kevin W Sexton
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
- Department of Surgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, USA
- University of Arkansas for Medical Sciences, Institute for Digital Health & Innovation, 4301 West Markham Street, Slot 781, Little Rock, AR, 72205, USA
| | - Charles D Mabry
- Department of Surgery, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - S Clint Dowland
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| | - Mathias Brochhausen
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
- Department of Medical Humanities and Bioethics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA
| |
Collapse
|
3
|
Taneja SB, Callahan TJ, Paine MF, Kane-Gill SL, Kilicoglu H, Joachimiak MP, Boyce RD. Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions. J Biomed Inform 2023; 140:104341. [PMID: 36933632 PMCID: PMC10150409 DOI: 10.1016/j.jbi.2023.104341] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 01/09/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023]
Abstract
BACKGROUND Pharmacokinetic natural product-drug interactions (NPDIs) occur when botanical or other natural products are co-consumed with pharmaceutical drugs. With the growing use of natural products, the risk for potential NPDIs and consequent adverse events has increased. Understanding mechanisms of NPDIs is key to preventing or minimizing adverse events. Although biomedical knowledge graphs (KGs) have been widely used for drug-drug interaction applications, computational investigation of NPDIs is novel. We constructed NP-KG as a first step toward computational discovery of plausible mechanistic explanations for pharmacokinetic NPDIs that can be used to guide scientific research. METHODS We developed a large-scale, heterogeneous KG with biomedical ontologies, linked data, and full texts of the scientific literature. To construct the KG, biomedical ontologies and drug databases were integrated with the Phenotype Knowledge Translator framework. The semantic relation extraction systems, SemRep and Integrated Network and Dynamic Reasoning Assembler, were used to extract semantic predications (subject-relation-object triples) from full texts of the scientific literature related to the exemplar natural products green tea and kratom. A literature-based graph constructed from the predications was integrated into the ontology-grounded KG to create NP-KG. NP-KG was evaluated with case studies of pharmacokinetic green tea- and kratom-drug interactions through KG path searches and meta-path discovery to determine congruent and contradictory information in NP-KG compared to ground truth data. We also conducted an error analysis to identify knowledge gaps and incorrect predications in the KG. RESULTS The fully integrated NP-KG consisted of 745,512 nodes and 7,249,576 edges. Evaluation of NP-KG resulted in congruent (38.98% for green tea, 50% for kratom), contradictory (15.25% for green tea, 21.43% for kratom), and both congruent and contradictory (15.25% for green tea, 21.43% for kratom) information compared to ground truth data. Potential pharmacokinetic mechanisms for several purported NPDIs, including the green tea-raloxifene, green tea-nadolol, kratom-midazolam, kratom-quetiapine, and kratom-venlafaxine interactions were congruent with the published literature. CONCLUSION NP-KG is the first KG to integrate biomedical ontologies with full texts of the scientific literature focused on natural products. We demonstrate the application of NP-KG to identify known pharmacokinetic interactions between natural products and pharmaceutical drugs mediated by drug metabolizing enzymes and transporters. Future work will incorporate context, contradiction analysis, and embedding-based methods to enrich NP-KG. NP-KG is publicly available at https://doi.org/10.5281/zenodo.6814507. The code for relation extraction, KG construction, and hypothesis generation is available at https://github.com/sanyabt/np-kg.
Collapse
Affiliation(s)
- Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15206, USA.
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Mary F Paine
- Department of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Washington State University, Spokane, WA 99202, USA
| | | | - Halil Kilicoglu
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
| | - Marcin P Joachimiak
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| |
Collapse
|
4
|
Chan LE, Thessen AE, Duncan WD, Matentzoglu N, Schmitt C, Grondin CJ, Vasilevsky N, McMurry JA, Robinson PN, Mungall CJ, Haendel MA. The Environmental Conditions, Treatments, and Exposures Ontology (ECTO): connecting toxicology and exposure to human health and beyond. J Biomed Semantics 2023; 14:3. [PMID: 36823605 PMCID: PMC9951428 DOI: 10.1186/s13326-023-00283-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 02/03/2023] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND Evaluating the impact of environmental exposures on organism health is a key goal of modern biomedicine and is critically important in an age of greater pollution and chemicals in our environment. Environmental health utilizes many different research methods and generates a variety of data types. However, to date, no comprehensive database represents the full spectrum of environmental health data. Due to a lack of interoperability between databases, tools for integrating these resources are needed. In this manuscript we present the Environmental Conditions, Treatments, and Exposures Ontology (ECTO), a species-agnostic ontology focused on exposure events that occur as a result of natural and experimental processes, such as diet, work, or research activities. ECTO is intended for use in harmonizing environmental health data resources to support cross-study integration and inference for mechanism discovery. METHODS AND FINDINGS ECTO is an ontology designed for describing organismal exposures such as toxicological research, environmental variables, dietary features, and patient-reported data from surveys. ECTO utilizes the base model established within the Exposure Ontology (ExO). ECTO is developed using a combination of manual curation and Dead Simple OWL Design Patterns (DOSDP), and contains over 2700 environmental exposure terms, and incorporates chemical and environmental ontologies. ECTO is an Open Biological and Biomedical Ontology (OBO) Foundry ontology that is designed for interoperability, reuse, and axiomatization with other ontologies. ECTO terms have been utilized in axioms within the Mondo Disease Ontology to represent diseases caused or influenced by environmental factors, as well as for survey encoding for the Personalized Environment and Genes Study (PEGS). CONCLUSIONS We constructed ECTO to meet Open Biological and Biomedical Ontology (OBO) Foundry principles to increase translation opportunities between environmental health and other areas of biology. ECTO has a growing community of contributors consisting of toxicologists, public health epidemiologists, and health care providers to provide the necessary expertise for areas that have been identified previously as gaps.
Collapse
Affiliation(s)
- Lauren E. Chan
- grid.4391.f0000 0001 2112 1969Oregon State University, Corvallis, OR 97331 USA
| | - Anne E. Thessen
- grid.4391.f0000 0001 2112 1969Oregon State University, Corvallis, OR 97331 USA ,grid.430503.10000 0001 0703 675XUniversity of Colorado Anschutz Medical Campus, Aurora, CO 80054 USA
| | - William D. Duncan
- grid.15276.370000 0004 1936 8091University of Florida, Gainesville, FL 32610 USA
| | | | - Charles Schmitt
- grid.280664.e0000 0001 2110 5790National Institute of Environmental Health Sciences, Durham, NC 27709 USA
| | - Cynthia J. Grondin
- grid.40803.3f0000 0001 2173 6074North Carolina State University, Raleigh, NC 27965 USA
| | - Nicole Vasilevsky
- grid.430503.10000 0001 0703 675XUniversity of Colorado Anschutz Medical Campus, Aurora, CO 80054 USA
| | - Julie A. McMurry
- grid.430503.10000 0001 0703 675XUniversity of Colorado Anschutz Medical Campus, Aurora, CO 80054 USA
| | - Peter N. Robinson
- grid.249880.f0000 0004 0374 0039The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
| | - Christopher J. Mungall
- grid.184769.50000 0001 2231 4551Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA
| | - Melissa A. Haendel
- grid.4391.f0000 0001 2112 1969Oregon State University, Corvallis, OR 97331 USA ,grid.430503.10000 0001 0703 675XUniversity of Colorado Anschutz Medical Campus, Aurora, CO 80054 USA
| |
Collapse
|
5
|
Vogt L, Mikó I, Bartolomaeus T. Anatomy and the type concept in biology show that ontologies must be adapted to the diagnostic needs of research. J Biomed Semantics 2022; 13:18. [PMID: 35761389 PMCID: PMC9235205 DOI: 10.1186/s13326-022-00268-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 04/12/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In times of exponential data growth in the life sciences, machine-supported approaches are becoming increasingly important and with them the need for FAIR (Findable, Accessible, Interoperable, Reusable) and eScience-compliant data and metadata standards. Ontologies, with their queryable knowledge resources, play an essential role in providing these standards. Unfortunately, biomedical ontologies only provide ontological definitions that answer What is it? questions, but no method-dependent empirical recognition criteria that answer How does it look? QUESTIONS Consequently, biomedical ontologies contain knowledge of the underlying ontological nature of structural kinds, but often lack sufficient diagnostic knowledge to unambiguously determine the reference of a term. RESULTS We argue that this is because ontology terms are usually textually defined and conceived as essentialistic classes, while recognition criteria often require perception-based definitions because perception-based contents more efficiently document and communicate spatial and temporal information-a picture is worth a thousand words. Therefore, diagnostic knowledge often must be conceived as cluster classes or fuzzy sets. Using several examples from anatomy, we point out the importance of diagnostic knowledge in anatomical research and discuss the role of cluster classes and fuzzy sets as concepts of grouping needed in anatomy ontologies in addition to essentialistic classes. In this context, we evaluate the role of the biological type concept and discuss its function as a general container concept for groupings not covered by the essentialistic class concept. CONCLUSIONS We conclude that many recognition criteria can be conceptualized as text-based cluster classes that use terms that are in turn based on perception-based fuzzy set concepts. Finally, we point out that only if biomedical ontologies model also relevant diagnostic knowledge in addition to ontological knowledge, they will fully realize their potential and contribute even more substantially to the establishment of FAIR and eScience-compliant data and metadata standards in the life sciences.
Collapse
Affiliation(s)
- Lars Vogt
- TIB Leibniz Information Centre for Science and Technology, Welfengarten 1B, 30167, Hannover, Germany.
| | - István Mikó
- Don Chandler Entomological Collection, University of New Hampshire, Durham, NH, USA
| | - Thomas Bartolomaeus
- Institut für Evolutionsbiologie und Ökologie, Universität Bonn, An der Immenburg 1, 53121, Bonn, Germany
| |
Collapse
|
6
|
Liu T, Pan X, Wang X, Feenstra KA, Heringa J, Huang Z. Predicting the relationships between gut microbiota and mental disorders with knowledge graphs. Health Inf Sci Syst 2020; 9:3. [PMID: 33262885 PMCID: PMC7686388 DOI: 10.1007/s13755-020-00128-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 09/30/2020] [Indexed: 01/14/2023] Open
Abstract
Gut microbiota produce and modulate the production of neurotransmitters which have been implicated in mental disorders. Neurotransmitters may act as ‘matchmaker’ between gut microbiota imbalance and mental disorders. Most of the relevant research effort goes into the relationship between gut microbiota and neurotransmitters and the other between neurotransmitters and mental disorders, while few studies collect and analyze the dispersed research results in systematic ways. We therefore gather the dispersed results that in the existing studies into a structured knowledge base for identifying and predicting the potential relationships between gut microbiota and mental disorders. In this study, we propose to construct a gut microbiota knowledge graph for mental disorder, which named as MiKG4MD. It is extendable by linking to future ontologies by just adding new relationships between existing information and new entities. This extendibility is emphasized for the integration with existing popular ontologies/terminologies, e.g. UMLS, MeSH, and KEGG. We demonstrate the performance of MiKG4MD with three SPARQL query test cases. Results show that the MiKG4MD knowledge graph is an effective method to predict the relationships between gut microbiota and mental disorders.
Collapse
Affiliation(s)
- Ting Liu
- Knowledge Representation and Reasoning (KR&R) Group, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Center for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Xueli Pan
- Knowledge Representation and Reasoning (KR&R) Group, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Xu Wang
- Knowledge Representation and Reasoning (KR&R) Group, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - K Anton Feenstra
- Center for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Jaap Heringa
- Center for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Zhisheng Huang
- Knowledge Representation and Reasoning (KR&R) Group, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Brain Protection Innovation Center, Capital Medical University, Beijing, China
| |
Collapse
|
7
|
Pan H, Deutsch GH, Wert SE; Ontology Subcommittee., NHLBI Molecular Atlas of Lung Development Program Consortium. Comprehensive anatomic ontologies for lung development: A comparison of alveolar formation and maturation within mouse and human lung. J Biomed Semantics 2019; 10:18. [PMID: 31651362 DOI: 10.1186/s13326-019-0209-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 09/09/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Although the mouse is widely used to model human lung development, function, and disease, our understanding of the molecular mechanisms involved in alveolarization of the peripheral lung is incomplete. Recently, the Molecular Atlas of Lung Development Program (LungMAP) was funded by the National Heart, Lung, and Blood Institute to develop an integrated open access database (known as BREATH) to characterize the molecular and cellular anatomy of the developing lung. To support this effort, we designed detailed anatomic and cellular ontologies describing alveolar formation and maturation in both mouse and human lung. DESCRIPTION While the general anatomic organization of the lung is similar for these two species, there are significant variations in the lung's architectural organization, distribution of connective tissue, and cellular composition along the respiratory tract. Anatomic ontologies for both species were constructed as partonomic hierarchies and organized along the lung's proximal-distal axis into respiratory, vascular, neural, and immunologic components. Terms for developmental and adult lung structures, tissues, and cells were included, providing comprehensive ontologies for application at varying levels of resolution. Using established scientific resources, multiple rounds of comparison were performed to identify common, analogous, and unique terms that describe the lungs of these two species. Existing biological and biomedical ontologies were examined and cross-referenced to facilitate integration at a later time, while additional terms were drawn from the scientific literature as needed. This comparative approach eliminated redundancy and inconsistent terminology, enabling us to differentiate true anatomic variations between mouse and human lungs. As a result, approximately 300 terms for fetal and postnatal lung structures, tissues, and cells were identified for each species. CONCLUSION These ontologies standardize and expand current terminology for fetal and adult lungs, providing a qualitative framework for data annotation, retrieval, and integration across a wide variety of datasets in the BREATH database. To our knowledge, these are the first ontologies designed to include terminology specific for developmental structures in the lung, as well as to compare common anatomic features and variations between mouse and human lungs. These ontologies provide a unique resource for the LungMAP, as well as for the broader scientific community.
Collapse
|
8
|
He Z, Keloth VK, Chen Y, Geller J. Extended Analysis of Topological-Pattern-Based Ontology Enrichment. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2019; 2018:1641-1648. [PMID: 30854243 DOI: 10.1109/bibm.2018.8621564] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Maintenance of biomedical ontologies is difficult. We have previously developed a topological-pattern-based method to deal with the problem of identifying concepts in a reference ontology that could be of interest for insertion into a target ontology. Assuming that both ontologies are parts of the Unified Medical Language System (UMLS), the method suggests approximate locations where the target ontology could be extended with new concepts from the reference ontology. However, the final decision about each concept has to be made by a human expert. In this paper, we describe the universe of cross-ontology topological patterns in quantitative terms. We then present a theoretical analysis of the number of potential placements of reference concepts in a path in a target ontology, allowing for new cross-ontology synonyms. This provides a rough estimate of what expert resources need to be allocated for the task. One insight in previous work on this topic was the large percentage of cases where importing concepts was impossible, due to a configuration called "alternative classification." In this paper, we confirm this observation. Our target ontology is the National Cancer Institute thesaurus (NCIt). However, the methods can be applied to other pairs of ontologies with hierarchical relationships from the UMLS.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University Tallahassee, Florida USA
| | | | - Yan Chen
- Department of Computer Inforamtion Systems, BMCC, CUNY, New York, NY USA,
| | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ USA,
| |
Collapse
|
9
|
Abstract
Diagnosing rare diseases can be challenging for clinicians. This article gives an overview on novel approaches, which enable automated phenotype-driven analyses of differential diagnoses for rare diseases as well as genomic variation data of affected individuals. The focus lies on reliable methods for collating clinical phenotypic data and new algorithms for precise and robust assessment of the similarity between phenotypic profiles. The Human Phenotype Ontology project (HPO; www.human-phenotype-ontology.org ) provides an ontology for collating symptoms and clinical phenotypic abnormalities. Using ontologies makes it possible to capture these data in a precise and comprehensive fashion as well as to apply reliable and robust automated analyses. Tools, such as the Phenomizer, enable the algorithmic calculation of similarity values amongst patients or between patients and disease descriptions. Such digital tools represent a solid foundation for differential diagnostic applications. Many rare diseases have a strong genetic component but the analysis of the coding DNA variants in rare disease patients is an enormously complex procedure, which often impedes successful molecular diagnostics. In this situation a combined analysis of the patients HPO-coded phenotypic features and the genomic characteristics of the variants can be of substantial help. In this case the HPO project and the associated algorithms are helpful: it is therefore an important component for phenotype-driven translational research and prioritization of disease-relavant genomic variations.
Collapse
Affiliation(s)
- S Köhler
- Berlin Institute of Health (BIH), Anna-Louisa-Karsch-Str. 2, 10178, Berlin, Deutschland.
- Einstein Center Digital Future, Wilhelmstr. 67, 10117, Berlin, Deutschland.
- NeuroCure Clinical Research Center, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Deutschland.
| |
Collapse
|
10
|
Zhang GQ, Xing G, Cui L. An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies. J Biomed Inform 2018; 80:106-119. [PMID: 29548711 DOI: 10.1016/j.jbi.2018.03.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2017] [Revised: 03/10/2018] [Accepted: 03/12/2018] [Indexed: 11/17/2022]
Abstract
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.
Collapse
Affiliation(s)
- Guo-Qiang Zhang
- Department of Computer Science, University of Kentucky, Lexington, KY, USA; Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, USA; Department of Internal Medicine, University of Kentucky, Lexington, KY, USA.
| | - Guangming Xing
- Department of Computer Science, Western Kentucky University, Bowling Green, KY, USA
| | - Licong Cui
- Department of Computer Science, University of Kentucky, Lexington, KY, USA; Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, USA
| |
Collapse
|
11
|
Agibetov A, Jiménez-Ruiz E, Ondrésik M, Solimando A, Banerjee I, Guerrini G, Catalano CE, Oliveira JM, Patanè G, Reis RL, Spagnuolo M. Supporting shared hypothesis testing in the biomedical domain. J Biomed Semantics 2018; 9:9. [PMID: 29422110 PMCID: PMC5804102 DOI: 10.1186/s13326-018-0177-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 01/18/2018] [Indexed: 02/01/2023] Open
Abstract
Background Pathogenesis of inflammatory diseases can be tracked by studying the causality relationships among the factors contributing to its development. We could, for instance, hypothesize on the connections of the pathogenesis outcomes to the observed conditions. And to prove such causal hypotheses we would need to have the full understanding of the causal relationships, and we would have to provide all the necessary evidences to support our claims. In practice, however, we might not possess all the background knowledge on the causality relationships, and we might be unable to collect all the evidence to prove our hypotheses. Results In this work we propose a methodology for the translation of biological knowledge on causality relationships of biological processes and their effects on conditions to a computational framework for hypothesis testing. The methodology consists of two main points: hypothesis graph construction from the formalization of the background knowledge on causality relationships, and confidence measurement in a causality hypothesis as a normalized weighted path computation in the hypothesis graph. In this framework, we can simulate collection of evidences and assess confidence in a causality hypothesis by measuring it proportionally to the amount of available knowledge and collected evidences. Conclusions We evaluate our methodology on a hypothesis graph that represents both contributing factors which may cause cartilage degradation and the factors which might be caused by the cartilage degradation during osteoarthritis. Hypothesis graph construction has proven to be robust to the addition of potentially contradictory information on the simultaneously positive and negative effects. The obtained confidence measures for the specific causality hypotheses have been validated by our domain experts, and, correspond closely to their subjective assessments of confidences in investigated hypotheses. Overall, our methodology for a shared hypothesis testing framework exhibits important properties that researchers will find useful in literature review for their experimental studies, planning and prioritizing evidence collection acquisition procedures, and testing their hypotheses with different depths of knowledge on causal dependencies of biological processes and their effects on the observed conditions.
Collapse
Affiliation(s)
- Asan Agibetov
- Italian National Research Council, Via De Marini 6, Genoa, 16149, Italy.,Center for Medical Statistics, Informatics, and Intelligent Systems, Institute for Artificial Intelligence and Decision Support, Medical University of Vienna, Spitalgasse 23, Vienna, 1090, Austria
| | | | - Marta Ondrésik
- 3B's Research Group, Biomaterials, Biodegradables and Biomimetics, Headquarters of the European Institute of Excellence on Tissue Engineering and Regenerative Medicine, University of Minho, Caldas das Taipas, Portugal.,ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal
| | | | - Imon Banerjee
- Italian National Research Council, Via De Marini 6, Genoa, 16149, Italy.,Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, 94305, California, USA
| | | | - Chiara E Catalano
- Italian National Research Council, Via De Marini 6, Genoa, 16149, Italy
| | - Joaquim M Oliveira
- 3B's Research Group, Biomaterials, Biodegradables and Biomimetics, Headquarters of the European Institute of Excellence on Tissue Engineering and Regenerative Medicine, University of Minho, Caldas das Taipas, Portugal.,ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal
| | - Giuseppe Patanè
- Italian National Research Council, Via De Marini 6, Genoa, 16149, Italy
| | - Rui L Reis
- 3B's Research Group, Biomaterials, Biodegradables and Biomimetics, Headquarters of the European Institute of Excellence on Tissue Engineering and Regenerative Medicine, University of Minho, Caldas das Taipas, Portugal.,ICVS/3B's - PT Government Associate Laboratory, Braga/Guimarães, Portugal
| | - Michela Spagnuolo
- Italian National Research Council, Via De Marini 6, Genoa, 16149, Italy
| |
Collapse
|
12
|
Abstract
Background Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. Results We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. Conclusions The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set. Electronic supplementary material The online version of this article (10.1186/s12864-017-4338-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China
| | - Yue Jiang
- Hospital for Sick Children, Toronto, M5G 1X8, Canada
| | - Hong Ju
- Department of Information Engineering, Heilongjiang Biological Science and Technology Career Academy, Harbin, 150081, People's Republic of China
| | - Jie Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xian, 710072, People's Republic of China
| | - Meng Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, People's Republic of China.
| | - Yang Hu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150088, People's Republic of China.
| |
Collapse
|
13
|
Harrow I, Jiménez-Ruiz E, Splendiani A, Romacker M, Woollard P, Markel S, Alam-Faruque Y, Koch M, Malone J, Waaler A. Matching disease and phenotype ontologies in the ontology alignment evaluation initiative. J Biomed Semantics 2017; 8:55. [PMID: 29197409 PMCID: PMC5712086 DOI: 10.1186/s13326-017-0162-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Accepted: 10/27/2017] [Indexed: 11/30/2022] Open
Abstract
Background The disease and phenotype track was designed to evaluate the relative performance of ontology matching systems that generate mappings between source ontologies. Disease and phenotype ontologies are important for applications such as data mining, data integration and knowledge management to support translational science in drug discovery and understanding the genetics of disease. Results Eleven systems (out of 21 OAEI participating systems) were able to cope with at least one of the tasks in the Disease and Phenotype track. AML, FCA-Map, LogMap(Bio) and PhenoMF systems produced the top results for ontology matching in comparison to consensus alignments. The results against manually curated mappings proved to be more difficult most likely because these mapping sets comprised mostly subsumption relationships rather than equivalence. Manual assessment of unique equivalence mappings showed that AML, LogMap(Bio) and PhenoMF systems have the highest precision results. Conclusions Four systems gave the highest performance for matching disease and phenotype ontologies. These systems coped well with the detection of equivalence matches, but struggled to detect semantic similarity. This deserves more attention in the future development of ontology matching systems. The findings of this evaluation show that such systems could help to automate equivalence matching in the workflow of curators, who maintain ontology mapping services in numerous domains such as disease and phenotype.
Collapse
Affiliation(s)
- Ian Harrow
- Pistoia Alliance Ontologies Mapping Project, Pistoia Alliance Inc, USA.
| | | | | | - Martin Romacker
- Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center, Basel, Switzerland
| | | | | | | | | | | | - Arild Waaler
- Department of Informatics, University of Oslo, Oslo, Norway
| |
Collapse
|
14
|
Hogan WR, Hanna J, Hicks A, Amirova S, Bramblett B, Diller M, Enderez R, Modzelewski T, Vasconcelos M, Delcher C. Therapeutic indications and other use-case-driven updates in the drug ontology: anti-malarials, anti-hypertensives, opioid analgesics, and a large term request. J Biomed Semantics 2017; 8:10. [PMID: 28253937 PMCID: PMC5335794 DOI: 10.1186/s13326-017-0121-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 02/24/2017] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND The Drug Ontology (DrOn) is an OWL2-based representation of drug products and their ingredients, mechanisms of action, strengths, and dose forms. We originally created DrOn for use cases in comparative effectiveness research, primarily to identify historically complete sets of United States National Drug Codes (NDCs) that represent packaged drug products, by the ingredient(s), mechanism(s) of action, and so on contained in those products. Although we had designed DrOn from the outset to carefully distinguish those entities that have a therapeutic indication from those entities that have a molecular mechanism of action, we had not previously represented in DrOn any particular therapeutic indication. RESULTS In this work, we add therapeutic indications for three research use cases: resistant hypertension, malaria, and opioid abuse research. We also added mechanisms of action for opioid analgesics and added 108 classes representing drug products in response to a large term request from the Program for Resistance, Immunology, Surveillance and Modeling of Malaria in Uganda (PRISM) project. The net result is a new version of DrOn, current to May 2016, that represents three major therapeutic classes of drugs and six new mechanisms of action. CONCLUSIONS A therapeutic indication of a drug product is represented as a therapeutic function in DrOn. Adverse effects of drug products, as well as other therapeutic uses for which the drug product was not designed are dispositions. Our work provides a framework for representing additional therapeutic indications, adverse effects, and uses of drug products beyond their design. Our work also validated our past modeling decisions for specific types of mechanisms of action, namely effects mediated via receptor and/or enzyme binding. DrOn is available at: http://purl.obolibrary.org/obo/dron.owl . A smaller version without NDCs is available at: http://purl.obolibrary.org/obo/dron/dron-lite.owl.
Collapse
Affiliation(s)
- William R. Hogan
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Josh Hanna
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Amanda Hicks
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Samira Amirova
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Baxter Bramblett
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Matthew Diller
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Rodel Enderez
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Timothy Modzelewski
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Mirela Vasconcelos
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| | - Chris Delcher
- Department of Health Outcomes and Policy, University of Florida, Clinical and Translational Research Building, 2004 Mowry Road, P.O. Box 100219, Gainesville, FL 32610 USA
| |
Collapse
|
15
|
Abstract
Background In the era of semantic web, life science ontologies play an important role in tasks such as annotating biological objects, linking relevant data pieces, and verifying data consistency. Understanding ontology structures and overlapping ontologies is essential for tasks such as ontology reuse and development. We present an exploratory study where we examine structure and look for patterns in BioPortal, a comprehensive publicly available repository of live science ontologies. Methods We report an analysis of biomedical ontology mapping data over time. We apply graph theory methods such as Modularity Analysis and Betweenness Centrality to analyse data gathered at five different time points. We identify communities, i.e., sets of overlapping ontologies, and define similar and closest communities. We demonstrate evolution of identified communities over time and identify core ontologies of the closest communities. We use BioPortal project and category data to measure community coherence. We also validate identified communities with their mutual mentions in scientific literature. Results With comparing mapping data gathered at five different time points, we identified similar and closest communities of overlapping ontologies, and demonstrated evolution of communities over time. Results showed that anatomy and health ontologies tend to form more isolated communities compared to other categories. We also showed that communities contain all or the majority of ontologies being used in narrower projects. In addition, we identified major changes in mapping data after migration to BioPortal Version 4.
Collapse
Affiliation(s)
- Simon Kocbek
- Database Center for Life Science, Research Organization of Information and Systems, Tokyo, Japan; Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, NSW, Australia; Department of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - Jin-Dong Kim
- Database Center for Life Science, Research Organization of Information and Systems , Tokyo , Japan
| |
Collapse
|
16
|
Abstract
Background Disease and diagnosis have been the subject of much ontological inquiry. However, the insights gained therein have not yet been well enough applied to the study, management, and improvement of data quality in electronic health records (EHR) and administrative systems. Data in these systems suffer from workarounds clinicians are forced to apply due to limitations in the current state-of-the art in system design which ignore the various types of entities that diagnoses as information content entities can be and are about. This leads to difficulties in distinguishing amongst diagnostic assertions misdiagnosis from correct diagnosis, and the former from coincidentally correct statements about disease. Methods We applied recent advances in the ontological understanding of the aboutness relation to the problem of diagnosis and disease as defined by the Ontology for General Medical Science. We created six scenarios that we analyzed using the method of Referent Tracking to identify all the entities and their relationships which must be present for each scenario to hold true. We discovered deficiencies in existing ontological definitions and proposed revisions of them to account for the improved understanding that resulted from our analysis. Results Our key result is that a diagnosis is an information content entity (ICE) whose concretization(s) are typically about a configuration in which there exists a disease that inheres in an organism and instantiates a certain type (e.g., hypertension). Misdiagnoses are ICEs whose concretizations succeed in aboutness on the level of reference for individual entities and types (the organism and the disease), but fail in aboutness on the level of compound expression (i.e., there is no configuration that corresponds in total with what is asserted). Provenance of diagnoses as concretizations is critical to distinguishing them from lucky guesses, hearsay, and justified layperson belief. Conclusions Recent improvements in our understanding of aboutness significantly improved our understanding of the ontology of diagnosis and related information content entities, which in turn opens new perspectives for the implementation of data capture methods in EHR and other systems to allow diagnostic assertions to be captured with less ambiguity. Electronic supplementary material The online version of this article (doi:10.1186/s13326-016-0098-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- William R Hogan
- University of Florida, 2004 Mowry Rd, P.O. Box 100219, Gainesville, FL, 32610-0219, USA.
| | - Werner Ceusters
- Department of Biomedical Informatics, University at Buffalo, 77 Goodell street, 5th floor, Buffalo, NY, 14203, USA
| |
Collapse
|
17
|
Groß A, Pruski C, Rahm E. Evolution of biomedical ontologies and mappings: Overview of recent approaches. Comput Struct Biotechnol J 2016; 14:333-40. [PMID: 27642503 PMCID: PMC5018063 DOI: 10.1016/j.csbj.2016.08.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 08/19/2016] [Accepted: 08/23/2016] [Indexed: 11/16/2022] Open
Abstract
Biomedical ontologies are heavily used to annotate data, and different ontologies are often interlinked by ontology mappings. These ontology-based mappings and annotations are used in many applications and analysis tasks. Since biomedical ontologies are continuously updated dependent artifacts can become outdated and need to undergo evolution as well. Hence there is a need for largely automated approaches to keep ontology-based mappings up-to-date in the presence of evolving ontologies. In this article, we survey current approaches and novel directions in the context of ontology and mapping evolution. We will discuss requirements for mapping adaptation and provide a comprehensive overview on existing approaches. We will further identify open challenges and outline ideas for future developments.
Collapse
Affiliation(s)
- Anika Groß
- Institute of Computer Science, Universität Leipzig, P.O. Box 100920, 04009 Leipzig, Germany
| | - Cédric Pruski
- Luxembourg Institute of Science and Technology, 5 Avenue des Hauts-Fourneaux, L-4362 Esch-sur-Alzette, Luxembourg
| | - Erhard Rahm
- Institute of Computer Science, Universität Leipzig, P.O. Box 100920, 04009 Leipzig, Germany
| |
Collapse
|
18
|
Hogan WR, Wagner MM, Brochhausen M, Levander J, Brown ST, Millett N, DePasse J, Hanna J. The Apollo Structured Vocabulary: an OWL2 ontology of phenomena in infectious disease epidemiology and population biology for use in epidemic simulation. J Biomed Semantics 2016; 7:50. [PMID: 27538448 PMCID: PMC4989460 DOI: 10.1186/s13326-016-0092-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 08/10/2016] [Indexed: 01/03/2023] Open
Abstract
Background We developed the Apollo Structured Vocabulary (Apollo-SV)—an OWL2 ontology of phenomena in infectious disease epidemiology and population biology—as part of a project whose goal is to increase the use of epidemic simulators in public health practice. Apollo-SV defines a terminology for use in simulator configuration. Apollo-SV is the product of an ontological analysis of the domain of infectious disease epidemiology, with particular attention to the inputs and outputs of nine simulators. Results Apollo-SV contains 802 classes for representing the inputs and outputs of simulators, of which approximately half are new and half are imported from existing ontologies. The most important Apollo-SV class for users of simulators is infectious disease scenario, which is a representation of an ecosystem at simulator time zero that has at least one infection process (a class) affecting at least one population (also a class). Other important classes represent ecosystem elements (e.g., households), ecosystem processes (e.g., infection acquisition and infectious disease), censuses of ecosystem elements (e.g., censuses of populations), and infectious disease control measures. In the larger project, which created an end-user application that can send the same infectious disease scenario to multiple simulators, Apollo-SV serves as the controlled terminology and strongly influences the design of the message syntax used to represent an infectious disease scenario. As we added simulators for different pathogens (e.g., malaria and dengue), the core classes of Apollo-SV have remained stable, suggesting that our conceptualization of the information required by simulators is sound. Despite adhering to the OBO Foundry principle of orthogonality, we could not reuse Infectious Disease Ontology classes as the basis for infectious disease scenarios. We thus defined new classes in Apollo-SV for host, pathogen, infection, infectious disease, colonization, and infection acquisition. Unlike IDO, our ontological analysis extended to existing mathematical models of key biological phenomena studied by infectious disease epidemiology and population biology. Conclusion Our ontological analysis as expressed in Apollo-SV was instrumental in developing a simulator-independent representation of infectious disease scenarios that can be run on multiple epidemic simulators. Our experience suggests the importance of extending ontological analysis of a domain to include existing mathematical models of the phenomena studied by the domain. Apollo-SV is freely available at: http://purl.obolibrary.org/obo/apollo_sv.owl.
Collapse
Affiliation(s)
- William R Hogan
- University of Florida, P.O. Box 100219, 2004 Mowry Rd, Gainesville, FL, 32610-0219, USA.
| | - Michael M Wagner
- University of Pittsburgh, 5607 Baum Boulevard, Room 434, Pittsburgh, PA, 15206, USA
| | - Mathias Brochhausen
- University of Arkansas for Medical Sciences, 4301 W. Markham St. Slot #782, Little Rock, AR, 72205, USA
| | - John Levander
- University of Pittsburgh, 5607 Baum Boulevard, Room 434G, Pittsburgh, PA, 15206, USA
| | - Shawn T Brown
- Pittsburgh Supercomputing Center, 300 S. Craig St., Pittsburgh, PA, 15213, USA
| | - Nicholas Millett
- University of Pittsburgh, 5607 Baum Boulevard, Room 435 J, Pittsburgh, PA, 15206, USA
| | - Jay DePasse
- Pittsburgh Supercomputing Center, 300 S. Craig St., Pittsburgh, PA, 15213, USA
| | - Josh Hanna
- University of Florida, P.O. Box 100212, Gainesville, FL, 32610-0212, USA
| |
Collapse
|
19
|
Abstract
BACKGROUND Biomedical information and knowledge, structural and non-structural, stored in different repositories can be semantically connected to form a hybrid knowledge network. How to compute relatedness between concepts and discover valuable but implicit information or knowledge from it effectively and efficiently is of paramount importance for precision medicine, and a major challenge facing the biomedical research community. RESULTS In this study, a hybrid biomedical knowledge network is constructed by linking concepts across multiple biomedical ontologies as well as non-structural biomedical knowledge sources. To discover implicit relatedness between concepts in ontologies for which potentially valuable relationships (implicit knowledge) may exist, we developed a Multi-Ontology Relatedness Model (MORM) within the knowledge network, for which a relatedness network (RN) is defined and computed across multiple ontologies using a formal inference mechanism of set-theoretic operations. Semantic constraints are designed and implemented to prune the search space of the relatedness network. CONCLUSIONS Experiments to test examples of several biomedical applications have been carried out, and the evaluation of the results showed an encouraging potential of the proposed approach to biomedical knowledge discovery.
Collapse
Affiliation(s)
- Tian Bai
- College of Computer Science and Technology, Jilin Univesity, 2699 Qianjin St, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 2699 Qianjin St, Changchun, China
| | - Leiguang Gong
- College of Computer Science and Technology, Jilin Univesity, 2699 Qianjin St, Changchun, China
- Yantai Intelligent Information Technologies Ltd., 2699 Qianjin St, Yantai, China
| | - Ye Wang
- College of Computer Science and Technology, Jilin Univesity, 2699 Qianjin St, Changchun, China
| | - Yan Wang
- College of Computer Science and Technology, Jilin Univesity, 2699 Qianjin St, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 2699 Qianjin St, Changchun, China
| | - Casimir A. Kulikowski
- Department of Computer Science, Rutgers, The State University of New Jersey, 2699 Qianjin St, Piscataway, NJ USA
| | - Lan Huang
- College of Computer Science and Technology, Jilin Univesity, 2699 Qianjin St, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 2699 Qianjin St, Changchun, China
| |
Collapse
|
20
|
Huang J, Gutierrez F, Strachan HJ, Dou D, Huang W, Smith B, Blake JA, Eilbeck K, Natale DA, Lin Y, Wu B, Silva ND, Wang X, Liu Z, Borchert GM, Tan M, Ruttenberg A. OmniSearch: a semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data. J Biomed Semantics 2016; 7:25. [PMID: 27175225 PMCID: PMC4863347 DOI: 10.1186/s13326-016-0064-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 04/12/2016] [Indexed: 01/05/2023] Open
Abstract
As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.
Collapse
Affiliation(s)
- Jingshan Huang
- School of Computing, University of South Alabama, Mobile, Alabama, 36688-0002 USA
| | - Fernando Gutierrez
- Computer and Information Science Department, University of Oregon, Eugene, Oregon, 97403-1202 USA
| | - Harrison J Strachan
- School of Computing, University of South Alabama, Mobile, Alabama, 36688-0002 USA
| | - Dejing Dou
- Computer and Information Science Department, University of Oregon, Eugene, Oregon, 97403-1202 USA
| | - Weili Huang
- Miracle Query, Inc., Eugene, Oregon, 97403-1202 USA
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, New York, 14260-4150 USA
| | - Judith A Blake
- Genome Informatics, The Jackson Laboratory, Bar Harbor, Maine, 04609-1523 USA
| | - Karen Eilbeck
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, 84112-5775 USA
| | - Darren A Natale
- Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, Washington D.C., 20007-1485 USA
| | - Yu Lin
- Center for Computational Science, University of Miami, Miami, Florida, 33146-2960 U.S.A
| | - Bin Wu
- Department of Microbiology and Immunology, First Affiliated Hospital, Kunming Medical University, Kunming, Yunnan, 650032 China
| | - Nisansa de Silva
- Computer and Information Science Department, University of Oregon, Eugene, Oregon, 97403-1202 USA
| | - Xiaowei Wang
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, Missouri, 63110-0001 USA
| | - Zixing Liu
- Mitchell Cancer Institute, University of South Alabama, Mobile, Alabama, 36604-1405 USA
| | - Glen M Borchert
- Department of Biology, University of South Alabama, Mobile, Alabama, 36688-0002 USA
| | - Ming Tan
- Mitchell Cancer Institute, University of South Alabama, Mobile, Alabama, 36604-1405 USA
| | - Alan Ruttenberg
- School of Dental Medicine, University at Buffalo, Buffalo, New York, 14214-8006 USA
| |
Collapse
|
21
|
Huang J, Eilbeck K, Smith B, Blake JA, Dou D, Huang W, Natale DA, Ruttenberg A, Huan J, Zimmermann MT, Jiang G, Lin Y, Wu B, Strachan HJ, He Y, Zhang S, Wang X, Liu Z, Borchert GM, Tan M. The Non-Coding RNA Ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology. J Biomed Semantics 2016; 7:24. [PMID: 27152146 DOI: 10.1186/s13326-016-0066-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 04/19/2016] [Indexed: 11/17/2022] Open
Abstract
In recent years, sequencing technologies have enabled the identification of a wide range of non-coding RNAs (ncRNAs). Unfortunately, annotation and integration of ncRNA data has lagged behind their identification. Given the large quantity of information being obtained in this area, there emerges an urgent need to integrate what is being discovered by a broad range of relevant communities. To this end, the Non-Coding RNA Ontology (NCRO) is being developed to provide a systematically structured and precisely defined controlled vocabulary for the domain of ncRNAs, thereby facilitating the discovery, curation, analysis, exchange, and reasoning of data about structures of ncRNAs, their molecular and cellular functions, and their impacts upon phenotypes. The goal of NCRO is to serve as a common resource for annotations of diverse research in a way that will significantly enhance integrative and comparative analysis of the myriad resources currently housed in disparate sources. It is our belief that the NCRO ontology can perform an important role in the comprehensive unification of ncRNA biology and, indeed, fill a critical gap in both the Open Biological and Biomedical Ontologies (OBO) Library and the National Center for Biomedical Ontology (NCBO) BioPortal. Our initial focus is on the ontological representation of small regulatory ncRNAs, which we see as the first step in providing a resource for the annotation of data about all forms of ncRNAs. The NCRO ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/ncro.owl.
Collapse
|
22
|
Banerjee I, Catalano CE, Patané G, Spagnuolo M. Semantic annotation of 3D anatomical models to support diagnosis and follow-up analysis of musculoskeletal pathologies. Int J Comput Assist Radiol Surg 2015; 11:707-20. [PMID: 26615427 DOI: 10.1007/s11548-015-1327-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2015] [Accepted: 11/09/2015] [Indexed: 10/22/2022]
Abstract
PURPOSE While 3D patient-specific digital models are currently available, thanks to advanced medical acquisition devices, there is still a long way to go before these models can be used in clinical practice. The goal of this paper is to demonstrate how 3D patient-specific models of anatomical parts can be analysed and documented accurately with morphological information extracted automatically from the data. Part-based semantic annotation of 3D anatomical models is discussed as a basic approach for sharing and reusing knowledge among clinicians for next-generation CAD-assisted diagnosis and treatments. METHODS We have developed (1) basic services for the analysis of 3D anatomical models and (2) a methodology for the enrichment of such models with relevant descriptions and attributes, which reflect the parameters of interest for medical investigations. The proposed semantic annotation is ontology-driven and includes both descriptive and quantitative labelling. Most importantly, the developed methodology permits to identify and annotate also parts-of-relevance of anatomical entities. RESULTS The computational tools for the automatic computation of qualitative and quantitative parameters have been integrated in a prototype system, the SemAnatomy3D framework, which demonstrates the functionalities needed to support effective annotation of 3D patient-specific models. From the first evaluation, SemAnatomy3D appears as an effective tool for clinical data analysis and opens new ways to support clinical diagnosis. CONCLUSIONS The SemAnatomy3D framework integrates several functionalities for 3D part-based annotation. The idea has been presented and discussed for the case study of rheumatoid arthritis of carpal bones; however, the framework can be extended to support similar annotations in different clinical applications.
Collapse
|
23
|
Hsu W, Gonzalez NR, Chien A, Pablo Villablanca J, Pajukanta P, Viñuela F, Bui AAT. An integrated, ontology-driven approach to constructing observational databases for research. J Biomed Inform 2015; 55:132-42. [PMID: 25817919 DOI: 10.1016/j.jbi.2015.03.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 02/14/2015] [Accepted: 03/19/2015] [Indexed: 11/28/2022]
Abstract
The electronic health record (EHR) contains a diverse set of clinical observations that are captured as part of routine care, but the incomplete, inconsistent, and sometimes incorrect nature of clinical data poses significant impediments for its secondary use in retrospective studies or comparative effectiveness research. In this work, we describe an ontology-driven approach for extracting and analyzing data from the patient record in a longitudinal and continuous manner. We demonstrate how the ontology helps enforce consistent data representation, integrates phenotypes generated through analyses of available clinical data sources, and facilitates subsequent studies to identify clinical predictors for an outcome of interest. Development and evaluation of our approach are described in the context of studying factors that influence intracranial aneurysm (ICA) growth and rupture. We report our experiences in capturing information on 78 individuals with a total of 120 aneurysms. Two example applications related to assessing the relationship between aneurysm size, growth, gene expression modules, and rupture are described. Our work highlights the challenges with respect to data quality, workflow, and analysis of data and its implications toward a learning health system paradigm.
Collapse
Affiliation(s)
- William Hsu
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States.
| | - Nestor R Gonzalez
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States; Department of Neurosurgery, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Aichi Chien
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - J Pablo Villablanca
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Päivi Pajukanta
- Department of Human Genetics, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Fernando Viñuela
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| | - Alex A T Bui
- Department of Radiological Sciences, UCLA David Geffen School of Medicine, Los Angeles, CA, United States
| |
Collapse
|
24
|
Hur J, Özgür A, Xiang Z, He Y. Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions. J Biomed Semantics 2015; 6:2. [PMID: 25785184 PMCID: PMC4362819 DOI: 10.1186/2041-1480-6-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2014] [Accepted: 12/17/2014] [Indexed: 12/31/2022] Open
Abstract
Background Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. Methods In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher’s exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. Results INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with ‘INO_’ prefix. A new annotation property, ‘has literature mining keywords’, was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher’s exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights. Conclusions INO provides a novel approach for defining hierarchical interaction types and related keywords for literature mining. The ontology-based literature mining, in combination with an INO-based statistical interaction enrichment test, provides a new platform for efficient mining and analysis of topic-specific gene interaction networks.
Collapse
Affiliation(s)
- Junguk Hur
- Department of Neurology, University of Michigan, Ann Arbor, MI 48109 USA
| | - Arzucan Özgür
- Department of Computer Engineering, Bogazici University, 34342 Istanbul, Turkey
| | - Zuoshuang Xiang
- Unit for Laboratory Animal Medicine, University of Michigan, Ann Arbor, MI 48109 USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, University of Michigan, Ann Arbor, MI 48109 USA ; Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI 48109 USA ; Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109 USA ; Comprehensive Cancer Center, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|