1
|
Mensah GA, Burns KM, Peprah EK, Sampson UKA, Engelgau MM. Opportunities and challenges in chronic Chagas cardiomyopathy. Glob Heart 2016; 10:203-7. [PMID: 26407517 DOI: 10.1016/j.gheart.2015.08.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Affiliation(s)
- George A Mensah
- Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Kristin M Burns
- Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emmanuel K Peprah
- Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Uchechukwu K A Sampson
- Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael M Engelgau
- Center for Translation Research and Implementation Science, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
2
|
Asiaee AH, Minning T, Doshi P, Tarleton RL. A framework for ontology-based question answering with application to parasite immunology. J Biomed Semantics 2015; 6:31. [PMID: 26185615 PMCID: PMC4504081 DOI: 10.1186/s13326-015-0029-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2013] [Accepted: 06/19/2015] [Indexed: 11/15/2022] Open
Abstract
Background Large quantities of biomedical data are being produced at a rapid pace for a variety of organisms. With ontologies proliferating, data is increasingly being stored using the RDF data model and queried using RDF based querying languages. While existing systems facilitate the querying in various ways, the scientist must map the question in his or her mind to the interface used by the systems. The field of natural language processing has long investigated the challenges of designing natural language based retrieval systems. Recent efforts seek to bring the ability to pose natural language questions to RDF data querying systems while leveraging the associated ontologies. These analyze the input question and extract triples (subject, relationship, object), if possible, mapping them to RDF triples in the data. However, in the biomedical context, relationships between entities are not always explicit in the question and these are often complex involving many intermediate concepts. Results We present a new framework, OntoNLQA, for querying RDF data annotated using ontologies which allows posing questions in natural language. OntoNLQA offers five steps in order to answer natural language questions. In comparison to previous systems, OntoNLQA differs in how some of the methods are realized. In particular, it introduces a novel approach for discovering the sophisticated semantic associations that may exist between the key terms of a natural language question, in order to build an intuitive query and retrieve precise answers. We apply this framework to the context of parasite immunology data, leading to a system called AskCuebee that allows parasitologists to pose genomic, proteomic and pathway questions in natural language related to the parasite, Trypanosoma cruzi. We separately evaluate the accuracy of each component of OntoNLQA as implemented in AskCuebee and the accuracy of the whole system. AskCuebee answers 68 % of the questions in a corpus of 125 questions, and 60 % of the questions in a new previously unseen corpus. If we allow simple corrections by the scientists, this proportion increases to 92 %. Conclusions We introduce a novel framework for question answering and apply it to parasite immunology data. Evaluations of translating the questions to RDF triple queries by combining machine learning, lexical similarity matching with ontology classes, properties and instances for specificity, and discovering associations between them demonstrate that the approach performs well and improves on previous systems. Subsequently, OntoNLQA offers a viable framework for building question answering systems in other biomedical domains. Electronic supplementary material The online version of this article (doi:10.1186/s13326-015-0029-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Amir H Asiaee
- THINC Lab, Department of Computer Science, University of Georgia, Athens, GA USA
| | - Todd Minning
- Tarleton Research Group, Department of Cellular Biology, University of Georgia, Athens, GA USA
| | - Prashant Doshi
- THINC Lab, Department of Computer Science, University of Georgia, Athens, GA USA
| | - Rick L Tarleton
- Tarleton Research Group, Department of Cellular Biology, University of Georgia, Athens, GA USA
| |
Collapse
|
3
|
Parikh PP, Zheng J, Logan-Klumpler F, Stoeckert CJ, Louis C, Topalis P, Protasio AV, Sheth AP, Carrington M, Berriman M, Sahoo SS. The Ontology for Parasite Lifecycle (OPL): towards a consistent vocabulary of lifecycle stages in parasitic organisms. J Biomed Semantics 2012; 3:5. [PMID: 22621763 PMCID: PMC3488002 DOI: 10.1186/2041-1480-3-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2011] [Accepted: 05/04/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome sequencing of many eukaryotic pathogens and the volume of data available on public resources have created a clear requirement for a consistent vocabulary to describe the range of developmental forms of parasites. Consistent labeling of experimental data and external data, in databases and the literature, is essential for integration, cross database comparison, and knowledge discovery. The primary objective of this work was to develop a dynamic and controlled vocabulary that can be used for various parasites. The paper describes the Ontology for Parasite Lifecycle (OPL) and discusses its application in parasite research. RESULTS The OPL is based on the Basic Formal Ontology (BFO) and follows the rules set by the OBO Foundry consortium. The first version of the OPL models complex life cycle stage details of a range of parasites, such as Trypanosoma sp., Leishmaniasp., Plasmodium sp., and Shicstosoma sp. In addition, the ontology also models necessary contextual details, such as host information, vector information, and anatomical locations. OPL is primarily designed to serve as a reference ontology for parasite life cycle stages that can be used for database annotation purposes and in the lab for data integration or information retrieval as exemplified in the application section below. CONCLUSION OPL is freely available at http://purl.obolibrary.org/obo/opl.owl and has been submitted to the BioPortal site of NCBO and to the OBO Foundry. We believe that database and phenotype annotations using OPL will help run fundamental queries on databases to know more about gene functions and to find intervention targets for various parasites. The OPL is under continuous development and new parasites and/or terms are being added.
Collapse
Affiliation(s)
- Priti P Parikh
- The Kno.e.sis Center, Department of Computer Science and Engineering, Wright State University, Dayton, OH, USA
| | - Jie Zheng
- Center for Bioinformatics, Department of Genetics, University of Pennsylvania, 1420 Blockley Hall, 423 Guardian Drive, Philadelphia, Pennsylvania, 19104, USA
| | - Flora Logan-Klumpler
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 ISA, UK
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Christian J Stoeckert
- Center for Bioinformatics, Department of Genetics, University of Pennsylvania, 1420 Blockley Hall, 423 Guardian Drive, Philadelphia, Pennsylvania, 19104, USA
| | - Christos Louis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, and Dept. of Biology, University of Crete, 700 13, Heraklion, Crete, Greece
| | - Pantelis Topalis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, and Dept. of Biology, University of Crete, 700 13, Heraklion, Crete, Greece
| | - Anna V Protasio
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 ISA, UK
| | - Amit P Sheth
- The Kno.e.sis Center, Department of Computer Science and Engineering, Wright State University, Dayton, OH, USA
| | - Mark Carrington
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Matthew Berriman
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 ISA, UK
| | - Satya S Sahoo
- The Kno.e.sis Center, Department of Computer Science and Engineering, Wright State University, Dayton, OH, USA
- Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|