1
|
Cui L, Agrawal A. Special supplement issue on quality assurance and enrichment of biological and biomedical ontologies and terminologies. BMC Med Inform Decis Mak 2024; 23:302. [PMID: 39215285 PMCID: PMC11363377 DOI: 10.1186/s12911-024-02654-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024] Open
Abstract
Ontologies and terminologies serve as the backbone of knowledge representation in biomedical domains, facilitating data integration, interoperability, and semantic understanding across diverse applications. However, the quality assurance and enrichment of these resources remain an ongoing challenge due to the dynamic nature of biomedical knowledge. In this editorial, we provide an introductory summary of seven articles included in this special supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. These articles span a spectrum of topics, such as development of automated quality assessment frameworks for Resource Description Framework (RDF) resources, identification of missing concepts in SNOMED CT through logical definitions, and developing a COVID interface terminology to enable automatic annotations of COVID-19 related Electronic Health Records (EHRs). Collectively, these contributions underscore the ongoing efforts to improve the accuracy, consistency, and interoperability of biomedical ontologies and terminologies, thus advancing their pivotal role in healthcare and biomedical research.
Collapse
Affiliation(s)
- Licong Cui
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| | - Ankur Agrawal
- Department of Computer Science, St. Edward's University, Austin, TX, USA
| |
Collapse
|
2
|
Schenk PM, Wright AJ, West R, Hastings J, Lorencatto F, Moore C, Hayes E, Schneider V, Howes E, Michie S. An ontology of mechanisms of action in behaviour change interventions. Wellcome Open Res 2024; 8:337. [PMID: 38481854 PMCID: PMC10933577 DOI: 10.12688/wellcomeopenres.19489.1] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/24/2024] [Indexed: 06/04/2024] Open
Abstract
Background Behaviour change interventions influence behaviour through causal processes called "mechanisms of action" (MoAs). Reports of such interventions and their evaluations often use inconsistent or ambiguous terminology, creating problems for searching, evidence synthesis and theory development. This inconsistency includes the reporting of MoAs. An ontology can help address these challenges by serving as a classification system that labels and defines MoAs and their relationships. The aim of this study was to develop an ontology of MoAs of behaviour change interventions. Methods To develop the MoA Ontology, we (1) defined the ontology's scope; (2) identified, labelled and defined the ontology's entities; (3) refined the ontology by annotating (i.e., coding) MoAs in intervention reports; (4) refined the ontology via stakeholder review of the ontology's comprehensiveness and clarity; (5) tested whether researchers could reliably apply the ontology to annotate MoAs in intervention evaluation reports; (6) refined the relationships between entities; (7) reviewed the alignment of the MoA Ontology with other relevant ontologies, (8) reviewed the ontology's alignment with the Theories and Techniques Tool; and (9) published a machine-readable version of the ontology. Results An MoA was defined as "a process that is causally active in the relationship between a behaviour change intervention scenario and its outcome behaviour". We created an initial MoA Ontology with 261 entities through Steps 2-5. Inter-rater reliability for annotating study reports using these entities was α=0.68 ("acceptable") for researchers familiar with the ontology and α=0.47 for researchers unfamiliar with it. As a result of additional revisions (Steps 6-8), 23 further entities were added to the ontology resulting in 284 entities organised in seven hierarchical levels. Conclusions The MoA Ontology extensively captures MoAs of behaviour change interventions. The ontology can serve as a controlled vocabulary for MoAs to consistently describe and synthesise evidence about MoAs across diverse sources.
Collapse
Affiliation(s)
- Paulina M. Schenk
- Centre for Behaviour Change, University College London, London, England, UK
| | - Alison J. Wright
- Centre for Behaviour Change, University College London, London, England, UK
- Institute of Pharmaceutical Science, King's College London, London, England, UK
| | - Robert West
- Department of Behavioural Science and Health, University College London, London, England, UK
| | - Janna Hastings
- Institute for Implementation Science in Health Care, Universitat Zurich, Zürich, Zurich, Switzerland
- School of Medicine, University of St Gallen, St. Gallen, St. Gallen, Switzerland
| | - Fabiana Lorencatto
- Centre for Behaviour Change, University College London, London, England, UK
| | - Candice Moore
- Centre for Behaviour Change, University College London, London, England, UK
| | - Emily Hayes
- Centre for Behaviour Change, University College London, London, England, UK
| | - Verena Schneider
- Research Department of Epidemiology and Public Health, University College London, London, England, UK
| | - Ella Howes
- Leeds Unit for Complex Intervention Development, University of Leeds, Leeds, England, UK
| | - Susan Michie
- Centre for Behaviour Change, University College London, London, England, UK
| |
Collapse
|
3
|
Xia Y, Duan Y, Sha L, Lai W, Zhang Z, Hou J, Chen L. Whole-cycle management of women with epilepsy of child-bearing age: ontology construction and application. BMC Med Inform Decis Mak 2024; 24:101. [PMID: 38637746 PMCID: PMC11027401 DOI: 10.1186/s12911-024-02509-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 04/15/2024] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND The effective management of epilepsy in women of child-bearing age necessitates a concerted effort from multidisciplinary teams. Nevertheless, there exists an inadequacy in the seamless exchange of knowledge among healthcare providers within this context. Consequently, it is imperative to enhance the availability of informatics resources and the development of decision support tools to address this issue comprehensively. MATERIALS AND METHODS The development of the Women with Epilepsy of Child-Bearing Age Ontology (WWECA) adhered to established ontology construction principles. The ontology's scope and universal terminology were initially established by the development team and subsequently subjected to external evaluation through a rapid Delphi consensus exercise involving domain experts. Additional entities and attribute annotation data were sourced from authoritative guideline documents and specialized terminology databases within the respective field. Furthermore, the ontology has played a pivotal role in steering the creation of an online question-and-answer system, which is actively employed and assessed by a diverse group of multidisciplinary healthcare providers. RESULTS WWECA successfully integrated a total of 609 entities encompassing various facets related to the diagnosis and medication for women of child-bearing age afflicted with epilepsy. The ontology exhibited a maximum depth of 8 within its hierarchical structure. Each of these entities featured three fundamental attributes, namely Chinese labels, definitions, and synonyms. The evaluation of WWECA involved 35 experts from 10 different hospitals across China, resulting in a favorable consensus among the experts. Furthermore, the ontology-driven online question and answer system underwent evaluation by a panel of 10 experts, including neurologists, obstetricians, and gynecologists. This evaluation yielded an average rating of 4.2, signifying a positive reception and endorsement of the system's utility and effectiveness. CONCLUSIONS Our ontology and the associated online question and answer system hold the potential to serve as a scalable assistant for healthcare providers engaged in the management of women with epilepsy (WWE). In the future, this developmental framework has the potential for broader application in the context of long-term management of more intricate chronic health conditions.
Collapse
Affiliation(s)
- Yilin Xia
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Yifei Duan
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Leihao Sha
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Wanlin Lai
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Zhimeng Zhang
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Jiaxin Hou
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China
| | - Lei Chen
- Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, 610041, Chengdu, Sichuan Province, China.
- Pazhou Lab, Guangzhou, China.
| |
Collapse
|
4
|
Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024; 11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open
Abstract
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Ignacio J Tripodi
- Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jonathan C Silverstein
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Scott A Malec
- Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Emanuele Cavalleri
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
| | - Marco Mesiti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Lucas A Gillenwater
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Brook Santangelo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael Bada
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - William A Baumgartner
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
5
|
Zhang Z, Yu P, Yin M, Chang HC, Thomas SJ, Wei W, Song T, Deng C. Developing an ontology of non-pharmacological treatment for emotional and mood disturbances in dementia. Sci Rep 2024; 14:1937. [PMID: 38253678 PMCID: PMC10803746 DOI: 10.1038/s41598-023-46226-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 10/30/2023] [Indexed: 01/24/2024] Open
Abstract
Emotional and mood disturbances are common in people with dementia. Non-pharmacological interventions are beneficial for managing these disturbances. However, effectively applying these interventions, particularly in the person-centred approach, is a complex and knowledge-intensive task. Healthcare professionals need the assistance of tools to obtain all relevant information that is often buried in a vast amount of clinical data to form a holistic understanding of the person for successfully applying non-pharmacological interventions. A machine-readable knowledge model, e.g., ontology, can codify the research evidence to underpin these tools. For the first time, this study aims to develop an ontology entitled Dementia-Related Emotional And Mood Disturbance Non-Pharmacological Treatment Ontology (DREAMDNPTO). DREAMDNPTO consists of 1258 unique classes (concepts) and 70 object properties that represent relationships between these classes. It meets the requirements and quality standards for biomedical ontology. As DREAMDNPTO provides a computerisable semantic representation of knowledge specific to non-pharmacological treatment for emotional and mood disturbances in dementia, it will facilitate the application of machine learning to this particular and important health domain of emotional and mood disturbance management for people with dementia.
Collapse
Affiliation(s)
- Zhenyu Zhang
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Northfield Ave, Wollongong, NSW, 2522, Australia
| | - Ping Yu
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Northfield Ave, Wollongong, NSW, 2522, Australia.
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, Australia.
| | - Mengyang Yin
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Northfield Ave, Wollongong, NSW, 2522, Australia
- Systems and Reporting Residential Care, Catholic Healthcare Ltd, Wollongong, Australia
| | - Hui Chen Chang
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, Australia
- School of Nursing, University of Wollongong, Wollongong, Australia
| | - Susan J Thomas
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, Australia
- Graduate School of Medicine, University of Wollongong, Wollongong, Australia
| | - Wenxi Wei
- School of Nursing, University of Wollongong, Wollongong, Australia
| | - Ting Song
- Centre for Digital Transformation, School of Computing and Information Technology, University of Wollongong, Northfield Ave, Wollongong, NSW, 2522, Australia
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, Australia
| | - Chao Deng
- Illawarra Health and Medical Research Institute, University of Wollongong, Wollongong, Australia
- School of Medical, Indigenous and Health Sciences, University of Wollongong, Wollongong, Australia
| |
Collapse
|
6
|
Hernández L, Estévez-Priego E, López-Pérez L, Fernanda Cabrera-Umpiérrez M, Arredondo MT, Fico G. HeNeCOn: An ontology for integrative research in Head and Neck cancer. Int J Med Inform 2024; 181:105284. [PMID: 37981440 DOI: 10.1016/j.ijmedinf.2023.105284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 07/14/2023] [Accepted: 11/01/2023] [Indexed: 11/21/2023]
Abstract
BACKGROUND Head and Neck Cancer (HNC) has a high incidence and prevalence in the worldwide population. The broad terminology associated with these diseases and their multimodality treatments generates large amounts of heterogeneous clinical data, which motivates the construction of a high-quality harmonization model to standardize this multi-source clinical data in terms of format and semantics. The use of ontologies and semantic techniques is a well-known approach to face this challenge. OBJECTIVE This work aims to provide a clinically reliable data model for HNC processes during all phases of the disease: prognosis, treatment, and follow-up. Therefore, we built the first ontology specifically focused on the HNC domain, named HeNeCOn (Head and Neck Cancer Ontology). METHODS First, an annotated dataset was established to provide a formal reference description of HNC. Then, 170 clinical variables were organized into a taxonomy, and later expanded and mapped to formalize and integrate multiple databases into the HeNeCOn ontology. The outcomes of this iterative process were reviewed and validated by clinicians and statisticians. RESULTS HeNeCOn is an ontology consisting of 502 classes, a taxonomy with a hierarchical structure, semantic definitions of 283 medical terms and detailed relations between them, which can be used as a tool for information extraction and knowledge management. CONCLUSION HeNeCOn is a reusable, extendible and standardized ontology which establishes a reference data model for terminology structure and standard definitions in the Head and Neck Cancer domain. This ontology allows handling both current and newly generated knowledge in Head and Neck cancer research, by means of data linking and mapping with other public ontologies.
Collapse
Affiliation(s)
- Liss Hernández
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Estefanía Estévez-Priego
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Laura López-Pérez
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | | | - María Teresa Arredondo
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain
| | - Giuseppe Fico
- Universidad Politécnica de Madrid-Life Supporting Technologies Research Group, ETSIT, 28040 Madrid, Spain.
| |
Collapse
|
7
|
Tavakoli K, Kalaw FGP, Bhanvadia S, Hogarth M, Baxter SL. Concept Coverage Analysis of Ophthalmic Infections and Trauma among the Standardized Medical Terminologies SNOMED-CT, ICD-10-CM, and ICD-11. OPHTHALMOLOGY SCIENCE 2023; 3:100337. [PMID: 37449050 PMCID: PMC10336190 DOI: 10.1016/j.xops.2023.100337] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/10/2023] [Accepted: 05/19/2023] [Indexed: 07/18/2023]
Abstract
Purpose Widespread electronic health record adoption has generated a large volume of data and emphasized the need for standardized terminology to describe clinical concepts. Here, we undertook a systematic concept coverage analysis to determine the representation of clinical concepts in ophthalmic infection and ophthalmic trauma among standardized medical terminologies, including the Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT), the International Classification of Diseases (ICD) version 10 with clinical modifications (ICD-10-CM), and ICD version 11 (ICD-11). Design Extraction of concepts related to ophthalmic infection and ophthalmic trauma and structured search in terminology browsers. Data Sources The American Academy of Ophthalmology Basic and Clinical Science Course (BCSC), SNOMED-CT, and ICD-10-CM terminologies from the Observational Health Data Sciences and Informatics Athena browser, and the ICD-11 terminology browser. Methods Concepts pertaining to ophthalmic infection and ophthalmic trauma were extracted from the 2022 BCSC free text and index terms. We searched terminology browsers to identify corresponding codes and classified the extent of semantic alignment as equal, wide, narrow, or unmatched in each terminology. The overlap of equal concepts in each terminology was represented in a Venn diagram. Main Outcome Measures Proportions of clinical concepts with corresponding codes at various levels of semantic alignment. Results A total of 443 concepts were identified: 304 concepts related to ophthalmic infection and 139 concepts related to ophthalmic trauma. The SNOMED-CT had the highest proportion of equal coverage, with 82.0% (249 of 304) among concepts related to ophthalmic infection and 82.0% (115 of 139) among concepts related to ophthalmic trauma. Across all concepts, 28% (124 of 443) were classified as equal in ICD-10-CM and 52.8% (234 of 443) were classified as equal in ICD-11. Conclusions The SNOMED-CT had significantly better semantic alignment than ICD-10-CM and ICD-11 for ophthalmic infections and ophthalmic trauma. This demonstrates opportunity for continuing advancement of representation of ophthalmic concepts in standardized medical terminologies.
Collapse
Affiliation(s)
- Kiana Tavakoli
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Fritz Gerald P. Kalaw
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Sonali Bhanvadia
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Michael Hogarth
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| | - Sally L. Baxter
- Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California
- Division of Biomedical Informatics, Department of Medicine, University of California San Diego, La Jolla, California
| |
Collapse
|
8
|
Golec M, Kamdar M, Barteit S. Comprehensive Ontology of Fibroproliferative Diseases: Protocol for a Semantic Technology Study. JMIR Res Protoc 2023; 12:e48645. [PMID: 37566458 PMCID: PMC10457705 DOI: 10.2196/48645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 06/16/2023] [Accepted: 07/04/2023] [Indexed: 08/12/2023] Open
Abstract
BACKGROUND Fibroproliferative or fibrotic diseases (FDs), which represent a significant proportion of age-related pathologies and account for over 40% of mortality in developed nations, are often underrepresented in focused research. Typically, these conditions are studied individually, such as chronic obstructive pulmonary disease or idiopathic pulmonary fibrosis (IPF), rather than as a collective entity, thereby limiting the holistic understanding and development of effective treatments. To address this, we propose creating and publicizing a comprehensive fibroproliferative disease ontology (FDO) to unify the understanding of FDs. OBJECTIVE This paper aims to delineate the study protocol for the creation of the FDO, foster transparency and high quality standards during its development, and subsequently promote its use once it becomes publicly available. METHODS We aim to establish an ontology encapsulating the broad spectrum of FDs, constructed in the Web Ontology Language format using the Protégé ontology editor, adhering to ontology development life cycle principles. The modeling process will leverage Protégé in accordance with a methodologically defined process, involving targeted scoping reviews of MEDLINE and PubMed information, expert knowledge, and an ontology development process. A hybrid top-down and bottom-up strategy will guide the identification of core concepts and relations, conducted by a team of domain experts based on systematic iterations of scientific literature reviews. RESULTS The result will be an exhaustive FDO accommodating a wide variety of crucial biomedical concepts, augmented with synonyms, definitions, and references. The FDO aims to encapsulate diverse perspectives on the FD domain, including those of clinicians, health informaticians, medical researchers, and public health experts. CONCLUSIONS The FDO is expected to stimulate broader and more in-depth FD research by enabling reasoning, inference, and the identification of relationships between concepts for application in multiple contexts, such as developing specialized software, fostering research communities, and enhancing domain comprehension. A common vocabulary and understanding of relationships among medical professionals could potentially expedite scientific progress and the discovery of innovative solutions. The publicly available FDO will form the foundation for future research, technological advancements, and public health initiatives. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/48645.
Collapse
Affiliation(s)
- Marcin Golec
- Heidelberg Institute of Global Health, Faculty of Medicine and University Hospital, Heidelberg University, Heidelberg, Germany
| | - Maulik Kamdar
- Center for Advanced Clinical Solutions, Optum Health, Eden Prairie, MN, United States
| | - Sandra Barteit
- Heidelberg Institute of Global Health, Faculty of Medicine and University Hospital, Heidelberg University, Heidelberg, Germany
| |
Collapse
|
9
|
Sabbaghi H, Madani S, Ahmadieh H, Daftarian N, Suri F, Khorrami F, Saviz P, Shahriari MH, Motevasseli T, Fekri S, Nourinia R, Moradian S, Sheikhtaheri A. A health terminological system for inherited retinal diseases: Content coverage evaluation and a proposed classification. PLoS One 2023; 18:e0281858. [PMID: 37540684 PMCID: PMC10403057 DOI: 10.1371/journal.pone.0281858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 02/02/2023] [Indexed: 08/06/2023] Open
Abstract
PURPOSE To present a classification of inherited retinal diseases (IRDs) and evaluate its content coverage in comparison with common standard terminology systems. METHODS In this comparative cross-sectional study, a panel of subject matter experts annotated a list of IRDs based on a comprehensive review of the literature. Then, they leveraged clinical terminologies from various reference sets including Unified Medical Language System (UMLS), Online Mendelian Inheritance in Man (OMIM), International Classification of Diseases (ICD-11), Systematized Nomenclature of Medicine (SNOMED-CT) and Orphanet Rare Disease Ontology (ORDO). RESULTS Initially, we generated a hierarchical classification of 62 IRD diagnosis concepts in six categories. Subsequently, the classification was extended to 164 IRD diagnoses after adding concepts from various standard terminologies. Finally, 158 concepts were selected to be classified into six categories and genetic subtypes of 412 cases were added to the related concepts. UMLS has the greatest content coverage of 90.51% followed respectively by SNOMED-CT (83.54%), ORDO (81.01%), OMIM (60.76%), and ICD-11 (60.13%). There were 53 IRD concepts (33.54%) that were covered by all five investigated systems. However, 2.53% of the IRD concepts in our classification were not covered by any of the standard terminologies. CONCLUSIONS This comprehensive classification system was established to organize IRD diseases based on phenotypic and genotypic specifications. It could potentially be used for IRD clinical documentation purposes and could also be considered a preliminary step forward to developing a more robust standard ontology for IRDs or updating available standard terminologies. In comparison, the greatest content coverage of our proposed classification was related to the UMLS Metathesaurus.
Collapse
Affiliation(s)
- Hamideh Sabbaghi
- Ophthalmic Epidemiology Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
- Department of Optometry, School of Rehabilitation, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sina Madani
- Department of HealthIT, Vanderbilt University Medical Center, Nashville, TN, United States of America
| | - Hamid Ahmadieh
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Narsis Daftarian
- Ocular Tissue Engineering Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fatemeh Suri
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Farid Khorrami
- Department of Health Information Technology, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
| | - Proshat Saviz
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| | - Mohammad Hasan Shahriari
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Tahmineh Motevasseli
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sahba Fekri
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ramin Nourinia
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Siamak Moradian
- Ophthalmic Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Abbas Sheikhtaheri
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
10
|
He Z, Pfaff E, Guo SJ, Guo Y, Wu Y, Tao C, Stiglic G, Bian J. Enriching Real-world Data with Social Determinants of Health for Health Outcomes and Health Equity: Successes, Challenges, and Opportunities. Yearb Med Inform 2023; 32:253-263. [PMID: 38147867 PMCID: PMC10751148 DOI: 10.1055/s-0043-1768732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023] Open
Abstract
OBJECTIVE To summarize the recent methods and applications that leverage real-world data such as electronic health records (EHRs) with social determinants of health (SDoH) for public and population health and health equity and identify successes, challenges, and possible solutions. METHODS In this opinion review, grounded on a social-ecological-model-based conceptual framework, we surveyed data sources and recent informatics approaches that enable leveraging SDoH along with real-world data to support public health and clinical health applications including helping design public health intervention, enhancing risk stratification, and enabling the prediction of unmet social needs. RESULTS Besides summarizing data sources, we identified gaps in capturing SDoH data in existing EHR systems and opportunities to leverage informatics approaches to collect SDoH information either from structured and unstructured EHR data or through linking with public surveys and environmental data. We also surveyed recently developed ontologies for standardizing SDoH information and approaches that incorporate SDoH for disease risk stratification, public health crisis prediction, and development of tailored interventions. CONCLUSIONS To enable effective public health and clinical applications using real-world data with SDoH, it is necessary to develop both non-technical solutions involving incentives, policies, and training as well as technical solutions such as novel social risk management tools that are integrated into clinical workflow. Ultimately, SDoH-powered social risk management, disease risk prediction, and development of SDoH tailored interventions for disease prevention and management have the potential to improve population health, reduce disparities, and improve health equity.
Collapse
Affiliation(s)
- Zhe He
- School of Information, Florida State University, United States
- Department of Behavioral Sciences and Social Medicine, College of Medicine, Florida State University, United States
| | - Emily Pfaff
- Department of Medicine, University of North Carolina at Chapel Hill School of Medicine, United States
| | - Serena Jingchuan Guo
- Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, United States
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, United States
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, United States
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, United States
| | - Gregor Stiglic
- Faculty of Health Science, University of Maribor, Slovenia
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Slovenia
- Usher Institute, University of Edinburgh, UK
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, United States
| |
Collapse
|
11
|
Callahan TJ, Stefanski AL, Wyrwa JM, Zeng C, Ostropolets A, Banda JM, Baumgartner WA, Boyce RD, Casiraghi E, Coleman BD, Collins JH, Deakyne Davies SJ, Feinstein JA, Lin AY, Martin B, Matentzoglu NA, Meeker D, Reese J, Sinclair J, Taneja SB, Trinkley KE, Vasilevsky NA, Williams AE, Zhang XA, Denny JC, Ryan PB, Hripcsak G, Bennett TD, Haendel MA, Robinson PN, Hunter LE, Kahn MG. Ontologizing health systems data at scale: making translational discovery a reality. NPJ Digit Med 2023; 6:89. [PMID: 37208468 PMCID: PMC10196319 DOI: 10.1038/s41746-023-00830-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 04/28/2023] [Indexed: 05/21/2023] Open
Abstract
Common data models solve many challenges of standardizing electronic health record (EHR) data but are unable to semantically integrate all of the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Chenjie Zeng
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Anna Ostropolets
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, GA, 30303, USA
| | - William A Baumgartner
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15260, USA
| | - Elena Casiraghi
- Computer Science, Università degli Studi di Milano, Milan, Italy
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Ben D Coleman
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Janine H Collins
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Sara J Deakyne Davies
- Department of Research Informatics & Data Science, Analytics Resource Center, Children's Hospital Colorado, Aurora, CO, 80045, USA
| | - James A Feinstein
- Adult and Child Center for Health Outcomes Research and Delivery Science (ACCORDS), University of Colorado Anschutz School of Medicine, Aurora, CO, 80045, USA
| | - Asiyah Y Lin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Blake Martin
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | | | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Katy E Trinkley
- Department of Family Medicine, University of Colorado Anschutz School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Translational and Integrative Sciences Lab, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Andrew E Williams
- Tufts Institute for Clinical Research and Health Policy Studies, Tufts University, Boston, MA, 02155, USA
| | - Xingmin A Zhang
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Joshua C Denny
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Tellen D Bennett
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Melissa A Haendel
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| |
Collapse
|
12
|
Dirkson A, Verberne S, van Oortmerssen G, Gelderblom H, Kraaij W. How do others cope? Extracting coping strategies for adverse drug events from social media. J Biomed Inform 2023; 139:104228. [PMID: 36309197 DOI: 10.1016/j.jbi.2022.104228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 09/09/2022] [Accepted: 10/09/2022] [Indexed: 02/16/2023]
Abstract
Patients advise their peers on how to cope with their illness in daily life on online support groups. To date, no efforts have been made to automatically extract recommended coping strategies from online patient discussion groups. We introduce this new task, which poses a number of challenges including complex, long entities, a large long-tailed label space, and cross-document relations. We present an initial ontology for coping strategies as a starting point for future research on coping strategies, and the first end-to-end pipeline for extracting coping strategies for side effects. We also compared two possible computational solutions for this novel and highly challenging task; multi-label classification and named entity recognition (NER) with entity linking (EL). We evaluated our methods on the discussion forum from the Facebook group of the worldwide patient support organization 'GIST support international' (GSI); GIST support international donated the data to us. We found that coping strategy extraction is difficult and both methods attain limited performance (measured with F1 score) on held out test sets; multi-label classification outperforms NER+EL (F1=0.220 vs F1=0.155). An inspection of the multi-label classification output revealed that for some of the incorrect predictions, the reference label is close to the predicted label in the ontology (e.g. the predicted label 'juice' instead of the more specific reference label 'grapefruit juice'). Performance increased to F1=0.498 when we evaluated at a coarser level of the ontology. We conclude that our pipeline can be used in a semi-automatic setting, in interaction with domain experts to discover coping strategies for side effects from a patient forum. For example, we found that patients recommend ginger tea for nausea and magnesium and potassium supplements for cramps. This information can be used as input for patient surveys or clinical studies.
Collapse
Affiliation(s)
- Anne Dirkson
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, Netherlands.
| | - Suzan Verberne
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, Netherlands.
| | - Gerard van Oortmerssen
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, Netherlands.
| | - Hans Gelderblom
- Department of Medical Oncology, Leiden University Medical Centre, Albinusdreef 2, 2333 ZA Leiden, Netherlands.
| | - Wessel Kraaij
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, Netherlands.
| |
Collapse
|
13
|
Keshavarzi M, Ghaffary HR. An ontology-driven framework for knowledge representation of digital extortion attacks. COMPUTERS IN HUMAN BEHAVIOR 2023; 139:107520. [PMID: 36268220 PMCID: PMC9557090 DOI: 10.1016/j.chb.2022.107520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 10/02/2022] [Accepted: 10/07/2022] [Indexed: 11/22/2022]
Abstract
With the COVID-19 pandemic and the growing influence of the Internet in critical sectors of industry and society, cyberattacks have not only not declined, but have risen sharply. In the meantime, ransomware is at the forefront of the most devastating threats that have launched the lucrative illegal business. Due to the proliferation and variety of ransomware forays, there is a need for a new theory of categories. The intricacy and multiplicity of components involved in digital extortions entails the construction of a knowledge representation system that is able to organize large volumes of information from heterogeneous sources in a formal structured format and infer new knowledge from it. This paper suggests and develops a dedicated ontology of digital blackmails, called Rantology, with a particular focus on ransomware assaults. The logic coded in this ontology allows to assess the maliciousness of programs based on various factors, including called API functions and their behaviors. The proposed framework can be used to facilitate interoperability between cybersecurity experts and knowledge-based systems, and identify sensitive points for surveillance. The evaluation results based on several criteria confirm the adequacy of the suggested ontology in terms of clarity, modularity, consistency, coverage and inheritance richness.
Collapse
|
14
|
Hao X, Abeysinghe R, Shi J, Cui L. A substring replacement approach for identifying missing IS-A relations in SNOMED CT. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2022; 2022:2611-2618. [PMID: 36776766 PMCID: PMC9918377 DOI: 10.1109/bibm55620.2022.9995595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Biomedical ontologies provide formalized information and knowledge in the biomedical domain. Over the years, biomedical ontologies have played an important role in facilitating biomedical research and applications. Common quality issues of biomedical ontologies include inconsistent naming of concepts, redundant concepts, redundant relations, incomplete/incorrect concept definitions, and incomplete/incorrect class hierarchies. In this work, we focus on addressing the incompleteness of the class hierarchy in SNOMED CT. We develop a substring replacement approach, leveraging concepts' lexical features and existing IS-A relations to identify potential missing IS-A relations in SNOMED CT. To evaluate the effectiveness of our approach, we performed both automated and manual validation. For the automated evaluation, we leverage relations from external terminologies in the Unified Medical Language System (UMLS) to validate the identified missing IS-A relations. For the manual validation, a randomly selected 100 samples from the results are reviewed by a domain expert. Applying our approach to the March 2022 release of SNOMED CT US Edition, we identified 3,228 potential missing IS-A relations, among which 63 were validated through the UMLS. The evaluation by the domain expert revealed that 89 out of 100 (a precision of 89%) missing IS-A relations are valid cases, showing the effectiveness of this substring replacement approach to facilitate the quality assurance of IS-A relations in SNOMED CT.
Collapse
Affiliation(s)
- Xubing Hao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Rashmie Abeysinghe
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Jay Shi
- SCL Health Medical Group, Denver, Colorado, USA
| | - Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
15
|
Turki H, Jemielniak D, Hadj Taieb MA, Labra Gayo JE, Ben Aouicha M, Banat M, Shafee T, Prud’hommeaux E, Lubiana T, Das D, Mietchen D. Using logical constraints to validate statistical information about disease outbreaks in collaborative knowledge graphs: the case of COVID-19 epidemiology in Wikidata. PeerJ Comput Sci 2022; 8:e1085. [PMID: 36262159 PMCID: PMC9575845 DOI: 10.7717/peerj-cs.1085] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 08/15/2022] [Indexed: 06/16/2023]
Abstract
Urgent global research demands real-time dissemination of precise data. Wikidata, a collaborative and openly licensed knowledge graph available in RDF format, provides an ideal forum for exchanging structured data that can be verified and consolidated using validation schemas and bot edits. In this research article, we catalog an automatable task set necessary to assess and validate the portion of Wikidata relating to the COVID-19 epidemiology. These tasks assess statistical data and are implemented in SPARQL, a query language for semantic databases. We demonstrate the efficiency of our methods for evaluating structured non-relational information on COVID-19 in Wikidata, and its applicability in collaborative ontologies and knowledge graphs more broadly. We show the advantages and limitations of our proposed approach by comparing it to the features of other methods for the validation of linked web data as revealed by previous research.
Collapse
Affiliation(s)
- Houcemeddine Turki
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Dariusz Jemielniak
- Department of Management in Networked and Digital Societies, Kozminski University, Warsaw, Masovia, Poland
| | - Mohamed A. Hadj Taieb
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Jose E. Labra Gayo
- Web Semantics Oviedo (WESO) Research Group, University of Oviedo, Oviedo, Asturias, Spain
| | - Mohamed Ben Aouicha
- Data Engineering and Semantics Research Unit, Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
| | - Mus’ab Banat
- Faculty of Medicine, Hashemite University, Zarqa, Jordan
| | - Thomas Shafee
- La Trobe University, Melbourne, Victoria, Australia
- Swinburne University of Technology, Melbourne, Victoria, Australia
| | - Eric Prud’hommeaux
- World Wide Web Consortium, Cambridge, Massachusetts, United States of America
| | - Tiago Lubiana
- Computational Systems Biology Laboratory, University of São Paulo, São Paulo, Brazil
| | - Diptanshu Das
- Institute of Child Health (ICH), Kolkata, West Bengal, India
- Medica Superspecialty Hospital, Kolkata, West Bengal, India
| | - Daniel Mietchen
- Ronin Institute, Montclair, New Jersey, United States of America
- Department of Evolutionary and Integrative Ecology, Leibniz Institute of Freshwater Ecology and Inland Fisheries, Berlin, Germany
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States
- Institute for Globally Distributed Open Research and Education (IGDORE), Jena, Germany
| |
Collapse
|
16
|
Targeting stopwords for quality assurance of SNOMED-CT. Int J Med Inform 2022; 167:104870. [PMID: 36148752 DOI: 10.1016/j.ijmedinf.2022.104870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 09/08/2022] [Accepted: 09/12/2022] [Indexed: 11/23/2022]
Abstract
OBJECTIVE We assess the potential of exploiting stopwords in biomedical concept names to complete the logical definitions of concepts that are not sufficiently defined. METHODS Concepts containing stopwords are selected from the Disorder hierarchy of Systematized NOmenclature of MEDicine (SNOMED-CT). SNOMED-CT consists of two types of concepts: Fully Defined (FD) concepts which are sufficiently defined and Partially Defined (PD) concepts which are not sufficiently defined. In this work, FD concepts containing stopwords are treated as a source of ground truth to complete the definitions of, lexically and semantically similar, PD concepts. FD and PD concepts are lexically and semantically analysed to create sample-sets. Mandatory attribute-relationships are calculated by using an intersection-set logic for each FD sample-set. PD sample-sets are audited against this mandatory attribute-relationship template to identify inconsistencies in modelling styles and potentially missing attribute-relationships. RESULTS Lexical and semantic patterns around 11 stopwords were analysed. 26 sample-sets were extracted for the 11 stopwords. Mandatory attribute-relationships were identified for 24 of the 26 sample-sets. The method identified 62.5% - 72.22% of the PD concepts, containing the stopwords in and due to, to be inconsistent in their modelling style and potentially missing at least one attribute-relationship according to the created template.
Collapse
|
17
|
Rodler P. One step at a time: An efficient approach to query-based ontology debugging. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
18
|
Manuel W, Abeysinghe R, He Y, Tao C, Cui L. Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs. J Biomed Semantics 2022; 13:22. [PMID: 35964149 PMCID: PMC9375092 DOI: 10.1186/s13326-022-00276-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 07/24/2022] [Indexed: 11/10/2022] Open
Abstract
Background The Vaccine Ontology (VO) is a biomedical ontology that standardizes vaccine annotation. Errors in VO will affect a multitude of applications that it is being used in. Quality assurance of VO is imperative to ensure that it provides accurate domain knowledge to these downstream tasks. Manual review to identify and fix quality issues (such as missing hierarchical is-a relations) is challenging given the complexity of the ontology. Automated approaches are highly desirable to facilitate the quality assurance of VO. Methods We developed an automated lexical approach that identifies potentially missing is-a relations in VO. First, we construct two types of VO concept-pairs: (1) linked; and (2) unlinked. Each concept-pair further derives an Acquired Term Pair (ATP) based on their lexical features. If the same ATP is obtained by a linked concept-pair and an unlinked concept-pair, this is considered to indicate a potentially missing is-a relation between the unlinked pair of concepts. Results Applying this approach on the 1.1.192 version of VO, we were able to identify 232 potentially missing is-a relations. A manual review by a VO domain expert on a random sample of 70 potentially missing is-a relations revealed that 65 of the cases were valid missing is-a relations in VO (a precision of 92.86%). Conclusions The results indicate that our approach is highly effective in identifying missing is-a relation in VO.
Collapse
Affiliation(s)
- Warren Manuel
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Rashmie Abeysinghe
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Cui Tao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
19
|
Patient safety classification, taxonomy and ontology systems: A systematic review on development and evaluation methodologies. J Biomed Inform 2022; 133:104150. [PMID: 35878822 DOI: 10.1016/j.jbi.2022.104150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/11/2022] [Accepted: 07/19/2022] [Indexed: 11/24/2022]
Abstract
INTRODUCTION Patient safety classifications/ontologies enable patient safety information systems to receive and analyze patient safety data to improve patient safety. Patient safety classifications/ontologies have been developed and evaluated using a variety of methods. The purpose of this review was to discuss and analyze the methodologies for developing and evaluating patient safety classifications/ontologies. METHODS Studies that developed or evaluated patient safety classifications, terminologies, taxonomies, or ontologies were searched through Google Scholar, Google search engines, National Center for Biomedical Ontology (NCBO) BioPortal, Open Biological and Biomedical Ontology (OBO) Foundry and World Health Organization (WHO) websites and Scopus, Web of Science, PubMed, and Science Direct. We updated our search on 30 February 2021 and included all studies published until the end of 2020. Studies that developed or evaluated classifications only for patient safety and provided information on how they were developed or evaluated were included. Systems with covered patient safety terms (such as ICD-10) but are not specifically developed for patient safety were excluded. The quality and the risk of bias of studies were not assessed because all methodologies and criteria were intended to be covered. In addition, we analyzed the data through descriptive narrative synthesis and compared and classified the development and evaluation methods and evaluation criteria according to available development and evaluation approaches for biomedical ontologies. RESULTS We identified 84 articles that met all of the inclusion criteria, resulting in 70 classifications/ontologies, nine of which were for the general medical domain. The most papers were published in 2010 and 2011, with 8 and 7 papers, respectively. The United States (50) and Australia (23) have the most studies. The most commonly used methods for developing classifications/ontologies included the use of existing systems (for expanding or mapping) (44) and qualitative analysis of event reports (39). The most common evaluation methods were coding or classifying some safety report samples (25), quantitative analysis of incidents based on the developed classification (24), and consensus among physicians (16). The most commonly applied evaluation criteria were reliability (27), content and face validity (9), comprehensiveness (6), usability (5), linguistic clarity (5), and impact (4), respectively. CONCLUSIONS Because of the weaknesses and strengths of the development/evaluation methods, it is advised that more than one method for development or evaluation, as well as evaluation criteria, should be used. To organize the processes of developing classification/ontologies, well-established approaches such as Methontology are recommended. The most prevalent evaluation methods applied in this domain are well fitted to the biomedical ontology evaluation methods, but it is also advised to apply some evaluation approaches such as logic, rules, and Natural language processing (NLP) based in combination with other evaluation approaches. This research can assist domain researchers in developing or evaluating domain ontologies using more complete methodologies. There is also a lack of reporting consistency in the literature and same methods or criteria were reported with different terminologies.
Collapse
|
20
|
Abeysinghe R, Yang Y, Bartels M, Zheng WJ, Cui L. An evidence-based lexical pattern approach for quality assurance of Gene Ontology relations. Brief Bioinform 2022; 23:bbac122. [PMID: 35419584 PMCID: PMC9116247 DOI: 10.1093/bib/bbac122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/11/2022] [Accepted: 03/15/2022] [Indexed: 11/14/2022] Open
Abstract
Gene Ontology (GO) is widely used in the biological domain. It is the most comprehensive ontology providing formal representation of gene functions (GO concepts) and relations between them. However, unintentional quality defects (e.g. missing or erroneous relations) in GO may exist due to the large size of GO concepts and complexity of GO structures. Such quality defects would impact the results of GO-based analyses and applications. In this work, we introduce a novel evidence-based lexical pattern approach for quality assurance of GO relations. We leverage two layers of evidence to suggest potentially missing relations in GO as follows. We first utilize related concept pairs (i.e. existing relations) in GO to extract relationship-specific lexical patterns, which serve as the first layer evidence to automatically suggest potentially missing relations between unrelated concept pairs. For each suggested missing relation, we further identify two other existing relations as the second layer of evidence that resemble the difference between the missing relation and the existing relation based on which the missing relation is suggested. Applied to the 15 December 2021 release of GO, this approach suggested a total of 866 potentially missing relations. Local domain experts evaluated the entire set of potentially missing relations, and identified 821 as missing relations and 45 indicate erroneous existing relations. We submitted these findings to the GO consortium for further validation and received encouraging feedback. These indicate that our evidence-based approach can be utilized to uncover missing relations and erroneous existing relations in GO.
Collapse
Affiliation(s)
- Rashmie Abeysinghe
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Yuntao Yang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Mason Bartels
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - W Jim Zheng
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| |
Collapse
|
21
|
A framework for selection of health terminology systems: A prerequisite for interoperability of health information systems. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
22
|
Zhang Z, Yu P, Pai N, Chang HCR, Chen S, Yin M, Song T, Lau SK, Deng C. Developing an Intuitive Graph Representation of Knowledge for Nonpharmacological Treatment of Psychotic Symptoms in Dementia. J Gerontol Nurs 2022; 48:49-55. [PMID: 35343842 DOI: 10.3928/00989134-20220308-02] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Applying person-centered, nonpharmacological interventions to manage psychotic symptoms of dementia is promoted for health care professionals, particularly gerontological nurses, who are responsible for care of older adults in nursing homes. A knowledge graph is a graph consisting of a set of concepts that are linked together by their interrelationship and has been widely used as a formal representation of domain knowledge in health. However, there is lack of a knowledge graph for nonpharmacological treatment of psychotic symptoms in dementia. Therefore, we developed a comprehensive, human- and machine-understandable knowledge graph for this domain, named Dementia-Related Psychotic Symptom Nonpharmacological Treatment Ontology (DRPSNPTO). This graph was built by adopting the established NeOn methodology, a knowledge graph engineering method, to meet the quality standards for biomedical knowledge graphs. This intuitive graph representation of the domain knowledge sets a new direction for visualizing and computerizing gerontological knowledge to facilitate human comprehension and build intelligent aged care information systems. [Journal of Gerontological Nursing, 48(4), 49-55.].
Collapse
|
23
|
Lu Y, Barrett LA, Lin RZ, Amith M, Tao C, He Z. Understanding Information Needs and Barriers to Accessing Health Information Across All Stages of Pregnancy: Systematic Review. JMIR Pediatr Parent 2022; 5:e32235. [PMID: 35188477 PMCID: PMC8902674 DOI: 10.2196/32235] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 11/15/2021] [Accepted: 12/08/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Understanding consumers' health information needs across all stages of the pregnancy trajectory is crucial to the development of mechanisms that allow them to retrieve high-quality, customized, and layperson-friendly health information. OBJECTIVE The objective of this study was to identify research gaps in pregnancy-related consumer information needs and available information from different sources. METHODS We conducted a systematic review of CINAHL, Cochrane, PubMed, and Web of Science for relevant articles that were published from 2009 to 2019. The quality of the included articles was assessed using the Critical Appraisal Skills Program. A descriptive data analysis was performed on these articles. Based on the review result, we developed the Pregnancy Information Needs Ontology (PINO) and made it publicly available in GitHub and BioPortal. RESULTS A total of 33 articles from 9 countries met the inclusion criteria for this review, of which the majority were published no earlier than 2016. Most studies were either descriptive (9/33, 27%), interviews (7/33, 21%), or surveys/questionnaires (7/33, 21%); 20 articles mentioned consumers' pregnancy-related information needs. Half (9/18, 50%) of the human-subject studies were conducted in the United States. More than a third (13/33, 39%) of all studies focused on during-pregnancy stage; only one study (1/33, 3%) was about all stages of pregnancy. The most frequent consumer information needs were related to labor delivery (9/20, 45%), medication in pregnancy (6/20, 30%), newborn care (5/20, 25%), and lab tests (6/20, 30%). The most frequently available source of information was the internet (15/24, 63%). PINO consists of 267 classes, 555 axioms, and 271 subclass relationships. CONCLUSIONS Only a few articles assessed the barriers to access to pregnancy-related information and the quality of each source of information; further work is needed. Future work is also needed to address the gaps between the information needed and the information available.
Collapse
Affiliation(s)
- Yu Lu
- School of Information, Florida State University, Tallahassee, FL, United States
| | - Laura A Barrett
- School of Information, Florida State University, Tallahassee, FL, United States
| | - Rebecca Z Lin
- Washington University School of Medicine at St. Louis, St. Louis, MO, United States
| | - Muhammad Amith
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zhe He
- School of Information, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
24
|
Abeysinghe R, Zheng F, Cui L. A Comparison of Exhaustive and Non-lattice-based Methods for Auditing Hierarchical Relations in Gene Ontology. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2021:177-186. [PMID: 35308995 PMCID: PMC8861660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Uncovering and fixing errors in biomedical terminologies is essential so that they provide accurate knowledge to downstream applications that rely on them. Non-lattice-based methods have been applied to identify various kinds of inconsistencies in different biomedical terminologies. In previous work, we have introduced two inference-based approaches that were applied in an exhaustive manner to audit hierarchical relations in the Gene Ontology: (1) Lexical-based inference framework, and (2) Subsumption-based sub-term inference framework. However, it is unclear how effective these exhaustive approaches perform compared with their corresponding non-lattice-based approaches. Therefore, in this paper, we implement the non-lattice versions of these two exhaustive approaches, and perform a comprehensive comparison between non-lattice-based and exhaustive approaches to audit the Gene Ontology. The domain expert evaluations performed for the two exhaustive approaches are leveraged to evaluate the non-lattice versions. The results indicate that the non-lattice versions have increased precision than their exhaustive counterparts even though they do not capture some of the potential inconsistencies that the exhaustive approaches identify.
Collapse
Affiliation(s)
- Rashmie Abeysinghe
- Department of Neurology, University of Texas Health Science Center at Houston, Houston, TX
| | - Fengbo Zheng
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX
| | - Licong Cui
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
25
|
Azzi S, Michalowski W, Iglewski M. Developing a pneumonia diagnosis ontology from multiple knowledge sources. Health Informatics J 2022; 28:14604582221083850. [PMID: 35377253 DOI: 10.1177/14604582221083850] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Background: Pneumonia is difficult to differentiate from other pulmonary diseases because it shares many symptoms with these diseases. Diagnosing pneumonia in clinical practice would benefit from having access to a codified representation of clinical knowledge. An ontology represents a well-established paradigm for such codification. Objectives: The goal of this research is to create Pneumonia Diagnosis Ontology (PNADO) that brings together the medical knowledge dispersed among multiple medical knowledge sources. Material and Methods: We used several clinical practice guidelines (CPGs) describing the pneumonia diagnostic process as a starting point in developing PNADO. Preliminary version of PNADO was subsequently expanded to cover a broader range of the concepts by reusing ontologies from Open Biological and Biomedical Ontology (OBO) Foundry and BioPortal. PNADO was evaluated by examining relevant concepts from the pneumonia-specific systematic reviews, using patient data from the MIMIC-III clinical dataset, and by clinical domain experts. Results: PNADO is a comprehensive ontology and has a rich set of classes and properties that cover different types of pneumonia, pathogens, symptoms, clinical signs, laboratory tests and imaging, clinical findings, complications, and diagnoses. Conclusion: PNADO unifies pneumonia diagnostic concepts from multiple knowledge sources. It is available in the BioPortal repository.
Collapse
|
26
|
Hao X, Abeysinghe R, Zheng F, Cui L. Leveraging non-lattice subgraphs for suggestion of new concepts for SNOMED CT. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2021; 2021:1805-1812. [PMID: 35291311 PMCID: PMC8919474 DOI: 10.1109/bibm52615.2021.9669407] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Missing hierarchical is-a relations and missing concepts are common quality issues in biomedical ontologies. Non-lattice subgraphs have been extensively studied for automatically identifying missing is-a relations in biomedical ontologies like SNOMED CT. However, little is known about non-lattice subgraphs' capability to uncover new or missing concepts in biomedical ontologies. In this work, we investigate a lexical-based intersection approach based on non-lattice subgraphs to identify potential missing concepts in SNOMED CT. We first construct lexical features of concepts using their fully specified names. Then we generate hierarchically unrelated concept pairs in non-lattice subgraphs as the candidates to derive new concepts. For each candidate pair of concepts, we conduct an order-preserving intersection based on the two concepts' lexical features, with the intersection result serving as the potential new concept name suggested. We further perform automatic validation through terminologies in the Unified Medical Language System (UMLS) and literature in PubMed. Applying this approach to the March 2021 release of SNOMED CT US Edition, we obtained 7,702 potential missing concepts, among which 1,288 were validated through UMLS and 1,309 were validated through PubMed. The results showed that non-lattice subgraphs have the potential to facilitate suggestion of new concepts for SNOMED CT.
Collapse
Affiliation(s)
- Xubing Hao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Rashmie Abeysinghe
- Department of Neurology, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Fengbo Zheng
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
27
|
Wan L, Song J, He V, Roman J, Whah G, Peng S, Zhang L, He Y. Development of the International Classification of Diseases Ontology (ICDO) and its application for COVID-19 diagnostic data analysis. BMC Bioinformatics 2021; 22:508. [PMID: 34663204 PMCID: PMC8522253 DOI: 10.1186/s12859-021-04402-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 09/24/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The 10th and 9th revisions of the International Statistical Classification of Diseases and Related Health Problems (ICD10 and ICD9) have been adopted worldwide as a well-recognized norm to share codes for diseases, signs and symptoms, abnormal findings, etc. The international Consortium for Clinical Characterization of COVID-19 by EHR (4CE) website stores diagnosis COVID-19 disease data using ICD10 and ICD9 codes. However, the ICD systems are difficult to decode due to their many shortcomings, which can be addressed using ontology. METHODS An ICD ontology (ICDO) was developed to logically and scientifically represent ICD terms and their relations among different ICD terms. ICDO is also aligned with the Basic Formal Ontology (BFO) and reuses terms from existing ontologies. As a use case, the ICD10 and ICD9 diagnosis data from the 4CE website were extracted, mapped to ICDO, and analyzed using ICDO. RESULTS We have developed the ICDO to ontologize the ICD terms and relations. Different from existing disease ontologies, all ICD diseases in ICDO are defined as disease processes to describe their occurrence with other properties. The ICDO decomposes each disease term into different components, including anatomic entities, process profiles, etiological causes, output phenotype, etc. Over 900 ICD terms have been represented in ICDO. Many ICDO terms are presented in both English and Chinese. The ICD10/ICD9-based diagnosis data of over 27,000 COVID-19 patients from 5 countries were extracted from the 4CE. A total of 917 COVID-19-related disease codes, each of which were associated with 1 or more cases in the 4CE dataset, were mapped to ICDO and further analyzed using the ICDO logical annotations. Our study showed that COVID-19 targeted multiple systems and organs such as the lung, heart, and kidney. Different acute and chronic kidney phenotypes were identified. Some kidney diseases appeared to result from other diseases, such as diabetes. Some of the findings could only be easily found using ICDO instead of ICD9/10. CONCLUSIONS ICDO was developed to ontologize ICD10/10 codes and applied to study COVID-19 patient diagnosis data. Our findings showed that ICDO provides a semantic platform for more accurate detection of disease profiles.
Collapse
Affiliation(s)
- Ling Wan
- University of Michigan Medical School, Ann Arbor, MI 48109 USA
- OntoWise, Nanjing, Jiangsu China
| | - Justin Song
- Cranbrook Kingswood Upper School, Bloomfield Hills, MI 48304 USA
| | | | - Jennifer Roman
- College of Literacy, Science, and Arts, University of Michigan, Ann Arbor, MI 48109 USA
| | - Grace Whah
- College of Engineering, University of Michigan, Ann Arbor, MI 48109 USA
| | - Suyuan Peng
- School of Public Health, Peking University, Beijing, China
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luxia Zhang
- National Institute of Health Data Science, Peking University, Beijing, China
- Advanced Institute of Information Technology, Peking University, Hangzhou, China
- Renal Division, Department of Medicine, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI 48109 USA
| |
Collapse
|
28
|
Prediction of Bladder Cancer Treatment Side Effects Using an Ontology-Based Reasoning for Enhanced Patient Health Safety. INFORMATICS 2021. [DOI: 10.3390/informatics8030055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Predicting potential cancer treatment side effects at time of prescription could decrease potential health risks and achieve better patient satisfaction. This paper presents a new approach, founded on evidence-based medical knowledge, using as much information and proof as possible to help a computer program to predict bladder cancer treatment side effects and support the oncologist’s decision. This will help in deciding treatment options for patients with bladder malignancies. Bladder cancer knowledge is complex and requires simplification before any attempt to represent it in a formal or computerized manner. In this work we rely on the capabilities of OWL ontologies to seamlessly capture and conceptualize the required knowledge about this type of cancer and the underlying patient treatment process. Our ontology allows case-based reasoning to effectively predict treatment side effects for a given set of contextual information related to a specific medical case. The ontology is enriched with proofs and evidence collected from online biomedical research databases using “web crawlers”. We have exclusively designed the crawler algorithm to search for the required knowledge based on a set of specified keywords. Results from the study presented 80.3% of real reported bladder cancer treatment side-effects prediction and were close to really occurring adverse events recorded within the collected test samples when applying the approach. Evidence-based medicine combined with semantic knowledge-based models is prominent in generating predictions related to possible health concerns. The integration of a diversity of knowledge and evidence into one single integrated knowledge-base could dramatically enhance the process of predicting treatment risks and side effects applied to bladder cancer oncotherapy.
Collapse
|
29
|
Mazzotti DR. Landscape of biomedical informatics standards and terminologies for clinical sleep medicine research: A systematic review. Sleep Med Rev 2021; 60:101529. [PMID: 34455108 DOI: 10.1016/j.smrv.2021.101529] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 05/14/2021] [Accepted: 07/03/2021] [Indexed: 12/31/2022]
Abstract
A systematic literature review was conducted to understand the current landscape of standards and terminologies used in clinical sleep medicine. Literature search on PubMed, EMBASE, Medline and Web of Science was performed in March 2021 using terms related to sleep, terminologies, standards, harmonization, semantics, ontology, and electronic health records (EHR). Systematic review was carried out according to PRISMA. Among 128 included studies, 35 were eligible for review. Articles were broadly classified into six topics: standard terminology efforts, reporting standards, databases and resources, data integration efforts, EHR abstraction and standards for automated sleep scoring. This review highlights the progress and challenges related to establishing computable terminologies in sleep medicine, and identifies gaps, limitations and research opportunities related to data integration that could improve adoption of clinical research informatics in this field. There is a need for the systematic adoption of standardized terminologies in all areas of sleep medicine. Existing data aggregation resources could be leveraged to support the development of an integrated infrastructure and subsequent deployment in EHR systems within sleep centers. Ultimately, the adoption of standardized practices for documenting sleep disorders and related traits facilitates data sharing, thus accelerating discovery and clinical translation of informatics approaches applied to sleep medicine.
Collapse
Affiliation(s)
- Diego R Mazzotti
- Division of Medical Informatics, Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA.
| |
Collapse
|
30
|
Abstract
Ontologies are widely used nowadays. However, the plethora of ontologies currently available online, makes it really difficult to identify which ontologies are appropriate for a given task and to decide on their quality characteristics. This is further complicated by the fact that multiple quality criteria have been proposed for ontologies, making it even more difficult to decide which ontology to adopt. In this context, in this paper we present Delta, a modular online tool for analyzing and evaluating ontologies. The interested user can upload an ontology to the tool, which then automatically analyzes it and graphically visualizes numerous statistics, metrics, and pitfalls. Those visuals presented include a diverse set of quality dimensions, further guiding users to understand the benefits and the drawbacks of each individual ontology and how to properly develop and extend it.
Collapse
|
31
|
Zhang H, Hu H, Diller M, Hogan WR, Prosperi M, Guo Y, Bian J. Semantic standards of external exposome data. ENVIRONMENTAL RESEARCH 2021; 197:111185. [PMID: 33901445 PMCID: PMC8597904 DOI: 10.1016/j.envres.2021.111185] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 03/25/2021] [Accepted: 04/12/2021] [Indexed: 05/21/2023]
Abstract
An individual's health and conditions are associated with a complex interplay between the individual's genetics and his or her exposures to both internal and external environments. Much attention has been placed on characterizing of the genome in the past; nevertheless, genetics only account for about 10% of an individual's health conditions, while the remaining appears to be determined by environmental factors and gene-environment interactions. To comprehensively understand the causes of diseases and prevent them, environmental exposures, especially the external exposome, need to be systematically explored. However, the heterogeneity of the external exposome data sources (e.g., same exposure variables using different nomenclature in different data sources, or vice versa, two variables have the same or similar name but measure different exposures in reality) increases the difficulty of analyzing and understanding the associations between environmental exposures and health outcomes. To solve the issue, the development of semantic standards using an ontology-driven approach is inevitable because ontologies can (1) provide a unambiguous and consistent understanding of the variables in heterogeneous data sources, and (2) explicitly express and model the context of the variables and relationships between those variables. We conducted a review of existing ontology for the external exposome and found only four relevant ontologies. Further, the four existing ontologies are limited: they (1) often ignored the spatiotemporal characteristics of external exposome data, and (2) were developed in isolation from other conceptual frameworks (e.g., the socioecological model and the social determinants of health). Moving forward, the combination of multi-domain and multi-scale data (i.e., genome, phenome and exposome at different granularity) and different conceptual frameworks is the basis of health outcomes research in the future.
Collapse
Affiliation(s)
- Hansi Zhang
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Hui Hu
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA
| | - Matthew Diller
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Mattia Prosperi
- Department of Epidemiology, College of Public Health and Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Yi Guo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA.
| |
Collapse
|
32
|
Xie J, Zi W, Li Z, He Y. Ontology-based Precision Vaccinology for Deep Mechanism Understanding and Precision Vaccine Development. Curr Pharm Des 2021; 27:900-910. [PMID: 33238868 DOI: 10.2174/1381612826666201125112131] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 10/08/2020] [Indexed: 11/22/2022]
Abstract
Vaccination is one of the most important innovations in human history. It has also become a hot research area in a new application - the development of new vaccines against non-infectious diseases such as cancers. However, effective and safe vaccines still do not exist for many diseases, and where vaccines exist, their protective immune mechanisms are often unclear. Although licensed vaccines are generally safe, various adverse events, and sometimes severe adverse events, still exist for a small population. Precision medicine tailors medical intervention to the personal characteristics of individual patients or sub-populations of individuals with similar immunity-related characteristics. Precision vaccinology is a new strategy that applies precision medicine to the development, administration, and post-administration analysis of vaccines. Several conditions contribute to make this the right time to embark on the development of precision vaccinology. First, the increased level of research in vaccinology has generated voluminous "big data" repositories of vaccinology data. Secondly, new technologies such as multi-omics and immunoinformatics bring new methods for investigating vaccines and immunology. Finally, the advent of AI and machine learning software now makes possible the marriage of Big Data to the development of new vaccines in ways not possible before. However, something is missing in this marriage, and that is a common language that facilitates the correlation, analysis, and reporting nomenclature for the field of vaccinology. Solving this bioinformatics problem is the domain of applied biomedical ontology. Ontology in the informatics field is human- and machine-interpretable representation of entities and the relations among entities in a specific domain. The Vaccine Ontology (VO) and Ontology of Vaccine Adverse Events (OVAE) have been developed to support the standard representation of vaccines, vaccine components, vaccinations, host responses, and vaccine adverse events. Many other biomedical ontologies have also been developed and can be applied in vaccine research. Here, we review the current status of precision vaccinology and how ontological development will enhance this field, and propose an ontology-based precision vaccinology strategy to support precision vaccine research and development.
Collapse
Affiliation(s)
- Jiangan Xie
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Wenrui Zi
- Chongqing engineering research center of medical electronics and information technology, School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Zhangyong Li
- Chongqing engineering research center of medical electronics and information technology, School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yongqun He
- Unit of Laboratory Animal Medicine, Development of Microbiology and Immunology, Center of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan, United States
| |
Collapse
|
33
|
Naghizadeh A, Salamat M, Hamzeian D, Akbari S, Rezaeizadeh H, Vaghasloo MA, Karbalaei R, Mirzaie M, Karimi M, Jafari M. IrGO: Iranian traditional medicine General Ontology and knowledge base. J Biomed Semantics 2021; 12:9. [PMID: 33863373 PMCID: PMC8052758 DOI: 10.1186/s13326-021-00237-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 03/04/2021] [Indexed: 11/22/2022] Open
Abstract
Background Iranian traditional medicine, also known as Persian Medicine, is a holistic school of medicine with a long prolific history. It describes numerous concepts and the relationships between them. However, no unified language system has been proposed for the concepts of this medicine up to the present time. Considering the extensive terminology in the numerous textbooks written by the scholars over centuries, comprehending the totality of concepts is obviously a very challenging task. To resolve this issue, overcome the obstacles, and code the concepts in a reusable manner, constructing an ontology of the concepts of Iranian traditional medicine seems a necessity. Construction and content Makhzan al-Advieh, an encyclopedia of materia medica compiled by Mohammad Hossein Aghili Khorasani, was selected as the resource to create an ontology of the concepts used to describe medicinal substances. The steps followed to accomplish this task included (1) compiling the list of classes via examination of textbooks, and text mining the resource followed by manual review to ensure comprehensiveness of extracted terms; (2) arranging the classes in a taxonomy; (3) determining object and data properties; (4) specifying annotation properties including ID, labels (English and Persian), alternative terms, and definitions (English and Persian); (5) ontology evaluation. The ontology was created using Protégé with adherence to the principles of ontology development provided by the Open Biological and Biomedical Ontology (OBO) foundry. Utility and discussion The ontology was finalized with inclusion of 3521 classes, 15 properties, and 20,903 axioms in the Iranian traditional medicine General Ontology (IrGO) database, freely available at http://ir-go.net/. An indented list and an interactive graph view using WebVOWL were used to visualize the ontology. All classes were linked to their instances in UNaProd database to create a knowledge base of ITM materia medica. Conclusion We constructed an ontology-based knowledge base of ITM concepts in the domain of materia medica to help offer a shared and common understanding of this concept, enable reuse of the knowledge, and make the assumptions explicit. This ontology will aid Persian medicine practitioners in clinical decision-making to select drugs. Extending IrGO will bridge the gap between traditional and conventional schools of medicine, helping guide future research in the process of drug discovery.
Collapse
Affiliation(s)
- Ayeh Naghizadeh
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Mahdi Salamat
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Donya Hamzeian
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Shaghayegh Akbari
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Hossein Rezaeizadeh
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Mahdi Alizadeh Vaghasloo
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modarres University, Jalal Ale Ahmad Highway, Tehran, Iran
| | - Mehrdad Karimi
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Mohieddin Jafari
- Department of Traditional Medicine, School of Persian Medicine, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
34
|
Zuo X, Li J, Zhao B, Zhou Y, Dong X, Duke J, Natarajan K, Hripcsak G, Shah N, Banda JM, Reeves R, Miller T, Xu H. Normalizing Clinical Document Titles to LOINC Document Ontology: an Initial Study. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021; 2020:1441-1450. [PMID: 33936520 PMCID: PMC8075502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The normalization of clinical documents is essential for health information management with the enormous amount of clinical documentation generated each year. The LOINC Document Ontology (DO) is a universal clinical document standard in a hierarchical structure. The objective of this study is to investigate the feasibility and generalizability of LOINC DO by mapping from clinical note titles across five institutions to five DO axes. We first developed an annotation framework based on the definition of LOINC DO axes and manually mapped 4,000 titles. Then we introduced a pre-trained deep learning model named Bidirectional Encoder Representations from Transformers (BERT) to enable automatic mapping from titles to LOINC DO axes. The results showed that the BERT-based automatic mapping achieved improved performance compared with the baseline model. By analyzing both manual annotations and predicted results, ambiguities in LOINC DO axes definition were discussed.
Collapse
Affiliation(s)
- Xu Zuo
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jianfu Li
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Bo Zhao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yujia Zhou
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Xiao Dong
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jon Duke
- Georgia Institute of Technology, Atlanta, GA, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Karthik Natarajan
- Columbia University, New York City, NY, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - George Hripcsak
- Columbia University, New York City, NY, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Nigam Shah
- Stanford University, Stanford, CA, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Juan M Banda
- Georgia State University, Atlanta, GA, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Ruth Reeves
- Department of Veterans Affairs, Tennessee Valley Healthcare System, Nashville, TN, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Timothy Miller
- Boston Children's Hospital, Boston, MA, USA
- OHDSI Consortium, Natural Language Processing Working Group
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
- OHDSI Consortium, Natural Language Processing Working Group
| |
Collapse
|
35
|
Agrawal A, Cui L. Quality assurance and enrichment of biological and biomedical ontologies and terminologies. BMC Med Inform Decis Mak 2020; 20:301. [PMID: 33319696 PMCID: PMC7737253 DOI: 10.1186/s12911-020-01342-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Biological and biomedical ontologies and terminologies are used to organize and store various domain-specific knowledge to provide standardization of terminology usage and to improve interoperability. The growing number of such ontologies and terminologies and their increasing adoption in clinical, research and healthcare settings call for effective and efficient quality assurance and semantic enrichment techniques of these ontologies and terminologies. In this editorial, we provide an introductory summary of nine articles included in this supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. The articles cover a range of standards including SNOMED CT, National Cancer Institute Thesaurus, Unified Medical Language System, North American Association of Central Cancer Registries and OBO Foundry Ontologies.
Collapse
Affiliation(s)
- Ankur Agrawal
- Department of Computer Science, Manhattan College, New York, USA
| | - Licong Cui
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
36
|
He Z, Tao C, Bian J, Zhang R. Selected articles from the Fourth International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019). BMC Med Inform Decis Mak 2020; 20:315. [PMID: 33317524 PMCID: PMC7734704 DOI: 10.1186/s12911-020-01292-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
In this introduction, we first summarize the Fourth International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019) held on October 26, 2019 in conjunction with the 18th International Semantic Web Conference (ISWC 2019) in Auckland, New Zealand, and then briefly introduce seven research articles included in this supplement issue, covering the topics on Knowledge Graph, Ontology-Powered Analytics, and Deep Learning.
Collapse
Affiliation(s)
- Zhe He
- School of Information, College of Communication and Information, Florida State University, 142 Collegiate Loop, Tallahassee, FL, 32306-2100, USA.
| | - Cui Tao
- School of Biomedical Informatics, University of Texas Health Science Center At Houston, Houston, TX, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Rui Zhang
- Institute for Health Informatics and College of Pharmacy, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
37
|
Zheng L, He Z, Wei D, Keloth V, Fan JW, Lindemann L, Zhu X, Cimino JJ, Perl Y. A review of auditing techniques for the Unified Medical Language System. J Am Med Inform Assoc 2020; 27:1625-1638. [PMID: 32766692 PMCID: PMC7566540 DOI: 10.1093/jamia/ocaa108] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 05/05/2020] [Accepted: 05/13/2020] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE The study sought to describe the literature related to the development of methods for auditing the Unified Medical Language System (UMLS), with particular attention to identifying errors and inconsistencies of attributes of the concepts in the UMLS Metathesaurus. MATERIALS AND METHODS We applied the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) approach by searching the MEDLINE database and Google Scholar for studies referencing the UMLS and any of several terms related to auditing, error detection, and quality assurance. A qualitative analysis and summarization of articles that met inclusion criteria were performed. RESULTS Eighty-three studies were reviewed in detail. We first categorized techniques based on various aspects including concepts, concept names, and synonymy (n = 37), semantic type assignments (n = 36), hierarchical relationships (n = 24), lateral relationships (n = 12), ontology enrichment (n = 8), and ontology alignment (n = 18). We also categorized the methods according to their level of automation (ie, automated systematic, automated heuristic, or manual) and the type of knowledge used (ie, intrinsic or extrinsic knowledge). CONCLUSIONS This study is a comprehensive review of the published methods for auditing the various conceptual aspects of the UMLS. Categorizing the auditing techniques according to the various aspects will enable the curators of the UMLS as well as researchers comprehensive easy access to this wealth of knowledge (eg, for auditing lateral relationships in the UMLS). We also reviewed ontology enrichment and alignment techniques due to their critical use of and impact on the UMLS.
Collapse
Affiliation(s)
- Ling Zheng
- Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, New Jersey, USA
| | - Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| | - Duo Wei
- School of Business, Stockton University, Galloway, New Jersey, USA
| | - Vipina Keloth
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA
| | - Jung-Wei Fan
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Luke Lindemann
- Center for Biomedical Data Science, Yale School of Medicine, New Haven, Connecticut, USA
| | - Xinxin Zhu
- Center for Biomedical Data Science, Yale School of Medicine, New Haven, Connecticut, USA
| | - James J Cimino
- Informatics Institute, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA
| |
Collapse
|
38
|
Zhang Z, Yu P, Chang HCR, Lau SK, Tao C, Wang N, Yin M, Deng C. Developing an ontology for representing the domain knowledge specific to non-pharmacological treatment for agitation in dementia. ALZHEIMER'S & DEMENTIA (NEW YORK, N. Y.) 2020; 6:e12061. [PMID: 32995470 PMCID: PMC7507392 DOI: 10.1002/trc2.12061] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 06/19/2020] [Accepted: 07/09/2020] [Indexed: 11/12/2022]
Abstract
INTRODUCTION A large volume of clinical care data has been generated for managing agitation in dementia. However, the valuable information in these data has not been used effectively to generate insights for improving the quality of care. Application of artificial intelligence technologies offers us enormous opportunities to reuse these data. For health data science to achieve this, this study focuses on using ontology to coding clinical knowledge for non-pharmacological treatment of agitation in a machine-readable format. METHODS The resultant ontology-Dementia-Related Agitation Non-Pharmacological Treatment Ontology (DRANPTO)-was developed using a method adopted from the NeOn methodology. RESULTS DRANPTO consisted of 569 concepts and 48 object properties. It meets the standards for biomedical ontology. DISCUSSION DRANPTO is the first comprehensive semantic representation of non-pharmacological management for agitation in dementia in the long-term care setting. As a knowledge base, it will play a vital role to facilitate the development of intelligent systems for managing agitation in dementia.
Collapse
Affiliation(s)
- Zhenyu Zhang
- Centre for Digital Transformation School of Computing and Information Technology University of Wollongong Wollongong New South Wales Australia
| | - Ping Yu
- Centre for Digital Transformation School of Computing and Information Technology University of Wollongong Wollongong New South Wales Australia
- Illawarra Health and Medical Research Institute Wollongong New South Wales Australia
| | - Hui Chen Rita Chang
- Illawarra Health and Medical Research Institute Wollongong New South Wales Australia
- School of Nursing University of Wollongong Wollongong New South Wales Australia
| | - Sim Kim Lau
- Centre for Digital Transformation School of Computing and Information Technology University of Wollongong Wollongong New South Wales Australia
| | - Cui Tao
- School of Biomedical Informatics University of Texas Health Science Center Houston Texas USA
| | - Ning Wang
- PR China Southern Centre for Evidence Based Nursing and Midwifery Practice School of Nursing Southern Medical University Guangzhou City PR China
| | - Mengyang Yin
- Systems and Reporting Residential Care Catholic Healthcare Ltd Macquarie Park New South Wales Australia
| | - Chao Deng
- Illawarra Health and Medical Research Institute Wollongong New South Wales Australia
- School of Medicine University of Wollongong Wollongong New South Wales Australia
| |
Collapse
|
39
|
Tiwari S, Abraham A. Semantic assessment of smart healthcare ontology. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS 2020. [DOI: 10.1108/ijwis-05-2020-0027] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeHealth-care ontologies and their terminologies play a vital role in knowledge representation and data integration for health information. In health-care systems, Internet of Technology (IoT) technologies provide data exchange among various entities and ontologies offer a formal description to present the knowledge of health-care domains. These ontologies are advised to assure the quality of their adoption and applicability in the real world.Design/methodology/approachOntology assessment is an integral part of ontology construction and maintenance. It is always performed to identify inconsistencies and modeling errors by the experts during the ontology development. A smart health-care ontology (SHCO) has been designed to deal with health-care information and IoT devices. In this paper, an integrated approach has been proposed to assess the SHCO on different assessment tools such as Themis, Test-Driven Development (TDD)onto, Protégé and OOPs! Several test cases are framed to assess the ontology on these tools, in this research, Themis and TDDonto tools provide the verification for the test cases while Protégé and OOPs! provides validation of modeled knowledge in the ontology.FindingsAs of the best knowledge, no other study has been presented earlier to conduct the integrated assessment on different tools. All test cases are successfully analyzed on these tools and results are drawn and compared with other ontologies.Originality/valueThe developed ontology is analyzed on different verification and validation tools to assure the quality of ontologies.
Collapse
|
40
|
Kim S, Oh SG. Extracting and applying evaluation criteria for ontology quality assessment. LIBRARY HI TECH 2019. [DOI: 10.1108/lht-01-2019-0012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The purpose of this paper is to formulate apposite criteria for ontology evaluation and test them through assessments of existing ontologies.
Design/methodology/approach
A literature review provided the basis from which to extract the categories relevant to an evaluation of internal ontology components. According to the ontology evaluation categories, a panel of experts provided the evaluation criteria for each category via Delphi survey. Reliability was gauged by applying the criteria to assessments of existing smartphone ontologies.
Findings
Existing research tends to approach ontology evaluation through comparison with well-engineered ontologies, implementation in target applications and appropriateness/interconnection appraisals in relation to raw data, but such methodologies fall short of shedding light on the internal workings of ontologies, such as structure, semantic representation and interoperability. This study adopts its evaluation categories from previous research while also collecting concrete evaluation criteria from an expert panel and verifying the reliability of the resulting 53 criteria.
Originality/value
This is the first published study to extract ontology evaluation criteria in terms of syntax, semantics and pragmatics. The results can be used as an evaluation index following ontology construction.
Collapse
|
41
|
Jiomekong A, Camara G, Tchuente M. Extracting ontological knowledge from Java source code using Hidden Markov Models. OPEN COMPUTER SCIENCE 2019. [DOI: 10.1515/comp-2019-0013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
AbstractOntologies have become a key element since many decades in information systems such as in epidemiological surveillance domain. Building domain ontologies requires the access to domain knowledge owned by domain experts or contained in knowledge sources. However, domain experts are not always available for interviews. Therefore, there is a lot of value in using ontology learning which consists in automatic or semi-automatic extraction of ontological knowledge from structured or unstructured knowledge sources such as texts, databases, etc. Many techniques have been used but they all are limited in concepts, properties and terminology extraction leaving behind axioms and rules. Source code which naturally embed domain knowledge is rarely used. In this paper, we propose an approach based on Hidden Markov Models (HMMs) for concepts, properties, axioms and rules learning from Java source code. This approach is experimented with the source code of EPICAM, an epidemiological platform developed in Java and used in Cameroon for tuberculosis surveillance. Domain experts involved in the evaluation estimated that knowledge extracted was relevant to the domain. In addition, we performed an automatic evaluation of the relevance of the terms extracted to the medical domain by aligning them with ontologies hosted on Bioportal platform through the Ontology Recommender tool. The results were interesting since the terms extracted were covered at 82.9% by many biomedical ontologies such as NCIT, SNOWMEDCT and ONTOPARON.
Collapse
Affiliation(s)
- Azanzi Jiomekong
- University of Yaounde I, Faculty of Science, Yaounde, Cameroon; IRD, Sorbonne Université, UMMISCO, F-93143, Bondy, France
| | - Gaoussou Camara
- LIMA, Université Alioune Diop de Bambey, Sénégal; IRD, Sorbonne Université, UMMISCO, F-93143, Bondy, France
| | - Maurice Tchuente
- University of Yaounde I, Faculty of Science, Yaounde, Cameroon; IRD, Sorbonne Université, UMMISCO, F-93143, Bondy, France
| |
Collapse
|
42
|
Amith M, Manion F, Liang C, Harris M, Wang D, He Y, Tao C. Architecture and usability of OntoKeeper, an ontology evaluation tool. BMC Med Inform Decis Mak 2019; 19:152. [PMID: 31391056 PMCID: PMC6686219 DOI: 10.1186/s12911-019-0859-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Background The existing community-wide bodies of biomedical ontologies are known to contain quality and content problems. Past research has revealed various errors related to their semantics and logical structure. Automated tools may help to ease the ontology construction, maintenance, assessment and quality assurance processes. However, there are relatively few tools that exist that can provide this support to knowledge engineers. Method We introduce OntoKeeper as a web-based tool that can automate quality scoring for ontology developers. We enlisted 5 experienced ontologists to test the tool and then administered the System Usability Scale to measure their assessment. Results In this paper, we present usability results from 5 ontologists revealing high system usability of OntoKeeper, and use-cases that demonstrate its capabilities in previous published biomedical ontology research. Conclusion To the best of our knowledge, OntoKeeper is the first of a few ontology evaluation tools that can help provide ontology evaluation functionality for knowledge engineers with good usability.
Collapse
Affiliation(s)
- Muhammad Amith
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA
| | - Frank Manion
- Department of Systems, Populations and Leadership, University of Michigan School of Nursing, 426 N. Ingalls St, Ann Arbor, 48109, MI, USA
| | - Chen Liang
- Arnold School of Public Health, University of South Carolina, Columbia, 29208, SC, USA
| | - Marcelline Harris
- Department of Systems, Populations and Leadership, University of Michigan School of Nursing, 426 N. Ingalls St, Ann Arbor, 48109, MI, USA
| | | | - Yongqun He
- Center for Computational Medicine & Bioinformatics, University of Michigan Medical School, Room 2017, Palmer Commons 100 Washtenaw Avenue, Ann Arbor, 48109, MI, USA
| | - Cui Tao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, 77030, TX, USA.
| |
Collapse
|
43
|
Block LJ, Currie LM, Hardiker NR, Strudwick G. Visibility of Community Nursing Within an Administrative Health Classification System: Evaluation of Content Coverage. J Med Internet Res 2019; 21:e12847. [PMID: 31244480 PMCID: PMC6617914 DOI: 10.2196/12847] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 04/11/2019] [Accepted: 05/02/2019] [Indexed: 12/21/2022] Open
Abstract
Background The World Health Organization is in the process of developing an international administrative classification for health called the International Classification of Health Interventions (ICHI). The purpose of ICHI is to provide a tool for supporting intervention reporting and analysis at a global level for policy development and beyond. Nurses represent the largest resource carrying out clinical interventions in any health system. With the shift in nursing care from hospital to community settings in many countries, it is important to ensure that community nursing interventions are present in any international health information system. Thus, an investigation into the extent to which community nursing interventions were covered in ICHI was needed. Objective The objectives of this study were to examine the extent to which International Classification for Nursing Practice (ICNP) community nursing interventions were represented in the ICHI administrative classification system, to identify themes related to gaps in coverage, and to support continued advancements in understanding the complexities of knowledge representation in standardized clinical terminologies and classifications. Methods This descriptive study used a content mapping approach in 2 phases in 2018. A total of 187 nursing intervention codes were extracted from the ICNP Community Nursing Catalogue and mapped to ICHI. In phase 1, 2 coders completed independent mapping activities. In phase 2, the 2 coders compared each list and discussed concept matches until consensus on ICNP-ICHI match and on mapping relationship was reached. Results The initial percentage agreement between the 2 coders was 47% (n=88), but reached 100% with consensus processes. After consensus was reached, 151 (81%) of the community nursing interventions resulted in an ICHI match. A total of 36 (19%) of community nursing interventions had no match to ICHI content. A total of 100 (53%) community nursing interventions resulted in a broader ICHI code, 9 (5%) resulted in a narrower ICHI code, and 42 (23%) were considered equivalent. ICNP concepts that were not represented in ICHI were thematically grouped into the categories family and caregivers, death and dying, and case management. Conclusions Overall, the content mapping yielded similar results to other content mapping studies in nursing. However, it also found areas of missing concept coverage, difficulties with interterminology mapping, and further need to develop mapping methods.
Collapse
Affiliation(s)
- Lorraine J Block
- School of Nursing, University of British Columbia, Vancouver, BC, Canada
| | - Leanne M Currie
- School of Nursing, University of British Columbia, Vancouver, BC, Canada
| | - Nicholas R Hardiker
- School of Human and Health Sciences, University of Huddersfield, Huddersfield, United Kingdom
| | - Gillian Strudwick
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
| |
Collapse
|
44
|
Li F, Rao G, Du J, Xiang Y, Zhang Y, Selek S, Hamilton JE, Xu H, Tao C. Ontological representation-oriented term normalization and standardization of the Research Domain Criteria. Health Informatics J 2019; 26:726-737. [PMID: 30843449 DOI: 10.1177/1460458219832059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The Research Domain Criteria, launched by the National Institute of Mental Health, is a new dimensional and interdisciplinary research framework for mental disorders. The Research Domain Criteria matrix is its core part. Since an ontology has the strengths of supporting semantic inferencing and automatic data processing, we would like to transform the Research Domain Criteria matrix into an ontological structure. In terms of data normalization, which is the essential part of an ontology representation, the Research Domain Criteria elements (mainly in the Units of Analysis) have some limitations. In this article, we propose a series of solutions to improve data normalization of the Research Domain Criteria elements in the Units of Analysis, including leveraging standard terminologies (i.e. the Unified Medical Language System Metathesaurus), context-combining queries, and domain expertise. The evaluation results show the positive (Yes) percentage is more than 80 percent, indicating our work is favorably received by the mental health professionals, and we have formed a good data foundation for the Research Domain Criteria ontological representation in the future work.
Collapse
Affiliation(s)
- Fang Li
- The University of Texas Health Science Center at Houston, USA
| | | | | | | | | | | | | | | | - Cui Tao
- The University of Texas Health Science Center at Houston, USA
| |
Collapse
|
45
|
A scoping review of ontologies related to human behaviour change. Nat Hum Behav 2019; 3:164-172. [PMID: 30944444 DOI: 10.1038/s41562-018-0511-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 12/06/2018] [Indexed: 12/16/2022]
Abstract
Ontologies are classification systems specifying entities, definitions and inter-relationships for a given domain, with the potential to advance knowledge about human behaviour change. A scoping review was conducted to: (1) identify what ontologies exist related to human behaviour change, (2) describe the methods used to develop these ontologies and (3) assess the quality of identified ontologies. Using a systematic search, 2,303 papers were identified. Fifteen ontologies met the eligibility criteria for inclusion, developed in areas such as cognition, mental disease and emotions. Methods used for developing the ontologies were expert consultation, data-driven techniques and reuse of terms from existing taxonomies, terminologies and ontologies. Best practices used in ontology development and maintenance were documented. The review did not identify any ontologies representing the breadth and detail of human behaviour change. This suggests that advancing behavioural science would benefit from the development of a behaviour change intervention ontology.
Collapse
|
46
|
Quality assurance of biomedical terminologies and ontologies. J Biomed Inform 2018; 86:106-108. [PMID: 30205171 DOI: 10.1016/j.jbi.2018.09.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 09/07/2018] [Indexed: 11/22/2022]
|