1
|
Beverley J, Babcock S, Carvalho G, Cowell LG, Duesing S, He Y, Hurley R, Merrell E, Scheuermann RH, Smith B. Coordinating virus research: The Virus Infectious Disease Ontology. PLoS One 2024; 19:e0285093. [PMID: 38236918 PMCID: PMC10796065 DOI: 10.1371/journal.pone.0285093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 04/12/2023] [Indexed: 01/22/2024] Open
Abstract
The COVID-19 pandemic prompted immense work on the investigation of the SARS-CoV-2 virus. Rapid, accurate, and consistent interpretation of generated data is thereby of fundamental concern. Ontologies-structured, controlled, vocabularies-are designed to support consistency of interpretation, and thereby to prevent the development of data silos. This paper describes how ontologies are serving this purpose in the COVID-19 research domain, by following principles of the Open Biological and Biomedical Ontology (OBO) Foundry and by reusing existing ontologies such as the Infectious Disease Ontology (IDO) Core, which provides terminological content common to investigations of all infectious diseases. We report here on the development of an IDO extension, the Virus Infectious Disease Ontology (VIDO), a reference ontology covering viral infectious diseases. We motivate term and definition choices, showcase reuse of terms from existing OBO ontologies, illustrate how ontological decisions were motivated by relevant life science research, and connect VIDO to the Coronavirus Infectious Disease Ontology (CIDO). We next use terms from these ontologies to annotate selections from life science research on SARS-CoV-2, highlighting how ontologies employing a common upper-level vocabulary may be seamlessly interwoven. Finally, we outline future work, including bacteria and fungus infectious disease reference ontologies currently under development, then cite uses of VIDO and CIDO in host-pathogen data analytics, electronic health record annotation, and ontology conflict-resolution projects.
Collapse
Affiliation(s)
- John Beverley
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| | - Shane Babcock
- National Center for Ontological Research, Buffalo, NY, United States of America
- Air Force Research Laboratory, Wright Patterson Air Force Base, Riverside, OH, United States of America
| | - Gustavo Carvalho
- Department of Cognitive Science, Northwestern University, Evanston, IL, United States of America
| | - Lindsay G. Cowell
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States of America
| | - Sebastian Duesing
- Department of Philosophy, Loyola University, Chicago, IL, United States of America
| | - Yongqun He
- Computational Medicine and Bioinformatics, University of Michigan Medical School, He Group, Ann Arbor, MI, United States of America
| | - Regina Hurley
- National Center for Ontological Research, Buffalo, NY, United States of America
- Department of Philosophy, Northwestern University, Evanston, IL, United States of America
| | - Eric Merrell
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| | - Richard H. Scheuermann
- Department of Informatics, J. Craig Venter Institute, La Jolla, CA, United States of America
- Department of Pathology, University of California, San Diego, CA, United States of America
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States of America
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| |
Collapse
|
2
|
Zheng L, Perl Y, He Y. Big knowledge visualization of the COVID-19 CIDO ontology evolution. BMC Med Inform Decis Mak 2023; 23:88. [PMID: 37161560 PMCID: PMC10169115 DOI: 10.1186/s12911-023-02184-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 04/20/2023] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND The extensive international research for medications and vaccines for the devastating COVID-19 pandemic requires a standard reference ontology. Among the current COVID-19 ontologies, the Coronavirus Infectious Disease Ontology (CIDO) is the largest one. Furthermore, it keeps growing very frequently. Researchers using CIDO as a reference ontology, need a quick update about the content added in a recent release to know how relevant the new concepts are to their research needs. Although CIDO is only a medium size ontology, it is still a large knowledge base posing a challenge for a user interested in obtaining the "big picture" of content changes between releases. Both a theoretical framework and a proper visualization are required to provide such a "big picture". METHODS The child-of-based layout of the weighted aggregate partial-area taxonomy summarization network (WAT) provides a "big picture" convenient visualization of the content of an ontology. In this paper we address the "big picture" of content changes between two releases of an ontology. We introduce a new DIFF framework named Diff Weighted Aggregate Taxonomy (DWAT) to display the differences between the WATs of two releases of an ontology. We use a layered approach which consists first of a DWAT of major subjects in CIDO, and then drill down a major subject of interest in the top-level DWAT to obtain a DWAT of secondary subjects and even further refined layers. RESULTS A visualization of the Diff Weighted Aggregate Taxonomy is demonstrated on the CIDO ontology. The evolution of CIDO between 2020 and 2022 is demonstrated in two perspectives. Drilling down for a DWAT of secondary subject networks is also demonstrated. We illustrate how the DWAT of CIDO provides insight into its evolution. CONCLUSIONS The new Diff Weighted Aggregate Taxonomy enables a layered approach to view the "big picture" of the changes in the content between two releases of an ontology.
Collapse
Affiliation(s)
- Ling Zheng
- Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ, USA.
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, and Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
3
|
Zgheib R, Chahbandarian G, Kamalov F, Messiry HE, Al-Gindy A. Towards an ML-based semantic IoT for pandemic management: A survey of enabling technologies for COVID-19. Neurocomputing 2023; 528:160-177. [PMID: 36647510 PMCID: PMC9833856 DOI: 10.1016/j.neucom.2023.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 12/03/2022] [Accepted: 01/08/2023] [Indexed: 01/13/2023]
Abstract
The connection between humans and digital technologies has been documented extensively in the past decades but needs to be evaluated through the current global pandemic. Artificial Intelligence(AI), with its two strands, Machine Learning (ML) and Semantic Reasoning, has proven to be a great solution to provide efficient ways to prevent, diagnose and limit the spread of COVID-19. IoT solutions have been widely proposed for COVID-19 disease monitoring, infection geolocation, and social applications. In this paper, we investigate the usage of the three technologies for handling the COVID-19 pandemic. For this purpose, we surveyed the existing ML applications and algorithms proposed during the pandemic to detect COVID-19 disease using symptom factors and image processing. The survey includes existing approaches including semantic technologies and IoT systems for COVID-19. Based on the survey result, we classified the main challenges and the solutions that could solve them. The study proposes a conceptual framework for pandemic management and discusses challenges and trends for future research.
Collapse
Affiliation(s)
- Rita Zgheib
- Department of Computer Engineering, Canadian University Dubai, Dubai, United Arab Emirates,Corresponding author at: Canadian University Dubai,City Walk, Dubai, UAE
| | | | - Firuz Kamalov
- Department of Electrical Engineering, Canadian University Dubai, Dubai, United Arab Emirates
| | - Haythem El Messiry
- University of Science and Technology of Fujairah, Fujairah, United Arab Emirates,University of Ain Shams, Cairo, Egypt
| | - Ahmed Al-Gindy
- Department of Electrical Engineering, Canadian University Dubai, Dubai, United Arab Emirates
| |
Collapse
|
4
|
Ontology-Driven Knowledge Sharing in Alzheimer’s Disease Research. INFORMATION 2023. [DOI: 10.3390/info14030188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023] Open
Abstract
Alzheimer’s disease is a debilitating neurodegenerative condition which is known to be the most common cause of dementia. Despite its rapidly growing prevalence, medicine still lacks a comprehensive definition of the disease. As a result, Alzheimer’s disease remains neither preventable nor curable. In recent years, broad interdisciplinary collaborations in Alzheimer’s disease research are becoming more common. Furthermore, such collaborations have already demonstrated their superiority in addressing the complexity of the disease in innovative ways. However, establishing effective communication and optimal knowledge distribution between researchers and specialists with different expertise and background is not a straightforward task. To address this challenge, we propose the Alzheimer’s disease Ontology for Diagnosis and Preclinical Classification (AD-DPC) as a tool for effective knowledge sharing in interdisciplinary/multidisciplinary teams working on Alzheimer’s disease. It covers six major conceptual groups, namely Alzheimer’s disease pathology, Alzheimer’s disease spectrum, Diagnostic process, Symptoms, Assessments, and Relevant clinical findings. All concepts were annotated with definitions or elucidations and in some cases enriched with synonyms and additional resources. The potential of AD-DPC to support non-medical experts is demonstrated through an evaluation of its usability, applicability and correctness. The results show that the participants in the evaluation process who lack prior medical knowledge can successfully answer Alzheimer’s disease-related questions by interacting with AD-DPC. Furthermore, their perceived level of knowledge in the field increased leading to effective communication with medical experts.
Collapse
|
5
|
Kreuzthaler M, Brochhausen M, Zayas C, Blobel B, Schulz S. Linguistic and ontological challenges of multiple domains contributing to transformed health ecosystems. Front Med (Lausanne) 2023; 10:1073313. [PMID: 37007792 PMCID: PMC10050682 DOI: 10.3389/fmed.2023.1073313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 02/13/2023] [Indexed: 03/17/2023] Open
Abstract
This paper provides an overview of current linguistic and ontological challenges which have to be met in order to provide full support to the transformation of health ecosystems in order to meet precision medicine (5 PM) standards. It highlights both standardization and interoperability aspects regarding formal, controlled representations of clinical and research data, requirements for smart support to produce and encode content in a way that humans and machines can understand and process it. Starting from the current text-centered communication practices in healthcare and biomedical research, it addresses the state of the art in information extraction using natural language processing (NLP). An important aspect of the language-centered perspective of managing health data is the integration of heterogeneous data sources, employing different natural languages and different terminologies. This is where biomedical ontologies, in the sense of formal, interchangeable representations of types of domain entities come into play. The paper discusses the state of the art of biomedical ontologies, addresses their importance for standardization and interoperability and sheds light to current misconceptions and shortcomings. Finally, the paper points out next steps and possible synergies of both the field of NLP and the area of Applied Ontology and Semantic Web to foster data interoperability for 5 PM.
Collapse
Affiliation(s)
- Markus Kreuzthaler
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
| | - Mathias Brochhausen
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Cilia Zayas
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States
| | - Bernd Blobel
- Medical Faculty, University of Regensburg, Regensburg, Germany
- eHealth Competence Center Bavaria, Deggendorf Institute of Technology, Deggendorf, Germany
- First Medical Faculty, Charles University Prague, Prague, Czechia
| | - Stefan Schulz
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria
- Averbis GmbH, Freiburg, Germany
- *Correspondence: Stefan Schulz,
| |
Collapse
|
6
|
Keloth VK, Zhou S, Lindemann L, Zheng L, Elhanan G, Einstein AJ, Geller J, Perl Y. Mining of EHR for interface terminology concepts for annotating EHRs of COVID patients. BMC Med Inform Decis Mak 2023; 23:40. [PMID: 36829139 PMCID: PMC9951157 DOI: 10.1186/s12911-023-02136-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 02/09/2023] [Indexed: 02/26/2023] Open
Abstract
BACKGROUND Two years into the COVID-19 pandemic and with more than five million deaths worldwide, the healthcare establishment continues to struggle with every new wave of the pandemic resulting from a new coronavirus variant. Research has demonstrated that there are variations in the symptoms, and even in the order of symptom presentations, in COVID-19 patients infected by different SARS-CoV-2 variants (e.g., Alpha and Omicron). Textual data in the form of admission notes and physician notes in the Electronic Health Records (EHRs) is rich in information regarding the symptoms and their orders of presentation. Unstructured EHR data is often underutilized in research due to the lack of annotations that enable automatic extraction of useful information from the available extensive volumes of textual data. METHODS We present the design of a COVID Interface Terminology (CIT), not just a generic COVID-19 terminology, but one serving a specific purpose of enabling automatic annotation of EHRs of COVID-19 patients. CIT was constructed by integrating existing COVID-related ontologies and mining additional fine granularity concepts from clinical notes. The iterative mining approach utilized the techniques of 'anchoring' and 'concatenation' to identify potential fine granularity concepts to be added to the CIT. We also tested the generalizability of our approach on a hold-out dataset and compared the annotation coverage to the coverage obtained for the dataset used to build the CIT. RESULTS Our experiments demonstrate that this approach results in higher annotation coverage compared to existing ontologies such as SNOMED CT and Coronavirus Infectious Disease Ontology (CIDO). The final version of CIT achieved about 20% more coverage than SNOMED CT and 50% more coverage than CIDO. In the future, the concepts mined and added into CIT could be used as training data for machine learning models for mining even more concepts into CIT and further increasing the annotation coverage. CONCLUSION In this paper, we demonstrated the construction of a COVID interface terminology that can be utilized for automatically annotating EHRs of COVID-19 patients. The techniques presented can identify frequently documented fine granularity concepts that are missing in other ontologies thereby increasing the annotation coverage.
Collapse
Affiliation(s)
- Vipina K Keloth
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
| | - Shuxin Zhou
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Luke Lindemann
- School of Medicine and Health Sciences, The George Washington University, Washington (D.C.), USA
| | - Ling Zheng
- Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ, USA
| | - Gai Elhanan
- Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, USA
| | - Andrew J Einstein
- Cardiology Division, Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
- Department of Radiology, Columbia University Irving Medical Center, New York, NY, USA
| | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| |
Collapse
|
7
|
Bakkas J, Hanine M, Chekry A, Gounane S, de la Torre Díez I, Lipari V, López NMM, Ashraf I. SARSMutOnto: An Ontology for SARS-CoV-2 Lineages and Mutations. Viruses 2023; 15:v15020505. [PMID: 36851719 PMCID: PMC9967353 DOI: 10.3390/v15020505] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 02/04/2023] [Accepted: 02/08/2023] [Indexed: 02/15/2023] Open
Abstract
Mutations allow viruses to continuously evolve by changing their genetic code to adapt to the hosts they infect. It is an adaptive and evolutionary mechanism that helps viruses acquire characteristics favoring their survival and propagation. The COVID-19 pandemic declared by the WHO in March 2020 is caused by the SARS-CoV-2 virus. The non-stop adaptive mutations of this virus and the emergence of several variants over time with characteristics favoring their spread constitute one of the biggest obstacles that researchers face in controlling this pandemic. Understanding the mutation mechanism allows for the adoption of anticipatory measures and the proposal of strategies to control its propagation. In this study, we focus on the mutations of this virus, and we propose the SARSMutOnto ontology to model SARS-CoV-2 mutations reported by Pango researchers. A detailed description is given for each mutation. The genes where the mutations occur and the genomic structure of this virus are also included. The sub-lineages and the recombinant sub-lineages resulting from these mutations are additionally represented while maintaining their hierarchy. We developed a Python-based tool to automatically generate this ontology from various published Pango source files. At the end of this paper, we provide some examples of SPARQL queries that can be used to exploit this ontology. SARSMutOnto might become a 'wet bench' machine learning tool for predicting likely future mutations based on previous mutations.
Collapse
Affiliation(s)
- Jamal Bakkas
- LAPSSII Laboratory, Graduate School of Technology, Cadi Ayyad University, Safi 46000, Morocco
| | - Mohamed Hanine
- Department of Telecommunications, Networks, and Informatics, LTI Laboratory, ENSA, Chouaib Doukkali University, Eljadida 24000, Morocco
| | - Abderrahman Chekry
- LAPSSII Laboratory, Graduate School of Technology, Cadi Ayyad University, Safi 46000, Morocco
| | - Said Gounane
- MIMSC Laboratory, Graduate School of Technology, Cadi Ayyad University, Essaouira 44000, Morocco
| | - Isabel de la Torre Díez
- Department of Signal Theory and Communications and Telematic Engineering, University of Valladolid, Paseo de Belén, 15, 47011 Valladolid, Spain
| | - Vivian Lipari
- Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Department of Project Management, Universidad Internacional Iberoamericana Campeche, Mexico City 24560, Mexico
- Fundación Universitaria Internacional de Colombia Bogotá, Bogotá 11001, Colombia
| | - Nohora Milena Martínez López
- Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Research Group on Foods, Nutritional Biochemistry and Health Universidad Internacional Iberoamericana, Arecibo, PR 00613, USA
- Research Group on Foods, Nutritional Biochemistry and Health Universidade Internacional do Cuanza, Cuito EN250, Angola
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea
| |
Collapse
|
8
|
Yu H, Li L, Huffman A, Beverley J, Hur J, Merrell E, Huang HH, Wang Y, Liu Y, Ong E, Cheng L, Zeng T, Zhang J, Li P, Liu Z, Wang Z, Zhang X, Ye X, Handelman SK, Sexton J, Eaton K, Higgins G, Omenn GS, Athey B, Smith B, Chen L, He Y. A new framework for host-pathogen interaction research. Front Immunol 2022; 13:1066733. [PMID: 36591248 PMCID: PMC9797517 DOI: 10.3389/fimmu.2022.1066733] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2022] Open
Abstract
COVID-19 often manifests with different outcomes in different patients, highlighting the complexity of the host-pathogen interactions involved in manifestations of the disease at the molecular and cellular levels. In this paper, we propose a set of postulates and a framework for systematically understanding complex molecular host-pathogen interaction networks. Specifically, we first propose four host-pathogen interaction (HPI) postulates as the basis for understanding molecular and cellular host-pathogen interactions and their relations to disease outcomes. These four postulates cover the evolutionary dispositions involved in HPIs, the dynamic nature of HPI outcomes, roles that HPI components may occupy leading to such outcomes, and HPI checkpoints that are critical for specific disease outcomes. Based on these postulates, an HPI Postulate and Ontology (HPIPO) framework is proposed to apply interoperable ontologies to systematically model and represent various granular details and knowledge within the scope of the HPI postulates, in a way that will support AI-ready data standardization, sharing, integration, and analysis. As a demonstration, the HPI postulates and the HPIPO framework were applied to study COVID-19 with the Coronavirus Infectious Disease Ontology (CIDO), leading to a novel approach to rational design of drug/vaccine cocktails aimed at interrupting processes occurring at critical host-coronavirus interaction checkpoints. Furthermore, the host-coronavirus protein-protein interactions (PPIs) relevant to COVID-19 were predicted and evaluated based on prior knowledge of curated PPIs and domain-domain interactions, and how such studies can be further explored with the HPI postulates and the HPIPO framework is discussed.
Collapse
Affiliation(s)
- Hong Yu
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Li Li
- Department of Genetics, Harvard Medical School, Boston, MA, United States
| | - Anthony Huffman
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - John Beverley
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
- Asymmetric Operations Sector, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, United States
| | - Eric Merrell
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Hsin-hui Huang
- University of Michigan Medical School, Ann Arbor, MI, United States
- Department of Biotechnology and Laboratory Science in Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Yang Wang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Yingtong Liu
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Edison Ong
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Liang Cheng
- Department of Bioinformatics, Harbin Medical University, Harbin, Helongjian, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Jingsong Zhang
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Pengpai Li
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhiping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhigang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences and School of Basic Medicine, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Xiangyan Zhang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Xianwei Ye
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | | | - Jonathan Sexton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Kathryn Eaton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gerry Higgins
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gilbert S. Omenn
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Brian Athey
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Luonan Chen
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, United States
| |
Collapse
|
9
|
Cui L, Dhombres F, Charlet J. Knowledge Representation and Management: Notable Contributions in 2021. Yearb Med Inform 2022; 31:236-240. [PMID: 36463882 PMCID: PMC9719756 DOI: 10.1055/s-0042-1742523] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022] Open
Abstract
OBJECTIVES To select, present, and summarize the best papers in the field of Knowledge Representation and Management (KRM) published in 2021. METHODS Following the International Medical Informatics Association (IMIA) Yearbook guidelines, a comprehensive and standardized review of the biomedical informatics literature was performed to select the best KRM papers published in 2021, based on PubMed queries. RESULTS A total of 1,231 publications were retrieved from PubMed. We nominated 15 candidate best papers, and four of them were finally selected as the best papers in the KRM section. The topics covered by these papers include knowledge graph, ontology development, ontology alignment, and the International Classification of Diseases. CONCLUSION In the KRM best paper selection for 2021, the candidate best papers covered a wider spectrum of topics compared to the last year's significant focus on ontology curation. In particular, ontology development for specific domains (e.g., Alzheimer's disease, infectious diseases, bioethics) has received the most attention.
Collapse
Affiliation(s)
- Licong Cui
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA,Correspondence to: Licong Cui School of Biomedical Informatics, The University of Texas Health Science Center at Houston7000 Fannin Street Houston, TX 77030USA
| | - Ferdinand Dhombres
- Sorbonne Université, INSERM, Univ Sorbonne Paris Nord, LIMICS, Paris, France,Sorbonne Université, Service de Médecine Foetale, DMU Origyne, AP-HP, Hôpital Armand Trousseau, Paris, France
| | - Jean Charlet
- Sorbonne Université, INSERM, Univ Sorbonne Paris Nord, LIMICS, Paris, France,AP-HP, DRCI, Paris, France
| |
Collapse
|
10
|
Bernasconi A, Guizzardi G, Pastor O, Storey VC. Semantic interoperability: ontological unpacking of a viral conceptual model. BMC Bioinformatics 2022; 23:491. [PMID: 36396980 PMCID: PMC9672571 DOI: 10.1186/s12859-022-05022-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 10/29/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Genomics and virology are unquestionably important, but complex, domains being investigated by a large number of scientists. The need to facilitate and support work within these domains requires sharing of databases, although it is often difficult to do so because of the different ways in which data is represented across the databases. To foster semantic interoperability, models are needed that provide a deep understanding and interpretation of the concepts in a domain, so that the data can be consistently interpreted among researchers. RESULTS In this research, we propose the use of conceptual models to support semantic interoperability among databases and assess their ontological clarity to support their effective use. This modeling effort is illustrated by its application to the Viral Conceptual Model (VCM) that captures and represents the sequencing of viruses, inspired by the need to understand the genomic aspects of the virus responsible for COVID-19. For achieving semantic clarity on the VCM, we leverage the "ontological unpacking" method, a process of ontological analysis that reveals the ontological foundation of the information that is represented in a conceptual model. This is accomplished by applying the stereotypes of the OntoUML ontology-driven conceptual modeling language.As a result, we propose a new OntoVCM, an ontologically grounded model, based on the initial VCM, but with guaranteed interoperability among the data sources that employ it. CONCLUSIONS We propose and illustrate how the unpacking of the Viral Conceptual Model resolves several issues related to semantic interoperability, the importance of which is recognized by the "I" in FAIR principles. The research addresses conceptual uncertainty within the domain of SARS-CoV-2 data and knowledge.The method employed provides the basis for further analyses of complex models currently used in life science applications, but lacking ontological grounding, subsequently hindering the interoperability needed for scientists to progress their research.
Collapse
Affiliation(s)
- Anna Bernasconi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain.
| | - Giancarlo Guizzardi
- Conceptual and Cognitive Modeling Research Group, Free University of Bozen-Bolzano, Bolzano, Italy
- Services and Cybersecurity Group, University of Twente, Enschede, The Netherlands
| | - Oscar Pastor
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain
| | - Veda C Storey
- J. Mack Robinson College of Business, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
11
|
He Y, Yu H, Huffman A, Lin AY, Natale DA, Beverley J, Zheng L, Perl Y, Wang Z, Liu Y, Ong E, Wang Y, Huang P, Tran L, Du J, Shah Z, Shah E, Desai R, Huang HH, Tian Y, Merrell E, Duncan WD, Arabandi S, Schriml LM, Zheng J, Masci AM, Wang L, Liu H, Smaili FZ, Hoehndorf R, Pendlington ZM, Roncaglia P, Ye X, Xie J, Tang YW, Yang X, Peng S, Zhang L, Chen L, Hur J, Omenn GS, Athey B, Smith B. A comprehensive update on CIDO: the community-based coronavirus infectious disease ontology. J Biomed Semantics 2022; 13:25. [PMID: 36271389 PMCID: PMC9585694 DOI: 10.1186/s13326-022-00279-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 09/13/2022] [Indexed: 11/24/2022] Open
Abstract
Background The current COVID-19 pandemic and the previous SARS/MERS outbreaks of 2003 and 2012 have resulted in a series of major global public health crises. We argue that in the interest of developing effective and safe vaccines and drugs and to better understand coronaviruses and associated disease mechenisms it is necessary to integrate the large and exponentially growing body of heterogeneous coronavirus data. Ontologies play an important role in standard-based knowledge and data representation, integration, sharing, and analysis. Accordingly, we initiated the development of the community-based Coronavirus Infectious Disease Ontology (CIDO) in early 2020. Results As an Open Biomedical Ontology (OBO) library ontology, CIDO is open source and interoperable with other existing OBO ontologies. CIDO is aligned with the Basic Formal Ontology and Viral Infectious Disease Ontology. CIDO has imported terms from over 30 OBO ontologies. For example, CIDO imports all SARS-CoV-2 protein terms from the Protein Ontology, COVID-19-related phenotype terms from the Human Phenotype Ontology, and over 100 COVID-19 terms for vaccines (both authorized and in clinical trial) from the Vaccine Ontology. CIDO systematically represents variants of SARS-CoV-2 viruses and over 300 amino acid substitutions therein, along with over 300 diagnostic kits and methods. CIDO also describes hundreds of host-coronavirus protein-protein interactions (PPIs) and the drugs that target proteins in these PPIs. CIDO has been used to model COVID-19 related phenomena in areas such as epidemiology. The scope of CIDO was evaluated by visual analysis supported by a summarization network method. CIDO has been used in various applications such as term standardization, inference, natural language processing (NLP) and clinical data integration. We have applied the amino acid variant knowledge present in CIDO to analyze differences between SARS-CoV-2 Delta and Omicron variants. CIDO's integrative host-coronavirus PPIs and drug-target knowledge has also been used to support drug repurposing for COVID-19 treatment. Conclusion CIDO represents entities and relations in the domain of coronavirus diseases with a special focus on COVID-19. It supports shared knowledge representation, data and metadata standardization and integration, and has been used in a range of applications. Supplementary Information The online version contains supplementary material available at 10.1186/s13326-022-00279-z.
Collapse
Affiliation(s)
- Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, USA.
| | - Hong Yu
- People's Hospital of Guizhou Province, Guiyang, Guizhou, China.
| | | | - Asiyah Yu Lin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.,National Center for Ontological Research, Buffalo, NY, USA
| | | | - John Beverley
- National Center for Ontological Research, Buffalo, NY, USA.,The Johns Hopkins University Applied Physics Laboratory, Laurel, MD, USA
| | - Ling Zheng
- Computer Science and Software Engineering Department, Monmouth University, West Long Branch, NJ, USA
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Zhigang Wang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
| | - Yingtong Liu
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Edison Ong
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Yang Wang
- University of Michigan Medical School, Ann Arbor, MI, USA.,People's Hospital of Guizhou Province, Guiyang, Guizhou, China
| | - Philip Huang
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Long Tran
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Jinyang Du
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Zalan Shah
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Easheta Shah
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Roshan Desai
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Hsin-Hui Huang
- University of Michigan Medical School, Ann Arbor, MI, USA.,National Yang-Ming University, Taipei, Taiwan
| | - Yujia Tian
- Rutgers University, New Brunswick, NJ, USA
| | | | | | | | - Lynn M Schriml
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Jie Zheng
- Department of Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Anna Maria Masci
- Office of Data Science, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
| | | | | | | | - Robert Hoehndorf
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Zoë May Pendlington
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Xianwei Ye
- People's Hospital of Guizhou Province, Guiyang, Guizhou, China
| | - Jiangan Xie
- School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yi-Wei Tang
- Cepheid, Danaher Diagnostic Platform, Shanghai, China
| | - Xiaolin Yang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, China
| | - Suyuan Peng
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luxia Zhang
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luonan Chen
- Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Junguk Hur
- University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, USA
| | | | - Brian Athey
- University of Michigan Medical School, Ann Arbor, MI, USA
| | - Barry Smith
- National Center for Ontological Research, Buffalo, NY, USA.,University at Buffalo, Buffalo, NY, 14260, USA
| |
Collapse
|
12
|
Covid19/IT the digital side of Covid19: A picture from Italy with clustering and taxonomy. PLoS One 2022; 17:e0269687. [PMID: 35679235 PMCID: PMC9182266 DOI: 10.1371/journal.pone.0269687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/26/2022] [Indexed: 11/19/2022] Open
Abstract
The Covid19 pandemic has significantly impacted on our lives, triggering a strong reaction resulting in vaccines, more effective diagnoses and therapies, policies to contain the pandemic outbreak, to name but a few. A significant contribution to their success comes from the computer science and information technology communities, both in support to other disciplines and as the primary driver of solutions for, e.g., diagnostics, social distancing, and contact tracing. In this work, we surveyed the Italian computer science and engineering community initiatives against the Covid19 pandemic. The 128 responses thus collected document the response of such a community during the first pandemic wave in Italy (February-May 2020), through several initiatives carried out by both single researchers and research groups able to promptly react to Covid19, even remotely. The data obtained by the survey are here reported, discussed and further investigated by Natural Language Processing techniques, to generate semantic clusters based on embedding representations of the surveyed activity descriptions. The resulting clusters have been then used to extend an existing Covid19 taxonomy with the classification of related research activities in computer science and information technology areas, summarizing this work contribution through a reproducible survey-to-taxonomy methodology.
Collapse
|
13
|
CoV2K model, a comprehensive representation of SARS-CoV-2 knowledge and data interplay. Sci Data 2022; 9:260. [PMID: 35650205 PMCID: PMC9160032 DOI: 10.1038/s41597-022-01348-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 04/26/2022] [Indexed: 11/08/2022] Open
Abstract
Since the outbreak of the COVID-19 pandemic, many research organizations have studied the genome of the SARS-CoV-2 virus; a body of public resources have been published for monitoring its evolution. While we experience an unprecedented richness of information in this domain, we also ascertained the presence of several information quality issues. We hereby propose CoV2K, an abstract model for explaining SARS-CoV-2-related concepts and interactions, focusing on viral mutations, their co-occurrence within variants, and their effects. CoV2K provides a clear and concise route map for understanding different connected types of information related to the virus; it thus drives a process of data and knowledge integration that aggregates information from several current resources, harmonizing their content and overcoming incompleteness and inconsistency issues. CoV2K is available for exploration as a graph that can be queried through a RESTful API addressing single entities or paths through their relationships. Practical use cases demonstrate its application to current knowledge inquiries.
Collapse
|
14
|
Medeiros GHA, Soualmia LF, Zanni-Merk C, Hagverdiyev R. Tracing and analyzing COVID-19 dissemination using knowledge graphs. PROCEDIA COMPUTER SCIENCE 2022; 207:2172-2181. [PMID: 36275379 PMCID: PMC9578937 DOI: 10.1016/j.procs.2022.09.277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The COVID-19 (SARS-CoV-2) spread around the globe could have been halted if we had had a better understanding of the situation and applied more restrictive measures for travel adapted to each country. This is due to a lack of efficient tools to visualize, analyze and control the virus dissemination. In the context of virus proliferation, analyzing flight connections between countries and COVID-19 data seems helpful to understand spatial and temporal information about the virus and its possible spread. To manage these complex, massive, and heterogeneous data, we propose a methodology based on knowledge graphs models. Several analyses and visualization tools can be applied, and our results show that these knowledge graph models may be a promising way to study the dissemination of any virus. These graphs can also be easily enriched with additional information that could be useful in the future to analyze or predict other interesting indicators.
Collapse
Affiliation(s)
| | - Lina F Soualmia
- Normandy University, LITIS EA 4108, Saint-Étienne-du-Rouvray, France
| | | | - Ramiz Hagverdiyev
- Normandy University, LITIS EA 4108, Saint-Étienne-du-Rouvray, France
- French-Azerbaijani University, Baku, Azerbaijan
| |
Collapse
|
15
|
Laddada W, Soualmia LF, Zanni-Merk C, Ayadi A, Frydman C, L'Hote I, Imbert I. OntoRepliCov: an Ontology-Based Approach for Modeling the SARS-CoV-2 Replication Process. ACTA ACUST UNITED AC 2021; 192:487-496. [PMID: 34630741 PMCID: PMC8486259 DOI: 10.1016/j.procs.2021.08.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Understanding the replication machinery of viruses contributes to suggest and try effective antiviral strategies. Exhaustive knowledge about the proteins structure, their function, or their interaction is one of the preconditions for successfully modeling it. In this context, modeling methods based on a formal representation with a high semantic expressiveness would be relevant to extract proteins and their nucleotide or amino acid sequences as an element from the replication process. Consequently, our approach relies on the use of semantic technologies to design the SARS-CoV-2 replication machinery. This provides the ability to infer new knowledge related to each step of the virus replication. More specifically, we developed an ontology-based approach enriched with reasoning process of a complete replication machinery process for SARS-CoV-2. We present in this paper a partial overview of our ontology OntoRepliCov to describe one step of this process, namely, the continuous translation or protein synthesis, through classes, properties, axioms, and SWRL (Semantic Web Rule Language) rules.
Collapse
Affiliation(s)
- Wissame Laddada
- Normandie Universit, LITIS, 7600 Rouen, France.,Aix-Marseille Universit, LIS, 13009 Marseille, France
| | | | | | - Ali Ayadi
- Aix-Marseille Universit, LIS, 13009 Marseille, France
| | | | - India L'Hote
- Aix-Marseille Universit, AFMB, 13009 Marseille, France
| | | |
Collapse
|
16
|
Patel A, Debnath NC, Mishra AK, Jain S. Covid19-IBO: A Covid-19 Impact on Indian Banking Ontology Along with an Efficient Schema Matching Approach. NEW GENERATION COMPUTING 2021; 39:647-676. [PMID: 34667368 PMCID: PMC8517947 DOI: 10.1007/s00354-021-00136-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 08/26/2021] [Indexed: 05/21/2023]
Abstract
The exponential spread of Covid-19 is not only a serious concern for public health but has also severely affected the global economy. India is not an exception. The banking sector must plan innovatively in a wide range of scenarios focusing upon Covid-19 specific requirements. It becomes essential to examine the impact of Covid-19 on the performance of the Indian banking sector and take focused initiatives at both the tactical and the strategic levels. This paper offers the Covid-19 Impact on Banking Ontology (Covid19-IBO) that provides semantic information about the impact of Covid-19 on the banking sector of India. The developed ontology has been verified and validated and has been made available on the Linked Open Data cloud. It can be utilized to annotate the related data to provide meaningful insights. The Covid-19 ontologies already available have some overlapping information that causes redundancy. Unified integration of these ontologies is required to operate upon them unambiguously. It becomes reasonable to develop a matching approach to link all these ontologies semantically. We, therefore, also provide a schema matching approach with reasonable results to map the Covid-19 ontologies.
Collapse
Affiliation(s)
- Archana Patel
- Department of Software Engineering, School of Computing and Information Technology, Eastern International University, Binh Duong New City, Vietnam
| | - Narayan C. Debnath
- Department of Software Engineering, School of Computing and Information Technology, Eastern International University, Binh Duong New City, Vietnam
| | | | - Sarika Jain
- Department of Computer Applications, National Institute of Technology, Kurukshetra, Haryana India
| |
Collapse
|
17
|
Implications of Knowledge Organization Systems for Health Information Exchange and Communication during the COVID-19 Pandemic. DATA AND INFORMATION MANAGEMENT 2020; 4:148-170. [PMID: 35382097 PMCID: PMC8969569 DOI: 10.2478/dim-2020-0009] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 05/23/2020] [Indexed: 12/25/2022]
Abstract
This article aims to review the important roles of health knowledge organization systems (KOSs) during the COVID-19 pandemic. Different types of knowledge organization systems, including term lists, synonym rings, thesauri, subject heading systems, taxonomies, classification schemes, and ontologies are widely recognized and applied in both modern and traditional information systems. Apart from their usage in the management of data, information, and knowledge, KOSs are seen as valuable components for large information architecture, content management, findability improvement, and many other applications. After introducing the challenges of information overload and semantic conflicts, the article reviews the efforts of major health KOSs, illustrates various health coding schemes, explains their usages and implementations, and reveals their implications for health information exchange and communication during the COVID-19 pandemic. Some general examples of the applications, services, and analysis powered by KOSs are presented at the end. As revealed in this article, they have become even more critical to aid the frontline endeavors to overcome the obstacles due to information overload and semantic conflicts that can occur during devastating historic and worldwide events like the COVID-19 pandemic.
Collapse
|