1
|
Beverley J, Babcock S, Carvalho G, Cowell LG, Duesing S, He Y, Hurley R, Merrell E, Scheuermann RH, Smith B. Coordinating virus research: The Virus Infectious Disease Ontology. PLoS One 2024; 19:e0285093. [PMID: 38236918 PMCID: PMC10796065 DOI: 10.1371/journal.pone.0285093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 04/12/2023] [Indexed: 01/22/2024] Open
Abstract
The COVID-19 pandemic prompted immense work on the investigation of the SARS-CoV-2 virus. Rapid, accurate, and consistent interpretation of generated data is thereby of fundamental concern. Ontologies-structured, controlled, vocabularies-are designed to support consistency of interpretation, and thereby to prevent the development of data silos. This paper describes how ontologies are serving this purpose in the COVID-19 research domain, by following principles of the Open Biological and Biomedical Ontology (OBO) Foundry and by reusing existing ontologies such as the Infectious Disease Ontology (IDO) Core, which provides terminological content common to investigations of all infectious diseases. We report here on the development of an IDO extension, the Virus Infectious Disease Ontology (VIDO), a reference ontology covering viral infectious diseases. We motivate term and definition choices, showcase reuse of terms from existing OBO ontologies, illustrate how ontological decisions were motivated by relevant life science research, and connect VIDO to the Coronavirus Infectious Disease Ontology (CIDO). We next use terms from these ontologies to annotate selections from life science research on SARS-CoV-2, highlighting how ontologies employing a common upper-level vocabulary may be seamlessly interwoven. Finally, we outline future work, including bacteria and fungus infectious disease reference ontologies currently under development, then cite uses of VIDO and CIDO in host-pathogen data analytics, electronic health record annotation, and ontology conflict-resolution projects.
Collapse
Affiliation(s)
- John Beverley
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| | - Shane Babcock
- National Center for Ontological Research, Buffalo, NY, United States of America
- Air Force Research Laboratory, Wright Patterson Air Force Base, Riverside, OH, United States of America
| | - Gustavo Carvalho
- Department of Cognitive Science, Northwestern University, Evanston, IL, United States of America
| | - Lindsay G. Cowell
- Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States of America
| | - Sebastian Duesing
- Department of Philosophy, Loyola University, Chicago, IL, United States of America
| | - Yongqun He
- Computational Medicine and Bioinformatics, University of Michigan Medical School, He Group, Ann Arbor, MI, United States of America
| | - Regina Hurley
- National Center for Ontological Research, Buffalo, NY, United States of America
- Department of Philosophy, Northwestern University, Evanston, IL, United States of America
| | - Eric Merrell
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| | - Richard H. Scheuermann
- Department of Informatics, J. Craig Venter Institute, La Jolla, CA, United States of America
- Department of Pathology, University of California, San Diego, CA, United States of America
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, United States of America
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States of America
- National Center for Ontological Research, Buffalo, NY, United States of America
| |
Collapse
|
2
|
Huffman A, Ong E, Hur J, D’Mello A, Tettelin H, He Y. COVID-19 vaccine design using reverse and structural vaccinology, ontology-based literature mining and machine learning. Brief Bioinform 2022; 23:bbac190. [PMID: 35649389 PMCID: PMC9294427 DOI: 10.1093/bib/bbac190] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 04/13/2022] [Accepted: 04/26/2022] [Indexed: 12/11/2022] Open
Abstract
Rational vaccine design, especially vaccine antigen identification and optimization, is critical to successful and efficient vaccine development against various infectious diseases including coronavirus disease 2019 (COVID-19). In general, computational vaccine design includes three major stages: (i) identification and annotation of experimentally verified gold standard protective antigens through literature mining, (ii) rational vaccine design using reverse vaccinology (RV) and structural vaccinology (SV) and (iii) post-licensure vaccine success and adverse event surveillance and its usage for vaccine design. Protegen is a database of experimentally verified protective antigens, which can be used as gold standard data for rational vaccine design. RV predicts protective antigen targets primarily from genome sequence analysis. SV refines antigens through structural engineering. Recently, RV and SV approaches, with the support of various machine learning methods, have been applied to COVID-19 vaccine design. The analysis of post-licensure vaccine adverse event report data also provides valuable results in terms of vaccine safety and how vaccines should be used or paused. Ontology standardizes and incorporates heterogeneous data and knowledge in a human- and computer-interpretable manner, further supporting machine learning and vaccine design. Future directions on rational vaccine design are discussed.
Collapse
Affiliation(s)
- Anthony Huffman
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, North Dakota 58202, USA
| | - Adonis D’Mello
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Hervé Tettelin
- Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
3
|
Jayachandran SK, Anusuyadevi M, Essa MM, Qoronfleh MW. Decoding information on COVID-19: Ontological approach towards design possible therapeutics. INFORMATICS IN MEDICINE UNLOCKED 2020; 22:100486. [PMID: 33263073 PMCID: PMC7691137 DOI: 10.1016/j.imu.2020.100486] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 11/20/2020] [Accepted: 11/20/2020] [Indexed: 12/23/2022] Open
Abstract
To date, no effective preventive or curative medical interventions exist against COVID-19, caused by Severe Acute Respiratory Syndrome corona virus 2 (SARS CoV-2). The available interventions are only supportive and palliative in nature. Popular among the emerging explanations for the mortality from COVID-19 is "cytokine storm", attributed to the body's aggressive immune response to this novel pathogen. In less than a year the disease has spread to almost all countries, though the mortality rates have varied significantly from country to country based on factors such as the demographical mix of the population, prevalence of comorbidities, as well as prior exposure to viruses from the corona family. This review examines the current literature on mortality rates across the globe, explores the possible reasons, thereby decoding variations. COVID-19 researchers have noted unique characteristics in the structural and host-pathogen interaction and identified several possible target proteins and sites that could exhibit control over the entry of SARS CoV-2 into the host, which this paper reviews in detail. Identification of new targets, both in the virus and the host, may accelerate the search for effective vaccines and curative drugs against COVID-19. Further, the ontological approach of this review is likely to provide insights for researchers to anticipate and be ready for future mutant viruses that may emerge in future.
Collapse
Affiliation(s)
- Swaminathan K Jayachandran
- Drug Discovery and Molecular Cardiology Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620204, India
| | - Muthuswamy Anusuyadevi
- Molecular Gerontology Lab, Department of Biochemistry, School of Life Sciences, Bharathidasan University, Tiruchirappalli, 620204, India
| | - Musthafa Mohamed Essa
- Department of Food Science and Nutrition, CAMS, Sultan Qaboos University, Muscat, Oman
- Ageing and Dementia Research Group, Sultan Qaboos University, Muscat, Oman
| | - M Walid Qoronfleh
- Research & Policy Department, World Innovation Summit for Health (WISH), Qatar Foundation, P.O. Box 5825, Doha, Qatar
| |
Collapse
|
4
|
Protein ontology on the semantic web for knowledge discovery. Sci Data 2020; 7:337. [PMID: 33046717 PMCID: PMC7550340 DOI: 10.1038/s41597-020-00679-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Accepted: 09/17/2020] [Indexed: 11/26/2022] Open
Abstract
The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/.
Collapse
|
5
|
Abstract
The Protein Ontology (PRO) is the reference ontology for proteins in the Open Biomedical Ontologies (OBO) foundry and consists of three sub-ontologies representing protein classes of homologous genes, proteoforms (e.g., splice isoforms, sequence variants, and post-translationally modified forms), and protein complexes. PRO defines classes of proteins and protein complexes, both species-specific and species nonspecific, and indicates their relationships in a hierarchical framework, supporting accurate protein annotation at the appropriate level of granularity, analyses of protein conservation across species, and semantic reasoning. In the first section of this chapter, we describe the PRO framework including categories of PRO terms and the relationship of PRO to other ontologies and protein resources. Next, we provide a tutorial about the PRO website ( proconsortium.org ) where users can browse and search the PRO hierarchy, view reports on individual PRO terms, and visualize relationships among PRO terms in a hierarchical table view, a multiple sequence alignment view, and a Cytoscape network view. Finally, we describe several examples illustrating the unique and rich information available in PRO.
Collapse
|
6
|
Full-Length cDNA Cloning, Molecular Characterization and Differential Expression Analysis of Lysophospholipase I from Ovis aries. Int J Mol Sci 2016; 17:ijms17081206. [PMID: 27483239 PMCID: PMC5000604 DOI: 10.3390/ijms17081206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Revised: 06/15/2016] [Accepted: 07/19/2016] [Indexed: 01/23/2023] Open
Abstract
Lysophospholipase I (LYPLA1) is an important protein with multiple functions. In this study, the full-length cDNA of the LYPLA1 gene from Ovis aries (OaLypla1) was cloned using primers and rapid amplification of cDNA ends (RACE) technology. The full-length OaLypla1 was 2457 bp with a 5′-untranslated region (UTR) of 24 bp, a 3′-UTR of 1740 bp with a poly (A) tail, and an open reading frame (ORF) of 693 bp encoding a protein of 230 amino acid residues with a predicted molecular weight of 24,625.78 Da. Phylogenetic analysis showed that the OaLypla1 protein shared a high amino acid identity with LYPLA1 of Bos taurus. The recombinant OaLypla1 protein was expressed and purified, and its phospholipase activity was identified. Monoclonal antibodies (mAb) against OaLypla1 that bound native OaLypla1 were generated. Real-time PCR analysis revealed that OaLypla1 was constitutively expressed in the liver, spleen, lung, kidney, and white blood cells of sheep, with the highest level in the kidney. Additionally, the mRNA levels of OaLypla1 in the buffy coats of sheep challenged with virulent or avirulent Brucella strains were down-regulated compared to untreated sheep. The results suggest that OaLypla1 may have an important physiological role in the host response to bacteria. The function of OaLypla1 in the host response to bacterial infection requires further study in the future.
Collapse
|
7
|
Karadeniz İ, Hur J, He Y, Özgür A. Literature Mining and Ontology based Analysis of Host-Brucella Gene-Gene Interaction Network. Front Microbiol 2015; 6:1386. [PMID: 26696993 PMCID: PMC4673313 DOI: 10.3389/fmicb.2015.01386] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 11/20/2015] [Indexed: 01/27/2023] Open
Abstract
Brucella is an intracellular bacterium that causes chronic brucellosis in humans and various mammals. The identification of host-Brucella interaction is crucial to understand host immunity against Brucella infection and Brucella pathogenesis against host immune responses. Most of the information about the inter-species interactions between host and Brucella genes is only available in the text of the scientific publications. Many text-mining systems for extracting gene and protein interactions have been proposed. However, only a few of them have been designed by considering the peculiarities of host–pathogen interactions. In this paper, we used a text mining approach for extracting host-Brucella gene–gene interactions from the abstracts of articles in PubMed. The gene–gene interactions here represent the interactions between genes and/or gene products (e.g., proteins). The SciMiner tool, originally designed for detecting mammalian gene/protein names in text, was extended to identify host and Brucella gene/protein names in the abstracts. Next, sentence-level and abstract-level co-occurrence based approaches, as well as sentence-level machine learning based methods, originally designed for extracting intra-species gene interactions, were utilized to extract the interactions among the identified host and Brucella genes. The extracted interactions were manually evaluated. A total of 46 host-Brucella gene interactions were identified and represented as an interaction network. Twenty four of these interactions were identified from sentence-level processing. Twenty two additional interactions were identified when abstract-level processing was performed. The Interaction Network Ontology (INO) was used to represent the identified interaction types at a hierarchical ontology structure. Ontological modeling of specific gene–gene interactions demonstrates that host–pathogen gene–gene interactions occur at experimental conditions which can be ontologically represented. Our results show that the introduced literature mining and ontology-based modeling approach are effective in retrieving and analyzing host–pathogen gene–gene interaction networks.
Collapse
Affiliation(s)
- İlknur Karadeniz
- Department of Computer Engineering, Boğaziçi University Istanbul, Turkey
| | - Junguk Hur
- Department of Basic Sciences, School of Medicine and Health Sciences, University of North Dakota, Grand Forks ND, USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor MI, USA ; Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor MI, USA ; Comprehensive Cancer Center, University of Michigan Health System, Ann Arbor MI, USA
| | - Arzucan Özgür
- Department of Computer Engineering, Boğaziçi University Istanbul, Turkey
| |
Collapse
|