1
|
Gu Z. simona: a comprehensive R package for semantic similarity analysis on bio-ontologies. BMC Genomics 2024; 25:869. [PMID: 39285315 PMCID: PMC11406866 DOI: 10.1186/s12864-024-10759-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 09/02/2024] [Indexed: 09/19/2024] Open
Abstract
BACKGROUND Bio-ontologies are keys in structuring complex biological information for effective data integration and knowledge representation. Semantic similarity analysis on bio-ontologies quantitatively assesses the degree of similarity between biological concepts based on the semantics encoded in ontologies. It plays an important role in structured and meaningful interpretations and integration of complex data from multiple biological domains. RESULTS We present simona, a novel R package for semantic similarity analysis on general bio-ontologies. Simona implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. Moreover, it provides a robust toolbox supporting over 70 methods for semantic similarity analysis. With simona, we conducted a benchmark against current semantic similarity methods. The results demonstrate methods are clustered based on their mathematical methodologies, thus guiding researchers in the selection of appropriate methods. Additionally, we explored annotation-based versus topology-based methods, revealing that semantic similarities solely based on ontology topology can efficiently reveal semantic similarity structures, facilitating analysis on less-studied organisms and other ontologies. CONCLUSIONS Simona offers a versatile interface and efficient implementation for processing, visualization, and semantic similarity analysis on bio-ontologies. We believe that simona will serve as a robust tool for uncovering relationships and enhancing the interoperability of biological knowledge systems.
Collapse
Affiliation(s)
- Zuguang Gu
- Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 280, Heidelberg, 69120, Germany.
| |
Collapse
|
2
|
Jannasch A, Tulok S, Okafornta CW, Kugel T, Bortolomeazzi M, Boissonnet T, Schmidt C, Vogelsang A, Dittfeld C, Tugtekin SM, Matschke K, Paliulis L, Thomas C, Lindemann D, Fabig G, Müller-Reichert T. Setting up an institutional OMERO environment for bioimage data: Perspectives from both facility staff and users. J Microsc 2024. [PMID: 39275979 DOI: 10.1111/jmi.13360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 09/03/2024] [Accepted: 09/04/2024] [Indexed: 09/16/2024]
Abstract
Modern bioimaging core facilities at research institutions are essential for managing and maintaining high-end instruments, providing training and support for researchers in experimental design, image acquisition and data analysis. An important task for these facilities is the professional management of complex multidimensional bioimaging data, which are often produced in large quantity and very different file formats. This article details the process that led to successfully implementing the OME Remote Objects system (OMERO) for bioimage-specific research data management (RDM) at the Core Facility Cellular Imaging (CFCI) at the Technische Universität Dresden (TU Dresden). Ensuring compliance with the FAIR (findable, accessible, interoperable, reusable) principles, we outline here the challenges that we faced in adapting data handling and storage to a new RDM system. These challenges included the introduction of a standardised group-specific naming convention, metadata curation with tagging and Key-Value pairs, and integration of existing image processing workflows. By sharing our experiences, this article aims to provide insights and recommendations for both individual researchers and educational institutions intending to implement OMERO as a management system for bioimaging data. We showcase how tailored decisions and structured approaches lead to successful outcomes in RDM practices.
Collapse
Affiliation(s)
- Anett Jannasch
- Department of Cardiac Surgery, Faculty of Medicine and University Hospital Carl Gustav Carus, Heart Centre Dresden, Technische Universität Dresden, Dresden, Germany
| | - Silke Tulok
- Core Facility Cellular Imaging, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | | | - Thomas Kugel
- IT Department, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | | | - Tom Boissonnet
- Center for Advanced Imaging, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
| | - Christian Schmidt
- Enabling Technology Department, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Andy Vogelsang
- Core Facility Cellular Imaging, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Claudia Dittfeld
- Department of Cardiac Surgery, Faculty of Medicine and University Hospital Carl Gustav Carus, Heart Centre Dresden, Technische Universität Dresden, Dresden, Germany
| | - Sems-Malte Tugtekin
- Department of Cardiac Surgery, Faculty of Medicine and University Hospital Carl Gustav Carus, Heart Centre Dresden, Technische Universität Dresden, Dresden, Germany
| | - Klaus Matschke
- Department of Cardiac Surgery, Faculty of Medicine and University Hospital Carl Gustav Carus, Heart Centre Dresden, Technische Universität Dresden, Dresden, Germany
| | - Leocadia Paliulis
- Biology Department, Bucknell University, Lewisburg, Pennsylvania, USA
| | - Carola Thomas
- Faculty of Medicine Carl Gustav Carus, Institute of Medical Microbiology and Virology, Technische Universität Dresden, Dresden, Germany
| | - Dirk Lindemann
- Faculty of Medicine Carl Gustav Carus, Institute of Medical Microbiology and Virology, Technische Universität Dresden, Dresden, Germany
| | - Gunar Fabig
- Experimental Center, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Thomas Müller-Reichert
- Core Facility Cellular Imaging, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
- Experimental Center, Faculty of Medicine Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| |
Collapse
|
3
|
Li J, Li Y, Pan Y, Guo J, Sun Z, Li F, He Y, Tao C. Mapping vaccine names in clinical trials to vaccine ontology using cascaded fine-tuned domain-specific language models. J Biomed Semantics 2024; 15:14. [PMID: 39123237 PMCID: PMC11316402 DOI: 10.1186/s13326-024-00318-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 07/31/2024] [Indexed: 08/12/2024] Open
Abstract
BACKGROUND Vaccines have revolutionized public health by providing protection against infectious diseases. They stimulate the immune system and generate memory cells to defend against targeted diseases. Clinical trials evaluate vaccine performance, including dosage, administration routes, and potential side effects. CLINICALTRIALS gov is a valuable repository of clinical trial information, but the vaccine data in them lacks standardization, leading to challenges in automatic concept mapping, vaccine-related knowledge development, evidence-based decision-making, and vaccine surveillance. RESULTS In this study, we developed a cascaded framework that capitalized on multiple domain knowledge sources, including clinical trials, the Unified Medical Language System (UMLS), and the Vaccine Ontology (VO), to enhance the performance of domain-specific language models for automated mapping of VO from clinical trials. The Vaccine Ontology (VO) is a community-based ontology that was developed to promote vaccine data standardization, integration, and computer-assisted reasoning. Our methodology involved extracting and annotating data from various sources. We then performed pre-training on the PubMedBERT model, leading to the development of CTPubMedBERT. Subsequently, we enhanced CTPubMedBERT by incorporating SAPBERT, which was pretrained using the UMLS, resulting in CTPubMedBERT + SAPBERT. Further refinement was accomplished through fine-tuning using the Vaccine Ontology corpus and vaccine data from clinical trials, yielding the CTPubMedBERT + SAPBERT + VO model. Finally, we utilized a collection of pre-trained models, along with the weighted rule-based ensemble approach, to normalize the vaccine corpus and improve the accuracy of the process. The ranking process in concept normalization involves prioritizing and ordering potential concepts to identify the most suitable match for a given context. We conducted a ranking of the Top 10 concepts, and our experimental results demonstrate that our proposed cascaded framework consistently outperformed existing effective baselines on vaccine mapping, achieving 71.8% on top 1 candidate's accuracy and 90.0% on top 10 candidate's accuracy. CONCLUSION This study provides a detailed insight into a cascaded framework of fine-tuned domain-specific language models improving mapping of VO from clinical trials. By effectively leveraging domain-specific information and applying weighted rule-based ensembles of different pre-trained BERT models, our framework can significantly enhance the mapping of VO from clinical trials.
Collapse
Affiliation(s)
- Jianfu Li
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, 32224, USA
| | - Yiming Li
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Yuanyi Pan
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Jinjing Guo
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Zenan Sun
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Fang Li
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, 32224, USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| | - Cui Tao
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL, 32224, USA.
| |
Collapse
|
4
|
He YO, Barisoni L, Rosenberg AZ, Robinson PN, Diehl AD, Chen Y, Phuong JP, Hansen J, Herr BW, Börner K, Schaub J, Bonevich N, Arnous G, Boddapati S, Zheng J, Alakwaa F, Sarder P, Duncan WD, Liang C, Valerius MT, Jain S, Iyengar R, Himmelfarb J, Kretzler M. Ontology-based modeling, integration, and analysis of heterogeneous clinical, pathological, and molecular kidney data for precision medicine. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.01.587658. [PMID: 38617362 PMCID: PMC11014593 DOI: 10.1101/2024.04.01.587658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Many data resources generate, process, store, or provide kidney related molecular, pathological, and clinical data. Reference ontologies offer an opportunity to support knowledge and data integration. The Kidney Precision Medicine Project (KPMP) team contributed to the representation and addition of 329 kidney phenotype terms to the Human Phenotype Ontology (HPO), and identified many subcategories of acute kidney injury (AKI) or chronic kidney disease (CKD). The Kidney Tissue Atlas Ontology (KTAO) imports and integrates kidney-related terms from existing ontologies (e.g., HPO, CL, and Uberon) and represents 259 kidney-related biomarkers. We have also developed a precision medicine metadata ontology (PMMO) to integrate 50 variables from KPMP and CZ CellxGene data resources and applied PMMO for integrative kidney data analysis. The gene expression profiles of kidney gene biomarkers were specifically analyzed under healthy control or AKI/CKD disease states. This work demonstrates how ontology-based approaches support multi-domain data and knowledge integration in precision medicine.
Collapse
|
5
|
Raj S, Raj S, Namdeo V, Srivastava A. Decoding the gene-disease associations in type 2 diabetes: A curated dataset for text mining-based classification. Data Brief 2024; 54:110418. [PMID: 38708311 PMCID: PMC11068543 DOI: 10.1016/j.dib.2024.110418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 04/03/2024] [Accepted: 04/09/2024] [Indexed: 05/07/2024] Open
Abstract
Type 2 Diabetes (T2D) exerts a substantial impact on mortality rates. According to 2023 statistics, more than half a billion individuals are experiencing the effects of T2D, making it one of the top 10 leading contributors to worldwide deaths. Multiple factors contribute to the onset of T2D, such as obesity, poor diet and lifestyle, the mutation in specific genes and many more. Among the various factors that contribute to the development of T2D, genetics is a pivotal aspect. Due to the significant influence of genes in the initiation and advancement of various phases of T2D, our focus lies on exploring the association between T2D and genes. In the present article, we have curated Standard disease gene association data which contains evidence or reference sentences which contain this disease gene association information, which is further classified into 4 classes: Yes, No, Ambiguous and X each pertaining to Positive, Negative, Ambiguous and Not related disease-gene associations respectively. For the purpose of this work, we downloaded T2D related abstracts from PubMed using EDirect and further pre-processed this abstract data to extract Reference Sentences Data. This data was later double-fold manually validated to compile this disease gene association data. The data produced in this article serves as reference data for the training text mining-based biological literature classifiers. Classifiers will further be used to predict classes of published literature, not just for T2D, but can also be expanded beyond to encompass a wide range of disease and their complications. The compilation of positively linked genes derived from these predictions can then be utilized for in-depth system-level analysis of T2D.
Collapse
Affiliation(s)
- Sushrutha Raj
- Amity Institute of Integrative Sciences and Health, Amity University Haryana, Amity Education Valley, Gurgaon 122413, India
| | - Sushmitha Raj
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
| | - Vindhya Namdeo
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
| | - Alok Srivastava
- Sri Innovation and Research Foundation, Ghaziabad 201009, India
- L V Prasad Eye Institute, Hyderabad 500034, Telangana, India
| |
Collapse
|
6
|
Corcho O, Ekaputra FJ, Heibi I, Jonquet C, Micsik A, Peroni S, Storti E. A maturity model for catalogues of semantic artefacts. Sci Data 2024; 11:479. [PMID: 38730252 PMCID: PMC11087458 DOI: 10.1038/s41597-024-03185-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 03/25/2024] [Indexed: 05/12/2024] Open
Abstract
This work presents a maturity model for assessing catalogues of semantic artefacts, one of the keystones that permit semantic interoperability of systems. We defined the dimensions and related features to include in the maturity model by analysing the current literature and existing catalogues of semantic artefacts provided by experts. In addition, we assessed 26 different catalogues to demonstrate the effectiveness of the maturity model, which includes 12 different dimensions (Metadata, Openness, Quality, Availability, Statistics, PID, Governance, Community, Sustainability, Technology, Transparency, and Assessment) and 43 related features (or sub-criteria) associated with these dimensions. Such a maturity model is one of the first attempts to provide recommendations for governance and processes for preserving and maintaining semantic artefacts and helps assess/address interoperability challenges.
Collapse
Affiliation(s)
- Oscar Corcho
- Ontology Engineering Group (OEG), Computer Science School, Universidad Politécnica de Madrid, Madrid, Spain
| | - Fajar J Ekaputra
- DPKM, Vienna University of Economic and Business (WU), Vienna, Austria
- Data Science Research Unit, TU Wien, Vienna, Austria
| | - Ivan Heibi
- Digital Humanities Advanced Research Centre (/DH.arc), Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy
- Research Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy
| | - Clement Jonquet
- MISTEA, University of Montpellier, INRAE & Institut Agro, Montpellier, France
- LIRMM, University of Montpellier & CNRS, Montpellier, France
| | - Andras Micsik
- Department of Distributed Systems (DSD), Institute for Computer Science and Control (SZTAKI), Hungarian Research Network (HUN-REN), Budapest, Hungary
| | - Silvio Peroni
- Digital Humanities Advanced Research Centre (/DH.arc), Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy.
- Research Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy.
| | - Emanuele Storti
- Department of Information Engineering, Polytechnic University of Marche, Ancona, Italy
- European Council of Doctoral Candidates and Junior Researchers (Eurodoc), Brussels, Belgium
| |
Collapse
|
7
|
Galgonek J, Vondrášek J. The IDSM mass spectrometry extension: searching mass spectra using SPARQL. Bioinformatics 2024; 40:btae174. [PMID: 38561173 PMCID: PMC11034985 DOI: 10.1093/bioinformatics/btae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/24/2024] [Accepted: 03/28/2024] [Indexed: 04/04/2024] Open
Abstract
SUMMARY The Integrated Database of Small Molecules (IDSM) integrates data from small-molecule datasets, making them accessible through the SPARQL query language. Its unique feature is the ability to search for compounds through SPARQL based on their molecular structure. We extended IDSM to enable mass spectra databases to be integrated and searched for based on mass spectrum similarity. As sources of mass spectra, we employed the MassBank of North America database and the In Silico Spectral Database of natural products. AVAILABILITY AND IMPLEMENTATION The extension is an integral part of IDSM, which is available at https://idsm.elixir-czech.cz. The manual and usage examples are available at https://idsm.elixir-czech.cz/docs/ms. The source codes of all IDSM parts are available under open-source licences at https://github.com/idsm-src.
Collapse
Affiliation(s)
- Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, Prague 160 00, Czech Republic
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, Prague 160 00, Czech Republic
| |
Collapse
|
8
|
Gao Y, Zhou Q, Luo J, Xia C, Zhang Y, Yue Z. Crop-GPA: an integrated platform of crop gene-phenotype associations. NPJ Syst Biol Appl 2024; 10:15. [PMID: 38346982 PMCID: PMC10861494 DOI: 10.1038/s41540-024-00343-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 01/22/2024] [Indexed: 02/15/2024] Open
Abstract
With the increasing availability of large-scale biology data in crop plants, there is an urgent demand for a versatile platform that fully mines and utilizes the data for modern molecular breeding. We present Crop-GPA ( https://crop-gpa.aielab.net ), a comprehensive and functional open-source platform for crop gene-phenotype association data. The current Crop-GPA provides well-curated information on genes, phenotypes, and their associations (GPAs) to researchers through an intuitive interface, dynamic graphical visualizations, and efficient online tools. Two computational tools, GPA-BERT and GPA-GCN, are specifically developed and integrated into Crop-GPA, facilitating the automatic extraction of gene-phenotype associations from bio-crop literature and predicting unknown relations based on known associations. Through usage examples, we demonstrate how our platform enables the exploration of complex correlations between genes and phenotypes in crop plants. In summary, Crop-GPA serves as a valuable multi-functional resource, empowering the crop research community to gain deeper insights into the biological mechanisms of interest.
Collapse
Affiliation(s)
- Yujia Gao
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Qian Zhou
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Jiaxin Luo
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Chuan Xia
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China
| | - Youhua Zhang
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| | - Zhenyu Yue
- School of Information and Artificial Intelligence, Anhui Beidou Precision Agriculture Information Engineering Research Center, Anhui Agricultural University, Hefei, Anhui, 230036, China.
| |
Collapse
|
9
|
Wang Y, Ye M, Zhang F, Freeman ZT, Yu H, Ye X, He Y. Ontology-based taxonomical analysis of experimentally verified natural and laboratory human coronavirus hosts and its implication for COVID-19 virus origination and transmission. PLoS One 2024; 19:e0295541. [PMID: 38252647 PMCID: PMC10802970 DOI: 10.1371/journal.pone.0295541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 11/26/2023] [Indexed: 01/24/2024] Open
Abstract
To fully understand COVID-19, it is critical to study all possible hosts of SARS-CoV-2 (the pathogen of COVID-19). In this work, we collected, annotated, and performed ontology-based taxonomical analysis of all the reported and verified hosts for all human coronaviruses including SARS-CoV, MERS-CoV, SARS-CoV-2, HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV-HKU1. A total of 37 natural hosts and 19 laboratory animal hosts of human coronaviruses were identified based on experimental evidence. Our analysis found that all the verified susceptible natural and laboratory animals belong to therian mammals. Specifically, these 37 natural therian hosts include one wildlife marsupial mammal (i.e., Virginia opossum) and 36 Eutheria mammals (a.k.a. placental mammals). The 19 laboratory animal hosts are also classified as therian mammals. The mouse models with genetically modified human ACE2 or DPP4 were more susceptible to virulent human coronaviruses with clear symptoms, suggesting the critical role of ACE2 and DPP4 to coronavirus virulence. Coronaviruses became more virulent and adaptive in the mouse hosts after a series of viral passages in the mice, providing clue to the possible coronavirus origination. The Huanan Seafood Wholesale Market animals identified early in the COVID-19 outbreak were also systematically analyzed as possible COVID-19 hosts. To support knowledge standardization and query, the annotated host knowledge was modeled and represented in the Coronavirus Infectious Disease Ontology (CIDO). Based on our and others' findings, we further propose a MOVIE model (i.e., Multiple-Organism viral Variations and Immune Evasion) to address how viral variations in therian animal hosts and the host immune evasion might have led to dynamic COVID-19 pandemic outcomes.
Collapse
Affiliation(s)
- Yang Wang
- Guizhou University School of Medicine, Guiyang, Guizhou, China
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and NHC Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou University, Guiyang, Guizhou, China
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States of America
| | - Muhui Ye
- Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong, China
| | - Fengwei Zhang
- Guizhou University School of Medicine, Guiyang, Guizhou, China
| | - Zachary Thomas Freeman
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States of America
| | - Hong Yu
- Guizhou University School of Medicine, Guiyang, Guizhou, China
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and NHC Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou University, Guiyang, Guizhou, China
| | - Xianwei Ye
- Guizhou University School of Medicine, Guiyang, Guizhou, China
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and NHC Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou University, Guiyang, Guizhou, China
| | - Yongqun He
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States of America
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, United States of America
- Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, United States of America
| |
Collapse
|
10
|
Baron JA, Johnson CSB, Schor MA, Olley D, Nickel L, Felix V, Munro J, Bello S, Bearer C, Lichenstein R, Bisordi K, Koka R, Greene C, Schriml L. The DO-KB Knowledgebase: a 20-year journey developing the disease open science ecosystem. Nucleic Acids Res 2024; 52:D1305-D1314. [PMID: 37953304 PMCID: PMC10767934 DOI: 10.1093/nar/gkad1051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/14/2023] Open
Abstract
In 2003, the Human Disease Ontology (DO, https://disease-ontology.org/) was established at Northwestern University. In the intervening 20 years, the DO has expanded to become a highly-utilized disease knowledge resource. Serving as the nomenclature and classification standard for human diseases, the DO provides a stable, etiology-based structure integrating mechanistic drivers of human disease. Over the past two decades the DO has grown from a collection of clinical vocabularies, into an expertly curated semantic resource of over 11300 common and rare diseases linking disease concepts through more than 37000 vocabulary cross mappings (v2023-08-08). Here, we introduce the recently launched DO Knowledgebase (DO-KB), which expands the DO's representation of the diseaseome and enhances the findability, accessibility, interoperability and reusability (FAIR) of disease data through a new SPARQL service and new Faceted Search Interface. The DO-KB is an integrated data system, built upon the DO's semantic disease knowledge backbone, with resources that expose and connect the DO's semantic knowledge with disease-related data across Open Linked Data resources. This update includes descriptions of efforts to assess the DO's global impact and improvements to data quality and content, with emphasis on changes in the last two years.
Collapse
Affiliation(s)
- J Allen Baron
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Michael A Schor
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Dustin Olley
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Lance Nickel
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Victor Felix
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - James B Munro
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
- Animal and Plant Health Inspection Service, Plant Protection and Quarantine, USDA, USA
| | - Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, USA
| | | | | | | | - Rima Koka
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Carol Greene
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| |
Collapse
|
11
|
Zhang Y, Sun H, Zhang W, Fu T, Huang S, Mou M, Zhang J, Gao J, Ge Y, Yang Q, Zhu F. CellSTAR: a comprehensive resource for single-cell transcriptomic annotation. Nucleic Acids Res 2024; 52:D859-D870. [PMID: 37855686 PMCID: PMC10767908 DOI: 10.1093/nar/gkad874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/12/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Large-scale studies of single-cell sequencing and biological experiments have successfully revealed expression patterns that distinguish different cell types in tissues, emphasizing the importance of studying cellular heterogeneity and accurately annotating cell types. Analysis of gene expression profiles in these experiments provides two essential types of data for cell type annotation: annotated references and canonical markers. In this study, the first comprehensive database of single-cell transcriptomic annotation resource (CellSTAR) was thus developed. It is unique in (a) offering the comprehensive expertly annotated reference data for annotating hundreds of cell types for the first time and (b) enabling the collective consideration of reference data and marker genes by incorporating tens of thousands of markers. Given its unique features, CellSTAR is expected to attract broad research interests from the technological innovations in single-cell transcriptomics, the studies of cellular heterogeneity & dynamics, and so on. It is now publicly accessible without any login requirement at: https://idrblab.org/cellstar.
Collapse
Affiliation(s)
- Ying Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jinsong Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yichao Ge
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
12
|
Zancolli G, von Reumont BM, Anderluh G, Caliskan F, Chiusano ML, Fröhlich J, Hapeshi E, Hempel BF, Ikonomopoulou MP, Jungo F, Marchot P, de Farias TM, Modica MV, Moran Y, Nalbantsoy A, Procházka J, Tarallo A, Tonello F, Vitorino R, Zammit ML, Antunes A. Web of venom: exploration of big data resources in animal toxin research. Gigascience 2024; 13:giae054. [PMID: 39250076 PMCID: PMC11382406 DOI: 10.1093/gigascience/giae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/01/2024] [Accepted: 07/13/2024] [Indexed: 09/10/2024] Open
Abstract
Research on animal venoms and their components spans multiple disciplines, including biology, biochemistry, bioinformatics, pharmacology, medicine, and more. Manipulating and analyzing the diverse array of data required for venom research can be challenging, and relevant tools and resources are often dispersed across different online platforms, making them less accessible to nonexperts. In this article, we address the multifaceted needs of the scientific community involved in venom and toxin-related research by identifying and discussing web resources, databases, and tools commonly used in this field. We have compiled these resources into a comprehensive table available on the VenomZone website (https://venomzone.expasy.org/10897). Furthermore, we highlight the challenges currently faced by researchers in accessing and using these resources and emphasize the importance of community-driven interdisciplinary approaches. We conclude by underscoring the significance of enhancing standards, promoting interoperability, and encouraging data and method sharing within the venom research community.
Collapse
Affiliation(s)
- Giulia Zancolli
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Björn Marcus von Reumont
- Goethe University Frankfurt, Faculty of Biological Sciences, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
| | - Gregor Anderluh
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, 1000 Ljubljana, Slovenia
| | - Figen Caliskan
- Department of Biology, Faculty of Science, Eskisehir Osmangazi University, 26040 Eskişehir, Turkey
| | - Maria Luisa Chiusano
- Department of Agricultural Sciences, University Federico II of Naples, 80055 Portici, Naples, Italy
- Department of Research Infrastructures for Marine Biological Resources, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| | - Jacob Fröhlich
- Veterinary Center for Resistance Research (TZR), Freie Universität Berlin, 14163 Berlin, Germany
| | - Evroula Hapeshi
- Department of Health Sciences, School of Life and Health Sciences, University of Nicosia, 1700 Nicosia, Cyprus
| | - Benjamin-Florian Hempel
- Veterinary Center for Resistance Research (TZR), Freie Universität Berlin, 14163 Berlin, Germany
| | - Maria P Ikonomopoulou
- Madrid Institute of Advanced Studies in Food, Precision Nutrition & Aging Program, 28049 Madrid, Spain
| | - Florence Jungo
- SIB Swiss Institute of Bioinformatics, Swiss-Prot Group, 1211 Geneva, Switzerland
| | - Pascale Marchot
- Laboratory Architecture et Fonction des Macromolécules Biologiques, Aix-Marseille University, Centre National de la Recherche Scientifique, Faculté des Sciences, Campus Luminy, 13288 Marseille, France
| | - Tarcisio Mendes de Farias
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Vittoria Modica
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, 00198 Rome, Italy
| | - Yehu Moran
- Department of Ecology, Evolution and Behavior, Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, 9190401 Jerusalem, Israel
| | - Ayse Nalbantsoy
- Engineering Faculty, Bioengineering Department, Ege University, 35100 Bornova-Izmir, Turkey
| | - Jan Procházka
- Laboratory of Transgenic Models of Diseases, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 50 Vestec, Czech Republic
| | - Andrea Tarallo
- Institute of Research on Terrestrial Ecosystems (IRET), National Research Council (CNR), 73100 Lecce, Italy
| | - Fiorella Tonello
- Neuroscience Institute, National Research Council (CNR), 35131 Padua, Italy
| | - Rui Vitorino
- Department of Medical Sciences, iBiMED, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Mark Lawrence Zammit
- Department of Clinical Pharmacology & Therapeutics, Faculty of Medicine & Surgery, University of Malta, 2090 Msida, Malta
- Malta National Poisons Centre, Malta Life Sciences Park, 3000 San Ġwann, Malta
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| |
Collapse
|
13
|
Eloe-Fadrosh EA, Mungall CJ, Miller MA, Smith M, Patil SS, Kelliher JM, Johnson LYD, Rodriguez FE, Chain PSG, Hu B, Thornton MB, McCue LA, McHardy AC, Harris NL, Reddy TBK, Mukherjee S, Hunter CI, Walls R, Schriml LM. A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics. Methods Mol Biol 2024; 2802:587-609. [PMID: 38819573 DOI: 10.1007/978-1-0716-3838-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.
Collapse
Affiliation(s)
- Emiley A Eloe-Fadrosh
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| | - Christopher J Mungall
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Mark Andrew Miller
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Montana Smith
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Sujay Sanjeev Patil
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Julia M Kelliher
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Leah Y D Johnson
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | | | - Patrick S G Chain
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Bin Hu
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Michael B Thornton
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Ann McCue
- Pacific Northwest National Laboratory, Richland, WA, USA
| | - Alice Carolyn McHardy
- Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Braunschweig, Germany
| | - Nomi L Harris
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - T B K Reddy
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Supratim Mukherjee
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christopher I Hunter
- GigaScience Press, Hong Kong Science Park, Pak Shek Kok, New Territories, Hong Kong
| | | | - Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| |
Collapse
|
14
|
Carmody LC, Gargano MA, Toro S, Vasilevsky NA, Adam MP, Blau H, Chan LE, Gomez-Andres D, Horvath R, Kraus ML, Ladewig MS, Lewis-Smith D, Lochmüller H, Matentzoglu NA, Munoz-Torres MC, Schuetz C, Seitz B, Similuk MN, Sparks TN, Strauss T, Swietlik EM, Thompson R, Zhang XA, Mungall CJ, Haendel MA, Robinson PN. The Medical Action Ontology: A tool for annotating and analyzing treatments and clinical management of human disease. MED 2023; 4:913-927.e3. [PMID: 37963467 PMCID: PMC10842845 DOI: 10.1016/j.medj.2023.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/31/2023] [Accepted: 10/14/2023] [Indexed: 11/16/2023]
Abstract
BACKGROUND Navigating the clinical literature to determine the optimal clinical management for rare diseases presents significant challenges. We introduce the Medical Action Ontology (MAxO), an ontology specifically designed to organize medical procedures, therapies, and interventions. METHODS MAxO incorporates logical structures that link MAxO terms to numerous other ontologies within the OBO Foundry. Term development involves a blend of manual and semi-automated processes. Additionally, we have generated annotations detailing diagnostic modalities for specific phenotypic abnormalities defined by the Human Phenotype Ontology (HPO). We introduce a web application, POET, that facilitates MAxO annotations for specific medical actions for diseases using the Mondo Disease Ontology. FINDINGS MAxO encompasses 1,757 terms spanning a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. These terms annotate phenotypic features associated with specific disease (using HPO and Mondo). Presently, there are over 16,000 MAxO diagnostic annotations that target HPO terms. Through POET, we have created 413 MAxO annotations specifying treatments for 189 rare diseases. CONCLUSIONS MAxO offers a computational representation of treatments and other actions taken for the clinical management of patients. Its development is closely coupled to Mondo and HPO, broadening the scope of our computational modeling of diseases and phenotypic features. We invite the community to contribute disease annotations using POET (https://poet.jax.org/). MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO). FUNDING NHGRI 1U24HG011449-01A1 and NHGRI 5RM1HG010860-04.
Collapse
Affiliation(s)
- Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Sabrina Toro
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Margaret P Adam
- University of Washington School of Medicine, Seattle, WA, USA
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - David Gomez-Andres
- Pediatric Neurology, Vall d'Hebron Institut de Recerca (VHIR), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus, Passeig Vall d'Hebron 119-129, 08035 Barcelona, Spain
| | - Rita Horvath
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
| | - Megan L Kraus
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Markus S Ladewig
- Department of Ophthalmology, Klinikum Saarbrücken, Saarbrücken, Germany
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Hanns Lochmüller
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada; Division of Neurology, Department of Medicine, The Ottawa Hospital, Ottawa, Canada; Brain and Mind Research Institute, University of Ottawa, Ottawa, Canada; Department of Neuropediatrics and Muscle Disorders, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany; Centro Nacional de Análisis Genómico, Barcelona, Spain
| | | | | | - Catharina Schuetz
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center UKS, Homburg, Saar, Germany
| | - Morgan N Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Teresa N Sparks
- Department of Obstetrics, Gynecology, & Reproductive Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Timmy Strauss
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Emilia M Swietlik
- Department of Medicine, University of Cambridge, Heart and Lung Research Institute, Cambridge CB2 0BB, UK
| | - Rachel Thompson
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada
| | | | | | | | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
| |
Collapse
|
15
|
Dumschott K, Dörpholz H, Laporte MA, Brilhaus D, Schrader A, Usadel B, Neumann S, Arnaud E, Kranz A. Ontologies for increasing the FAIRness of plant research data. FRONTIERS IN PLANT SCIENCE 2023; 14:1279694. [PMID: 38098789 PMCID: PMC10720748 DOI: 10.3389/fpls.2023.1279694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 11/15/2023] [Indexed: 12/17/2023]
Abstract
The importance of improving the FAIRness (findability, accessibility, interoperability, reusability) of research data is undeniable, especially in the face of large, complex datasets currently being produced by omics technologies. Facilitating the integration of a dataset with other types of data increases the likelihood of reuse, and the potential of answering novel research questions. Ontologies are a useful tool for semantically tagging datasets as adding relevant metadata increases the understanding of how data was produced and increases its interoperability. Ontologies provide concepts for a particular domain as well as the relationships between concepts. By tagging data with ontology terms, data becomes both human- and machine- interpretable, allowing for increased reuse and interoperability. However, the task of identifying ontologies relevant to a particular research domain or technology is challenging, especially within the diverse realm of fundamental plant research. In this review, we outline the ontologies most relevant to the fundamental plant sciences and how they can be used to annotate data related to plant-specific experiments within metadata frameworks, such as Investigation-Study-Assay (ISA). We also outline repositories and platforms most useful for identifying applicable ontologies or finding ontology terms.
Collapse
Affiliation(s)
- Kathryn Dumschott
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Hannah Dörpholz
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| | - Marie-Angélique Laporte
- Digital Solutions Team, Digital Inclusion Lever, Bioversity International, Montpellier Office, Montpellier, France
| | - Dominik Brilhaus
- Data Science and Management & Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Andrea Schrader
- Data Science and Management & Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Björn Usadel
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
- Institute for Biological Data Science & Cluster of Excellence on Plant Sciences (CEPLAS), Faculty of Mathematics and Life Sciences, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Steffen Neumann
- Program Center MetaCom, Leibniz Institute of Plant Biochemistry, Halle, Germany
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany
| | - Elizabeth Arnaud
- Digital Solutions Team, Digital Inclusion Lever, Bioversity International, Montpellier Office, Montpellier, France
| | - Angela Kranz
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics) & Bioeconomy Science Center (BioSC), CEPLAS, Forschungszentrum Jülich, Jülich, Germany
| |
Collapse
|
16
|
Li J, Li Y, Pan Y, Guo J, Sun Z, Li F, He Y, Tao C. Mapping Vaccine Names in Clinical Trials to Vaccine Ontology using Cascaded Fine-Tuned Domain-Specific Language Models. RESEARCH SQUARE 2023:rs.3.rs-3362256. [PMID: 37841880 PMCID: PMC10571639 DOI: 10.21203/rs.3.rs-3362256/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
Background Vaccines have revolutionized public health by providing protection against infectious diseases. They stimulate the immune system and generate memory cells to defend against targeted diseases. Clinical trials evaluate vaccine performance, including dosage, administration routes, and potential side effects. ClinicalTrials.gov is a valuable repository of clinical trial information, but the vaccine data in them lacks standardization, leading to challenges in automatic concept mapping, vaccine-related knowledge development, evidence-based decision-making, and vaccine surveillance. Results In this study, we developed a cascaded framework that capitalized on multiple domain knowledge sources, including clinical trials, Unified Medical Language System (UMLS), and the Vaccine Ontology (VO), to enhance the performance of domain-specific language models for automated mapping of VO from clinical trials. The Vaccine Ontology (VO) is a community-based ontology that was developed to promote vaccine data standardization, integration, and computer-assisted reasoning. Our methodology involved extracting and annotating data from various sources. We then performed pre-training on the PubMedBERT model, leading to the development of CTPubMedBERT. Subsequently, we enhanced CTPubMedBERT by incorporating SAPBERT, which was pretrained using the UMLS, resulting in CTPubMedBERT + SAPBERT. Further refinement was accomplished through fine-tuning using the Vaccine Ontology corpus and vaccine data from clinical trials, yielding the CTPubMedBERT + SAPBERT + VO model. Finally, we utilized a collection of pre-trained models, along with the weighted rule-based ensemble approach, to normalize the vaccine corpus and improve the accuracy of the process. The ranking process in concept normalization involves prioritizing and ordering potential concepts to identify the most suitable match for a given context. We conducted a ranking of the Top 10 concepts, and our experimental results demonstrate that our proposed cascaded framework consistently outperformed existing effective baselines on vaccine mapping, achieving 71.8% on top 1 candidate's accuracy and 90.0% on top 10 candidate's accuracy. Conclusion This study provides a detailed insight into a cascaded framework of fine-tuned domain-specific language models improving mapping of VO from clinical trials. By effectively leveraging domain-specific information and applying weighted rule-based ensembles of different pre-trained BERT models, our framework can significantly enhance the mapping of VO from clinical trials.
Collapse
Affiliation(s)
- Jianfu Li
- The University of Texas Health Science Center at Houston
| | - Yiming Li
- The University of Texas Health Science Center at Houston
| | | | | | - Zenan Sun
- The University of Texas Health Science Center at Houston
| | - Fang Li
- The University of Texas Health Science Center at Houston
| | | | - Cui Tao
- The University of Texas Health Science Center at Houston
| |
Collapse
|
17
|
Stefancsik R, Balhoff JP, Balk MA, Ball RL, Bello SM, Caron AR, Chesler EJ, de Souza V, Gehrke S, Haendel M, Harris LW, Harris NL, Ibrahim A, Koehler S, Matentzoglu N, McMurry JA, Mungall CJ, Munoz-Torres MC, Putman T, Robinson P, Smedley D, Sollis E, Thessen AE, Vasilevsky N, Walton DO, Osumi-Sutherland D. The Ontology of Biological Attributes (OBA)-computational traits for the life sciences. Mamm Genome 2023; 34:364-378. [PMID: 37076585 PMCID: PMC10382347 DOI: 10.1007/s00335-023-09992-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 04/06/2023] [Indexed: 04/21/2023]
Abstract
Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focussed measurable trait data. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.
Collapse
Affiliation(s)
- Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, 27517, USA
| | - Meghan A Balk
- Natural History Museum, University of Oslo, Oslo, Norway
| | - Robyn L Ball
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | | | - Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah Gehrke
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Melissa Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Laura W Harris
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Arwa Ibrahim
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | | | - Julie A McMurry
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Elliot Sollis
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Nicole Vasilevsky
- Data Collaboration Center, Critical Path Institute, Tucson, AZ, 85718, USA
| | | | | |
Collapse
|
18
|
Menotti L, Silvello G, Atzori M, Boytcheva S, Ciompi F, Di Nunzio GM, Fraggetta F, Giachelle F, Irrera O, Marchesin S, Marini N, Müller H, Primov T. Modelling digital health data: The ExaMode ontology for computational pathology. J Pathol Inform 2023; 14:100332. [PMID: 37705689 PMCID: PMC10495665 DOI: 10.1016/j.jpi.2023.100332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 07/14/2023] [Accepted: 08/16/2023] [Indexed: 09/15/2023] Open
Abstract
Computational pathology can significantly benefit from ontologies to standardize the employed nomenclature and help with knowledge extraction processes for high-quality annotated image datasets. The end goal is to reach a shared model for digital pathology to overcome data variability and integration problems. Indeed, data annotation in such a specific domain is still an unsolved challenge and datasets cannot be steadily reused in diverse contexts due to heterogeneity issues of the adopted labels, multilingualism, and different clinical practices. Material and methods This paper presents the ExaMode ontology, modeling the histopathology process by considering 3 key cancer diseases (colon, cervical, and lung tumors) and celiac disease. The ExaMode ontology has been designed bottom-up in an iterative fashion with continuous feedback and validation from pathologists and clinicians. The ontology is organized into 5 semantic areas that defines an ontological template to model any disease of interest in histopathology. Results The ExaMode ontology is currently being used as a common semantic layer in: (i) an entity linking tool for the automatic annotation of medical records; (ii) a web-based collaborative annotation tool for histopathology text reports; and (iii) a software platform for building holistic solutions integrating multimodal histopathology data. Discussion The ontology ExaMode is a key means to store data in a graph database according to the RDF data model. The creation of an RDF dataset can help develop more accurate algorithms for image analysis, especially in the field of digital pathology. This approach allows for seamless data integration and a unified query access point, from which we can extract relevant clinical insights about the considered diseases using SPARQL queries.
Collapse
Affiliation(s)
- Laura Menotti
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Gianmaria Silvello
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Manfredo Atzori
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
- Department of Neuroscience, University of Padua, Padova, Italy
| | | | - Francesco Ciompi
- Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | | | - Fabio Giachelle
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Ornella Irrera
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Stefano Marchesin
- Department of Information Engineering, University of Padua, Padova, Italy
| | - Niccolò Marini
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland, Delémont, Switzerland
| | | |
Collapse
|
19
|
Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One 2023; 18:e0285433. [PMID: 37196000 PMCID: PMC10191354 DOI: 10.1371/journal.pone.0285433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 04/21/2023] [Indexed: 05/19/2023] Open
Abstract
The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.
Collapse
Affiliation(s)
- Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Julius O. B. Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Alex H. Wagner
- Departments of Pediatrics and Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States of America
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH, United States of America
| | | | - Martha A. Beckwith
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Lauren Rekerle
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Leigh C. Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Justin Reese
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Harshad Hegde
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Markus S. Ladewig
- Department of Ophthalmology, Klinikum Saarbrücken, Saarbrücken, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center, Homburg/Saar, Germany
| | - Monica Munoz-Torres
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States of America
| | - Nomi L. Harris
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Jordi Rambla
- European Genome-Phenome Archive (EGA) in the Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Michael Baudis
- University of Zurich and Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Melissa A. Haendel
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States of America
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States of America
| |
Collapse
|
20
|
Sanders LM, Scott RT, Yang JH, Qutub AA, Garcia Martin H, Berrios DC, Hastings JJA, Rask J, Mackintosh G, Hoarfrost AL, Chalk S, Kalantari J, Khezeli K, Antonsen EL, Babdor J, Barker R, Baranzini SE, Beheshti A, Delgado-Aparicio GM, Glicksberg BS, Greene CS, Haendel M, Hamid AA, Heller P, Jamieson D, Jarvis KJ, Komarova SV, Komorowski M, Kothiyal P, Mahabal A, Manor U, Mason CE, Matar M, Mias GI, Miller J, Myers JG, Nelson C, Oribello J, Park SM, Parsons-Wingerter P, Prabhu RK, Reynolds RJ, Saravia-Butler A, Saria S, Sawyer A, Singh NK, Snyder M, Soboczenski F, Soman K, Theriot CA, Van Valen D, Venkateswaran K, Warren L, Worthey L, Zitnik M, Costes SV. Biological research and self-driving labs in deep space supported by artificial intelligence. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00618-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
21
|
Brunson T, Sanati N, Huffman A, Masci AM, Zheng J, Cooke MF, Conley P, He Y, Wu G. VIGET: A web portal for study of vaccine-induced host responses based on Reactome pathways and ImmPort data. Front Immunol 2023; 14:1141030. [PMID: 37180100 PMCID: PMC10172660 DOI: 10.3389/fimmu.2023.1141030] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 03/07/2023] [Indexed: 05/15/2023] Open
Abstract
Host responses to vaccines are complex but important to investigate. To facilitate the study, we have developed a tool called Vaccine Induced Gene Expression Analysis Tool (VIGET), with the aim to provide an interactive online tool for users to efficiently and robustly analyze the host immune response gene expression data collected in the ImmPort/GEO databases. VIGET allows users to select vaccines, choose ImmPort studies, set up analysis models by choosing confounding variables and two groups of samples having different vaccination times, and then perform differential expression analysis to select genes for pathway enrichment analysis and functional interaction network construction using the Reactome's web services. VIGET provides features for users to compare results from two analyses, facilitating comparative response analysis across different demographic groups. VIGET uses the Vaccine Ontology (VO) to classify various types of vaccines such as live or inactivated flu vaccines, yellow fever vaccines, etc. To showcase the utilities of VIGET, we conducted a longitudinal analysis of immune responses to yellow fever vaccines and found an intriguing complex activity response pattern of pathways in the immune system annotated in Reactome, demonstrating that VIGET is a valuable web portal that supports effective vaccine response studies using Reactome pathways and ImmPort data.
Collapse
Affiliation(s)
- Timothy Brunson
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Nasim Sanati
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Anthony Huffman
- Department for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
| | - Anna Maria Masci
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, United States
- Office of Data Science, National Institute of Environmental Health Sciences, Research Triangle Park, NC, United States
| | - Jie Zheng
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Michael F. Cooke
- Department for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
| | - Patrick Conley
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Yongqun He
- Department for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, United States
| | - Guanming Wu
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| |
Collapse
|
22
|
Stefancsik R, Balhoff JP, Balk MA, Ball R, Bello SM, Caron AR, Chessler E, de Souza V, Gehrke S, Haendel M, Harris LW, Harris NL, Ibrahim A, Koehler S, Matentzoglu N, McMurry JA, Mungall CJ, Munoz-Torres MC, Putman T, Robinson P, Smedley D, Sollis E, Thessen AE, Vasilevsky N, Walton DO, Osumi-Sutherland D. The Ontology of Biological Attributes (OBA) - Computational Traits for the Life Sciences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.26.525742. [PMID: 36747660 PMCID: PMC9900877 DOI: 10.1101/2023.01.26.525742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focused measurable trait data. Moreover, variations in gene expression in response to environmental disturbances even without any genetic alterations can also be associated with particular biological attributes. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.
Collapse
Affiliation(s)
- Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - James P. Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC 27517, USA
| | - Meghan A. Balk
- National Ecological Observatory Network, Battelle, Boulder, CO 80301, USA
| | - Robyn Ball
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | | | - Anita R. Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah Gehrke
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Melissa Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Laura W. Harris
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nomi L. Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Arwa Ibrahim
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | | | - Julie A. McMurry
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Christopher J. Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Elliot Sollis
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | - Nicole Vasilevsky
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, USA
| | | | | |
Collapse
|
23
|
Kirsten T, Meineke FA, Loeffler-Wirth H, Beger C, Uciteli A, Stäubert S, Löbe M, Hänsel R, Rauscher FG, Schuster J, Peschel T, Herre H, Wagner J, Zachariae S, Engel C, Scholz M, Rahm E, Binder H, Loeffler M. The Leipzig Health Atlas-An Open Platform to Present, Archive, and Share Biomedical Data, Analyses, and Models Online. Methods Inf Med 2022; 61:e103-e115. [PMID: 35915977 PMCID: PMC9788914 DOI: 10.1055/a-1914-1985] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
BACKGROUND Clinical trials, epidemiological studies, clinical registries, and other prospective research projects, together with patient care services, are main sources of data in the medical research domain. They serve often as a basis for secondary research in evidence-based medicine, prediction models for disease, and its progression. This data are often neither sufficiently described nor accessible. Related models are often not accessible as a functional program tool for interested users from the health care and biomedical domains. OBJECTIVE The interdisciplinary project Leipzig Health Atlas (LHA) was developed to close this gap. LHA is an online platform that serves as a sustainable archive providing medical data, metadata, models, and novel phenotypes from clinical trials, epidemiological studies, and other medical research projects. METHODS Data, models, and phenotypes are described by semantically rich metadata. The platform prefers to share data and models presented in original publications but is also open for nonpublished data. LHA provides and associates unique permanent identifiers for each dataset and model. Hence, the platform can be used to share prepared, quality-assured datasets and models while they are referenced in publications. All managed data, models, and phenotypes in LHA follow the FAIR principles, with public availability or restricted access for specific user groups. RESULTS The LHA platform is in productive mode (https://www.health-atlas.de/). It is already used by a variety of clinical trial and research groups and is becoming increasingly popular also in the biomedical community. LHA is an integral part of the forthcoming initiative building a national research data infrastructure for health in Germany.
Collapse
Affiliation(s)
- Toralf Kirsten
- Department of Medical Data Science, Leipzig University Medical Center, Leipzig, Germany,Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany,Address for correspondence Toralf Kirsten Department of Medical Data Science, Leipzig UniversityHärtelstraße 16-18, 04107 LeipzigGermany
| | - Frank A. Meineke
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Henry Loeffler-Wirth
- LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | - Christoph Beger
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Alexandr Uciteli
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Sebastian Stäubert
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Matthias Löbe
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - René Hänsel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Franziska G. Rauscher
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Judith Schuster
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Thomas Peschel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Heinrich Herre
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Jonas Wagner
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Silke Zachariae
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Christoph Engel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Markus Scholz
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Erhard Rahm
- Department of Computer Sciences, Leipzig University, Leipzig, Germany
| | - Hans Binder
- LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | - Markus Loeffler
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany,LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | | |
Collapse
|
24
|
Yu H, Li L, Huffman A, Beverley J, Hur J, Merrell E, Huang HH, Wang Y, Liu Y, Ong E, Cheng L, Zeng T, Zhang J, Li P, Liu Z, Wang Z, Zhang X, Ye X, Handelman SK, Sexton J, Eaton K, Higgins G, Omenn GS, Athey B, Smith B, Chen L, He Y. A new framework for host-pathogen interaction research. Front Immunol 2022; 13:1066733. [PMID: 36591248 PMCID: PMC9797517 DOI: 10.3389/fimmu.2022.1066733] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2022] Open
Abstract
COVID-19 often manifests with different outcomes in different patients, highlighting the complexity of the host-pathogen interactions involved in manifestations of the disease at the molecular and cellular levels. In this paper, we propose a set of postulates and a framework for systematically understanding complex molecular host-pathogen interaction networks. Specifically, we first propose four host-pathogen interaction (HPI) postulates as the basis for understanding molecular and cellular host-pathogen interactions and their relations to disease outcomes. These four postulates cover the evolutionary dispositions involved in HPIs, the dynamic nature of HPI outcomes, roles that HPI components may occupy leading to such outcomes, and HPI checkpoints that are critical for specific disease outcomes. Based on these postulates, an HPI Postulate and Ontology (HPIPO) framework is proposed to apply interoperable ontologies to systematically model and represent various granular details and knowledge within the scope of the HPI postulates, in a way that will support AI-ready data standardization, sharing, integration, and analysis. As a demonstration, the HPI postulates and the HPIPO framework were applied to study COVID-19 with the Coronavirus Infectious Disease Ontology (CIDO), leading to a novel approach to rational design of drug/vaccine cocktails aimed at interrupting processes occurring at critical host-coronavirus interaction checkpoints. Furthermore, the host-coronavirus protein-protein interactions (PPIs) relevant to COVID-19 were predicted and evaluated based on prior knowledge of curated PPIs and domain-domain interactions, and how such studies can be further explored with the HPI postulates and the HPIPO framework is discussed.
Collapse
Affiliation(s)
- Hong Yu
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Li Li
- Department of Genetics, Harvard Medical School, Boston, MA, United States
| | - Anthony Huffman
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - John Beverley
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
- Asymmetric Operations Sector, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, United States
| | - Junguk Hur
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, United States
| | - Eric Merrell
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Hsin-hui Huang
- University of Michigan Medical School, Ann Arbor, MI, United States
- Department of Biotechnology and Laboratory Science in Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Yang Wang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Yingtong Liu
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Edison Ong
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Liang Cheng
- Department of Bioinformatics, Harbin Medical University, Harbin, Helongjian, China
| | - Tao Zeng
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Jingsong Zhang
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Pengpai Li
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhiping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong, China
| | - Zhigang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences and School of Basic Medicine, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China
| | - Xiangyan Zhang
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | - Xianwei Ye
- Department of Respiratory and Critical Care Medicine, Guizhou Provincial People’s Hospital and National Health Commission (NHC) Key Laboratory of Immunological Diseases, People’s Hospital of Guizhou Province, Guiyang, Guizhou, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, China
| | | | - Jonathan Sexton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Kathryn Eaton
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gerry Higgins
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Gilbert S. Omenn
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Brian Athey
- University of Michigan Medical School, Ann Arbor, MI, United States
| | - Barry Smith
- Department of Philosophy, University at Buffalo, Buffalo, NY, United States
| | - Luonan Chen
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, United States
| |
Collapse
|
25
|
Hoyt CT, Balk M, Callahan TJ, Domingo-Fernández D, Haendel MA, Hegde HB, Himmelstein DS, Karis K, Kunze J, Lubiana T, Matentzoglu N, McMurry J, Moxon S, Mungall CJ, Rutz A, Unni DR, Willighagen E, Winston D, Gyori BM. Unifying the identification of biomedical entities with the Bioregistry. Sci Data 2022; 9:714. [PMID: 36402838 PMCID: PMC9675740 DOI: 10.1038/s41597-022-01807-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 10/26/2022] [Indexed: 11/21/2022] Open
Abstract
The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers. Here, we introduce the Bioregistry, an integrative, open, community-driven metaregistry that synthesizes and substantially expands upon 23 existing registries. The Bioregistry addresses the need for a sustainable registry by leveraging public infrastructure and automation, and employing a progressive governance model centered around open code and open data to foster community contribution. The Bioregistry can be used to support the standardized annotation of data, models, ontologies, and scientific literature, thereby promoting their interoperability and reuse. The Bioregistry can be accessed through https://bioregistry.io and its source code and data are available under the MIT and CC0 Licenses at https://github.com/biopragmatics/bioregistry .
Collapse
Affiliation(s)
| | | | | | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer SCAI, Sankt Augustin, Germany
- Enveda Biosciences, Boulder, USA
| | | | | | | | - Klas Karis
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, USA
| | - John Kunze
- California Digital Library, University of California, Berkeley, USA
| | - Tiago Lubiana
- School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil
| | | | - Julie McMurry
- University of Colorado Anschutz Medical Campus, Aurora, USA
| | - Sierra Moxon
- Lawrence Berkeley National Laboratory, Berkeley, USA
| | | | - Adriano Rutz
- School of Pharmaceutical Sciences, University of Geneva, Geneva, Switzerland
- Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, Geneva, Switzerland
| | - Deepak R Unni
- Lawrence Berkeley National Laboratory, Berkeley, USA
- European Molecular Biology Laboratory, Heidelberg, Germany
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| | | | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, USA.
| |
Collapse
|
26
|
Zi W, Yang Q, Su J, He Y, Xie J. OAE-based data mining and modeling analysis of adverse events associated with three licensed HPV vaccines. Heliyon 2022; 8:e11515. [DOI: 10.1016/j.heliyon.2022.e11515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 06/11/2022] [Accepted: 11/03/2022] [Indexed: 11/13/2022] Open
|
27
|
Zhang M, Zong W, Zou D, Wang G, Zhao W, Yang F, Wu S, Zhang X, Guo X, Ma Y, Xiong Z, Zhang Z, Bao Y, Li R. MethBank 4.0: an updated database of DNA methylation across a variety of species. Nucleic Acids Res 2022; 51:D208-D216. [PMID: 36318250 PMCID: PMC9825483 DOI: 10.1093/nar/gkac969] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/05/2022] [Accepted: 10/13/2022] [Indexed: 11/05/2022] Open
Abstract
DNA methylation, as the most intensively studied epigenetic mark, regulates gene expression in numerous biological processes including development, aging, and disease. With the rapid accumulation of whole-genome bisulfite sequencing data, integrating, archiving, analyzing, and visualizing those data becomes critical. Since its first publication in 2015, MethBank has been continuously updated to include more DNA methylomes across more diverse species. Here, we present MethBank 4.0 (https://ngdc.cncb.ac.cn/methbank/), which reports an increase of 309% in data volume, with 1449 single-base resolution methylomes of 23 species, covering 236 tissues/cell lines and 15 biological contexts. Value-added information, such as more rigorous quality evaluation, more standardized metadata, and comprehensive downstream annotations have been integrated in the new version. Moreover, expert-curated knowledge modules of featured differentially methylated genes associated with biological contexts and methylation analysis tools have been incorporated as new components of MethBank. In addition, MethBank 4.0 is equipped with a series of new web interfaces to browse, search, and visualize DNA methylation profiles and related information. With all these improvements, we believe the updated MethBank 4.0 will serve as a fundamental resource to provide a wide range of data services for the global research community.
Collapse
Affiliation(s)
| | | | | | | | - Wei Zhao
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fei Yang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China
| | - Song Wu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xinran Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xutong Guo
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yingke Ma
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China
| | - Zhuang Xiong
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- Correspondence may also be addressed to Zhang Zhang. Tel: +86 10 84097261;
| | - Yiming Bao
- Correspondence may also be addressed to Yiming Bao. Tel: +86 10 84097858;
| | - Rujiao Li
- To whom correspondence should be addressed. Tel: +86 10 84097638;
| |
Collapse
|
28
|
Gupta S, Sharma N, Naorem LD, Jain S, Raghava GP. Collection, compilation and analysis of bacterial vaccines. Comput Biol Med 2022; 149:106030. [DOI: 10.1016/j.compbiomed.2022.106030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/16/2022] [Accepted: 08/20/2022] [Indexed: 11/03/2022]
|
29
|
The Representation of Causality and Causation with Ontologies: A Systematic Literature Review. Online J Public Health Inform 2022; 14:e4. [PMID: 36120162 PMCID: PMC9473331 DOI: 10.5210/ojphi.v14i1.12577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Objective To explore how disease-related causality is formally represented in current ontologies and identify their potential limitations. Methods We conducted a systematic literature search on eight databases (PubMed, Institute of Electrical and Electronic Engendering (IEEE Xplore), Association for Computing Machinery (ACM), Scopus, Web of Science databases, Ontobee, OBO Foundry, and Bioportal. We included studies published between January 1, 1970, and December 9, 2020, that formally represent the notions of causality and causation in the medical domain using ontology as a representational tool. Further inclusion criteria were publication in English and peer-reviewed journals or conference proceedings. Two authors (SS, RM) independently assessed study quality and performed content analysis using a modified validated extraction grid with pre-established categorization. Results The search strategy led to a total of 8,501 potentially relevant papers, of which 50 met the inclusion criteria. Only 14 out of 50 (28%) specified the nature of causation, and only 7 (14%) included clear and non-circular natural language definitions. Although several theories of causality were mentioned, none of the articles offers a widely accepted conceptualization of how causation and causality can be formally represented. Conclusion No current ontology captures the wealth of available concepts of causality. This provides an opportunity for the development of a formal ontology of causation/causality.
Collapse
|
30
|
Bernabé-Díaz JA, Franco M, Vivo JM, Quesada-Martínez M, Fernández-Breis JT. An automated process for supporting decisions in clustering-based data analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 219:106765. [PMID: 35367914 DOI: 10.1016/j.cmpb.2022.106765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 03/14/2022] [Accepted: 03/18/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Metrics are commonly used by biomedical researchers and practitioners to measure and evaluate properties of individuals, instruments, models, methods, or datasets. Due to the lack of a standardized validation procedure for a metric, it is assumed that if a metric is appropriate for analyzing a dataset in a certain domain, then it will be appropriate for other datasets in the same domain. However, such generalizability cannot be taken for granted, since the behavior of a metric can vary in different scenarios. The study of such behavior of a metric is the objective of this paper, since it would allow for assessing its reliability before drawing any conclusion about biomedical datasets. METHODS We present a method to support in evaluating the behavior of quantitative metrics on datasets. Our approach assesses a metric by using clustering-based data analysis, and enhancing the decision-making process in the optimal classification. Our method assesses the metrics by applying two important criteria of the unsupervised classification validation that are calculated on the clusterings generated by the metric, namely stability and goodness of the clusters. The application of our method is facilitated to biomedical researchers by our evaluomeR tool. RESULTS The analytical power of our methods is shown in the results of the application of our method to analyze (1) the behavior of the impact factor metric for a series of journal categories; (2) which structural metrics provide a better partitioning of the content of a repository of biomedical ontologies, and (3) the heterogeneity sources in effect size metrics of biomedical primary studies. CONCLUSIONS The use of statistical properties such as stability and goodness of classifications allows for a useful analysis of the behavior of quantitative metrics, which can be used for supporting decisions about which metrics to apply on a certain dataset.
Collapse
Affiliation(s)
| | - Manuel Franco
- Dept. Statistics and Operations Research, University of Murcia, IMIB-Arrixaca, Spain
| | - Juana-María Vivo
- Dept. Statistics and Operations Research, University of Murcia, IMIB-Arrixaca, Spain
| | | | | |
Collapse
|
31
|
Umberfield EE, Stansbury C, Ford K, Jiang Y, Kardia SLR, Thomer AK, Harris MR. Evaluating and Extending the Informed Consent Ontology for Representing Permissions from the Clinical Domain. APPLIED ONTOLOGY 2022; 17:321-336. [PMID: 36312514 PMCID: PMC9616177 DOI: 10.3233/ao-210260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The purpose of this study was to evaluate, revise, and extend the Informed Consent Ontology (ICO) for expressing clinical permissions, including reuse of residual clinical biospecimens and health data. This study followed a formative evaluation design and used a bottom-up modeling approach. Data were collected from the literature on US federal regulations and a study of clinical consent forms. Eleven federal regulations and fifteen permission-sentences from clinical consent forms were iteratively modeled to identify entities and their relationships, followed by community reflection and negotiation based on a series of predetermined evaluation questions. ICO included fifty-two classes and twelve object properties necessary when modeling, demonstrating appropriateness of extending ICO for the clinical domain. Twenty-six additional classes were imported into ICO from other ontologies, and twelve new classes were recommended for development. This work addresses a critical gap in formally representing permissions clinical permissions, including reuse of residual clinical biospecimens and health data. It makes missing content available to the OBO Foundry, enabling use alongside other widely-adopted biomedical ontologies. ICO serves as a machine-interpretable and interoperable tool for responsible reuse of residual clinical biospecimens and health data at scale.
Collapse
Affiliation(s)
- Elizabeth E. Umberfield
- Indiana University Richard M Fairbanks School of Public Health, Health Policy & Management; Indianapolis, IN, USA
- Regenstrief Institute Inc, Center for Biomedical Informatics, Indianapolis, IN, USA
| | - Cooper Stansbury
- University of Michigan Medical School, Computational Medicine and Bioinformatics; Ann Arbor, MI, USA
- University of Michigan, Institute for Computational Discovery & Engineering; Ann Arbor, MI, USA
| | | | - Yun Jiang
- University of Michigan School of Nursing, Systems, Populations and Leadership; Ann Arbor, MI, USA
| | - Sharon L. R. Kardia
- University of Michigan School of Public Health, Epidemiology; Ann Arbor, MI, USA
| | - Andrea K. Thomer
- University of Michigan School of Information, Ann Arbor, MI, USA
| | - Marcelline R. Harris
- University of Michigan School of Nursing, Systems, Populations and Leadership; Ann Arbor, MI, USA
| |
Collapse
|
32
|
He Y. Development and Applications of Interoperable Biomedical Ontologies for Integrative Data and Knowledge Representation and Multiscale Modeling in Systems Medicine. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2486:233-244. [PMID: 35437726 DOI: 10.1007/978-1-0716-2265-0_12] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The data FAIR Guiding Principles state that all data should be Findable, Accessible, Interoperable, and Reusable. Ontology is critical to data integration, sharing, and analysis. Given thousands of ontologies have been developed in the era of artificial intelligence, it is critical to have interoperable ontologies to support standardized data and knowledge presentation and reasoning. For interoperable ontology development, the eXtensible ontology development (XOD) strategy offers four principles including ontology term reuse, semantic alignment, ontology design pattern usage, and community extensibility. Many software programs are available to help implement these principles. As a demonstration, the XOD strategy is applied to developing the interoperable Coronavirus Infectious Disease Ontology (CIDO). Various applications of interoperable ontologies, such as COVID-19 and kidney precision medicine research, are also introduced in this chapter.
Collapse
Affiliation(s)
- Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA.
| |
Collapse
|
33
|
Wang Y, Zhang F, Byrd JB, Yu H, Ye X, He Y. Differential COVID-19 Symptoms Given Pandemic Locations, Time, and Comorbidities During the Early Pandemic. Front Med (Lausanne) 2022; 9:770031. [PMID: 35155491 PMCID: PMC8831795 DOI: 10.3389/fmed.2022.770031] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 01/04/2022] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND COVID-19 pandemic is disaster to public health worldwide. Better perspective on COVID's features early in its course-prior to the development of vaccines and widespread variants-may prove useful in the understanding of future pandemics. Ontology provides a standardized integrative method for knowledge modeling and computer-assisted reasoning. In this study, we systematically extracted and analyzed clinical phenotypes and comorbidities in COVID-19 patients found at different countries and regions during the early pandemic using an ontology-based bioinformatics approach, with the aim to identify new insights and hidden patterns of the COVID-19 symptoms. RESULTS A total of 48 research articles reporting analysis of first-hand clinical data from over 40,000 COVID-19 patients were surveyed. The patients studied therein were diagnosed with COVID-19 before May 2020. A total of 18 commonly-occurring phenotypes in these COVID-19 patients were first identified and then classified into different hierarchical groups based on the Human Phenotype Ontology (HPO). This meta-analytic approach revealed that fever, cough, and the loss of smell and taste were ranked as the most commonly-occurring phenotype in China, the US, and Italy, respectively. We also found that the patients from Europe and the US appeared to have more frequent occurrence of many nervous and abdominal symptom phenotypes (e.g., loss of smell, loss of taste, and diarrhea) than patients from China during the early pandemic. A total of 22 comorbidities, such as diabetes and kidney failure, were found to commonly exist in COVID-19 patients and positively correlated with the severity of the disease. The knowledge learned from the study was further modeled and represented in the Coronavirus Infectious Disease Ontology (CIDO), supporting semantic queries and analysis. Furthermore, also considering the symptoms caused by new viral variants at the later stages, a spiral model hypothesis was proposed to address the changes of specific symptoms during different stages of the pandemic. CONCLUSIONS Differential patterns of symptoms in COVID-19 patients were found given different locations, time, and comorbidity types during the early pandemic. The ontology-based informatics provides a unique approach to systematically model, represent, and analyze COVID-19 symptoms, comorbidities, and the factors that influence the disease outcomes.
Collapse
Affiliation(s)
- Yang Wang
- Guizhou University School of Medicine, Guiyang, China
- NHC Key Laboratory of Immunological Diseases, Department of Pulmonary and Critical Care Medicine, Guizhou Provincial People's Hospital and People's Hospital of Guizhou University, Guiyang, China
| | - Fengwei Zhang
- Guizhou University School of Medicine, Guiyang, China
| | - J. Brian Byrd
- Division of Cardiovascular Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States
| | - Hong Yu
- Guizhou University School of Medicine, Guiyang, China
- NHC Key Laboratory of Immunological Diseases, Department of Pulmonary and Critical Care Medicine, Guizhou Provincial People's Hospital and People's Hospital of Guizhou University, Guiyang, China
| | - Xianwei Ye
- Guizhou University School of Medicine, Guiyang, China
- NHC Key Laboratory of Immunological Diseases, Department of Pulmonary and Critical Care Medicine, Guizhou Provincial People's Hospital and People's Hospital of Guizhou University, Guiyang, China
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, United States
| |
Collapse
|
34
|
Liu M, Liu J, Liu G, Wang H, Wang X, Deng Z, He Y, Ou HY. ICEO, a biological ontology for representing and analyzing bacterial integrative and conjugative elements. Sci Data 2022; 9:11. [PMID: 35058462 PMCID: PMC8776819 DOI: 10.1038/s41597-021-01112-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 12/13/2021] [Indexed: 01/18/2023] Open
Abstract
Bacterial integrative and conjugative elements (ICEs) are highly modular mobile genetic elements critical to the horizontal transfer of antibiotic resistance and virulence factor genes. To better understand and analyze the ongoing increase of ICEs, we developed an Integrative and Conjugative Element Ontology (ICEO) to represent the gene components, functional modules, and other information of experimentally verified ICEs. ICEO is aligned with the upper-level Basic Formal Ontology and reuses existing reliable ontologies. There are 31,081 terms, including 26,814 classes from 14 ontologies and 4128 ICEO-specific classes, representing the information of 271 known experimentally verified ICEs from 235 bacterial strains in ICEO currently and 311 predicted ICEs of 272 completely sequenced Klebsiella pneumoniae strains. Three ICEO use cases were illustrated to investigate complex joins of ICEs and their harboring antibiotic resistance or virulence factor genes by using SPARQL or DL query. ICEO has been approved as an Open Biomedical Ontology library ontology. It may be dedicated to facilitating systematical ICE knowledge representation, integration, and computer-assisted queries.
Collapse
Affiliation(s)
- Meng Liu
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Jialin Liu
- Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Guitian Liu
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Hui Wang
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing, 100071, China
| | - Xiaoli Wang
- Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Zixin Deng
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, 200030, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| | - Hong-Yu Ou
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, 200030, China.
| |
Collapse
|
35
|
Hollas MAR, Robey M, Fellers R, LeDuc R, Thomas P, Kelleher N. The Human Proteoform Atlas: a FAIR community resource for experimentally derived proteoforms. Nucleic Acids Res 2022; 50:D526-D533. [PMID: 34986596 PMCID: PMC8728143 DOI: 10.1093/nar/gkab1086] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 10/06/2021] [Accepted: 11/14/2021] [Indexed: 01/01/2023] Open
Abstract
The Human Proteoform Atlas (HPfA) is a web-based repository of experimentally verified human proteoforms on-line at http://human-proteoform-atlas.org and is a direct descendant of the Consortium of Top-Down Proteomics' (CTDP) Proteoform Atlas. Proteoforms are the specific forms of protein molecules expressed by our cells and include the unique combination of post-translational modifications (PTMs), alternative splicing and other sources of variation deriving from a specific gene. The HPfA uses a FAIR system to assign persistent identifiers to proteoforms which allows for redundancy calling and tracking from prior and future studies in the growing community of proteoform biology and measurement. The HPfA is organized around open ontologies and enables flexible classification of proteoforms. To achieve this, a public registry of experimentally verified proteoforms was also created. Submission of new proteoforms can be processed through email vianrtdphelp@northwestern.edu, and future iterations of these proteoform atlases will help to organize and assign function to proteoforms, their PTMs and their complexes in the years ahead.
Collapse
Affiliation(s)
- Michael A R Hollas
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| | - Matthew T Robey
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| | - Ryan T Fellers
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| | - Richard D LeDuc
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| | - Paul M Thomas
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| | - Neil L Kelleher
- Departments of Molecular Biosciences, Chemistry, and the Chemistry of Life Processes Institute, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
36
|
Juanes Cortés B, Vera-Ramos JA, Lovering RC, Gaudet P, Laegreid A, Logie C, Schulz S, Roldán-García MDM, Kuiper M, Fernández-Breis JT. Formalization of gene regulation knowledge using ontologies and gene ontology causal activity models. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2021; 1864:194766. [PMID: 34710644 DOI: 10.1016/j.bbagrm.2021.194766] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 09/13/2021] [Accepted: 10/11/2021] [Indexed: 02/02/2023]
Abstract
Gene regulation computational research requires handling and integrating large amounts of heterogeneous data. The Gene Ontology has demonstrated that ontologies play a fundamental role in biological data interoperability and integration. Ontologies help to express data and knowledge in a machine processable way, which enables complex querying and advanced exploitation of distributed data. Contributing to improve data interoperability in gene regulation is a major objective of the GREEKC Consortium, which aims to develop a standardized gene regulation knowledge commons. GREEKC proposes the use of ontologies and semantic tools for developing interoperable gene regulation knowledge models, which should support data annotation. In this work, we study how such knowledge models can be generated from cartoons of gene regulation scenarios. The proposed method consists of generating descriptions in natural language of the cartoons; extracting the entities from the texts; finding those entities in existing ontologies to reuse as much content as possible, especially from well known and maintained ontologies such as the Gene Ontology, the Sequence Ontology, the Relations Ontology and ChEBI; and implementation of the knowledge models. The models have been implemented using Protégé, a general ontology editor, and Noctua, the tool developed by the Gene Ontology Consortium for the development of causal activity models to capture more comprehensive annotations of genes and link their activities in a causal framework for Gene Ontology Annotations. We applied the method to two gene regulation scenarios and illustrate how to apply the models generated to support the annotation of data from research articles.
Collapse
Affiliation(s)
- Belén Juanes Cortés
- Departamento de Informatica y Sistemas, University of Murcia, CEIR Campus Mare Nostrum, IMIB-Arrixaca, Campus de Espinardo, 30100 Murcia, Spain.
| | - José Antonio Vera-Ramos
- Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerpl. 2, Graz, Austria.
| | - Ruth C Lovering
- Institute of Cardiovascular Science, Faculty of Pop Health Sciences, University College London, Rayne Building, 5 University Street, London WC1E 6JF, United Kingdom.
| | - Pascale Gaudet
- Swiss Institute of Bioinformatics, 1, rue Michel Servet, 1211 Geneva 4, Switzerland.
| | - Astrid Laegreid
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Gastrosenteret, 431.03.046, Øya, Prinsesse Kristinas gate 1, Trondheim, Norway.
| | - Colin Logie
- Faculty of Science, Radboud Institute for Molecular Life Sciences, Geert Grooteplein Zuid 28, 6525, GA, Nijmegen, the Netherlands.
| | - Stefan Schulz
- Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Auenbruggerpl. 2, Graz, Austria.
| | - María Del Mar Roldán-García
- Departamento de Lenguajes y Ciencias de la Computación, University of Málaga,Bulevard Louis Pasteur 35, 29071 Málaga, Spain; ITIS Software, University of Málaga, Calle Arquitecto Francisco Peñalosa s/n, 29071 Málaga,Spain; Biomedical Research Institute of Málaga (IBIMA), University of Málaga, Calle Doctor Miguel Díaz Recio, 28, 29010 Málaga, Spain.
| | - Martin Kuiper
- Department of Biology, Norwegian University of Science and Technology, Realfagbygget, Høgskoleringen 5, 7034 Trondheim, Norway.
| | - Jesualdo Tomás Fernández-Breis
- Departamento de Informatica y Sistemas, University of Murcia, CEIR Campus Mare Nostrum, IMIB-Arrixaca, Campus de Espinardo, 30100 Murcia, Spain.
| |
Collapse
|
37
|
Wan L, Song J, He V, Roman J, Whah G, Peng S, Zhang L, He Y. Development of the International Classification of Diseases Ontology (ICDO) and its application for COVID-19 diagnostic data analysis. BMC Bioinformatics 2021; 22:508. [PMID: 34663204 PMCID: PMC8522253 DOI: 10.1186/s12859-021-04402-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 09/24/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The 10th and 9th revisions of the International Statistical Classification of Diseases and Related Health Problems (ICD10 and ICD9) have been adopted worldwide as a well-recognized norm to share codes for diseases, signs and symptoms, abnormal findings, etc. The international Consortium for Clinical Characterization of COVID-19 by EHR (4CE) website stores diagnosis COVID-19 disease data using ICD10 and ICD9 codes. However, the ICD systems are difficult to decode due to their many shortcomings, which can be addressed using ontology. METHODS An ICD ontology (ICDO) was developed to logically and scientifically represent ICD terms and their relations among different ICD terms. ICDO is also aligned with the Basic Formal Ontology (BFO) and reuses terms from existing ontologies. As a use case, the ICD10 and ICD9 diagnosis data from the 4CE website were extracted, mapped to ICDO, and analyzed using ICDO. RESULTS We have developed the ICDO to ontologize the ICD terms and relations. Different from existing disease ontologies, all ICD diseases in ICDO are defined as disease processes to describe their occurrence with other properties. The ICDO decomposes each disease term into different components, including anatomic entities, process profiles, etiological causes, output phenotype, etc. Over 900 ICD terms have been represented in ICDO. Many ICDO terms are presented in both English and Chinese. The ICD10/ICD9-based diagnosis data of over 27,000 COVID-19 patients from 5 countries were extracted from the 4CE. A total of 917 COVID-19-related disease codes, each of which were associated with 1 or more cases in the 4CE dataset, were mapped to ICDO and further analyzed using the ICDO logical annotations. Our study showed that COVID-19 targeted multiple systems and organs such as the lung, heart, and kidney. Different acute and chronic kidney phenotypes were identified. Some kidney diseases appeared to result from other diseases, such as diabetes. Some of the findings could only be easily found using ICDO instead of ICD9/10. CONCLUSIONS ICDO was developed to ontologize ICD10/10 codes and applied to study COVID-19 patient diagnosis data. Our findings showed that ICDO provides a semantic platform for more accurate detection of disease profiles.
Collapse
Affiliation(s)
- Ling Wan
- University of Michigan Medical School, Ann Arbor, MI 48109 USA
- OntoWise, Nanjing, Jiangsu China
| | - Justin Song
- Cranbrook Kingswood Upper School, Bloomfield Hills, MI 48304 USA
| | | | - Jennifer Roman
- College of Literacy, Science, and Arts, University of Michigan, Ann Arbor, MI 48109 USA
| | - Grace Whah
- College of Engineering, University of Michigan, Ann Arbor, MI 48109 USA
| | - Suyuan Peng
- School of Public Health, Peking University, Beijing, China
- National Institute of Health Data Science, Peking University, Beijing, China
| | - Luxia Zhang
- National Institute of Health Data Science, Peking University, Beijing, China
- Advanced Institute of Information Technology, Peking University, Hangzhou, China
- Renal Division, Department of Medicine, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI 48109 USA
| |
Collapse
|
38
|
Zhu Y, Liu L, Gao B, Liu J, Qiao X, Lian C, He Y. TCDO: A Community-Based Ontology for Integrative Representation and Analysis of Traditional Chinese Drugs and Their Properties. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE : ECAM 2021; 2021:6637810. [PMID: 34603473 PMCID: PMC8483929 DOI: 10.1155/2021/6637810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 08/04/2021] [Accepted: 08/31/2021] [Indexed: 11/17/2022]
Abstract
Traditional Chinese drugs (TCDs) have been widely used in clinical practice in China and many other regions for thousands of years. Nowadays TCD's bioactive ingredients and mechanisms of action are being identified. However, the lack of standardized terminologies or ontologies for the description of TCDs has hindered the interoperability and deep analysis of TCD knowledge and data. By aligning with the Basic Formal Ontology (BFO), an ISO-approved top-level ontology, we constructed a community-driven TCD ontology (TCDO) with the aim of supporting standardized TCD representation and integrated analysis. TCDO provides logical and textual definitions of TCDs, TCD categories, and the properties of TCDs (i.e., nature, flavor, toxicity, and channel tropism). More than 400 popular TCD decoction pieces (TCD-DPs) and Chinese medicinal materials (CMMs) are systematically represented. The logical TCD representation in TCDO supports computer-assisted reasoning and queries using tools such as Description Logic (DL) and SPARQL queries. Our statistical analysis of the knowledge represented in TCDO revealed scientific insights about TCDs. A total of 36 TCDs with medium or high toxicity are most densely distributed, primarily in Aconitum genus, Lamiids clade, and Fabids clade. TCD toxicity is mostly associated with the hot nature and pungent or bitter flavors and has liver, kidney, and spleen channel tropism. The three pairs of TCD flavor-nature associations (i.e., bitter-cold, pungent-warm, and sweet-neutral) were identified. The significance of these findings is discussed. TCDO has also been used to support the development of a web-based traditional Chinese medicine semantic annotation system that provides comprehensive annotation for individual TCDs. As a novel formal TCD ontology, TCDO lays out a strong foundation for more advanced TCD studies in the future.
Collapse
Affiliation(s)
- Yan Zhu
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Lihong Liu
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Bo Gao
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Jing Liu
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Xingchao Qiao
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Chaojie Lian
- National Institutes for Food and Drug Control, Beijing 102627, China
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
39
|
Huffman A, Masci AM, Zheng J, Sanati N, Brunson T, Wu G, He Y. CIDO ontology updates and secondary analysis of host responses to COVID-19 infection based on ImmPort reports and literature. J Biomed Semantics 2021; 12:18. [PMID: 34454610 PMCID: PMC8400831 DOI: 10.1186/s13326-021-00250-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 08/05/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND With COVID-19 still in its pandemic stage, extensive research has generated increasing amounts of data and knowledge. As many studies are published within a short span of time, we often lose an integrative and comprehensive picture of host-coronavirus interaction (HCI) mechanisms. As of early April 2021, the ImmPort database has stored 7 studies (with 6 having details) that cover topics including molecular immune signatures, epitopes, and sex differences in terms of mortality in COVID-19 patients. The Coronavirus Infectious Disease Ontology (CIDO) represents basic HCI information. We hypothesize that the CIDO can be used as the platform to represent newly recorded information from ImmPort leading the reinforcement of CIDO. METHODS The CIDO was used as the semantic platform for logically modeling and representing newly identified knowledge reported in the 6 ImmPort studies. A recursive eXtensible Ontology Development (XOD) strategy was established to support the CIDO representation and enhancement. Secondary data analysis was also performed to analyze different aspects of the HCI from these ImmPort studies and other related literature reports. RESULTS The topics covered by the 6 ImmPort papers were identified to overlap with existing CIDO representation. SARS-CoV-2 viral S protein related HCI knowledge was emphasized for CIDO modeling, including its binding with ACE2, mutations causing different variants, and epitope homology by comparison with other coronavirus S proteins. Different types of cytokine signatures were also identified and added to CIDO. Our secondary analysis of two cohort COVID-19 studies with cytokine panel detection found that a total of 11 cytokines were up-regulated in female patients after infection and 8 cytokines in male patients. These sex-specific gene responses were newly modeled and represented in CIDO. A new DL query was generated to demonstrate the benefits of such integrative ontology representation. Furthermore, IL-10 signaling pathway was found to be statistically significant for both male patients and female patients. CONCLUSION Using the recursive XOD strategy, six new ImmPort COVID-19 studies were systematically reviewed, the results were modeled and represented in CIDO, leading to the enhancement of CIDO. The enhanced ontology and further seconary analysis supported more comprehensive understanding of the molecular mechanism of host responses to COVID-19 infection.
Collapse
Affiliation(s)
- Anthony Huffman
- Department of Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109 USA
| | - Anna Maria Masci
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710 USA
- Office of Data Science, National Institute of Environmental Health Sciences, 530 Davis Drive, Research Triangle Park, NC 27560 USA
| | - Jie Zheng
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104 USA
| | - Nasim Sanati
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97239 USA
| | - Timothy Brunson
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97239 USA
| | - Guanming Wu
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97239 USA
| | - Yongqun He
- Department of Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109 USA
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109 USA
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI 48109 USA
- Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
40
|
Zhang L, Shi J, Ouyang J, Zhang R, Tao Y, Yuan D, Lv C, Wang R, Ning B, Roberts R, Tong W, Liu Z, Shi T. X-CNV: genome-wide prediction of the pathogenicity of copy number variations. Genome Med 2021; 13:132. [PMID: 34407882 PMCID: PMC8375180 DOI: 10.1186/s13073-021-00945-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 07/30/2021] [Indexed: 01/04/2023] Open
Abstract
Background Gene copy number variations (CNVs) contribute to genetic diversity and disease prevalence across populations. Substantial efforts have been made to decipher the relationship between CNVs and pathogenesis but with limited success. Results We have developed a novel computational framework X-CNV (www.unimd.org/XCNV), to predict the pathogenicity of CNVs by integrating more than 30 informative features such as allele frequency (AF), CNV length, CNV type, and some deleterious scores. Notably, over 14 million CNVs across various ethnic groups, covering nearly 93% of the human genome, were unified to calculate the AF. X-CNV, which yielded area under curve (AUC) values of 0.96 and 0.94 in training and validation sets, was demonstrated to outperform other available tools in terms of CNV pathogenicity prediction. A meta-voting prediction (MVP) score was developed to quantitively measure the pathogenic effect, which is based on the probabilistic value generated from the XGBoost algorithm. The proposed MVP score demonstrated a high discriminative power in determining pathogenetic CNVs for inherited traits/diseases in different ethnic groups. Conclusions The ability of the X-CNV framework to quantitatively prioritize functional, deleterious, and disease-causing CNV on a genome-wide basis outperformed current CNV-annotation tools and will have broad utility in population genetics, disease-association studies, and diagnostic screening. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-021-00945-4.
Collapse
Affiliation(s)
- Li Zhang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China.,School of Statistics, Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, East China Normal University, Shanghai, 200062, China
| | - Jingru Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Jian Ouyang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Riquan Zhang
- School of Statistics, Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, East China Normal University, Shanghai, 200062, China
| | - Yiran Tao
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Dongsheng Yuan
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Chengkai Lv
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Ruiyuan Wang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Baitang Ning
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge, SK10 4TG, UK.,University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA.
| | - Zhichao Liu
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR, 72079, USA.
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, 200241, China. .,School of Statistics, Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, East China Normal University, Shanghai, 200062, China. .,Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University & Capital Medical University, Beijing, 100083, China.
| |
Collapse
|
41
|
Matsuzawa Y, Higashi Y, Takano K, Takahashi M, Yamada Y, Okazaki Y, Nakabayashi R, Saito K, Tsugawa H. Food Lipidomics for 155 Agricultural Plant Products. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2021; 69:8981-8990. [PMID: 33570932 DOI: 10.1021/acs.jafc.0c07356] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Lipids exhibit functional bioactivities based on their polar and acyl chain properties; humans obtain lipids from dietary plant product intake. Therefore, the identification of different molecular species facilitates the evaluation of biological functions and nutrition levels and new phenotype-modulating lipid structures. As a rapid screening strategy, we performed untargeted lipidomics for 155 agricultural products in 58 species from 23 plant families, wherein product-specific lipid diversities were shown using computational mass spectrometry. We characterized 716 lipid species, for which the profiles revealed the National Center for Biotechnology Information-established organismal classification and unique plant tissue metabotypes. Moreover, we annotated unreported subclasses in plant lipidology; e.g., triacylglycerol estolide (TG-EST) was detected in rice seeds (Oryza sativa) and several plant species. TG-EST is known as the precursor molecule producing the fatty acid ester of hydroxy fatty acid, which lowers ambient glycemia and improves glucose tolerance. Hence, our method can identify agricultural plant products containing valuable lipid ingredients.
Collapse
Affiliation(s)
- Yuki Matsuzawa
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Nakamachi, Koganei-shi, Tokyo 184-8588, Japan
| | - Yasuhiro Higashi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kouji Takano
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Mikiko Takahashi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yutaka Yamada
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yozo Okazaki
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate School of Bioresources, Mie University, 1577 Kurimamachiya-cho, Tsu, Mie 514-8507 Japan
| | - Ryo Nakabayashi
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kazuki Saito
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hiroshi Tsugawa
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate School of Medical Life Science, Yokohama City University, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
42
|
Wang Z, He Y. Precision omics data integration and analysis with interoperable ontologies and their application for COVID-19 research. Brief Funct Genomics 2021; 20:235-248. [PMID: 34159360 PMCID: PMC8287950 DOI: 10.1093/bfgp/elab029] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 05/10/2021] [Accepted: 05/24/2021] [Indexed: 12/12/2022] Open
Abstract
Omics technologies are widely used in biomedical research. Precision medicine focuses on individual-level disease treatment and prevention. Here, we propose the usage of the term 'precision omics' to represent the combinatorial strategy that applies omics to translate large-scale molecular omics data for precision disease understanding and accurate disease diagnosis, treatment and prevention. Given the complexity of both omics and precision medicine, precision omics requires standardized representation and integration of heterogeneous data types. Ontology has emerged as an important artificial intelligence component to become critical for standard data and metadata representation, standardization and integration. To support precision omics, we propose a precision omics ontology hypothesis, which hypothesizes that the effectiveness of precision omics is positively correlated with the interoperability of ontologies used for data and knowledge integration. Therefore, to make effective precision omics studies, interoperable ontologies are required to standardize and incorporate heterogeneous data and knowledge in a human- and computer-interpretable manner. Methods for efficient development and application of interoperable ontologies are proposed and illustrated. With the interoperable omics data and knowledge, omics tools such as OmicsViz can also be evolved to process, integrate, visualize and analyze various omics data, leading to the identification of new knowledge and hypotheses of molecular mechanisms underlying the outcomes of diseases such as COVID-19. Given extensive COVID-19 omics research, we propose the strategy of precision omics supported by interoperable ontologies, accompanied with ontology-based semantic reasoning and machine learning, leading to systematic disease mechanism understanding and rational design of precision treatment and prevention. SHORT ABSTRACT Precision medicine focuses on individual-level disease treatment and prevention. Precision omics is a new strategy that applies omics for precision medicine research, which requires standardized representation and integration of individual genetics and phenotypes, experimental conditions, and data analysis settings. Ontology has emerged as an important artificial intelligence component to become critical for standard data and metadata representation, standardization and integration. To support precision omics, interoperable ontologies are required in order to standardize and incorporate heterogeneous data and knowledge in a human- and computer-interpretable manner. With the interoperable omics data and knowledge, omics tools such as OmicsViz can also be evolved to process, integrate, visualize and analyze various omics data, leading to the identification of new knowledge and hypotheses of molecular mechanisms underlying disease outcomes. The precision COVID-19 omics study is provided as the primary use case to illustrate the rationale and implementation of the precision omics strategy.
Collapse
Affiliation(s)
| | - Yongqun He
- University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
43
|
Bertrand S, Carvalho JE, Dauga D, Matentzoglu N, Daric V, Yu JK, Schubert M, Escrivá H. The Ontology of the Amphioxus Anatomy and Life Cycle (AMPHX). Front Cell Dev Biol 2021; 9:668025. [PMID: 33981708 PMCID: PMC8107275 DOI: 10.3389/fcell.2021.668025] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 03/31/2021] [Indexed: 11/13/2022] Open
Abstract
An ontology is a computable representation of the different parts of an organism and its different developmental stages as well as the relationships between them. The ontology of model organisms is therefore a fundamental tool for a multitude of bioinformatics and comparative analyses. The cephalochordate amphioxus is a marine animal representing the earliest diverging evolutionary lineage of chordates. Furthermore, its morphology, its anatomy and its genome can be considered as prototypes of the chordate phylum. For these reasons, amphioxus is a very important animal model for evolutionary developmental biology studies aimed at understanding the origin and diversification of vertebrates. Here, we have constructed an amphioxus ontology (AMPHX) which combines anatomical and developmental terms and includes the relationships between these terms. AMPHX will be used to annotate amphioxus gene expression patterns as well as phenotypes. We encourage the scientific community to adopt this amphioxus ontology and send recommendations for future updates and improvements.
Collapse
Affiliation(s)
- Stephanie Bertrand
- CNRS, Biologie Intégrative des Organismes Marins, Sorbonne Université, Paris, France
| | - João E. Carvalho
- CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-Mer, Institut de la Mer de Villefranche, Sorbonne Université, Paris, France
| | | | | | - Vladimir Daric
- CNRS, Biologie Intégrative des Organismes Marins, Sorbonne Université, Paris, France
| | - Jr-Kai Yu
- Institute of Cellular and Organismic Biology, Academia Sinica, Taipei City, Taiwan
- Marine Research Station, Institute of Cellular and Organismic Biology, Academia Sinica, Yilan, Taiwan
| | - Michael Schubert
- CNRS, Laboratoire de Biologie du Développement de Villefranche-sur-Mer, Institut de la Mer de Villefranche, Sorbonne Université, Paris, France
| | - Hector Escrivá
- CNRS, Biologie Intégrative des Organismes Marins, Sorbonne Université, Paris, France
| |
Collapse
|
44
|
Peng D, Ruan C, Fu S, He C, Song J, Li H, Tu Y, Tang D, Yao L, Lin S, Shi Y, Zhang W, Zhou H, Zhu L, Ma C, Chang C, Ma J, Xie Z, Wang C, Xue Y. Atg9-centered multi-omics integration reveals new autophagy regulators in Saccharomyces cerevisiae. Autophagy 2021; 17:4453-4476. [PMID: 33722159 DOI: 10.1080/15548627.2021.1898749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
In Saccharomyces cerevisiae, Atg9 is an important autophagy-related (Atg) protein, and interacts with hundreds of other proteins. How many Atg9-interacting proteins are involved in macroautophagy/autophagy is unclear. Here, we conducted a multi-omic profiling of Atg9-dependent molecular landscapes during nitrogen starvation-induced autophagy, and identified 290 and 256 genes to be markedly regulated by ATG9 in transcriptional and translational levels, respectively. Unexpectedly, we found most of known Atg proteins and autophagy regulators that interact with Atg9 were not significantly changed in the mRNA or protein level during autophagy. Based on a hypothesis that proteins with similar molecular characteristics might have similar functions, we developed a new method named inference of functional interacting partners (iFIP) to integrate the transcriptomic, proteomic and interactomic data, and predicted 42 Atg9-interacting proteins to be potentially involved in autophagy, including 15 known Atg proteins or autophagy regulators. We validated 2 Atg9-interacting partners, Glo3 and Scs7, to be functional in both bulk and selective autophagy. The mRNA and protein expressions but not subcellular localizations of Glo3 and Scs7 were affected with or without ATG9 during autophagy, whereas the colocalizations of the 2 proteins and Atg9 were markedly enhanced at early stages of the autophagic process. Further analyses demonstrated that Glo3 but not Scs7 regulates the retrograde transport of Atg9 during autophagy. A working model was illustrated to highlight the importance of the Atg9 interactome. Taken together, our study not only provided a powerful method for analyzing the multi-omics data, but also revealed 2 new players that regulate autophagy.Abbreviations: ALP: alkaline phosphatase; Arf1: ADP-ribosylation factor 1; Atg: autophagy-related; Co-IP: co-immunoprecipitation; Cvt: cytoplasm-to-vacuole targeting; DEM: differentially expressed mRNA; DEP: differentially expressed protein; DIC: differential interference contrast; E-ratio: enrichment ratio; ER: endoplasmic reticulum; ES: enrichment score; FC: fold change; FPKM: fragments per kilobase of exon per million fragments mapped; GAP: GTPase-activating protein; GFP: green fluorescent protein; GO: gene ontology; GSEA: gene set enrichment analysis; GST: glutathione S-transferase; HA: hemagglutinin; iFIP: inference of functional interacting partners; KO: knockout; LR: logistic regression; OE: over-expression; PAS: phagophore assembly site; PPI: protein-protein interaction; RFP: red fluorescence protein; RNA-seq: RNA sequencing; RT-PCR: real-time polymerase chain reaction; SCC: Spearman's correlation coefficient; SD-N: synthetic minimal medium lacking nitrogen; THANATOS: The Autophagy, Necrosis, ApopTosis OrchestratorS; Vsn: variance stabilization normalization; WT: wild-type.
Collapse
Affiliation(s)
- Di Peng
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China.,Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, Jiangsu China
| | - Chen Ruan
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Shanshan Fu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Chengwen He
- State Key Laboratory of Microbial Metabolism & Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai China
| | - Jingzhen Song
- State Key Laboratory of Microbial Metabolism & Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai China
| | - Hui Li
- State Key Laboratory of Microbial Metabolism & Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai China
| | - Yiran Tu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Dachao Tang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Lan Yao
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Shaofeng Lin
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Ying Shi
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Weizhi Zhang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Hao Zhou
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Le Zhu
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Cong Ma
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Cheng Chang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing China
| | - Jie Ma
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Life Omics, Beijing China
| | - Zhiping Xie
- State Key Laboratory of Microbial Metabolism & Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai China
| | - Chenwei Wang
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Yu Xue
- Key Laboratory of Molecular Biophysics of Ministry of Education, Hubei Bioinformatics and Molecular Imaging Key Laboratory, Center for Artificial Intelligence Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China.,Nanjing University Institute of Artificial Intelligence Biomedicine, Nanjing, Jiangsu China
| |
Collapse
|
45
|
Berke K, Sun P, Ong E, Sanati N, Huffman A, Brunson T, Loney F, Ostrow J, Racz R, Zhao B, Xiang Z, Masci AM, Zheng J, Wu G, He Y. VaximmutorDB: A Web-Based Vaccine Immune Factor Database and Its Application for Understanding Vaccine-Induced Immune Mechanisms. Front Immunol 2021; 12:639491. [PMID: 33777032 PMCID: PMC7994782 DOI: 10.3389/fimmu.2021.639491] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 02/18/2021] [Indexed: 01/07/2023] Open
Abstract
Vaccines stimulate various immune factors critical to protective immune responses. However, a comprehensive picture of vaccine-induced immune factors and pathways have not been systematically collected and analyzed. To address this issue, we developed VaximmutorDB, a web-based database system of vaccine immune factors (abbreviated as “vaximmutors”) manually curated from peer-reviewed articles. VaximmutorDB currently stores 1,740 vaccine immune factors from 13 host species (e.g., human, mouse, and pig). These vaximmutors were induced by 154 vaccines for 46 pathogens. Top 10 vaximmutors include three antibodies (IgG, IgG2a and IgG1), Th1 immune factors (IFN-γ and IL-2), Th2 immune factors (IL-4 and IL-6), TNF-α, CASP-1, and TLR8. Many enriched host processes (e.g., stimulatory C-type lectin receptor signaling pathway, SRP-dependent cotranslational protein targeting to membrane) and cellular components (e.g., extracellular exosome, nucleoplasm) by all the vaximmutors were identified. Using influenza as a model, live attenuated and killed inactivated influenza vaccines stimulate many shared pathways such as signaling of many interleukins (including IL-1, IL-4, IL-6, IL-13, IL-20, and IL-27), interferon signaling, MARK1 activation, and neutrophil degranulation. However, they also present their unique response patterns. While live attenuated influenza vaccine FluMist induced significant signal transduction responses, killed inactivated influenza vaccine Fluarix induced significant metabolism of protein responses. Two different Yellow Fever vaccine (YF-Vax) studies resulted in overlapping gene lists; however, they shared more portions of pathways than gene lists. Interestingly, live attenuated YF-Vax simulates significant metabolism of protein responses, which was similar to the pattern induced by killed inactivated Fluarix. A user-friendly web interface was generated to access, browse and search the VaximmutorDB database information. As the first web-based database of vaccine immune factors, VaximmutorDB provides systematical collection, standardization, storage, and analysis of experimentally verified vaccine immune factors, supporting better understanding of protective vaccine immunity.
Collapse
Affiliation(s)
- Kimberly Berke
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States.,Central Michigan College of Medicine, Mt. Pleasant, MI, United States
| | - Peter Sun
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States
| | - Edison Ong
- Department of Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
| | - Nasim Sanati
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Anthony Huffman
- Department of Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
| | - Timothy Brunson
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Fred Loney
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Joseph Ostrow
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States
| | - Rebecca Racz
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States
| | - Bin Zhao
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States
| | - Zuoshuang Xiang
- College of Literature, Science, and the Arts, University of Michigan, Ann Arbor, MI, United States
| | - Anna Maria Masci
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, United States
| | - Jie Zheng
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Guanming Wu
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Yongqun He
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, United States.,Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, United States.,Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
46
|
Barh D, Tiwari S, Andrade BS, Weener ME, Góes-Neto A, Azevedo V, Ghosh P, Blum K, Ganguly NK. A novel multi-omics-based highly accurate prediction of symptoms, comorbid conditions, and possible long-term complications of COVID-19. Mol Omics 2021; 17:317-337. [PMID: 33683246 DOI: 10.1039/d0mo00189a] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Comprehensive clinical pictures, comorbid conditions, and long-term complications of COVID-19 are still unknown. Recently, using a multi-omics-based strategy, we predicted potential drugs for COVID-19 with ∼70% accuracy. Herein, using a novel multi-omics-based bioinformatic approach and three ways of analysis, we identified the symptoms, comorbid conditions, and short-, mid-, and possible long-term complications of COVID-19 with >90% precision including 27 parent, 170 child, and 403 specific conditions. Among the specific conditions, 36 viral, 53 short-term, 62 short-mid-long-term, 194 mid-long-term, and 57 congenital conditions are identified. At a threshold "count of occurrence" of 4, we found that 83-100% (average 92.67%) of enriched conditions are associated with COVID-19. Except for dry cough and loss of taste, all the other COVID-19-associated mild and severe symptoms are enriched. CVDs, and pulmonary, metabolic, musculoskeletal, neuropsychiatric, kidney, liver, and immune system disorders are top comorbid conditions. Specific diseases like myocardial infarction, hypertension, COPD, lung injury, diabetes, cirrhosis, mood disorders, dementia, macular degeneration, chronic kidney disease, lupus, arthritis, etc. along with several other NCDs were found to be top candidates. Interestingly, many cancers and congenital disorders associated with COVID-19 severity are also identified. Arthritis, gliomas, diabetes, psychiatric disorders, and CVDs having a bidirectional relationship with COVID-19 are also identified as top conditions. Based on our accuracy (>90%), the long-term presence of SARS-CoV-2 RNA in human, and our "genetic remittance" assumption, we hypothesize that all the identified top-ranked conditions could be potential long-term consequences in COVID-19 survivors, warranting long-term observational studies.
Collapse
Affiliation(s)
- Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, WB, India.
| | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Liu Y, Hur J, Chan WKB, Wang Z, Xie J, Sun D, Handelman S, Sexton J, Yu H, He Y. Ontological modeling and analysis of experimentally or clinically verified drugs against coronavirus infection. Sci Data 2021; 8:16. [PMID: 33441564 PMCID: PMC7806933 DOI: 10.1038/s41597-021-00799-w] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 12/14/2020] [Indexed: 12/25/2022] Open
Abstract
Our systematic literature collection and annotation identified 106 chemical drugs and 31 antibodies effective against the infection of at least one human coronavirus (including SARS-CoV, SAR-CoV-2, and MERS-CoV) in vitro or in vivo in an experimental or clinical setting. A total of 163 drug protein targets were identified, and 125 biological processes involving the drug targets were significantly enriched based on a Gene Ontology (GO) enrichment analysis. The Coronavirus Infectious Disease Ontology (CIDO) was used as an ontological platform to represent the anti-coronaviral drugs, chemical compounds, drug targets, biological processes, viruses, and the relations among these entities. In addition to new term generation, CIDO also adopted various terms from existing ontologies and developed new relations and axioms to semantically represent our annotated knowledge. The CIDO knowledgebase was systematically analyzed for scientific insights. To support rational drug design, a "Host-coronavirus interaction (HCI) checkpoint cocktail" strategy was proposed to interrupt the important checkpoints in the dynamic HCI network, and ontologies would greatly support the design process with interoperable knowledge representation and reasoning.
Collapse
Affiliation(s)
- Yingtong Liu
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Junguk Hur
- University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND, 58202, USA
| | - Wallace K B Chan
- Department of Pharmacology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Zhigang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences and School of Basic Medicine, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, 100005, China
| | - Jiangan Xie
- School of Bioinformatics, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Duxin Sun
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Samuel Handelman
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- U-M Center for Drug Repurposing, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jonathan Sexton
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- U-M Center for Drug Repurposing, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Medicinal Chemistry, College of Pharmacy, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Hong Yu
- Department of Respiratory and Critical Care Medicine, Guizhou Province People's Hospital and NHC Key Laboratory of Immunological Diseases, People's Hospital of Guizhou University, Guiyang, Guizhou, 550002, China
- Department of Basic Medicine, Guizhou University Medical College, Guiyang, Guizhou, 550025, China
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Unit for Laboratory Animal Medicine, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
48
|
Feng G, Li X, Wang W, Deng L, Zeng K. Effects of Peptide Thanatin on the Growth and Transcriptome of Penicillium digitatum. Front Microbiol 2020; 11:606482. [PMID: 33381100 PMCID: PMC7767931 DOI: 10.3389/fmicb.2020.606482] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 11/28/2022] Open
Abstract
Penicillium digitatum is the most damaging pathogen provoking green mold in citrus fruit during storage, and there is an urgent need for novel antifungal agents with high efficiency. The aim of this study was to investigate the antifungal effects of peptide thanatin against P. digitatum and the molecular mechanisms. Results showed that peptide thanatin had a prominent inhibitory effect on P. digitatum by in vitro and in vivo test. A total of 938 genes, including 556 downregulated and 382 upregulated genes, were differentially expressed, as revealed by RNA-seq of whole P. digitatum genomes analysis with or without thanatin treatment. The downregulated genes mainly encoded RNA polymerase, ribosome biogenesis, amino acid metabolism, and major facilitator superfamily. The genes associated with heat shock proteins and antioxidative systems were widely expressed in thanatin-treated group. DNA, RNA, and the protein content of P. digitatum were significantly decreased after thanatin treatment. In conclusion, thanatin could inhibit the growth of P. digitatum, and the underlying mechanism might be the genetic information processing and stress response were affected. The research will provide more precise and directional clues to explore the inhibitory mechanism of thanatin on growth of P. digitatum.
Collapse
Affiliation(s)
- Guirong Feng
- College of Food Science, Southwest University, Chongqing, China
| | - Xindan Li
- College of Food Science, Southwest University, Chongqing, China
| | - Wenjun Wang
- College of Food Science, Southwest University, Chongqing, China
| | - Lili Deng
- College of Food Science, Southwest University, Chongqing, China.,Research Center of Food Storage and Logistics, Southwest University, Chongqing, China
| | - Kaifang Zeng
- College of Food Science, Southwest University, Chongqing, China.,Research Center of Food Storage and Logistics, Southwest University, Chongqing, China
| |
Collapse
|
49
|
Thessen AE, Walls RL, Vogt L, Singer J, Warren R, Buttigieg PL, Balhoff JP, Mungall CJ, McGuinness DL, Stucky BJ, Yoder MJ, Haendel MA. Transforming the study of organisms: Phenomic data models and knowledge bases. PLoS Comput Biol 2020; 16:e1008376. [PMID: 33232313 PMCID: PMC7685442 DOI: 10.1371/journal.pcbi.1008376] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.
Collapse
Affiliation(s)
- Anne E. Thessen
- Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
- Ronin Institute for Independent Scholarship, Monclair, New Jersey, United States of America
| | - Ramona L. Walls
- Bio5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Lars Vogt
- TIB Leibniz Information Centre for Science and Technology, Hannover, Germany
| | | | | | - Pier Luigi Buttigieg
- Alfred-Wegener-Institut, Helmholtz-Zentrum für Polar- und Meeresforschung, Bremerhaven, Germany
| | - James P. Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | | | - Brian J. Stucky
- Florida Museum of Natural History, University of Florida, Gainesville, Florida, United States of America
| | - Matthew J. Yoder
- Illinois Natural History Survey, Champaign, Illinois, United States of America
| | - Melissa A. Haendel
- Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America
| |
Collapse
|
50
|
Diller M, Johnson E, Hicks A, Hogan WR. A realism-based approach to an ontological representation of symbiotic interactions. BMC Med Inform Decis Mak 2020; 20:258. [PMID: 33032576 PMCID: PMC7542735 DOI: 10.1186/s12911-020-01273-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 09/22/2020] [Indexed: 11/10/2022] Open
Abstract
Background The symbiotic interactions that occur between humans and organisms in our environment have a tremendous impact on our health. Recently, there has been a surge in interest in understanding the complex relationships between the microbiome and human health and host immunity against microbial pathogens, among other things. To collect and manage data about these interactions and their complexity, scientists will need ontologies that represent symbiotic interactions as they occur in reality. Methods We began with two papers that reviewed the usage of ‘symbiosis’ and related terms in the biology and ecology literature and prominent textbooks. We then analyzed several prominent standard terminologies and ontologies that contain representations of symbiotic interactions, to determine if they appropriately defined ‘symbiosis’ and related terms according to current scientific usage as identified by the review papers. In the process, we identified several subtypes of symbiotic interactions, as well as the characteristics that differentiate them, which we used to propose textual and axiomatic definitions for each subtype of interaction. To both illustrate how to use the ontological representations and definitions we created and provide additional quality assurance on key definitions, we carried out a referent tracking analysis and representation of three scenarios involving symbiotic interactions among organisms. Results We found one definition of ‘symbiosis’ in an existing ontology that was consistent with the vast preponderance of scientific usage in biology and ecology. However, that ontology changed its definition during the course of our work, and discussions are ongoing. We present a new definition that we have proposed. We also define 34 subtypes of symbiosis. Our referent tracking analysis showed that it is necessary to define symbiotic interactions at the level of the individual, rather than at the species level, due to the complex nature in which organisms can go from participating in one type of symbiosis with one organism to participating in another type of symbiosis with a different organism. Conclusion As a result of our efforts here, we have developed a robust representation of symbiotic interactions using a realism-based approach, which fills a gap in existing biomedical ontologies.
Collapse
Affiliation(s)
- Matthew Diller
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA.
| | - Evan Johnson
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Amanda Hicks
- Applied Physics Laboratory, Johns Hopkins University, Baltimore, MD, USA
| | - William R Hogan
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| |
Collapse
|