1
|
Network based systems biology approach to identify diseasome and comorbidity associations of Systemic Sclerosis with cancers. Heliyon 2022; 8:e08892. [PMID: 35198765 PMCID: PMC8841363 DOI: 10.1016/j.heliyon.2022.e08892] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/04/2021] [Accepted: 01/29/2022] [Indexed: 01/11/2023] Open
Abstract
Systemic Sclerosis (SSc) is an autoimmune disease associated with changes in the skin's structure in which the immune system attacks the body. A recent meta-analysis has reported a high incidence of cancer prognosis including lung cancer (LC), leukemia (LK), and lymphoma (LP) in patients with SSc as comorbidity but its underlying mechanistic details are yet to be revealed. To address this research gap, bioinformatics methodologies were developed to explore the comorbidity interactions between a pair of diseases. Firstly, appropriate gene expression datasets from different repositories on SSc and its comorbidities were collected. Then the interconnection between SSc and its cancer comorbidities was identified by applying the developed pipelines. The pipeline was designed as a generic workflow to demonstrate a premise comorbid condition that integrate regarding gene expression data, tissue/organ meta-data, Gene Ontology (GO), Molecular pathways, and other online resources, and analyze them with Gene Set Enrichment Analysis (GSEA), Pathway enrichment and Semantic Similarity (SS). The pipeline was implemented in R and can be accessed through our Github repository: https://github.com/hiddenntreasure/comorbidity. Our result suggests that SSc and its cancer comorbidities share differentially expressed genes, functional terms (gene ontology), and pathways. The findings have led to a better understanding of disease pathways and our developed methodologies may be applied to any set of diseases for finding any association between them. This research may be used by physicians, researchers, biologists, and others.
Collapse
|
2
|
Wesołowski S, Lemmon G, Hernandez EJ, Henrie A, Miller TA, Weyhrauch D, Puchalski MD, Bray BE, Shah RU, Deshmukh VG, Delaney R, Yost HJ, Eilbeck K, Tristani-Firouzi M, Yandell M. An explainable artificial intelligence approach for predicting cardiovascular outcomes using electronic health records. PLOS DIGITAL HEALTH 2022; 1:e0000004. [PMID: 35373216 PMCID: PMC8975108 DOI: 10.1371/journal.pdig.0000004] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 11/17/2021] [Indexed: 11/19/2022]
Abstract
Understanding the conditionally-dependent clinical variables that drive cardiovascular health outcomes is a major challenge for precision medicine. Here, we deploy a recently developed massively scalable comorbidity discovery method called Poisson Binomial based Comorbidity discovery (PBC), to analyze Electronic Health Records (EHRs) from the University of Utah and Primary Children's Hospital (over 1.6 million patients and 77 million visits) for comorbid diagnoses, procedures, and medications. Using explainable Artificial Intelligence (AI) methodologies, we then tease apart the intertwined, conditionally-dependent impacts of comorbid conditions and demography upon cardiovascular health, focusing on the key areas of heart transplant, sinoatrial node dysfunction and various forms of congenital heart disease. The resulting multimorbidity networks make possible wide-ranging explorations of the comorbid and demographic landscapes surrounding these cardiovascular outcomes, and can be distributed as web-based tools for further community-based outcomes research. The ability to transform enormous collections of EHRs into compact, portable tools devoid of Protected Health Information solves many of the legal, technological, and data-scientific challenges associated with large-scale EHR analyses.
Collapse
Affiliation(s)
- Sergiusz Wesołowski
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Gordon Lemmon
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Edgar J. Hernandez
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Alex Henrie
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Thomas A. Miller
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Derek Weyhrauch
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Michael D. Puchalski
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Bruce E. Bray
- Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America
- University of Utah, Biomedical Informatics, Salt Lake City, UT 84108, United States of America
| | - Rashmee U. Shah
- Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Vikrant G. Deshmukh
- University of Utah Health Care CMIO Office, Salt Lake City, UT, United States of America
| | - Rebecca Delaney
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, United States of America
| | - H. Joseph Yost
- Molecular Medicine Program, University of Utah, Salt Lake City, UT, United States of America
| | - Karen Eilbeck
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, United States of America
| | - Martin Tristani-Firouzi
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Mark Yandell
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| |
Collapse
|
3
|
Rahman MH, Rana HK, Peng S, Kibria MG, Islam MZ, Mahmud SMH, Moni MA. Bioinformatics and system biology approaches to identify pathophysiological impact of COVID-19 to the progression and severity of neurological diseases. Comput Biol Med 2021; 138:104859. [PMID: 34601390 PMCID: PMC8483812 DOI: 10.1016/j.compbiomed.2021.104859] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 08/21/2021] [Accepted: 09/06/2021] [Indexed: 02/06/2023]
Abstract
The Coronavirus Disease 2019 (COVID-19) still tends to propagate and increase the occurrence of COVID-19 across the globe. The clinical and epidemiological analyses indicate the link between COVID-19 and Neurological Diseases (NDs) that drive the progression and severity of NDs. Elucidating why some patients with COVID-19 influence the progression of NDs and patients with NDs who are diagnosed with COVID-19 are becoming increasingly sick, although others are not is unclear. In this research, we investigated how COVID-19 and ND interact and the impact of COVID-19 on the severity of NDs by performing transcriptomic analyses of COVID-19 and NDs samples by developing the pipeline of bioinformatics and network-based approaches. The transcriptomic study identified the contributing genes which are then filtered with cell signaling pathway, gene ontology, protein-protein interactions, transcription factor, and microRNA analysis. Identifying hub-proteins using protein-protein interactions leads to the identification of a therapeutic strategy. Additionally, the incorporation of comorbidity interactions score enhances the identification beyond simply detecting novel biological mechanisms involved in the pathophysiology of COVID-19 and its NDs comorbidities. By computing the semantic similarity between COVID-19 and each of the ND, we have found gene-based maximum semantic score between COVID-19 and Parkinson's disease, the minimum semantic score between COVID-19 and Multiple sclerosis. Similarly, we have found gene ontology-based maximum semantic score between COVID-19 and Huntington disease, minimum semantic score between COVID-19 and Epilepsy disease. Finally, we validated our findings using gold-standard databases and literature searches to determine which genes and pathways had previously been associated with COVID-19 and NDs.
Collapse
Affiliation(s)
- Md Habibur Rahman
- Dept. of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Humayan Kabir Rana
- Dept. of Computer Science and Engineering, Green University of Bangladesh, Dhaka, Bangladesh
| | - Silong Peng
- Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, China
| | - Md Golam Kibria
- Dept. of Chemical and Petroleum Engineering, Schulich School of Engineering, University of Calgary, Canada
| | - Md Zahidul Islam
- Department of Electronics, Graduate School of Engineering, Nagoya University, Japan
| | - S M Hasan Mahmud
- Dept. of Computer Science, American International University Bangladesh, Dhaka, Bangladesh
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia.
| |
Collapse
|
4
|
Lemmon G, Wesolowski S, Henrie A, Tristani-Firouzi M, Yandell M. A Poisson binomial-based statistical testing framework for comorbidity discovery across electronic health record datasets. NATURE COMPUTATIONAL SCIENCE 2021; 1:694-702. [PMID: 35252879 PMCID: PMC8896515 DOI: 10.1038/s43588-021-00141-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/16/2021] [Indexed: 01/28/2023]
Abstract
Discovering the concomitant occurrence of distinct medical conditions in a patient, also known as comorbidities, is a prerequisite for creating patient outcome prediction tools. Current comorbidity discovery applications are designed for small datasets and use stratification to control for confounding variables such as age, sex or ancestry. Stratification lowers false positive rates, but reduces power, as the size of the study cohort is decreased. Here we describe a Poisson binomial-based approach to comorbidity discovery (PBC) designed for big-data applications that circumvents the need for stratification. PBC adjusts for confounding demographic variables on a per-patient basis and models temporal relationships. We benchmark PBC using two datasets to compute comorbidity statistics on 4,623,841 pairs of potentially comorbid medical terms. The results of this computation are provided as a searchable web resource. Compared with current methods, the PBC approach reduces false positive associations while retaining statistical power to discover true comorbidities.
Collapse
Affiliation(s)
- Gordon Lemmon
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Sergiusz Wesolowski
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Alex Henrie
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Martin Tristani-Firouzi
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| |
Collapse
|
5
|
Chowdhury UN, Ahmad S, Islam MB, Alyami SA, Quinn JMW, Eapen V, Moni MA. System biology and bioinformatics pipeline to identify comorbidities risk association: Neurodegenerative disorder case study. PLoS One 2021; 16:e0250660. [PMID: 33956862 PMCID: PMC8101720 DOI: 10.1371/journal.pone.0250660] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 04/12/2021] [Indexed: 12/17/2022] Open
Abstract
Alzheimer's disease (AD) is the commonest progressive neurodegenerative condition in humans, and is currently incurable. A wide spectrum of comorbidities, including other neurodegenerative diseases, are frequently associated with AD. How AD interacts with those comorbidities can be examined by analysing gene expression patterns in affected tissues using bioinformatics tools. We surveyed public data repositories for available gene expression data on tissue from AD subjects and from people affected by neurodegenerative diseases that are often found as comorbidities with AD. We then utilized large set of gene expression data, cell-related data and other public resources through an analytical process to identify functional disease links. This process incorporated gene set enrichment analysis and utilized semantic similarity to give proximity measures. We identified genes with abnormal expressions that were common to AD and its comorbidities, as well as shared gene ontology terms and molecular pathways. Our methodological pipeline was implemented in the R platform as an open-source package and available at the following link: https://github.com/unchowdhury/AD_comorbidity. The pipeline was thus able to identify factors and pathways that may constitute functional links between AD and these common comorbidities by which they affect each others development and progression. This pipeline can also be useful to identify key pathological factors and therapeutic targets for other diseases and disease interactions.
Collapse
Affiliation(s)
- Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Shamim Ahmad
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - M. Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Salem A. Alyami
- Department of Mathematics and Statistics, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
| | - Julian M. W. Quinn
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Valsamma Eapen
- School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
- School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia
- WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Sydney, Australia
| |
Collapse
|
6
|
Alcaide D, Aerts J. A visual analytic approach for the identification of ICU patient subpopulations using ICD diagnostic codes. PeerJ Comput Sci 2021; 7:e430. [PMID: 33954230 PMCID: PMC8049127 DOI: 10.7717/peerj-cs.430] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 02/15/2021] [Indexed: 05/03/2023]
Abstract
A large number of clinical concepts are categorized under standardized formats that ease the manipulation, understanding, analysis, and exchange of information. One of the most extended codifications is the International Classification of Diseases (ICD) used for characterizing diagnoses and clinical procedures. With formatted ICD concepts, a patient profile can be described through a set of standardized and sorted attributes according to the relevance or chronology of events. This structured data is fundamental to quantify the similarity between patients and detect relevant clinical characteristics. Data visualization tools allow the representation and comprehension of data patterns, usually of a high dimensional nature, where only a partial picture can be projected. In this paper, we provide a visual analytics approach for the identification of homogeneous patient cohorts by combining custom distance metrics with a flexible dimensionality reduction technique. First we define a new metric to measure the similarity between diagnosis profiles through the concordance and relevance of events. Second we describe a variation of the Simplified Topological Abstraction of Data (STAD) dimensionality reduction technique to enhance the projection of signals preserving the global structure of data. The MIMIC-III clinical database is used for implementing the analysis into an interactive dashboard, providing a highly expressive environment for the exploration and comparison of patients groups with at least one identical diagnostic ICD code. The combination of the distance metric and STAD not only allows the identification of patterns but also provides a new layer of information to establish additional relationships between patient cohorts. The method and tool presented here add a valuable new approach for exploring heterogeneous patient populations. In addition, the distance metric described can be applied in other domains that employ ordered lists of categorical data.
Collapse
Affiliation(s)
- Daniel Alcaide
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium
| | - Jan Aerts
- Department of Electrical Engineering (ESAT) STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Leuven, Belgium
- UHasselt, I-BioStat, Data Science Institute, Hasselt, Belgium
| |
Collapse
|