1
|
Huang B, Xin C, Yan H, Yu Z. A Machine Learning Method for a Blood Diagnostic Model of Pancreatic Cancer Based on microRNA Signatures. Crit Rev Immunol 2024; 44:13-23. [PMID: 38421702 DOI: 10.1615/critrevimmunol.2023051250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
This study aimed to construct a blood diagnostic model for pancreatic cancer (PC) using miRNA signatures by a combination of machine learning and biological experimental verification. Gene expression profiles of patients with PC and transcriptome normalization data were obtained from the Gene Expression Omnibus (GEO) database. Using random forest algorithm, lasso regression algorithm, and multivariate cox regression analyses, the classifier of differentially expressed miRNAs was identified based on algorithms and functional properties. Next, the ROC curve analysis was used to evaluate the predictive performance of the diagnostic model. Finally, we analyzed the expression of two specific miRNAs in Capan-1, PANC-1, and MIA PaCa-2 pancreatic cells using qRT-PCR. Integrated microarray analysis revealed that 33 common miRNAs exhibited significant differences in expression profiles between tumor and normal groups (P value < 0.05 and |logFC| > 0.3). Pathway analysis showed that differentially expressed miRNAs were related to P00059 p53 pathway, hsa04062 chemokine signaling pathway, and cancer-related pathways including PC. In ENCORI database, the hsa-miR-4486 and hsa-miR-6075 were identified by random forest algorithm and lasso regression algorithm and introduced as major miRNA markers in PC diagnosis. Further, the receiver operating characteristic curve analysis achieved the area under curve score > 80%, showing good sensitivity and specificity of the two-miRNA signature model in PC diagnosis. Additionally, hsa-miR-4486 and hsa-miR-6075 genes expressions in three pancreatic cells were all up-regulated by qRT-PCR. In summary, these findings suggest that the two miRNAs, hsa-miR-4486 and hsa-miR-6075, could serve as valuable prognostic markers for PC.
Collapse
Affiliation(s)
- Bin Huang
- The Affiliated People's Hospital of Ningbo University
| | - Chang Xin
- Department of Hepatopancreatobiliary Surgery, The Affiliated People's Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Huanjun Yan
- Department of Hepatopancreatobiliary Surgery, The Affiliated People's Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Zhewei Yu
- Department of Hepatopancreatobiliary Surgery, The Affiliated People's Hospital of Ningbo University, Ningbo, Zhejiang, China
| |
Collapse
|
2
|
Akhtar MR, Mondal MNI, Rana HK. Bioinformatics approach to identify the impacts of microgravity on the development of bone and joint diseases. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023] Open
|
3
|
Network-Based Data Analysis Reveals Ion Channel-Related Gene Features in COVID-19: A Bioinformatic Approach. Biochem Genet 2022; 61:471-505. [PMID: 36104591 PMCID: PMC9473477 DOI: 10.1007/s10528-022-10280-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 09/01/2022] [Indexed: 11/02/2022]
Abstract
Coronavirus disease 2019 (COVID-19) seriously threatens human health and has been disseminated worldwide. Although there are several treatments for COVID-19, its control is currently suboptimal. Therefore, the development of novel strategies to treat COVID-19 is necessary. Ion channels are located on the membranes of all excitable cells and many intracellular organelles and are key components involved in various biological processes. They are a target of interest when searching for drug targets. This study aimed to reveal the relevant molecular features of ion channel genes in COVID-19 based on bioinformatic analyses. The RNA-sequencing data of patients with COVID-19 and healthy subjects (GSE152418 and GSE171110 datasets) were obtained from the Gene Expression Omnibus (GEO) database. Ion channel genes were selected from the Hugo Gene Nomenclature Committee (HGNC) database. The RStudio software was used to process the data based on the corresponding R language package to identify ion channel-associated differentially expressed genes (DEGs). Based on the DEGs, Gene Ontology (GO) functional and pathway enrichment analyses were performed using the Enrichr web tool. The STRING database was used to generate a protein-protein interaction (PPI) network, and the Cytoscape software was used to screen for hub genes in the PPI network based on the cytoHubba plug-in. Transcription factors (TF)-DEG, DEG-microRNA (miRNA) and DEG-disease association networks were constructed using the NetworkAnalyst web tool. Finally, the screened hub genes as drug targets were subjected to enrichment analysis based on the DSigDB using the Enrichr web tool to identify potential therapeutic agents for COVID-19. A total of 29 ion channel-associated DEGs were identified. GO functional analysis showed that the DEGs were integral components of the plasma membrane and were mainly involved in inorganic cation transmembrane transport and ion channel activity functions. Pathway analysis showed that the DEGs were mainly involved in nicotine addiction, calcium regulation in the cardiac cell and neuronal system pathways. The top 10 hub genes screened based on the PPI network included KCNA2, KCNJ4, CACNA1A, CACNA1E, NALCN, KCNA5, CACNA2D1, TRPC1, TRPM3 and KCNN3. The TF-DEG and DEG-miRNA networks revealed significant TFs (FOXC1, GATA2, HINFP, USF2, JUN and NFKB1) and miRNAs (hsa-mir-146a-5p, hsa-mir-27a-3p, hsa-mir-335-5p, hsa-let-7b-5p and hsa-mir-129-2-3p). Gene-disease association network analysis revealed that the DEGs were closely associated with intellectual disability and cerebellar ataxia. Drug-target enrichment analysis showed that the relevant drugs targeting the hub genes CACNA2D1, CACNA1A, CACNA1E, KCNA2 and KCNA5 were gabapentin, gabapentin enacarbil, pregabalin, guanidine hydrochloride and 4-aminopyridine. The results of this study provide a valuable basis for exploring the mechanisms of ion channel genes in COVID-19 and clues for developing therapeutic strategies for COVID-19.
Collapse
|
4
|
Bioinformatics and System Biological Approaches for the Identification of Genetic Risk Factors in the Progression of Cardiovascular Disease. Cardiovasc Ther 2022; 2022:9034996. [PMID: 36035865 PMCID: PMC9381297 DOI: 10.1155/2022/9034996] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 07/17/2022] [Accepted: 07/23/2022] [Indexed: 11/17/2022] Open
Abstract
Background Cardiovascular disease (CVD) is the combination of coronary heart disease, myocardial infarction, rheumatic heart disease, and peripheral vascular disease of the heart and blood vessels. It is one of the leading deadly diseases that causes one-third of the deaths yearly in the globe. Additionally, the risk factors associated with it make the situation more complex for cardiovascular patients, which lead them towards mortality, but the genetic association between CVD and its risk factors is not clearly explored in the global literature. We addressed this issue and explored the linkage between CVD and its risk factors. Methods We developed an analytical approach to reveal the risk factors and their linkages with CVD. We used GEO microarray datasets for the CVD and other risk factors in this study. We performed several analyses including gene expression analysis, diseasome analysis, protein-protein interaction (PPI) analysis, and pathway analysis for discovering the relationship between CVD and its risk factors. We also examined the validation of our study using gold benchmark databases OMIM, dbGAP, and DisGeNET. Results We observed that the number of 32, 17, 53, 70, and 89 differentially expressed genes (DEGs) is overlapped between CVD and its risk factors of hypertension (HTN), type 2 diabetes (T2D), hypercholesterolemia (HCL), obesity, and aging, respectively. We identified 10 major hub proteins (FPR2, TNF, CXCL8, CXCL1, IL1B, VEGFA, CYBB, PTGS2, ITGAX, and CCR5), 12 significant functional pathways, and 11 gene ontological pathways that are associated with CVD. We also found the connection of CVD with its risk factors in the gold benchmark databases. Our experimental outcomes indicate a strong association of CVD with its risk factors of HTN, T2D, HCL, obesity, and aging. Conclusions Our computational approach explored the genetic association of CVD with its risk factors by identifying the significant DEGs, hub proteins, and signaling and ontological pathways. The outcomes of this study may be further used in the lab-based analysis for developing the effective treatment strategies of CVD.
Collapse
|
5
|
Chen Q, Wang Y, Liu Y, Xi B. ESRRG, ATP4A, and ATP4B as Diagnostic Biomarkers for Gastric Cancer: A Bioinformatic Analysis Based on Machine Learning. Front Physiol 2022; 13:905523. [PMID: 35812327 PMCID: PMC9262247 DOI: 10.3389/fphys.2022.905523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 05/10/2022] [Indexed: 11/13/2022] Open
Abstract
Based on multiple bioinformatics methods and machine learning techniques, this study was designed to explore potential hub genes of gastric cancer with a diagnostic value. The novel biomarkers were detected through multiple databases of gastric cancer–related genes. The NCBI Gene Expression Omnibus (GEO) database was used to obtain gene expression files. Three hub genes (ESRRG, ATP4A, and ATP4B) were detected through a combination of weighted gene co-expression network analysis (WGCNA), gene–gene interaction network analysis, and supervised feature selection method. GEPIA2 was used to verify the differences in the expression levels of the hub genes in normal and cancer tissues in the RNA-seq levels of Genotype-Tissue Expression (GTEx) and The Cancer Genome Atlas (TCGA) databases. The objectivity of potential hub genes was also verified by immunohistochemistry in the Human Protein Atlas (HPA) database and transcription factor–hub gene regulatory network. Machine learning (ML) methods including data pre-processing, model selection and cross-validation, and performance evaluation were examined on the hub-gene expression profiles in five Gene Expression Omnibus datasets and verified on a GEO external validation (EV) dataset. Six supervised learning models (support vector machine, random forest, k-nearest neighbors, neural network, decision tree, and eXtreme Gradient Boosting) and one semi-supervised learning model (label spreading) were established to evaluate the diagnostic value of biomarkers. Among the six supervised models, the support vector machine (SVM) algorithm was the most effective one according to calculated performance metrics, including 0.93 and 0.99 area under the curve (AUC) scores on the test and external validation datasets, respectively. Furthermore, the semi-supervised model could also successfully learn and predict sample types, achieving a 0.986 AUC score on the EV dataset, even when 10% samples in the five GEO datasets were labeled. In conclusion, three hub genes (ATP4A, ATP4B, and ESRRG) closely related to gastric cancer were mined, based on which the ML diagnostic model of gastric cancer was conducted.
Collapse
Affiliation(s)
- Qiu Chen
- Medical College, Yangzhou University, Yangzhou, China
| | - Yu Wang
- College of Physics Science and Technology, Yangzhou University, Yangzhou, China
| | - Yongjun Liu
- College of Physics Science and Technology, Yangzhou University, Yangzhou, China
| | - Bin Xi
- College of Physics Science and Technology, Yangzhou University, Yangzhou, China
- *Correspondence: Bin Xi,
| |
Collapse
|
6
|
Kim S, Hollinger H, Radke EG. 'Omics in environmental epidemiological studies of chemical exposures: A systematic evidence map. ENVIRONMENT INTERNATIONAL 2022; 164:107243. [PMID: 35551006 PMCID: PMC11515950 DOI: 10.1016/j.envint.2022.107243] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 03/25/2022] [Accepted: 04/10/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Systematic evidence maps are increasingly used to develop chemical risk assessments. These maps can provide an overview of available studies and relevant study information to be used for various research objectives and applications. Environmental epidemiological studies that examine the impact of chemical exposures on various 'omic profiles in human populations provide relevant mechanistic information and can be used for benchmark dose modeling to derive potential human health reference values. OBJECTIVES To create a systematic evidence map of environmental epidemiological studies examining environmental contaminant exposures with 'omics in order to characterize the extent of available studies for future research needs. METHODS Systematic review methods were used to search and screen the literature and included the use of machine learning methods to facilitate screening studies. The Populations, Exposures, Comparators and Outcomes (PECO) criteria were developed to identify and screen relevant studies. Studies that met the PECO criteria after full-text review were summarized with information such as study population, study design, sample size, exposure measurement, and 'omics analysis. RESULTS Over 10,000 studies were identified from scientific databases. Screening processes were used to identify 84 studies considered PECO-relevant after full-text review. Various contaminants (e.g. phthalate, benzene, arsenic, etc.) were investigated in epidemiological studies that used one or more of the four 'omics of interest: epigenomics, transcriptomics, proteomics, and metabolomics . The epidemiological study designs that were used to explore single or integrated 'omic research questions with contaminant exposures were cohort studies, controlled trials, cross-sectional, and case-control studies. An interactive web-based systematic evidence map was created to display more study-related information. CONCLUSIONS This systematic evidence map is a novel tool to visually characterize the available environmental epidemiological studies investigating contaminants and biological effects using 'omics technology and serves as a resource for investigators and allows for a range of applications in chemical research and risk assessment needs.
Collapse
Affiliation(s)
- Stephanie Kim
- Superfund and Emergency Management Division, Region 2, U.S. Environmental Protection Agency, NY, USA.
| | - Hillary Hollinger
- Office of Pollution Prevention and Toxics, U.S. Environmental Protection Agency, NC, USA.
| | - Elizabeth G Radke
- Center for Public Health and Environmental Assessment, U.S. Environmental Protection Agency, D.C, USA.
| |
Collapse
|
7
|
Aziz RM. Cuckoo Search-Based Optimization for Cancer Classification: A New Hybrid Approach. J Comput Biol 2022; 29:565-584. [DOI: 10.1089/cmb.2021.0410] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
|
8
|
Bhadra S, Chen S, Liu C. Analysis of Differentially Expressed Genes That Aggravate Metabolic Diseases in Depression. Life (Basel) 2021; 11:life11111203. [PMID: 34833079 PMCID: PMC8620538 DOI: 10.3390/life11111203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 12/14/2022] Open
Abstract
Depression is considered the second leading cause of the global health burden after cancer. It is recognized as the most common physiological disorder. It affects about 350 million people worldwide to a serious degree. The onset of depression, inadequate food intake, abnormal glycemic control and cognitive impairment have strong associations with various metabolic disorders which are mediated through alterations in diet and physical activities. The regulatory key factors among metabolic diseases and depression are poorly understood. To understand the molecular mechanisms of the dysregulation of genes affected in depressive disorder, we employed an analytical, quantitative framework for depression and related metabolic diseases. In this study, we examined datasets containing patients with depression, obesity, diabetes and NASH. After normalizing batch effects to minimize the heterogeneity of all the datasets, we found differentially expressed genes (DEGs) common to all the datasets. We identified significantly associated enrichment pathways, ontology pathways, protein–protein cluster networks and gene–disease associations among the co-expressed genes co-expressed in depression and the metabolic disorders. Our study suggested potentially active signaling pathways and co-expressed gene sets which may play key roles in crosstalk between metabolic diseases and depression.
Collapse
|
9
|
Rahman MH, Rana HK, Peng S, Kibria MG, Islam MZ, Mahmud SMH, Moni MA. Bioinformatics and system biology approaches to identify pathophysiological impact of COVID-19 to the progression and severity of neurological diseases. Comput Biol Med 2021; 138:104859. [PMID: 34601390 PMCID: PMC8483812 DOI: 10.1016/j.compbiomed.2021.104859] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 08/21/2021] [Accepted: 09/06/2021] [Indexed: 02/06/2023]
Abstract
The Coronavirus Disease 2019 (COVID-19) still tends to propagate and increase the occurrence of COVID-19 across the globe. The clinical and epidemiological analyses indicate the link between COVID-19 and Neurological Diseases (NDs) that drive the progression and severity of NDs. Elucidating why some patients with COVID-19 influence the progression of NDs and patients with NDs who are diagnosed with COVID-19 are becoming increasingly sick, although others are not is unclear. In this research, we investigated how COVID-19 and ND interact and the impact of COVID-19 on the severity of NDs by performing transcriptomic analyses of COVID-19 and NDs samples by developing the pipeline of bioinformatics and network-based approaches. The transcriptomic study identified the contributing genes which are then filtered with cell signaling pathway, gene ontology, protein-protein interactions, transcription factor, and microRNA analysis. Identifying hub-proteins using protein-protein interactions leads to the identification of a therapeutic strategy. Additionally, the incorporation of comorbidity interactions score enhances the identification beyond simply detecting novel biological mechanisms involved in the pathophysiology of COVID-19 and its NDs comorbidities. By computing the semantic similarity between COVID-19 and each of the ND, we have found gene-based maximum semantic score between COVID-19 and Parkinson's disease, the minimum semantic score between COVID-19 and Multiple sclerosis. Similarly, we have found gene ontology-based maximum semantic score between COVID-19 and Huntington disease, minimum semantic score between COVID-19 and Epilepsy disease. Finally, we validated our findings using gold-standard databases and literature searches to determine which genes and pathways had previously been associated with COVID-19 and NDs.
Collapse
Affiliation(s)
- Md Habibur Rahman
- Dept. of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Humayan Kabir Rana
- Dept. of Computer Science and Engineering, Green University of Bangladesh, Dhaka, Bangladesh
| | - Silong Peng
- Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, China
| | - Md Golam Kibria
- Dept. of Chemical and Petroleum Engineering, Schulich School of Engineering, University of Calgary, Canada
| | - Md Zahidul Islam
- Department of Electronics, Graduate School of Engineering, Nagoya University, Japan
| | - S M Hasan Mahmud
- Dept. of Computer Science, American International University Bangladesh, Dhaka, Bangladesh
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia.
| |
Collapse
|
10
|
Auwul MR, Zhang C, Rahman MR, Shahjaman M, Alyami SA, Moni MA. Network-based transcriptomic analysis identifies the genetic effect of COVID-19 to chronic kidney disease patients: A bioinformatics approach. Saudi J Biol Sci 2021; 28:5647-5656. [PMID: 34127904 PMCID: PMC8190333 DOI: 10.1016/j.sjbs.2021.06.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 06/01/2021] [Accepted: 06/02/2021] [Indexed: 12/15/2022] Open
Abstract
COVID-19 has emerged as global health threats. Chronic kidney disease (CKD) patients are immune-compromised and may have a high risk of infection by the SARS-CoV-2. We aimed to detect common transcriptomic signatures and pathways between COVID-19 and CKD by systems biology analysis. We analyzed transcriptomic data obtained from peripheral blood mononuclear cells (PBMC) infected with SARS-CoV-2 and PBMC of CKD patients. We identified 49 differentially expressed genes (DEGs) which were common between COVID-19 and CKD. The gene ontology and pathways analysis showed the DEGs were associated with "platelet degranulation", "regulation of wound healing", "platelet activation", "focal adhesion", "regulation of actin cytoskeleton" and "PI3K-Akt signalling pathway". The protein-protein interaction (PPI) network encoded by the common DEGs showed ten hub proteins (EPHB2, PRKAR2B, CAV1, ARHGEF12, HSP90B1, ITGA2B, BCL2L1, E2F1, TUBB1, and C3). Besides, we identified significant transcription factors and microRNAs that may regulate the common DEGs. We investigated protein-drug interaction analysis and identified potential drugs namely, aspirin, estradiol, rapamycin, and nebivolol. The identified common gene signature and pathways between COVID-19 and CKD may be therapeutic targets in COVID-19 patients with CKD comorbidity.
Collapse
Affiliation(s)
- Md. Rabiul Auwul
- School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China
| | - Chongqi Zhang
- School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China
| | - Md Rezanur Rahman
- Department of Biotechnology and Genetic Engineering, Faculty of Biological Sciences, Islamic University, Kushtia 7003, Bangladesh
- Department of Biochemistry and Biotechnology, School of Biomedical Science, Khwaja Yunus Ali University, Enayetpur, Sirajganj 6751, Bangladesh
| | - Md. Shahjaman
- Department of Statistics, Begum Rokeya University, Rangpur 5400, Bangladesh
| | - Salem A. Alyami
- Department of Mathematics and Statistics, Imam Mohammad Ibn Saud Islamic University, Saudi Arabia
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia
- The Garvan Institute of Medical Research, Healthy Ageing Theme, Darlinghurst, NSW 2010, Australia
| |
Collapse
|
11
|
Hossain MJ, Chowdhury UN, Islam MB, Uddin S, Ahmed MB, Quinn JMW, Moni MA. Machine learning and network-based models to identify genetic risk factors to the progression and survival of colorectal cancer. Comput Biol Med 2021; 135:104539. [PMID: 34153790 DOI: 10.1016/j.compbiomed.2021.104539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 05/12/2021] [Accepted: 05/26/2021] [Indexed: 01/04/2023]
Abstract
Colorectal cancer (CRC) is one of the most common and lethal malignant lesions. Determining how the identified risk factors drive the formation and development of CRC could be an essential means for effective therapeutic development. Aiming this, we investigated how the altered gene expression resulting from exposure to putative CRC risk factors contribute to prognostic biomarker identification. Differentially expressed genes (DEGs) were first identified for CRC and other eight risk factors. Gene set enrichment analysis (GSEA) through the molecular pathway and gene ontology (GO), as well as protein-protein interaction (PPI) network, were then conducted to predict the functions of these DEGs. Our identified genes were explored through the dbGaP and OMIM databases to compare with the already identified and known prognostic CRC biomarkers. The survival time of CRC patients was also examined using a Cox Proportional Hazard regression-based prognostic model by integrating transcriptome data from The Cancer Genome Atlas (TCGA). In this study, PPI analysis identified 4 sub-networks and 8 hub genes that may be potential therapeutic targets, including CXCL8, ICAM1, SOD2, CXCL2, CCL20, OIP5, BUB1, ASPM and IL1RN. We also identified seven signature genes (PRR5.ARHGAP8, CA7, NEDD4L, GFR2, ARHGAP8, SMTN, OIP5) in independent analysis and among which PRR5. ARHGAP8 was found in both multivariate analyses and in analyses that combined gene expression and clinical information. This approach provides both mechanistic information and, when combined with predictive clinical information, good evidence that the identified genes are significant biomarkers of processes involved in CRC progression and survival.
Collapse
Affiliation(s)
- Md Jakir Hossain
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - M Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
| | - Shahadat Uddin
- Complex Systems Research Group & Project Management Program, Faculty of Engineering, The University of Sydney, NSW, 2006, Australia
| | - Mohammad Boshir Ahmed
- School of Material Science and Engineering, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea
| | - Julian M W Quinn
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, 2010, Australia; WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, NSW, 2052, Australia.
| |
Collapse
|
12
|
Nain Z, Rana HK, Liò P, Islam SMS, Summers MA, Moni MA. Pathogenetic profiling of COVID-19 and SARS-like viruses. Brief Bioinform 2021; 22:1175-1196. [PMID: 32778874 PMCID: PMC7454314 DOI: 10.1093/bib/bbaa173] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/23/2020] [Accepted: 07/08/2020] [Indexed: 12/15/2022] Open
Abstract
The novel coronavirus (2019-nCoV) has recently emerged, causing COVID-19 outbreaks and significant societal/global disruption. Importantly, COVID-19 infection resembles SARS-like complications. However, the lack of knowledge about the underlying genetic mechanisms of COVID-19 warrants the development of prospective control measures. In this study, we employed whole-genome alignment and digital DNA-DNA hybridization analyses to assess genomic linkage between 2019-nCoV and other coronaviruses. To understand the pathogenetic behavior of 2019-nCoV, we compared gene expression datasets of viral infections closest to 2019-nCoV with four COVID-19 clinical presentations followed by functional enrichment of shared dysregulated genes. Potential chemical antagonists were also identified using protein-chemical interaction analysis. Based on phylogram analysis, the 2019-nCoV was found genetically closest to SARS-CoVs. In addition, we identified 562 upregulated and 738 downregulated genes (adj. P ≤ 0.05) with SARS-CoV infection. Among the dysregulated genes, SARS-CoV shared ≤19 upregulated and ≤22 downregulated genes with each of different COVID-19 complications. Notably, upregulation of BCL6 and PFKFB3 genes was common to SARS-CoV, pneumonia and severe acute respiratory syndrome, while they shared CRIP2, NSG1 and TNFRSF21 genes in downregulation. Besides, 14 genes were common to different SARS-CoV comorbidities that might influence COVID-19 disease. We also observed similarities in pathways that can lead to COVID-19 and SARS-CoV diseases. Finally, protein-chemical interactions suggest cyclosporine, resveratrol and quercetin as promising drug candidates against COVID-19 as well as other SARS-like viral infections. The pathogenetic analyses, along with identified biomarkers, signaling pathways and chemical antagonists, could prove useful for novel drug development in the fight against the current global 2019-nCoV pandemic.
Collapse
Affiliation(s)
- Zulkar Nain
- Department of Genetic Engineering and Biotechnology, East West University, Bangladesh
| | - Humayan Kabir Rana
- Department of Computer Science and Engineering, Green University of Bangladesh
| | - Pietro Liò
- Artificial Intelligence Group at the University of Cambridge
| | | | | | | |
Collapse
|
13
|
Moni MA, Quinn JMW, Sinmaz N, Summers MA. Gene expression profiling of SARS-CoV-2 infections reveal distinct primary lung cell and systemic immune infection responses that identify pathways relevant in COVID-19 disease. Brief Bioinform 2021; 22:1324-1337. [PMID: 33333559 PMCID: PMC7799202 DOI: 10.1093/bib/bbaa376] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 11/02/2020] [Accepted: 11/26/2020] [Indexed: 12/18/2022] Open
Abstract
To identify key gene expression pathways altered with infection of the novel coronavirus SARS-CoV-2, we performed the largest comparative genomic and transcriptomic analysis to date. We compared the novel pandemic coronavirus SARS-CoV-2 with SARS-CoV and MERS-CoV, as well as influenza A strains H1N1, H3N2 and H5N1. Phylogenetic analysis confirms that SARS-CoV-2 is closely related to SARS-CoV at the level of the viral genome. RNAseq analyses demonstrate that human lung epithelial cell responses to SARS-CoV-2 infection are distinct. Extensive Gene Expression Omnibus literature screening and drug predictive analyses show that SARS-CoV-2 infection response pathways are closely related to those of SARS-CoV and respiratory syncytial virus infections. We validated SARS-CoV-2 infection response genes as disease-associated using Kaplan-Meier survival estimates in lung disease patient data. We also analysed COVID-19 patient peripheral blood samples, which identified signalling pathway concordance between the primary lung cell and blood cell infection responses.
Collapse
|
14
|
Rahman MH, Rana HK, Peng S, Hu X, Chen C, Quinn JMW, Moni MA. Bioinformatics and machine learning methodologies to identify the effects of central nervous system disorders on glioblastoma progression. Brief Bioinform 2021; 22:6066369. [PMID: 33406529 DOI: 10.1093/bib/bbaa365] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 10/25/2020] [Accepted: 11/11/2020] [Indexed: 12/14/2022] Open
Abstract
Glioblastoma (GBM) is a common malignant brain tumor which often presents as a comorbidity with central nervous system (CNS) disorders. Both CNS disorders and GBM cells release glutamate and show an abnormality, but differ in cellular behavior. So, their etiology is not well understood, nor is it clear how CNS disorders influence GBM behavior or growth. This led us to employ a quantitative analytical framework to unravel shared differentially expressed genes (DEGs) and cell signaling pathways that could link CNS disorders and GBM using datasets acquired from the Gene Expression Omnibus database (GEO) and The Cancer Genome Atlas (TCGA) datasets where normal tissue and disease-affected tissue were examined. After identifying DEGs, we identified disease-gene association networks and signaling pathways and performed gene ontology (GO) analyses as well as hub protein identifications to predict the roles of these DEGs. We expanded our study to determine the significant genes that may play a role in GBM progression and the survival of the GBM patients by exploiting clinical and genetic factors using the Cox Proportional Hazard Model and the Kaplan-Meier estimator. In this study, 177 DEGs with 129 upregulated and 48 downregulated genes were identified. Our findings indicate new ways that CNS disorders may influence the incidence of GBM progression, growth or establishment and may also function as biomarkers for GBM prognosis and potential targets for therapies. Our comparison with gold standard databases also provides further proof to support the connection of our identified biomarkers in the pathology underlying the GBM progression.
Collapse
Affiliation(s)
- Md Habibur Rahman
- Institute of Automation Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100190, China.,Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Humayan Kabir Rana
- Department of Computer Science and Engineering, Green University of Bangladesh, Bangladesh
| | - Silong Peng
- Institute of Automation Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100190, China
| | - Xiyuan Hu
- Institute of Automation Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100190, China
| | - Chen Chen
- Institute of Automation Chinese Academy of Sciences, Beijing 100190, China.,University of Chinese Academy of Sciences, Beijing 100190, China
| | - Julian M W Quinn
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia.,The Surgical Education and Research Training Institute, Royal North Shore Hospital, Sydney, Australia
| | - Mohammad Ali Moni
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia.,WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, The University of New South Wales, Sydney, Australia
| |
Collapse
|
15
|
Deng F, Shen L, Wang H, Zhang L. Classify multicategory outcome in patients with lung adenocarcinoma using clinical, transcriptomic and clinico-transcriptomic data: machine learning versus multinomial models. Am J Cancer Res 2020; 10:4624-4639. [PMID: 33415023 PMCID: PMC7783755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 11/25/2020] [Indexed: 06/12/2023] Open
Abstract
Classification of multicategory survival-outcome is important for precision oncology. Machine learning (ML) algorithms have been used to accurately classify multi-category survival-outcome of some cancer-types, but not yet that of lung adenocarcinoma. Therefore, we compared the performances of 3 ML models (random forests, support vector machine [SVM], multilayer perceptron) and multinomial logistic regression (Mlogit) models for classifying 4-category survival-outcome of lung adenocarcinoma using the TCGA. Mlogit model overall performed similar to SVM and multilayer perceptron models (micro-average area under curve=0.82), while random forests model was inferior. Surprisingly, transcriptomic data alone and clinico-transcriptomic data appeared sufficient to accurately classify the 4-category survival-outcome in these patients, but no models using clinical data alone performed well. Notably, NDUFS5, P2RY2, PRPF18, CCL24, ZNF813, MYL6, FLJ41941, POU5F1B, and SUV420H1 were the top-ranked genes that were associated with alive without disease and inversely linked to other outcomes. Similarly, BDKRB2, TERC, DNAJA3, MRPL15, SLC16A13, CRHBP and ACSBG2 were associated with alive with progression and GAL3ST3, AD2, RAB41, HDC, and PLEKHG1 associated with dead with disease, respectively, while also inversely linked other outcomes. These cross-linked genes may be used for risk-stratification and future treatment development.
Collapse
Affiliation(s)
- Fei Deng
- School of Electrical and Electronic Engineering, Shanghai Institute of TechnologyShanghai, China
| | - Lanlan Shen
- Department of Pediatrics, Baylor College of Medicine, USDA/ARS Children’s Nutrition Research CenterHouston, TX, USA
| | - He Wang
- Department of Pathology, Yale University School of MedicineNew Haven, CT, USA
| | - Lanjing Zhang
- Department of Pathology, Princeton Medical CenterPlainsboro, NJ, USA
- Department of Biological Sciences, Rutgers UniversityNewark, NJ
- Rutgers Cancer Institute of New JerseyNew Brunswick, NJ, USA
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers UniversityPiscataway, NJ, USA
| |
Collapse
|
16
|
A system biological approach to investigate the genetic profiling and comorbidities of type 2 diabetes. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100830] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
17
|
Al-Mustanjid M, Mahmud SMH, Royel MRI, Rahman MH, Islam T, Rahman MR, Moni MA. Detection of molecular signatures and pathways shared in inflammatory bowel disease and colorectal cancer: A bioinformatics and systems biology approach. Genomics 2020; 112:3416-3426. [PMID: 32535071 DOI: 10.1016/j.ygeno.2020.06.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 05/03/2020] [Accepted: 06/02/2020] [Indexed: 02/07/2023]
Abstract
Emerging evidence indicates IBD is a risk factor for the increasing incidence of colorectal cancer (CRC) development. We used a system biology approach to identify common molecular signatures and pathways that interact between IBD and CRC and the indispensable pathological mechanisms. First, we identified 177 common differentially expressed genes (DEGs) between IBD and CRC. Gene set enrichment, protein-protein, DEGs-transcription factors, DEGs-microRNAs, protein-drug interaction, gene-disease association, Gene Ontology, pathway enrichment analyses were conducted to these common genes. The inclusion of common DEGs with bimolecular networks disclosed hub proteins (LYN, PLCB1, NPSR1, WNT5A, CDC25B, CD44, RIPK2, ASAP1), transcription factors (SCD, SLC7A5, IKZF3, SLC16A1, SLC7A11) and miRNAs (mir-335-5p, mir-26b-5p, mir-124-3p, mir-16-5p, mir-192-5p, mir-548c-3p, mir-29b-3p, mir-155-5p, mir-21-5p, mir-15a-5p). Analysis of the interaction between protein and drug discovered ASAP1 interacts with cysteine sulfonic acid and double oxidized cysteine drug compounds. Gene-disease association analysis retrieved ASAP1 also associated with pulmonary and bladder neoplasm diseases.
Collapse
Affiliation(s)
- Md Al-Mustanjid
- Department of Software Engineering, Faculty of Science and Information Technology, Daffodil International University, Dhaka 1207, Bangladesh
| | - S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Md Rejaul Islam Royel
- Department of Software Engineering, Faculty of Science and Information Technology, Daffodil International University, Dhaka 1207, Bangladesh
| | - Md Habibur Rahman
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Tangail, Bangladesh
| | - Tania Islam
- Department of Biotechnology and Genetic Engineering, Faculty of Biological Sciences, Islamic University, Kushtia 7003, Bangladesh
| | - Md Rezanur Rahman
- Department of Biochemistry and Biotechnology, School of Biomedical Science, Khwaja Yunus Ali, University, Enayetpur, Sirajganj 6751, Bangladesh
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia.
| |
Collapse
|