1
|
Park C, Kim B, Park T. DeepHisCoM: deep learning pathway analysis using hierarchical structural component models. Brief Bioinform 2022; 23:6590446. [DOI: 10.1093/bib/bbac171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/04/2022] [Accepted: 04/18/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Many statistical methods for pathway analysis have been used to identify pathways associated with the disease along with biological factors such as genes and proteins. However, most pathway analysis methods neglect the complex nonlinear relationship between biological factors and pathways. In this study, we propose a Deep-learning pathway analysis using Hierarchical structured CoMponent models (DeepHisCoM) that utilize deep learning to consider a nonlinear complex contribution of biological factors to pathways by constructing a multilayered model which accounts for hierarchical biological structure. Through simulation studies, DeepHisCoM was shown to have a higher power in the nonlinear pathway effect and comparable power for the linear pathway effect when compared to the conventional pathway methods. Application to hepatocellular carcinoma (HCC) omics datasets, including metabolomic, transcriptomic and metagenomic datasets, demonstrated that DeepHisCoM successfully identified three well-known pathways that are highly associated with HCC, such as lysine degradation, valine, leucine and isoleucine biosynthesis and phenylalanine, tyrosine and tryptophan. Application to the coronavirus disease-2019 (COVID-19) single-nucleotide polymorphism (SNP) dataset also showed that DeepHisCoM identified four pathways that are highly associated with the severity of COVID-19, such as mitogen-activated protein kinase (MAPK) signaling pathway, gonadotropin-releasing hormone (GnRH) signaling pathway, hypertrophic cardiomyopathy and dilated cardiomyopathy. Codes are available at https://github.com/chanwoo-park-official/DeepHisCoM.
Collapse
Affiliation(s)
- Chanwoo Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Boram Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
2
|
Kim B, Cho EJ, Yoon JH, Kim SS, Cheong JY, Cho SW, Park T. Pathway-Based Integrative Analysis of Metabolome and Microbiome Data from Hepatocellular Carcinoma and Liver Cirrhosis Patients. Cancers (Basel) 2020; 12:E2705. [PMID: 32967314 PMCID: PMC7563418 DOI: 10.3390/cancers12092705] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/12/2022] Open
Abstract
Aberrations of the human microbiome are associated with diverse liver diseases, including hepatocellular carcinoma (HCC). Even if we can associate specific microbes with particular diseases, it is difficult to know mechanistically how the microbe contributes to the pathophysiology. Here, we sought to reveal the functional potential of the HCC-associated microbiome with the human metabolome which is known to play a role in connecting host phenotype to microbiome function. To utilize both microbiome and metabolomic data sets, we propose an innovative, pathway-based analysis, Hierarchical structural Component Model for pathway analysis of Microbiome and Metabolome (HisCoM-MnM), for integrating microbiome and metabolomic data. In particular, we used pathway information to integrate these two omics data sets, thus providing insight into biological interactions between different biological layers, with regard to the host's phenotype. The application of HisCoM-MnM to data sets from 103 and 97 patients with HCC and liver cirrhosis (LC), respectively, showed that this approach could identify HCC-related pathways related to cancer metabolic reprogramming, in addition to the significant metabolome and metagenome that make up those pathways.
Collapse
Affiliation(s)
- Boram Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea;
| | - Eun Ju Cho
- Department of Internal Medicine and Liver Research Institute, Seoul National University College of Medicine, Seoul 03080, Korea; (E.J.C.); (J.-H.Y.)
| | - Jung-Hwan Yoon
- Department of Internal Medicine and Liver Research Institute, Seoul National University College of Medicine, Seoul 03080, Korea; (E.J.C.); (J.-H.Y.)
| | - Soon Sun Kim
- Department of Gastroenterology, Ajou University School of Medicine, Suwon 16499, Korea; (S.S.K.); (J.Y.C.); (S.W.C.)
| | - Jae Youn Cheong
- Department of Gastroenterology, Ajou University School of Medicine, Suwon 16499, Korea; (S.S.K.); (J.Y.C.); (S.W.C.)
| | - Sung Won Cho
- Department of Gastroenterology, Ajou University School of Medicine, Suwon 16499, Korea; (S.S.K.); (J.Y.C.); (S.W.C.)
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea;
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
3
|
Choi S, Lee S, Huh I, Hwang H, Park T. HisCoM-G×E: Hierarchical Structural Component Analysis of Gene-Based Gene-Environment Interactions. Int J Mol Sci 2020; 21:E6724. [PMID: 32937825 PMCID: PMC7555026 DOI: 10.3390/ijms21186724] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 08/31/2020] [Accepted: 09/04/2020] [Indexed: 11/30/2022] Open
Abstract
Gene-environment interaction (G×E) studies are one of the most important solutions for understanding the "missing heritability" problem in genome-wide association studies (GWAS). Although many statistical methods have been proposed for detecting and identifying G×E, most employ single nucleotide polymorphism (SNP)-level analysis. In this study, we propose a new statistical method, Hierarchical structural CoMponent analysis of gene-based Gene-Environment interactions (HisCoM-G×E). HisCoM-G×E is based on the hierarchical structural relationship among all SNPs within a gene, and can accommodate all possible SNP-level effects into a single latent variable, by imposing a ridge penalty, and thus more efficiently takes into account the latent interaction term of G×E. The performance of the proposed method was evaluated in simulation studies, and we applied the proposed method to investigate gene-alcohol intake interactions affecting systolic blood pressure (SBP), using samples from the Korea Associated REsource (KARE) consortium data.
Collapse
Affiliation(s)
- Sungkyoung Choi
- Department of Applied Mathematics, Hanyang University (ERICA), Ansan 15588, Korea;
| | - Sungyoung Lee
- Center for Precision Medicine, Seoul National University Hospital, Seoul 03080, Korea;
| | - Iksoo Huh
- Department of nursing, College of Nursing and Research Institute of Nursing Science, Seoul National University, Seoul 03080, Korea;
| | - Heungsun Hwang
- Department of Psychology, McGill University, Montreal, QC H3A 1G1, Canada;
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
4
|
Jiang N, Lee S, Park T. HisCoM-PCA: software for hierarchical structural component analysis for pathway analysis based using principal component analysis. Genomics Inform 2020; 18:e11. [PMID: 32224844 PMCID: PMC7120349 DOI: 10.5808/gi.2020.18.1.e11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 03/11/2020] [Indexed: 11/20/2022] Open
Affiliation(s)
- Nan Jiang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Sungyoung Lee
- Center for Precision Medicine, Seoul National University Hospital, Seoul 08826, Korea
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Corresponding author: E-mail:
| |
Collapse
|
5
|
Mok L, Park T. HisCoM-PAGE: software for hierarchical structural component models for pathway analysis of gene expression data. Genomics Inform 2019; 17:e45. [PMID: 31896245 PMCID: PMC6944051 DOI: 10.5808/gi.2019.17.4.e45] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Accepted: 11/22/2019] [Indexed: 12/04/2022] Open
Abstract
To identify pathways associated with survival phenotypes using gene expression data, we recently proposed the hierarchical structural component model for pathway analysis of gene expression data (HisCoM-PAGE) method. The HisCoM-PAGE software can consider hierarchical structural relationships between genes and pathways and analyze multiple pathways simultaneously. It can be applied to various types of gene expression data, such as microarray data or RNA sequencing data. We expect that the HisCoM-PAGE software will make our method more easily accessible to researchers who want to perform pathway analysis for survival times.
Collapse
Affiliation(s)
- Lydia Mok
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Corresponding author: E-mail:
| |
Collapse
|
6
|
Mok L, Kim Y, Lee S, Choi S, Lee S, Jang JY, Park T. HisCoM-PAGE: Hierarchical Structural Component Models for Pathway Analysis of Gene Expression Data. Genes (Basel) 2019; 10:E931. [PMID: 31739607 PMCID: PMC6896173 DOI: 10.3390/genes10110931] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 11/06/2019] [Accepted: 11/07/2019] [Indexed: 01/10/2023] Open
Abstract
Although there have been several analyses for identifying cancer-associated pathways, based on gene expression data, most of these are based on single pathway analyses, and thus do not consider correlations between pathways. In this paper, we propose a hierarchical structural component model for pathway analysis of gene expression data (HisCoM-PAGE), which accounts for the hierarchical structure of genes and pathways, as well as the correlations among pathways. Specifically, HisCoM-PAGE focuses on the survival phenotype and identifies its associated pathways. Moreover, its application to real biological data analysis of pancreatic cancer data demonstrated that HisCoM-PAGE could successfully identify pathways associated with pancreatic cancer prognosis. Simulation studies comparing the performance of HisCoM-PAGE with other competing methods such as Gene Set Enrichment Analysis (GSEA), Global Test, and Wald-type Test showed HisCoM-PAGE to have the highest power to detect causal pathways in most simulation scenarios.
Collapse
Affiliation(s)
- Lydia Mok
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Sungyoung Lee
- Center for Precision Medicine, Seoul National University Hospital, Seoul 03080, Korea
| | - Sungkyoung Choi
- Department of Applied Mathematics, Hanyang University (ERICA), Ansan 15588, Korea
| | - Seungyeoun Lee
- Department of Mathematics and Statistics, Sejong University, Seoul 05006, Korea
| | - Jin-Young Jang
- Department of Surgery, Seoul National University College of Medicine, Seoul 03080, Korea
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
7
|
Zhang Z, Wang S, Yang F, Meng Z, Liu Y. LncRNA ROR1‑AS1 high expression and its prognostic significance in liver cancer. Oncol Rep 2019; 43:55-74. [PMID: 31746401 PMCID: PMC6908930 DOI: 10.3892/or.2019.7398] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 09/27/2019] [Indexed: 12/11/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is a common disease of the digestive system with no curative treatments. Long noncoding RNA tyrosine protein kinase transmembrane receptor 1 antisense RNA 1 (lncRNA ROR1-AS1) is an lncRNA whose functions have been predicted in human diseases; however, its important role in cancer has been probed only in mantle cell lymphoma, not in HCC. Therefore, the present study aimed to elucidate the prognostic significance of lncRNA ROR1-AS1 in HCC. The Cancer Genome Atlas Liver Hepatocellular Carcinoma was used to analyze the expression of ROR1-AS1 in liver cancer. χ2 tests were performed to evaluate associations between clinical characteristics and ROR1-AS1 expression. The role of ROR1-AS1 in HCC prognosis was assessed using Kaplan-Meier curves and proportional hazards model (Cox) analysis. Gene set enrichment analysis was performed by using a Gene Expression Omnibus dataset. At the same time, Multi Experiment Matrix was used to predict genes that may be co-expressed with ROR1-AS1. The Database for Annotation, Visualization and Integrated Discovery and KO-Based Annotation System were used to analyze the most closely associated cytological behaviors and pathways in HCC. Then, the genes in the three databases were integrated to screen mRNAs, microRNAs and lncRNAs that had co-expression relationships with ROR1-AS1. Cytoscape, Search Tool for the Retrieval of Interacting Genes/Proteins and Molecular Evolutionary Genetics Analysis were used to map potential regulatory networks and developmental relationships associated with ROR1-AS1. Finally, 12 genes most closely associated with ROR1-AS1 were identified, and their relationship was described using a Circos plot. The results showed that ROR1-AS1 was upregulated in HCC, and its expression was related to clinical stage, T stage and N stage. Furthermore, Kaplan-Meier curves and Cox analysis indicated that high expression of ROR1-AS1 was associated with poor prognosis, and that ROR1-AS1 was an independent risk factor for HCC. Co-expression data suggested that there may be a large regulatory network of 45 genes with indirect associations with ROR1-AS1, a small regulatory network of 15 genes with direct or indirect regulatory relationships, and a special regulatory network containing 12 genes directly associated with ROR1-AS1. The present findings indicated that high expression of ROR1-AS1 suggests poor prognosis in patients with HCC.
Collapse
Affiliation(s)
- Ze Zhang
- Department of Hepatobiliary‑Pancreatic Surgery, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130000, P.R. China
| | - Shouqian Wang
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Fan Yang
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Zihui Meng
- Department of Hepatobiliary‑Pancreatic Surgery, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130000, P.R. China
| | - Yahui Liu
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| |
Collapse
|
8
|
Zhang Z, Wang S, Liu Y, Meng Z, Chen F. Low lncRNA ZNF385D‑AS2 expression and its prognostic significance in liver cancer. Oncol Rep 2019; 42:1110-1124. [PMID: 31322274 PMCID: PMC6667919 DOI: 10.3892/or.2019.7238] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 07/10/2019] [Indexed: 12/18/2022] Open
Abstract
Hepatocellular carcinoma (HCC) is a common digestive system disease with no curative treatment. Zinc finger protein 385D antisense RNA 2 (ZNF385D-AS2) is a long non-coding RNA (lncRNA) that has been predicted to function in human diseases, including several types of cancer. Yet, it has not been investigated in relation to liver cancer. Thus, the present study was designed with an aim to elucidate the prognostic significance of lncRNA ZNF385D-AS2 in HCC. The Cancer Genome Atlas-Liver Hepatocellular Carcinoma (TCGA-LIHC) collection of data was utilized to analyze the expression of lncRNA ZNF385D-AS2 in liver cancer. Then Chi-square tests were used to evaluate the correlation between clinical characteristics and lncRNA ZNF385D-AS2 expression. The significance of lncRNA ZNF385D-AS2 in patient prognosis was evaluated using Kaplan-Meier curves and Cox analysis. Concomitantly, Gene Set Enrichment Analysis (GSEA) was performed to analyze the most closely related cytological behavior. Finally, we used the Database for Annotation, Visualization and Integrated Discovery (DAVID) and KOBAS software and data from the Gene Expression Omnibus (GEO) database to analyze the possible competing endogenous RNA (ceRNA) network pattern as well as the co-expression network in liver cancer. Based on the results, analysis of RNA-Seq gene expression data for 303 patients with primary tumors revealed low expression of ZNF385D-AS2 in liver cancer. Low expression of ZNF385D-AS2 was found to be significantly associated with sex (P=0.050), T stage (P=0.049), M stage (P=0.040), N stage (P<0.001) and clinical stage (P=0.037). Patients with ZNF385D-AS2 low-expression liver cancers had a shorter median overall survival compared with the patients with ZNF385D-AS2 high-expression liver cancers (P=0.0079). Cox analysis identified ZNF385D-AS2 low-expression as an independent prognostic variable (AUC=0.594) for overall survival in liver cancer patients. Co-expression and ceRNA predictive analysis data suggested that there may be a regulatory signaling axis between ZNF385D-AS2 and miR-96 and miR-182. In conclusion, our results suggests that low expression of ZNF385D-AS2 is predictive of a poor prognosis of liver cancer patients.
Collapse
Affiliation(s)
- Ze Zhang
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Shouqian Wang
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Yahui Liu
- Department of General Surgery, The First Hospital of Jilin University, Changchun, Jilin 130021, P.R. China
| | - Zihui Meng
- Department of Hepatobiliary‑Pancreatic Surgery, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| | - Fangfang Chen
- Department of Gastrointestinal Colorectal and Anal Surgery, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| |
Collapse
|
9
|
Li J, Nakai K, Zheng Y, Sato K, Wong L. Introduction to Selected Papers from GIW2018. J Bioinform Comput Biol 2019; 16:1802005. [PMID: 30616475 DOI: 10.1142/s0219720018020055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Jinyan Li
- 1 University of Technology Sydney, Australia
| | | | - Yun Zheng
- 3 Kunming University of Science and Technology, China
| | | | | |
Collapse
|
10
|
Choi S, Lee S, Park T. HisCoM-GGI: Software for Hierarchical Structural Component Analysis of Gene-Gene Interactions. Genomics Inform 2018; 16:e38. [PMID: 30602099 PMCID: PMC6440671 DOI: 10.5808/gi.2018.16.4.e38] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 12/18/2018] [Indexed: 11/20/2022] Open
Abstract
Gene-gene interaction (GGI) analysis is known to play an important role in explain missing heritability issue. Many previous studies have already proposed software to analyze GGI, but most methods focus on a binary phenotype in case-control design. In this study, we developed 'Hierarchical structural CoMponent analysis of Gene-Gene Interactions' (HisCoM-GGI) software for gene-gene interaction analysis with a continuous phenotype. The HisCoM-GGI method considers hierarchical structural relationships between genes and SNPs, that enables both gene-level and SNP-level interaction analysis in a single model. Furthermore, this software accepts various type of genomic data, and supports a data management and multithreading to improve the efficiency of GWAS data analysis. We expect that the HisCoM-GGI software provides advanced accessibility to the researchers on the genetic interaction studies and a more effective way to understand biological mechanisms of complex diseases.
Collapse
Affiliation(s)
- Sungkyoung Choi
- Department of Pharmacology, Yonsei University College of Medicine, Seoul 03722,
Korea
| | - Sungyoung Lee
- Center for Precision Medicine, Seoul National University Hospital, Seoul 03080,
Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826,
Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826,
Korea
| |
Collapse
|