1
|
Liu Y, Zhang H, Xu Y, Liu YZ, Al-Adra DP, Yeh MM, Zhang Z. Five Critical Gene-Based Biomarkers With Optimal Performance for Hepatocellular Carcinoma. Cancer Inform 2023; 22:11769351231190477. [PMID: 37577174 PMCID: PMC10413891 DOI: 10.1177/11769351231190477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 07/11/2023] [Indexed: 08/15/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is one of the most fatal cancers in the world. There is an urgent need to understand the molecular background of HCC to facilitate the identification of biomarkers and discover effective therapeutic targets. Published transcriptomic studies have reported a large number of genes that are individually significant for HCC. However, reliable biomarkers remain to be determined. In this study, built on max-linear competing risk factor models, we developed a machine learning analytical framework to analyze transcriptomic data to identify the most miniature set of differentially expressed genes (DEGs). By analyzing 9 public whole-transcriptome datasets (containing 1184 HCC samples and 672 nontumor controls), we identified 5 critical differentially expressed genes (DEGs) (ie, CCDC107, CXCL12, GIGYF1, GMNN, and IFFO1) between HCC and control samples. The classifiers built on these 5 DEGs reached nearly perfect performance in identification of HCC. The performance of the 5 DEGs was further validated in a US Caucasian cohort that we collected (containing 17 HCC with paired nontumor tissue). The conceptual advance of our work lies in modeling gene-gene interactions and correcting batch effect in the analytic framework. The classifiers built on the 5 DEGs demonstrated clear signature patterns for HCC. The results are interpretable, robust, and reproducible across diverse cohorts/populations with various disease etiologies, indicating the 5 DEGs are intrinsic variables that can describe the overall features of HCC at the genomic level. The analytical framework applied in this study may pave a new way for improving transcriptome profiling analysis of human cancers.
Collapse
Affiliation(s)
- Yongjun Liu
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, WA, USA
| | - Heping Zhang
- Yale School of Public Health, Yale University, New Haven, CT, USA
| | - Yuqing Xu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
| | - Yao-Zhong Liu
- Department of Biostatistics, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - David P Al-Adra
- Department of Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Matthew M Yeh
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, WA, USA
- Department of Medicine, University of Washington Medical Center, Seattle, WA, USA
| | - Zhengjun Zhang
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
- Biostatistics and Medical Informatics, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI, USA
| |
Collapse
|
2
|
Chiang CC, Yeh H, Lim SN, Lin WR. Transcriptome analysis creates a new era of precision medicine for managing recurrent hepatocellular carcinoma. World J Gastroenterol 2023; 29:780-799. [PMID: 36816628 PMCID: PMC9932421 DOI: 10.3748/wjg.v29.i5.780] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/23/2022] [Accepted: 01/10/2023] [Indexed: 02/06/2023] Open
Abstract
The high incidence of hepatocellular carcinoma (HCC) recurrence negatively impacts outcomes of patients treated with curative intent despite advances in surgical techniques and other locoregional liver-targeting therapies. Over the past few decades, the emergence of transcriptome analysis tools, including real-time quantitative reverse transcription PCR, microarrays, and RNA sequencing, has not only largely contributed to our knowledge about the pathogenesis of recurrent HCC but also led to the development of outcome prediction models based on differentially expressed gene signatures. In recent years, the single-cell RNA sequencing technique has revolutionized our ability to study the complicated crosstalk between cancer cells and the immune environment, which may benefit further investigations on the role of different immune cells in HCC recurrence and the identification of potential therapeutic targets. In the present article, we summarized the major findings yielded with these transcriptome methods within the framework of a causal model consisting of three domains: primary cancer cells; carcinogenic stimuli; and tumor microenvironment. We provided a comprehensive review of the insights that transcriptome analyses have provided into diagnostics, surveillance, and treatment of HCC recurrence.
Collapse
Affiliation(s)
- Chun-Cheng Chiang
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA 15232, United States
| | - Hsuan Yeh
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, United States
| | - Siew-Na Lim
- Department of Neurology, Linkou Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| | - Wey-Ran Lin
- College of Medicine, Chang Gung University, Taoyuan 333, Taiwan
- Department of Gastroenterology and Hepatology, Linkou Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
| |
Collapse
|
3
|
Sucularli C. Identification of BRIP1, NSMCE2, ANAPC7, RAD18 and TTL from chromosome segregation gene set associated with hepatocellular carcinoma. Cancer Genet 2022; 268-269:28-36. [PMID: 36126360 DOI: 10.1016/j.cancergen.2022.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 07/12/2022] [Accepted: 09/06/2022] [Indexed: 01/25/2023]
Abstract
INTRODUCTION Hepatocellular carcinoma is one of the most frequent cancers with high mortality rate worldwide. METHODS TCGA LIHC HTseq counts were analyzed. GSEA was performed with GO BP gene sets. GO analysis was performed with differentially expressed genes. The subset of genes contributing most of the enrichment result of GO_BP_CHROMOSOME_SEGREGATION of GSEA were identified. Five genes have been selected in this subset of genes for further analysis. A microarray data set, GSE112790, was analyzed as a validation data set. Survival analysis was performed. RESULTS According to GSEA and GO analysis several gene sets and processes related to chromosome segregation were enriched in LIHC. GO_BP_CHROMOSOME_SEGREGATION gene set from GSEA had the highest size of the genes contributing most of the enrichment. Five genes in this gene set; BRIP1, NSMCE2, ANAPC7, RAD18 and TTL, whose expressions and prognostic values have not been studied in hepatocellular carcinoma in detail, have been selected for further analyses. Expression of these five genes were identified as significantly upregulated in LIHC RNA-seq and HCC microarray data set. Survival analysis showed that high expression of the five genes was associated with poor overall survival in HCC patients. CONCLUSION Selected genes were upregulated and had prognostic value in HCC.
Collapse
Affiliation(s)
- Ceren Sucularli
- Department of Bioinformatics, Institute of Health Sciences, Hacettepe University, Ankara, Turkey.
| |
Collapse
|
4
|
Liu Y, Al‐Adra DP, Lan R, Jung G, Li H, Yeh MM, Liu Y. RNA sequencing analysis of hepatocellular carcinoma identified oxidative phosphorylation as a major pathologic feature. Hepatol Commun 2022; 6:2170-2181. [PMID: 35344307 PMCID: PMC9315135 DOI: 10.1002/hep4.1945] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 02/12/2022] [Accepted: 03/03/2022] [Indexed: 11/12/2022] Open
Abstract
Dysregulation of expression of functional genes and pathways plays critical roles in the etiology and progression of hepatocellular carcinoma (HCC). Next generation-based RNA sequencing (RNA-seq) offers unparalleled power to comprehensively characterize HCC at the whole transcriptome level. In this study, 17 fresh-frozen HCC samples with paired non-neoplastic liver tissue from Caucasian patients undergoing liver resection or transplantation were used for RNA-seq analysis. Pairwise differential expression analysis of the RNA-seq data was performed to identify genes, pathways, and functional terms differentially regulated in HCC versus normal tissues. At a false discovery rate (FDR) of 0.10, 13% (n = 4335) of transcripts were up-regulated and 19% (n = 6454) of transcripts were down-regulated in HCC versus non-neoplastic tissue. Eighty-five Kyoto Encyclopedia of Genes and Genomes pathways were differentially regulated (FDR, <0.10), with almost all pathways (n = 83) being up-regulated in HCC versus non-neoplastic tissue. Among the top up-regulated pathways was oxidative phosphorylation (hsa00190; FDR, 1.12E-15), which was confirmed by Database for Annotation, Visualization, and Integrated Discovery (DAVID) gene set enrichment analysis. Consistent with potential oxidative stress due to activated oxidative phosphorylation, DNA damage-related signals (e.g., the up-regulated hsa03420 nucleotide excision repair [FDR, 1.14E-04] and hsa03410 base excision repair [FDR, 2.71E-04] pathways) were observed. Among down-regulated genes (FDR, <0.10), functional terms related to cellular structures (e.g., cell membrane [FDR, 3.05E-21] and cell junction [FDR, 2.41E-07], were highly enriched, suggesting compromised formation of cellular structure in HCC at the transcriptome level. Interestingly, the olfactory transduction (hsa04740; FDR, 1.53E-07) pathway was observed to be down-regulated in HCC versus non-neoplastic tissue, suggesting impaired liver chemosensory functions in HCC. Our findings suggest oxidative phosphorylation and the associated DNA damage may be the major driving pathologic feature in HCC.
Collapse
Affiliation(s)
- Yongjun Liu
- Department of Pathology and Laboratory MedicineUniversity of Wisconsin School of Medicine and Public HealthMadisonWisconsinUSA
| | - David P. Al‐Adra
- Department of SurgeryUniversity of Wisconsin School of Medicine and Public HealthMadisonWisconsinUSA
| | - Ruoxin Lan
- Department of Biostatistics and Data ScienceTulane University School of Public Health and Tropical MedicineNew OrleansLouisianaUSA
| | - Geunyoung Jung
- Department of Pathology and Laboratory MedicineUniversity of Wisconsin School of Medicine and Public HealthMadisonWisconsinUSA
| | - Huihua Li
- Department of Pathology and Laboratory MedicineUniversity of Wisconsin School of Medicine and Public HealthMadisonWisconsinUSA
| | - Matthew M. Yeh
- Department of Laboratory Medicine and PathologyUniversity of Washington School of MedicineSeattleWisconsinUSA
| | - Yao‐Zhong Liu
- Department of Biostatistics and Data ScienceTulane University School of Public Health and Tropical MedicineNew OrleansLouisianaUSA
| |
Collapse
|
5
|
Zeng W, Rao N, Li Q, Wang G, Liu D, Li Z, Yang Y. Genome-wide Analyses on Single Disease Samples for Potential Biomarkers and Biological Features of Molecular Subtypes: A Case Study in Gastric Cancer. Int J Biol Sci 2018; 14:833-842. [PMID: 29989098 PMCID: PMC6036754 DOI: 10.7150/ijbs.24816] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 03/06/2018] [Indexed: 02/06/2023] Open
Abstract
Purpose: Based on the previous 3 well-defined subtypes of gastric adenocarcinoma (invasive, proliferative and metabolic), we aimed to find potential biomarkers and biological features of each subtype. Methods: The genome-wide co-expression network of each subtype of gastric cancer was firstly constructed. Then, the functional modules in each genome-wide co-expression network were divided. Next, the key genes were screened from each functional module. Finally, the enrichment analysis was performed on the key genes to mine the biological features of each subtype. Comparative analysis between each pair of subtypes was performed to find the common and unique features among different subtypes. Results: A total of 207 key genes were identified in invasive, 215 key genes in proliferative, and 204 key genes in metabolic subtypes. Most key genes in each subtype were unique and new findings compared with that of the existing related researches. The GO and KEGG enrichment analyses for the key genes of each subtype revealed important biological features of each subtype. Conclusions: For a subtype, most identified key genes and important biological features were unique, which means that the key genes can be used as the potential biomarker of a subtype, and each subtype of gastric cancer might have different occurrence and development mechanisms. Thus, different diagnosis and therapy methods should be applied to the invasive, proliferative and metabolic subtypes of gastric cancer.
Collapse
Affiliation(s)
- Wei Zeng
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China.,Department of Biomedical Engineering, School of Automation and Information Engineering, Sichuan University of Science and Engineering, Zigong, 643000, China
| | - Nini Rao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China.,Institute of Electronic and Information Engineering of UESTC in Guangdong, Dongguan, 523808, China
| | - Qian Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Guangbin Wang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Dingyun Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zhengwen Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yuntao Yang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Key Laboratory for NeuroInformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
6
|
Peng H, Yang Y, Zhe S, Wang J, Gribskov M, Qi Y. DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates. Bioinformatics 2017; 33:3018-3027. [PMID: 28595376 PMCID: PMC5870796 DOI: 10.1093/bioinformatics/btx357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 06/02/2017] [Indexed: 11/18/2022] Open
Abstract
Motivation High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. Results We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. Availability and implementation The software is available at https://github.com/hao-peng/DEIsoM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Yifan Yang
- Department of Computer Science.,Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | - Jian Wang
- Eli Lilly and Company, Indianapolis, IN 46285, USA
| | - Michael Gribskov
- Department of Computer Science.,Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Yuan Qi
- Department of Computer Science.,Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|