1
|
Che J, Zhao Y, Gu B, Li S, Li Y, Pan K, Sun T, Han X, Lv J, Zhang S, Fan B, Li C, Wang C, Wang J, Zhang T. Untargeted serum metabolomics reveals potential biomarkers and metabolic pathways associated with the progression of gastroesophageal cancer. BMC Cancer 2023; 23:1238. [PMID: 38102546 PMCID: PMC10724912 DOI: 10.1186/s12885-023-11744-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 12/12/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND Previous metabolic studies in upper digestive cancer have mostly been limited to cross-sectional study designs, which hinders the ability to effectively predict outcomes in the early stage of cancer. This study aims to identify key metabolites and metabolic pathways associated with the multistage progression of epithelial cancer and to explore their predictive value for gastroesophageal cancer (GEC) formation and for the early screening of esophageal squamous cell carcinoma (ESCC). METHODS A case-cohort study within the 7-year prospective Esophageal Cancer Screening Cohort of Shandong Province included 77 GEC cases and 77 sub-cohort individuals. Untargeted metabolic analysis was performed in serum samples. Metabolites, with FDR q value < 0.05 and variable importance in projection (VIP) > 1, were selected as differential metabolites to predict GEC formation using Random Forest (RF) models. Subsequently, we evaluated the predictive performance of these differential metabolites for the early screening of ESCC. RESULTS We found a distinct metabolic profile alteration in GEC cases compared to the sub-cohort, and identified eight differential metabolites. Pathway analyses showed dysregulation in D-glutamine and D-glutamate metabolism, nitrogen metabolism, primary bile acid biosynthesis, and steroid hormone biosynthesis in GEC patients. A panel of eight differential metabolites showed good predictive performance for GEC formation, with an area under the receiver operating characteristic curve (AUC) of 0.893 (95% CI = 0.816-0.951). Furthermore, four of the GEC pathological progression-related metabolites were validated in the early screening of ESCC, with an AUC of 0.761 (95% CI = 0.716-0.805). CONCLUSIONS These findings indicated a panel of metabolites might be an alternative approach to predict GEC formation, and therefore have the potential to mitigate the risk of cancer progression at the early stage of GEC.
Collapse
Affiliation(s)
- Jiajing Che
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Yongbin Zhao
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Bingbing Gu
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Shuting Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Yunfei Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Keyu Pan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Tiantian Sun
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Xinyue Han
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Jiali Lv
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Shuai Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Bingbing Fan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Chunxia Li
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
| | - Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.
| | - Jialin Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, 440 Jiyan Road, Jinan, 250117, China.
| | - Tao Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
- Institute for Medical Dataology, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China.
| |
Collapse
|
3
|
Wang Y, Zhang S, Yang L, Yang S, Tian Y, Ma Q. Measurement of Conditional Relatedness Between Genes Using Fully Convolutional Neural Network. Front Genet 2019; 10:1009. [PMID: 31695723 PMCID: PMC6818468 DOI: 10.3389/fgene.2019.01009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Accepted: 09/23/2019] [Indexed: 11/13/2022] Open
Abstract
Measuring conditional relatedness, the degree of relation between a pair of genes in a certain condition, is a basic but difficult task in bioinformatics, as traditional co-expression analysis methods rely on co-expression similarities, well known with high false positive rate. Complement with prior-knowledge similarities is a feasible way to tackle the problem. However, classical combination machine learning algorithms fail in detection and application of the complex mapping relations between similarities and conditional relatedness, so a powerful predictive model will have enormous benefit for measuring this kind of complex mapping relations. To this need, we propose a novel deep learning model of convolutional neural network with a fully connected first layer, named fully convolutional neural network (FCNN), to measure conditional relatedness between genes using both co-expression and prior-knowledge similarities. The results on validation and test datasets show FCNN model yields an average 3.0% and 2.7% higher accuracy values for identifying gene–gene interactions collected from the COXPRESdb, KEGG, and TRRUST databases, and a benchmark dataset of Xiao-Yong et al. research, by grid-search 10-fold cross validation, respectively. In order to estimate the FCNN model, we conduct a further verification on the GeneFriends and DIP datasets, and the FCNN model obtains an average of 1.8% and 7.6% higher accuracy, respectively. Then the FCNN model is applied to construct cancer gene networks, and also calls more practical results than other compared models and methods. A website of the FCNN model and relevant datasets can be accessed from https://bmbl.bmi.osumc.edu/FCNN.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China.,School of Artificial Intelligence, Jilin University, Changchun, China
| | - Shuangquan Zhang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Lili Yang
- Department of Obstetrics, The First Hospital of Jilin University, Changchun, China
| | - Sen Yang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Yuan Tian
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
4
|
Glazko G, Zybailov B, Emmert-Streib F, Baranova A, Rahmatallah Y. Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study. PLoS One 2019; 14:e0221444. [PMID: 31437237 PMCID: PMC6705791 DOI: 10.1371/journal.pone.0221444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/06/2019] [Indexed: 01/10/2023] Open
Abstract
Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.
Collapse
Affiliation(s)
- Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Frank Emmert-Streib
- Computational Medicine and Statistical Learning Laboratory, Tampere University of Technology, Korkeakoulunkatu, Tampere, Finland FI
| | - Ancha Baranova
- School of Systems Biology, George Mason University, Manassas VA, United States of America
- Research Center for Medical Genetics, Moscow, Russia
| | - Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| |
Collapse
|
5
|
Rogozin IB, Pavlov YI, Goncearenco A, De S, Lada AG, Poliakov E, Panchenko AR, Cooper DN. Mutational signatures and mutable motifs in cancer genomes. Brief Bioinform 2019; 19:1085-1101. [PMID: 28498882 DOI: 10.1093/bib/bbx049] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Indexed: 12/22/2022] Open
Abstract
Cancer is a genetic disorder, meaning that a plethora of different mutations, whether somatic or germ line, underlie the etiology of the 'Emperor of Maladies'. Point mutations, chromosomal rearrangements and copy number changes, whether they have occurred spontaneously in predisposed individuals or have been induced by intrinsic or extrinsic (environmental) mutagens, lead to the activation of oncogenes and inactivation of tumor suppressor genes, thereby promoting malignancy. This scenario has now been recognized and experimentally confirmed in a wide range of different contexts. Over the past decade, a surge in available sequencing technologies has allowed the sequencing of whole genomes from liquid malignancies and solid tumors belonging to different types and stages of cancer, giving birth to the new field of cancer genomics. One of the most striking discoveries has been that cancer genomes are highly enriched with mutations of specific kinds. It has been suggested that these mutations can be classified into 'families' based on their mutational signatures. A mutational signature may be regarded as a type of base substitution (e.g. C:G to T:A) within a particular context of neighboring nucleotide sequence (the bases upstream and/or downstream of the mutation). These mutational signatures, supplemented by mutable motifs (a wider mutational context), promise to help us to understand the nature of the mutational processes that operate during tumor evolution because they represent the footprints of interactions between DNA, mutagens and the enzymes of the repair/replication/modification pathways.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA
| | - Youri I Pavlov
- Eppley Institute for Cancer Research, University of Nebraska Medical Center, USA
| | | | | | - Artem G Lada
- Department Microbiology and Molecular Genetics, University of California, Davis, USA
| | - Eugenia Poliakov
- Laboratory of Retinal Cell and Molecular Biology, National Eye Institute, National Institutes of Health, USA
| | - Anna R Panchenko
- National Center for Biotechnology Information, National Institutes of Health, USA
| | | |
Collapse
|
9
|
Kim P, Cheng F, Zhao J, Zhao Z. ccmGDB: a database for cancer cell metabolism genes. Nucleic Acids Res 2015; 44:D959-68. [PMID: 26519468 PMCID: PMC4702820 DOI: 10.1093/nar/gkv1128] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Accepted: 10/15/2015] [Indexed: 12/24/2022] Open
Abstract
Accumulating evidence has demonstrated that rewiring of metabolism in cells is an important hallmark of cancer. The percentage of patients killed by metabolic disorder has been estimated to be 30% of the advanced-stage cancer patients. Thus, a systematic annotation of cancer cell metabolism genes is imperative. Here, we present ccmGDB (Cancer Cell Metabolism Gene DataBase), a comprehensive annotation database for cell metabolism genes in cancer, available at http://bioinfo.mc.vanderbilt.edu/ccmGDB. We assembled, curated, and integrated genetic, genomic, transcriptomic, proteomic, biological network and functional information for over 2000 cell metabolism genes in more than 30 cancer types. In total, we integrated over 260 000 somatic alterations including non-synonymous mutations, copy number variants and structural variants. We also integrated RNA-Seq data in various primary tumors, gene expression microarray data in over 1000 cancer cell lines and protein expression data. Furthermore, we constructed cancer or tissue type-specific, gene co-expression based protein interaction networks and drug-target interaction networks. Using these systematic annotations, the ccmGDB portal site provides 6 categories: gene summary, phenotypic information, somatic mutations, gene and protein expression, gene co-expression network and drug pharmacological information with a user-friendly interface for browsing and searching. ccmGDB is developed and maintained as a useful resource for the cancer research community.
Collapse
Affiliation(s)
- Pora Kim
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - Feixiong Cheng
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - Junfei Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203, USA Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN 37232, USA Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN 37212, USA
| |
Collapse
|