1
|
Fang F, Sun Y. Prediction of systemic lupus erythematosus-related genes based on graph attention network and deep neural network. Comput Biol Med 2024; 175:108371. [PMID: 38691916 DOI: 10.1016/j.compbiomed.2024.108371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/13/2024] [Accepted: 03/24/2024] [Indexed: 05/03/2024]
Abstract
Systemic lupus erythematosus (SLE) is an autoimmune disorder intricately linked to genetic factors, with numerous approaches having identified genes linked to its development, diagnosis and prognosis. Despite genome-wide association analysis and gene knockout experiments confirming some genes associated with SLE, there are still numerous potential genes yet to be discovered. The search for relevant genes through biological experiments entails significant financial and human resources. With the advancement of computational technologies like deep learning, we aim to identify SLE-related genes through deep learning methods, thereby narrowing down the scope for biological experimentation. This study introduces SLEDL, a deep learning-based approach that leverages DNN and graph neural networks to effectively identify SLE-related genes by capturing relevant features in the gene interaction network. The above steps transform the identification of SLE related genes into a binary classification problem, ultimately solved through a fully connected layer. The results demonstrate the superiority of SLEDL, achieving higher AUC (0.7274) and AUPR (0.7599), further validated through case studies.
Collapse
Affiliation(s)
- Fang Fang
- Department of Rheumatology and Immunology, The First Hospital of China Medical University, Shenyang, Liaoning, China
| | - Yizhou Sun
- Department of Ophthalmology, The First Hospital of China Medical University, Shenyang, Liaoning, China.
| |
Collapse
|
2
|
Shi J, Chen Y, Wang Y. Deep learning and machine learning approaches to classify stomach distant metastatic tumors using DNA methylation profiles. Comput Biol Med 2024; 175:108496. [PMID: 38657466 DOI: 10.1016/j.compbiomed.2024.108496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Revised: 04/14/2024] [Accepted: 04/21/2024] [Indexed: 04/26/2024]
Abstract
Distant metastasis of cancer is a significant contributor to cancer-related complications, and early identification of unidentified stomach adenocarcinoma is crucial for a positive prognosis. Changes inDNA methylation are being increasingly recognized as a crucial factor in predicting cancer progression. Within this research, we developed machine learning and deep learning models for distinguishing distant metastasis in samples of stomach adenocarcinoma based on DNA methylation profile. Employing deep neural networks (DNN), support vector machines (SVM), random forest (RF), Naive Bayes (NB) and decision tree (DT), and models for forecasting distant metastasis in stomach adenocarcinoma. The results show that the performance of DNN is better than that of other models, AUC and AUPR achieving 99.9 % and 99.5 % respectively. Additionally, a weighted random sampling technique was utilized to address the issue of imbalanced datasets, enabling the identification of crucial methylation markers associated with functionally significant genes in stomach distant metastasis tumors with greater performance.
Collapse
Affiliation(s)
- Jing Shi
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
| | - Ying Chen
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
| | - Ying Wang
- Department of Endoscopy, The First Hospital of China Medical University, Shenyang, China.
| |
Collapse
|
3
|
Stincone P, Naimi A, Saviola AJ, Reher R, Petras D. Decoding the molecular interplay in the central dogma: An overview of mass spectrometry-based methods to investigate protein-metabolite interactions. Proteomics 2024; 24:e2200533. [PMID: 37929699 DOI: 10.1002/pmic.202200533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/15/2023] [Accepted: 10/23/2023] [Indexed: 11/07/2023]
Abstract
With the emergence of next-generation nucleotide sequencing and mass spectrometry-based proteomics and metabolomics tools, we have comprehensive and scalable methods to analyze the genes, transcripts, proteins, and metabolites of a multitude of biological systems. Despite the fascinating new molecular insights at the genome, transcriptome, proteome and metabolome scale, we are still far from fully understanding cellular organization, cell cycles and biology at the molecular level. Significant advances in sensitivity and depth for both sequencing as well as mass spectrometry-based methods allow the analysis at the single cell and single molecule level. At the same time, new tools are emerging that enable the investigation of molecular interactions throughout the central dogma of molecular biology. In this review, we provide an overview of established and recently developed mass spectrometry-based tools to probe metabolite-protein interactions-from individual interaction pairs to interactions at the proteome-metabolome scale.
Collapse
Affiliation(s)
- Paolo Stincone
- University of Tuebingen, CMFI Cluster of Excellence, Interfaculty Institute of Microbiology and Infection Medicine, Tuebingen, Germany
- University of Tuebingen, Center for Plant Molecular Biology, Tuebingen, Germany
| | - Amira Naimi
- University of Marburg, Institute of Pharmaceutical Biology and Biotechnology, Marburg, Germany
| | | | - Raphael Reher
- University of Marburg, Institute of Pharmaceutical Biology and Biotechnology, Marburg, Germany
| | - Daniel Petras
- University of Tuebingen, CMFI Cluster of Excellence, Interfaculty Institute of Microbiology and Infection Medicine, Tuebingen, Germany
- University of California Riverside, Department of Biochemistry, Riverside, USA
| |
Collapse
|
4
|
Bai Y, Di L, Liu W, Zhou F, Ma J, Meng G, Li M, Sun G. Elucidating immune cell dynamics in chronic lung allograft dysfunction: A comprehensive single-cell transcriptomic study. Comput Biol Med 2024; 173:108254. [PMID: 38520924 DOI: 10.1016/j.compbiomed.2024.108254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/12/2024] [Accepted: 03/06/2024] [Indexed: 03/25/2024]
Abstract
Chronic Lung Allograft Dysfunction (CLAD) is a critical post-transplant complication that predominantly determines the long-term survival rates and quality of life of patients undergoing lung transplantation. The limited efficacy of current immunosuppressive strategies underscores our incomplete understanding of the immunological aspects of CLAD. Hence, there is an urgent need for more comprehensive and targeted research to unravel the complex interplay of immune cells in the development and progression of CLAD. This study conducts an in-depth analysis of the immune environment in CLAD. By examining the gene expression profiles of T cells, natural killer cells, B cells, macrophages, and monocytes, we have elucidated a unique immunological landscape in CLAD compared to healthy controls. We highlight the heterogeneity within the immune populations and provide a comprehensive understanding of the immune mechanisms driving CLAD. Enrichment analysis identified specific pathways that are either overactive or suppressed in CLAD, revealing potential molecular targets for therapeutic intervention. Our findings emphasize the crucial role of T cells in the pathophysiology of CLAD, coordinating the immune response and revealing an amplified immune cell network, potentially leading to maladaptive tissue responses. By integrating a comprehensive cellular and molecular portrait of the immune environment, our research not only deepens our understanding of the pathogenesis of CLAD but also lays a foundational approach for the development of targeted therapies.
Collapse
Affiliation(s)
- Yu Bai
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Liang Di
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Wanying Liu
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Feixue Zhou
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Jiaxiang Ma
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Guangxian Meng
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China
| | - Mo Li
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China.
| | - Ge Sun
- Department of Thoracic Surgery, The Second Hospital of Dalian Medical University, Dalian, China.
| |
Collapse
|
5
|
Wei W, Yue D. CoGSPro-net:A graph neural network based on protein-protein interaction for classifying lung cancer-relatrd proteins. Comput Biol Med 2024; 172:108251. [PMID: 38508055 DOI: 10.1016/j.compbiomed.2024.108251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/16/2024] [Accepted: 03/06/2024] [Indexed: 03/22/2024]
Abstract
This paper proposes a deep learning algorithm named CoGSPro for classifying lung cancer-related proteins. CoGSPro combines graph neural networks and attention mechanisms to extract key features from protein data and accurately classify proteins. It utilizes large-scale protein expression datasets to train and validate the model, enabling it to identify subtle patterns related to lung cancer. CoGSPro integrates protein-protein interaction network information to improve its predictive accuracy. The experimental results indicate that CoGSPro achieves cutting-edge performance, attaining an accuracy of 96.60% in the classification of lung cancer proteins, surpassing other baseline methods. Additionally, CoGSPro has uncovered new biomarkers for lung cancer, offering potential targets for early detection and treatment.
Collapse
Affiliation(s)
- Wei Wei
- Department of Lung Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Lung Cancer Center, Tianjin, China
| | - Dongsheng Yue
- Department of Lung Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Lung Cancer Center, Tianjin, China.
| |
Collapse
|
6
|
Wang Y, Du Y. Graph neural network model GGDisnet for identifying genes in gastrointestinal cancer and single-cell analysis. Comput Biol Med 2024; 172:108285. [PMID: 38503088 DOI: 10.1016/j.compbiomed.2024.108285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 02/22/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Gastrointestinal cancer, a highly prevalent form of cancer, has been the subject of extensive research resulting in the identification of numerous pathogenic genes. However, validation and exploration of these findings often require traditional biological experiments, which are time-consuming and limit the ability to make extensive assessments promptly. To address this challenge, this paper introduces GGDisnet, a novel model for identifying genes associated with gastrointestinal cancer. GGDisnet efficiently screens human genes, providing a set of genes with a high correlation to gastrointestinal cancer for reference. Comparative analysis with other models demonstrates GGDisnet's superior performance. Furthermore, we conducted enrichment and single-cell analyses based on GGDisnet-predicted genes, offering valuable clinical insights.
Collapse
Affiliation(s)
- Ying Wang
- Department of Endoscopy, The First Hospital of China Medical University, Shenyang, Liaoning, China
| | - Yaqi Du
- Department of Gastroenterology, The First Hospital of China Medical University, Shenyang, Liaoning, China.
| |
Collapse
|
7
|
Habibpour M, Razaghi-Moghadam Z, Nikoloski Z. Prediction and integration of metabolite-protein interactions with genome-scale metabolic models. Metab Eng 2024; 82:216-224. [PMID: 38367764 DOI: 10.1016/j.ymben.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/13/2024] [Accepted: 02/14/2024] [Indexed: 02/19/2024]
Abstract
Metabolites, as small molecules, can act not only as substrates to enzymes, but also as effectors of activity of proteins with different functions, thereby affecting various cellular processes. While several experimental techniques have started to catalogue the metabolite-protein interactions (MPIs) present in different cellular contexts, characterizing the functional relevance of MPIs remains a challenging problem. Computational approaches from the constrained-based modeling framework allow for predicting MPIs and integrating their effects in the in silico analysis of metabolic and physiological phenotypes, like cell growth. Here, we provide a classification of all existing constraint-based approaches that predict and integrate MPIs using genome-scale metabolic networks as input. In addition, we benchmark the performance of the approaches to predict MPIs in a comparative study using different features extracted from the model structure and predicted metabolic phenotypes with the state-of-the-art metabolic networks of Escherichia coli and Saccharomyces cerevisiae. Lastly, we provide an outlook for future, feasible directions to expand the consideration of MPIs in constraint-based modeling approaches with wide biotechnological applications.
Collapse
Affiliation(s)
- Mahdis Habibpour
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Zahra Razaghi-Moghadam
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany; Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam, Germany; Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, 14476, Potsdam, Germany.
| |
Collapse
|
8
|
Cui X, Lin Q, Chen M, Wang Y, Wang Y, Wang Y, Tao J, Yin H, Zhao T. Long-read sequencing unveils novel somatic variants and methylation patterns in the genetic information system of early lung cancer. Comput Biol Med 2024; 171:108174. [PMID: 38442557 DOI: 10.1016/j.compbiomed.2024.108174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 01/25/2024] [Accepted: 02/18/2024] [Indexed: 03/07/2024]
Abstract
Lung cancer poses a global health challenge, necessitating advanced diagnostics for improved outcomes. Intensive efforts are ongoing to pinpoint early detection biomarkers, such as genomic variations and DNA methylation, to elevate diagnostic precision. We conducted long-read sequencing on cancerous and adjacent non-cancerous tissues from a patient with lung adenocarcinoma. We identified somatic structural variations (SVs) specific to lung cancer by integrating data from various SV calling methods and differentially methylated regions (DMRs) that were distinct between these two tissue samples, revealing a unique methylation pattern associated with lung cancer. This study discovered over 40,000 somatic SVs and over 180,000 DMRs linked to lung cancer. We identified approximately 700 genes of significant relevance through comprehensive analysis, including genes intricately associated with many lung cancers, such as NOTCH1, SMOC2, CSMD2, and others. Furthermore, we observed that somatic SVs and DMRs were substantially enriched in several pathways, such as axon guidance signaling pathways, which suggests a comprehensive multi-omics impact on lung cancer progression across various biological investigation levels. These datasets can potentially serve as biomarkers for early lung cancer detection and may hold significant value in clinical diagnosis and treatment applications.
Collapse
Affiliation(s)
- Xinran Cui
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China
| | - Qingyan Lin
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China
| | - Ming Chen
- Institute of Bioinformatics, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China
| | - Yidan Wang
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China
| | - Yiwen Wang
- Tanwei College, Tsinghua University, Shuangqing Road, Beijing, 100084, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| | - Jiang Tao
- School of Computer Science and Technology, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| | - Honglei Yin
- Department of Respiratory and Critical Care, Heilongjiang Provincial Hospital, 405 Gorokhovaya Street, Harbin, Heilongjiang, 150000, China.
| | - Tianyi Zhao
- School of Medicine, Harbin Institute of Technology, 92 West Da Zhi St, Harbin, Heilongjiang, 150000, China.
| |
Collapse
|
9
|
Devasahayam Arokia Balaya R, Palollathil A, Kumar STA, Chandrasekaran J, Upadhyay SS, Parate SS, Sajida M, Karthikkeyan G, Prasad TSK. Role of Hemigraphis alternata in wound healing: metabolomic profiling and molecular insights into mechanisms. Sci Rep 2024; 14:3872. [PMID: 38365839 PMCID: PMC10873326 DOI: 10.1038/s41598-024-54352-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/12/2024] [Indexed: 02/18/2024] Open
Abstract
Hemigraphis alternata (H. alternata), commonly known as Red Flame Ivy, is widely recognized for its wound healing capabilities. However, the pharmacologically active plant components and their mechanisms of action in wound healing are yet to be determined. This study presents the mass spectrometry-based global metabolite profiling of aqueous and ethanolic extract of H. alternata leaves. The analysis identified 2285 metabolites from 24,203 spectra obtained in both positive and negative polarities. The identified metabolites were classified under ketones, carboxylic acids, primary aliphatic amines, steroids and steroid derivatives. We performed network pharmacology analysis to explore metabolite-protein interactions and identified 124 human proteins as targets for H. alternata metabolites. Among these, several of them were implicated in wound healing including prothrombin (F2), alpha-2A adrenergic receptor (ADRA2A) and fibroblast growth factor receptor 1 (FGFR1). Gene ontology analysis of target proteins enriched cellular functions related to glucose metabolic process, platelet activation, membrane organization and response to wounding. Additionally, pathway enrichment analysis revealed potential molecular network involved in wound healing. Moreover, in-silico docking analysis showed strong binding energy between H. alternata metabolites with identified protein targets (F2 and PTPN11). Furthermore, the key metabolites involved in wound healing were further validated by multiple reaction monitoring-based targeted analysis.
Collapse
Affiliation(s)
- Rex Devasahayam Arokia Balaya
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Akhina Palollathil
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
| | - Sumaithangi Thattai Arun Kumar
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
| | - Jaikanth Chandrasekaran
- Department of Pharmacology, Sri Ramachandra Faculty of Pharmacy, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, 600116, India
| | - Shubham Sukerndeo Upadhyay
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
| | - Sakshi Sanjay Parate
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
| | - M Sajida
- Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Gayathree Karthikkeyan
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India, 575018
| | | |
Collapse
|
10
|
Cai WL, Cheng M, Wang Y, Xu PH, Yang X, Sun ZW, Wang-Jun Yan. Prediction and related genes of cancer distant metastasis based on deep learning. Comput Biol Med 2024; 168:107664. [PMID: 38000245 DOI: 10.1016/j.compbiomed.2023.107664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 09/27/2023] [Accepted: 10/31/2023] [Indexed: 11/26/2023]
Abstract
Cancer metastasis is one of the main causes of cancer progression and difficulty in treatment. Genes play a key role in the process of cancer metastasis, as they can influence tumor cell invasiveness, migration ability and fitness. At the same time, there is heterogeneity in the organs of cancer metastasis. Breast cancer, prostate cancer, etc. tend to metastasize in the bone. Previous studies have pointed out that the occurrence of metastasis is closely related to which tissue is transferred to and genes. In this paper, we identified genes associated with cancer metastasis to different tissues based on LASSO and Pearson correlation coefficients. In total, we identified 45 genes associated with bone metastases, 89 genes associated with lung metastases, and 86 genes associated with liver metastases. Through the expression of these genes, we propose a CNN-based model to predict the occurrence of metastasis. We call this method MDCNN, which introduces a modulation mechanism that allows the weights of convolution kernels to be adjusted at different positions and feature maps, thereby adaptively changing the convolution operation at different positions. Experiments have proved that MDCNN has achieved satisfactory prediction accuracy in bone metastasis, lung metastasis and liver metastasis, and is better than other 4 methods of the same kind. We performed enrichment analysis and immune infiltration analysis on bone metastasis-related genes, and found multiple pathways and GO terms related to bone metastasis, and found that the abundance of macrophages and monocytes was the highest in patients with bone metastasis.
Collapse
Affiliation(s)
- Wei-Luo Cai
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Mo Cheng
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Yi Wang
- Department of Gastrointestinal Surgical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, China
| | - Pei-Hang Xu
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Xi Yang
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, China; Department of Oncology, Shanghai Medical College, Fudan University, China.
| | - Zheng-Wang Sun
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China.
| | - Wang-Jun Yan
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China.
| |
Collapse
|
11
|
Wang YM, Sun Y, Wang B, Wu Z, He XY, Zhao Y. Transfer learning for clustering single-cell RNA-seq data crossing-species and batch, case on uterine fibroids. Brief Bioinform 2023; 25:bbad426. [PMID: 37991248 PMCID: PMC10664408 DOI: 10.1093/bib/bbad426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/12/2023] [Accepted: 10/30/2023] [Indexed: 11/23/2023] Open
Abstract
Due to the high dimensionality and sparsity of the gene expression matrix in single-cell RNA-sequencing (scRNA-seq) data, coupled with significant noise generated by shallow sequencing, it poses a great challenge for cell clustering methods. While numerous computational methods have been proposed, the majority of existing approaches center on processing the target dataset itself. This approach disregards the wealth of knowledge present within other species and batches of scRNA-seq data. In light of this, our paper proposes a novel method named graph-based deep embedding clustering (GDEC) that leverages transfer learning across species and batches. GDEC integrates graph convolutional networks, effectively overcoming the challenges posed by sparse gene expression matrices. Additionally, the incorporation of DEC in GDEC enables the partitioning of cell clusters within a lower-dimensional space, thereby mitigating the adverse effects of noise on clustering outcomes. GDEC constructs a model based on existing scRNA-seq datasets and then applying transfer learning techniques to fine-tune the model using a limited amount of prior knowledge gleaned from the target dataset. This empowers GDEC to adeptly cluster scRNA-seq data cross different species and batches. Through cross-species and cross-batch clustering experiments, we conducted a comparative analysis between GDEC and conventional packages. Furthermore, we implemented GDEC on the scRNA-seq data of uterine fibroids. Compared results obtained from the Seurat package, GDEC unveiled a novel cell type (epithelial cells) and identified a notable number of new pathways among various cell types, thus underscoring the enhanced analytical capabilities of GDEC. Availability and implementation: https://github.com/YuzhiSun/GDEC/tree/main.
Collapse
Affiliation(s)
- Yu Mei Wang
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai , China
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai,China
| | - Yuzhi Sun
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Beiying Wang
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai , China
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai,China
| | - Zhiping Wu
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai , China
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai,China
| | - Xiao Ying He
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai , China
- Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai,China
| | - Yuansong Zhao
- University of Texas Health Science Center at Houston, 77030-5400, USA
| |
Collapse
|
12
|
Han L, Wang Z, Li C, Fan M, Wang Y, Sun G, Dai G. Functional identification and prediction of lncRNAs in esophageal cancer. Comput Biol Med 2023; 165:107205. [PMID: 37611425 DOI: 10.1016/j.compbiomed.2023.107205] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 05/29/2023] [Accepted: 06/25/2023] [Indexed: 08/25/2023]
Abstract
Esophageal cancer is a highly lethal malignancy with poor prognosis, and the identification of molecular biomarkers is crucial for improving diagnosis and treatment. Long non-coding RNAs (lncRNAs) have been shown to play important roles in the development and progression of esophageal cancer. However, due to the time cost of biological experiments, only a small number of lncRNAs related to esophageal cancer have been discovered. Currently, computational methods have emerged as powerful tools for identifying and characterizing lncRNAs, as well as predicting their potential functions. Therefore, this article proposes a transformer-based method for identifying esophageal cancer-related lncRNAs. Experimental results show that the AUC and AUPR of this method are superior to other comparison methods, with an AUC of 0.87 and an AUPR of 0.83, and the identified lncRNA targets are closely associated with esophageal cancer. We focus on the role of esophageal cancer-related lncRNAs in the immune microenvironment, and fully explore the functions of the target genes regulated by lncRNAs. Enrichment analysis shows that the predicted target genes are related to multiple pathways involved in the occurrence, development, and prognosis of esophageal cancer. This not only demonstrates the effectiveness of the method but also indicates the accuracy of the prediction results.
Collapse
Affiliation(s)
- Lu Han
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Zhikuan Wang
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Congyong Li
- Medical School of Chinese PLA, Beijing, China; Sixth Health Care Department, The Second Medical Center of PLA General Hospital, Beijing, China
| | - Mengjiao Fan
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Yanrong Wang
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China; Medical School of Chinese PLA, Beijing, China
| | - Gang Sun
- Department of Gastroenterology and Hepatology, The First Medical Center of Chinese PLA General Hospital, Beijing, China.
| | - Guanghai Dai
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China.
| |
Collapse
|
13
|
Sheng QJ, Tan Y, Zhang L, Wu ZP, Wang B, He XY. Heterogeneous graph framework for predicting the association between lncRNA and disease and case on uterine fibroid. Comput Biol Med 2023; 165:107331. [PMID: 37619322 DOI: 10.1016/j.compbiomed.2023.107331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 07/24/2023] [Accepted: 08/07/2023] [Indexed: 08/26/2023]
Abstract
Long non-coding RNAs (lncRNAs) play crucial regulatory roles in various cellular processes, including gene expression, chromatin remodeling, and protein localization. Dysregulation of lncRNAs has been linked to several diseases, making it essential to understand their functions in disease mechanisms and therapeutic strategies. However, traditional experimental methods for studying lncRNA function are time-consuming, expensive, and offer limited insights. In recent years, computational methods have emerged as valuable tools for predicting lncRNA functions and their associations with diseases. However, many existing methods focus on constructing separate networks for lncRNA and disease similarity, resulting in information loss and insufficient processing capacity for isolated nodes. To address this, we developed 'RGLD' by combining Random Walk with restarting (RWR), Graph Neural Network (GNN), and Graph Attention Networks (GAT) to predict lncRNA-disease associations in a heterogeneous network. RGLD achieved an impressive AUC of 0.88, outperforming other methods. It can also predict novel associations between lncRNAs and diseases. RGLD identified HOTAIR, MEG3, and PVT1 as lncRNAs associated with uterine fibroids. Biological experiments directly or indirectly verified the involvement of these three lncRNAs in uterine fibroids, validating the accuracy of RGLD's predictions. Furthermore, we extensively discussed the functions of the target genes regulated by these lncRNAs in uterine fibroids, providing evidence for their role in the development and progression of the disease.
Collapse
Affiliation(s)
- Qing-Jing Sheng
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Yuan Tan
- Department of Integrated Traditional Chinese Medicine (TCM) & Western Medicine, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Liyuan Zhang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zhi-Ping Wu
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Beiying Wang
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China
| | - Xiao-Ying He
- Department of Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tong Ji University, Shanghai, China; Shanghai Key Laboratory of Maternal and Fetal Medicine, Shanghai First Maternity and Infant Hospital, Shanghai, China.
| |
Collapse
|
14
|
Wang L, Ding X, Qiu X. Mechanism of breast cancer immune microenvironment in prognosis of heart failure. Comput Biol Med 2023; 164:107339. [PMID: 37586207 DOI: 10.1016/j.compbiomed.2023.107339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 07/15/2023] [Accepted: 08/07/2023] [Indexed: 08/18/2023]
Abstract
The treatment of breast cancer can potentially impose a burden on the heart, leading to an increased risk of heart failure. Studies have shown that more than half of breast cancer patients die from non-tumor-related causes, with cardiovascular disease (CVD) being the leading cause of death. However, the underlying mechanism linking breast cancer prognosis and heart failure remains unclear. To investigate this, we conducted an analysis where we compared the differentially expressed genes (DEGs) in early and advanced breast cancer with genes associated with heart failure. This analysis revealed 18 genes that overlapped between the two conditions, with 15 of them being related to immune function. This suggests that immune pathways may play a role in the prognosis of breast cancer patients with heart failure. Using gene expression data from 1260 breast cancer patients, we further examined the impact of these 15 genes on survival time. Additionally, through enrichment analysis, we explored the functions and pathways associated with these genes in relation to breast cancer and heart failure. By constructing a transformer model, we discovered that the expression patterns of these 15 genes can accurately predict the occurrence of heart failure. The model achieved an AUC of 0.86 and an AUPR of 0.91. Moreover, through analysis of single-cell sequencing data from breast cancer patients undergoing PD-1 treatment and experiencing heart failure, we identified a significant number of cell-type-specific genes that were shared between both diseases. This suggests that changes in gene expression in immune cells following breast cancer treatment may be associated with the development of heart failure.
Collapse
Affiliation(s)
- Lida Wang
- Department of Cardiology, The Second Hospital of Dalian Medical University, Dalian, China.
| | - Xiaolei Ding
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China.
| | - Xun Qiu
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China.
| |
Collapse
|
15
|
Liu D, Yao L, Ding X, Zhou H. Multi-omics immune regulatory mechanisms in lung adenocarcinoma metastasis and survival time. Comput Biol Med 2023; 164:107333. [PMID: 37586202 DOI: 10.1016/j.compbiomed.2023.107333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 07/23/2023] [Accepted: 08/07/2023] [Indexed: 08/18/2023]
Abstract
Lung adenocarcinoma (LUAD) is the most common type of lung cancer. Despite previous research on immune mechanisms and related molecules in LUAD, the specific regulatory mechanisms of these molecules in the immune microenvironment remain unclear. Furthermore, the impact of regulatory genes or RNA on LUAD metastasis and survival time is yet to be understood. To address these gaps, we collected a substantial amount of data, including 17,226 gene expression profiles from 1,018 samples, 370,640 methylation sites from 461 samples, and 248 miRNAs from 513 samples. Our aim was to explore the genes, miRNAs, and methylation sites associated with LUAD progression. Leveraging the regulatory functions of miRNAs and methylation sites, we identified target and regulated genes. Through the utilization of LASSO and survival analysis, we pinpointed 22 key genes that play pivotal roles in the immune regulatory mechanism of LUAD. Notably, the expression levels of these 22 genes demonstrated significant discriminatory power in predicting LUAD patient survival time. Additionally, our deep learning model accurately predicted distant metastasis in LUAD patients using the expression levels of these genes. Further pathway enrichment analysis revealed that these 22 genes are significantly enriched in pathways closely linked to LUAD progression. Through Immune Infiltration Assay, we observed that T cell CD4 memory resting, monocytes, and macrophages.M2 were the three most abundant cell types in the immune microenvironment of LUAD. These cells are known to play crucial roles in tumor growth, invasion, and metastasis. Single-cell data analysis further validated the functional significance of these genes, indicating their involvement not only in immune cells but also in epithelial cells, showcasing significant differential expression. Overall, this study sheds light on the regulatory mechanisms underlying the immune microenvironment of LUAD by identifying key genes associated with LUAD progression. The findings provide insights into potential prognostic markers and therapeutic targets.
Collapse
Affiliation(s)
- Dan Liu
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China
| | - Lulu Yao
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China
| | - Xiaolei Ding
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China.
| | - Huan Zhou
- Department of Medical Oncology, The Second Hospital of Dalian Medical University, Dalian, China.
| |
Collapse
|
16
|
Wang C, Yuan C, Wang Y, Chen R, Shi Y, Zhang T, Xue F, Patti GJ, Wei L, Hou Q. MPI-VGAE: protein-metabolite enzymatic reaction link learning by variational graph autoencoders. Brief Bioinform 2023; 24:bbad189. [PMID: 37225420 PMCID: PMC10359079 DOI: 10.1093/bib/bbad189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 04/10/2023] [Accepted: 04/27/2023] [Indexed: 05/26/2023] Open
Abstract
Enzymatic reactions are crucial to explore the mechanistic function of metabolites and proteins in cellular processes and to understand the etiology of diseases. The increasing number of interconnected metabolic reactions allows the development of in silico deep learning-based methods to discover new enzymatic reaction links between metabolites and proteins to further expand the landscape of existing metabolite-protein interactome. Computational approaches to predict the enzymatic reaction link by metabolite-protein interaction (MPI) prediction are still very limited. In this study, we developed a Variational Graph Autoencoders (VGAE)-based framework to predict MPI in genome-scale heterogeneous enzymatic reaction networks across ten organisms. By incorporating molecular features of metabolites and proteins as well as neighboring information in the MPI networks, our MPI-VGAE predictor achieved the best predictive performance compared to other machine learning methods. Moreover, when applying the MPI-VGAE framework to reconstruct hundreds of metabolic pathways, functional enzymatic reaction networks and a metabolite-metabolite interaction network, our method showed the most robust performance among all scenarios. To the best of our knowledge, this is the first MPI predictor by VGAE for enzymatic reaction link prediction. Furthermore, we implemented the MPI-VGAE framework to reconstruct the disease-specific MPI network based on the disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of novel enzymatic reaction links were identified. We further validated and explored the interactions of these enzymatic reactions using molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and facilitate the study of the disrupted metabolisms in diseases.
Collapse
Affiliation(s)
- Cheng Wang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Chuang Yuan
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Yahui Wang
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, 63130, USA
- Center for Metabolomics and Isotope Tracing, Washington University in St. Louis, St. Louis, MO, 63130, USA
| | - Ranran Chen
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Yuying Shi
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Tao Zhang
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Fuzhong Xue
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| | - Gary J Patti
- Department of Chemistry, Washington University in St. Louis, St. Louis, MO, 63130, USA
- Department of Medicine, Washington University in St. Louis, St. Louis, MO, 63130, USA
- Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, 63130, USA
- Center for Metabolomics and Isotope Tracing, Washington University in St. Louis, St. Louis, MO, 63130, USA
| | - Leyi Wei
- School of Software, Shandong University, Jinan, 250100, China
| | - Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, 250012, China
- National Institute of Health Data Science of China, Shandong University, Jinan, 250000, China
| |
Collapse
|
17
|
Soleymani Babadi F, Razaghi-Moghadam Z, Zare-Mirakabad F, Nikoloski Z. Prediction of metabolite-protein interactions based on integration of machine learning and constraint-based modeling. BIOINFORMATICS ADVANCES 2023; 3:vbad098. [PMID: 37521309 PMCID: PMC10374491 DOI: 10.1093/bioadv/vbad098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 06/28/2023] [Accepted: 07/15/2023] [Indexed: 08/01/2023]
Abstract
Motivation Metabolite-protein interactions play an important role in regulating protein functions and metabolism. Yet, predictions of metabolite-protein interactions using genome-scale metabolic networks are lacking. Here, we fill this gap by presenting a computational framework, termed SARTRE, that employs features corresponding to shadow prices determined in the context of flux variability analysis to predict metabolite-protein interactions using supervised machine learning. Results By using gold standards for metabolite-protein interactomes and well-curated genome-scale metabolic models of Escherichia coli and Saccharomyces cerevisiae, we found that the implementation of SARTRE with random forest classifiers accurately predicts metabolite-protein interactions, supported by an average area under the receiver operating curve of 0.86 and 0.85, respectively. Ranking of features based on their importance for classification demonstrated the key role of shadow prices in predicting metabolite-protein interactions. The quality of predictions is further supported by the excellent agreement of the organism-specific classifiers on unseen interactions shared between the two model organisms. Further, predictions from SARTRE are highly competitive against those obtained from a recent deep-learning approach relying on a variety of protein and metabolite features. Together, these findings show that features extracted from constraint-based analyses of metabolic networks pave the way for understanding the functional roles of the interactions between proteins and small molecules. Availability and implementation https://github.com/fayazsoleymani/SARTRE.
Collapse
Affiliation(s)
- Fayaz Soleymani Babadi
- Departement of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Zahra Razaghi-Moghadam
- Systems Biology and Mathematical Biology, Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany
| | - Fatemeh Zare-Mirakabad
- Departement of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran
| | - Zoran Nikoloski
- Corresponding author. Bioinformatics Department, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany. E-mail:
| |
Collapse
|
18
|
Han L, Chen S, Luan Z, Fan M, Wang Y, Sun G, Dai G. Immune function of colon cancer associated miRNA and target genes. Front Immunol 2023; 14:1203070. [PMID: 37465677 PMCID: PMC10351377 DOI: 10.3389/fimmu.2023.1203070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 05/15/2023] [Indexed: 07/20/2023] Open
Abstract
Introduction Colon cancer is a complex disease that involves intricate interactions between cancer cells and theimmune microenvironment. MicroRNAs (miRNAs) have recently emerged as critical regulators of gene expression in cancer, including colon cancer. There is increasing evidence suggesting that miRNA dysregulation plays a crucial role in modulating the immune microenvironment of intestinal cancer. In particular, miRNAs regulate immune cell activation, differentiation, and function, as well as cytokine and chemokine production in intestinal cancer. It is urgent to fully investigate the potential role of intestinal cancer-related miRNAs in shaping the immune microenvironment. Methods Therefore, this paper aims to identify miRNAs that are potentially associated with colon cancer and regulate a large number of genes related to immune function. We explored the role of these genes in colon cancer patient prognosis, immune infiltration, and tumor purity based on data of 174 colon cancer patients though convolutional neural network, survival analysis and multiple analysis tools. Results Our findings suggest that miRNA regulated genes play important roles in CD4 memory resting cells, macrophages.M2, and Mast cell activated cells, and they are concentrated in the cytokinecytokine receptor interaction pathway. Discussion Our study enhances our understanding of the underlying mechanisms of intestinal cancer and provides new insights into the development of effective therapies. Additionally, identification of miRNA biomarkers could aid in diagnosis and prognosis, as well as guide personalized treatment strategies for patients with intestinal cancer.
Collapse
Affiliation(s)
- Lu Han
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
- Medical School of Chinese PLA, Beijing, China
| | - Shiyun Chen
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
- Medical School of Chinese PLA, Beijing, China
| | - Zhe Luan
- Department of Gastroenterology and Hepatology, The First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Mengjiao Fan
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
- Medical School of Chinese PLA, Beijing, China
| | - Yanrong Wang
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
- Medical School of Chinese PLA, Beijing, China
| | - Gang Sun
- Department of Gastroenterology and Hepatology, The First Medical Center of Chinese PLA General Hospital, Beijing, China
| | - Guanghai Dai
- Department of Oncology, The Fifth Medical Center of Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
19
|
Wei W, Su Y. Function of CD8 +, conventional CD4 +, and regulatory CD4 + T cell identification in lung cancer. Comput Biol Med 2023; 160:106933. [PMID: 37156220 DOI: 10.1016/j.compbiomed.2023.106933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 04/06/2023] [Accepted: 04/13/2023] [Indexed: 05/10/2023]
Abstract
Lung cancer is the malignant tumor with the highest mortality rate in the world. There is obvious heterogeneity within the tumor. Single cell sequencing technology enables scholars to obtain information about the cell type, status, subpopulation distribution and communication behavior between cells in the tumor microenvironment from the cellular level. However, due to the problem of sequencing depth, some genes with low expression cannot be detected, which results in that most of the specific genes of immune cells cannot be recognized, and lead to defects in the functional identification of immune cells. In this paper, we used single cell sequencing data of 12346 T cells in 14 treatment-naïve non-small-cell lung cancer patients to identify immune cell-specific genes and infer the function of three types of T cells. The method, named GRAPH-LC, implemented this function by gene interaction network and graph learning methods. Graph learning methods are used to extract genes feature and dense neural network is used to identify immune cell-specific genes. The experiments on 10-cross validation shows that the AUROC and AUPR reached at least 0.802, 0.815 on identifying cell-specific genes of three types of T cells. And we did functional enrichment analysis on the top 15 expressed genes. By functional enrichment analysis, we got 95 GO terms and 39 KEGG pathways that related to three types of T cells. The use of this technology will help to deeply understand the mechanism of the occurrence and development of lung cancer, find new diagnostic markers and therapeutic targets, and provide a theoretical reference for the precise treatment of lung cancer patients in the future.
Collapse
Affiliation(s)
- Wei Wei
- Department of Lung Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Lung Cancer Center, tianjin, China
| | - Yanjun Su
- Department of Lung Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin Lung Cancer Center, tianjin, China.
| |
Collapse
|
20
|
Liu J, Qu J, Xu L, Qiao C, Shao G, Liu X, He H, Zhang J. Prediction of liver cancer prognosis based on immune cell marker genes. Front Immunol 2023; 14:1147797. [PMID: 37180166 PMCID: PMC10174299 DOI: 10.3389/fimmu.2023.1147797] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 02/24/2023] [Indexed: 05/15/2023] Open
Abstract
Introduction Monitoring the response after treatment of liver cancer and timely adjusting the treatment strategy are crucial to improve the survival rate of liver cancer. At present, the clinical monitoring of liver cancer after treatment is mainly based on serum markers and imaging. Morphological evaluation has limitations, such as the inability to measure small tumors and the poor repeatability of measurement, which is not applicable to cancer evaluation after immunotherapy or targeted treatment. The determination of serum markers is greatly affected by the environment and cannot accurately evaluate the prognosis. With the development of single cell sequencing technology, a large number of immune cell-specific genes have been identified. Immune cells and microenvironment play an important role in the process of prognosis. We speculate that the expression changes of immune cell-specific genes can indicate the process of prognosis. Method Therefore, this paper first screened out the immune cell-specific genes related to liver cancer, and then built a deep learning model based on the expression of these genes to predict metastasis and the survival time of liver cancer patients. We verified and compared the model on the data set of 372 patients with liver cancer. Result The experiments found that our model is significantly superior to other methods, and can accurately identify whether liver cancer patients have metastasis and predict the survival time of liver cancer patients according to the expression of immune cell-specific genes. Discussion We found these immune cell-specific genes participant multiple cancer-related pathways. We fully explored the function of these genes, which would support the development of immunotherapy for liver cancer.
Collapse
Affiliation(s)
- Jianfei Liu
- Department of Interventional Therapy, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Junjie Qu
- Interventional Medicine Center, Affiliated Zhongshan Hospital of Dalian University, Dalian, Liaoning, China
| | - Lingling Xu
- Department of Medical Oncology, The Second Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Chen Qiao
- Department of Interventional Therapy, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Guiwen Shao
- Department of Interventional Therapy, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Xin Liu
- Department of Medical Oncology, The Second Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Hui He
- Department of Laparoscopic Surgery, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| | - Jian Zhang
- Department of Interventional Therapy, The First Affiliated Hospital of Dalian Medical University, Dalian, Liaoning, China
| |
Collapse
|
21
|
Wang C, Yuan C, Wang Y, Chen R, Shi Y, Patti GJ, Hou Q. Genome-scale enzymatic reaction prediction by variational graph autoencoders. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531729. [PMID: 36945484 PMCID: PMC10028866 DOI: 10.1101/2023.03.08.531729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
Abstract
Background Enzymatic reaction networks are crucial to explore the mechanistic function of metabolites and proteins in biological systems and understanding the etiology of diseases and potential target for drug discovery. The increasing number of metabolic reactions allows the development of deep learning-based methods to discover new enzymatic reactions, which will expand the landscape of existing enzymatic reaction networks to investigate the disrupted metabolisms in diseases. Results In this study, we propose the MPI-VGAE framework to predict metabolite-protein interactions (MPI) in a genome-scale heterogeneous enzymatic reaction network across ten organisms with thousands of enzymatic reactions. We improved the Variational Graph Autoencoders (VGAE) model to incorporate both molecular features of metabolites and proteins as well as neighboring features to achieve the best predictive performance of MPI. The MPI-VGAE framework showed robust performance in the reconstruction of hundreds of metabolic pathways and five functional enzymatic reaction networks. The MPI-VGAE framework was also applied to a homogenous metabolic reaction network and achieved as high performance as other state-of-art methods. Furthermore, the MPI-VGAE framework could be implemented to reconstruct the disease-specific MPI network based on hundreds of disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of new potential enzymatic reactions were predicted and validated by molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and drug targets in real-world applications. Data availability and implementation The MPI-VGAE framework and datasets are publicly accessible on GitHub https://github.com/mmetalab/mpi-vgae . Author Biographies Cheng Wang received his Ph.D. in Chemistry from The Ohio State Univesity, USA. He is currently a Assistant Professor in School of Public Health at Shandong University, China. His research interests include bioinformatics, machine learning-based approach with applications to biomedical networks. Chuang Yuan is a research assistant at Shandong University. He obtained the MS degree in Biology at the University of Science and Technology of China. His research interests include biochemistry & molecular biology, cell biology, biomedicine, bioinformatics, and computational biology. Yahui Wang is a PhD student in Department of Chemistry at Washington University in St. Louis. Her research interests include biochemistry, mass spectrometry-based metabolomics, and cancer metabolism. Ranran Chen is a master graduate student in School of Public Health at University of Shandong, China. Yuying Shi is a master graduate student in School of Public Health at University of Shandong, China. Gary J. Patti is the Michael and Tana Powell Professor at Washington University in St. Louis, where he holds appointments in the Department of Chemisrty and the Department of Medicine. He is also the Senior Director of the Center for Metabolomics and Isotope Tracing at Washington University. His research interests include metabolomics, bioinformatics, high-throughput mass spectrometry, environmental health, cancer, and aging. Leyi Wei received his Ph.D. in Computer Science from Xiamen University, China. He is currently a Professor in School of Software at Shandong University, China. His research interests include machine learning and its applications to bioinformatics. Qingzhen Hou received his Ph.D. in the Centre for Integrative Bioinformatics VU (IBIVU) from Vrije Universiteit Amsterdam, the Netherlands. Since 2020, He has serveved as the head of Bioinformatics Center in National Institute of Health Data Science of China and Assistant Professor in School of Public Health, Shandong University, China. His areas of research are bioinformatics and computational biophysics. Key points Genome-scale heterogeneous networks of metabolite-protein interaction (MPI) based on thousands of enzymatic reactions across ten organisms were constructed semi-automatically.An enzymatic reaction prediction method called Metabolite-Protein Interaction Variational Graph Autoencoders (MPI-VGAE) was developed and optimized to achieve higher performance compared with existing machine learning methods by using both molecular features of metabolites and proteins.MPI-VGAE is broadly useful for applications involving the reconstruction of metabolic pathways, functional enzymatic reaction networks, and homogenous networks (e.g., metabolic reaction networks).By implementing MPI-VGAE to Alzheimer's disease and colorectal cancer, we obtained several novel disease-related protein-metabolite reactions with biological meanings. Moreover, we further investigated the reasonable binding details of protein-metabolite interactions using molecular docking approaches which provided useful information for disease mechanism and drug design.
Collapse
|
22
|
Kurbatov I, Dolgalev G, Arzumanian V, Kiseleva O, Poverennaya E. The Knowns and Unknowns in Protein-Metabolite Interactions. Int J Mol Sci 2023; 24:ijms24044155. [PMID: 36835565 PMCID: PMC9964805 DOI: 10.3390/ijms24044155] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 02/11/2023] [Accepted: 02/15/2023] [Indexed: 02/22/2023] Open
Abstract
Increasing attention has been focused on the study of protein-metabolite interactions (PMI), which play a key role in regulating protein functions and directing an orchestra of cellular processes. The investigation of PMIs is complicated by the fact that many such interactions are extremely short-lived, which requires very high resolution in order to detect them. As in the case of protein-protein interactions, protein-metabolite interactions are still not clearly defined. Existing assays for detecting protein-metabolite interactions have an additional limitation in the form of a limited capacity to identify interacting metabolites. Thus, although recent advances in mass spectrometry allow the routine identification and quantification of thousands of proteins and metabolites today, they still need to be improved to provide a complete inventory of biological molecules, as well as all interactions between them. Multiomic studies aimed at deciphering the implementation of genetic information often end with the analysis of changes in metabolic pathways, as they constitute one of the most informative phenotypic layers. In this approach, the quantity and quality of knowledge about PMIs become vital to establishing the full scope of crosstalk between the proteome and the metabolome in a biological object of interest. In this review, we analyze the current state of investigation into the detection and annotation of protein-metabolite interactions, describe the recent progress in developing associated research methods, and attempt to deconstruct the very term "interaction" to advance the field of interactomics further.
Collapse
|
23
|
Cheng N, Liu J, Chen C, Zheng T, Li C, Huang J. Prediction of lung cancer metastasis by gene expression. Comput Biol Med 2023; 153:106490. [PMID: 36638618 DOI: 10.1016/j.compbiomed.2022.106490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 12/14/2022] [Accepted: 12/27/2022] [Indexed: 12/31/2022]
Abstract
Tumor metastasis is the main cause of death in cancer patients. Early prediction of tumor metastasis can allow for timely intervention. At present, research on tumor metastasis mainly focuses on manual diagnosis by imaging or diagnosis by computational methods. With the deterioration of the tumor, gene expression levels in blood change greatly. It is feasible to measure the transcripts of key genes to predict whether cancer will metastasize. Therefore, in this paper, we obtained gene expression data from 226 patients from TCGA. These data included 239,322 transcripts. Background screening and LASSO analysis were used to select 31 transcripts as features. Finally, a deep neural network (DNN) was used to determine whether or not lung cancer would metastasize. We compared our methods with several other methods and found that our method achieved the best precision. In addition, in a previous study, we identified 7 genes that play a vital role in lung cancer. We added those gene transcripts into the DNN and found that the AUC and AUPR of the model were increased.
Collapse
Affiliation(s)
- Nitao Cheng
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Junliang Liu
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Chen Chen
- Department of Biological Repositories, Zhongnan Hospital of Wuhan University, China
| | - Tang Zheng
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China.
| |
Collapse
|
24
|
Li C, Tang H, Yang Z, Tang Z, Cheng N, Huang J, Zhou X. Mechanism of CAV and CAVIN Family Genes in Acute Lung Injury based on DeepGENE. Curr Gene Ther 2023; 23:72-80. [PMID: 36043785 DOI: 10.2174/1566523222666220829140649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 07/01/2022] [Accepted: 07/06/2022] [Indexed: 02/08/2023]
Abstract
BACKGROUND The fatality rate of acute lung injury (ALI) is as high as 40% to 60%. Although various factors, such as sepsis, trauma, pneumonia, burns, blood transfusion, cardiopulmonary bypass, and pancreatitis, can induce ALI, patients with these risk factors will eventually develop ALI. The rate of developing ALI is not high, and the outcomes of ALI patients vary, indicating that it is related to genetic differences between individuals. In a previous study, we found multiple functions of cavin-2 in lung function. In addition, many other studies have revealed that CAV1 is a critical regulator of lung injury. Due to the strong relationship between cavin-2 and CAV1, we suspect that cavin-2 is also associated with ALI. Furthermore, we are curious about the role of the CAV family and cavin family genes in ALI. METHODS To reveal the mechanism of CAV and CAVIN family genes in ALI, we propose DeepGENE to predict whether CAV and CAVIN family genes are associated with ALI. This method constructs a gene interaction network and extracts gene expression in 84 tissues. We divided these features into two groups and used two network encoders to encode and learn the features. RESULTS Compared with DNN, GBDT, RF and KNN, the AUC of DeepGENE increased by 7.89%, 16.84%, 20.19% and 32.01%, respectively. The AUPR scores increased by 8.05%, 15.58%, 22.56% and 23.34%. DeepGENE shows that CAVIN-1, CAVIN-2, CAVIN-3 and CAV2 are related to ALI. CONCLUSION DeepGENE is a reliable method for identifying acute lung injury-related genes. Multiple CAV and CAVIN family genes are associated with acute lung injury-related genes through multiple pathways and gene functions.
Collapse
Affiliation(s)
- Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Hexiao Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Zetian Yang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Zheng Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Nitao Cheng
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Xuefeng Zhou
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| |
Collapse
|
25
|
Walther D. Specifics of Metabolite-Protein Interactions and Their Computational Analysis and Prediction. Methods Mol Biol 2023; 2554:179-197. [PMID: 36178627 DOI: 10.1007/978-1-0716-2624-5_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Computational approaches to the characterization and prediction of compound-protein interactions have a long research history and are well established, driven primarily by the needs of drug development. While, in principle, many of the computational methods developed in the context of drug development can also be applied directly to the investigation of metabolite-protein interactions, the interactions of metabolites with proteins (enzymes) are characterized by a number of particularities that result from their natural evolutionary origin and their biological and biochemical roles, as well as from a different problem setting when investigating them. In this review, these special aspects will be highlighted and recent research on them and developed computational approaches presented, along with available resources. They concern, among others, binding promiscuity, allostery, the role of posttranslational modifications, molecular steering and crowding effects, and metabolic conversion rate predictions. Recent breakthroughs in the field of protein structure prediction and newly developed machine learning techniques are being discussed as a tremendous opportunity for developing a more detailed molecular understanding of metabolism.
Collapse
Affiliation(s)
- Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.
| |
Collapse
|
26
|
de Souza LP, Fernie AR. Databases and Tools to Investigate Protein-Metabolite Interactions. Methods Mol Biol 2023; 2554:231-249. [PMID: 36178629 DOI: 10.1007/978-1-0716-2624-5_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein-metabolite interactions (PMIs) are directly responsible for the regulation of numerous processes. From the direct regulation of enzymes to complex developmental processes intermediated by hormones, PMIs are central to understanding the molecular mechanisms of important physiological phenomena. Still, proving such interactions experimentally has proven an arduous task. We discuss here some of the current technologies contributing to expand our knowledge on PMIs, with particular emphasis on platforms and databases to explore the highly heterogenous nature of characterized PMIs, which is likely to be an essential resource on the development of new computational approaches to predict and validate interactions based on large-scale PMI screenings.
Collapse
Affiliation(s)
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.
| |
Collapse
|
27
|
Tang H, Sun L, Huang J, Yang Z, Li C, Zhou X. The mechanism and biomarker function of Cavin-2 in lung ischemia-reperfusion injury. Comput Biol Med 2022; 151:106234. [PMID: 36335812 DOI: 10.1016/j.compbiomed.2022.106234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 10/01/2022] [Accepted: 10/22/2022] [Indexed: 12/27/2022]
Abstract
BACKGROUND Lung Ischemia Reperfusion injury(LIRI) is one of the most predominant complications of ischemic lung disease. Cavin-2 emerged as a regulator of a variety of cellular processes, including endocytosis, lipid homeostasis, signal transduction and tumorigenesis, but the function of Cavin-2 in LIRI is unknown. The purpose of this study was to determine the predictive potential of Cavin-2 in protecting lung ischemia-reperfusion injury and its corresponding mechanisms. METHODS We found the strong relationship between Cavin-2 and multiple immune-related genes by deep learning method. To reveal the mechanism of Cavin-2 in LIRI, the LIRI SD rat model was constructed to detect the expression of Cavin-2 in the lung tissue of SD rats after LIRI, and the expression of Cavin-2 in lung cell lines was also detected. The expression of IL-6, IL-10 and MDA in cells after Cavin-2 over-expression or knockdown was examined under hypoxic conditions. The expression levels of p-AKT, p-STAT3 and p-ERK1/2 were measured in over-expressing Cavin-2 cells under hypoxic-ischemia conditions, and then the corresponding blockers of AKT, STAT3 and ERK1/2 were given to verify, whether they play a protective role in LIRI. RESULTS After hypoxia, the expression of Cavin-2 in rat lung tissues was significantly increased, and the cellular activity and IL-10 in Cavin-2 over-expressing cells were significantly higher than that of the control group, while IL-6 and MDA were significantly lower than that of the control group, while the above results were reversed in Cavin-2 knockdown cells; Meanwhile, the phosphorylation levels of AKT, STAT3, and ERK1/2 were significantly increased in Cavin-2 over-expression cells after hypoxia. When AKT, STAT3, and ERK1/2 specific blockers were given, they lost their protective effect against LIRI. CONCLUSIONS Cavin-2 shows biomarker potential in protecting lung from ischemia-reperfusion injury through the survivor activating factor enhancement (SAFE) and reperfusion injury salvage kinase (RISK) pathway.
Collapse
Affiliation(s)
- Hexiao Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Linao Sun
- Tianjin Medical University, Tianjin, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Zetian Yang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China.
| | - Xuefeng Zhou
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China.
| |
Collapse
|
28
|
Zhang L, Qin Q, Xu C, Zhang N, Zhao T. Identification of immune cell function in breast cancer by integrating multiple single-cell data. Front Immunol 2022; 13:1058239. [PMID: 36479102 PMCID: PMC9719918 DOI: 10.3389/fimmu.2022.1058239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 10/31/2022] [Indexed: 11/22/2022] Open
Abstract
Breast cancer has now become the most commonly diagnosed cancer worldwide. It is a highly complex and heterogeneous disease that comprises distinct histological features and treatment response. With the development of molecular biology and immunology, immunotherapy has become a new field of breast cancer treatment. Identifying cell-type-specific genes critical to the immune microenvironment contributes to breast cancer treatment. Single-cell RNA sequencing (scRNA-seq) technology could serve as a powerful tool to analyze cellular genetic information at single-cell resolution and to uncover the gene expression status of each cell, thus allowing comprehensive assessment of intercellular heterogeneity. Because of the influence of sample size and sequencing depth, the specificity of genes in different cell types for breast cancer cannot be fully revealed. Therefore, the present study integrated two public breast cancer scRNA-seq datasets aiming to investigate the functions of different type of immune cells in tumor microenvironment. We identified total five significant differential expressed genes of B cells, T cells and macrophage and explored their functions and immune mechanisms in breast cancer. Finally, we performed functional annotation analyses using the top fifteen differentially expressed genes in each immune cell type to discover the immune-related pathways and gene ontology (GO) terms.
Collapse
Affiliation(s)
- Liyuan Zhang
- Department of Computer Science, Harbin Institute of Technology, Harbin, China
| | - Qiyuan Qin
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Chen Xu
- Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Ningyi Zhang
- Department of Computer Science, Harbin Institute of Technology, Harbin, China
| | - Tianyi Zhao
- School of Medicine and Health, Harbin Institute of Technology, Harbin, China,*Correspondence: Tianyi Zhao,
| |
Collapse
|
29
|
Zhang T, Lin Y, He W, Yuan F, Zeng Y, Zhang S. GCN-GENE: A novel method for prediction of coronary heart disease-related genes. Comput Biol Med 2022; 150:105918. [PMID: 36215847 DOI: 10.1016/j.compbiomed.2022.105918] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/19/2022] [Accepted: 07/30/2022] [Indexed: 11/22/2022]
Abstract
Coronary heart disease is the most common heart disease, it can induce myocardial infarction, and the cause of the disease has a lot to do with life and eating habits. The results of a large number of epidemiological studies at home and abroad show that the incidence of coronary heart disease has an obvious familial tendency. However, little is known about the genetic factors of coronary heart disease. Although genome-wide association analysis and gene knockout experiments have found some genes related to coronary heart disease, there are still a large number of genes potentially related to coronary heart disease that have not been discovered. If it is confirmed by biological experimental means, the time and money cost is too high. Therefore, it is urgent to identify genes related to coronary heart disease on a large scale by computational means, so as to conduct targeted biological experimental verification. This paper proposes a deep learning method based on biological networks for the identification of coronary heart disease-related genes. We constructed gene interaction networks and extracted gene expression levels in different tissues as features. Through the association information and expression characteristics between genes, we constructed a model of coronary heart disease-related genes. Through cross-validation, we found that our proposed GCN-GENE that has AUC as 0.75 and AUPR as 0.78, which is more accurate than other methods and is a reliable method for predicting coronary heart disease-related genes.
Collapse
Affiliation(s)
- Tong Zhang
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Yixuan Lin
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Weimin He
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - FengXin Yuan
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Yu Zeng
- Department of Cardiology, The Sixth Affiliated Hospital, School of Medicine, South China University of Technology, Guangdong, China.
| | - Shihua Zhang
- College of Life Science and Health, Wuhan University of Science and Technology, Wuhan, China.
| |
Collapse
|
30
|
Cai Y, Wu Q, Chen Y, Liu Y, Wang J. Predicting non-small cell lung cancer-related genes by a new network-based machine learning method. Front Oncol 2022; 12:981154. [PMID: 36203453 PMCID: PMC9530852 DOI: 10.3389/fonc.2022.981154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 07/25/2022] [Indexed: 12/03/2022] Open
Abstract
Lung cancer is the leading cause of cancer death globally, killing 1.8 million people yearly. Over 85% of lung cancer cases are non-small cell lung cancer (NSCLC). Lung cancer running in families has shown that some genes are linked to lung cancer. Genes associated with NSCLC have been found by next-generation sequencing (NGS) and genome-wide association studies (GWAS). Many papers, however, neglected the complex information about interactions between gene pairs. Along with its high cost, GWAS analysis has an obvious drawback of false-positive results. Based on the above problem, computational techniques are used to offer researchers alternative and complementary low-cost disease–gene association findings. To help find NSCLC-related genes, we proposed a new network-based machine learning method, named deepRW, to predict genes linked to NSCLC. We first constructed a gene interaction network consisting of genes that are related and irrelevant to NSCLC disease and used deep walk and graph convolutional network (GCN) method to learn gene–disease interactions. Finally, deep neural network (DNN) was utilized as the prediction module to decide which genes are related to NSCLC. To evaluate the performance of deepRW, we ran tests with 10-fold cross-validation. The experimental results showed that our method greatly exceeded the existing methods. In addition, the effectiveness of each module in deepRW was demonstrated in comparative experiments.
Collapse
Affiliation(s)
- Yong Cai
- Department of Radiation Oncology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Qiongya Wu
- Department of Radiation Oncology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Yun Chen
- Department of Radiation Oncology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Yu Liu
- Department of Radiation Oncology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
- *Correspondence: Yu Liu, ; Jiying Wang,
| | - Jiying Wang
- Department of Oncology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
- *Correspondence: Yu Liu, ; Jiying Wang,
| |
Collapse
|
31
|
Skolnick J, Zhou H. Implications of the Essential Role of Small Molecule Ligand Binding Pockets in Protein-Protein Interactions. J Phys Chem B 2022; 126:6853-6867. [PMID: 36044742 PMCID: PMC9484464 DOI: 10.1021/acs.jpcb.2c04525] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 08/18/2022] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) and protein-metabolite interactions play a key role in many biochemical processes, yet they are often viewed as being independent. However, the fact that small molecule drugs have been successful in inhibiting PPIs suggests a deeper relationship between protein pockets that bind small molecules and PPIs. We demonstrate that 2/3 of PPI interfaces, including antibody-epitope interfaces, contain at least one significant small molecule ligand binding pocket. In a representative library of 50 distinct protein-protein interactions involving hundreds of mutations, >75% of hot spot residues overlap with small molecule ligand binding pockets. Hence, ligand binding pockets play an essential role in PPIs. In representative cases, evolutionary unrelated monomers that are involved in different multimeric interactions yet share the same pocket are predicted to bind the same metabolites/drugs; these results are confirmed by examples in the PDB. Thus, the binding of a metabolite can shift the equilibrium between monomers and multimers. This implicit coupling of PPI equilibria, termed "metabolic entanglement", was successfully employed to suggest novel functional relationships among protein multimers that do not directly interact. Thus, the current work provides an approach to unify metabolomics and protein interactomics.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| | - Hongyi Zhou
- Center for the Study of Systems
Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, Georgia 30332, United States
| |
Collapse
|
32
|
He X, Li WS, Qiu ZG, Zhang L, Long HM, Zhang GS, Huang YW, Zhan YM, Meng F. A computational method for large-scale identification of esophageal cancer-related genes. Front Oncol 2022; 12:982641. [PMID: 36052230 PMCID: PMC9425068 DOI: 10.3389/fonc.2022.982641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 07/25/2022] [Indexed: 11/14/2022] Open
Abstract
The incidence of esophageal cancer has obvious genetic susceptibility. Identifying esophageal cancer-related genes plays a huge role in the prevention and treatment of esophageal cancer. Through various sequencing methods, researchers have found only a small number of genes associated with esophageal cancer. In order to improve the efficiency of esophageal cancer genetic susceptibility research, this paper proposes a method for large-scale identification of esophageal cancer-related genes by computational methods. In order to improve the efficiency of esophageal cancer genetic susceptibility research, this paper proposes a method for large-scale identification of esophageal cancer-related genes by computational methods. This method fuses graph convolutional network and logical matrix factorization to effectively identify esophageal cancer-related genes through the association between genes. We call this method GCNLMF which achieved AUC as 0.927 and AUPR as 0.86. Compared with other five methods, GCNLMF performed best. We conducted a case study of the top three predicted genes. Although the association of these three genes with esophageal cancer has not been reported in the database, studies by other reseachers have shown that these three genes are significantly associated with esophageal cancer, which illustrates the accuracy of the prediction results of GCNLMF.
Collapse
Affiliation(s)
- Xin He
- Department of Respiratory and Critical Care, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - Wei-Song Li
- Department of pathology, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - Zhen-Gang Qiu
- Department of Oncology, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - Lei Zhang
- Department of Gastroenterology, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - He-Ming Long
- Department of Oncology, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - Gui-Sheng Zhang
- School of Basic Medicine, Gannan Medical University, Ganzhou, China
| | - Yang-Wen Huang
- School of Basic Medicine, Gannan Medical University, Ganzhou, China
| | - Yun-mei Zhan
- School of Basic Medicine, Gannan Medical University, Ganzhou, China
| | - Fan Meng
- Department of Gastroenterology, The First Affiliated Hospital of Gannan Medical University, Ganzhou, China
- *Correspondence: Fan Meng,
| |
Collapse
|
33
|
Li M, Meng GX, Liu XW, Ma T, Sun G, He H. Deep-LC: A Novel Deep Learning Method of Identifying Non-Small Cell Lung Cancer-Related Genes. Front Oncol 2022; 12:949546. [PMID: 35936745 PMCID: PMC9353732 DOI: 10.3389/fonc.2022.949546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 06/16/2022] [Indexed: 12/15/2022] Open
Abstract
According to statistics, lung cancer kills 1.8 million people each year and is the main cause of cancer mortality worldwide. Non-small cell lung cancer (NSCLC) accounts for over 85% of all lung cancers. Lung cancer has a strong genetic predisposition, demonstrating that the susceptibility and survival of lung cancer are related to specific genes. Genome-wide association studies (GWASs) and next-generation sequencing have been used to discover genes related to NSCLC. However, many studies ignored the intricate interaction information between gene pairs. In the paper, we proposed a novel deep learning method named Deep-LC for predicting NSCLC-related genes. First, we built a gene interaction network and used graph convolutional networks (GCNs) to extract features of genes and interactions between gene pairs. Then a simple convolutional neural network (CNN) module is used as the decoder to decide whether the gene is related to the disease. Deep-LC is an end-to-end method, and from the evaluation results, we can conclude that Deep-LC performs well in mining potential NSCLC-related genes and performs better than existing state-of-the-art methods.
Collapse
Affiliation(s)
| | | | | | | | - Ge Sun
- *Correspondence: Ge Sun, ; HongMei He,
| | | |
Collapse
|
34
|
Chen Y, Sun X, Yang J. Prediction of Gastric Cancer-Related Genes Based on the Graph Transformer Network. Front Oncol 2022; 12:902616. [PMID: 35847949 PMCID: PMC9281472 DOI: 10.3389/fonc.2022.902616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 04/26/2022] [Indexed: 02/01/2023] Open
Abstract
Gastric cancer is a complex multifactorial and multistage process that involves a large number of tumor-related gene structural changes and abnormal expression. Therefore, knowing the related genes of gastric cancer can further understand the pathogenesis of gastric cancer and provide guidance for the development of targeted drugs. Traditional methods to discover gastric cancer-related genes based on biological experiments are time-consuming and expensive. In recent years, a large number of computational methods have been developed to identify gastric cancer-related genes. In addition, a large number of experiments show that establishing a biological network to identify disease-related genes has higher accuracy than ordinary methods. However, most of the current computing methods focus on the processing of homogeneous networks, and do not have the ability to encode heterogeneous networks. In this paper, we built a heterogeneous network using a disease similarity network and a gene interaction network. We implemented the graph transformer network (GTN) to encode this heterogeneous network. Meanwhile, the deep belief network (DBN) was applied to reduce the dimension of features. We call this method “DBN-GTN”, and it performed best among four traditional methods and five similar methods.
Collapse
|
35
|
Resurreccion EP, Fong KW. The Integration of Metabolomics with Other Omics: Insights into Understanding Prostate Cancer. Metabolites 2022; 12:metabo12060488. [PMID: 35736421 PMCID: PMC9230859 DOI: 10.3390/metabo12060488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 05/21/2022] [Accepted: 05/24/2022] [Indexed: 02/06/2023] Open
Abstract
Our understanding of prostate cancer (PCa) has shifted from solely caused by a few genetic aberrations to a combination of complex biochemical dysregulations with the prostate metabolome at its core. The role of metabolomics in analyzing the pathophysiology of PCa is indispensable. However, to fully elucidate real-time complex dysregulation in prostate cells, an integrated approach based on metabolomics and other omics is warranted. Individually, genomics, transcriptomics, and proteomics are robust, but they are not enough to achieve a holistic view of PCa tumorigenesis. This review is the first of its kind to focus solely on the integration of metabolomics with multi-omic platforms in PCa research, including a detailed emphasis on the metabolomic profile of PCa. The authors intend to provide researchers in the field with a comprehensive knowledge base in PCa metabolomics and offer perspectives on overcoming limitations of the tool to guide future point-of-care applications.
Collapse
Affiliation(s)
- Eleazer P. Resurreccion
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40506, USA;
| | - Ka-wing Fong
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40506, USA;
- Markey Cancer Center, University of Kentucky, Lexington, KY 40506, USA
- Correspondence: ; Tel.: +1-859-562-3455
| |
Collapse
|
36
|
Zhang N, Zang T. A multi-network integration approach for measuring disease similarity based on ncRNA regulation and heterogeneous information. BMC Bioinformatics 2022; 23:89. [PMID: 35255810 PMCID: PMC8902705 DOI: 10.1186/s12859-022-04613-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 11/28/2022] Open
Abstract
Background Measuring similarity between complex diseases has significant implications for revealing the pathogenesis of diseases and development in the domain of biomedicine. It has been consentaneous that functional associations between disease-related genes and semantic associations can be applied to calculate disease similarity. Currently, more and more studies have demonstrated the profound involvement of non-coding RNA in the regulation of genome organization and gene expression. Thus, taking ncRNA into account can be useful in measuring disease similarities. However, existing methods ignore the regulation functions of ncRNA in biological process. In this study, we proposed a novel deep-learning method to deduce disease similarity. Results In this article, we proposed a novel method, ImpAESim, a framework integrating multiple networks embedding to learn compact feature representations and disease similarity calculation. We first utilize three different disease-related information networks to build up a heterogeneous network, after a network diffusion process, RWR, a compact feature learning model composed of classic Auto Encoder (AE) and improved AE model is proposed to extract constraints and low-dimensional feature representations. We finally obtain an accurate and low-dimensional feature representation of diseases, then we employed the cosine distance as the measurement of disease similarity. Conclusion ImpAESim focuses on extracting a low-dimensional vector representation of features based on ncRNA regulation, and gene–gene interaction network. Our method can significantly reduce the calculation bias resulted from the sparse disease associations which are derived from semantic associations.
Collapse
Affiliation(s)
- Ningyi Zhang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Tianyi Zang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
37
|
Zhang C, Lu Y, Zang T. CNN-DDI: a learning-based method for predicting drug-drug interactions using convolution neural networks. BMC Bioinformatics 2022; 23:88. [PMID: 35255808 PMCID: PMC8902704 DOI: 10.1186/s12859-022-04612-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 02/14/2022] [Indexed: 01/07/2023] Open
Abstract
Background Drug–drug interactions (DDIs) are the reactions between drugs. They are compartmentalized into three types: synergistic, antagonistic and no reaction. As a rapidly developing technology, predicting DDIs-associated events is getting more and more attention and application in drug development and disease diagnosis fields. In this work, we study not only whether the two drugs interact, but also specific interaction types. And we propose a learning-based method using convolution neural networks to learn feature representations and predict DDIs. Results In this paper, we proposed a novel algorithm using a CNN architecture, named CNN-DDI, to predict drug–drug interactions. First, we extract feature interactions from drug categories, targets, pathways and enzymes as feature vectors and employ the Jaccard similarity as the measurement of drugs similarity. Then, based on the representation of features, we build a new convolution neural network as the DDIs’ predictor. Conclusion The experimental results indicate that drug categories is effective as a new feature type applied to CNN-DDI method. And using multiple features is more informative and more effective than single feature. It can be concluded that CNN-DDI has more superiority than other existing algorithms on task of predicting DDIs.
Collapse
Affiliation(s)
- Chengcheng Zhang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yao Lu
- General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China
| | - Tianyi Zang
- Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
38
|
Xia Y, Li X, Chen X, Lu C, Yu X. Inferring Retinal Degeneration-Related Genes Based on Xgboost. Front Mol Biosci 2022; 9:843150. [PMID: 35223997 PMCID: PMC8880610 DOI: 10.3389/fmolb.2022.843150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Abstract
Retinal Degeneration (RD) is an inherited retinal disease characterized by degeneration of rods and cones photoreceptor cells and degeneration of retinal pigment epithelial cells. The age of onset and disease progression of RD are related to genes and environment. At present, research has discovered five genes closely related to RD. They are RHO, PDE6B, MERTK, RLBP1, RPGR, and researchers have developed corresponding gene therapy methods. Gene therapy uses vectors to transfer therapeutic genes, genetically modify target cells, and correct or replace disease-causing RD genes. Therefore, identifying the pathogenic genes of RD will play an important role in the development of treatment methods for the disease. However, the traditional methods of identifying RD-related genes are mostly based on animal experiments, and currently only a small number of RD-related genes have been identified. With the increase of biological data, Xgboost is purposed in this article to identify RP-related genes. Xgboost adds a regular term to control the complexity of the model, hence using Xgboost to find out true RD-related genes from complex and massive genes is suitable. The problem of overfitting can be avoided to some extent. To verify the power of Xgboost to identify RD-related genes, we did 10-cross validation and compared with three traditional methods: Random Forest, Back Propagation network, Support Vector Machine. The accuracy of Xgboost is 99.13% and AUC is much higher than other three methods. Therefore, this article can provide technical support for efficient identification of RD-related genes and help researchers have a deeper the understanding of the genetic characteristics of RD.
Collapse
Affiliation(s)
- Yujie Xia
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaojie Li
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xinlin Chen
- Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Changjin Lu
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| | - Xiaoyi Yu
- Department of Ophthalmology, The First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
| |
Collapse
|
39
|
Li C, Huang J, Tang H, Liu B, Zhou X. Revealing Cavin-2 Gene Function in Lung Based on Multi-Omics Data Analysis Method. Front Cell Dev Biol 2022; 9:827108. [PMID: 35174175 PMCID: PMC8841408 DOI: 10.3389/fcell.2021.827108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 12/15/2021] [Indexed: 11/23/2022] Open
Abstract
Research points out that it is particularly important to comprehensively evaluate immune microenvironmental indicators and gene mutation characteristics to select the best treatment plan. Therefore, exploring the relevant genes of pulmonary injury is an important basis for the improvement of survival. In recent years, with the massive production of omics data, a large number of computational methods have been applied in the field of biomedicine. Most of these computational methods are devel-oped for a certain type of diseases or whole diseases. Algorithms that specifically identify genes associated with pulmonary injury have not yet been developed. To fill this gap, we developed a novel method, named AdaRVM, to identify pulmonary injury-related genes in large scale. AdaRVM is the fusion of Adaboost and Relevance Vector Machine (RVM) to achieve fast and high-precision pattern recognition of pulmonary injury genetic mechanism. AdaRVM found that Cavin-2 gene has strong potential to be related to pulmonary injury. As we known, the formation and function of Caveolae are mediated by two family proteins: Caveolin and Cavin. Many studies have explored the role of Caveolin proteins, but people still knew little about Cavin family members. To verify our method and reveal the functions of cavin-2, we integrated six genome-wide association studies (GWAS) data related to lung function traits, four expression Quantitative Trait Loci (eQTL) data, and one methylation Quantitative Trait Loci (mQTL) data by Summary data level Mendelian Randomization (SMR). We found strong relationship between cavin-2 and canonical signaling pathways ERK1/2, AKT, and STAT3 which are all known to be related to lung injury.
Collapse
Affiliation(s)
- Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Hexiao Tang
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Bing Liu
- Department of Pulmonary and Critical Care Medicine, Zhongnan Hospital of Wuhan University, Wuhan, China
- Wuhan Research Center for Infectious Diseases and Cancer, Chinese Academy of Medical Sciences, Wuhan, China
| | - Xuefeng Zhou
- Department of Thoracic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
- *Correspondence: Xuefeng Zhou,
| |
Collapse
|
40
|
Chen Q, Zhang J, Bao B, Zhang F, Zhou J. Large-Scale Gastric Cancer Susceptibility Gene Identification Based on Gradient Boosting Decision Tree. Front Mol Biosci 2022; 8:815243. [PMID: 35096975 PMCID: PMC8793069 DOI: 10.3389/fmolb.2021.815243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 12/06/2021] [Indexed: 01/21/2023] Open
Abstract
The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.
Collapse
Affiliation(s)
- Qing Chen
- Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Ji Zhang
- Department of Hepatobiliary Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Banghe Bao
- Department of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Fan Zhang
- Wuhan Asia General Hospital, Wuhan, China
| | - Jie Zhou
- Department of Biochemistry and Molecular Biology, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
- *Correspondence: Jie Zhou,
| |
Collapse
|
41
|
Cheng N, Cui X, Chen C, Li C, Huang J. Exploration of Lung Cancer-Related Genetic Factors via Mendelian Randomization Method Based on Genomic and Transcriptomic Summarized Data. Front Cell Dev Biol 2021; 9:800756. [PMID: 34938740 PMCID: PMC8686495 DOI: 10.3389/fcell.2021.800756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 11/22/2021] [Indexed: 12/24/2022] Open
Abstract
Lung carcinoma is one of the most deadly malignant tumors in mankind. With the rising incidence of lung cancer, searching for the high effective cures become more and more imperative. There has been sufficient research evidence that living habits and situations such as smoking and air pollution are associated with an increased risk of lung cancer. Simultaneously, the influence of individual genetic susceptibility on lung carcinoma morbidity has been confirmed, and a growing body of evidence has been accumulated on the relationship between various risk factors and the risk of different pathological types of lung cancer. Additionally, the analyses from many large-scale cancer registries have shown a degree of familial aggregation of lung cancer. To explore lung cancer-related genetic factors, Genome-Wide Association Studies (GWAS) have been used to identify several lung cancer susceptibility sites and have been widely validated. However, the biological mechanism behind the impact of these site mutations on lung cancer remains unclear. Therefore, this study applied the Summary data-based Mendelian Randomization (SMR) model through the integration of two GWAS datasets and four expression Quantitative Trait Loci (eQTL) datasets to identify susceptibility genes. Using this strategy, we found ten of Single Nucleotide Polymorphisms (SNPs) sites that affect the occurrence and development of lung tumors by regulating the expression of seven genes. Further analysis of the signaling pathway about these genes not only provides important clues to explain the pathogenesis of lung cancer but also has critical significance for the diagnosis and treatment of lung cancer.
Collapse
Affiliation(s)
- Nitao Cheng
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Xinran Cui
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chen Chen
- Department of Biological Repositories, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Changsheng Li
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| | - Jingyu Huang
- Department of Thoracic Surgery, Zhongnan Hospital, Wuhan University, Wuhan, China
| |
Collapse
|
42
|
Zhang H, Xu R, Ding M, Zhang Y. Prediction of Gastric Cancer-Related Proteins Based on Graph Fusion Method. Front Cell Dev Biol 2021; 9:739715. [PMID: 34790662 PMCID: PMC8591485 DOI: 10.3389/fcell.2021.739715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2021] [Accepted: 08/02/2021] [Indexed: 12/09/2022] Open
Abstract
Gastric cancer is a common malignant tumor of the digestive system with no specific symptoms. Due to the limited knowledge of pathogenesis, patients are usually diagnosed in advanced stage and do not have effective treatment methods. Proteome has unique tissue and time specificity and can reflect the influence of external factors that has become a potential biomarker for early diagnosis. Therefore, discovering gastric cancer-related proteins could greatly help researchers design drugs and develop an early diagnosis kit. However, identifying gastric cancer-related proteins by biological experiments is time- and money-consuming. With the high speed increase of data, it has become a hot issue to mine the knowledge of proteomics data on a large scale through computational methods. Based on the hypothesis that the stronger the association between the two proteins, the more likely they are to be associated with the same disease, in this paper, we constructed both disease similarity network and protein interaction network. Then, Graph Convolutional Networks (GCN) was applied to extract topological features of these networks. Finally, Xgboost was used to identify the relationship between proteins and gastric cancer. Results of 10-cross validation experiments show high area under the curve (AUC) (0.85) and area under the precision recall (AUPR) curve (0.76) of our method, which proves the effectiveness of our method.
Collapse
Affiliation(s)
- Hao Zhang
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Ruisi Xu
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Meng Ding
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| | - Ying Zhang
- Endoscopy Center, China-Japan Union Hospital of Jilin University, Changchun, China
| |
Collapse
|
43
|
Zhao Z, Shi J, Zhao G, Gao Y, Jiang Z, Yuan F. Large Scale Identification of Osteosarcoma Pathogenic Genes by Multiple Extreme Learning Machine. Front Cell Dev Biol 2021; 9:755511. [PMID: 34646831 PMCID: PMC8502917 DOI: 10.3389/fcell.2021.755511] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Accepted: 09/02/2021] [Indexed: 11/13/2022] Open
Abstract
At present, the main treatment methods of osteosarcoma are chemotherapy and surgery. Its 5-year survival rate has not been significantly improved in the past decades. Osteosarcoma has extremely complex multigenomic heterogeneity and lacks universally applicable signal blocking targets. Osteosarcoma is often found in adolescents or children under the age of 20, so it is very important to explore its genetic pathogenic factors. We used known osteosarcoma-related genes and computer algorithms to find more osteosarcoma pathogenic genes, laying the foundation for the treatment of osteosarcoma immune microenvironment-related treatments, so as to carry out further explorations on these genes. It is a traditional method to identify osteosarcoma related genes by collecting clinical samples, measuring gene expressions by RNA-seq technology and comparing differentially expressed gene. The high cost and time consumption make it difficult to carry out research on a large scale. In this paper, we developed a novel method “RELM” which fuses multiple extreme learning machines (ELM) to identify osteosarcoma pathogenic genes. The AUC and AUPR of RELM are 0.91 and 0.88, respectively, in 10-cross validation, which illustrates the reliability of RELM.
Collapse
Affiliation(s)
- Zhipeng Zhao
- Department of Basic Medical Sciences, Taizhou University, Taizhou, China
| | - Jijun Shi
- Department of Orthopedics, Songyuan Central Hospital, Songyuan, China
| | - Guang Zhao
- Department of Orthopedics, The Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Yanjun Gao
- Department of Orthopedics, The Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Zhigang Jiang
- Department of Hand Surgery, Changchun Central Hospital, Changchun, China
| | - Fusheng Yuan
- Department of Orthopedics, The Fourth Affiliated Hospital of China Medical University, Shenyang, China
| |
Collapse
|
44
|
Al-Saggaf UM, Usman M, Naseem I, Moinuddin M, Jiman AA, Alsaggaf MU, Alshoubaki HK, Khan S. ECM-LSE: Prediction of Extracellular Matrix Proteins Using Deep Latent Space Encoding of k-Spaced Amino Acid Pairs. Front Bioeng Biotechnol 2021; 9:752658. [PMID: 34722479 PMCID: PMC8552119 DOI: 10.3389/fbioe.2021.752658] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 09/13/2021] [Indexed: 12/26/2022] Open
Abstract
Extracelluar matrix (ECM) proteins create complex networks of macromolecules which fill-in the extracellular spaces of living tissues. They provide structural support and play an important role in maintaining cellular functions. Identification of ECM proteins can play a vital role in studying various types of diseases. Conventional wet lab-based methods are reliable; however, they are expensive and time consuming and are, therefore, not scalable. In this research, we propose a sequence-based novel machine learning approach for the prediction of ECM proteins. In the proposed method, composition of k-spaced amino acid pair (CKSAAP) features are encoded into a classifiable latent space (LS) with the help of deep latent space encoding (LSE). A comprehensive ablation analysis is conducted for performance evaluation of the proposed method. Results are compared with other state-of-the-art methods on the benchmark dataset, and the proposed ECM-LSE approach has shown to comprehensively outperform the contemporary methods.
Collapse
Affiliation(s)
- Ubaid M. Al-Saggaf
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Electrical and Computer Engineering Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Muhammad Usman
- Department of Computer Engineering, Chosun University, Gwangju, South Korea
| | - Imran Naseem
- Research and Development, Love For Data, Karachi, Pakistan
- School of Electrical, Electronic and Computer Engineering, The University of Western Australia, Perth, WA, Australia
- College of Engineering, Karachi Institute of Economics and Technology, Korangi Creek, Karachi, Pakistan
| | - Muhammad Moinuddin
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Electrical and Computer Engineering Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ahmad A. Jiman
- Electrical and Computer Engineering Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohammed U. Alsaggaf
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Radiology, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Hitham K. Alshoubaki
- Center of Excellence in Intelligent Engineering Systems, King Abdulaziz University, Jeddah, Saudi Arabia
- Electrical and Computer Engineering Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Shujaat Khan
- Department of Bio and Brain Engineering, Daejeon, South Korea
| |
Collapse
|
45
|
Liu Y, Jin G, Wang X, Dong Y, Ding F. Identification of New Genes and Loci Associated With Bone Mineral Density Based on Mendelian Randomization. Front Genet 2021; 12:728563. [PMID: 34567079 PMCID: PMC8456003 DOI: 10.3389/fgene.2021.728563] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 08/02/2021] [Indexed: 02/05/2023] Open
Abstract
Bone mineral density (BMD) is a complex and highly hereditary trait that can lead to osteoporotic fractures. It is estimated that BMD is mainly affected by genetic factors (about 85%). BMD has been reported to be associated with both common and rare variants, and numerous loci related to BMD have been identified by genome-wide association studies (GWAS). We systematically integrated expression quantitative trait loci (eQTL) data with GWAS summary statistical data. We mainly focused on the loci, which can affect gene expression, so Summary data-based Mendelian randomization (SMR) analysis was implemented to investigate new genes and loci associated with BMD. We identified 12,477 single-nucleotide polymorphisms (SNPs) regulating 564 genes, which are associated with BMD. The genetic mechanism we detected could make a contribution in the density of BMD in individuals and play an important role in understanding the pathophysiology of cataclasis.
Collapse
Affiliation(s)
- Yijun Liu
- Department of Orthopedics, The First Hospital of Jilin University, Changchun, China
| | - Guang Jin
- Department of Orthopedics, The First Hospital of Jilin University, Changchun, China
| | - Xue Wang
- Department of Anesthesiology, The First Hospital of Jilin University, Changchun, China
| | - Ying Dong
- The Third Department of Radiotherapy, Jilin Provincial Tumor Hospital, Changchun, China
| | - Fupeng Ding
- Department of Orthopedics, The First Hospital of Jilin University, Changchun, China
| |
Collapse
|
46
|
Zhong LK, Xie CL, Jiang S, Deng XY, Gan XX, Feng JH, Cai WS, Liu CZ, Shen F, Miao JH, Xu B. Prioritizing Susceptible Genes for Thyroid Cancer Based on Gene Interaction Network. Front Cell Dev Biol 2021; 9:740267. [PMID: 34497810 PMCID: PMC8421023 DOI: 10.3389/fcell.2021.740267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 08/02/2021] [Indexed: 12/05/2022] Open
Abstract
Thyroid cancer ranks second in the incidence rate of endocrine malignant cancer. Thyroid cancer is usually asymptomatic at the initial stage, which makes patients easily miss the early treatment time. Combining genetic testing with imaging can greatly improve the diagnostic efficiency of thyroid cancer. Researchers have discovered many genes related to thyroid cancer. However, the effects of these genes on thyroid cancer are different. We hypothesize that there is a stronger interaction between the core genes that cause thyroid cancer. Based on this hypothesis, we constructed an interaction network of thyroid cancer-related genes. We traversed the network through random walks, and sorted thyroid cancer-related genes through ADNN which is fusion of Adaboost and deep neural network (DNN). In addition, we discovered more thyroid cancer-related genes by ADNN. In order to verify the accuracy of ADNN, we conducted a fivefold cross-validation. ADNN achieved AUC of 0.85 and AUPR of 0.81, which are more accurate than other methods.
Collapse
Affiliation(s)
- Lin-Kun Zhong
- Department of General Surgery, Zhongshan City People's Hospital, Zhongshan, China
| | - Chang-Lian Xie
- Intensive Care Unit, Zhongshan Hospital of Traditional Chinese Medicine Affiliated to Guangzhou University of Chinese Medicine, Zhongshan, China
| | - Shan Jiang
- Reproductive Medicine Center, Boai Hospital of Zhongshan, Zhongshan, China
| | - Xing-Yan Deng
- Department of Thyrovascular Surgery, Maoming People's Hospital, Maoming, China
| | - Xiao-Xiong Gan
- Department of Thyroid Surgery, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Jian-Hua Feng
- Department of Thyroid Surgery, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Wen-Song Cai
- Department of Thyroid Surgery, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Chi-Zhuai Liu
- Department of General Surgery, Zhongshan City People's Hospital, Zhongshan, China
| | - Fei Shen
- Department of Thyroid Surgery, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Jian-Hang Miao
- Department of General Surgery, Zhongshan City People's Hospital, Zhongshan, China
| | - Bo Xu
- Department of Thyroid Surgery, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| |
Collapse
|
47
|
Shen F, Cai W, Gan X, Feng J, Chen Z, Guo M, Wei F, Cao J, Xu B. Prediction of Genetic Factors of Hyperthyroidism Based on Gene Interaction Network. Front Cell Dev Biol 2021; 9:700355. [PMID: 34409035 PMCID: PMC8365469 DOI: 10.3389/fcell.2021.700355] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 06/02/2021] [Indexed: 12/25/2022] Open
Abstract
The number of hyperthyroidism patients is increasing these years. As a disease that can lead to cardiovascular disease, it brings great potential health risks to humans. Since hyperthyroidism can induce the occurrence of many diseases, studying its genetic factors will promote the early diagnosis and treatment of hyperthyroidism and its related diseases. Previous studies have used genome-wide association analysis (GWAS) to identify genes related to hyperthyroidism. However, these studies only identify significant sites related to the disease from a statistical point of view and ignore the complex regulation relationship between genes. In addition, mutation is not the only genetic factor of causing hyperthyroidism. Identifying hyperthyroidism-related genes from gene interactions would help researchers discover the disease mechanism. In this paper, we purposed a novel machine learning method for identifying hyperthyroidism-related genes based on gene interaction network. The method, which is called “RW-RVM,” is a combination of Random Walk (RW) and Relevance Vector Machines (RVM). RW was implemented to encode the gene interaction network. The features of genes were the regulation relationship between genes and non-coding RNAs. Finally, multiple RVMs were applied to identify hyperthyroidism-related genes. The result of 10-cross validation shows that the area under the receiver operating characteristic curve (AUC) of our method reached 0.9, and area under the precision-recall curve (AUPR) was 0.87. Seventy-eight novel genes were found to be related to hyperthyroidism. We investigated two genes of these novel genes with existing literature, which proved the accuracy of our result and method.
Collapse
Affiliation(s)
- Fei Shen
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Wensong Cai
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Xiaoxiong Gan
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Jianhua Feng
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Zhen Chen
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Mengli Guo
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| | - Fang Wei
- Department of General Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China
| | - Jie Cao
- Department of General Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China
| | - Bo Xu
- Department of Thyroid Surgery, School of Medicine, Guangzhou First People's Hospital, South China University of Technology, Guangzhou, China.,Department of Thyroid Surgery, Guangzhou First People's Hospital, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
48
|
Yaoxing H, Danchun Y, Xiaojuan S, Shuman J, Qingqing Y, Lin J. Identification of Novel Susceptible Genes of Gastric Cancer Based on Integrated Omics Data. Front Cell Dev Biol 2021; 9:712020. [PMID: 34354996 PMCID: PMC8329722 DOI: 10.3389/fcell.2021.712020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 06/23/2021] [Indexed: 12/24/2022] Open
Abstract
Gastric cancer (GC) is one of the most common causes of cancer-related deaths in the world. This cancer has been regarded as a biological and genetically heterogeneous disease with a poorly understood carcinogenesis at the molecular level. Thousands of biomarkers and susceptible loci have been explored via experimental and computational methods, but their effects on disease outcome are still unknown. Genome-wide association studies (GWAS) have identified multiple susceptible loci for GC, but due to the linkage disequilibrium (LD), single-nucleotide polymorphisms (SNPs) may fall within the non-coding region and exert their biological function by modulating the gene expression level. In this study, we collected 1,091 cases and 410,350 controls from the GWAS catalog database. Integrating with gene expression level data obtained from stomach tissue, we conducted a machine learning-based method to predict GC-susceptible genes. As a result, we identified 787 novel susceptible genes related to GC, which will provide new insight into the genetic and biological basis for the mechanism and pathology of GC development.
Collapse
Affiliation(s)
- Huang Yaoxing
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Yu Danchun
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Sun Xiaojuan
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Jiang Shuman
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Yan Qingqing
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| | - Jia Lin
- Department of Gastroenterology, Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou, China
| |
Collapse
|
49
|
Xu Y, Cui X, Wang Y. Pan-Cancer Metastasis Prediction Based on Graph Deep Learning Method. Front Cell Dev Biol 2021; 9:675978. [PMID: 34179004 PMCID: PMC8220811 DOI: 10.3389/fcell.2021.675978] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 04/12/2021] [Indexed: 11/29/2022] Open
Abstract
Tumor metastasis is the major cause of mortality from cancer. From this perspective, detecting cancer gene expression and transcriptome changes is important for exploring tumor metastasis molecular mechanisms and cellular events. Precisely estimating a patient’s cancer state and prognosis is the key challenge to develop a patient’s therapeutic schedule. In the recent years, a variety of machine learning techniques widely contributed to analyzing real-world gene expression data and predicting tumor outcomes. In this area, data mining and machine learning techniques have widely contributed to gene expression data analysis by supplying computational models to support decision-making on real-world data. Nevertheless, limitation of real-world data extremely restricted model predictive performance, and the complexity of data makes it difficult to extract vital features. Besides these, the efficacy of standard machine learning pipelines is far from being satisfactory despite the fact that diverse feature selection strategy had been applied. To address these problems, we developed directed relation-graph convolutional network to provide an advanced feature extraction strategy. We first constructed gene regulation network and extracted gene expression features based on relational graph convolutional network method. The high-dimensional features of each sample were regarded as an image pixel, and convolutional neural network was implemented to predict the risk of metastasis for each patient. Ten cross-validations on 1,779 cases from The Cancer Genome Atlas show that our model’s performance (area under the curve, AUC = 0.837; area under precision recall curve, AUPRC = 0.717) outstands that of an existing network-based method (AUC = 0.707, AUPRC = 0.555).
Collapse
Affiliation(s)
- Yining Xu
- Department of Computer Science, Harbin Institute of Technology, Harbin, China
| | - Xinran Cui
- Department of Computer Science, Harbin Institute of Technology, Harbin, China
| | - Yadong Wang
- Department of Computer Science, Harbin Institute of Technology, Harbin, China
| |
Collapse
|