1
|
Cheng Y, Xu SM, Santucci K, Lindner G, Janitz M. Machine learning and related approaches in transcriptomics. Biochem Biophys Res Commun 2024; 724:150225. [PMID: 38852503 DOI: 10.1016/j.bbrc.2024.150225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 05/18/2024] [Accepted: 06/03/2024] [Indexed: 06/11/2024]
Abstract
Data acquisition for transcriptomic studies used to be the bottleneck in the transcriptomic analytical pipeline. However, recent developments in transcriptome profiling technologies have increased researchers' ability to obtain data, resulting in a shift in focus to data analysis. Incorporating machine learning to traditional analytical methods allows the possibility of handling larger volumes of complex data more efficiently. Many bioinformaticians, especially those unfamiliar with ML in the study of human transcriptomics and complex biological systems, face a significant barrier stemming from their limited awareness of the current landscape of ML utilisation in this field. To address this gap, this review endeavours to introduce those individuals to the general types of ML, followed by a comprehensive range of more specific techniques, demonstrated through examples of their incorporation into analytical pipelines for human transcriptome investigations. Important computational aspects such as data pre-processing, task formulation, results (performance of ML models), and validation methods are encompassed. In hope of better practical relevance, there is a strong focus on studies published within the last five years, almost exclusively examining human transcriptomes, with outcomes compared with standard non-ML tools.
Collapse
Affiliation(s)
- Yuning Cheng
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Si-Mei Xu
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Kristina Santucci
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Grace Lindner
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Michael Janitz
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.
| |
Collapse
|
2
|
Wang X, Zhang XY, Liao NQ, He ZH, Chen QF. Identification of ribosome biogenesis genes and subgroups in ischaemic stroke. Front Immunol 2024; 15:1449158. [PMID: 39290696 PMCID: PMC11406505 DOI: 10.3389/fimmu.2024.1449158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 08/14/2024] [Indexed: 09/19/2024] Open
Abstract
Background Ischaemic stroke is a leading cause of death and severe disability worldwide. Given the importance of protein synthesis in the inflammatory response and neuronal repair and regeneration after stroke, and that proteins are acquired by ribosomal translation of mRNA, it has been theorised that ribosome biogenesis may have an impact on promoting and facilitating recovery after stroke. However, the relationship between stroke and ribosome biogenesis has not been investigated. Methods In the present study, a ribosome biogenesis gene signature (RSG) was developed using Cox and least absolute shrinkage and selection operator (LASSO) analysis. We classified ischaemic stroke patients into high-risk and low-risk groups using the obtained relevant genes, and further elucidated the immune infiltration of the disease using ssGSEA, which clarified the close relationship between ischaemic stroke and immune subgroups. The concentration of related proteins in the serum of stroke patients was determined by ELISA, and the patients were divided into groups to evaluate the effect of the ribosome biogenesis gene on patients. Through bioinformatics analysis, we identified potential IS-RSGs and explored future therapeutic targets, thereby facilitating the development of more effective therapeutic strategies and novel drugs against potential therapeutic targets in ischaemic stroke. Results We obtained a set of 12 ribosome biogenesis-related genes (EXOSC5, MRPS11, MRPS7, RNASEL, RPF1, RPS28, C1QBP, GAR1, GRWD1, PELP1, UTP, ERI3), which play a key role in assessing the prognostic risk of ischaemic stroke. Importantly, risk grouping using ribosome biogenesis-related genes was also closely associated with important signaling pathways in stroke. ELISA detected the expression of C1QBP, RPS28 and RNASEL proteins in stroke patients, and the proportion of neutrophils was significantly increased in the high-risk group. Conclusions The present study demonstrates the involvement of ribosomal biogenesis genes in the pathogenesis of ischaemic stroke, providing novel insights into the underlying pathogenic mechanisms and potential therapeutic strategies for ischaemic stroke.
Collapse
Affiliation(s)
- Xi Wang
- School of Medicine, Guangxi University, Nanning, China
| | - Xiao-Yu Zhang
- The College of Life Sciences, Northwest University, Xian, China
| | - Nan-Qing Liao
- School of Medicine, Guangxi University, Nanning, China
| | - Ze-Hua He
- Department of General Surgery, Guangxi Hospital Division of The First Affiliated Hospital, Sun Yat-sen University, Nanning, China
| | - Qing-Feng Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning, China
| |
Collapse
|
3
|
Contini C, Manconi B, Olianas A, Guadalupi G, Schirru A, Zorcolo L, Castagnola M, Messana I, Faa G, Diaz G, Cabras T. Combined High-Throughput Proteomics and Random Forest Machine-Learning Approach Differentiates and Classifies Metabolic, Immune, Signaling and ECM Intra-Tumor Heterogeneity of Colorectal Cancer. Cells 2024; 13:1311. [PMID: 39195201 DOI: 10.3390/cells13161311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 07/24/2024] [Accepted: 07/31/2024] [Indexed: 08/29/2024] Open
Abstract
Colorectal cancer (CRC) is a frequent, worldwide tumor described for its huge complexity, including inter-/intra-heterogeneity and tumor microenvironment (TME) variability. Intra-tumor heterogeneity and its connections with metabolic reprogramming and epithelial-mesenchymal transition (EMT) were investigated with explorative shotgun proteomics complemented by a Random Forest (RF) machine-learning approach. Deep and superficial tumor regions and distant-site non-tumor samples from the same patients (n = 16) were analyzed. Among the 2009 proteins analyzed, 91 proteins, including 23 novel potential CRC hallmarks, showed significant quantitative changes. In addition, a 98.4% accurate classification of the three analyzed tissues was obtained by RF using a set of 21 proteins. Subunit E1 of 2-oxoglutarate dehydrogenase (OGDH-E1) was the best classifying factor for the superficial tumor region, while sorting nexin-18 and coatomer-beta protein (beta-COP), implicated in protein trafficking, classified the deep region. Down- and up-regulations of metabolic checkpoints involved different proteins in superficial and deep tumors. Analogously to immune checkpoints affecting the TME, cytoskeleton and extracellular matrix (ECM) dynamics were crucial for EMT. Galectin-3, basigin, S100A9, and fibronectin involved in TME-CRC-ECM crosstalk were found to be differently variated in both tumor regions. Different metabolic strategies appeared to be adopted by the two CRC regions to uncouple the Krebs cycle and cytosolic glucose metabolism, promote lipogenesis, promote amino acid synthesis, down-regulate bioenergetics in mitochondria, and up-regulate oxidative stress. Finally, correlations with the Dukes stage and budding supported the finding of novel potential CRC hallmarks and therapeutic targets.
Collapse
Affiliation(s)
- Cristina Contini
- Department of Medical Sciences and Public Health, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Barbara Manconi
- Department of Life and Environmental Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Alessandra Olianas
- Department of Life and Environmental Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Giulia Guadalupi
- Department of Surgical Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Alessandra Schirru
- Department of Life and Environmental Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Luigi Zorcolo
- Department of Surgical Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Massimo Castagnola
- Laboratorio di Proteomica, Centro Europeo di Ricerca sul Cervello, IRCCS Fondazione Santa Lucia, 00143 Roma, Italy
| | - Irene Messana
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta", Consiglio Nazionale delle Ricerche, 00168 Roma, Italy
| | - Gavino Faa
- Department of Medical Sciences and Public Health, Statal University of Cagliari, 09042 Monserrato (CA), Italy
- Department of Biology, College of Science and Technology, Temple University, Philadelphia, PA 19122, USA
| | - Giacomo Diaz
- Department of Biomedical Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| | - Tiziana Cabras
- Department of Life and Environmental Sciences, Statal University of Cagliari, 09042 Monserrato (CA), Italy
| |
Collapse
|
4
|
Xu JX, Zhu QL, Bi YM, Peng YC. New evidence: Metformin unsuitable as routine adjuvant for breast cancer: a drug-target mendelian randomization analysis. BMC Cancer 2024; 24:691. [PMID: 38844880 PMCID: PMC11155042 DOI: 10.1186/s12885-024-12453-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 05/30/2024] [Indexed: 06/10/2024] Open
Abstract
PURPOSE The potential efficacy of metformin in breast cancer (BC) has been hotly discussed but never conclusive. This genetics-based study aimed to evaluate the relationships between metformin targets and BC risk. METHODS Metformin targets from DrugBank and genome-wide association study (GWAS) data from IEU OpenGWAS and FinnGen were used to investigate the breast cancer (BC)-metformin causal link with various Mendelian Randomization (MR) methods (e.g., inverse-variance-weighting). The genetic association between type 2 diabetes (T2D) and the drug target of metformin was also analyzed as a positive control. Sensitivity and pleiotropic tests ensured reliability. RESULTS The primary targets of metformin are PRKAB1, ETFDH and GPD1L. We found a causal association between PRKAB1 and T2D (odds ratio [OR] 0.959, P = 0.002), but no causal relationship was observed between metformin targets and overall BC risk (PRKAB1: OR 0.990, P = 0.530; ETFDH: OR 0.986, P = 0.592; GPD1L: OR 1.002, P = 0.806). A noteworthy causal relationship was observed between ETFDH and estrogen receptor (ER)-positive BC (OR 0.867, P = 0.018), and between GPD1L and human epidermal growth factor receptor 2 (HER2)-negative BC (OR 0.966, P = 0.040). Other group analyses did not yield positive results. CONCLUSION The star target of metformin, PRKAB1, does not exhibit a substantial causal association with the risk of BC. Conversely, metformin, acting as an inhibitor of ETFDH and GPD1L, may potentially elevate the likelihood of developing ER-positive BC and HER2-negative BC. Consequently, it is not advisable to employ metformin as a standard supplementary therapy for BC patients without T2D.
Collapse
Affiliation(s)
- Jing-Xuan Xu
- Department of General Surgery, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, 400021, China
- Department of Hepatobiliary Surgery, Guangxi Medical University Cancer Hospital, Nanning, Guangxi Province, 530021, China
| | - Qi-Long Zhu
- Pharmacy Department, The Ninth People's Hospital of Chongqing, Chongqing, 400015, China
| | - Yu-Miao Bi
- Department of General Surgery, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, 400021, China.
| | - Yu-Chong Peng
- Department of General Surgery, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, 400021, China.
| |
Collapse
|
5
|
Anuntakarun S, Khamjerm J, Tangkijvanich P, Chuaypen N. Classification of Long Non-Coding RNAs s Between Early and Late Stage of Liver Cancers From Non-coding RNA Profiles Using Machine-Learning Approach. Bioinform Biol Insights 2024; 18:11779322241258586. [PMID: 38846329 PMCID: PMC11155358 DOI: 10.1177/11779322241258586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 05/10/2024] [Indexed: 06/09/2024] Open
Abstract
Long non-coding RNAs (lncRNAs), which are RNA sequences greater than 200 nucleotides in length, play a crucial role in regulating gene expression and biological processes associated with cancer development and progression. Liver cancer is a major cause of cancer-related mortality worldwide, notably in Thailand. Although machine learning has been extensively used in analyzing RNA-sequencing data for advanced knowledge, the identification of potential lncRNA biomarkers for cancer, particularly focusing on lncRNAs as molecular biomarkers in liver cancer, remains comparatively limited. In this study, our objective was to identify candidate lncRNAs in liver cancer. We employed an expression data set of lncRNAs from patients with liver cancer, which comprised 40 699 lncRNAs sourced from The CancerLivER database. Various feature selection methods and machine-learning approaches were used to identify these candidate lncRNAs. The results showed that the random forest algorithm could predict lncRNAs using features extracted from the database, which achieved an area under the curve (AUC) of 0.840 for classifying lncRNAs between early (stage 1) and late stages (stages 2, 3, and 4) of liver cancer. Five of 23 significant lncRNAs (WAC-AS1, MAPKAPK5-AS1, ARRDC1-AS1, AC133528.2, and RP11-1094M14.11) were differentially expressed between early and late stage of liver cancer. Based on the Gene Expression Profiling Interactive Analysis (GEPIA) database, higher expression of WAC-AS1, MAPKAPK5-AS1, and ARRDC1-AS1 was associated with shorter overall survival. In conclusion, the classification model could predict the early and late stages of liver cancer using the signature expression of lncRNA genes. The identified lncRNAs might be used as early diagnostic and prognostic biomarkers for patients with liver cancer.
Collapse
Affiliation(s)
- Songtham Anuntakarun
- Center of Excellence in Hepatitis and Liver Cancer, Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Jakkrit Khamjerm
- Center of Excellence in Hepatitis and Liver Cancer, Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- Biomedical Engineering Program, Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Pisit Tangkijvanich
- Center of Excellence in Hepatitis and Liver Cancer, Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| | - Natthaya Chuaypen
- Center of Excellence in Hepatitis and Liver Cancer, Department of Biochemistry, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
6
|
Mukherjee A, Abraham S, Singh A, Balaji S, Mukunthan KS. From Data to Cure: A Comprehensive Exploration of Multi-omics Data Analysis for Targeted Therapies. Mol Biotechnol 2024:10.1007/s12033-024-01133-6. [PMID: 38565775 DOI: 10.1007/s12033-024-01133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024]
Abstract
In the dynamic landscape of targeted therapeutics, drug discovery has pivoted towards understanding underlying disease mechanisms, placing a strong emphasis on molecular perturbations and target identification. This paradigm shift, crucial for drug discovery, is underpinned by big data, a transformative force in the current era. Omics data, characterized by its heterogeneity and enormity, has ushered biological and biomedical research into the big data domain. Acknowledging the significance of integrating diverse omics data strata, known as multi-omics studies, researchers delve into the intricate interrelationships among various omics layers. This review navigates the expansive omics landscape, showcasing tailored assays for each molecular layer through genomes to metabolomes. The sheer volume of data generated necessitates sophisticated informatics techniques, with machine-learning (ML) algorithms emerging as robust tools. These datasets not only refine disease classification but also enhance diagnostics and foster the development of targeted therapeutic strategies. Through the integration of high-throughput data, the review focuses on targeting and modeling multiple disease-regulated networks, validating interactions with multiple targets, and enhancing therapeutic potential using network pharmacology approaches. Ultimately, this exploration aims to illuminate the transformative impact of multi-omics in the big data era, shaping the future of biological research.
Collapse
Affiliation(s)
- Arnab Mukherjee
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Suzanna Abraham
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Akshita Singh
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - S Balaji
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - K S Mukunthan
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| |
Collapse
|
7
|
Hassani M, Mahdevar M, Peymani M. Exploring the role of interleukin 11 in cancer progression, patient survival, and therapeutic insights. Mol Biol Rep 2024; 51:461. [PMID: 38551695 DOI: 10.1007/s11033-024-09358-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 02/15/2024] [Indexed: 04/02/2024]
Abstract
BACKGROUND The Interleukin (IL)-11 gene, which is one of the members of the cytokine family, has an oncogenic role in some cancers. The main goal of this study is to analyze IL-11 expression level in 14 prevalent cancers and highlights its role in patients' survival, drug resistance, and sensitivities. Also, an association of this gene with metastasis and inflammation pathways has been investigated. METHODS AND RESULTS Using the cancer genome atlas (TCGA) data, the level of IL-11 expression and its role in prognosis and survival rate were evaluated in 13 common cancers. Then, confirming the obtained in-silico outcomes, the relative expression level of this gene in colorectal cancer (CRC) samples and their adjusted tissues were assayed by the RT-qPCR method. Furthermore, to examine the association between IL-11 expression and drug resistance and sensitivity, PharmacoGX data was applied. The co-expression network was used to recognize the pathways in which IL-11 was involved. The results from the TCGA dataset indicated that the expression level of IL-11 increased significantly in 13 prevalence cancers compared to the control groups. Interestingly, this enhanced expression level is associated with a high rate of mortality in patients with bladder, stomach, colorectal, and endometrial cancers. Also, the co-expression network analysis showed a strong correlation between IL-11 and the genes of metastasis pathway and the genes related to the inflammation process. Finally, regarding drug sensitivity, IL-11 expression level can be introduced as a remarkable biomarker for cancer detection due to area under curve (AUC). CONCLUSION Altered expression of the IL-11 gene is observed in 13 common cancers and is associated with prognosis and mortality rate in patients. Moreover, this gene can be considered a prognostic biomarker in different types of cancer, such as CRC.
Collapse
Affiliation(s)
- Mahsa Hassani
- Department of Biology, Faculty of Basic Sciences, Shahrekord Branch, Islamic Azad University, Shahrekord, Iran
| | - Mohammad Mahdevar
- Genius Gene, Genetics and Biotechnology Company, Isfahan, Iran
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Maryam Peymani
- Department of Biology, Faculty of Basic Sciences, Shahrekord Branch, Islamic Azad University, Shahrekord, Iran.
| |
Collapse
|
8
|
Wu J, Singleton SS, Bhuiyan U, Krammer L, Mazumder R. Multi-omics approaches to studying gastrointestinal microbiome in the context of precision medicine and machine learning. Front Mol Biosci 2024; 10:1337373. [PMID: 38313584 PMCID: PMC10834744 DOI: 10.3389/fmolb.2023.1337373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 12/27/2023] [Indexed: 02/06/2024] Open
Abstract
The human gastrointestinal (gut) microbiome plays a critical role in maintaining host health and has been increasingly recognized as an important factor in precision medicine. High-throughput sequencing technologies have revolutionized -omics data generation, facilitating the characterization of the human gut microbiome with exceptional resolution. The analysis of various -omics data, including metatranscriptomics, metagenomics, glycomics, and metabolomics, holds potential for personalized therapies by revealing information about functional genes, microbial composition, glycans, and metabolites. This multi-omics approach has not only provided insights into the role of the gut microbiome in various diseases but has also facilitated the identification of microbial biomarkers for diagnosis, prognosis, and treatment. Machine learning algorithms have emerged as powerful tools for extracting meaningful insights from complex datasets, and more recently have been applied to metagenomics data via efficiently identifying microbial signatures, predicting disease states, and determining potential therapeutic targets. Despite these rapid advancements, several challenges remain, such as key knowledge gaps, algorithm selection, and bioinformatics software parametrization. In this mini-review, our primary focus is metagenomics, while recognizing that other -omics can enhance our understanding of the functional diversity of organisms and how they interact with the host. We aim to explore the current intersection of multi-omics, precision medicine, and machine learning in advancing our understanding of the gut microbiome. A multidisciplinary approach holds promise for improving patient outcomes in the era of precision medicine, as we unravel the intricate interactions between the microbiome and human health.
Collapse
Affiliation(s)
- Jingyue Wu
- Department of Biochemistry and Molecular Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, United States
| | - Stephanie S. Singleton
- Department of Biochemistry and Molecular Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, United States
| | - Urnisha Bhuiyan
- Department of Biochemistry and Molecular Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, United States
| | - Lori Krammer
- Department of Biochemistry and Molecular Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, United States
- Milken Institute School of Public Health, The George Washington University, Washington, DC, United States
| | - Raja Mazumder
- Department of Biochemistry and Molecular Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, United States
- The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC, United States
| |
Collapse
|
9
|
Qi P, Huang M, Ren X, Zhai Y, Qiu C, Zhu H. Identification of potential biomarkers and therapeutic targets related to post-traumatic stress disorder due to traumatic brain injury. Eur J Med Res 2024; 29:44. [PMID: 38212778 PMCID: PMC10782540 DOI: 10.1186/s40001-024-01640-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 01/03/2024] [Indexed: 01/13/2024] Open
Abstract
BACKGROUND Post-traumatic stress disorder (PTSD), a disease state that has an unclear pathogenesis, imposes a substantial burden on individuals and society. Traumatic brain injury (TBI) is one of the most significant triggers of PTSD. Identifying biomarkers associated with TBI-related PTSD will help researchers to uncover the underlying mechanism that drives disease development. Furthermore, it remains to be confirmed whether different types of traumas share a common mechanism of action. METHODS For this study, we screened the eligible data sets from the Gene Expression Omnibus (GEO) database, obtained differentially expressed genes (DEGs) through analysis, conducted functional enrichment analysis on the DEGs in order to understand their molecular mechanisms, constructed a PPI network, used various algorithms to obtain hub genes, and finally evaluated, validated, and analyzed the diagnostic performance of the hub genes. RESULTS A total of 430 upregulated and 992 down-regulated differentially expressed genes were extracted from the TBI data set. A total of 1919 upregulated and 851 down-regulated differentially expressed genes were extracted from the PTSD data set. Functional enrichment analysis revealed that the differentially expressed genes had biological functions linked to molecular regulation, cell signaling transduction, cell metabolic regulation, and immune response. After constructing a PPI network and introducing algorithm analysis, the upregulated hub genes were identified as VNN1, SERPINB2, and ETFDH, and the down-regulated hub genes were identified as FLT3LG, DYRK1A, DCN, and FKBP8. In addition, by comparing the data with patients with other types of trauma, it was revealed that PTSD showed different molecular processes that are under the influence of different trauma characteristics and responses. CONCLUSIONS By exploring the role of different types of traumas during the pathogenesis of PTSD, its possible molecular mechanisms have been revealed, providing vital information for understanding the complex pathways associated with TBI-related PTSD. The data in this study has important implications for the design and development of new diagnostic and therapeutic methods needed to treat and manage PTSD.
Collapse
Affiliation(s)
- Peng Qi
- Department of Emergency, First Medical Center of Chinese, PLA General Hospital, 28 Fuxing Road, Beijing, 100853, China
| | - Mengjie Huang
- Department of Nephrology, First Medical Center of Chinese, PLA General Hospital, 28 Fuxing Road, Beijing, 100853, China
| | - Xuewen Ren
- Department of Emergency, First Medical Center of Chinese, PLA General Hospital, 28 Fuxing Road, Beijing, 100853, China
| | - Yongzhi Zhai
- Department of Emergency, First Medical Center of Chinese, PLA General Hospital, 28 Fuxing Road, Beijing, 100853, China
| | - Chen Qiu
- Department of Orthopedics, Fourth Medical Center of Chinese, PLA General Hospital, Beijing, 100853, China.
| | - Haiyan Zhu
- Department of Emergency, First Medical Center of Chinese, PLA General Hospital, 28 Fuxing Road, Beijing, 100853, China.
| |
Collapse
|
10
|
Ren JX, Chen L, Guo W, Feng KY, Cai YD, Huang T. Patterns of Gene Expression Profiles Associated with Colorectal Cancer in Colorectal Mucosa by Using Machine Learning Methods. Comb Chem High Throughput Screen 2024; 27:2921-2934. [PMID: 37957897 DOI: 10.2174/0113862073266300231026103844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 09/11/2023] [Accepted: 09/30/2023] [Indexed: 11/15/2023]
Abstract
BACKGROUND Colorectal cancer (CRC) has a very high incidence and lethality rate and is one of the most dangerous cancer types. Timely diagnosis can effectively reduce the incidence of colorectal cancer. Changes in para-cancerous tissues may serve as an early signal for tumorigenesis. Comparison of the differences in gene expression between para-cancerous and normal mucosa can help in the diagnosis of CRC and understanding the mechanisms of development. OBJECTIVES This study aimed to identify specific genes at the level of gene expression, which are expressed in normal mucosa and may be predictive of CRC risk. METHODS A machine learning approach was used to analyze transcriptomic data in 459 samples of normal colonic mucosal tissue from 322 CRC cases and 137 non-CRC, in which each sample contained 28,706 gene expression levels. The genes were ranked using four ranking methods based on importance estimation (LASSO, LightGBM, MCFS, and mRMR) and four classification algorithms (decision tree [DT], K-nearest neighbor [KNN], random forest [RF], and support vector machine [SVM]) were combined with incremental feature selection [IFS] methods to construct a prediction model with excellent performance. RESULT The top-ranked genes, namely, HOXD12, CDH1, and S100A12, were associated with tumorigenesis based on previous studies. CONCLUSION This study summarized four sets of quantitative classification rules based on the DT algorithm, providing clues for understanding the microenvironmental changes caused by CRC. According to the rules, the effect of CRC on normal mucosa can be determined.
Collapse
Affiliation(s)
- Jing Xin Ren
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai, 200030, China
| | - Kai Yan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, 510507, China
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| |
Collapse
|
11
|
Jiang S, Wang T, Zhang KH. Data-driven decision-making for precision diagnosis of digestive diseases. Biomed Eng Online 2023; 22:87. [PMID: 37658345 PMCID: PMC10472739 DOI: 10.1186/s12938-023-01148-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 08/15/2023] [Indexed: 09/03/2023] Open
Abstract
Modern omics technologies can generate massive amounts of biomedical data, providing unprecedented opportunities for individualized precision medicine. However, traditional statistical methods cannot effectively process and utilize such big data. To meet this new challenge, machine learning algorithms have been developed and applied rapidly in recent years, which are capable of reducing dimensionality, extracting features, organizing data and forming automatable data-driven clinical decision systems. Data-driven clinical decision-making have promising applications in precision medicine and has been studied in digestive diseases, including early diagnosis and screening, molecular typing, staging and stratification of digestive malignancies, as well as precise diagnosis of Crohn's disease, auxiliary diagnosis of imaging and endoscopy, differential diagnosis of cystic lesions, etiology discrimination of acute abdominal pain, stratification of upper gastrointestinal bleeding (UGIB), and real-time diagnosis of esophageal motility function, showing good application prospects. Herein, we reviewed the recent progress of data-driven clinical decision making in precision diagnosis of digestive diseases and discussed the limitations of data-driven decision making after a brief introduction of methods for data-driven decision making.
Collapse
Affiliation(s)
- Song Jiang
- Department of Gastroenterology, The First Affiliated Hospital of Nanchang University, No. 17, Yongwai Zheng Street, Nanchang, 330006 China
- Jiangxi Institute of Gastroenterology and Hepatology, Nanchang, 330006 China
| | - Ting Wang
- Department of Gastroenterology, The First Affiliated Hospital of Nanchang University, No. 17, Yongwai Zheng Street, Nanchang, 330006 China
- Jiangxi Institute of Gastroenterology and Hepatology, Nanchang, 330006 China
| | - Kun-He Zhang
- Department of Gastroenterology, The First Affiliated Hospital of Nanchang University, No. 17, Yongwai Zheng Street, Nanchang, 330006 China
- Jiangxi Institute of Gastroenterology and Hepatology, Nanchang, 330006 China
| |
Collapse
|
12
|
Asadnia A, Nazari E, Goshayeshi L, Zafari N, Moetamani-Ahmadi M, Goshayeshi L, Azari H, Pourali G, Khalili-Tanha G, Abbaszadegan MR, Khojasteh-Leylakoohi F, Bazyari M, Kahaei MS, Ghorbani E, Khazaei M, Hassanian SM, Gataa IS, Kiani MA, Peters GJ, Ferns GA, Batra J, Lam AKY, Giovannetti E, Avan A. The Prognostic Value of ASPHD1 and ZBTB12 in Colorectal Cancer: A Machine Learning-Based Integrated Bioinformatics Approach. Cancers (Basel) 2023; 15:4300. [PMID: 37686578 PMCID: PMC10486397 DOI: 10.3390/cancers15174300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023] Open
Abstract
Introduction: Colorectal cancer (CRC) is a common cancer associated with poor outcomes, underscoring a need for the identification of novel prognostic and therapeutic targets to improve outcomes. This study aimed to identify genetic variants and differentially expressed genes (DEGs) using genome-wide DNA and RNA sequencing followed by validation in a large cohort of patients with CRC. Methods: Whole genome and gene expression profiling were used to identify DEGs and genetic alterations in 146 patients with CRC. Gene Ontology, Reactom, GSEA, and Human Disease Ontology were employed to study the biological process and pathways involved in CRC. Survival analysis on dysregulated genes in patients with CRC was conducted using Cox regression and Kaplan-Meier analysis. The STRING database was used to construct a protein-protein interaction (PPI) network. Moreover, candidate genes were subjected to ML-based analysis and the Receiver operating characteristic (ROC) curve. Subsequently, the expression of the identified genes was evaluated by Real-time PCR (RT-PCR) in another cohort of 64 patients with CRC. Gene variants affecting the regulation of candidate gene expressions were further validated followed by Whole Exome Sequencing (WES) in 15 patients with CRC. Results: A total of 3576 DEGs in the early stages of CRC and 2985 DEGs in the advanced stages of CRC were identified. ASPHD1 and ZBTB12 genes were identified as potential prognostic markers. Moreover, the combination of ASPHD and ZBTB12 genes was sensitive, and the two were considered specific markers, with an area under the curve (AUC) of 0.934, 1.00, and 0.986, respectively. The expression levels of these two genes were higher in patients with CRC. Moreover, our data identified two novel genetic variants-the rs925939730 variant in ASPHD1 and the rs1428982750 variant in ZBTB1-as being potentially involved in the regulation of gene expression. Conclusions: Our findings provide a proof of concept for the prognostic values of two novel genes-ASPHD1 and ZBTB12-and their associated variants (rs925939730 and rs1428982750) in CRC, supporting further functional analyses to evaluate the value of emerging biomarkers in colorectal cancer.
Collapse
Affiliation(s)
- Alireza Asadnia
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad 91886-17871, Iran; (M.R.A.); (M.S.K.)
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | - Elham Nazari
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran 19839-69411, Iran;
| | - Ladan Goshayeshi
- Department of Gastroenterology and Hepatology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran;
- Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48954, Iran;
| | - Nima Zafari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
| | - Mehrdad Moetamani-Ahmadi
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad 91886-17871, Iran; (M.R.A.); (M.S.K.)
| | - Lena Goshayeshi
- Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48954, Iran;
| | - Haneih Azari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
| | - Ghazaleh Pourali
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
| | - Ghazaleh Khalili-Tanha
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
| | - Mohammad Reza Abbaszadegan
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad 91886-17871, Iran; (M.R.A.); (M.S.K.)
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | - Fatemeh Khojasteh-Leylakoohi
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | - MohammadJavad Bazyari
- Department of Medical Biotechnology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran;
| | - Mir Salar Kahaei
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad 91886-17871, Iran; (M.R.A.); (M.S.K.)
| | - Elnaz Ghorbani
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | | | - Mohammad Ali Kiani
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad 13944-91388, Iran;
| | - Godefridus J. Peters
- Department of Biochemistry, Medical University of Gdansk, 80-211 Gdansk, Poland;
- Cancer Center Amsterdam, Amsterdam U.M.C., VU University Medical Center (VUMC), Department of Medical Oncology, 1081 HV Amsterdam, The Netherlands
| | - Gordon A. Ferns
- Brighton & Sussex Medical School, Department of Medical Education, Falmer, Brighton, Sussex BN1 9PH, UK;
| | - Jyotsna Batra
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD 4059, Australia;
| | - Alfred King-yin Lam
- Pathology, School of Medicine and Dentistry, Gold Coast Campus, Griffith University, Gold Coast, QLD 4222, Australia;
| | - Elisa Giovannetti
- Cancer Center Amsterdam, Amsterdam U.M.C., VU University Medical Center (VUMC), Department of Medical Oncology, 1081 HV Amsterdam, The Netherlands
- Cancer Pharmacology Lab, AIRC Start Up Unit, Fondazione Pisana per La Scienza, 56017 Pisa, Italy
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran; (A.A.); (N.Z.); (M.M.-A.); (H.A.); (G.P.); (G.K.-T.); (F.K.-L.); (E.G.); (M.K.); (S.M.H.)
- Department of Medical Biotechnology, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad 91779-48564, Iran;
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology (QUT), Brisbane, QLD 4059, Australia;
| |
Collapse
|
13
|
Khalili-Tanha G, Mohit R, Asadnia A, Khazaei M, Dashtiahangar M, Maftooh M, Nassiri M, Hassanian SM, Ghayour-Mobarhan M, Kiani MA, Ferns GA, Batra J, Nazari E, Avan A. Identification of ZMYND19 as a novel biomarker of colorectal cancer: RNA-sequencing and machine learning analysis. J Cell Commun Signal 2023:10.1007/s12079-023-00779-2. [PMID: 37428302 DOI: 10.1007/s12079-023-00779-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 05/29/2023] [Indexed: 07/11/2023] Open
Abstract
Colorectal cancer (CRC) is the third most common cause of cancer-related deaths. The five-year relative survival rate for CRC is estimated to be approximately 90% for patients diagnosed with early stages and 14% for those diagnosed at an advanced stages of disease, respectively. Hence, the development of accurate prognostic markers is required. Bioinformatics enables the identification of dysregulated pathways and novel biomarkers. RNA expression profiling was performed in CRC patients from the TCGA database using a Machine Learning approach to identify differential expression genes (DEGs). Survival curves were assessed using Kaplan-Meier analysis to identify prognostic biomarkers. Furthermore, the molecular pathways, protein-protein interaction, the co-expression of DEGs, and the correlation between DEGs and clinical data have been evaluated. The diagnostic markers were then determined based on machine learning analysis. The results indicated that key upregulated genes are associated with the RNA processing and heterocycle metabolic process, including C10orf2, NOP2, DKC1, BYSL, RRP12, PUS7, MTHFD1L, and PPAT. Furthermore, the survival analysis identified NOP58, OSBPL3, DNAJC2, and ZMYND19 as prognostic markers. The combineROC curve analysis indicated that the combination of C10orf2 -PPAT- ZMYND19 can be considered as diagnostic markers with sensitivity, specificity, and AUC values of 0.98, 1.00, and 0.99, respectively. Eventually, ZMYND19 gene was validated in CRC patients. In conclusion, novel biomarkers of CRC have been identified that may be a promising strategy for early diagnosis, potential treatment, and better prognosis.
Collapse
Affiliation(s)
- Ghazaleh Khalili-Tanha
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Reza Mohit
- Department of Anesthesia, Bushehr University of Medical Sciences, Bushehr, Iran
| | - Alireza Asadnia
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Mina Maftooh
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Ghayour-Mobarhan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammad Ali Kiani
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
- Department of Pediatrics, Ghaem Hospital, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex, BN1 9PH, UK
| | - Jyotsna Batra
- Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, 4059, Australia
- Translational Research Institute, Queensland University of Technology, Brisbane, 4102, Australia
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- College of Medicine, University of Warith Al-Anbiyaa, Karbala, Iraq.
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
14
|
Azari H, Nazari E, Mohit R, Asadnia A, Maftooh M, Nassiri M, Hassanian SM, Ghayour-Mobarhan M, Shahidsales S, Khazaei M, Ferns GA, Avan A. Machine learning algorithms reveal potential miRNAs biomarkers in gastric cancer. Sci Rep 2023; 13:6147. [PMID: 37061507 PMCID: PMC10105697 DOI: 10.1038/s41598-023-32332-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 03/26/2023] [Indexed: 04/17/2023] Open
Abstract
Gastric cancer is the high mortality rate cancers globally, and the current survival rate is 30% even with the use of combination therapies. Recently, mounting evidence indicates the potential role of miRNAs in the diagnosis and assessing the prognosis of cancers. In the state-of-art research in cancer, machine-learning (ML) has gained increasing attention to find clinically useful biomarkers. The present study aimed to identify potential diagnostic and prognostic miRNAs in GC with the application of ML. Using the TCGA database and ML algorithms such as Support Vector Machine (SVM), Random Forest, k-NN, etc., a panel of 29 was obtained. Among the ML algorithms, SVM was chosen (AUC:88.5%, Accuracy:93% in GC). To find common molecular mechanisms of the miRNAs, their common gene targets were predicted using online databases such as miRWalk, miRDB, and Targetscan. Functional and enrichment analyzes were performed using Gene Ontology (GO) and Kyoto Database of Genes and Genomes (KEGG), as well as identification of protein-protein interactions (PPI) using the STRING database. Pathway analysis of the target genes revealed the involvement of several cancer-related pathways including miRNA mediated inhibition of translation, regulation of gene expression by genetic imprinting, and the Wnt signaling pathway. Survival and ROC curve analysis showed that the expression levels of hsa-miR-21, hsa-miR-133a, hsa-miR-146b, and hsa-miR-29c were associated with higher mortality and potentially earlier detection of GC patients. A panel of dysregulated miRNAs that may serve as reliable biomarkers for gastric cancer were identified using machine learning, which represents a powerful tool in biomarker identification.
Collapse
Affiliation(s)
- Hanieh Azari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Reza Mohit
- Department of Anesthesia, Bushehr University of Medical Sciences, Bushehr, Iran
| | - Alireza Asadnia
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mina Maftooh
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Ghayour-Mobarhan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | | | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
- Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Division of Medical Education, Brighton and Sussex Medical School, Falmer, Brighton, Sussex, BN1 9PH, UK.
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
- Faculty of Health, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia.
- College of Medicine, University of Warith Al-Anbiyaa, Karbala, Iraq, College of Medicine, University of Warith Al-Anbiyaa, karbala, Iraq.
| |
Collapse
|
15
|
Zafari N, Bathaei P, Velayati M, Khojasteh-Leylakoohi F, Khazaei M, Fiuji H, Nassiri M, Hassanian SM, Ferns GA, Nazari E, Avan A. Integrated analysis of multi-omics data for the discovery of biomarkers and therapeutic targets for colorectal cancer. Comput Biol Med 2023; 155:106639. [PMID: 36805214 DOI: 10.1016/j.compbiomed.2023.106639] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/14/2023] [Accepted: 02/05/2023] [Indexed: 02/12/2023]
Abstract
The considerable burden of colorectal cancer and the rising trend in young adults emphasize the necessity of understanding its underlying mechanisms, providing new diagnostic and prognostic markers, and improving therapeutic approaches. Precision medicine is a new trend all over the world and identification of novel biomarkers and therapeutic targets is a step forward towards this trend. In this context, multi-omics data and integrated analysis are being investigated to develop personalized medicine in the management of colorectal cancer. Given the large amount of data from multi-omics approach, data integration and analysis is a great challenge. In this Review, we summarize how statistical and machine learning techniques are applied to analyze multi-omics data and how it contributes to the discovery of useful diagnostic and prognostic biomarkers and therapeutic targets. Moreover, we discuss the importance of these biomarkers and therapeutic targets in the clinical management of colorectal cancer in the future. Taken together, integrated analysis of multi-omics data has great potential for finding novel diagnostic and prognostic biomarkers and therapeutic targets, however, there are still challenges to overcome in future studies.
Collapse
Affiliation(s)
- Nima Zafari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Parsa Bathaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahla Velayati
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Fatemeh Khojasteh-Leylakoohi
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Majid Khazaei
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Hamid Fiuji
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mohammadreza Nassiri
- Recombinant Proteins Research Group, The Research Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Seyed Mahdi Hassanian
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Gordon A Ferns
- Brighton & Sussex Medical School, Division of Medical Education, Falmer, Brighton, Sussex, BN1 9PH, UK
| | - Elham Nazari
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
| | - Amir Avan
- Metabolic Syndrome Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
16
|
Li H, Lin D, Yu Z, Li H, Zhao S, Hainisayimu T, Liu L, Wang K. A nomogram model based on the number of examined lymph nodes-related signature to predict prognosis and guide clinical therapy in gastric cancer. Front Immunol 2022; 13:947802. [PMID: 36405735 PMCID: PMC9667298 DOI: 10.3389/fimmu.2022.947802] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 09/30/2022] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Increasing evidence suggests that the number of examined lymph nodes (ELNs) is strongly linked to the survivorship of gastric cancer (GC). The goal of this study was to assess the prognostic implications of the ELNs number and to construct an ELNs-based risk signature and nomogram model to predict overall survival (OS) characteristics in GC patients. METHODS This inception cohort study included 19,317 GC patients from the U.S. Surveillance, Epidemiology, and End Results (SEER) database, who were separated into a training group and an internal validation group. The nomogram was built with the training set, then internally verified with SEER data, and externally validated with two different data sets. Based on the RNA-seq data, ELNs-related DERNAs (DElncRNAs, DEmiRNAs, andDEmRNAs) and immune cells were identified. The LASSO-Cox regression analysis was utilized to construct ELNs-related DERNAs and immune cell prognostic signature in The Cancer Genome Atlas (TCGA) cohort. The OS of subgroups with high- and low-ELN signature was compared using the Kaplan-Meier (K-M) analysis. A nomogram was successfully constructed based on the ELNs signature and other clinical characteristics. The concordance index (C-index), calibration plot, receiver operating characteristic curve, and decision curve analysis (DCA) were all used to evaluate the nomogram model. The meta-analysis, the Gene Expression Profiling Interactive Analysis database, and reverse transcription-quantitative PCR (RT-qPCR) were utilized to validate the RNA expression or abundance of prognostic genes and immune cells between GC tissues and normal gastric tissues, respectively. Finally, we analyzed the correlations between immune checkpoints, chemotherapy drug sensitivity, and risk score. RESULTS The multivariate analysis revealed that the high ELNs improved OS compared with low ELNs (hazard ratio [HR] = 0.659, 95% confidence interval [CI]: 0.626-0.694, p < 0.0001). Using the training set, a nomogram incorporating ELNs was built and proven to have good calibration and discrimination (C-index [95% CI], 0.714 [0.710-0.718]), which was validated in the internal validation set (C-index [95% CI], 0.720 [0.714-0.726]), the TCGA set (C-index [95% CI], 0.693 [0.662-0.724]), and the Chinese set (C-index [95% CI], 0.750 [0.720-0.782]). An ELNs-related signature model based on ELNs group, regulatory T cells (Tregs), neutrophils, CDKN2B-AS1, H19, HOTTIP, LINC00643, MIR663AHG, TMEM236, ZNF705A, and hsa-miR-135a-5p was constructed by the LASSO-Cox regression analysis. The result showed that OS was remarkably lower in patients with high-ELNs signature compared with those with low-ELN signature (HR = 2.418, 95% CI: 1.804-3.241, p < 0.001). This signature performed well in predicting 1-, 3-, and 5-year survival (AUC [95% CI] = 0.688 [0.612-0.763], 0.744 [0.659-0.830], and 0.778 [0.647-0.909], respectively). The multivariate Cox analysis illustrated that the risk score was an independent predictor of survival for patients with GC. Moreover, the expression of prognostic genes (LINC00643, TMEM236, and hsa-miR-135a-5p) displayed differences between GC tissues and adjacent non-tumor tissues. The C-index of the nomogram that can be used to predict the OS of GC patients was 0.710 (95% CI: 0.663-0.753). Both the calibration plots and DCA showed that the nomogram has good predictive performance. Moreover, the signature was significantly correlated with the N stage and T stage. According to our analysis, GC patients in the low-ELN signature group may have a better immunotherapy response and OS outcome. CONCLUSIONS We explored the prognostic role of ELNs in GC and successfully constructed an ELNs signature linked to the GC prognosis in TCGA. The findings manifested that the signature is a powerful predictive indicator for patients with GC. The signature might contain potential biomarkers for treatment response prediction for GC patients. Additionally, we identified a novel and robust nomogram combining the characteristics of ELNs and clinical factors for predicting 1-, 3-, and 5-year OS in GC patients, which will facilitate personalized survival prediction and aid clinical decision-making in GC patients.
Collapse
Affiliation(s)
- Huling Li
- School of Public Health, Xinjiang Medical University, Urumqi, China
| | - Dandan Lin
- School of Public Health, Xinjiang Medical University, Urumqi, China
| | - Zhen Yu
- Department of Gastrointestinal Surgery, The Third Affiliated Hospital, Xinjiang Medical University, Urumqi, China
| | - Hui Li
- Central Laboratory of Xinjiang Medical University, Urumqi, China
| | - Shi Zhao
- JC School of Public Health and Primary Care, Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Tuersun Hainisayimu
- Department of Biochemistry and Molecular Biology, Basic Medicine School, Xinjiang Medical University, Urumqi, China
| | - Lin Liu
- Department of Gastrointestinal Surgery, The Third Affiliated Hospital, Xinjiang Medical University, Urumqi, China,*Correspondence: Kai Wang, ; Lin Liu,
| | - Kai Wang
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, China,*Correspondence: Kai Wang, ; Lin Liu,
| |
Collapse
|
17
|
Phan TTH, Nguyen-Doan D, Nguyen-Huu D, Nguyen-Van H, Pham-Hong T. Investigation on new Mel frequency cepstral coefficients features and hyper-parameters tuning technique for bee sound recognition. Soft comput 2022. [DOI: 10.1007/s00500-022-07596-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
18
|
Bayrak T, Çetin Z, Saygılı Eİ, Ogul H. Identifying the tumor location-associated candidate genes in development of new drugs for colorectal cancer using machine-learning-based approach. Med Biol Eng Comput 2022; 60:2877-2897. [DOI: 10.1007/s11517-022-02641-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 07/28/2022] [Indexed: 02/07/2023]
|
19
|
Keshavarz-Rahaghi F, Pleasance E, Kolisnik T, Jones SJM. A p53 transcriptional signature in primary and metastatic cancers derived using machine learning. Front Genet 2022; 13:987238. [PMID: 36134028 PMCID: PMC9483853 DOI: 10.3389/fgene.2022.987238] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
The tumor suppressor gene, TP53, has the highest rate of mutation among all genes in human cancer. This transcription factor plays an essential role in the regulation of many cellular processes. Mutations in TP53 result in loss of wild-type p53 function in a dominant negative manner. Although TP53 is a well-studied gene, the transcriptome modifications caused by the mutations in this gene have not yet been explored in a pan-cancer study using both primary and metastatic samples. In this work, we used a random forest model to stratify tumor samples based on TP53 mutational status and detected a p53 transcriptional signature. We hypothesize that the existence of this transcriptional signature is due to the loss of wild-type p53 function and is universal across primary and metastatic tumors as well as different tumor types. Additionally, we showed that the algorithm successfully detected this signature in samples with apparent silent mutations that affect correct mRNA splicing. Furthermore, we observed that most of the highly ranked genes contributing to the classification extracted from the random forest have known associations with p53 within the literature. We suggest that other genes found in this list including GPSM2, OR4N2, CTSL2, SPERT, and RPE65 protein coding genes have yet undiscovered linkages to p53 function. Our analysis of time on different therapies also revealed that this signature is more effective than the recorded TP53 status in detecting patients who can benefit from platinum therapies and taxanes. Our findings delineate a p53 transcriptional signature, expand the knowledge of p53 biology and further identify genes important in p53 related pathways.
Collapse
Affiliation(s)
- Faeze Keshavarz-Rahaghi
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Erin Pleasance
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Tyler Kolisnik
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- School of Natural and Computational Sciences, Massey University, Auckland, New Zealand
| | - Steven J. M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Vancouver, BC, Canada
- *Correspondence: Steven J. M. Jones,
| |
Collapse
|
20
|
Pietrucci D, Teofani A, Milanesi M, Fosso B, Putignani L, Messina F, Pesole G, Desideri A, Chillemi G. Machine Learning Data Analysis Highlights the Role of Parasutterella and Alloprevotella in Autism Spectrum Disorders. Biomedicines 2022; 10:biomedicines10082028. [PMID: 36009575 PMCID: PMC9405825 DOI: 10.3390/biomedicines10082028] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/10/2022] [Accepted: 08/15/2022] [Indexed: 11/25/2022] Open
Abstract
In recent years, the involvement of the gut microbiota in disease and health has been investigated by sequencing the 16S gene from fecal samples. Dysbiotic gut microbiota was also observed in Autism Spectrum Disorder (ASD), a neurodevelopmental disorder characterized by gastrointestinal symptoms. However, despite the relevant number of studies, it is still difficult to identify a typical dysbiotic profile in ASD patients. The discrepancies among these studies are due to technical factors (i.e., experimental procedures) and external parameters (i.e., dietary habits). In this paper, we collected 959 samples from eight available projects (540 ASD and 419 Healthy Controls, HC) and reduced the observed bias among studies. Then, we applied a Machine Learning (ML) approach to create a predictor able to discriminate between ASD and HC. We tested and optimized three algorithms: Random Forest, Support Vector Machine and Gradient Boosting Machine. All three algorithms confirmed the importance of five different genera, including Parasutterella and Alloprevotella. Furthermore, our results show that ML algorithms could identify common taxonomic features by comparing datasets obtained from countries characterized by latent confounding variables.
Collapse
Affiliation(s)
- Daniele Pietrucci
- Department for Innovation in Biological, Agro-Food and Forest Systems (DIBAF), University of Tuscia, 01100 Viterbo, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM, CNR, 70126 Bari, Italy
| | - Adelaide Teofani
- Department of Biology, University of Rome Tor Vergata, Via Montpellier 1, 00133 Rome, Italy
| | - Marco Milanesi
- Department for Innovation in Biological, Agro-Food and Forest Systems (DIBAF), University of Tuscia, 01100 Viterbo, Italy
| | - Bruno Fosso
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Piazza Umberto I, 1, 70121 Bari, Italy
| | - Lorenza Putignani
- Unit of Microbiology and Diagnostic Immunology, Units of Microbiomics, Department of Diagnostic and Laboratory Medicine, Bambino Gesù Children’s Hospital, IRCCS, 00146 Rome, Italy
| | - Francesco Messina
- Laboratory of Microbiology and Biological Bank National Institute for Infectious Diseases “Lazzaro Spallanzani” Istituto di Ricovero e Cura a Carattere Scientifico, 00149 Rome, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM, CNR, 70126 Bari, Italy
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Piazza Umberto I, 1, 70121 Bari, Italy
| | - Alessandro Desideri
- Department of Biology, University of Rome Tor Vergata, Via Montpellier 1, 00133 Rome, Italy
| | - Giovanni Chillemi
- Department for Innovation in Biological, Agro-Food and Forest Systems (DIBAF), University of Tuscia, 01100 Viterbo, Italy
- Correspondence: ; Tel.: +39-0761-357-429
| |
Collapse
|
21
|
Yang L, Wei S, Zhang J, Hu Q, Hu W, Cao M, Zhang L, Wang Y, Wang P, Wang K. Construction of a predictive model for immunotherapy efficacy in lung squamous cell carcinoma based on the degree of tumor-infiltrating immune cells and molecular typing. Lab Invest 2022; 20:364. [PMID: 35962453 PMCID: PMC9373274 DOI: 10.1186/s12967-022-03565-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 08/02/2022] [Indexed: 12/20/2022]
Abstract
Background To construct a predictive model of immunotherapy efficacy for patients with lung squamous cell carcinoma (LUSC) based on the degree of tumor-infiltrating immune cells (TIIC) in the tumor microenvironment (TME). Methods The data of 501 patients with LUSC in the TCGA database were used as a training set, and grouped using non-negative matrix factorization (NMF) based on the degree of TIIC assessed by single-sample gene set enrichment analysis (GSEA). Two data sets (GSE126044 and GSE135222) were used as validation sets. Genes screened for modeling by least absolute shrinkage and selection operator (LASSO) regression and used to construct a model based on immunophenotyping score (IPTS). RNA extraction and qPCR were performed to validate the prognostic value of IPTS in our independent LUSC cohort. The receiver operating characteristic (ROC) curve was constructed to determine the predictive value of the immune efficacy. Kaplan–Meier survival curve analysis was performed to evaluate the prognostic predictive ability. Correlation analysis and enrichment analysis were used to explore the potential mechanism of IPTS molecular typing involved in predicting the immunotherapy efficacy for patients with LUSC. Results The training set was divided into a low immune cell infiltration type (C1) and a high immune cell infiltration type (C2) by NMF typing, and the IPTS molecular typing based on the 17-gene model could replace the results of the NMF typing. The area under the ROC curve (AUC) was 0.82. In both validation sets, the IPTS of patients who responded to immunotherapy were significantly higher than those who did not respond to immunotherapy (P = 0.0032 and P = 0.0451), whereas the AUC was 0.95 (95% CI = 1.00–0.84) and 0.77 (95% CI = 0.58–0.96), respectively. In our independent cohort, we validated its ability to predict the response to cancer immunotherapy, for the AUC was 0.88 (95% CI = 1.00–0.66). GSEA suggested that the high IPTS group was mainly involved in immune-related signaling pathways. Conclusions IPTS molecular typing based on the degree of TIIC in the TME could well predict the efficacy of immunotherapy in patients with LUSC with a certain prognostic value. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-022-03565-7.
Collapse
Affiliation(s)
- Lingge Yang
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Shuli Wei
- Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
| | - Jingnan Zhang
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Qiongjie Hu
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Wansong Hu
- Department of Heart Center, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Mengqing Cao
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Long Zhang
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China
| | - Yongfang Wang
- Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
| | - Pingli Wang
- Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China.
| | - Kai Wang
- Department of Respiratory and Critical Care Medicine, The Fourth Affiliated Hospital, International Institutes of Medicine, Zhejiang University School of Medicine, Yiwu, China.
| |
Collapse
|
22
|
A novel biomarker selection method combining graph neural network and gene relationships applied to microarray data. BMC Bioinformatics 2022; 23:303. [PMID: 35883022 PMCID: PMC9327232 DOI: 10.1186/s12859-022-04848-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 07/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The discovery of critical biomarkers is significant for clinical diagnosis, drug research and development. Researchers usually obtain biomarkers from microarray data, which comes from the dimensional curse. Feature selection in machine learning is usually used to solve this problem. However, most methods do not fully consider feature dependence, especially the real pathway relationship of genes. RESULTS Experimental results show that the proposed method is superior to classical algorithms and advanced methods in feature number and accuracy, and the selected features have more significance. METHOD This paper proposes a feature selection method based on a graph neural network. The proposed method uses the actual dependencies between features and the Pearson correlation coefficient to construct graph-structured data. The information dissemination and aggregation operations based on graph neural network are applied to fuse node information on graph structured data. The redundant features are clustered by the spectral clustering method. Then, the feature ranking aggregation model using eight feature evaluation methods acts on each clustering sub-cluster for different feature selection. CONCLUSION The proposed method can effectively remove redundant features. The algorithm's output has high stability and classification accuracy, which can potentially select potential biomarkers.
Collapse
|
23
|
Distinct power of bone marrow microRNA signatures and tumor suppressor genes for early detection of acute leukemia. Clin Transl Oncol 2022; 24:1372-1380. [PMID: 35247197 DOI: 10.1007/s12094-022-02781-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 01/13/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Acute leukemia involving lymphocytic and myeloid cells is cancer with a high mortality rate. Swift and timely diagnosis might be a potential approach to improving patient prognosis and survival. The microRNA (miRNA) signatures are emerging nowadays for their promising diagnostic potential. MiRNA levels from bone marrow can be used as prognostic biomarkers. METHODS The current study was designed to evaluate if the microRNAs and tumor suppressor genes (TSGs) profiling of hematopoietic bone marrow could help in acute leukemia early detection. Also, we assessed the DNA methyltransferase 3A (DNMT3A) expression and its possible epigenetic effects on miRNAs plus TSGs expression levels. The expression levels of ten miRNAs and four TSGs involved in acute lymphocytic leukemia (ALL) as well as acute myeloid leukemia (AML) were quantified in 43 and 40 bone marrow samples of ALL and AML patients in comparison with cancer-free subjects via real-time quantitative PCR (RT-qPCR). The receiver-operating-characteristic (ROC) analysis of miRNAs was performed in the study groups. Further, the correlation between the DNMT3A and TSGs was calculated. RESULTS Significant differences were detected in the bone marrow expression of miRNAs and TSGs (P < 0.05) between acute leukemia patients and healthy group. ROC analysis confirmed the ability of miR-30a, miR-101, miR-132, miR-129, miR-124, and miR-143 to discriminate both ALL and AML patients with an area under the ROC curve of ≥ 0.80 (P < 0.001) and high accuracy. The correlation between DNMT3A and P15/P16 TSGs revealed that DNMT3A plays a vital role in epigenetic control of TSGs expression. Our findings indicated that the downregulation of bone marrow miRNAs and TSGs was accompanied by acute leukemia development. CONCLUSIONS The authors conclude that this study could contribute to introducing useful biomarkers for acute leukemia diagnosis.
Collapse
|
24
|
Mo L, Su Y, Yuan J, Xiao Z, Zhang Z, Lan X, Huang D. Comparisons of Forecasting for Survival Outcome for Head and Neck Squamous Cell Carcinoma by using Machine Learning Models based on Multi-omics. Curr Genomics 2022; 23:94-108. [PMID: 36778975 PMCID: PMC9878835 DOI: 10.2174/1389202923666220204153744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 01/13/2022] [Accepted: 01/19/2022] [Indexed: 11/22/2022] Open
Abstract
Background: Machine learning methods showed excellent predictive ability in a wide range of fields. For the survival of head and neck squamous cell carcinoma (HNSC), its multi-omics influence is crucial. This study attempts to establish a variety of machine learning multi-omics models to predict the survival of HNSC and find the most suitable machine learning prediction method. Methods: The HNSC clinical data and multi-omics data were downloaded from the TCGA database. The important variables were screened by the LASSO algorithm. We used a total of 12 supervised machine learning models to predict the outcome of HNSC survival and compared the results. In vitro qPCR was performed to verify core genes predicted by the random forest algorithm. Results: For omics of HNSC, the results of the twelve models showed that the performance of multi-omics was better than each single-omic alone. Results were presented, which showed that the Bayesian network(BN) model (area under the curve [AUC] 0.8250, F1 score=0.7917) and random forest(RF) model (area under the curve [AUC] 0.8002,F1 score=0.7839) played good prediction performance in HNSC multi-omics data. The results of in vitro qPCR were consistent with the RF algorithm. Conclusion: Machine learning methods could better forecast the survival outcome of HNSC. Meanwhile, this study found that the BN model and the RF model were the most superior. Moreover, the forecast result of multi-omics was better than single-omic alone in HNSC.
Collapse
Affiliation(s)
- Liying Mo
- School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China;,These authors contributed equally to this work
| | - Yuangang Su
- School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China;,Research Centre for Regenerative Medicine, Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, Guangxi, China;,These authors contributed equally to this work
| | - Jianhui Yuan
- School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China;,The Laboratory of Biomedical Photonics and Engineering, Guangxi Medical University, Nanning, China
| | - Zhiwei Xiao
- School of Information and Management, Guangxi Medical University, Nanning, Guangxi, China
| | - Ziyan Zhang
- Life Sciences Institute, Guangxi Medical University, Nanning, Guangxi, China
| | - Xiuwan Lan
- School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China;,These authors contributed equally to this work
| | - Daizheng Huang
- School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China;,The Laboratory of Biomedical Photonics and Engineering, Guangxi Medical University, Nanning, China;,Address correspondence to this author at the School of Basic Medical Sciences, Guangxi Medical University, Nanning, Guangxi, China; The Laboratory of Biomedical Photonics and Engineering, Guangxi Medical University, Nanning, China; Tel: +867715358270; E-mail:
| |
Collapse
|
25
|
Li YR, Meng K, Yang G, Liu BH, Li CQ, Zhang JY, Zhang XM. Diagnostic genes and immune infiltration analysis of colorectal cancer determined by LASSO and SVM machine learning methods: a bioinformatics analysis. J Gastrointest Oncol 2022; 13:1188-1203. [PMID: 35837194 PMCID: PMC9274036 DOI: 10.21037/jgo-22-536] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 06/16/2022] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND Genetic factors account for approximately 35% of colorectal cancer risk. The specificity and sensitivity of previous diagnostic biomarkers for colorectal cancer could not meet the need of clinical application. The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning to build informative and predictive models of the underlying biological processes. The aim of this study is to identify diagnostic genes of colorectal cancer by using machine learning methods. METHODS The GSE41328 and GSE106582 data sets were downloaded from the Gene Expression Omnibus (GEO) database. The gene expression differences between colon cancer and normal tissues were analyzed. The key colorectal cancer genes were screened and validated by Least Absolute Shrinkage and Selection Operator (LASSO) and Support Vector Machine (SVM) regression. Immune cell infiltration and the correlation with the key genes in patients with colon cancer were further analyzed by CIBERSORT. RESULTS Eleven key genes were identified as biomarkers for colon cancer, namely ASCL2, BEST4, CFD, DPEPCFD, FOXQ1, TRIB3, KLF4, MMP7, MMP11, PYY, and PDK4. The mean area under the receiver operating characteristic (ROC) curve (AUC) of all 11 genes for colon cancer diagnosis were 0.94 with a range of 0.91-0.97. In the validation set, the expression of the 11 key genes was significantly different between colon cancer and normal subjects (P<0.05) and the mean AUCs were 0.82 with a range of 0.70-0.88. Immune cell infiltration analyses demonstrated that the relative quantity of plasma cells, T cells, B cells, NK cells, MO, M1, Dendritic cells resting, Mast cells resting, Mast cells activated, and Neutrophils in the tumor group were significantly different to the normal group. CONCLUSIONS ASCL2, BEST4, CFD, DPEPCFD, FOXQ1, TRIB3, KLF4, MMP7, MMP11, PYY, and PDK4 were identified as the key genes for colon cancer diagnosis. These genes are expected to become novel diagnostic markers and targets of new pharmacotherapies for colorectal cancer.
Collapse
Affiliation(s)
- Yan-Rong Li
- Department of Gastroenterology, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
| | - Ke Meng
- Department of Gastroenterology and Hepatology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Guang Yang
- Department of Laboratory, The Red Cross (SEN GONG GENERAL) Hospital of Heilongjiang, Heilongjiang, China
| | - Bao-Hai Liu
- Department of Gastroenterology, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
| | - Chu-Qiao Li
- Department of Gastroenterology, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
| | - Jia-Yuan Zhang
- Department of Gastroenterology, The First Affiliated Hospital of Jinzhou Medical University, Jinzhou, China
| | - Xiao-Mei Zhang
- Department of Gastroenterology and Hepatology, The First Medical Center, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
26
|
Su Y, Tian X, Gao R, Guo W, Chen C, Chen C, Jia D, Li H, Lv X. Colon cancer diagnosis and staging classification based on machine learning and bioinformatics analysis. Comput Biol Med 2022; 145:105409. [PMID: 35339846 DOI: 10.1016/j.compbiomed.2022.105409] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 02/20/2022] [Accepted: 03/12/2022] [Indexed: 12/13/2022]
Abstract
Advanced metastasis of colon cancer makes it more difficult to treat colon cancer. Finding the markers of colon cancer (Colon Cancer) can diagnose the stage of cancer in time and improve the prognosis with timely treatment. This paper uses gene expression profiling data from The Cancer Genome Atlas (TCGA) for the diagnosis of colon cancer and its staging. In this study, we first selected the gene modules with the greatest correlation with cancer by Weighted Gene Co-expression Network Analysis (WGCNA), extracted the characteristic genes for differential expression results using the least absolute shrinkage and selection operator algorithm (Lasso) and performed survival analysis, and then combined the genes in the modules with the Lasso-extracted feature genes were combined to diagnose colon cancer versus healthy controls using RF, SVM and decision trees, and colon cancer staging was diagnosed using differentially expressed genes for each stage. Finally, Protein-Protein Interaction Networks (PPI) networks were done for 289 genes to identify clusters of aggregated proteins for survival analysis. Finally, the RF model had the best results in the diagnosis of colon cancer versus control group fold cross-validation with an average accuracy of 99.81%, F1 value reaching 0.9968, accuracy of 99.88%, and recall of 99.5%, and an average accuracy of 91.5%, F1 value reaching 0.7679, accuracy of 86.94%, and recall in the diagnosis of colon cancer stages I, II, III and IV. The recall rate reached 73.04%, and eight genes associated with colon cancer prognosis were identified for GCNT2, GLDN, SULT1B1, UGT2B15, PTGDR2, GPR15, BMP5 and CPT2.
Collapse
Affiliation(s)
- Ying Su
- College of Software, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Xuecong Tian
- College of Software, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Rui Gao
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China
| | - Wenjia Guo
- Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, China
| | - Cheng Chen
- College of Software, Xinjiang University, Urumqi, 830046, Xinjiang, China.
| | - Chen Chen
- College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China; Cloud Computing Engineering Technology Research Center of Xinjiang, Kelamayi, 834099, China
| | - Dongfang Jia
- College of Software, Xinjiang University, Urumqi, 830046, Xinjiang, China
| | - Hongtao Li
- Affiliated Tumor Hospital of Xinjiang Medical University, Urumqi, 830011, China
| | - Xiaoyi Lv
- College of Software, Xinjiang University, Urumqi, 830046, Xinjiang, China; Key Laboratory of Signal Detection and Processing, Xinjiang University, Urumqi, 830046, Xinjiang, China.
| |
Collapse
|
27
|
Hammad A, Elshaer M, Tang X. Identification of potential biomarkers with colorectal cancer based on bioinformatics analysis and machine learning. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:8997-9015. [PMID: 34814332 DOI: 10.3934/mbe.2021443] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Colorectal cancer (CRC) is one of the most common malignancies worldwide. Biomarker discovery is critical to improve CRC diagnosis, however, machine learning offers a new platform to study the etiology of CRC for this purpose. Therefore, the current study aimed to perform an integrated bioinformatics and machine learning analyses to explore novel biomarkers for CRC prognosis. In this study, we acquired gene expression microarray data from Gene Expression Omnibus (GEO) database. The microarray expressions GSE103512 dataset was downloaded and integrated. Subsequently, differentially expressed genes (DEGs) were identified and functionally analyzed via Gene Ontology (GO) and Kyoto Enrichment of Genes and Genomes (KEGG). Furthermore, protein protein interaction (PPI) network analysis was conducted using the STRING database and Cytoscape software to identify hub genes; however, the hub genes were subjected to Support Vector Machine (SVM), Receiver operating characteristic curve (ROC) and survival analyses to explore their diagnostic values. Meanwhile, TCGA transcriptomics data in Gene Expression Profiling Interactive Analysis (GEPIA) database and the pathology data presented by in the human protein atlas (HPA) database were used to verify our transcriptomic analyses. A total of 105 DEGs were identified in this study. Functional enrichment analysis showed that these genes were significantly enriched in biological processes related to cancer progression. Thereafter, PPI network explored a total of 10 significant hub genes. The ROC curve was used to predict the potential application of biomarkers in CRC diagnosis, with an area under ROC curve (AUC) of these genes exceeding 0.92 suggesting that this risk classifier can discriminate between CRC patients and normal controls. Moreover, the prognostic values of these hub genes were confirmed by survival analyses using different CRC patient cohorts. Our results demonstrated that these 10 differentially expressed hub genes could be used as potential biomarkers for CRC diagnosis.
Collapse
Affiliation(s)
- Ahmed Hammad
- Department of Biochemistry and Department of Thoracic Surgery of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
- Radiation Biology Department, National Center for Radiation Research and Technology, Egyptian Atomic Energy Authority, Cairo 13759, Egypt
| | - Mohamed Elshaer
- Department of Biochemistry and Department of Thoracic Surgery of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
- Labeled Compounds Department, Hot Labs Center, Egyptian Atomic Energy Authority, Cairo 13759, Egypt
| | - Xiuwen Tang
- Department of Biochemistry and Department of Thoracic Surgery of the First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310003, China
| |
Collapse
|