1
|
Ozcelik F, Dundar MS, Yildirim AB, Henehan G, Vicente O, Sánchez-Alcázar JA, Gokce N, Yildirim DT, Bingol NN, Karanfilska DP, Bertelli M, Pojskic L, Ercan M, Kellermayer M, Sahin IO, Greiner-Tollersrud OK, Tan B, Martin D, Marks R, Prakash S, Yakubi M, Beccari T, Lal R, Temel SG, Fournier I, Ergoren MC, Mechler A, Salzet M, Maffia M, Danalev D, Sun Q, Nei L, Matulis D, Tapaloaga D, Janecke A, Bown J, Cruz KS, Radecka I, Ozturk C, Nalbantoglu OU, Sag SO, Ko K, Arngrimsson R, Belo I, Akalin H, Dundar M. The impact and future of artificial intelligence in medical genetics and molecular medicine: an ongoing revolution. Funct Integr Genomics 2024; 24:138. [PMID: 39147901 DOI: 10.1007/s10142-024-01417-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 08/17/2024]
Abstract
Artificial intelligence (AI) platforms have emerged as pivotal tools in genetics and molecular medicine, as in many other fields. The growth in patient data, identification of new diseases and phenotypes, discovery of new intracellular pathways, availability of greater sets of omics data, and the need to continuously analyse them have led to the development of new AI platforms. AI continues to weave its way into the fabric of genetics with the potential to unlock new discoveries and enhance patient care. This technology is setting the stage for breakthroughs across various domains, including dysmorphology, rare hereditary diseases, cancers, clinical microbiomics, the investigation of zoonotic diseases, omics studies in all medical disciplines. AI's role in facilitating a deeper understanding of these areas heralds a new era of personalised medicine, where treatments and diagnoses are tailored to the individual's molecular features, offering a more precise approach to combating genetic or acquired disorders. The significance of these AI platforms is growing as they assist healthcare professionals in the diagnostic and treatment processes, marking a pivotal shift towards more informed, efficient, and effective medical practice. In this review, we will explore the range of AI tools available and show how they have become vital in various sectors of genomic research supporting clinical decisions.
Collapse
Affiliation(s)
- Firat Ozcelik
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Mehmet Sait Dundar
- Department of Electrical and Computer Engineering, Graduate School of Engineering and Sciences, Abdullah Gul University, Kayseri, Turkey
| | - A Baki Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Gary Henehan
- School of Food Science and Environmental Health, Technological University of Dublin, Dublin, Ireland
| | - Oscar Vicente
- Institute for the Conservation and Improvement of Valencian Agrodiversity (COMAV), Universitat Politècnica de València, Valencia, Spain
| | - José A Sánchez-Alcázar
- Centro de Investigación Biomédica en Red: Enfermedades Raras, Centro Andaluz de Biología del Desarrollo (CABD-CSIC-Universidad Pablo de Olavide), Instituto de Salud Carlos III, Sevilla, Spain
| | - Nuriye Gokce
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Duygu T Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Nurdeniz Nalbant Bingol
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
| | - Dijana Plaseska Karanfilska
- Research Centre for Genetic Engineering and Biotechnology, Macedonian Academy of Sciences and Arts, Skopje, Macedonia
| | | | - Lejla Pojskic
- Institute for Genetic Engineering and Biotechnology, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Mehmet Ercan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Miklos Kellermayer
- Department of Biophysics and Radiation Biology, Faculty of Medicine, Semmelweis University, Budapest, Hungary
| | - Izem Olcay Sahin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | | | - Busra Tan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Donald Martin
- University Grenoble Alpes, CNRS, TIMC-IMAG/SyNaBi (UMR 5525), Grenoble, France
| | - Robert Marks
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Satya Prakash
- Department of Biomedical Engineering, University of McGill, Montreal, QC, Canada
| | - Mustafa Yakubi
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Tommaso Beccari
- Department of Pharmeceutical Sciences, University of Perugia, Perugia, Italy
| | - Ratnesh Lal
- Neuroscience Research Institute, University of California, Santa Barbara, USA
| | - Sehime G Temel
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
- Department of Histology and Embryology, Faculty of Medicine, Bursa Uludag University, Bursa, Turkey
| | - Isabelle Fournier
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - M Cerkez Ergoren
- Department of Medical Genetics, Near East University Faculty of Medicine, Nicosia, Cyprus
| | - Adam Mechler
- Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia
| | - Michel Salzet
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - Michele Maffia
- Department of Experimental Medicine, University of Salento, Via Lecce-Monteroni, Lecce, 73100, Italy
| | - Dancho Danalev
- University of Chemical Technology and Metallurgy, Sofia, Bulgaria
| | - Qun Sun
- Department of Food Science and Technology, Sichuan University, Chengdu, China
| | - Lembit Nei
- School of Engineering Tallinn University of Technology, Tartu College, Tartu, Estonia
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Dana Tapaloaga
- Faculty of Veterinary Medicine, University of Agronomic Sciences and Veterinary Medicine of Bucharest, Bucharest, Romania
| | - Andres Janecke
- Department of Paediatrics I, Medical University of Innsbruck, Innsbruck, Austria
- Division of Human Genetics, Medical University of Innsbruck, Innsbruck, Austria
| | - James Bown
- School of Science, Engineering and Technology, Abertay University, Dundee, UK
| | | | - Iza Radecka
- School of Science, Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK
| | - Celal Ozturk
- Department of Software Engineering, Erciyes University, Kayseri, Turkey
| | - Ozkan Ufuk Nalbantoglu
- Department of Computer Engineering, Engineering Faculty, Erciyes University, Kayseri, Turkey
| | - Sebnem Ozemri Sag
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
| | - Kisung Ko
- Department of Medicine, College of Medicine, Chung-Ang University, Seoul, Korea
| | - Reynir Arngrimsson
- Iceland Landspitali University Hospital, University of Iceland, Reykjavik, Iceland
| | - Isabel Belo
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Hilal Akalin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| | - Munis Dundar
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| |
Collapse
|
2
|
Keuper K, Bartek J, Maya-Mendoza A. The nexus of nuclear envelope dynamics, circular economy and cancer cell pathophysiology. Eur J Cell Biol 2024; 103:151394. [PMID: 38340500 DOI: 10.1016/j.ejcb.2024.151394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 02/12/2024] Open
Abstract
The nuclear envelope (NE) is a critical component in maintaining the function and structure of the eukaryotic nucleus. The NE and lamina are disassembled during each cell cycle to enable an open mitosis. Nuclear architecture construction and deconstruction is a prime example of a circular economy, as it fulfills a highly efficient recycling program bound to continuous assessment of the quality and functionality of the building blocks. Alterations in the nuclear dynamics and lamina structure have emerged as important contributors to both oncogenic transformation and cancer progression. However, the knowledge of the NE breakdown and reassembly is still limited to a fraction of participating proteins and complexes. As cancer cells contain highly diverse nuclei in terms of DNA content, but also in terms of nuclear number, size, and shape, it is of great interest to understand the intricate relationship between these nuclear features in cancer cell pathophysiology. In this review, we provide insights into how those NE dynamics are regulated, and how lamina destabilization processes may alter the NE circular economy. Moreover, we expand the knowledge of the lamina-associated domain region by using strategic algorithms, including Artificial Intelligence, to infer protein associations, assess their function and location, and predict cancer-type specificity with implications for the future of cancer diagnosis, prognosis and treatment. Using this approach we identified NUP98 and MECP2 as potential proteins that exhibit upregulation in Acute Myeloid Leukemia (LAML) patients with implications for early diagnosis.
Collapse
Affiliation(s)
- Kristina Keuper
- DNA Replication and Cancer Group, Danish Cancer Institute, Copenhagen, Denmark; Genome Integrity Group, Danish Cancer Institute, Copenhagen, Denmark
| | - Jiri Bartek
- Genome Integrity Group, Danish Cancer Institute, Copenhagen, Denmark; Division of Genome Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, SciLifeLab, Stockholm, Sweden
| | | |
Collapse
|
3
|
Wang CW, Muzakky H, Firdi NP, Liu TC, Lai PJ, Wang YC, Yu MH, Chao TK. Deep learning to assess microsatellite instability directly from histopathological whole slide images in endometrial cancer. NPJ Digit Med 2024; 7:143. [PMID: 38811811 PMCID: PMC11137095 DOI: 10.1038/s41746-024-01131-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 05/08/2024] [Indexed: 05/31/2024] Open
Abstract
Molecular classification, particularly microsatellite instability-high (MSI-H), has gained attention for immunotherapy in endometrial cancer (EC). MSI-H is associated with DNA mismatch repair defects and is a crucial treatment predictor. The NCCN guidelines recommend pembrolizumab and nivolumab for advanced or recurrent MSI-H/mismatch repair deficient (dMMR) EC. However, evaluating MSI in all cases is impractical due to time and cost constraints. To overcome this challenge, we present an effective and efficient deep learning-based model designed to accurately and rapidly assess MSI status of EC using H&E-stained whole slide images. Our framework was evaluated on a comprehensive dataset of gigapixel histopathology images of 529 patients from the Cancer Genome Atlas (TCGA). The experimental results have shown that the proposed method achieved excellent performances in assessing MSI status, obtaining remarkably high results with 96%, 94%, 93% and 100% for endometrioid carcinoma G1G2, respectively, and 87%, 84%, 81% and 94% for endometrioid carcinoma G3, in terms of F-measure, accuracy, precision and sensitivity, respectively. Furthermore, the proposed deep learning framework outperforms four state-of-the-art benchmarked methods by a significant margin (p < 0.001) in terms of accuracy, precision, sensitivity and F-measure, respectively. Additionally, a run time analysis demonstrates that the proposed method achieves excellent quantitative results with high efficiency in AI inference time (1.03 seconds per slide), making the proposed framework viable for practical clinical usage. These results highlight the efficacy and efficiency of the proposed model to assess MSI status of EC directly from histopathological slides.
Collapse
Affiliation(s)
- Ching-Wei Wang
- Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Hikam Muzakky
- Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Nabila Puspita Firdi
- Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Tzu-Chien Liu
- Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Po-Jen Lai
- Graduate Institute of Biomedical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Yu-Chi Wang
- Department of Gynecology and Obstetrics, Tri-Service General Hospital, Taipei, Taiwan
- Department of Gynecology and Obstetrics, National Defense Medical Center, Taipei, Taiwan
| | - Mu-Hsien Yu
- Department of Gynecology and Obstetrics, Tri-Service General Hospital, Taipei, Taiwan
- Department of Gynecology and Obstetrics, National Defense Medical Center, Taipei, Taiwan
| | - Tai-Kuang Chao
- Institute of Pathology and Parasitology, National Defense Medical Center, Taipei, Taiwan.
- Department of Pathology, Tri-Service General Hospital, Taipei, Taiwan.
| |
Collapse
|
4
|
Van R, Alvarez D, Mize T, Gannavarapu S, Chintham Reddy L, Nasoz F, Han MV. A comparison of RNA-Seq data preprocessing pipelines for transcriptomic predictions across independent studies. BMC Bioinformatics 2024; 25:181. [PMID: 38720247 PMCID: PMC11080237 DOI: 10.1186/s12859-024-05801-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 05/02/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND RNA sequencing combined with machine learning techniques has provided a modern approach to the molecular classification of cancer. Class predictors, reflecting the disease class, can be constructed for known tissue types using the gene expression measurements extracted from cancer patients. One challenge of current cancer predictors is that they often have suboptimal performance estimates when integrating molecular datasets generated from different labs. Often, the quality of the data is variable, procured differently, and contains unwanted noise hampering the ability of a predictive model to extract useful information. Data preprocessing methods can be applied in attempts to reduce these systematic variations and harmonize the datasets before they are used to build a machine learning model for resolving tissue of origins. RESULTS We aimed to investigate the impact of data preprocessing steps-focusing on normalization, batch effect correction, and data scaling-through trial and comparison. Our goal was to improve the cross-study predictions of tissue of origin for common cancers on large-scale RNA-Seq datasets derived from thousands of patients and over a dozen tumor types. The results showed that the choice of data preprocessing operations affected the performance of the associated classifier models constructed for tissue of origin predictions in cancer. CONCLUSION By using TCGA as a training set and applying data preprocessing methods, we demonstrated that batch effect correction improved performance measured by weighted F1-score in resolving tissue of origin against an independent GTEx test dataset. On the other hand, the use of data preprocessing operations worsened classification performance when the independent test dataset was aggregated from separate studies in ICGC and GEO. Therefore, based on our findings with these publicly available large-scale RNA-Seq datasets, the application of data preprocessing techniques to a machine learning pipeline is not always appropriate.
Collapse
Affiliation(s)
- Richard Van
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA
| | - Daniel Alvarez
- Department of Computer Science, University of Nevada Las Vegas, Las Vegas, NV, USA
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA
| | - Travis Mize
- Icahn School of Medicine at Mount Sinai, Institute for Genomic Health, New York City, NY, USA
| | - Sravani Gannavarapu
- Department of Computer Science, University of Nevada Las Vegas, Las Vegas, NV, USA
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA
| | - Lohitha Chintham Reddy
- Department of Computer Science, University of Nevada Las Vegas, Las Vegas, NV, USA
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA
| | - Fatma Nasoz
- Department of Computer Science, University of Nevada Las Vegas, Las Vegas, NV, USA
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA
| | - Mira V Han
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV, USA.
- Nevada Institute of Personalized Medicine, Las Vegas, NV, USA.
| |
Collapse
|
5
|
Amanzholova A, Coşkun A. Enhancing cancer stage prediction through hybrid deep neural networks: a comparative study. Front Big Data 2024; 7:1359703. [PMID: 38586474 PMCID: PMC10995364 DOI: 10.3389/fdata.2024.1359703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 02/20/2024] [Indexed: 04/09/2024] Open
Abstract
Efficiently detecting and treating cancer at an early stage is crucial to improve the overall treatment process and mitigate the risk of disease progression. In the realm of research, the utilization of artificial intelligence technologies holds significant promise for enhancing advanced cancer diagnosis. Nonetheless, a notable hurdle arises when striving for precise cancer-stage diagnoses through the analysis of gene sets. Issues such as limited sample volumes, data dispersion, overfitting, and the use of linear classifiers with simple parameters hinder prediction performance. This study introduces an innovative approach for predicting early and late-stage cancers by integrating hybrid deep neural networks. A deep neural network classifier, developed using the open-source TensorFlow library and Keras network, incorporates a novel method that combines genetic algorithms, Extreme Learning Machines (ELM), and Deep Belief Networks (DBN). Specifically, two evolutionary techniques, DBN-ELM-BP and DBN-ELM-ELM, are proposed and evaluated using data from The Cancer Genome Atlas (TCGA), encompassing mRNA expression, miRNA levels, DNA methylation, and clinical information. The models demonstrate outstanding prediction accuracy (89.35%-98.75%) in distinguishing between early- and late-stage cancers. Comparative analysis against existing methods in the literature using the same cancer dataset reveals the superiority of the proposed hybrid method, highlighting its enhanced accuracy in cancer stage prediction.
Collapse
Affiliation(s)
- Alina Amanzholova
- Graduate School of Natural and Applied Sciences, Department of Computer Engineering, Gazi University, Ankara, Türkiye
- Khoja Akhmet Yassawi International Kazakh-Turkish University, Faculty of Engineering, Department of Computer Engineering, Turkistan, Kazakhstan
| | - Aysun Coşkun
- Department of Computer Engineering, Faculty of Technology, Gazi University, Ankara, Türkiye
| |
Collapse
|
6
|
Zhang ZH, Du Y, Wei S, Pei W. Multilayered insights: a machine learning approach for personalized prognostic assessment in hepatocellular carcinoma. Front Oncol 2024; 13:1327147. [PMID: 38486931 PMCID: PMC10937467 DOI: 10.3389/fonc.2023.1327147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 12/08/2023] [Indexed: 03/17/2024] Open
Abstract
Background Hepatocellular carcinoma (HCC) is a complex malignancy, and precise prognosis assessment is vital for personalized treatment decisions. Objective This study aimed to develop a multi-level prognostic risk model for HCC, offering individualized prognosis assessment and treatment guidance. Methods By utilizing data from The Cancer Genome Atlas (TCGA) and the Surveillance, Epidemiology, and End Results (SEER) database, we performed differential gene expression analysis to identify genes associated with survival in HCC patients. The HCC Differential Gene Prognostic Model (HCC-DGPM) was developed through multivariate Cox regression. Clinical indicators were incorporated into the HCC-DGPM using Cox regression, leading to the creation of the HCC Multilevel Prognostic Model (HCC-MLPM). Immune function was evaluated using single-sample Gene Set Enrichment Analysis (ssGSEA), and immune cell infiltration was assessed. Patient responsiveness to immunotherapy was evaluated using the Immunophenoscore (IPS). Clinical drug responsiveness was investigated using drug-related information from the TCGA database. Cox regression, Kaplan-Meier analysis, and trend association tests were conducted. Results Seven differentially expressed genes from the TCGA database were used to construct the HCC-DGPM. Additionally, four clinical indicators associated with survival were identified from the SEER database for model adjustment. The adjusted HCC-MLPM showed significantly improved discriminative capacity (AUC=0.819 vs. 0.724). External validation involving 153 HCC patients from the International Cancer Genome Consortium (ICGC) database verified the performance of the HCC-MLPM (AUC=0.776). Significantly, the HCC-MLPM exhibited predictive capacity for patient response to immunotherapy and clinical drug efficacy (P < 0.05). Conclusion This study offers comprehensive insights into HCC prognosis and develops predictive models to enhance patient outcomes. The evaluation of immune function, immune cell infiltration, and clinical drug responsiveness enhances our comprehension and management of HCC.
Collapse
Affiliation(s)
| | - Yunxiang Du
- Department of Oncology, Huai’an 82 Hospital, China RongTong Medical Healthcare Group Co., Ltd., Chengdu, China
| | - Shuzhen Wei
- Department of Oncology, Huai’an 82 Hospital, China RongTong Medical Healthcare Group Co., Ltd., Chengdu, China
| | - Weidong Pei
- Department of Discipline Development, China RongTong Medical Healthcare Group Co., Ltd., Chengdu, China
| |
Collapse
|
7
|
Ren Y, Gao Y, Du W, Qiao W, Li W, Yang Q, Liang Y, Li G. Classifying breast cancer using multi-view graph neural network based on multi-omics data. Front Genet 2024; 15:1363896. [PMID: 38444760 PMCID: PMC10912483 DOI: 10.3389/fgene.2024.1363896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 02/02/2024] [Indexed: 03/07/2024] Open
Abstract
Introduction: As the evaluation indices, cancer grading and subtyping have diverse clinical, pathological, and molecular characteristics with prognostic and therapeutic implications. Although researchers have begun to study cancer differentiation and subtype prediction, most of relevant methods are based on traditional machine learning and rely on single omics data. It is necessary to explore a deep learning algorithm that integrates multi-omics data to achieve classification prediction of cancer differentiation and subtypes. Methods: This paper proposes a multi-omics data fusion algorithm based on a multi-view graph neural network (MVGNN) for predicting cancer differentiation and subtype classification. The model framework consists of a graph convolutional network (GCN) module for learning features from different omics data and an attention module for integrating multi-omics data. Three different types of omics data are used. For each type of omics data, feature selection is performed using methods such as the chi-square test and minimum redundancy maximum relevance (mRMR). Weighted patient similarity networks are constructed based on the selected omics features, and GCN is trained using omics features and corresponding similarity networks. Finally, an attention module integrates different types of omics features and performs the final cancer classification prediction. Results: To validate the cancer classification predictive performance of the MVGNN model, we conducted experimental comparisons with traditional machine learning models and currently popular methods based on integrating multi-omics data using 5-fold cross-validation. Additionally, we performed comparative experiments on cancer differentiation and its subtypes based on single omics data, two omics data, and three omics data. Discussion: This paper proposed the MVGNN model and it performed well in cancer classification prediction based on multiple omics data.
Collapse
Affiliation(s)
- Yanjiao Ren
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Yimeng Gao
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Wei Du
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Weibo Qiao
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Wei Li
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Qianqian Yang
- College of Information Technology, Smart Agriculture Research Institute, Jilin Agricultural University, Changchun, Jilin, China
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, China
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai, China
| | - Gaoyang Li
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, China
| |
Collapse
|
8
|
Holubekova V, Loderer D, Grendar M, Mikolajcik P, Kolkova Z, Turyova E, Kudelova E, Kalman M, Marcinek J, Miklusica J, Laca L, Lasabova Z. Differential gene expression of immunity and inflammation genes in colorectal cancer using targeted RNA sequencing. Front Oncol 2023; 13:1206482. [PMID: 37869102 PMCID: PMC10586664 DOI: 10.3389/fonc.2023.1206482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 08/24/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction Colorectal cancer (CRC) is a heterogeneous disease caused by molecular changes, as driver mutations, gene methylations, etc., and influenced by tumor microenvironment (TME) pervaded with immune cells with both pro- and anti-tumor effects. The studying of interactions between the immune system (IS) and the TME is important for developing effective immunotherapeutic strategies for CRC. In our study, we focused on the analysis of expression profiles of inflammatory and immune-relevant genes to identify aberrant signaling pathways included in carcinogenesis, metastatic potential of tumors, and association of Kirsten rat sarcoma virus (KRAS) gene mutation. Methods A total of 91 patients were enrolled in the study. Using NGS, differential gene expression analysis of 11 tumor samples and 11 matching non-tumor controls was carried out by applying a targeted RNA panel for inflammation and immunity genes containing 475 target genes. The obtained data were evaluated by the CLC Genomics Workbench and R library. The significantly differentially expressed genes (DEGs) were analyzed in Reactome GSA software, and some selected DEGs were used for real-time PCR validation. Results After prioritization, the most significant differences in gene expression were shown by the genes TNFRSF4, IRF7, IL6R, NR3CI, EIF2AK2, MIF, CCL5, TNFSF10, CCL20, CXCL11, RIPK2, and BLNK. Validation analyses on 91 samples showed a correlation between RNA-seq data and qPCR for TNFSF10, RIPK2, and BLNK gene expression. The top differently regulated signaling pathways between the studied groups (cancer vs. control, metastatic vs. primary CRC and KRAS positive and negative CRC) belong to immune system, signal transduction, disease, gene expression, DNA repair, and programmed cell death. Conclusion Analyzed data suggest the changes at more levels of CRC carcinogenesis, including surface receptors of epithelial or immune cells, its signal transduction pathways, programmed cell death modifications, alterations in DNA repair machinery, and cell cycle control leading to uncontrolled proliferation. This study indicates only basic molecular pathways that enabled the formation of metastatic cancer stem cells and may contribute to clarifying the function of the IS in the TME of CRC. A precise identification of signaling pathways responsible for CRC may help in the selection of personalized pharmacological treatment.
Collapse
Affiliation(s)
- Veronika Holubekova
- Laboratory of Genomics and Prenatal Diagnostics, Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin, Slovakia
| | - Dusan Loderer
- Laboratory of Genomics and Prenatal Diagnostics, Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin, Slovakia
| | - Marian Grendar
- Laboratory of Bioinformatics and Biostatistics, Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin, Slovakia
| | - Peter Mikolajcik
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Zuzana Kolkova
- Laboratory of Genomics and Prenatal Diagnostics, Biomedical Center in Martin, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin, Slovakia
| | - Eva Turyova
- Department of Molecular Biology and Genomics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin, Slovakia
| | - Eva Kudelova
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Michal Kalman
- Department of Pathological Anatomy, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Juraj Marcinek
- Department of Pathological Anatomy, Jessenius Faculty of Medicine, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Juraj Miklusica
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Ludovit Laca
- Clinic of Surgery and Transplant Center, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin University Hospital, Martin, Slovakia
| | - Zora Lasabova
- Department of Molecular Biology and Genomics, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin, Slovakia
| |
Collapse
|
9
|
TCNN: A Transformer Convolutional Neural Network for artifact classification in whole slide images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023]
|
10
|
Guttà C, Morhard C, Rehm M. Applying a GAN-based classifier to improve transcriptome-based prognostication in breast cancer. PLoS Comput Biol 2023; 19:e1011035. [PMID: 37011102 PMCID: PMC10101642 DOI: 10.1371/journal.pcbi.1011035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 04/13/2023] [Accepted: 03/17/2023] [Indexed: 04/05/2023] Open
Abstract
Established prognostic tests based on limited numbers of transcripts can identify high-risk breast cancer patients yet are approved only for individuals presenting with specific clinical features or disease characteristics. Deep learning algorithms could hold potential for stratifying patient cohorts based on full transcriptome data, yet the development of robust classifiers is hampered by the number of variables in omics datasets typically far exceeding the number of patients. To overcome this hurdle, we propose a classifier based on a data augmentation pipeline consisting of a Wasserstein generative adversarial network (GAN) with gradient penalty and an embedded auxiliary classifier to obtain a trained GAN discriminator (T-GAN-D). Applied to 1244 patients of the METABRIC breast cancer cohort, this classifier outperformed established breast cancer biomarkers in separating low- from high-risk patients (disease specific death, progression or relapse within 10 years from initial diagnosis). Importantly, the T-GAN-D also performed across independent, merged transcriptome datasets (METABRIC and TCGA-BRCA cohorts), and merging data improved overall patient stratification. In conclusion, the reiterative GAN-based training process allowed generating a robust classifier capable of stratifying low- vs high-risk patients based on full transcriptome data and across independent and heterogeneous breast cancer cohorts.
Collapse
Affiliation(s)
- Cristiano Guttà
- Institute of Cell Biology and Immunology, University of Stuttgart, Stuttgart, Germany
| | | | - Markus Rehm
- Institute of Cell Biology and Immunology, University of Stuttgart, Stuttgart, Germany
- Stuttgart Research Center Systems Biology, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
11
|
Taghizadeh E, Heydarheydari S, Saberi A, JafarpoorNesheli S, Rezaeijo SM. Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods. BMC Bioinformatics 2022; 23:410. [PMID: 36183055 PMCID: PMC9526906 DOI: 10.1186/s12859-022-04965-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 09/27/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We used a hybrid machine learning systems (HMLS) strategy that includes the extensive search for the discovery of the most optimal HMLSs, including feature selection algorithms, a feature extraction algorithm, and classifiers for diagnosing breast cancer. Hence, this study aims to obtain a high-importance transcriptome profile linked with classification procedures that can facilitate the early detection of breast cancer. METHODS In the present study, 762 breast cancer patients and 138 solid tissue normal subjects were included. Three groups of machine learning (ML) algorithms were employed: (i) four feature selection procedures are employed and compared to select the most valuable feature: (1) ANOVA; (2) Mutual Information; (3) Extra Trees Classifier; and (4) Logistic Regression (LGR), (ii) a feature extraction algorithm (Principal Component Analysis), iii) we utilized 13 classification algorithms accompanied with automated ML hyperparameter tuning, including (1) LGR; (2) Support Vector Machine; (3) Bagging; (4) Gaussian Naive Bayes; (5) Decision Tree; (6) Gradient Boosting Decision Tree; (7) K Nearest Neighborhood; (8) Bernoulli Naive Bayes; (9) Random Forest; (10) AdaBoost, (11) ExtraTrees; (12) Linear Discriminant Analysis; and (13) Multilayer Perceptron (MLP). For evaluating the proposed models' performance, balance accuracy and area under the curve (AUC) were used. RESULTS Feature selection procedure LGR + MLP classifier achieved the highest prediction accuracy and AUC (balanced accuracy: 0.86, AUC = 0.94), followed by an LGR + LGR classifier (balanced accuracy: 0.84, AUC = 0.94). The results showed that achieved AUC for the LGR + LGR classifier belonged to the 20 biomarkers as follows: TMEM212, SNORD115-13, ATP1A4, FRG2, CFHR4, ZCCHC13, FLJ46361, LY6G6E, ZNF323, KRT28, KRT25, LPPR5, C10orf99, PRKACG, SULT2A1, GRIN2C, EN2, GBA2, CUX2, and SNORA66. CONCLUSIONS The best performance was achieved using the LGR feature selection procedure and MLP classifier. Results show that the 20 biomarkers had the highest score or ranking in breast cancer detection.
Collapse
Affiliation(s)
- Eskandar Taghizadeh
- Department of Medical Genetic, Faculty of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Sahel Heydarheydari
- Department of Radiology Technology, Shoushtar Faculty of Medical Sciences, Shoushtar, Iran
| | - Alihossein Saberi
- Department of Medical Genetic, Faculty of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | | | - Seyed Masoud Rezaeijo
- Department of Medical Physics, Faculty of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.
| |
Collapse
|
12
|
Maabreh RSA, Alazzam MB, AlGhamdi AS. Machine Learning Algorithms for Prediction of Survival Curves in Breast Cancer Patients. Appl Bionics Biomech 2021; 2021:9338091. [PMID: 34845416 PMCID: PMC8627349 DOI: 10.1155/2021/9338091] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 10/19/2021] [Accepted: 10/29/2021] [Indexed: 12/30/2022] Open
Abstract
Today, cancer is the second leading cause of death worldwide, and the number of people diagnosed with the disease is expected to rise. Breast cancer is the most commonly diagnosed cancer in women, and it has one of the highest survival rates when treated properly. Because the effectiveness and, as a result, survival of the patient are dependent on each case, it is critical to know the modelling of their survival ahead of time. Artificial intelligence is a rapidly expanding field, and its clinical applications are following suit (having surpassed humans in many evidence-based medical tasks). From the inception of since first stable risk estimator based on statistical methods appeared in survival analysis, there have been numerous versions of it created, with machine learning being used in only a few of them. Nonlinear relationships between variables and the impact they have on the variable to be predicted are very easy to evaluate using statistical methods. However, because they are just mathematical equations, they have flaws that limit the quality of their output. The main goal of this study is to find the best machine learning algorithms for predicting the individualised survival of breast cancer patients, as well as the most appropriate treatment, and to propose new numerical variable stratifications. They will still be carried out using unsupervised machine learning methods that divide patients into groups based on their risk in each dataset. We will compare it to standard groupings to see if it has more significance. Knowing that the greatest challenge in dealing with clinical data is its quantity and quality, we have gone to great lengths to ensure their quality before replicating them. We used the Cox statistical method in conjunction with other statistical methods and tests to find the best possible dataset with which to train our model, despite its ease of multivariate analysis.
Collapse
Affiliation(s)
| | | | - Ahmed S. AlGhamdi
- Department of Computer Engineering, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| |
Collapse
|