1
|
Du W, Jia M, Li J, Gao M, Zhang W, Yu Y, Wang H, Peng X. Prognostic prediction model for salivary gland carcinoma based on machine learning. Int J Oral Maxillofac Surg 2024; 53:905-910. [PMID: 38981745 DOI: 10.1016/j.ijom.2024.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 06/23/2024] [Accepted: 07/01/2024] [Indexed: 07/11/2024]
Abstract
Although rare overall, salivary gland carcinomas (SGCs) are among the most common oral and maxillofacial malignancies. The aim of this study was to develop a machine learning-based model to predict the survival of patients with SGC. Patients in whom SGC was confirmed by histological testing and who underwent primary extirpation at the authors' institution between 1963 and 2014 were identified. Demographic and clinicopathological data with complete follow-up information were collected for analysis. Feature selection methods were used to determine the correlation between prognosis-related factors and survival in the collected patient data. The collected clinicopathological data and multiple machine learning algorithms were used to develop a survival prediction model. Three machine learning algorithms were applied to construct the prediction models. The area under the receiver operating characteristic curve (AUC) and accuracy were used to measure model performance. The best classification performance was achieved with a LightGBM algorithm (AUC = 0.83, accuracy = 0.91). This model enabled prognostic prediction of patient survival. The model may be useful in developing personalized diagnostic and treatment strategies and formulating individualized follow-up plans, as well as assisting in the communication between doctors and patients, facilitating a better understanding of and compliance with treatment.
Collapse
Affiliation(s)
- W Du
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - M Jia
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China; Zhongguancun Hospital, Beijing, China
| | - J Li
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - M Gao
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - W Zhang
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - Y Yu
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China
| | - H Wang
- School of Mathematical Sciences, Beihang University, Beijing, China
| | - X Peng
- Department of Oral and Maxillofacial Surgery, Peking University School and Hospital of Stomatology, Beijing, China.
| |
Collapse
|
2
|
Kaur P, Singh A, Chana I. OmicPredict: a framework for omics data prediction using ANOVA-Firefly algorithm for feature selection. Comput Methods Biomech Biomed Engin 2024; 27:1970-1983. [PMID: 37842810 DOI: 10.1080/10255842.2023.2268236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 09/12/2023] [Accepted: 09/30/2023] [Indexed: 10/17/2023]
Abstract
High-throughput technologies and machine learning (ML), when applied to a huge pool of medical data such as omics data, result in efficient analysis. Recent research aims to apply and develop ML models to predict a disease well in time using available omics datasets. The present work proposed a framework, 'OmicPredict', deploying a hybrid feature selection method and deep neural network (DNN) model to predict multiple diseases using omics data. The hybrid feature selection method is developed using the Analysis of Variance (ANOVA) technique and firefly algorithm. The OmicPredict framework is applied to three case studies, Alzheimer's disease, Breast cancer, and Coronavirus disease 2019 (COVID-19). In the case study of Alzheimer's disease, the framework predicts patients using GSE33000 and GSE44770 dataset. In the case study of Breast cancer, the framework predicts human epidermal growth factor receptor 2 (HER2) subtype status using Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset. In the case study of COVID-19, the framework performs patients' classification using GSE157103 dataset. The experimental results show that DNN model achieved an Area Under Curve (AUC) score of 0.949 for the Alzheimer's (GSE33000 and GSE44770) dataset. Furthermore, it achieved an AUC score of 0.987 and 0.989 for breast cancer (METABRIC) and COVID-19 (GSE157103) datasets, respectively, outperforming Random Forest, Naïve Bayes models, and the existing research.
Collapse
Affiliation(s)
- Parampreet Kaur
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Ashima Singh
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Inderveer Chana
- Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| |
Collapse
|
3
|
Na S, Jeong H, Kim I, Hong SM, Shim J, Yoon IH, Cho KH. Distribution coefficient prediction using multimodal machine learning based on soil adsorption factors, XRF, and XRD spectrum data. JOURNAL OF HAZARDOUS MATERIALS 2024; 478:135285. [PMID: 39121738 DOI: 10.1016/j.jhazmat.2024.135285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 07/08/2024] [Accepted: 07/20/2024] [Indexed: 08/12/2024]
Abstract
The distribution coefficient (Kd) plays a crucial role in predicting the migration behavior of radionuclides in the soil environment. However, Kd depends on the complexities of geological and environmental factors, and existing models often do not reflect the unique soil properties. We propose a multimodal technique to predict Kd values for radionuclide adsorption in soils surrounding nuclear facilities in Republic of Korea. We integrated and trained three sub-networks reflecting different data domains: soil adsorption factors for physicochemical conditions, X-ray fluorescence (XRF) data, and X-ray diffraction (XRD) spectra for inherent soil properties. Our multimodal model achieved high performance, with a coefficient of determination (R2) of 0.84 and root mean squared error (RMSE) of 0.89 for natural log-transformed Kd. This is the first study to develop a multimodal model that simultaneously incorporates inherent soil properties and adsorption factors to predict Kd. We investigated influential peaks in XRD spectra and also revealed that pH and calcium oxide (CaO) were significant variables in soil adsorption factors and XRF data, respectively. These results promote the use of a multimodal model to predict Kd values by integrating data from different domains, providing a cost-effective and novel approach to elucidate the mechanisms of radionuclide adsorption in soil.
Collapse
Affiliation(s)
- Seongyeon Na
- Department of Civil, Urban, Earth and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Republic of Korea
| | - Heewon Jeong
- Future and Fusion Lab of Architectural, Civil and Environmental Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Ilgook Kim
- Decommissioning Technology Research Division, Korea Atomic Energy Research Institute, Daejeon 34057, Republic of Korea
| | - Seok Min Hong
- Department of Civil, Urban, Earth and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Republic of Korea
| | - Jaegyu Shim
- Department of Civil, Urban, Earth and Environmental Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Republic of Korea
| | - In-Ho Yoon
- Decommissioning Technology Research Division, Korea Atomic Energy Research Institute, Daejeon 34057, Republic of Korea.
| | - Kyung Hwa Cho
- School of Civil, Environmental, and Architectural Engineering, Korea University, Seoul 02841, Republic of Korea.
| |
Collapse
|
4
|
El-Ateif S, Idri A. Multimodality Fusion Strategies in Eye Disease Diagnosis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2524-2558. [PMID: 38639808 PMCID: PMC11522204 DOI: 10.1007/s10278-024-01105-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 03/08/2024] [Accepted: 03/26/2024] [Indexed: 04/20/2024]
Abstract
Multimodality fusion has gained significance in medical applications, particularly in diagnosing challenging diseases like eye diseases, notably diabetic eye diseases that pose risks of vision loss and blindness. Mono-modality eye disease diagnosis proves difficult, often missing crucial disease indicators. In response, researchers advocate multimodality-based approaches to enhance diagnostics. This study is a unique exploration, evaluating three multimodality fusion strategies-early, joint, and late-in conjunction with state-of-the-art convolutional neural network models for automated eye disease binary detection across three datasets: fundus fluorescein angiography, macula, and combination of digital retinal images for vessel extraction, structured analysis of the retina, and high-resolution fundus. Findings reveal the efficacy of each fusion strategy: type 0 early fusion with DenseNet121 achieves an impressive 99.45% average accuracy. InceptionResNetV2 emerges as the top-performing joint fusion architecture with an average accuracy of 99.58%. Late fusion ResNet50V2 achieves a perfect score of 100% across all metrics, surpassing both early and joint fusion. Comparative analysis demonstrates that late fusion ResNet50V2 matches the accuracy of state-of-the-art feature-level fusion model for multiview learning. In conclusion, this study substantiates late fusion as the optimal strategy for eye disease diagnosis compared to early and joint fusion, showcasing its superiority in leveraging multimodal information.
Collapse
Affiliation(s)
- Sara El-Ateif
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco
| | - Ali Idri
- Software Project Management Research Team, ENSIAS, Mohammed V University, BP 713, Agdal, Rabat, Morocco.
- Faculty of Medical Sciences, Mohammed VI Polytechnic University, Marrakech-Rhamna, Benguerir, Morocco.
| |
Collapse
|
5
|
Mathur A, Arya N, Pasupa K, Saha S, Roy Dey S, Saha S. Breast cancer prognosis through the use of multi-modal classifiers: current state of the art and the way forward. Brief Funct Genomics 2024; 23:561-569. [PMID: 38688724 DOI: 10.1093/bfgp/elae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 03/01/2024] [Accepted: 04/09/2024] [Indexed: 05/02/2024] Open
Abstract
We present a survey of the current state-of-the-art in breast cancer detection and prognosis. We analyze the evolution of Artificial Intelligence-based approaches from using just uni-modal information to multi-modality for detection and how such paradigm shift facilitates the efficacy of detection, consistent with clinical observations. We conclude that interpretable AI-based predictions and ability to handle class imbalance should be considered priority.
Collapse
Affiliation(s)
- Archana Mathur
- Department of Information Science and Engineering, Nitte Meenakshi Institute of Technology, Yelahanka, 560064, Karnataka, India
| | - Nikhilanand Arya
- School of Computer Engineering, Kalinga Institute of Industrial Technology, Deemed to be University, Bhubaneshwar, 751024, Odisha, India
| | - Kitsuchart Pasupa
- School of Information Technology, King Mongkut's Institute of Technology Ladkrabang, 1 Soi Chalongkrung 1, 10520, Bangkok, Thailand
| | - Sriparna Saha
- Computer Science and Engineering, Indian Institute of Technology Patna, Bihta, 801106, Bihar, India
| | - Sudeepa Roy Dey
- Department of Computer Science and Engineering, PES University, Hosur Road, 560100, Karnataka, India
| | - Snehanshu Saha
- CSIS and APPCAIR, BITS Pilani K.K Birla Goa Campus, Goa, 403726, Goa, India
- Div of AI Research, HappyMonk AI, Bangalore, 560078, Karnataka, India
| |
Collapse
|
6
|
Wang F, Jia K, Li Y. Integrative deep learning with prior assisted feature selection. Stat Med 2024; 43:3792-3814. [PMID: 38923006 DOI: 10.1002/sim.10148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 04/23/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024]
Abstract
Integrative analysis has emerged as a prominent tool in biomedical research, offering a solution to the "smalln $$ n $$ and largep $$ p $$ " challenge. Leveraging the powerful capabilities of deep learning in extracting complex relationship between genes and diseases, our objective in this study is to incorporate deep learning into the framework of integrative analysis. Recognizing the redundancy within candidate features, we introduce a dedicated feature selection layer in the proposed integrative deep learning method. To further improve the performance of feature selection, the rich previous researches are utilized by an ensemble learning method to identify "prior information". This leads to the proposed prior assisted integrative deep learning (PANDA) method. We demonstrate the superiority of the PANDA method through a series of simulation studies, showing its clear advantages over competing approaches in both feature selection and outcome prediction. Finally, a skin cutaneous melanoma (SKCM) dataset is extensively analyzed by the PANDA method to show its practical application.
Collapse
Affiliation(s)
- Feifei Wang
- Center for Applied Statistics, Renmin University of China, Beijing, China
- School of Statistics, Renmin University of China, Beijing, China
| | - Ke Jia
- School of Statistics, Renmin University of China, Beijing, China
| | - Yang Li
- Center for Applied Statistics, Renmin University of China, Beijing, China
- School of Statistics, Renmin University of China, Beijing, China
| |
Collapse
|
7
|
Yang P, Chen W, Qiu H. MMGCN: Multi-modal multi-view graph convolutional networks for cancer prognosis prediction. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 257:108400. [PMID: 39270533 DOI: 10.1016/j.cmpb.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 07/14/2024] [Accepted: 08/27/2024] [Indexed: 09/15/2024]
Abstract
BACKGROUND AND OBJECTIVE Accurate prognosis prediction for cancer patients plays a significant role in the formulation of treatment strategies, considerably impacting personalized medicine. Recent advancements in this field indicate that integrating information from various modalities, such as genetic and clinical data, and developing multi-modal deep learning models can enhance prediction accuracy. However, most existing multi-modal deep learning methods either overlook patient similarities that benefit prognosis prediction or fail to effectively capture diverse information due to measuring patient similarities from a single perspective. To address these issues, a novel framework called multi-modal multi-view graph convolutional networks (MMGCN) is proposed for cancer prognosis prediction. METHODS Initially, we utilize the similarity network fusion (SNF) algorithm to merge patient similarity networks (PSNs), individually constructed using gene expression, copy number alteration, and clinical data, into a fused PSN for integrating multi-modal information. To capture diverse perspectives of patient similarities, we treat the fused PSN as a multi-view graph by considering each single-edge-type subgraph as a view graph, and propose multi-view graph convolutional networks (GCNs) with a view-level attention mechanism. Moreover, an edge homophily prediction module is designed to alleviate the adverse effects of heterophilic edges on the representation power of GCNs. Finally, comprehensive representations of patient nodes are obtained to predict cancer prognosis. RESULTS Experimental results demonstrate that MMGCN outperforms state-of-the-art baselines on four public datasets, including METABRIC, TCGA-BRCA, TCGA-LGG, and TCGA-LUSC, with the area under the receiver operating characteristic curve achieving 0.827 ± 0.005, 0.805 ± 0.014, 0.925 ± 0.007, and 0.746 ± 0.013, respectively. CONCLUSIONS Our study reveals the effectiveness of the proposed MMGCN, which deeply explores patient similarities related to different modalities from a broad perspective, in enhancing the performance of multi-modal cancer prognosis prediction. The source code is publicly available at https://github.com/ping-y/MMGCN.
Collapse
Affiliation(s)
- Ping Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Wengxiang Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, PR China
| | - Hang Qiu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, PR China; Big Data Research Center, University of Electronic Science and Technology of China, Chengdu, 611731, PR China.
| |
Collapse
|
8
|
Darbandi MR, Darbandi M, Darbandi S, Bado I, Hadizadeh M, Khorram Khorshid HR. Artificial intelligence breakthroughs in pioneering early diagnosis and precision treatment of breast cancer: A multimethod study. Eur J Cancer 2024; 209:114227. [PMID: 39053289 DOI: 10.1016/j.ejca.2024.114227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 07/07/2024] [Indexed: 07/27/2024]
Abstract
This article delves into the potential of artificial intelligence (AI) to enhance early breast cancer (BC) detection for improved treatment outcomes and patient care. Utilizing a multimethod approach comprising literature review and experiments, the study systematically reviewed 310 articles utilizing 30 diverse datasets. Among the techniques assessed, recurrent neural network (RNN) emerged as the most accurate, achieving 98.58 % accuracy, followed by genetic principles (GP), transfer learning (TL), and artificial neural networks (ANNs), with accuracies exceeding 96 %. While conventional machine learning (ML) methods demonstrated accuracies above 90 %, DL techniques outperformed them. Evaluation of BC diagnostic models using the Wisconsin breast cancer dataset (WBCD) highlighted logistic regression (LR) and support vector machine (SVM) as the most accurate predictors, with minimal errors for clinical data. Conversely, decision trees (DT) exhibited higher error rates due to overfitting, emphasizing the importance of algorithm selection for complex datasets. Analysis of ultrasound images underscored the significance of preprocessing, while histopathological image analysis using convolutional neural networks (CNNs) demonstrated robust classification capabilities. These findings underscore the transformative potential of ML and DL in BC diagnosis, offering automated, accurate, and accessible diagnostic tools. Collaboration among stakeholders is crucial for further advancements in BC detection methods.
Collapse
Affiliation(s)
| | - Mahsa Darbandi
- Fetal Health Research Center, Hope Generation Foundation, Tehran, Iran.
| | - Sara Darbandi
- Gene Therapy and Regenerative Medicine Research Center, Hope Generation Foundation, Tehran, Iran.
| | - Igor Bado
- Department of Oncological Sciences, Tisch Cancer Institute, New York, USA.
| | - Mohammad Hadizadeh
- Cancer Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Hamid Reza Khorram Khorshid
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran; Personalized Medicine and Genometabolics Research Center, Hope Generation Foundation, Tehran, Iran.
| |
Collapse
|
9
|
Scala G, Ferraro L, Brandi A, Guo Y, Majello B, Ceccarelli M. MoNETA: MultiOmics Network Embedding for SubType Analysis. NAR Genom Bioinform 2024; 6:lqae141. [PMID: 39416887 PMCID: PMC11482636 DOI: 10.1093/nargab/lqae141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/19/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024] Open
Abstract
Cells are complex systems whose behavior emerges from a huge number of reactions taking place within and among different molecular districts. The availability of bulk and single-cell omics data fueled the creation of multi-omics systems biology models capturing the dynamics within and between omics layers. Powerful modeling strategies are needed to cope with the increased amount of data to be interrogated and the relative research questions. Here, we present MultiOmics Network Embedding for SubType Analysis (MoNETA) for fast and scalable identification of relevant multi-omics relationships between biological entities at the bulk and single-cells level. We apply MoNETA to show how glioma subtypes previously described naturally emerge with our approach. We also show how MoNETA can be used to identify cell types in five multi-omic single-cell datasets.
Collapse
Affiliation(s)
- Giovanni Scala
- Department of Biology, University of Naples ‘Federico II’, 80128 Naples, Italy
| | - Luigi Ferraro
- Sylvester Comprehensive Cancer Center, University of Miami, 33136, Miami, USA
| | - Aurora Brandi
- Department of Biology, University of Naples ‘Federico II’, 80128 Naples, Italy
| | - Yan Guo
- Sylvester Comprehensive Cancer Center, University of Miami, 33136, Miami, USA
| | - Barbara Majello
- Department of Biology, University of Naples ‘Federico II’, 80128 Naples, Italy
| | - Michele Ceccarelli
- Sylvester Comprehensive Cancer Center, University of Miami, 33136, Miami, USA
| |
Collapse
|
10
|
Sathyamoorthi K, Vp A, Venkataramana LY, Prasad D VV. Enhancing Breast Cancer Survival Prognosis Through Omic and Non-Omic Data Integration. Clin Breast Cancer 2024:S1526-8209(24)00221-0. [PMID: 39294025 DOI: 10.1016/j.clbc.2024.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/31/2024] [Accepted: 08/11/2024] [Indexed: 09/20/2024]
Abstract
BACKGROUND AND OBJECTIVE Cancer, the second leading cause of death globally, claimed 685,000 lives among 2.3 million women affected by breast cancer in 2020. Cancer prognosis plays a pivotal role in tailoring treatments and assessing efficacy, emphasizing the need for a comprehensive understanding. The goal is to develop predictive model capable of accurately predicting patient outcomes and guiding personalized treatment strategies, thereby advancing precision medicine in breast cancer care. METHODS This project addresses limitations in current cancer prognosis models by integrating omics and non-omics data. While existing models often neglect crucial omics data like DNA methylation and miRNA, the method utilizes the TCGA dataset to incorporate these data types along with others. Employing mRMR feature selection and CNN models for each type of data for feature extraction, features are stacked and a Random Forest classifier is employed for final prognosis. RESULT The proposed method is applied to the dataset to predict whether the patient is a long-time or a short-time survivor. This strategy showcases excellent performance, with an AUC value of 0.873, precision at 0.881, and sensitivity reaching 0.943. With an accuracy rate of 0.861, signaling an improvement of 11.96% compared to prior studies. CONCLUSION In conclusion, integrating diverse data with advanced machine learning holds promise for improving breast cancer prognosis. Addressing model limitations and leveraging comprehensive datasets can enhance accuracy, paving the way for better patient care. Further refinement offers potential for significant advancements in cancer prognosis and treatment strategies.
Collapse
Affiliation(s)
- Kishaanth Sathyamoorthi
- Department of Computer Science, Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India
| | - Abishek Vp
- Department of Computer Science, Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India
| | - Lokeswari Y Venkataramana
- Department of Computer Science, Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India.
| | - Venkata Vara Prasad D
- Department of Computer Science, Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India
| |
Collapse
|
11
|
Ozcelik F, Dundar MS, Yildirim AB, Henehan G, Vicente O, Sánchez-Alcázar JA, Gokce N, Yildirim DT, Bingol NN, Karanfilska DP, Bertelli M, Pojskic L, Ercan M, Kellermayer M, Sahin IO, Greiner-Tollersrud OK, Tan B, Martin D, Marks R, Prakash S, Yakubi M, Beccari T, Lal R, Temel SG, Fournier I, Ergoren MC, Mechler A, Salzet M, Maffia M, Danalev D, Sun Q, Nei L, Matulis D, Tapaloaga D, Janecke A, Bown J, Cruz KS, Radecka I, Ozturk C, Nalbantoglu OU, Sag SO, Ko K, Arngrimsson R, Belo I, Akalin H, Dundar M. The impact and future of artificial intelligence in medical genetics and molecular medicine: an ongoing revolution. Funct Integr Genomics 2024; 24:138. [PMID: 39147901 DOI: 10.1007/s10142-024-01417-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 08/17/2024]
Abstract
Artificial intelligence (AI) platforms have emerged as pivotal tools in genetics and molecular medicine, as in many other fields. The growth in patient data, identification of new diseases and phenotypes, discovery of new intracellular pathways, availability of greater sets of omics data, and the need to continuously analyse them have led to the development of new AI platforms. AI continues to weave its way into the fabric of genetics with the potential to unlock new discoveries and enhance patient care. This technology is setting the stage for breakthroughs across various domains, including dysmorphology, rare hereditary diseases, cancers, clinical microbiomics, the investigation of zoonotic diseases, omics studies in all medical disciplines. AI's role in facilitating a deeper understanding of these areas heralds a new era of personalised medicine, where treatments and diagnoses are tailored to the individual's molecular features, offering a more precise approach to combating genetic or acquired disorders. The significance of these AI platforms is growing as they assist healthcare professionals in the diagnostic and treatment processes, marking a pivotal shift towards more informed, efficient, and effective medical practice. In this review, we will explore the range of AI tools available and show how they have become vital in various sectors of genomic research supporting clinical decisions.
Collapse
Affiliation(s)
- Firat Ozcelik
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Mehmet Sait Dundar
- Department of Electrical and Computer Engineering, Graduate School of Engineering and Sciences, Abdullah Gul University, Kayseri, Turkey
| | - A Baki Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Gary Henehan
- School of Food Science and Environmental Health, Technological University of Dublin, Dublin, Ireland
| | - Oscar Vicente
- Institute for the Conservation and Improvement of Valencian Agrodiversity (COMAV), Universitat Politècnica de València, Valencia, Spain
| | - José A Sánchez-Alcázar
- Centro de Investigación Biomédica en Red: Enfermedades Raras, Centro Andaluz de Biología del Desarrollo (CABD-CSIC-Universidad Pablo de Olavide), Instituto de Salud Carlos III, Sevilla, Spain
| | - Nuriye Gokce
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Duygu T Yildirim
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Nurdeniz Nalbant Bingol
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
| | - Dijana Plaseska Karanfilska
- Research Centre for Genetic Engineering and Biotechnology, Macedonian Academy of Sciences and Arts, Skopje, Macedonia
| | | | - Lejla Pojskic
- Institute for Genetic Engineering and Biotechnology, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Mehmet Ercan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Miklos Kellermayer
- Department of Biophysics and Radiation Biology, Faculty of Medicine, Semmelweis University, Budapest, Hungary
| | - Izem Olcay Sahin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | | | - Busra Tan
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Donald Martin
- University Grenoble Alpes, CNRS, TIMC-IMAG/SyNaBi (UMR 5525), Grenoble, France
| | - Robert Marks
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel
| | - Satya Prakash
- Department of Biomedical Engineering, University of McGill, Montreal, QC, Canada
| | - Mustafa Yakubi
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey
| | - Tommaso Beccari
- Department of Pharmeceutical Sciences, University of Perugia, Perugia, Italy
| | - Ratnesh Lal
- Neuroscience Research Institute, University of California, Santa Barbara, USA
| | - Sehime G Temel
- Department of Translational Medicine, Institute of Health Sciences, Bursa Uludag University, Bursa, Turkey
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
- Department of Histology and Embryology, Faculty of Medicine, Bursa Uludag University, Bursa, Turkey
| | - Isabelle Fournier
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - M Cerkez Ergoren
- Department of Medical Genetics, Near East University Faculty of Medicine, Nicosia, Cyprus
| | - Adam Mechler
- Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia
| | - Michel Salzet
- Réponse Inflammatoire et Spectrométrie de Masse-PRISM, University of Lille, Lille, France
| | - Michele Maffia
- Department of Experimental Medicine, University of Salento, Via Lecce-Monteroni, Lecce, 73100, Italy
| | - Dancho Danalev
- University of Chemical Technology and Metallurgy, Sofia, Bulgaria
| | - Qun Sun
- Department of Food Science and Technology, Sichuan University, Chengdu, China
| | - Lembit Nei
- School of Engineering Tallinn University of Technology, Tartu College, Tartu, Estonia
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Dana Tapaloaga
- Faculty of Veterinary Medicine, University of Agronomic Sciences and Veterinary Medicine of Bucharest, Bucharest, Romania
| | - Andres Janecke
- Department of Paediatrics I, Medical University of Innsbruck, Innsbruck, Austria
- Division of Human Genetics, Medical University of Innsbruck, Innsbruck, Austria
| | - James Bown
- School of Science, Engineering and Technology, Abertay University, Dundee, UK
| | | | - Iza Radecka
- School of Science, Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK
| | - Celal Ozturk
- Department of Software Engineering, Erciyes University, Kayseri, Turkey
| | - Ozkan Ufuk Nalbantoglu
- Department of Computer Engineering, Engineering Faculty, Erciyes University, Kayseri, Turkey
| | - Sebnem Ozemri Sag
- Department of Medical Genetics, Bursa Uludag University Faculty of Medicine, Bursa, Turkey
| | - Kisung Ko
- Department of Medicine, College of Medicine, Chung-Ang University, Seoul, Korea
| | - Reynir Arngrimsson
- Iceland Landspitali University Hospital, University of Iceland, Reykjavik, Iceland
| | - Isabel Belo
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Hilal Akalin
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| | - Munis Dundar
- Department of Medical Genetics, Faculty of Medicine, Erciyes University, Kayseri, Turkey.
| |
Collapse
|
12
|
Zhang G, Ma C, Yan C, Luo H, Wang J, Liang W, Luo J. MSFN: a multi-omics stacked fusion network for breast cancer survival prediction. Front Genet 2024; 15:1378809. [PMID: 39161422 PMCID: PMC11331006 DOI: 10.3389/fgene.2024.1378809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/22/2024] [Indexed: 08/21/2024] Open
Abstract
Introduction: Developing effective breast cancer survival prediction models is critical to breast cancer prognosis. With the widespread use of next-generation sequencing technologies, numerous studies have focused on survival prediction. However, previous methods predominantly relied on single-omics data, and survival prediction using multi-omics data remains a significant challenge. Methods: In this study, considering the similarity of patients and the relevance of multi-omics data, we propose a novel multi-omics stacked fusion network (MSFN) based on a stacking strategy to predict the survival of breast cancer patients. MSFN first constructs a patient similarity network (PSN) and employs a residual graph neural network (ResGCN) to obtain correlative prognostic information from PSN. Simultaneously, it employs convolutional neural networks (CNNs) to obtain specificity prognostic information from multi-omics data. Finally, MSFN stacks the prognostic information from these networks and feeds into AdaboostRF for survival prediction. Results: Experiments results demonstrated that our method outperformed several state-of-the-art methods, and biologically validated by Kaplan-Meier and t-SNE.
Collapse
Affiliation(s)
- Ge Zhang
- Academy for Advanced Interdisciplinary Studies, Henan University, Kaifeng, Henan, China
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Chenwei Ma
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
| | - Chaokun Yan
- Academy for Advanced Interdisciplinary Studies, Henan University, Kaifeng, Henan, China
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Huimin Luo
- Academy for Advanced Interdisciplinary Studies, Henan University, Kaifeng, Henan, China
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Jianlin Wang
- Academy for Advanced Interdisciplinary Studies, Henan University, Kaifeng, Henan, China
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Wenjuan Liang
- Academy for Advanced Interdisciplinary Studies, Henan University, Kaifeng, Henan, China
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan, China
| |
Collapse
|
13
|
Sadique FL, Subramaiam H, Krishnappa P, Chellappan DK, Ma JH. Recent advances in breast cancer metastasis with special emphasis on metastasis to the brain. Pathol Res Pract 2024; 260:155378. [PMID: 38850880 DOI: 10.1016/j.prp.2024.155378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 05/28/2024] [Accepted: 05/28/2024] [Indexed: 06/10/2024]
Abstract
Understanding the underlying mechanisms of breast cancer metastasis is of vital importance for developing treatment approaches. This review emphasizes contemporary breakthrough studies with special focus on breast cancer brain metastasis. Acquired mutational changes in metastatic lesions are often distinct from the primary tumor, suggesting altered mutagenesis pathways. The concept of micrometastases and heterogeneity within the tumors unravels novel therapeutic targets at genomic and molecular levels through epigenetic and proteomic profiling. Several pre-clinical studies have identified mechanisms involving the immune system, where tumor associated macrophages are key players. Expression of cell proteins like Syndecan1, fatty acid-binding protein 7 and tropomyosin kinase receptor B have been implicated in aiding the transmigration of breast cancer cells to the brain. Changes in the proteomic landscape of the blood-brain-barrier show altered permeability characteristics, supporting entry of cancer cells. Findings from laboratory studies pave the path for the emergence of new biomarkers, especially blood-based miRNA and circulating tumor cell markers for prognostic staging. The constantly evolving therapeutics call for clinical trials backing supportive evidence of efficacies of both novel and existing approaches. The challenge lying ahead is discovering innovative techniques to replace use of human samples and optimize small-scale patient recruitment in trials.
Collapse
Affiliation(s)
- Fairooz Labiba Sadique
- Department of Biomedical Science, School of Health Sciences, International Medical University, Kuala Lumpur 57000, Malaysia
| | - Hemavathy Subramaiam
- Division of Pathology, School of Medicine, International Medical University, Kuala Lumpur 57000, Malaysia.
| | - Purushotham Krishnappa
- Division of Pathology, School of Medicine, International Medical University, Kuala Lumpur 57000, Malaysia
| | - Dinesh Kumar Chellappan
- Department of Life Sciences, School of Pharmacy, International Medical University, Kuala Lumpur 57000, Malaysia
| | - Jin Hao Ma
- School of Medicine, International Medical University, Kuala Lumpur 57000, Malaysia
| |
Collapse
|
14
|
Zhu B, Zhang Z, Leung SY, Fan X. NetMIM: network-based multi-omics integration with block missingness for biomarker selection and disease outcome prediction. Brief Bioinform 2024; 25:bbae454. [PMID: 39288230 PMCID: PMC11407451 DOI: 10.1093/bib/bbae454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 07/24/2024] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
Compared with analyzing omics data from a single platform, an integrative analysis of multi-omics data provides a more comprehensive understanding of the regulatory relationships among biological features associated with complex diseases. However, most existing frameworks for integrative analysis overlook two crucial aspects of multi-omics data. Firstly, they neglect the known dependencies among biological features that exist in highly credible biological databases. Secondly, most existing integrative frameworks just simply remove the subjects without full omics data to handle block missingness, resulting in decreasing statistical power. To overcome these issues, we propose a network-based integrative Bayesian framework for biomarker selection and disease outcome prediction based on multi-omics data. Our framework utilizes Dirac spike-and-slab variable selection prior to identifying a small subset of biomarkers. The incorporation of gene pathway information improves the interpretability of feature selection. Furthermore, with the strategy in the FBM (stand for "full Bayesian model with missingness") model where missing omics data are augmented via a mechanistic model, our framework handles block missingness in multi-omics data via a data augmentation approach. The real application illustrates that our approach, which incorporates existing gene pathway information and includes subjects without DNA methylation data, results in more interpretable feature selection results and more accurate predictions.
Collapse
Affiliation(s)
- Bencong Zhu
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China
| | - Zhen Zhang
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China
| | - Suet Yi Leung
- Department of Pathology, School of Clinical Medicine, LKS Faculty of Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China
| |
Collapse
|
15
|
Verma S, Magazzù G, Eftekhari N, Lou T, Gilhespy A, Occhipinti A, Angione C. Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients. CELL REPORTS METHODS 2024; 4:100817. [PMID: 38981473 PMCID: PMC11294841 DOI: 10.1016/j.crmeth.2024.100817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 04/18/2024] [Accepted: 06/17/2024] [Indexed: 07/11/2024]
Abstract
Deep-learning tools that extract prognostic factors derived from multi-omics data have recently contributed to individualized predictions of survival outcomes. However, the limited size of integrated omics-imaging-clinical datasets poses challenges. Here, we propose two biologically interpretable and robust deep-learning architectures for survival prediction of non-small cell lung cancer (NSCLC) patients, learning simultaneously from computed tomography (CT) scan images, gene expression data, and clinical information. The proposed models integrate patient-specific clinical, transcriptomic, and imaging data and incorporate Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome pathway information, adding biological knowledge within the learning process to extract prognostic gene biomarkers and molecular pathways. While both models accurately stratify patients in high- and low-risk groups when trained on a dataset of only 130 patients, introducing a cross-attention mechanism in a sparse autoencoder significantly improves the performance, highlighting tumor regions and NSCLC-related genes as potential biomarkers and thus offering a significant methodological advancement when learning from small imaging-omics-clinical samples.
Collapse
Affiliation(s)
- Suraj Verma
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK
| | | | | | - Thai Lou
- Gateshead Health NHS Foundation Trust, Gateshead, UK
| | - Alex Gilhespy
- South Tyneside and Sunderland NHS Foundation Trust, Sunderland, UK
| | - Annalisa Occhipinti
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK; Centre for Digital Innovation, Teesside University, Middlesbrough, UK; National Horizons Centre, Teesside University, Darlington, UK
| | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK; Centre for Digital Innovation, Teesside University, Middlesbrough, UK; National Horizons Centre, Teesside University, Darlington, UK.
| |
Collapse
|
16
|
Li L, Sun M, Wang J, Wan S. Multi-omics based artificial intelligence for cancer research. Adv Cancer Res 2024; 163:303-356. [PMID: 39271266 DOI: 10.1016/bs.acr.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
With significant advancements of next generation sequencing technologies, large amounts of multi-omics data, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, have been accumulated, offering an unprecedented opportunity to explore the heterogeneity and complexity of cancer across various molecular levels and scales. One of the promising aspects of multi-omics lies in its capacity to offer a holistic view of the biological networks and pathways underpinning cancer, facilitating a deeper understanding of its development, progression, and response to treatment. However, the exponential growth of data generated by multi-omics studies present significant analytical challenges. Processing, analyzing, integrating, and interpreting these multi-omics datasets to extract meaningful insights is an ambitious task that stands at the forefront of current cancer research. The application of artificial intelligence (AI) has emerged as a powerful solution to these challenges, demonstrating exceptional capabilities in deciphering complex patterns and extracting valuable information from large-scale, intricate omics datasets. This review delves into the synergy of AI and multi-omics, highlighting its revolutionary impact on oncology. We dissect how this confluence is reshaping the landscape of cancer research and clinical practice, particularly in the realms of early detection, diagnosis, prognosis, treatment and pathology. Additionally, we elaborate the latest AI methods for multi-omics integration to provide a comprehensive insight of the complex biological mechanisms and inherent heterogeneity of cancer. Finally, we discuss the current challenges of data harmonization, algorithm interpretability, and ethical considerations. Addressing these challenges necessitates a multidisciplinary collaboration, paving the promising way for more precise, personalized, and effective treatments for cancer patients.
Collapse
Affiliation(s)
- Lusheng Li
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
| | - Mengtao Sun
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States
| | - Jieqiong Wang
- Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE, United States
| | - Shibiao Wan
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States.
| |
Collapse
|
17
|
Winicki NM, Radomski SN, Ciftci Y, Sabit AH, Johnston FM, Greer JB. Mortality risk prediction for primary appendiceal cancer. Surgery 2024; 175:1489-1495. [PMID: 38494390 DOI: 10.1016/j.surg.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 02/10/2024] [Accepted: 02/13/2024] [Indexed: 03/19/2024]
Abstract
BACKGROUND Accurately predicting survival in patients with cancer is crucial for both clinical decision-making and patient counseling. The primary aim of this study was to generate the first machine-learning algorithm to predict the risk of mortality following the diagnosis of an appendiceal neoplasm. METHODS Patients with primary appendiceal cancer in the Surveillance, Epidemiology, and End Results database from 2000 to 2019 were included. Patient demographics, tumor characteristics, and survival data were extracted from the Surveillance, Epidemiology, and End Results database. Extreme gradient boost, random forest, neural network, and logistic regression machine learning models were employed to predict 1-, 5-, and 10-year mortality. After algorithm validation, the best-performance model was used to develop a patient-specific web-based risk prediction model. RESULTS A total of 16,579 patients were included in the study, with 13,262 in the training group (80%) and 3,317 in the validation group (20%). Extreme gradient boost exhibited the highest prediction accuracy for 1-, 5-, and 10-year mortality, with the 10-year model exhibiting the maximum area under the curve (0.909 [±0.006]) after 10-fold cross-validation. Variables that significantly influenced the predictive ability of the model were disease grade, malignant carcinoid histology, incidence of positive regional lymph nodes, number of nodes harvested, and presence of distant disease. CONCLUSION Here, we report the development and validation of a novel prognostic prediction model for patients with appendiceal neoplasms of numerous histologic subtypes that incorporate a vast array of patient, surgical, and pathologic variables. By using machine learning, we achieved an excellent predictive accuracy that was superior to that of previous nomograms.
Collapse
Affiliation(s)
- Nolan M Winicki
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Shannon N Radomski
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Yusuf Ciftci
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Ahmed H Sabit
- Department of Biostatistics, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Fabian M Johnston
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Jonathan B Greer
- Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD.
| |
Collapse
|
18
|
Wang H, Lin K, Zhang Q, Shi J, Song X, Wu J, Zhao C, He K. HyperTMO: a trusted multi-omics integration framework based on hypergraph convolutional network for patient classification. Bioinformatics 2024; 40:btae159. [PMID: 38530977 PMCID: PMC11212491 DOI: 10.1093/bioinformatics/btae159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 02/02/2024] [Accepted: 03/24/2024] [Indexed: 03/28/2024] Open
Abstract
MOTIVATION The rapid development of high-throughput biomedical technologies can provide researchers with detailed multi-omics data. The multi-omics integrated analysis approach based on machine learning contributes a more comprehensive perspective to human disease research. However, there are still significant challenges in representing single-omics data and integrating multi-omics information. RESULTS This article presents HyperTMO, a Trusted Multi-Omics integration framework based on Hypergraph convolutional network for patient classification. HyperTMO constructs hypergraph structures to represent the association between samples in single-omics data, then evidence extraction is performed by hypergraph convolutional network, and multi-omics information is integrated at an evidence level. Last, we experimentally demonstrate that HyperTMO outperforms other state-of-the-art methods in breast cancer subtype classification and Alzheimer's disease classification tasks using multi-omics data from TCGA (BRCA) and ROSMAP datasets. Importantly, HyperTMO is the first attempt to integrate hypergraph structure, evidence theory, and multi-omics integration for patient classification. Its accurate and robust properties bring great potential for applications in clinical diagnosis. AVAILABILITY AND IMPLEMENTATION HyperTMO and datasets are publicly available at https://github.com/ippousyuga/HyperTMO.
Collapse
Affiliation(s)
- Haohua Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Kai Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Qiang Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jinlong Shi
- Research Center for Medical Big Data, Medical Innovation Research Division of Chinese PLA General Hospital, Beijing 100039, China
| | - Xinyu Song
- Research Center for Medical Big Data, Medical Innovation Research Division of Chinese PLA General Hospital, Beijing 100039, China
| | - Jue Wu
- Research Center for Medical Big Data, Medical Innovation Research Division of Chinese PLA General Hospital, Beijing 100039, China
| | - Chenghui Zhao
- Research Center for Medical Big Data, Medical Innovation Research Division of Chinese PLA General Hospital, Beijing 100039, China
| | - Kunlun He
- Research Center for Medical Big Data, Medical Innovation Research Division of Chinese PLA General Hospital, Beijing 100039, China
| |
Collapse
|
19
|
Taheri G, Habibi M. Uncovering driver genes in breast cancer through an innovative machine learning mutational analysis method. Comput Biol Med 2024; 171:108234. [PMID: 38430742 DOI: 10.1016/j.compbiomed.2024.108234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 01/25/2024] [Accepted: 02/25/2024] [Indexed: 03/05/2024]
Abstract
Breast cancer has become a severe public health concern and one of the leading causes of cancer-related death in women worldwide. Several genes and mutations in these genes linked to breast cancer have been identified using sophisticated techniques, despite the fact that the exact cause of breast cancer is still unknown. A commonly used feature for identifying driver mutations is the recurrence of a mutation in patients. Nevertheless, some mutations are more likely to occur than others for various reasons. Sequencing analysis has shown that cancer-driving genes operate across complex networks, often with mutations appearing in a modular pattern. In this work, as a retrospective study, we used TCGA data, which is gathered from breast cancer patients. We introduced a new machine-learning approach to examine gene functionality in networks derived from mutation associations, gene-gene interactions, and graph clustering for breast cancer analysis. These networks have uncovered crucial biological components in critical pathways, particularly those that exhibit low-frequency mutations. The statistical strength of the clinical study is significantly boosted by evaluating the network as a whole instead of just single gene effects. Our method successfully identified essential driver genes with diverse mutation frequencies. We then explored the functions of these potential driver genes and their related pathways. By uncovering low-frequency genes, we shed light on understudied pathways associated with breast cancer. Additionally, we present a novel Monte Carlo-based algorithm to identify driver modules in breast cancer. Our findings highlight the significance and role of these modules in critical signaling pathways in breast cancer, providing a comprehensive understanding of breast cancer development. Materials and implementations are available at: [https://github.com/MahnazHabibi/BreastCancer].
Collapse
Affiliation(s)
- Golnaz Taheri
- Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden; Science for Life Laboratory, Stockholm, Sweden.
| | - Mahnaz Habibi
- Department of Mathematics, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| |
Collapse
|
20
|
Cai WL, Cheng M, Wang Y, Xu PH, Yang X, Sun ZW, Wang-Jun Yan. Prediction and related genes of cancer distant metastasis based on deep learning. Comput Biol Med 2024; 168:107664. [PMID: 38000245 DOI: 10.1016/j.compbiomed.2023.107664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 09/27/2023] [Accepted: 10/31/2023] [Indexed: 11/26/2023]
Abstract
Cancer metastasis is one of the main causes of cancer progression and difficulty in treatment. Genes play a key role in the process of cancer metastasis, as they can influence tumor cell invasiveness, migration ability and fitness. At the same time, there is heterogeneity in the organs of cancer metastasis. Breast cancer, prostate cancer, etc. tend to metastasize in the bone. Previous studies have pointed out that the occurrence of metastasis is closely related to which tissue is transferred to and genes. In this paper, we identified genes associated with cancer metastasis to different tissues based on LASSO and Pearson correlation coefficients. In total, we identified 45 genes associated with bone metastases, 89 genes associated with lung metastases, and 86 genes associated with liver metastases. Through the expression of these genes, we propose a CNN-based model to predict the occurrence of metastasis. We call this method MDCNN, which introduces a modulation mechanism that allows the weights of convolution kernels to be adjusted at different positions and feature maps, thereby adaptively changing the convolution operation at different positions. Experiments have proved that MDCNN has achieved satisfactory prediction accuracy in bone metastasis, lung metastasis and liver metastasis, and is better than other 4 methods of the same kind. We performed enrichment analysis and immune infiltration analysis on bone metastasis-related genes, and found multiple pathways and GO terms related to bone metastasis, and found that the abundance of macrophages and monocytes was the highest in patients with bone metastasis.
Collapse
Affiliation(s)
- Wei-Luo Cai
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Mo Cheng
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Yi Wang
- Department of Gastrointestinal Surgical Oncology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, China
| | - Pei-Hang Xu
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China
| | - Xi Yang
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, China; Department of Oncology, Shanghai Medical College, Fudan University, China.
| | - Zheng-Wang Sun
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China.
| | - Wang-Jun Yan
- Department of Musculoskeletal Surgery, Fudan University Shanghai Cancer Center, China.
| |
Collapse
|
21
|
Zhang H, Hussin H, Hoh CC, Cheong SH, Lee WK, Yahaya BH. Big data in breast cancer: Towards precision treatment. Digit Health 2024; 10:20552076241293695. [PMID: 39502482 PMCID: PMC11536614 DOI: 10.1177/20552076241293695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 10/07/2024] [Indexed: 11/08/2024] Open
Abstract
Breast cancer is the most prevalent and deadliest cancer among women globally, representing a major threat to public health. In response, the World Health Organization has established the Global Breast Cancer Initiative framework to reduce breast cancer mortality through global collaboration. The integration of big data analytics (BDA) and precision medicine has transformed our understanding of breast cancer's biological traits and treatment responses. By harnessing large-scale datasets - encompassing genetic, clinical, and environmental data - BDA has enhanced strategies for breast cancer prevention, diagnosis, and treatment, driving the advancement of precision oncology and personalised care. Despite the increasing importance of big data in breast cancer research, comprehensive studies remain sparse, underscoring the need for more systematic investigation. This review evaluates the contributions of big data to breast cancer precision medicine while addressing the associated opportunities and challenges. Through the application of big data, we aim to deepen insights into breast cancer pathogenesis, optimise therapeutic approaches, improve patient outcomes, and ultimately contribute to better survival rates and quality of life. This review seeks to provide a foundation for future research in breast cancer prevention, treatment, and management.
Collapse
Affiliation(s)
- Hao Zhang
- Breast Cancer Translational Research Program (BCTRP@IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
- Department of Biomedical Sciences, Advanced Medical and Dental Institute (IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
| | - Hasmah Hussin
- Breast Cancer Translational Research Program (BCTRP@IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
- Department of Clinical Medicine, Advanced Medical and Dental Institute (IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
| | | | | | - Wei-Kang Lee
- Codon Genomics Sdn Bhd, Seri Kembangan, Selangor, Malaysia
| | - Badrul Hisham Yahaya
- Breast Cancer Translational Research Program (BCTRP@IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
- Department of Biomedical Sciences, Advanced Medical and Dental Institute (IPPT), Universiti Sains Malaysia, Kepala Batas, Penang, Malaysia
| |
Collapse
|
22
|
Wang H, Han X, Ren J, Cheng H, Li H, Li Y, Li X. A prognostic prediction model for ovarian cancer using a cross-modal view correlation discovery network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:736-764. [PMID: 38303441 DOI: 10.3934/mbe.2024031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Ovarian cancer is a tumor with different clinicopathological and molecular features, and the vast majority of patients have local or extensive spread at the time of diagnosis. Early diagnosis and prognostic prediction of patients can contribute to the understanding of the underlying pathogenesis of ovarian cancer and the improvement of therapeutic outcomes. The occurrence of ovarian cancer is influenced by multiple complex mechanisms, including the genome, transcriptome and proteome. Different types of omics analysis help predict the survival rate of ovarian cancer patients. Multi-omics data of ovarian cancer exhibit high-dimensional heterogeneity, and existing methods for integrating multi-omics data have not taken into account the variability and inter-correlation between different omics data. In this paper, we propose a deep learning model, MDCADON, which utilizes multi-omics data and cross-modal view correlation discovery network. We introduce random forest into LASSO regression for feature selection on mRNA expression, DNA methylation, miRNA expression and copy number variation (CNV), aiming to select important features highly correlated with ovarian cancer prognosis. A multi-modal deep neural network is used to comprehensively learn feature representations of each omics data and clinical data, and cross-modal view correlation discovery network is employed to construct the multi-omics discovery tensor, exploring the inter-relationships between different omics data. The experimental results demonstrate that MDCADON is superior to the existing methods in predicting ovarian cancer prognosis, which enables survival analysis for patients and facilitates the determination of follow-up treatment plans. Finally, we perform Gene Ontology (GO) term analysis and biological pathway analysis on the genes identified by MDCADON, revealing the underlying mechanisms of ovarian cancer and providing certain support for guiding ovarian cancer treatments.
Collapse
Affiliation(s)
- Huiqing Wang
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Xiao Han
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Jianxue Ren
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Hao Cheng
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Haolin Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Ying Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| | - Xue Li
- College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
| |
Collapse
|
23
|
Liu H, Shi Y, Li A, Wang M. Multi-modal fusion network with intra- and inter-modality attention for prognosis prediction in breast cancer. Comput Biol Med 2024; 168:107796. [PMID: 38064843 DOI: 10.1016/j.compbiomed.2023.107796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 11/20/2023] [Accepted: 11/29/2023] [Indexed: 01/10/2024]
Abstract
Accurate breast cancer prognosis prediction can help clinicians to develop appropriate treatment plans and improve life quality for patients. Recent prognostic prediction studies suggest that fusing multi-modal data, e.g., genomic data and pathological images, plays a crucial role in improving predictive performance. Despite promising results of existing approaches, there remain challenges in effective multi-modal fusion. First, albeit a powerful fusion technique, Kronecker product produces high-dimensional quadratic expansion of features that may result in high computational cost and overfitting risk, thereby limiting its performance and applicability in cancer prognosis prediction. Second, most existing methods put more attention on learning cross-modality relations between different modalities, ignoring modality-specific relations that are complementary to cross-modality relations and beneficial for cancer prognosis prediction. To address these challenges, in this study we propose a novel attention-based multi-modal network to accurately predict breast cancer prognosis, which efficiently models both modality-specific and cross-modality relations without bringing in high-dimensional features. Specifically, two intra-modality self-attentional modules and an inter-modality cross-attentional module, accompanied by latent space transformation of channel affinity matrix, are developed to successfully capture modality-specific and cross-modality relations for efficient integration of genomic data and pathological images, respectively. Moreover, we design an adaptive fusion block to take full advantage of both modality-specific and cross-modality relations. Comprehensive experiment demonstrates that our method can effectively boost prognosis prediction performance of breast cancer and compare favorably with the state-of-the-art methods.
Collapse
Affiliation(s)
- Honglei Liu
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China
| | - Yi Shi
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China.
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei 230027, China.
| |
Collapse
|
24
|
Vanmathi P, Jose D. An ensemble-based serial cascaded attention network and improved variational auto encoder for breast cancer prognosis prediction using data. Comput Methods Biomech Biomed Engin 2024; 27:98-115. [PMID: 38006210 DOI: 10.1080/10255842.2023.2280883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 11/02/2023] [Indexed: 11/26/2023]
Abstract
Breast cancer is one of the most common types of cancer in women and it produces a huge amount of death rate in the world. Early recognition is lessening its impact. The early recognition of breast cancer could convince patients to receive surgical therapy, which will significantly improve the chance of restoration. This information is used by the machine learning technique to find links between them and appraise our forecasts of fresh occurrences. Later recognition of breast cancer can lead to death. An accurate prescient framework for breast cancer prediction is urgently needed in the current era. In order to accomplish the objective, an adaptive ensemble model is proposed for breast cancer prognosis prediction using data. At the initial stage, the raw data are fetched from benchmark datasets. It is then followed by data cleaning and preprocessing. Subsequently, the pre-processed data is fed into the Improved Variational Autoencoder (IVAE), where the deep features are extracted. Finally, the resultant features are given as input to the Ensemble-based Serial Cascaded Attention Network (ESCANet), which is built with Deep Temporal Convolution Network (DTCN), Bi-directional Long Short-Term Memory (BiLSTM), and Recurrent Neural Network (RNN). The effectiveness of the model is validated and compared with conventional methodologies. Therefore, the results elucidate that the proposed methodology achieves extensive results; thus, it increases the system's efficiency.
Collapse
Affiliation(s)
- P Vanmathi
- Full time Research Scholar, Department of ECE, KCG College of Technology, Karapakkam, Chennai, Tamil Nadu, India
| | - Deepa Jose
- Professor, Department of ECE, KCG College of Technology, Karapakkam, Chennai, Tamil Nadu, India
| |
Collapse
|
25
|
Arya N, Saha S. Deviation-support based fuzzy ensemble of multi-modal deep learning classifiers for breast cancer prognosis prediction. Sci Rep 2023; 13:21326. [PMID: 38044381 PMCID: PMC10694142 DOI: 10.1038/s41598-023-47543-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 11/15/2023] [Indexed: 12/05/2023] Open
Abstract
Breast cancer is the fifth leading cause of death in females worldwide. Early detection and treatment are crucial for improving health outcomes and preventing more serious conditions. Analyzing diverse information from multiple sources without errors, particularly with the growing burden of cancer cases, is a daunting task for humans. In this study, our main objective is to improve the accuracy of breast cancer survival prediction using a novel ensemble approach. It is novel due to the consideration of deviation (closeness between predicted classes and actual classes) and support (sparsity between predicted classes and actual classes) of the predicted class with respect to the actual class, a feature lacking in traditional ensembles. The ensemble uses fuzzy integrals on support and deviation scores from base classifiers to calculate aggregated scores while considering how confident or uncertain each classifier is. The proposed ensemble mechanism has been evaluated on a multi-modal breast cancer dataset of breast tumors collected from participants in the METABRIC trial. The proposed architecture proves its efficiency by achieving the accuracy, sensitivity, F1-score, and balanced accuracy of 82.88%, 58.64%, 62.94%, and 74.75% respectively. The obtained results are superior to the performance of individual classifiers and existing ensemble approaches.
Collapse
Affiliation(s)
- Nikhilanand Arya
- Department of Computer Science & Engineering, Indian Institute of Technology Patna, Bihar, 801106, India.
| | - Sriparna Saha
- Department of Computer Science & Engineering, Indian Institute of Technology Patna, Bihar, 801106, India
| |
Collapse
|
26
|
Chandrashekar PB, Alatkar S, Wang J, Hoffman GE, He C, Jin T, Khullar S, Bendl J, Fullard JF, Roussos P, Wang D. DeepGAMI: deep biologically guided auxiliary learning for multimodal integration and imputation to improve genotype-phenotype prediction. Genome Med 2023; 15:88. [PMID: 37904203 PMCID: PMC10617196 DOI: 10.1186/s13073-023-01248-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Genotypes are strongly associated with disease phenotypes, particularly in brain disorders. However, the molecular and cellular mechanisms behind this association remain elusive. With emerging multimodal data for these mechanisms, machine learning methods can be applied for phenotype prediction at different scales, but due to the black-box nature of machine learning, integrating these modalities and interpreting biological mechanisms can be challenging. Additionally, the partial availability of these multimodal data presents a challenge in developing these predictive models. METHOD To address these challenges, we developed DeepGAMI, an interpretable neural network model to improve genotype-phenotype prediction from multimodal data. DeepGAMI leverages functional genomic information, such as eQTLs and gene regulation, to guide neural network connections. Additionally, it includes an auxiliary learning layer for cross-modal imputation allowing the imputation of latent features of missing modalities and thus predicting phenotypes from a single modality. Finally, DeepGAMI uses integrated gradient to prioritize multimodal features for various phenotypes. RESULTS We applied DeepGAMI to several multimodal datasets including genotype and bulk and cell-type gene expression data in brain diseases, and gene expression and electrophysiology data of mouse neuronal cells. Using cross-validation and independent validation, DeepGAMI outperformed existing methods for classifying disease types, and cellular and clinical phenotypes, even using single modalities (e.g., AUC score of 0.79 for Schizophrenia and 0.73 for cognitive impairment in Alzheimer's disease). CONCLUSION We demonstrated that DeepGAMI improves phenotype prediction and prioritizes phenotypic features and networks in multiple multimodal datasets in complex brains and brain diseases. Also, it prioritized disease-associated variants, genes, and regulatory networks linked to different phenotypes, providing novel insights into the interpretation of gene regulatory mechanisms. DeepGAMI is open-source and available for general use.
Collapse
Affiliation(s)
- Pramod Bharadwaj Chandrashekar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Sayali Alatkar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Chenfeng He
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Ting Jin
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Saniya Khullar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY, 10468, USA
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, 10962, USA
| | - Daifeng Wang
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA.
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA.
| |
Collapse
|
27
|
Palmal S, Arya N, Saha S, Tripathy S. Breast cancer survival prognosis using the graph convolutional network with Choquet fuzzy integral. Sci Rep 2023; 13:14757. [PMID: 37679421 PMCID: PMC10485011 DOI: 10.1038/s41598-023-40341-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 08/09/2023] [Indexed: 09/09/2023] Open
Abstract
Breast cancer is the most prevalent kind of cancer among women and there is a need for a reliable algorithm to predict its prognosis. Previous studies focused on using gene expression data to build predictive models. However, recent advancements have made multi-omics cancer data sets (gene expression, copy number alteration, etc.) accessible. This has acted as the motivation for the creation of a novel model that utilizes a graph convolutional network (GCN) and Choquet fuzzy ensemble, incorporating multi-omics and clinical data retrieved from the publicly available METABRIC Database. In this study, graphs have been used to extract structural information, and a Choquet Fuzzy Ensemble with Logistic Regression, Random Forest, and Support Vector Machine as base classifiers has been employed to classify breast cancer patients as short-term or long-term survivors. The model has been run using all possible combinations of gene expression, copy number alteration, and clinical modality, and the results have been reported. Furthermore, a comparison has been made between the obtained results and different baseline models and state-of-the-art to demonstrate the efficacy of the proposed model in terms of different metrics. The results of this model based on Accuracy, Matthews correlation coefficient, Precision, Sensitivity, Specificity, Balanced Accuracy, and F1-Measure are 0.820, 0.528, 0.630, 0.666, 0.871, 0.769, and 0.647, respectively.
Collapse
Affiliation(s)
- Susmita Palmal
- Department of Computer Science and Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India.
| | - Nikhilanand Arya
- Department of Computer Science and Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India
| | - Sriparna Saha
- Department of Computer Science and Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India
| | - Somanath Tripathy
- Department of Computer Science and Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India
| |
Collapse
|
28
|
Kim SY. GNN-surv: Discrete-Time Survival Prediction Using Graph Neural Networks. Bioengineering (Basel) 2023; 10:1046. [PMID: 37760148 PMCID: PMC10525217 DOI: 10.3390/bioengineering10091046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Survival prediction models play a key role in patient prognosis and personalized treatment. However, their accuracy can be improved by incorporating patient similarity networks, which uncover complex data patterns. Our study uses Graph Neural Networks (GNNs) to enhance discrete-time survival predictions (GNN-surv) by leveraging relationships in these networks. We build these networks using cancer patients' genomic and clinical data and train various GNN models on them, integrating Logistic Hazard and PMF survival models. GNN-surv models exhibit superior performance in survival prediction across two urologic cancer datasets, outperforming traditional MLP models. They maintain robustness and effectiveness under varying graph construction hyperparameter μ values, with performance boosts of up to 14.6% and 7.9% in the time-dependent concordance index and reductions in the integrated brier score of 26.7% and 24.1% in the BLCA and KIRC datasets, respectively. Notably, these models also maintain their effectiveness across three different types of GNN models, suggesting potential adaptability to other cancer datasets. The superior performance of our GNN-surv models underscores their wide applicability in the fields of oncology and personalized medicine, providing clinicians with a more accurate tool for patient prognosis and personalized treatment planning. Future studies can further optimize these models by incorporating other survival models or additional data modalities.
Collapse
Affiliation(s)
- So Yeon Kim
- Department of Artificial Intelligence, Ajou University, Suwon 16499, Republic of Korea;
- Department of Software and Computer Engineering, Ajou University, Suwon 16499, Republic of Korea
| |
Collapse
|
29
|
Ramachandran RA, Barão VAR, Ozevin D, Sukotjo C, Srinivasa PP, Mathew M. Early Predicting Tribocorrosion Rate of Dental Implant Titanium Materials Using Random Forest Machine Learning Models. TRIBOLOGY INTERNATIONAL 2023; 187:108735. [PMID: 37720691 PMCID: PMC10503681 DOI: 10.1016/j.triboint.2023.108735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/19/2023]
Abstract
Early detection and prediction of bio-tribocorrosion can avert unexpected damage that may lead to secondary revision surgery and associated risks of implantable devices. Therefore, this study sought to develop a state-of-the-art prediction technique leveraging machine learning(ML) models to classify and predict the possibility of mechanical degradation in dental implant materials. Key features considered in the study involving pure titanium and titanium-zirconium (zirconium = 5, 10, and 15 in wt%) alloys include corrosion potential, acoustic emission(AE) absolute energy, hardness, and weight-loss estimates. ML prototype models deployed confirms its suitability in tribocorrosion prediction with an accuracy above 90%. Proposed system can evolve as a continuous structural-health monitoring as well as a reliable predictive modeling technique for dental implant monitoring.
Collapse
Affiliation(s)
| | - Valentim A R Barão
- Department of Prosthodontics and Periodontology, Piracicaba Dental School, University of Campinas (UNICAMP), Piracicaba, São Paulo, Brazil
| | - Didem Ozevin
- Department of Civil, Materials, and Environmental Engineering, University of Illinois at Chicago, IL, USA
| | - Cortino Sukotjo
- Department of Restorative Dentistry, College of Dentistry, University of Illinois at Chicago, IL, USA
| | - Pai P Srinivasa
- Department of Mechanical Engineering, NMAM IT, Nitte, Karnataka, India
| | - Mathew Mathew
- Department of Biomedical Engineering, University of Illinois at Chicago, IL, USA
- Department of Restorative Dentistry, College of Dentistry, University of Illinois at Chicago, IL, USA
| |
Collapse
|
30
|
Furtney I, Bradley R, Kabuka MR. Patient Graph Deep Learning to Predict Breast Cancer Molecular Subtype. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3117-3127. [PMID: 37379184 PMCID: PMC10623656 DOI: 10.1109/tcbb.2023.3290394] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Breast cancer is a heterogeneous disease consisting of a diverse set of genomic mutations and clinical characteristics. The molecular subtypes of breast cancer are closely tied to prognosis and therapeutic treatment options. We investigate using deep graph learning on a collection of patient factors from multiple diagnostic disciplines to better represent breast cancer patient information and predict molecular subtype. Our method models breast cancer patient data into a multi-relational directed graph with extracted feature embeddings to directly represent patient information and diagnostic test results. We develop a radiographic image feature extraction pipeline to produce vector representation of breast cancer tumors in DCE-MRI and an autoencoder-based genomic variant embedding method to map variant assay results to a low-dimensional latent space. We leverage related-domain transfer learning to train and evaluate a Relational Graph Convolutional Network to predict the probabilities of molecular subtypes for individual breast cancer patient graphs. Our work found that utilizing information from multiple multimodal diagnostic disciplines improved the model's prediction results and produced more distinct learned feature representations for breast cancer patients. This research demonstrates the capabilities of graph neural networks and deep learning feature representation to perform multimodal data fusion and representation in the breast cancer domain.
Collapse
|
31
|
Huang W, Liu Y, Hu P, Ding S, Gao S, Zhang M. What influence farmers' relative poverty in China: A global analysis based on statistical and interpretable machine learning methods. Heliyon 2023; 9:e19525. [PMID: 37809468 PMCID: PMC10558733 DOI: 10.1016/j.heliyon.2023.e19525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/24/2023] [Accepted: 08/25/2023] [Indexed: 10/10/2023] Open
Abstract
Poverty eradication has always been a major challenge to global development and governance, which received widespread attention from each country. With the completion poverty alleviation task in 2020, relative poverty governance becomes an important issue to be solved in China urgently. Because of a large population, poor infrastructures, insufficient resources, and long-term uneven development raising the living standard of farmers in rural areas is critical to China's success in realizing moderate prosperity. Therefore, identifying the poor farmers, exploring the influence factors to relative poverty, and clarifying its effect mechanism in rural areas are significant for the subsequent poverty governance. Most of the previous studies adopted the method of apriori assuming the factor system and verifying the hypothesis. We innovatively constructed a relative poverty index system consistent with China's actual conditions, selecting all the possible variables that could affect relative poverty based on the existing literature, including individual characteristics, psychological endowment, and geographical environment, and rebuilt an experimental database. Then, through data processing and data analysis, the main factors influencing the relative poverty of farmers were systematically sorted out based on the machine learning method. Finally, 25 chosen influencing factors were discussed in detail. Research findings show that: 1) Machine learning algorithm is proved it could be well applied in relative poverty fields, especially XGBoost, which achieves 81.9% accuracy and the score of ROC_AUC reaches 0.819. 2) This study sheds light on many new research directions in applying machine learning for relative poverty research, besides, the paper offers an integral framework and beneficial reference for target identification using machine learning algorithms. 3) In addition, by utilizing the interpretable tools, the "black-box" of ML become transparent through PDP and SHAP explanation, it also reveals that machine learning models can readily handle the non-linear association relationship.
Collapse
Affiliation(s)
- Wei Huang
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| | - Yinke Liu
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| | - Peiqi Hu
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| | - Shiyu Ding
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| | - Shuhui Gao
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| | - Ming Zhang
- School of Management and Economics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
| |
Collapse
|
32
|
Wen G, Li L. FGCNSurv: dually fused graph convolutional network for multi-omics survival prediction. Bioinformatics 2023; 39:btad472. [PMID: 37522887 PMCID: PMC10412406 DOI: 10.1093/bioinformatics/btad472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 05/24/2023] [Accepted: 07/29/2023] [Indexed: 08/01/2023] Open
Abstract
MOTIVATION Survival analysis is an important tool for modeling time-to-event data, e.g. to predict the survival time of patient after a cancer diagnosis or a certain treatment. While deep neural networks work well in standard prediction tasks, it is still unclear how to best utilize these deep models in survival analysis due to the difficulty of modeling right censored data, especially for multi-omics data. Although existing methods have shown the advantage of multi-omics integration in survival prediction, it remains challenging to extract complementary information from different omics and improve the prediction accuracy. RESULTS In this work, we propose a novel multi-omics deep survival prediction approach by dually fused graph convolutional network (GCN) named FGCNSurv. Our FGCNSurv is a complete generative model from multi-omics data to survival outcome of patients, including feature fusion by a factorized bilinear model, graph fusion of multiple graphs, higher-level feature extraction by GCN and survival prediction by a Cox proportional hazard model. The factorized bilinear model enables to capture cross-omics features and quantify complex relations from multi-omics data. By fusing single-omics features and the cross-omics features, and simultaneously fusing multiple graphs from different omics, GCN with the generated dually fused graph could capture higher-level features for computing the survival loss in the Cox-PH model. Comprehensive experimental results on real-world datasets with gene expression and microRNA expression data show that the proposed FGCNSurv method outperforms existing survival prediction methods, and imply its ability to extract complementary information for survival prediction from multi-omics data. AVAILABILITY AND IMPLEMENTATION The codes are freely available at https://github.com/LiminLi-xjtu/FGCNSurv.
Collapse
Affiliation(s)
- Gang Wen
- School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China
| | - Limin Li
- School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China
| |
Collapse
|
33
|
Hao Y, Jing XY, Sun Q. Cancer survival prediction by learning comprehensive deep feature representation for multiple types of genetic data. BMC Bioinformatics 2023; 24:267. [PMID: 37380946 DOI: 10.1186/s12859-023-05392-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/19/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Cancer is one of the leading death causes around the world. Accurate prediction of its survival time is significant, which can help clinicians make appropriate therapeutic schemes. Cancer data can be characterized by varied molecular features, clinical behaviors and morphological appearances. However, the cancer heterogeneity problem usually makes patient samples with different risks (i.e., short and long survival time) inseparable, thereby causing unsatisfactory prediction results. Clinical studies have shown that genetic data tends to contain more molecular biomarkers associated with cancer, and hence integrating multi-type genetic data may be a feasible way to deal with cancer heterogeneity. Although multi-type gene data have been used in the existing work, how to learn more effective features for cancer survival prediction has not been well studied. RESULTS To this end, we propose a deep learning approach to reduce the negative impact of cancer heterogeneity and improve the cancer survival prediction effect. It represents each type of genetic data as the shared and specific features, which can capture the consensus and complementary information among all types of data. We collect mRNA expression, DNA methylation and microRNA expression data for four cancers to conduct experiments. CONCLUSIONS Experimental results demonstrate that our approach substantially outperforms established integrative methods and is effective for cancer survival prediction. AVAILABILITY AND IMPLEMENTATION https://github.com/githyr/ComprehensiveSurvival .
Collapse
Affiliation(s)
- Yaru Hao
- School of Computer Science, Wuhan University, Wuhan, China.
| | - Xiao-Yuan Jing
- School of Computer Science, Wuhan University, Wuhan, China.
- School of Computer, Guangdong University of Petrochemical Technology, Maoming, China.
- State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China.
| | - Qixing Sun
- School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
34
|
Chong D, Jones NC, Schittenhelm RB, Anderson A, Casillas-Espinosa PM. Multi-omics Integration and Epilepsy: Towards a Better Understanding of Biological Mechanisms. Prog Neurobiol 2023:102480. [PMID: 37286031 DOI: 10.1016/j.pneurobio.2023.102480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/09/2023] [Accepted: 06/03/2023] [Indexed: 06/09/2023]
Abstract
The epilepsies are a group of complex neurological disorders characterised by recurrent seizures. Approximately 30% of patients fail to respond to anti-seizure medications, despite the recent introduction of many new drugs. The molecular processes underlying epilepsy development are not well understood and this knowledge gap impedes efforts to identify effective targets and develop novel therapies against epilepsy. Omics studies allow a comprehensive characterisation of a class of molecules. Omics-based biomarkers have led to clinically validated diagnostic and prognostic tests for personalised oncology, and more recently for non-cancer diseases. We believe that, in epilepsy, the full potential of multi-omics research is yet to be realised and we envisage that this review will serve as a guide to researchers planning to undertake omics-based mechanistic studies.
Collapse
Affiliation(s)
- Debbie Chong
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia
| | - Nigel C Jones
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| | - Ralf B Schittenhelm
- Monash Proteomics & Metabolomics Facility and Monash Biomedicine Discovery Institute, Monash University, Clayton, Victoria, 3800, Australia
| | - Alison Anderson
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| | - Pablo M Casillas-Espinosa
- Department of Neuroscience, Central Clinical School, Monash University, Melbourne, 3004, Victoria, Australia; Department of Medicine (The Royal Melbourne Hospital), The University of Melbourne, 3000, Victoria, Australia; Department of Neurology, Alfred Health, Melbourne, 3004, Victoria, Australia
| |
Collapse
|
35
|
Mustafa E, Jadoon EK, Khaliq-uz-Zaman S, Humayun MA, Maray M. An Ensembled Framework for Human Breast Cancer Survivability Prediction Using Deep Learning. Diagnostics (Basel) 2023; 13:1688. [PMID: 37238173 PMCID: PMC10217686 DOI: 10.3390/diagnostics13101688] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 04/13/2023] [Accepted: 04/23/2023] [Indexed: 05/28/2023] Open
Abstract
Breast cancer is categorized as an aggressive disease, and it is one of the leading causes of death. Accurate survival predictions for both long-term and short-term survivors, when delivered on time, can help physicians make effective treatment decisions for their patients. Therefore, there is a dire need to design an efficient and rapid computational model for breast cancer prognosis. In this study, we propose an ensemble model for breast cancer survivability prediction (EBCSP) that utilizes multi-modal data and stacks the output of multiple neural networks. Specifically, we design a convolutional neural network (CNN) for clinical modalities, a deep neural network (DNN) for copy number variations (CNV), and a long short-term memory (LSTM) architecture for gene expression modalities to effectively handle multi-dimensional data. The independent models' results are then used for binary classification (long term > 5 years and short term < 5 years) based on survivability using the random forest method. The EBCSP model's successful application outperforms models that utilize a single data modality for prediction and existing benchmarks.
Collapse
Affiliation(s)
- Ehzaz Mustafa
- Department of Computer Science, Comsats University Islamabad, Abbottabad Campus, Islamabad 22060, Pakistan; (E.K.J.); (S.K.-u.-Z.)
| | - Ehtisham Khan Jadoon
- Department of Computer Science, Comsats University Islamabad, Abbottabad Campus, Islamabad 22060, Pakistan; (E.K.J.); (S.K.-u.-Z.)
| | - Sardar Khaliq-uz-Zaman
- Department of Computer Science, Comsats University Islamabad, Abbottabad Campus, Islamabad 22060, Pakistan; (E.K.J.); (S.K.-u.-Z.)
| | - Mohammad Ali Humayun
- Department of Computer Science, Information Technology University of the Punjab, Lahore 54590, Pakistan;
| | - Mohammed Maray
- Department of Information Systems, King Khalid University, Abha 62529, Saudi Arabia;
| |
Collapse
|
36
|
Yin Z, Chen T, Shu Y, Li Q, Yuan Z, Zhang Y, Xu X, Liu Y. A Gallbladder Cancer Survival Prediction Model Based on Multimodal Fusion Analysis. Dig Dis Sci 2023; 68:1762-1776. [PMID: 36496528 PMCID: PMC10133088 DOI: 10.1007/s10620-022-07782-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 11/28/2022] [Indexed: 04/27/2023]
Abstract
BACKGROUND Gallbladder cancer is the sixth most common malignant gastrointestinal tumor. Radical surgery is currently the only effective treatment, but patient prognosis is poor, with a 5-year survival rate of only 5-10%. Establishing an effective survival prediction model for gallbladder cancer patients is crucial for disease status assessment, early intervention, and individualized treatment approaches. The existing gallbladder cancer survival prediction model uses clinical data-radiotherapy and chemotherapy, pathology, and surgical scope-but fails to utilize laboratory examination and imaging data, limiting its prediction accuracy and preventing sufficient treatment plan guidance. AIMS The aim of this work is to propose an accurate survival prediction model, based on the deep learning 3D-DenseNet network, integrated with multimodal medical data (enhanced CT imaging, laboratory test results, and data regarding systemic treatments). METHODS Data were collected from 195 gallbladder cancer patients at two large tertiary hospitals in Shanghai. The 3D-DenseNet network extracted deep imaging features and constructed prognostic factors, from which a multimodal survival prediction model was established, based on the Cox regression model and incorporating patients' laboratory test and systemic treatment data. RESULTS The model had a C-index of 0.787 in predicting patients' survival rate. Moreover, the area under the curve (AUC) of predicting patients' 1-, 3-, and 5-year survival rates reached 0.827, 0.865, and 0.926, respectively. CONCLUSIONS Compared with the monomodal model based on deep imaging features and the tumor-node-metastasis (TNM) staging system-widely used in clinical practice-our model's prediction accuracy was greatly improved, aiding the prognostic assessment of gallbladder cancer patients.
Collapse
Affiliation(s)
- Ziming Yin
- School of Health Science and Engineering, University of Shanghai for Science and Technology, 516 Jungong Road, Yangpu District, Shanghai, 200093, China.
| | - Tao Chen
- Department of Biliary and Pancreatic Surgery, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| | - Yijun Shu
- Department of General Surgery, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 1665 Kongjiang Road, Yangpu District, Shanghai, 200092, China
- Shanghai Key Laboratory of Biliary Disease Research, Institute of Biliary Tract Disease Research, Shanghai Jiaotong University School of Medicine, 1665 Kongjiang Road, Yangpu District, Shanghai, 200092, China
| | - Qiwei Li
- Department of Biliary and Pancreatic Surgery, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| | - Zhiqing Yuan
- Department of Biliary and Pancreatic Surgery, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| | - Yijue Zhang
- Department of Anesthesiology, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| | - Xinsen Xu
- Department of Biliary and Pancreatic Surgery, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| | - Yingbin Liu
- Department of Biliary and Pancreatic Surgery, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
- Shanghai Key Laboratory of Biliary Disease Research, Institute of Biliary Tract Disease Research, Shanghai Jiaotong University School of Medicine, 1665 Kongjiang Road, Yangpu District, Shanghai, 200092, China
- State Key Laboratory of Oncogenes and Related Genes, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine, 160 Pujian Road, Pudong New District, Shanghai, 200127, China
| |
Collapse
|
37
|
Cui C, Yang H, Wang Y, Zhao S, Asad Z, Coburn LA, Wilson KT, Landman BA, Huo Y. Deep multimodal fusion of image and non-image data in disease diagnosis and prognosis: a review. PROGRESS IN BIOMEDICAL ENGINEERING (BRISTOL, ENGLAND) 2023; 5:10.1088/2516-1091/acc2fe. [PMID: 37360402 PMCID: PMC10288577 DOI: 10.1088/2516-1091/acc2fe] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
The rapid development of diagnostic technologies in healthcare is leading to higher requirements for physicians to handle and integrate the heterogeneous, yet complementary data that are produced during routine practice. For instance, the personalized diagnosis and treatment planning for a single cancer patient relies on various images (e.g. radiology, pathology and camera images) and non-image data (e.g. clinical data and genomic data). However, such decision-making procedures can be subjective, qualitative, and have large inter-subject variabilities. With the recent advances in multimodal deep learning technologies, an increasingly large number of efforts have been devoted to a key question: how do we extract and aggregate multimodal information to ultimately provide more objective, quantitative computer-aided clinical decision making? This paper reviews the recent studies on dealing with such a question. Briefly, this review will include the (a) overview of current multimodal learning workflows, (b) summarization of multimodal fusion methods, (c) discussion of the performance, (d) applications in disease diagnosis and prognosis, and (e) challenges and future directions.
Collapse
Affiliation(s)
- Can Cui
- Department of Computer Science, Vanderbilt University, Nashville, TN 37235, United States of America
| | - Haichun Yang
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37215, United States of America
| | - Yaohong Wang
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37215, United States of America
| | - Shilin Zhao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37215, United States of America
| | - Zuhayr Asad
- Department of Computer Science, Vanderbilt University, Nashville, TN 37235, United States of America
| | - Lori A Coburn
- Division of Gastroenterology Hepatology, and Nutrition, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States of America
- Veterans Affairs Tennessee Valley Healthcare System, Nashville, TN 37212, United States of America
| | - Keith T Wilson
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37215, United States of America
- Division of Gastroenterology Hepatology, and Nutrition, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States of America
- Veterans Affairs Tennessee Valley Healthcare System, Nashville, TN 37212, United States of America
| | - Bennett A Landman
- Department of Computer Science, Vanderbilt University, Nashville, TN 37235, United States of America
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN 37235, United States of America
| | - Yuankai Huo
- Department of Computer Science, Vanderbilt University, Nashville, TN 37235, United States of America
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN 37235, United States of America
| |
Collapse
|
38
|
Du X, Zhao Y. Multimodal adversarial representation learning for breast cancer prognosis prediction. Comput Biol Med 2023; 157:106765. [PMID: 36963355 DOI: 10.1016/j.compbiomed.2023.106765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 02/27/2023] [Accepted: 03/07/2023] [Indexed: 03/17/2023]
Abstract
With the increasing incidence of breast cancer, accurate prognosis prediction of breast cancer patients is a key issue in current cancer research, and it is also of great significance for patients' psychological rehabilitation and assisting clinical decision-making. Many studies that integrate data from different heterogeneous modalities such as gene expression profile, clinical data, and copy number alteration, have achieved greater success than those with only one modality in prognostic prediction. However, many of these approaches that exist fail to dramatically reduce the modality gap by aligning multimodal distributions. Therefore, it is crucial to develop a method that fully considers a modality-invariant embedding space to effectively integrate multimodal data. In this study, to reduce the modality gap, we propose a multimodal data adversarial representation framework (MDAR) to reduce the modal heterogeneity by translating source modalities into distributions for the target modality. Additionally, we apply reconstruction and classification losses to embedding space to further constrain it. Then, we design a multi-scale bilinear convolutional neural network (MS-B-CNN) for uni-modality to improve the feature expression ability. In addition, the embedding space generates predictions as stacked feature inputs to the extremely randomized trees classifier. With 10-fold cross-validation, our results show that the proposed adversarial representation learning improves prognostic performance. A comparative study of this method and other existing methods on the METABRIC (1980 patients) dataset showed that Matthews correlation coefficient (Mcc) was significantly enhanced by 7.4% in the prognosis prediction of breast cancer patients.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, China; School of Computer Science and Technology, Anhui University, Hefei, China.
| | - Yuefan Zhao
- School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
39
|
Arya N, Saha S, Mathur A, Saha S. Improving the robustness and stability of a machine learning model for breast cancer prognosis through the use of multi-modal classifiers. Sci Rep 2023; 13:4079. [PMID: 36906618 PMCID: PMC10008603 DOI: 10.1038/s41598-023-30143-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 02/16/2023] [Indexed: 03/13/2023] Open
Abstract
Breast cancer is a deadly disease with a high mortality rate among PAN cancers. The advancements in biomedical information retrieval techniques have been beneficial in developing early prognosis and diagnosis systems for cancer patients. These systems provide the oncologist with plenty of information from several modalities to make the correct and feasible treatment plan for breast cancer patients and protect them from unnecessary therapies and their toxic side effects. The cancer patient's related information can be collected using various modalities like clinical, copy number variation, DNA-methylation, microRNA sequencing, gene expression, and histopathological whole slide images. High dimensionality and heterogeneity in these modalities demand the development of some intelligent systems to understand related features to the prognosis and diagnosis of diseases and make correct predictions. In this work, we have studied some end-to-end systems having two main components : (a) dimensionality reduction techniques applied to original features from different modalities and (b) classification techniques applied to the fusion of reduced feature vectors from different modalities for automatic predictions of breast cancer patients into two categories: short-time and long-time survivors. Principal component analysis (PCA) and variational auto-encoders (VAEs) are used as the dimensionality reduction techniques, followed by support vector machines (SVM) or random forest as the machine learning classifiers. The study utilizes raw, PCA, and VAE extracted features of the TCGA-BRCA dataset from six different modalities as input to the machine learning classifiers. We conclude this study by suggesting that adding more modalities to the classifiers provides complementary information to the classifier and increases the stability and robustness of the classifiers. In this study, the multimodal classifiers have not been validated on primary data prospectively.
Collapse
Affiliation(s)
- Nikhilanand Arya
- Department of Computer Science & Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India.
| | - Sriparna Saha
- Department of Computer Science & Engineering, Indian Institute of Technology, Patna, Bihar, 801106, India
| | - Archana Mathur
- Department of Information Science & Engineering, Nitte Meenkashi Institute of Technology, Bangalore, 560064, India
| | - Snehanshu Saha
- APPCAIR & CSIS, Birla Institute of Technology and Science, Pilani-Goa Campus, Pilani, Goa, 403726, India
| |
Collapse
|
40
|
Arya N, Mathur A, Saha S, Saha S. Proposal of SVM Utility Kernel for Breast Cancer Survival Estimation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1372-1383. [PMID: 35994556 DOI: 10.1109/tcbb.2022.3198879] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The advancement of medical research in the field of cancer prognosis and diagnosis using various modalities has put oncologists under tremendous stress. The complexity and heterogeneity involved in multiple modalities and their significantly varied clinical outcomes make it difficult to analyze the disease and provide the correct treatment. Breast cancer is the major concern among all cancers worldwide, specifically for females. To help oncologists and cancer patients, research for breast cancer survival estimation has been proposed. It ranges from complex deep neural networks to simple and interpretable architectures. We propose a utility kernel for a support vector machine (SVM) in this article. It is a simple yet powerful function, which performs better than other popular machine learning algorithms and deep neural networks in the task of breast cancer survival prediction using the TCGA-BRCA dataset. This study validates the proposed utility kernel using four different modalities (gene expression, copy number variation, clinical, and histopathological tissue images) and their multi-modal combinations. The SVM based on our utility kernel empirically proves its efficacy by achieving the highest value on various performance measures, whereas advanced deep neural networks fail to train on small and highly imbalanced breast cancer data.
Collapse
|
41
|
Alsubai S, Alqahtani A, Sha M. Genetic hyperparameter optimization with Modified Scalable-Neighbourhood Component Analysis for breast cancer prognostication. Neural Netw 2023; 162:240-257. [PMID: 36913821 DOI: 10.1016/j.neunet.2023.02.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 12/30/2022] [Accepted: 02/23/2023] [Indexed: 03/02/2023]
Abstract
Breast cancer is common among women resulting in mortality when left untreated. Early detection is vital so that suitable treatment could assist cancer from spreading further and save people's life. The traditional way of detection is a time-consuming process. With the evolvement of DM (Data Mining), the healthcare industry could be benefitted in predicting the disease as it permits the physicians to determine the significant attributes for diagnosis. Though, conventional techniques have used DM-based methods to identify breast cancer, they lacked in terms of prediction rate. Moreover, parametric-Softmax classifiers have been a general option by conventional works with fixed classes, particularly when huge labelled data are present during training. Nevertheless, this turns into an issue for open set cases where new classes are encountered along with few instances to learn a generalized parametric classifier. Thus, the present study aims to implement a non-parametric strategy by optimizing the embedding of a feature rather than parametric classifiers. This research utilizes Deep CNN (Deep Convolutional Neural Network) and Inception V3 for learning visual features which preserve neighbourhood outline in semantic space relying on NCA (Neighbourhood Component Analysis) criteria. Delimited by its bottleneck, the study proposes MS-NCA (Modified Scalable-Neighbourhood Component Analysis) that relies on a non-linear objective function to perform feature fusion by optimizing the distance-learning objective due to which it gains the capability of computing inner feature products without performing mapping which increases the scalability of MS-NCA. Finally, G-HPO (Genetic-Hyper-parameter Optimization) is proposed. In this case, the new stage in the algorithm simply denotes the enhancement in the length of chromosome bringing several hyperparameters into subsequent XGBoost, NB and RF models having numerous layers for identifying the normal and affected cases of breast cancer for which optimized hyper-parameter values of RF (Random Forest), NB (Naïve Bayes), and XGBoost (eXtreme Gradient Boosting) are determined. This process helps in improvising the classification rate which is confirmed through analytical results.
Collapse
Affiliation(s)
- Shtwai Alsubai
- College of Computer Engineering and Sciences, Prince Sattam Bin AbdulAziz University, Al Kharj, Saudi Arabia.
| | - Abdullah Alqahtani
- College of Computer Engineering and Sciences, Prince Sattam Bin AbdulAziz University, Al Kharj, Saudi Arabia.
| | - Mohemmed Sha
- College of Computer Engineering and Sciences, Prince Sattam Bin AbdulAziz University, Al Kharj, Saudi Arabia.
| |
Collapse
|
42
|
Flores JE, Claborne DM, Weller ZD, Webb-Robertson BJM, Waters KM, Bramer LM. Missing data in multi-omics integration: Recent advances through artificial intelligence. Front Artif Intell 2023; 6:1098308. [PMID: 36844425 PMCID: PMC9949722 DOI: 10.3389/frai.2023.1098308] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 01/23/2023] [Indexed: 02/11/2023] Open
Abstract
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
Collapse
Affiliation(s)
- Javier E. Flores
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Daniel M. Claborne
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Zachary D. Weller
- Pacific Northwest National Laboratory, Artificial Intelligence and Data Analytics Division, National Security Directorate, Richland, WA, United States
| | - Bobbie-Jo M. Webb-Robertson
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Katrina M. Waters
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| | - Lisa M. Bramer
- Pacific Northwest National Laboratory, Biological Sciences Division, Earth and Biological Sciences Directorate, Richland, WA, United States
| |
Collapse
|
43
|
Accuracy Assessment of Machine Learning Algorithms Used to Predict Breast Cancer. DATA 2023. [DOI: 10.3390/data8020035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Machine learning (ML) was used to develop classification models to predict individual tumor patients’ outcomes. Binary classification defined whether the tumor was malignant or benign. This paper presents a comparative analysis of machine learning algorithms used for breast cancer prediction. This study used a dataset obtained from the National Cancer Institute (NIH), USA, which contains 1.7 million data records. Classical and deep learning methods were included in the accuracy assessment. Classical decision tree (DT), linear discriminant (LD), logistic regression (LR), support vector machine (SVM), and ensemble techniques (ET) algorithms were used. Probabilistic neural network (PNN), deep neural network (DNN), and recurrent neural network (RNN) methods were used for comparison. Feature selection and its effect on accuracy were also investigated. The results showed that decision trees and ensemble techniques outperformed the other techniques, as they both achieved a 98.7% accuracy.
Collapse
|
44
|
Huang Y, Rong Z, Zhang L, Xu Z, Ji J, He J, Liu W, Hou Y, Li K. HiRAND: A novel GCN semi-supervised deep learning-based framework for classification and feature selection in drug research and development. Front Oncol 2023; 13:1047556. [PMID: 36776339 PMCID: PMC9909422 DOI: 10.3389/fonc.2023.1047556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 01/03/2023] [Indexed: 01/28/2023] Open
Abstract
The prediction of response to drugs before initiating therapy based on transcriptome data is a major challenge. However, identifying effective drug response label data costs time and resources. Methods available often predict poorly and fail to identify robust biomarkers due to the curse of dimensionality: high dimensionality and low sample size. Therefore, this necessitates the development of predictive models to effectively predict the response to drugs using limited labeled data while being interpretable. In this study, we report a novel Hierarchical Graph Random Neural Networks (HiRAND) framework to predict the drug response using transcriptome data of few labeled data and additional unlabeled data. HiRAND completes the information integration of the gene graph and sample graph by graph convolutional network (GCN). The innovation of our model is leveraging data augmentation strategy to solve the dilemma of limited labeled data and using consistency regularization to optimize the prediction consistency of unlabeled data across different data augmentations. The results showed that HiRAND achieved better performance than competitive methods in various prediction scenarios, including both simulation data and multiple drug response data. We found that the prediction ability of HiRAND in the drug vorinostat showed the best results across all 62 drugs. In addition, HiRAND was interpreted to identify the key genes most important to vorinostat response, highlighting critical roles for ribosomal protein-related genes in the response to histone deacetylase inhibition. Our HiRAND could be utilized as an efficient framework for improving the drug response prediction performance using few labeled data.
Collapse
Affiliation(s)
- Yue Huang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhiwei Rong
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Liuchao Zhang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhenyi Xu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jianxin Ji
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jia He
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Weisha Liu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Yan Hou
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China,*Correspondence: Kang Li, ; Yan Hou,
| | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China,*Correspondence: Kang Li, ; Yan Hou,
| |
Collapse
|
45
|
Duan Y, Huo J, Chen M, Hou F, Yan G, Li S, Wang H. Early prediction of sepsis using double fusion of deep features and handcrafted features. APPL INTELL 2023; 53:1-17. [PMID: 36685641 PMCID: PMC9843111 DOI: 10.1007/s10489-022-04425-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2022] [Indexed: 01/19/2023]
Abstract
Sepsis is a life-threatening medical condition that is characterized by the dysregulated immune system response to infections, having both high morbidity and mortality rates. Early prediction of sepsis is critical to the decrease of mortality. This paper presents a novel early warning model called Double Fusion Sepsis Predictor (DFSP) for sepsis onset. DFSP is a double fusion framework that combines the benefits of early and late fusion strategies. First, a hybrid deep learning model that combines both the convolutional and recurrent neural networks to extract deep features is proposed. Second, deep features and handcrafted features, such as clinical scores, are concatenated to build the joint feature representation (early fusion). Third, several tree-based models based on joint feature representation are developed to generate the risk scores of sepsis onset that are combined with an End-to-End neural network for final sepsis detection (late fusion). To evaluate DFSP, a retrospective study was conducted, which included patients admitted to the ICUs of a hospital in Shanghai China. The results demonstrate that the DFSP outperforms state-of-the-art approaches in early sepsis prediction.
Collapse
Affiliation(s)
- Yongrui Duan
- School of Economics & Management, Tongji University, Shanghai, China
| | - Jiazhen Huo
- School of Economics & Management, Tongji University, Shanghai, China
| | - Mingzhou Chen
- School of Economics & Management, Tongji University, Shanghai, China
| | - Fenggang Hou
- Department of Oncology, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai, China
| | - Guoliang Yan
- Department of Geriatrics, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai, China
| | - Shufang Li
- Emergency Department, Shuguang Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Haihui Wang
- Department of Geriatrics, Shanghai Municipal Hospital of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
46
|
Hsu TC, Lin C. Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder. BIOINFORMATICS ADVANCES 2023; 3:vbac100. [PMID: 36698767 PMCID: PMC9832968 DOI: 10.1093/bioadv/vbac100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 12/07/2022] [Accepted: 01/08/2023] [Indexed: 01/11/2023]
Abstract
Motivation Cancer is one of the world's leading mortality causes, and its prognosis is hard to predict due to complicated biological interactions among heterogeneous data types. Numerous challenges, such as censorship, high dimensionality and small sample size, prevent researchers from using deep learning models for precise prediction. Results We propose a robust Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder (SCAN) as a structured machine-learning framework for cancer prognosis prediction. SCAN incorporates semi-supervised learning for predicting 5-year disease-specific survival and overall survival in breast and non-small cell lung cancer (NSCLC) patients, respectively. SCAN achieved significantly better AUROC scores than all existing benchmarks (81.73% for breast cancer; 80.46% for NSCLC), including our previously proposed bimodal neural network classifiers (77.71% for breast cancer; 78.67% for NSCLC). Independent validation results showed that SCAN still achieved better AUROC scores (74.74% for breast; 72.80% for NSCLC) than the bimodal neural network classifiers (64.13% for breast; 67.07% for NSCLC). SCAN is general and can potentially be trained on more patient data. This paves the foundation for personalized medicine for early cancer risk screening. Availability and implementation The source codes reproducing the main results are available on GitHub: https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Te-Cheng Hsu
- Institute of Communications Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Che Lin
- To whom correspondence should be addressed.
| |
Collapse
|
47
|
Liao J, Li X, Gan Y, Han S, Rong P, Wang W, Li W, Zhou L. Artificial intelligence assists precision medicine in cancer treatment. Front Oncol 2023; 12:998222. [PMID: 36686757 PMCID: PMC9846804 DOI: 10.3389/fonc.2022.998222] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 11/22/2022] [Indexed: 01/06/2023] Open
Abstract
Cancer is a major medical problem worldwide. Due to its high heterogeneity, the use of the same drugs or surgical methods in patients with the same tumor may have different curative effects, leading to the need for more accurate treatment methods for tumors and personalized treatments for patients. The precise treatment of tumors is essential, which renders obtaining an in-depth understanding of the changes that tumors undergo urgent, including changes in their genes, proteins and cancer cell phenotypes, in order to develop targeted treatment strategies for patients. Artificial intelligence (AI) based on big data can extract the hidden patterns, important information, and corresponding knowledge behind the enormous amount of data. For example, the ML and deep learning of subsets of AI can be used to mine the deep-level information in genomics, transcriptomics, proteomics, radiomics, digital pathological images, and other data, which can make clinicians synthetically and comprehensively understand tumors. In addition, AI can find new biomarkers from data to assist tumor screening, detection, diagnosis, treatment and prognosis prediction, so as to providing the best treatment for individual patients and improving their clinical outcomes.
Collapse
Affiliation(s)
- Jinzhuang Liao
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Xiaoying Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Yu Gan
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Shuangze Han
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Pengfei Rong
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
- Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Wei Wang
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
- Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Wei Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
- Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Li Zhou
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
- Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China
- Department of Pathology, The Xiangya Hospital of Central South University, Changsha, Hunan, China
| |
Collapse
|
48
|
Du J, Huang M, Liu L. AI-Aided Disease Prediction in Visualized Medicine. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1199:107-126. [PMID: 37460729 DOI: 10.1007/978-981-32-9902-3_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Artificial intelligence (AI) is playing a vitally important role in promoting the revolution of future technology. Healthcare is one of the promising applications in AI, which covers medical imaging, diagnosis, robotics, disease prediction, pharmacy, health management, and hospital management. Numbers of achievements that made in these fields overturn every aspect in traditional healthcare system. Therefore, to understand the state-of-art AI in healthcare, as well as the chances and obstacles in its development, the applications of AI in disease detection and outlook and the future trends of AI-aided disease prediction were discussed in this chapter.
Collapse
Affiliation(s)
- Juan Du
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China.
| | - Mengen Huang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Lin Liu
- Tianjin Key Laboratory of Retinal Functions and Diseases, Eye Institute and School of Optometry, Tianjin Medical University Eye Hospital, Tianjin, China
| |
Collapse
|
49
|
Wang X, Yu G, Yan Z, Wan L, Wang W, Cui L. Lung Cancer Subtype Diagnosis by Fusing Image-Genomics Data and Hybrid Deep Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:512-523. [PMID: 34855599 DOI: 10.1109/tcbb.2021.3132292] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accurate diagnosis of cancer subtypes is crucial for precise treatment, because different cancer subtypes are involved with different pathology and require different therapies. Although deep learning techniques have made great success in computer vision and other fields, they do not work well on Lung cancer subtype diagnosis, due to the distinction of slide images between different cancer subtypes is ambiguous. Furthermore, they often over-fit to high-dimensional genomics data with limited samples, and do not fuse the image and genomics data in a sensible way. In this paper, we propose a hybrid deep network based approach LungDIG for Lung cancer subtype Diagnosis by fusing Image-Genomics data. LungDIG first tiles the tissue slide image into small patches and extracts the patch-level features by fine-tuning an Inception-V3 model. Since the patches may contain some false positives in non-diagnostic regions, it further designs a patch-level feature combination strategy to integrate the extracted patch features and maintain the diversity between different cancer subtypes. At the same time, it extracts the genomics features from Copy Number Variation data by an attention based nonlinear extractor. Next, it fuses the image and genomics features by an attention based multilayer perceptron (MLP) to diagnose cancer subtype. Experiments on TCGA lung cancer data show that LungDIG can not only achieve higher accuracy for cancer subtype diagnosis than state-of-the-art methods, but also have a high authenticity and good interpretability.
Collapse
|
50
|
Wu X, Shi Y, Wang M, Li A. CAMR: cross-aligned multimodal representation learning for cancer survival prediction. Bioinformatics 2023; 39:btad025. [PMID: 36637188 PMCID: PMC9857974 DOI: 10.1093/bioinformatics/btad025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/10/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023] Open
Abstract
MOTIVATION Accurately predicting cancer survival is crucial for helping clinicians to plan appropriate treatments, which largely improves the life quality of cancer patients and spares the related medical costs. Recent advances in survival prediction methods suggest that integrating complementary information from different modalities, e.g. histopathological images and genomic data, plays a key role in enhancing predictive performance. Despite promising results obtained by existing multimodal methods, the disparate and heterogeneous characteristics of multimodal data cause the so-called modality gap problem, which brings in dramatically diverse modality representations in feature space. Consequently, detrimental modality gaps make it difficult for comprehensive integration of multimodal information via representation learning and therefore pose a great challenge to further improvements of cancer survival prediction. RESULTS To solve the above problems, we propose a novel method called cross-aligned multimodal representation learning (CAMR), which generates both modality-invariant and -specific representations for more accurate cancer survival prediction. Specifically, a cross-modality representation alignment learning network is introduced to reduce modality gaps by effectively learning modality-invariant representations in a common subspace, which is achieved by aligning the distributions of different modality representations through adversarial training. Besides, we adopt a cross-modality fusion module to fuse modality-invariant representations into a unified cross-modality representation for each patient. Meanwhile, CAMR learns modality-specific representations which complement modality-invariant representations and therefore provides a holistic view of the multimodal data for cancer survival prediction. Comprehensive experiment results demonstrate that CAMR can successfully narrow modality gaps and consistently yields better performance than other survival prediction methods using multimodal data. AVAILABILITY AND IMPLEMENTATION CAMR is freely available at https://github.com/wxq-ustc/CAMR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xingqi Wu
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Yi Shi
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Minghui Wang
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| | - Ao Li
- School of Information Science and Technology, University of Science and Technology of China, Hefei AH230027, China
| |
Collapse
|