1
|
Luo D, Luo A, Hu S, Ye G, Li D, Zhao H, Peng B. Genomics and proteomics to determine novel molecular subtypes and predict the response to immunotherapy and the effect of bevacizumab in glioblastoma. Sci Rep 2024; 14:17630. [PMID: 39085480 PMCID: PMC11292017 DOI: 10.1038/s41598-024-68648-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 07/25/2024] [Indexed: 08/02/2024] Open
Abstract
Glioblastoma (GBM) is a highly aggressive, infiltrative malignancy that cannot be completely cured by current treatment modalities, and therefore requires more precise molecular subtype signatures to predict treatment response for personalized precision therapy. Expression subtypes of GBM samples from the Cancer Genome Atlas (TCGA) were identified using BayesNM and compared with existing molecular subtypes of GBM. Biological features of the subtypes were determined by single-sample gene set enrichment analysis. Genomic and proteomic data from GBM samples were combined and Genomic Identification of Significant Targets in Cancer analysis was used to screen genes with recurrent somatic copy-number alterations phenomenon. The immune environment among subtypes was compared by assessing the expression of immune molecules and the infiltration of immune cells. Molecular subtypes adapted to immunotherapy were identified based on Tumor Immune Dysfunction and Exclusion (TIDE) score. Finally, least absolute shrinkage and selection operator (LASSO) logistic regression was performed on the expression profiles of S2, S3 and S4 in TCGA-GBM and RPPA to determine the respective corresponding best predictive model. Four novel molecular subtypes were classified. Specifically, S1 exhibited a low proliferative profile; S2 exhibited the profile of high proliferation, IDH1 mutation, TP53 mutation and deletion; S3 was characterized by high immune scores, innate immunity and adaptive immune infiltration scores, with the lowest TIDE score and was most likely to benefit from immunotherapy; S4 was characterized by high proliferation, EGFR amplification, and high protein abundance, and was the most suitable subtype for bevacizumab. LASSO analysis constructed the best prediction model composed of 13 genes in S2 with an accuracy of 96.7%, and the prediction model consisting of 17 genes in S3 with an accuracy of 86.7%, and screened 14 genes as components of the best prediction model in S4 with an accuracy of 93%. To conclude, our study classified reproducible and robust molecular subtypes of GBM, and these findings might contribute to the identification of patients responding to immunotherapy, thereby improving GBM prognosis.
Collapse
Affiliation(s)
- Dongdong Luo
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China
| | - Aiping Luo
- Radiology Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China.
| | - Su Hu
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China.
| | - Ganwei Ye
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China
| | - Dan Li
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China
| | - Hailin Zhao
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China
| | - Biao Peng
- Neurosurgery Department, Guangzhou Institute of Cancer Research, The Affiliated Cancer Hospital, Guangzhou Medical University, Guangzhou, 510032, China.
| |
Collapse
|
2
|
Li Z, Wei C, Zhang Z, Han L. ecGBMsub: an integrative stacking ensemble model framework based on eccDNA molecular profiling for improving IDH wild-type glioblastoma molecular subtype classification. Front Pharmacol 2024; 15:1375112. [PMID: 38666025 PMCID: PMC11043526 DOI: 10.3389/fphar.2024.1375112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/18/2024] [Indexed: 04/28/2024] Open
Abstract
IDH wild-type glioblastoma (GBM) intrinsic subtypes have been linked to different molecular landscapes and outcomes. Accurate prediction of molecular subtypes of GBM is very important to guide clinical diagnosis and treatment. Leveraging machine learning technology to improve the subtype classification was considered a robust strategy. Several single machine learning models have been developed to predict survival or stratify patients. An ensemble learning strategy combines several basic learners to boost model performance. However, it still lacked a robust stacking ensemble learning model with high accuracy in clinical practice. Here, we developed a novel integrative stacking ensemble model framework (ecGBMsub) for improving IDH wild-type GBM molecular subtype classification. In the framework, nine single models with the best hyperparameters were fitted based on extrachromosomal circular DNA (eccDNA) molecular profiling. Then, the top five optimal single models were selected as base models. By randomly combining the five optimal base models, 26 different combinations were finally generated. Nine different meta-models with the best hyperparameters were fitted based on the prediction results of 26 different combinations, resulting in 234 different stacked ensemble models. All models in ecGBMsub were comprehensively evaluated and compared. Finally, the stacking ensemble model named "XGBoost.Enet-stacking-Enet" was chosen as the optimal model in the ecGBMsub framework. A user-friendly web tool was developed to facilitate accessibility to the XGBoost.Enet-stacking-Enet models (https://lizesheng20190820.shinyapps.io/ecGBMsub/).
Collapse
Affiliation(s)
- Zesheng Li
- Tianjin Neurological Institute, Key Laboratory of Post-Neuro Injury, Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education and Tianjin City, Tianjin Medical University General Hospital, Tianjin, China
| | - Cheng Wei
- Tianjin Neurological Institute, Key Laboratory of Post-Neuro Injury, Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education and Tianjin City, Tianjin Medical University General Hospital, Tianjin, China
| | - Zhenyu Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Lei Han
- Tianjin Neurological Institute, Key Laboratory of Post-Neuro Injury, Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education and Tianjin City, Tianjin Medical University General Hospital, Tianjin, China
| |
Collapse
|
3
|
Li C, Ye G, Jiang Y, Wang Z, Yu H, Yang M. Artificial Intelligence in battling infectious diseases: A transformative role. J Med Virol 2024; 96:e29355. [PMID: 38179882 DOI: 10.1002/jmv.29355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/01/2023] [Accepted: 12/17/2023] [Indexed: 01/06/2024]
Abstract
It is widely acknowledged that infectious diseases have wrought immense havoc on human society, being regarded as adversaries from which humanity cannot elude. In recent years, the advancement of Artificial Intelligence (AI) technology has ushered in a revolutionary era in the realm of infectious disease prevention and control. This evolution encompasses early warning of outbreaks, contact tracing, infection diagnosis, drug discovery, and the facilitation of drug design, alongside other facets of epidemic management. This article presents an overview of the utilization of AI systems in the field of infectious diseases, with a specific focus on their role during the COVID-19 pandemic. The article also highlights the contemporary challenges that AI confronts within this domain and posits strategies for their mitigation. There exists an imperative to further harness the potential applications of AI across multiple domains to augment its capacity in effectively addressing future disease outbreaks.
Collapse
Affiliation(s)
- Chunhui Li
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Guoguo Ye
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, The Third People's Hospital of Shenzhen, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Yinghan Jiang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Zhiming Wang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| | - Haiyang Yu
- Hangzhou Yalla Information Technology Service Co., Ltd., Hangzhou, People's Republic of China
| | - Minghui Yang
- School of Life Science, Advanced Research Institute of Multidisciplinary Science, Key Laboratory of Molecular Medicine and Biotherapy, Beijing Institute of Technology, Beijing, People's Republic of China
| |
Collapse
|
4
|
Bhandari M, Shahi TB, Neupane A, Walsh KB. BotanicX-AI: Identification of Tomato Leaf Diseases Using an Explanation-Driven Deep-Learning Model. J Imaging 2023; 9:jimaging9020053. [PMID: 36826972 PMCID: PMC9964407 DOI: 10.3390/jimaging9020053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/13/2023] [Accepted: 02/14/2023] [Indexed: 02/23/2023] Open
Abstract
Early and accurate tomato disease detection using easily available leaf photos is essential for farmers and stakeholders as it help reduce yield loss due to possible disease epidemics. This paper aims to visually identify nine different infectious diseases (bacterial spot, early blight, Septoria leaf spot, late blight, leaf mold, two-spotted spider mite, mosaic virus, target spot, and yellow leaf curl virus) in tomato leaves in addition to healthy leaves. We implemented EfficientNetB5 with a tomato leaf disease (TLD) dataset without any segmentation, and the model achieved an average training accuracy of 99.84% ± 0.10%, average validation accuracy of 98.28% ± 0.20%, and average test accuracy of 99.07% ± 0.38% over 10 cross folds.The use of gradient-weighted class activation mapping (GradCAM) and local interpretable model-agnostic explanations are proposed to provide model interpretability, which is essential to predictive performance, helpful in building trust, and required for integration into agricultural practice.
Collapse
Affiliation(s)
- Mohan Bhandari
- Department of Science and Technology, Samriddhi College, Bhaktapur 44800, Nepal
| | - Tej Bahadur Shahi
- School of Engineering and Technology, Central Queensland University, Norman Gardens, Rockhampton 4701, Australia
- Central Department of Computer Science and IT, Tribhuvan University, Kathmandu 44600, Nepal
| | - Arjun Neupane
- School of Engineering and Technology, Central Queensland University, Norman Gardens, Rockhampton 4701, Australia
- Correspondence:
| | - Kerry Brian Walsh
- Institute for Future Farming Systems, Central Queensland University, Rockhampton 4701, Australia
| |
Collapse
|
5
|
Khandelwal M, Kumar Rout R, Umer S, Mallik S, Li A. Multifactorial feature extraction and site prognosis model for protein methylation data. Brief Funct Genomics 2023; 22:20-30. [PMID: 36310537 DOI: 10.1093/bfgp/elac034] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 09/23/2022] [Accepted: 09/28/2022] [Indexed: 01/24/2023] Open
Abstract
Integrated studies (multi-omics studies) comprising genetic, proteomic and epigenetic data analyses have become an emerging topic in biomedical research. Protein methylation is a posttranslational modification that plays an essential role in various cellular activities. The prediction of methylation sites (arginine and lysine) is vital to understand the molecular processes of protein methylation. However, current experimental techniques used for methylation site predictions are tedious and expensive. Hence, computational techniques for predicting methylation sites in proteins are necessary. For predicting methylation sites, various computational methods have been proposed in recent years. Most existing methods require structural and evolutionary information for retrieving features, acquiring this information is not always convenient. Thus, we proposed a novel method, called multi-factorial feature extraction and site prognosis model (MufeSPM), for the prediction of protein methylation sites based on information theory features (Renyi, Shannon, Havrda-Charvat and Arimoto entropy), amino acid composition and physicochemical properties acquired from protein methylation data. A random forest algorithm was used to predict methylation sites in protein sequences. This paper also studied the impact of different features and classifiers on arginine and lysine methylation data sets. For the R methylation data set, MufeSPM yielded 82.45%($\pm $ 3.47) accuracy, and for the K methylation data set, it provided an average accuracy of 71.94%($\pm $ 2.12). Additionally, the area under the receiver operating characteristic curve for different classifiers in predicting methylation site was provided. The experimental results signify that MufeSPM performs better than the state-of-the-art predictors.
Collapse
Affiliation(s)
- Monika Khandelwal
- Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, Srinagar, 190006, Jammu and Kashmir, India
| | - Ranjeet Kumar Rout
- Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, Srinagar, 190006, Jammu and Kashmir, India
| | - Saiyed Umer
- Computer Science & Engineering, Aliah University, Kolkata, 700016, West Bengal, India
| | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of Public Health, Huntington Ave, Boston, 02115, MA, USA
| | - Aimin Li
- School of Computer Science and Engineering, Xi'an University of Technology, Jinhua S Rd, 710048, Shaanxi, China
| |
Collapse
|
6
|
Li H, He J, Li M, Li K, Pu X, Guo Y. Immune landscape-based machine-learning-assisted subclassification, prognosis, and immunotherapy prediction for glioblastoma. Front Immunol 2022; 13:1027631. [PMID: 36532035 PMCID: PMC9751405 DOI: 10.3389/fimmu.2022.1027631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 11/15/2022] [Indexed: 12/04/2022] Open
Abstract
Introduction As a malignant brain tumor, glioblastoma (GBM) is characterized by intratumor heterogeneity, a worse prognosis, and highly invasive, lethal, and refractory natures. Immunotherapy has been becoming a promising strategy to treat diverse cancers. It has been known that there are highly heterogeneous immunosuppressive microenvironments among different GBM molecular subtypes that mainly include classical (CL), mesenchymal (MES), and proneural (PN), respectively. Therefore, an in-depth understanding of immune landscapes among them is essential for identifying novel immune markers of GBM. Methods and results In the present study, based on collecting the largest number of 109 immune signatures, we aim to achieve a precise diagnosis, prognosis, and immunotherapy prediction for GBM by performing a comprehensive immunogenomic analysis. Firstly, machine-learning (ML) methods were proposed to evaluate the diagnostic values of these immune signatures, and the optimal classifier was constructed for accurate recognition of three GBM subtypes with robust and promising performance. The prognostic values of these signatures were then confirmed, and a risk score was established to divide all GBM patients into high-, medium-, and low-risk groups with a high predictive accuracy for overall survival (OS). Therefore, complete differential analysis across GBM subtypes was performed in terms of the immune characteristics along with clinicopathological and molecular features, which indicates that MES shows much higher immune heterogeneity compared to CL and PN but has significantly better immunotherapy responses, although MES patients may have an immunosuppressive microenvironment and be more proinflammatory and invasive. Finally, the MES subtype is proved to be more sensitive to 17-AAG, docetaxel, and erlotinib using drug sensitivity analysis and three compounds of AS-703026, PD-0325901, and MEK1-2-inhibitor might be potential therapeutic agents. Conclusion Overall, the findings of this research could help enhance our understanding of the tumor immune microenvironment and provide new insights for improving the prognosis and immunotherapy of GBM patients.
Collapse
|
7
|
Hierarchical Voting-Based Feature Selection and Ensemble Learning Model Scheme for Glioma Grading with Clinical and Molecular Characteristics. Int J Mol Sci 2022; 23:ijms232214155. [PMID: 36430631 PMCID: PMC9697273 DOI: 10.3390/ijms232214155] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 10/31/2022] [Accepted: 11/12/2022] [Indexed: 11/18/2022] Open
Abstract
Determining the aggressiveness of gliomas, termed grading, is a critical step toward treatment optimization to increase the survival rate and decrease treatment toxicity for patients. Streamlined grading using molecular information has the potential to facilitate decision making in the clinic and aid in treatment planning. In recent years, molecular markers have increasingly gained importance in the classification of tumors. In this study, we propose a novel hierarchical voting-based methodology for improving the performance results of the feature selection stage and machine learning models for glioma grading with clinical and molecular predictors. To identify the best scheme for the given soft-voting-based ensemble learning model selections, we utilized publicly available TCGA and CGGA datasets and employed four dimensionality reduction methods to carry out a voting-based ensemble feature selection and five supervised models, with a total of sixteen combination sets. We also compared our proposed feature selection method with the LASSO feature selection method in isolation. The computational results indicate that the proposed method achieves 87.606% and 79.668% accuracy rates on TCGA and CGGA datasets, respectively, outperforming the LASSO feature selection method.
Collapse
|
8
|
Munquad S, Si T, Mallik S, Li A, Das AB. Subtyping and grading of lower-grade gliomas using integrated feature selection and support vector machine. Brief Funct Genomics 2022; 21:408-421. [PMID: 35923100 DOI: 10.1093/bfgp/elac025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/23/2022] [Accepted: 07/17/2022] [Indexed: 11/13/2022] Open
Abstract
Classifying lower-grade gliomas (LGGs) is a crucial step for accurate therapeutic intervention. The histopathological classification of various subtypes of LGG, including astrocytoma, oligodendroglioma and oligoastrocytoma, suffers from intraobserver and interobserver variability leading to inaccurate classification and greater risk to patient health. We designed an efficient machine learning-based classification framework to diagnose LGG subtypes and grades using transcriptome data. First, we developed an integrated feature selection method based on correlation and support vector machine (SVM) recursive feature elimination. Then, implementation of the SVM classifier achieved superior accuracy compared with other machine learning frameworks. Most importantly, we found that the accuracy of subtype classification is always high (>90%) in a specific grade rather than in mixed grade (~80%) cancer. Differential co-expression analysis revealed higher heterogeneity in mixed grade cancer, resulting in reduced prediction accuracy. Our findings suggest that it is necessary to identify cancer grades and subtypes to attain a higher classification accuracy. Our six-class classification model efficiently predicts the grades and subtypes with an average accuracy of 91% (±0.02). Furthermore, we identify several predictive biomarkers using co-expression, gene set enrichment and survival analysis, indicating our framework is biologically interpretable and can potentially support the clinician.
Collapse
Affiliation(s)
- Sana Munquad
- Department of Biotechnology, National Institute of Technology Warangal, Warangal 506004, Telangana, India
| | - Tapas Si
- Department of Computer Science and Engineering, Bankura Unnayani Institute of Engineering, Bankura 722146, West Bengal, India
| | - Saurav Mallik
- Department of Environmental Epigenetics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Aimin Li
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Asim Bikas Das
- Department of Biotechnology, National Institute of Technology Warangal, Warangal 506004, Telangana, India
| |
Collapse
|