1
|
Ji K, Shi L, Feng Y, Wang L, Guo H, Li H, Xing J, Xia S, Xu B, Liu E, Zheng Y, Li C, Liu M. Construction and interpretation of machine learning-based prognostic models for survival prediction among intestinal-type and diffuse-type gastric cancer patients. World J Surg Oncol 2024; 22:275. [PMID: 39407221 PMCID: PMC11481450 DOI: 10.1186/s12957-024-03550-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 10/01/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND Gastric cancer is one of the most common malignant tumors worldwide, with high incidence and mortality rates, and it has a complex etiology and complex pathological features. Depending on the tumor type, gastric cancer can be classified as intestinal-type and diffuse-type gastric cancer, each with distinct pathogenic mechanisms and clinical presentations. In recent years, machine learning techniques have been widely applied in the medical field, offering new perspectives for the diagnosis, treatment, and prognosis of gastric cancer patients. METHODS This study recruited 2158 gastric cancer patients and constructed prognostic prediction models for both intestinal-type and diffuse-type gastric cancer. Clinical pathological data were collected from patients, and machine learning algorithms were used for feature selection and model construction. The performance of the models was validated with training and testing datasets. The Shapley additive explanations (SHAP) values were used to interpret the model predictions and identify the main factors that influence patient survival. RESULTS In the prognostic model for intestinal-type gastric cancer, the gradient boosting decision tree (GBDT) model demonstrated the best performance, with key features including pTNM, CA125, tumor size, CA199, and PALB. Similarly, in the prognostic model for diffuse-type gastric cancer, the GBDT model was utilized, with key features comprising pTNM, Borrmann type IV disease, lymphocyte (LYM), lactate dehydrogenase (LDH), potassium (K), perineural invasion (PNI), tumor size, and whole stomach location. Risk stratification analysis revealed that the prognosis of high-risk patients was significantly worse than that of low-risk patients. CONCLUSION Machine learning shows great potential in predicting survival outcomes of gastric cancer patients, providing strong support for the development of personalized treatment plans.
Collapse
Affiliation(s)
- Kunxiang Ji
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Lei Shi
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Yan Feng
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Linna Wang
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - HuanNan Guo
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Hui Li
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Jiacheng Xing
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Siyu Xia
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China
| | - Boran Xu
- Department of Oncology III, Beidahuang Industry Group General Hospital, Harbin, China
| | - Eryu Liu
- Department of Oncology III, Beidahuang Industry Group General Hospital, Harbin, China
| | - YanDan Zheng
- Department of Oncology, Anda City Hospital, Anda, China
| | - Chunfeng Li
- Department of Gastrointestinal Surgery, Harbin Medical University Cancer Hospital, Harbin, China.
| | - Mingyang Liu
- Department of Oncology IV, Beidahuang Industry Group General Hospital, Harbin, China.
| |
Collapse
|
2
|
Gu J, Zhao Y, Ben Y, Zhang S, Hua L, He S, Liu R, Chen X, Sheng H. A personalized mRNA signature for predicting hypertrophic cardiomyopathy applying machine learning methods. Sci Rep 2024; 14:17023. [PMID: 39043774 PMCID: PMC11266364 DOI: 10.1038/s41598-024-67201-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 07/09/2024] [Indexed: 07/25/2024] Open
Abstract
Hypertrophic cardiomyopathy (HCM) may lead to cardiac dysfunction and sudden death. This study was designed to develop a HCM signature applying bioinformatics and machine learning methods. Data of HCM and normal tissues were obtained from public databases to screen differentially expressed genes (DEGs) using the R software limma package. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were performed for enrichment analysis of HCM-associated DEGs. Hub genes for HCM were determined using weighted gene co-expression network analysis (WGCNA) together with two machine learning algorithms (SVM-RFE and LASSO). Finally, we introduced a zebrafish model to simulate changes in the hub genes in the HCM and to observe their effects on cardiac disease development. The mRNA expression data from a total of 106 HCM tissues and 39 normal samples were collected and we screened 157 DEGs. Enrichment analysis showed that immune pathways played an important role in the pathogenesis of HCM. Three hub genes (FCN3, MYH6 and RASD1) were identified using WGCNA, SVM-RFE, and LASSO analysis. In a zebrafish model, knockdown of MYH6 and RASD1 resulted in cardiac malformations with reduced ventricular capacity and heart rate, which validated the clinical significance of these genes in the diagnosis of HCM. Based on machine learning algorithms, our study created a signature with potential impact on cardiac function and cardiac quality index for HCM. The current findings had important implications for the early diagnosis and treatment of HCM.
Collapse
Affiliation(s)
- Jue Gu
- Affiliated Hospital of Nantong University, No.20 Xisi Road, Nantong, 226000, Jiangsu Province, China
| | - Yamin Zhao
- Nantong Second People's Hospital, Nantong, China
| | - Yue Ben
- Affiliated Hospital of Nantong University, No.20 Xisi Road, Nantong, 226000, Jiangsu Province, China
| | - Siming Zhang
- Medical School of Nantong University, Nantong University, Nantong, China
| | - Liqi Hua
- Medical School of Nantong University, Nantong University, Nantong, China
| | - Songnian He
- Medical School of Nantong University, Nantong University, Nantong, China
| | - Ruizi Liu
- Medical School of Nantong University, Nantong University, Nantong, China
| | - Xu Chen
- Medical School of Nantong University, Nantong University, Nantong, China.
| | - Hongzhuan Sheng
- Affiliated Hospital of Nantong University, No.20 Xisi Road, Nantong, 226000, Jiangsu Province, China.
| |
Collapse
|
3
|
Alinia S, Asghari-Jafarabadi M, Mahmoudi L, Roshanaei G, Safari M. Predicting mortality and recurrence in colorectal cancer: Comparative assessment of predictive models. Heliyon 2024; 10:e27854. [PMID: 38515707 PMCID: PMC10955293 DOI: 10.1016/j.heliyon.2024.e27854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 03/05/2024] [Accepted: 03/07/2024] [Indexed: 03/23/2024] Open
Abstract
Introduction Colorectal cancer (CRC), also known as colorectal cancer, is a significant disease marked by high fatality rates, ranking as the third leading cause of global mortality. The main objective of this study was to assess the accuracy of predictive models in predicting both mortality events and the probability of disease recurrence. Method A retrospective analysis was conducted on a cohort of 284 individuals diagnosed with colorectal cancer between 2001 and 2017. Demographic and clinical data, including gender, disease stage, age at diagnosis, recurrence status, and treatment details, were meticulously recorded. We rigorously evaluated various predictive models, including Decision Trees, Random Forests, Random Survival Forests (RSF), Gradient Boosting, mboost, Deep Learning Neural Network (DLNN), and Cox regression. Performance metrics, such as sensitivity, positive predictive value (PPV), specificity, area under the receiver operating characteristic curve (ROC area), and overall accuracy, were calculated for each model to predict mortality and disease recurrence. The analysis was performed using R version 4.1.3 software and the Python programming language. Results For mortality prediction, the mboost model demonstrated the highest sensitivity at 96.9% (95% CI: 0.83-0.99) and an ROC area of 0.88. It also exhibited high specificity at 80% (95% CI: 0.59-0.93), a positive predictive value of 86.1% (95% CI: 0.70-0.95), and an overall accuracy of 89% (95% CI: 0.78-0.96). Random Forests showed perfect sensitivity of 100% (95% CI: 0.85-1) but had low specificity at 0% (95% CI: 0-0.52) and poor overall accuracy (50%). On the other hand, DLNN had the lowest performance metrics for mortality prediction, with a sensitivity of 24% (95% CI: 0.222-0.268), specificity of 75% (95% CI: 0.73-0.77), and a lower positive predictive value of 42% (95% CI: 0.38-0.45). The Gradient Boosting model showed the best performance in predicting recurrence, achieving perfect sensitivity of 100% (95% CI: 0.87-1) and high specificity at 92.9% (95% CI: 0.76-0.99). It also had a high positive predictive value of 93.3% (95% CI: 0.77-0.99). Gradient Boosting, with an ROC area of 96.4%, and mboost, with an ROC area of 75%, demonstrated remarkable performance. DLNN had the lowest performance metrics for recurrence prediction, with sensitivity at 1.75% (95% CI: 0.01-0.02), specificity at 98% (95% CI: 0.97-0.98), and a lower positive predictive value at 52.6% (95% CI: 0.39-0.65). Conclusion In summary, the mboost model demonstrated outstanding performance in predicting mortality, achieving exceptional results across various evaluation metrics. Random Forests exhibited perfect sensitivity but showed poor specificity and overall accuracy. The DLNN model displayed the lowest performance metrics for mortality prediction. In terms of recurrence prediction, the Gradient Boosting model outperformed other models with perfect sensitivity, high specificity, and positive predictive value. The DLNN model had the lowest performance metrics for recurrence prediction. Overall, the results emphasize the effectiveness of the mboost and Gradient Boosting models in predicting mortality and recurrence in colorectal cancer patients.
Collapse
Affiliation(s)
- Shayeste Alinia
- Department of Statistics and Epidemiology, School of Medicine, Zanjan University of Medical Sciences, Zanjan, Iran
| | | | - Leila Mahmoudi
- Department of Statistics and Epidemiology, School of Medicine, Zanjan University of Medical Sciences, Zanjan, Iran
| | - Ghodratollah Roshanaei
- Modeling of Non-communicable Diseases Research Canter, Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Maliheh Safari
- Department of Biostatistics, School of Medicine, Arak University of Medical Sciences, Arak, Iran
| |
Collapse
|
4
|
You Y, Yang Q. Glycosylation-related genes mediated prognostic signature contribute to prognostic prediction and treatment options in ovarian cancer: based on bulk and single‑cell RNA sequencing data. BMC Cancer 2024; 24:207. [PMID: 38355446 PMCID: PMC10865697 DOI: 10.1186/s12885-024-11908-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 01/22/2024] [Indexed: 02/16/2024] Open
Abstract
BACKGROUND Ovarian cancer (OC) is a complex disease with significant tumor heterogeneity with the worst prognosis and highest mortality among all gynecological cancers. Glycosylation is a specific post-translational modification that plays an important role in tumor progression, immune escape and metastatic spread. The aim of this work was to identify the major glycosylation-related genes (GRGs) in OC and construct an effective GRGs signature to predict prognosis and immunotherapy. METHODS AUCell algorithm was used to identify glycosylation-related genes (GRGs) based on the scRNA-seq and bulk RNA-seq data. An effective GRGs signature was conducted using COX and LASSO regression algorithm. The texting dataset and clinical sample data were used to assessed the accuracy of GRGs signature. We evaluated the differences in immune cell infiltration, enrichment of immune checkpoints, immunotherapy response, and gene mutation status among different risk groups. Finally, RT-qPCR, Wound-healing assay, Transwell assay were performed to verify the effect of the CYBRD1 on OC. RESULTS A total of 1187 GRGs were obtained and a GRGs signature including 16 genes was established. The OC patients were divided into high- and low- risk group based on the median riskscore and the patients in high-risk group have poor outcome. We also found that the patients in low-risk group have higher immune cell infiltration, enrichment of immune checkpoints and immunotherapy response. The results of laboratory test showed that CYBRD1 can promote the invasion, and migration of OC and is closely related to the poor prognosis of OC patients. CONCLUSIONS Our study established a GRGs signature consisting of 16 genes based on the scRNA-seq and bulk RNA-seq data, which provides a new perspective on the prognosis prediction and treatment strategy for OC.
Collapse
Affiliation(s)
- Yue You
- Department of gynaecology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Qing Yang
- Department of gynaecology, Shengjing Hospital of China Medical University, Shenyang, China.
| |
Collapse
|
5
|
Zhang P, Wu L, Zou TT, Zou Z, Tu J, Gong R, Kuang J. Machine Learning for Early Prediction of Major Adverse Cardiovascular Events After First Percutaneous Coronary Intervention in Patients With Acute Myocardial Infarction: Retrospective Cohort Study. JMIR Form Res 2024; 8:e48487. [PMID: 38170581 PMCID: PMC10794958 DOI: 10.2196/48487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/29/2023] [Accepted: 09/15/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND The incidence of major adverse cardiovascular events (MACEs) remains high in patients with acute myocardial infarction (AMI) who undergo percutaneous coronary intervention (PCI), and early prediction models to guide their clinical management are lacking. OBJECTIVE This study aimed to develop machine learning-based early prediction models for MACEs in patients with newly diagnosed AMI who underwent PCI. METHODS A total of 1531 patients with AMI who underwent PCI from January 2018 to December 2019 were enrolled in this consecutive cohort. The data comprised demographic characteristics, clinical investigations, laboratory tests, and disease-related events. Four machine learning models-artificial neural network (ANN), k-nearest neighbors, support vector machine, and random forest-were developed and compared with the logistic regression model. Our primary outcome was the model performance that predicted the MACEs, which was determined by accuracy, area under the receiver operating characteristic curve, and F1-score. RESULTS In total, 1362 patients were successfully followed up. With a median follow-up of 25.9 months, the incidence of MACEs was 18.5% (252/1362). The area under the receiver operating characteristic curve of the ANN, random forest, k-nearest neighbors, support vector machine, and logistic regression models were 80.49%, 72.67%, 79.80%, 77.20%, and 71.77%, respectively. The top 5 predictors in the ANN model were left ventricular ejection fraction, the number of implanted stents, age, diabetes, and the number of vessels with coronary artery disease. CONCLUSIONS The ANN model showed good MACE prediction after PCI for patients with AMI. The use of machine learning-based prediction models may improve patient management and outcomes in clinical practice.
Collapse
Affiliation(s)
- Pin Zhang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
- School of Public Health and Management, Nanchang Medical College, Nanchang, China
| | - Lei Wu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - Ting-Ting Zou
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - ZiXuan Zou
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - JiaXin Tu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - Ren Gong
- Department of Cardiology, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Jie Kuang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| |
Collapse
|
6
|
Cheng Y, Yang X, Wang Y, Li Q, Chen W, Dai R, Zhang C. Multiple machine-learning tools identifying prognostic biomarkers for acute Myeloid Leukemia. BMC Med Inform Decis Mak 2024; 24:2. [PMID: 38167056 PMCID: PMC10759623 DOI: 10.1186/s12911-023-02408-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/14/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Acute Myeloid Leukemia (AML) generally has a relatively low survival rate after treatment. There is an urgent need to find new biomarkers that may improve the survival prognosis of patients. Machine-learning tools are more and more widely used in the screening of biomarkers. METHODS Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine-Recursive Feature Elimination (SVM-RFE), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), lrFuncs, IdaProfile, caretFuncs, and nbFuncs models were used to screen key genes closely associated with AML. Then, based on the Cancer Genome Atlas (TCGA), pan-cancer analysis was performed to determine the correlation between important genes and AML or other cancers. Finally, the diagnostic value of important genes for AML was verified in different data sets. RESULTS The survival analysis results of the training set showed 26 genes with survival differences. After the intersection of the results of each machine learning method, DNM1, MEIS1, and SUSD3 were selected as key genes for subsequent analysis. The results of the pan-cancer analysis showed that MEIS1 and DNM1 were significantly highly expressed in AML; MEIS1 and SUSD3 are potential risk factors for the prognosis of AML, and DNM1 is a potential protective factor. Three key genes were significantly associated with AML immune subtypes and multiple immune checkpoints in AML. The results of the verification analysis show that DNM1, MEIS1, and SUSD3 have potential diagnostic value for AML. CONCLUSION Multiple machine learning methods identified DNM1, MEIS1, and SUSD3 can be regarded as prognostic biomarkers for AML.
Collapse
Affiliation(s)
- Yujing Cheng
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Xin Yang
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Ying Wang
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Qi Li
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Wanlu Chen
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Run Dai
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China
| | - Chan Zhang
- Department of blood transfusion, The First People's Hospital of Yunnan Province. The Affiliated Hospital of Kunming University of Science and Technology, No.157 Jinbi Road, 650034, Kunming, Yunnan, China.
| |
Collapse
|
7
|
Zhao X, Jiang C. The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model. BMC Med Inform Decis Mak 2023; 23:74. [PMID: 37085843 PMCID: PMC10120176 DOI: 10.1186/s12911-023-02166-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 04/04/2023] [Indexed: 04/23/2023] Open
Abstract
OBJECTIVES This research was designed to compare the ability of different machine learning (ML) models and nomogram to predict distant metastasis in male breast cancer (MBC) patients and to interpret the optimal ML model by SHapley Additive exPlanations (SHAP) framework. METHODS Four powerful ML models were developed using data from male breast cancer (MBC) patients in the SEER database between 2010 and 2015 and MBC patients from our hospital between 2010 and 2020. The area under curve (AUC) and Brier score were used to assess the capacity of different models. The Delong test was applied to compare the performance of the models. Univariable and multivariable analysis were conducted using logistic regression. RESULTS Of 2351 patients were analyzed; 168 (7.1%) had distant metastasis (M1); 117 (5.0%) had bone metastasis, and 71 (3.0%) had lung metastasis. The median age at diagnosis is 68.0 years old. Most patients did not receive radiotherapy (1723, 73.3%) or chemotherapy (1447, 61.5%). The XGB model was the best ML model for predicting M1 in MBC patients. It showed the largest AUC value in the tenfold cross validation (AUC:0.884; SD:0.02), training (AUC:0.907; 95% CI: 0.899-0.917), testing (AUC:0.827; 95% CI: 0.802-0.857) and external validation (AUC:0.754; 95% CI: 0.739-0.771) sets. It also showed powerful ability in the prediction of bone metastasis (AUC: 0.880, 95% CI: 0.856-0.903 in the training set; AUC: 0.823, 95% CI:0.790-0.848 in the test set; AUC: 0.747, 95% CI: 0.727-0.764 in the external validation set) and lung metastasis (AUC: 0.906, 95% CI: 0.877-0.928 in training set; AUC: 0.859, 95% CI: 0.816-0.891 in the test set; AUC: 0.756, 95% CI: 0.732-0.777 in the external validation set). The AUC value of the XGB model was larger than that of nomogram in the training (0.907 vs 0.802) and external validation (0.754 vs 0.706) sets. CONCLUSIONS The XGB model is a better predictor of distant metastasis among MBC patients than other ML models and nomogram; furthermore, the XGB model is a powerful model for predicting bone and lung metastasis. Combining with SHAP values, it could help doctors intuitively understand the impact of each variable on outcome.
Collapse
Affiliation(s)
- Xuhai Zhao
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, China
| | - Cong Jiang
- Department of Breast Surgery, Harbin Medical University Cancer Hospital, Harbin, China.
| |
Collapse
|
8
|
Lyu J, Xu Z, Sun H, Zhai F, Qu X. Machine learning-based CT radiomics model to discriminate the primary and secondary intracranial hemorrhage. Sci Rep 2023; 13:3709. [PMID: 36879050 PMCID: PMC9988881 DOI: 10.1038/s41598-023-30678-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 02/28/2023] [Indexed: 03/08/2023] Open
Abstract
It is challenging to distinguish between primary and secondary intracranial hemorrhage (ICH) purely by imaging data, and the two forms of ICHs are treated differently. This study aims to evaluate the potential of CT-based machine learning to identify the etiology of ICHs and compare the effectiveness of two regions of interest (ROI) sketching methods. A total of 1702 radiomic features were extracted from the CT brain images of 238 patients with acute ICH. We used the Select K Best method, least absolute shrinkage, and selection operator logistic regression to select the most discriminable features with a support vector machine to build a classifier model. Then, a ten-fold cross-validation strategy was employed to evaluate the performance of the classifier. From all quantitative CT-based imaging features obtained by two sketch methods, eighteen features were selected respectively. The radiomics model outperformed radiologists in distinguishing between primary and secondary ICH in both the volume of interest and the three-layer ROI sketches. As a result, a machine learning-based CT radiomics model can improve the accuracy of identifying primary and secondary ICH. A three-layer ROI sketch can identify primary versus secondary ICH based on the CT radiomics method.
Collapse
Affiliation(s)
- Jianbo Lyu
- Department of Radiology, The Second Hospital of Dalian Medical University, No. 467 Zhongshan Road, Shahekou District, Dalian, 116023, China
| | - Zhaohui Xu
- Department of Hernia and Colorectal Surgery, The Second Hospital of Dalian Medical University, No. 467 Zhongshan Road, Shahekou District, Dalian, 116023, China
| | - HaiYan Sun
- Department of Radiology, The Second Hospital of Dalian Medical University, No. 467 Zhongshan Road, Shahekou District, Dalian, 116023, China
| | - Fangbing Zhai
- Department of Radiology, The Second Hospital of Dalian Medical University, No. 467 Zhongshan Road, Shahekou District, Dalian, 116023, China.
| | - Xiaofeng Qu
- Department of Radiology, The Second Hospital of Dalian Medical University, No. 467 Zhongshan Road, Shahekou District, Dalian, 116023, China.
| |
Collapse
|
9
|
Mubarik S, Wang F, Luo L, Hezam K, Yu C. Evaluation of Lee-Carter model to breast cancer mortality prediction in China and Pakistan. Front Oncol 2023; 13:1101249. [PMID: 36845742 PMCID: PMC9954621 DOI: 10.3389/fonc.2023.1101249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 01/27/2023] [Indexed: 02/12/2023] Open
Abstract
Background Precise breast cancer-related mortality forecasts are required for public health program and healthcare service planning. A number of stochastic model-based approaches for predicting mortality have been developed. The trends shown by mortality data from various diseases and countries are critical to the effectiveness of these models. This study illustrates the unconventional statistical method for estimating and predicting the mortality risk between the early-onset and screen-age/late-onset breast cancer population in China and Pakistan using the Lee-Carter model. Methods Longitudinal death data for female breast cancer from 1990 to 2019 obtained from the Global Burden of Disease study database were used to compare statistical approach between early-onset (age group, 25-49 years) and screen-age/late-onset (age group, 50-84 years) population. We evaluated the model performance both within (training period, 1990-2010) and outside (test period, 2011-2019) data forecast accuracy using the different error measures and graphical analysis. Finally, using the Lee-Carter model, we predicted the general index for the time period (2011 to 2030) and derived corresponding life expectancy at birth for the female breast cancer population using life tables. Results Study findings revealed that the Lee-Carter approach to predict breast cancer mortality rate outperformed in the screen-age/late-onset compared with that in the early-onset population in terms of goodness of fit and within and outside forecast accuracy check. Moreover, the trend in forecast error was decreasing gradually in the screen-age/late-onset compared with that in the early-onset breast cancer population in China and Pakistan. Furthermore, we observed that this approach had provided almost comparable results between the early-onset and screen-age/late-onset population in forecast accuracy for more varying mortality behavior over time like in Pakistan. Both the early-onset and screen-age/late-onset populations in Pakistan were expected to have an increase in breast cancer mortality by 2030. whereas, for China, it was expected to decrease in the early-onset population. Conclusion The Lee-Carter model can be used to estimate breast cancer mortality and so to project future life expectancy at birth, especially in the screen-age/late-onset population. As a result, it is suggested that this approach may be useful and convenient for predicting cancer-related mortality even when epidemiological and demographic disease data sets are limited. According to model predictions for breast cancer mortality, improved health facilities for disease diagnosis, control, and prevention are required to reduce the disease's future burden, particularly in less developed countries.
Collapse
Affiliation(s)
- Sumaira Mubarik
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, China
| | - Fang Wang
- Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Lisha Luo
- Center for Evidence-Based and Translational Medicine, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China
| | - Kamal Hezam
- Nankai University, School of Medicine, Tianjin, China
| | - Chuanhua Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Wuhan University, Wuhan, China,*Correspondence: Chuanhua Yu,
| |
Collapse
|
10
|
Wu R, Luo J, Wan H, Zhang H, Yuan Y, Hu H, Feng J, Wen J, Wang Y, Li J, Liang Q, Gan F, Zhang G. Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database. PLoS One 2023; 18:e0280340. [PMID: 36701415 PMCID: PMC9879508 DOI: 10.1371/journal.pone.0280340] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 12/26/2022] [Indexed: 01/27/2023] Open
Abstract
INTRODUCTION Many researchers used machine learning (ML) to predict the prognosis of breast cancer (BC) patients and noticed that the ML model had good individualized prediction performance. OBJECTIVE The cohort study was intended to establish a reliable data analysis model by comparing the performance of 10 common ML algorithms and the the traditional American Joint Committee on Cancer (AJCC) stage, and used this model in Web application development to provide a good individualized prediction for others. METHODS This study included 63145 BC patients from the Surveillance, Epidemiology, and End Results database. RESULTS Through the performance of the 10 ML algorithms and 7th AJCC stage in the optimal test set, we found that in terms of 5-year overall survival, multivariate adaptive regression splines (MARS) had the highest area under the curve (AUC) value (0.831) and F1-score (0.608), and both sensitivity (0.737) and specificity (0.772) were relatively high. Besides, MARS showed a highest AUC value (0.831, 95%confidence interval: 0.820-0.842) in comparison to the other ML algorithms and 7th AJCC stage (all P < 0.05). MARS, the best performing model, was selected for web application development (https://w12251393.shinyapps.io/app2/). CONCLUSIONS The comparative study of multiple forecasting models utilizing a large data noted that MARS based model achieved a much better performance compared to other ML algorithms and 7th AJCC stage in individualized estimation of survival of BC patients, which was very likely to be the next step towards precision medicine.
Collapse
Affiliation(s)
- Ruiyang Wu
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jing Luo
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Hangyu Wan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Haiyan Zhang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Yewei Yuan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Huihua Hu
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jinyan Feng
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jing Wen
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Yan Wang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Junyan Li
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Qi Liang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Fengjiao Gan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Gang Zhang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
- * E-mail:
| |
Collapse
|
11
|
Xu J, Zhou J, Hu J, Ren Q, Wang X, Shu Y. Development and validation of a machine learning model for survival risk stratification after esophageal cancer surgery. Front Oncol 2022; 12:1068198. [PMID: 36568178 PMCID: PMC9780661 DOI: 10.3389/fonc.2022.1068198] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 11/24/2022] [Indexed: 12/13/2022] Open
Abstract
Background Prediction of prognosis for patients with esophageal cancer(EC) is beneficial for their postoperative clinical decision-making. This study's goal was to create a dependable machine learning (ML) model for predicting the prognosis of patients with EC after surgery. Methods The files of patients with esophageal squamous cell carcinoma (ESCC) of the thoracic segment from China who received radical surgery for EC were analyzed. The data were separated into training and test sets, and prognostic risk variables were identified in the training set using univariate and multifactor COX regression. Based on the screened features, training and validation of five ML models were carried out through nested cross-validation (nCV). The performance of each model was evaluated using Area under the curve (AUC), accuracy(ACC), and F1-Score, and the optimum model was chosen as the final model for risk stratification and survival analysis in order to build a valid model for predicting the prognosis of patients with EC after surgery. Results This study enrolled 810 patients with thoracic ESCC. 6 variables were ultimately included for modeling. Five ML models were trained and validated. The XGBoost model was selected as the optimum for final modeling. The XGBoost model was trained, optimized, and tested (AUC = 0.855; 95% CI, 0.808-0.902). Patients were separated into three risk groups. Statistically significant differences (p < 0.001) were found among all three groups for both the training and test sets. Conclusions A ML model that was highly practical and reliable for predicting the prognosis of patients with EC after surgery was established, and an application to facilitate clinical utility was developed.
Collapse
Affiliation(s)
- Jinye Xu
- Clinical Medical College, Yangzhou University, Yangzhou, China,Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China
| | - Jianghui Zhou
- Clinical Medical College, Yangzhou University, Yangzhou, China,Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China
| | - Junxi Hu
- Clinical Medical College, Yangzhou University, Yangzhou, China,Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China
| | - Qinglin Ren
- Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China
| | - Xiaolin Wang
- Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China,*Correspondence: Yusheng Shu, ; Xiaolin Wang,
| | - Yusheng Shu
- Department of Thoracic Surgery, Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China,*Correspondence: Yusheng Shu, ; Xiaolin Wang,
| |
Collapse
|
12
|
A Breast Cancer Prediction Model Based on a Panel from Circulating Exosomal miRNAs. BIOMED RESEARCH INTERNATIONAL 2022; 2022:5170261. [PMID: 36312858 PMCID: PMC9615554 DOI: 10.1155/2022/5170261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/19/2022] [Accepted: 09/22/2022] [Indexed: 12/09/2022]
Abstract
Breast cancer (BC) has been a serious threat to women's health. Exosomes contain a variety of biomolecules, which is an excellent choice as disease diagnostic markers, but whether it could be applied as a noninvasive biomarker for BC diagnosis demands to be additional studied. In this study, we aimed at creating a predictive model and reveal the value of plasma exosomal miRNA (exo-miRNA) in early diagnosis of BC. Firstly, exosomes isolated from plasma were identified by Nanoparticle Tracking Analysis (NTA), Transmission Electron Microscope (TEM), and Western Blot. miRNA expression in plasma samples from 56 BC patients and 40 normal controls was analyzed by high-throughput sequencing. miRNAs with strong correlation characteristics were selected by Lasso logistic regression. Then, we built the training set and test set, evaluated the Lasso regression accuracy, and evaluated the performance of different models in the training set and test set. Finally, GO analysis, KEGG, and Reactome pathway enrichment analysis were used to understand the biological significance of 16 characteristic miRNAs. The successful separation of exosomes in serum was identified by NTA, TEM, and Western Blot. The training set data matrix containing 1962 miRNAs was obtained by sequencing for model construction, and 16 strongly correlated miRNAs were selected by Lasso logistic regression. The accuracy of Lasso regression in training set and test set were 97.22% and 95.83%, respectively. We built different models and evaluated the performance of each model in the training set and test set. The results showed that the AUC values of Lasso, SVM, GBDT, and Random Forest model in the training set were 1, and the AUC values in the test set were 0.979, 0.936, 0.971, and 0.979, respectively. Bioinformatics analysis showed that 16 signature miRNAs were significantly enriched in cancer-related pathways such as herpes simplex virus 1 infection, TGF-β signaling, and Toll-like receptor family. The results of this study suggest that the 16 characteristic miRNAs screened from plasma exosomes can be used as a group of biomarkers, and the prediction model constructed based on this set of markers is expected to be used in the early diagnosis of BC.
Collapse
|
13
|
Accuracy and Utility of Preoperative Ultrasound-Guided Axillary Lymph Node Biopsy for Invasive Breast Cancer: A Systematic Review and Meta-Analysis. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:3307627. [PMID: 36203726 PMCID: PMC9532070 DOI: 10.1155/2022/3307627] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 09/07/2022] [Accepted: 09/10/2022] [Indexed: 12/05/2022]
Abstract
Background With the acceleration of the pace of life and work, the incidence rate of invasive breast cancer is getting higher and higher, and early diagnosis is very important. This study screened and analyzed the published literature on ultrasound-guided biopsy of invasive breast cancer and obtained the accuracy and practicality of preoperative biopsy. Method The four databases were screened for the literature. There was no requirement for the start date of retrieval, and the deadline was July 2, 2022. Two researchers screened the literature, respectively, and included the literature on preoperative ultrasound-guided biopsy and intraoperative and postoperative pathological diagnosis of invasive breast cancer. The diagnostic data included in the literature were extracted and meta-analyzed with RevMan 5.4 software, and the bias risk map, forest map, and summary receiver operating characteristic curves (SROC) were drawn. Results The included 19 studies involved about 18668 patients with invasive breast cancer. The degree of bias of the included literature is low. The distribution range of true positive, false positive, true negative, and false negative in the forest map is large, which may be related to the large difference in the number of patients in each study. Most studies in the SROC curve are at the upper left, indicating that the accuracy of ultrasound-guided axillary biopsy is very high. Conclusion For invasive breast cancer, preoperative ultrasound-guided biopsy can accurately predict staging and grading of breast cancer, which has important reference value for surgery and follow-up treatment.
Collapse
|
14
|
Xue Q, Wen D, Ji MH, Tong J, Yang JJ, Zhou CM. Developing Machine Learning Algorithms to Predict Pulmonary Complications After Emergency Gastrointestinal Surgery. Front Med (Lausanne) 2021; 8:655686. [PMID: 34409047 PMCID: PMC8365303 DOI: 10.3389/fmed.2021.655686] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 07/12/2021] [Indexed: 12/12/2022] Open
Abstract
Objective: Investigate whether machine learning can predict pulmonary complications (PPCs) after emergency gastrointestinal surgery in patients with acute diffuse peritonitis. Methods: This is a secondary data analysis study. We use five machine learning algorithms (Logistic regression, DecisionTree, GradientBoosting, Xgbc, and gbm) to predict postoperative pulmonary complications. Results: Nine hundred and twenty-six cases were included in this study; 187 cases (20.19%) had PPCs. The five most important variables for the postoperative weight were preoperative albumin, cholesterol on the 3rd day after surgery, albumin on the day of surgery, platelet count on the 1st day after surgery and cholesterol count on the 1st day after surgery for pulmonary complications. In the test group: the logistic regression model shows AUC = 0.808, accuracy = 0.824 and precision = 0.621; Decision tree shows AUC = 0.702, accuracy = 0.795 and precision = 0.486; The GradientBoosting model shows AUC = 0.788, accuracy = 0.827 and precision = 1.000; The Xgbc model shows AUC = 0.784, accuracy = 0.806 and precision = 0.583. The Gbm model shows AUC = 0.814, accuracy = 0.806 and precision = 0.750. Conclusion: Machine learning algorithms can predict patients' PPCs with acute diffuse peritonitis. Moreover, the results of the importance matrix for the Gbdt algorithm model show that albumin, cholesterol, age, and platelets are the main variables that account for the highest pulmonary complication weights.
Collapse
Affiliation(s)
- Qiong Xue
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Duan Wen
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Mu-Huo Ji
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.,Department of Anesthesiology, The Second Affiliated Hospital, Nanjing Medical University, Nanjing, China
| | - Jianhua Tong
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jian-Jun Yang
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Cheng-Mao Zhou
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|
15
|
Wang Y, Zhu Y, Xue Q, Ji M, Tong J, Yang JJ, Zhou CM. Predicting chronic pain in postoperative breast cancer patients with multiple machine learning and deep learning models. J Clin Anesth 2021; 74:110423. [PMID: 34364190 DOI: 10.1016/j.jclinane.2021.110423] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 06/14/2021] [Accepted: 06/17/2021] [Indexed: 11/26/2022]
Affiliation(s)
- Ying Wang
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Yu Zhu
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China; Anesthesia and Big Data Research Group, Department of Scientific Research, Zhaoqing Medical College, China
| | - Qiong Xue
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Muhuo Ji
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Jianhua Tong
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Jian-Jun Yang
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China.
| | - Cheng-Mao Zhou
- Department of Anesthesiology, Pain and Perioperative Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China.
| |
Collapse
|