1
|
Wang H, Zhang C, Li Q, Tian T, Huang R, Qiu J, Tian R. Development and validation of prediction models for papillary thyroid cancer structural recurrence using machine learning approaches. BMC Cancer 2024; 24:427. [PMID: 38589799 PMCID: PMC11000372 DOI: 10.1186/s12885-024-12146-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 03/19/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND Although papillary thyroid cancer (PTC) patients are known to have an excellent prognosis, up to 30% of patients experience disease recurrence after initial treatment. Accurately predicting disease prognosis remains a challenge given that the predictive value of several predictors remains controversial. Thus, we investigated whether machine learning (ML) approaches based on comprehensive predictors can predict the risk of structural recurrence for PTC patients. METHODS A total of 2244 patients treated with thyroid surgery and radioiodine were included. Twenty-nine perioperative variables consisting of four dimensions (demographic characteristics and comorbidities, tumor-related variables, lymph node (LN)-related variables, and metabolic and inflammatory markers) were analyzed. We applied five ML algorithms-logistic regression (LR), support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), and neural network (NN)-to develop the models. The area under the receiver operating characteristic (AUC-ROC) curve, calibration curve, and variable importance were used to evaluate the models' performance. RESULTS During a median follow-up of 45.5 months, 179 patients (8.0%) experienced structural recurrence. The non-stimulated thyroglobulin, LN dissection, number of LNs dissected, lymph node metastasis ratio, N stage, comorbidity of hypertension, comorbidity of diabetes, body mass index, and low-density lipoprotein were used to develop the models. All models showed a greater AUC (AUC = 0.738 to 0.767) than did the ATA risk stratification (AUC = 0.620, DeLong test: P < 0.01). The SVM, XGBoost, and RF model showed greater sensitivity (0.568, 0.595, 0.676), specificity (0.903, 0.857, 0.784), accuracy (0.875, 0.835, 0.775), positive predictive value (PPV) (0.344, 0.272, 0.219), negative predictive value (NPV) (0.959, 0.959, 0.964), and F1 score (0.429, 0.373, 0.331) than did the ATA risk stratification (sensitivity = 0.432, specificity = 0.770, accuracy = 0.742, PPV = 0.144, NPV = 0.938, F1 score = 0.216). The RF model had generally consistent calibration compared with the other models. The Tg and the LNR were the top 2 important variables in all the models, the N stage was the top 5 important variables in all the models. CONCLUSIONS The RF model achieved the expected prediction performance with generally good discrimination, calibration and interpretability in this study. This study sheds light on the potential of ML approaches for improving the accuracy of risk stratification for PTC patients. TRIAL REGISTRATION Retrospectively registered at www.chictr.org.cn (trial registration number: ChiCTR2300075574, date of registration: 2023-09-08).
Collapse
Affiliation(s)
- Hongxi Wang
- Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China
| | - Chao Zhang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, 610041, Chengdu, China
| | - Qianrui Li
- Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China
| | - Tian Tian
- Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China
| | - Rui Huang
- Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China
| | - Jiajun Qiu
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, 610041, Chengdu, China.
| | - Rong Tian
- Department of Nuclear Medicine, West China Hospital, Sichuan University, No 37. Guoxue Alley, 610041, Chengdu, China.
| |
Collapse
|
2
|
Jin X, Pan Y, Zhai C, Shen H, You L, Pan H. Exploration and machine learning model development for T2 NSCLC with bronchus infiltration and obstructive pneumonia/atelectasis. Sci Rep 2024; 14:4793. [PMID: 38413705 PMCID: PMC10899628 DOI: 10.1038/s41598-024-55507-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 02/24/2024] [Indexed: 02/29/2024] Open
Abstract
In the 8th edition of the American Joint Committee on Cancer (AJCC) staging system for Non-Small Cell Lung Cancer (NSCLC), tumors exhibiting main bronchial infiltration (MBI) near the carina and those presenting with complete lung obstructive pneumonia/atelectasis (P/ATL) have been reclassified from T3 to T2. Our investigation into the Surveillance, Epidemiology, and End Results (SEER) database, spanning from 2007 to 2015 and adjusted via Propensity Score Matching (PSM) for additional variables, disclosed a notably inferior overall survival (OS) for patients afflicted with these conditions. Specifically, individuals with P/ATL experienced a median OS of 12 months compared to 15 months (p < 0.001). In contrast, MBI patients demonstrated a slightly worse prognosis with a median OS of 22 months versus 23 months (p = 0.037), with both conditions significantly correlated with lymph node metastasis (All p < 0.001). Upon evaluating different treatment approaches for these particular T2 NSCLC variants, while adjusting for other factors, surgery emerged as the optimal therapeutic strategy. We counted those who underwent surgery and found that compared to surgery alone, the MBI/(P/ATL) group experienced a much higher proportion of preoperative induction therapy or postoperative adjuvant therapy than the non-MBI/(P/ATL) group (41.3%/54.7% vs. 36.6%). However, for MBI patients, initial surgery followed by adjuvant treatment or induction therapy succeeded in significantly enhancing prognosis, a benefit that was not replicated for P/ATL patients. Leveraging the XGBoost model for a 5-year survival forecast and treatment determination for P/ATL and MBI patients yielded Area Under the Curve (AUC) scores of 0.853 for P/ATL and 0.814 for MBI, affirming the model's efficacy in prognostication and treatment allocation for these distinct T2 NSCLC categories.
Collapse
Affiliation(s)
- Xuanhong Jin
- Department of Medical Oncology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Yang Pan
- Postgraduate Training Base Alliance of Wenzhou Medical University (Zhejiang Cancer Hospital), Hangzhou, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, China
| | - Chongya Zhai
- Department of Medical Oncology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Hangchen Shen
- Department of Medical Oncology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China
| | - Liangkun You
- Department of Medical Oncology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China.
| | - Hongming Pan
- Department of Medical Oncology, Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou, China.
| |
Collapse
|
3
|
Li S, Yi H, Leng Q, Wu Y, Mao Y. New perspectives on cancer clinical research in the era of big data and machine learning. Surg Oncol 2024; 52:102009. [PMID: 38215544 DOI: 10.1016/j.suronc.2023.102009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 10/16/2023] [Indexed: 01/14/2024]
Abstract
In the 21st century, the development of medical science has entered the era of big data, and machine learning has become an essential tool for mining medical big data. The establishment of the SEER database has provided a wealth of epidemiological data for cancer clinical research, and the number of studies based on SEER and machine learning has been growing in recent years. This article reviews recent research based on SEER and machine learning and finds that the current focus of such studies is primarily on the development and validation of models using machine learning algorithms, with the main directions being lymph node metastasis prediction, distant metastasis prediction, and prognosis-related research. Compared to traditional models, machine learning algorithms have the advantage of stronger adaptability, but also suffer from disadvantages such as overfitting and poor interpretability, which need to be weighed in practical applications. At present, machine learning algorithms, as the foundation of artificial intelligence, have just begun to emerge in the field of cancer clinical research. The future development of oncology will enter a more precise era of cancer research, characterized by larger data, higher dimensions, and more frequent information exchange. Machine learning is bound to shine brightly in this field.
Collapse
Affiliation(s)
- Shujun Li
- Department of Hematology, Xiangya Hospital, Central South University, Changsha, 410008, China; National Clinical Research Center for Geriatric Diseases (Xiangya Hospital), China; Hunan Hematology Oncology Clinical Medical Research Center, China
| | - Hang Yi
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Qihao Leng
- Xiangya School of Medicine, Central South University, Changsha, 410013, Hunan Province, China
| | - You Wu
- Institute for Hospital Management, School of Medicine, Tsinghua University, 30 Shuangqing Rd, Haidian District, Beijing, China; Department of Health Policy and Management, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, 21205, USA.
| | - Yousheng Mao
- Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
4
|
Babaei Rikan S, Sorayaie Azar A, Naemi A, Bagherzadeh Mohasefi J, Pirnejad H, Wiil UK. Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques. Sci Rep 2024; 14:2371. [PMID: 38287149 PMCID: PMC10824760 DOI: 10.1038/s41598-024-53006-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 01/25/2024] [Indexed: 01/31/2024] Open
Abstract
In this study, we utilized data from the Surveillance, Epidemiology, and End Results (SEER) database to predict the glioblastoma patients' survival outcomes. To assess dataset skewness and detect feature importance, we applied Pearson's second coefficient test of skewness and the Ordinary Least Squares method, respectively. Using two sampling strategies, holdout and five-fold cross-validation, we developed five machine learning (ML) models alongside a feed-forward deep neural network (DNN) for the multiclass classification and regression prediction of glioblastoma patient survival. After balancing the classification and regression datasets, we obtained 46,340 and 28,573 samples, respectively. Shapley additive explanations (SHAP) were then used to explain the decision-making process of the best model. In both classification and regression tasks, as well as across holdout and cross-validation sampling strategies, the DNN consistently outperformed the ML models. Notably, the accuracy were 90.25% and 90.22% for holdout and five-fold cross-validation, respectively, while the corresponding R2 values were 0.6565 and 0.6622. SHAP analysis revealed the importance of age at diagnosis as the most influential feature in the DNN's survival predictions. These findings suggest that the DNN holds promise as a practical auxiliary tool for clinicians, aiding them in optimal decision-making concerning the treatment and care trajectories for glioblastoma patients.
Collapse
Affiliation(s)
| | | | - Amin Naemi
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | | | - Habibollah Pirnejad
- Erasmus School of Health Policy and Management (ESHPM), Erasmus University Rotterdam, Rotterdam, The Netherlands.
- Patient Safety Research Center, Clinical Research Institute, Urmia University of Medical Sciences, Urmia, Iran.
| | - Uffe Kock Wiil
- SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
5
|
Toro-Tobon D, Loor-Torres R, Duran M, Fan JW, Singh Ospina N, Wu Y, Brito JP. Artificial Intelligence in Thyroidology: A Narrative Review of the Current Applications, Associated Challenges, and Future Directions. Thyroid 2023; 33:903-917. [PMID: 37279303 PMCID: PMC10440669 DOI: 10.1089/thy.2023.0132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Background: The use of artificial intelligence (AI) in health care has grown exponentially with the promise of facilitating biomedical research and enhancing diagnosis, treatment, monitoring, disease prevention, and health care delivery. We aim to examine the current state, limitations, and future directions of AI in thyroidology. Summary: AI has been explored in thyroidology since the 1990s, and currently, there is an increasing interest in applying AI to improve the care of patients with thyroid nodules (TNODs), thyroid cancer, and functional or autoimmune thyroid disease. These applications aim to automate processes, improve the accuracy and consistency of diagnosis, personalize treatment, decrease the burden for health care professionals, improve access to specialized care in areas lacking expertise, deepen the understanding of subtle pathophysiologic patterns, and accelerate the learning curve of less experienced clinicians. There are promising results for many of these applications. Yet, most are in the validation or early clinical evaluation stages. Only a few are currently adopted for risk stratification of TNODs by ultrasound and determination of the malignant nature of indeterminate TNODs by molecular testing. Challenges of the currently available AI applications include the lack of prospective and multicenter validations and utility studies, small and low diversity of training data sets, differences in data sources, lack of explainability, unclear clinical impact, inadequate stakeholder engagement, and inability to use outside of the research setting, which might limit the value of their future adoption. Conclusions: AI has the potential to improve many aspects of thyroidology; however, addressing the limitations affecting the suitability of AI interventions in thyroidology is a prerequisite to ensure that AI provides added value for patients with thyroid disease.
Collapse
Affiliation(s)
- David Toro-Tobon
- Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Ricardo Loor-Torres
- Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Mayra Duran
- Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| | - Jungwei W. Fan
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
| | - Naykky Singh Ospina
- Division of Endocrinology, Department of Medicine, University of Florida, Gainesville, Florida, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, USA
| | - Juan P. Brito
- Division of Endocrinology, Diabetes, Metabolism and Nutrition, Department of Medicine, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
6
|
Ajilisa OA, Jagathy Raj VP, Sabu MK. A Deep Learning Framework for the Characterization of Thyroid Nodules from Ultrasound Images Using Improved Inception Network and Multi-Level Transfer Learning. Diagnostics (Basel) 2023; 13:2463. [PMID: 37510206 PMCID: PMC10378664 DOI: 10.3390/diagnostics13142463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/07/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
In the past few years, deep learning has gained increasingly widespread attention and has been applied to diagnosing benign and malignant thyroid nodules. It is difficult to acquire sufficient medical images, resulting in insufficient data, which hinders the development of an efficient deep-learning model. In this paper, we developed a deep-learning-based characterization framework to differentiate malignant and benign nodules from the thyroid ultrasound images. This approach improves the recognition accuracy of the inception network by combining squeeze and excitation networks with the inception modules. We have also integrated the concept of multi-level transfer learning using breast ultrasound images as a bridge dataset. This transfer learning approach addresses the issues regarding domain differences between natural images and ultrasound images during transfer learning. This paper aimed to investigate how the entire framework could help radiologists improve diagnostic performance and avoid unnecessary fine-needle aspiration. The proposed approach based on multi-level transfer learning and improved inception blocks achieved higher precision (0.9057 for the benign class and 0.9667 for the malignant class), recall (0.9796 for the benign class and 0.8529 for malignant), and F1-score (0.9412 for benign class and 0.9062 for malignant class). It also obtained an AUC value of 0.9537, which is higher than that of the single-level transfer learning method. The experimental results show that this model can achieve satisfactory classification accuracy comparable to experienced radiologists. Using this model, we can save time and effort as well as deliver potential clinical application value.
Collapse
Affiliation(s)
- O A Ajilisa
- Department of Computer Applications, Cochin University of Science and Technology, South Kalamassery, Kochi 682022, Kerala, India
| | - V P Jagathy Raj
- School of Management Studies, Cochin University of Science and Technology, South Kalamassery, Kochi 682022, Kerala, India
| | - M K Sabu
- Department of Computer Applications, Cochin University of Science and Technology, South Kalamassery, Kochi 682022, Kerala, India
| |
Collapse
|
7
|
Yeramosu T, Ahmad W, Bashir A, Wait J, Bassett J, Domson G. Predicting five-year mortality in soft-tissue sarcoma patients. Bone Joint J 2023; 105-B:702-710. [PMID: 37257862 DOI: 10.1302/0301-620x.105b6.bjj-2022-0998.r1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Aims The aim of this study was to identify factors associated with five-year cancer-related mortality in patients with limb and trunk soft-tissue sarcoma (STS) and develop and validate machine learning algorithms in order to predict five-year cancer-related mortality in these patients. Methods Demographic, clinicopathological, and treatment variables of limb and trunk STS patients in the Surveillance, Epidemiology, and End Results Program (SEER) database from 2004 to 2017 were analyzed. Multivariable logistic regression was used to determine factors significantly associated with five-year cancer-related mortality. Various machine learning models were developed and compared using area under the curve (AUC), calibration, and decision curve analysis. The model that performed best on the SEER testing data was further assessed to determine the variables most important in its predictive capacity. This model was externally validated using our institutional dataset. Results A total of 13,646 patients with STS from the SEER database were included, of whom 35.9% experienced five-year cancer-related mortality. The random forest model performed the best overall and identified tumour size as the most important variable when predicting mortality in patients with STS, followed by M stage, histological subtype, age, and surgical excision. Each variable was significant in logistic regression. External validation yielded an AUC of 0.752. Conclusion This study identified clinically important variables associated with five-year cancer-related mortality in patients with limb and trunk STS, and developed a predictive model that demonstrated good accuracy and predictability. Orthopaedic oncologists may use these findings to further risk-stratify their patients and recommend an optimal course of treatment.
Collapse
Affiliation(s)
- Teja Yeramosu
- School of Medicine, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Waleed Ahmad
- School of Medicine, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Azhar Bashir
- Department of Orthopaedic Surgery, School of Medicine, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Jacob Wait
- School of Medicine, Virginia Commonwealth University, Richmond, Virginia, USA
| | - James Bassett
- College of Medicine and Life Sciences, University of Toledo, Toledo, Ohio, USA
| | - Gregory Domson
- Department of Orthopaedic Surgery, School of Medicine, Virginia Commonwealth University, Richmond, Virginia, USA
| |
Collapse
|
8
|
Sitahong A, Yuan Y, Li M, Ma J, Ba Z, Lu Y. Learning dispatching rules via novel genetic programming with feature selection in energy-aware dynamic job-shop scheduling. Sci Rep 2023; 13:8558. [PMID: 37236998 DOI: 10.1038/s41598-023-34951-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
The incorporation of energy conservation measures into production efficiency is widely recognized as a crucial aspect of contemporary industry. This study aims to develop interpretable and high-quality dispatching rules for energy-aware dynamic job shop scheduling (EDJSS). In comparison to the traditional modeling methods, this paper proposes a novel genetic programming with online feature selection mechanism to learn dispatching rules automatically. The idea of the novel GP method is to achieve a progressive transition from exploration to exploitation by relating the level of population diversity to the stopping criteria and elapsed duration. We hypothesize that diverse and promising individuals obtained from the novel GP method can guide the feature selection to design competitive rules. The proposed approach is compared with three GP-based algorithms and 20 benchmark rules in the different job shop conditions and scheduling objectives considered energy consumption. Experiments show that the proposed approach greatly outperforms the compared methods in generating more interpretable and effective rules. Overall, the average improvement over the best-evolved rules by the other three GP-based algorithms is 12.67%, 15.38%, and 11.59% in the meakspan with energy consumption (EMS), mean weighted tardiness with energy consumption (EMWT), and mean flow time with energy consumption (EMFT) scenarios, respectively.
Collapse
Affiliation(s)
- Adilanmu Sitahong
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China
| | - Yiping Yuan
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China.
| | - Ming Li
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China
| | - Junyan Ma
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China
| | - Zhiyong Ba
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China
| | - Yongxin Lu
- School of Mechanical Engineering, Xinjiang University, Urumqi, 830047, China
| |
Collapse
|
9
|
Long-term survival and second malignant tumor prediction in pediatric, adolescent, and young adult cancer survivors using Random Survival Forests: a SEER analysis. Sci Rep 2023; 13:1911. [PMID: 36732358 PMCID: PMC9894907 DOI: 10.1038/s41598-023-29167-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 01/31/2023] [Indexed: 02/04/2023] Open
Abstract
Survival and second malignancy prediction models can aid clinical decision making. Most commonly, survival analysis studies are performed using traditional proportional hazards models, which require strong assumptions and can lead to biased estimates if violated. Therefore, this study aims to implement an alternative, machine learning (ML) model for survival analysis: Random Survival Forest (RSF). In this study, RSFs were built using the U.S. Surveillance Epidemiology and End Results to (1) predict 30-year survival in pediatric, adolescent, and young adult cancer survivors; and (2) predict risk and site of a second tumor within 30 years of the first tumor diagnosis in these age groups. The final RSF model for pediatric, adolescent, and young adult survival has an average Concordance index (C-index) of 92.9%, 94.2%, and 94.4% and average time-dependent area under the receiver operating characteristic curve (AUC) at 30-years since first diagnosis of 90.8%, 93.6%, 96.1% respectively. The final RSF model for pediatric, adolescent, and young adult second malignancy has an average C-index of 86.8%, 85.2%, and 88.6% and average time-dependent AUC at 30-years since first diagnosis of 76.5%, 88.1%, and 99.0% respectively. This study suggests the robustness and potential clinical value of ML models to alleviate physician burden by quickly identifying highest risk individuals.
Collapse
|
10
|
George MM, Tolley NS. AIM in Otolaryngology and Head and Neck Surgery. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
|
11
|
Nagy M, Radakovich N, Nazha A. Machine Learning in Oncology: What Should Clinicians Know? JCO Clin Cancer Inform 2021; 4:799-810. [PMID: 32926637 DOI: 10.1200/cci.20.00049] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The volume and complexity of scientific and clinical data in oncology have grown markedly over recent years, including but not limited to the realms of electronic health data, radiographic and histologic data, and genomics. This growth holds promise for a deeper understanding of malignancy and, accordingly, more personalized and effective oncologic care. Such goals require, however, the development of new methods to fully make use of the wealth of available data. Improvements in computer processing power and algorithm development have positioned machine learning, a branch of artificial intelligence, to play a prominent role in oncology research and practice. This review provides an overview of the basics of machine learning and highlights current progress and challenges in applying this technology to cancer diagnosis, prognosis, and treatment recommendations, including a discussion of current takeaways for clinicians.
Collapse
Affiliation(s)
- Matthew Nagy
- Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, OH
| | - Nathan Radakovich
- Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, OH
| | - Aziz Nazha
- Center for Clinical Artificial Intelligence, Cleveland Clinic, Cleveland, OH.,Department of Hematology and Medical Oncology, Cleveland Clinic, Cleveland, OH
| |
Collapse
|
12
|
Oei RW, Lyu Y, Ye L, Kong F, Du C, Zhai R, Xu T, Shen C, He X, Kong L, Hu C, Ying H. Progression-Free Survival Prediction in Patients with Nasopharyngeal Carcinoma after Intensity-Modulated Radiotherapy: Machine Learning vs. Traditional Statistics. J Pers Med 2021; 11:jpm11080787. [PMID: 34442430 PMCID: PMC8398698 DOI: 10.3390/jpm11080787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/08/2021] [Accepted: 08/10/2021] [Indexed: 12/24/2022] Open
Abstract
Background: The Cox proportional hazards (CPH) model is the most commonly used statistical method for nasopharyngeal carcinoma (NPC) prognostication. Recently, machine learning (ML) models are increasingly adopted for this purpose. However, only a few studies have compared the performances between CPH and ML models. This study aimed at comparing CPH with two state-of-the-art ML algorithms, namely, conditional survival forest (CSF) and DeepSurv for disease progression prediction in NPC. Methods: From January 2010 to March 2013, 412 eligible NPC patients were reviewed. The entire dataset was split into training cohort and testing cohort in a ratio of 90%:10%. Ten features from patient-related, disease-related, and treatment-related data were used to train the models for progression-free survival (PFS) prediction. The model performance was compared using the concordance index (c-index), Brier score, and log-rank test based on the risk stratification results. Results: DeepSurv (c-index = 0.68, Brier score = 0.13, log-rank test p = 0.02) achieved the best performance compared to CSF (c-index = 0.63, Brier score = 0.14, log-rank test p = 0.38) and CPH (c-index = 0.57, Brier score = 0.15, log-rank test p = 0.81). Conclusions: Both CSF and DeepSurv outperformed CPH in our relatively small dataset. ML-based survival prediction may guide physicians in choosing the most suitable treatment strategy for NPC patients.
Collapse
Affiliation(s)
- Ronald Wihal Oei
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Yingchen Lyu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Lulu Ye
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Fangfang Kong
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chengrun Du
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Ruiping Zhai
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Tingting Xu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chunying Shen
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Xiayun He
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Lin Kong
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chaosu Hu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Hongmei Ying
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
- Correspondence: ; Tel.: +86-21-64175590; Fax: +86-21-6417477
| |
Collapse
|
13
|
Liu Z, Thapa N, Shaver A, Roy K, Siddula M, Yuan X, Yu A. Using Embedded Feature Selection and CNN for Classification on CCD-INID-V1-A New IoT Dataset. SENSORS 2021; 21:s21144834. [PMID: 34300574 PMCID: PMC8309834 DOI: 10.3390/s21144834] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 07/09/2021] [Accepted: 07/12/2021] [Indexed: 11/20/2022]
Abstract
As Internet of Things (IoT) networks expand globally with an annual increase of active devices, providing better safeguards to threats is becoming more prominent. An intrusion detection system (IDS) is the most viable solution that mitigates the threats of cyberattacks. Given the many constraints of the ever-changing network environment of IoT devices, an effective yet lightweight IDS is required to detect cyber anomalies and categorize various cyberattacks. Additionally, most publicly available datasets used for research do not reflect the recent network behaviors, nor are they made from IoT networks. To address these issues, in this paper, we have the following contributions: (1) we create a dataset from IoT networks, namely, the Center for Cyber Defense (CCD) IoT Network Intrusion Dataset V1 (CCD-INID-V1); (2) we propose a hybrid lightweight form of IDS—an embedded model (EM) for feature selection and a convolutional neural network (CNN) for attack detection and classification. The proposed method has two models: (a) RCNN: Random Forest (RF) is combined with CNN and (b) XCNN: eXtreme Gradient Boosting (XGBoost) is combined with CNN. RF and XGBoost are the embedded models to reduce less impactful features. (3) We attempt anomaly (binary) classifications and attack-based (multiclass) classifications on CCD-INID-V1 and two other IoT datasets, the detection_of_IoT_botnet_attacks_N_BaIoT dataset (Balot) and the CIRA-CIC-DoHBrw-2020 dataset (DoH20), to explore the effectiveness of these learning-based security models. Using RCNN, we achieved an Area under the Receiver Characteristic Operator (ROC) Curve (AUC) score of 0.956 with a runtime of 32.28 s on CCD-INID-V1, 0.999 with a runtime of 71.46 s on Balot, and 0.986 with a runtime of 35.45 s on DoH20. Using XCNN, we achieved an AUC score of 0.998 with a runtime of 51.38 s for CCD-INID-V1, 0.999 with a runtime of 72.12 s for Balot, and 0.999 with a runtime of 72.91 s for DoH20. Compared to KNN, XCNN required 86.98% less computational time, and RCNN required 91.74% less computational time to achieve equal or better accurate anomaly detections. We find XCNN and RCNN are consistently efficient and handle scalability well; in particular, 1000 times faster than KNN when dealing with a relatively larger dataset-Balot. Finally, we highlight RCNN and XCNN’s ability to accurately detect anomalies with a significant reduction in computational time. This advantage grants flexibility for the IDS placement strategy. Our IDS can be placed at a central server as well as resource-constrained edge devices. Our lightweight IDS requires low train time and hence decreases reaction time to zero-day attacks.
Collapse
|
14
|
George MM, Tolley NS. AIM in Otolaryngology and Head & Neck Surgery. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_198-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Hussain L, Aziz W, Khan IR, Alkinani MH, Alowibdi JS. Machine learning based congestive heart failure detection using feature importance ranking of multimodal features. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020; 18:69-91. [PMID: 33525081 DOI: 10.3934/mbe.2021004] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this study, we ranked the Multimodal Features extracted from Congestive Heart Failure (CHF) and Normal Sinus Rhythm (NSR) subjects. We categorized the ranked features into 1 to 5 categories based on Empirical Receiver Operating Characteristics (EROC) values. Instead of using all multimodal features, we use high ranking features for detection of CHF and normal subjects. We employed powerful machine learning techniques such as Decision Tree (DT), Naïve Bayes (NB), SVM Gaussian, SVM RBF and SVM Polynomial. The performance was measured in terms of Sensitivity, Specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV), Accuracy, False Positive Rate (FPR), and area under the Receiver Operating characteristic Curve (AUC). The highest detection performance in terms of accuracy and AUC was obtained with all multimodal features using SVM Gaussian with Sensitivity (93.06%), Specificity (81.82%), Accuracy (88.79%) and AUC (0.95). Using the top five ranked features, the highest performance was obtained with SVM Gaussian yields accuracy (84.48%), AUC (0.86); top nine ranked features using Decision Tree and Naïve Bayes got accuracy (84.48%), AUC (0.88); last thirteen ranked features using SVM polynomial obtained accuracy (80.17%), AUC (0.84). The findings indicate that proposed approach with feature ranking can be very useful for automatic detection of congestive heart failure patients and can be very helpful for further decision making by the clinicians and physicians in order to decrease the mortality rate.
Collapse
Affiliation(s)
- Lal Hussain
- Department of Computer Science & IT, University of Azad Jammu and Kashmir, King Abdullah Campus, 13100, Muzaffarabad, Pakistan
- Department of Computer Science & IT, University of Azad Jammu and Kashmir, Neelum Campus, 13230, Muzaffarabad, Pakistan
| | - Wajid Aziz
- Department of Computer & AI, University of Jeddah, Jeddah, 23890, Saudi Arabia
| | - Ishtiaq Rasool Khan
- Department of Computer & AI, University of Jeddah, Jeddah, 23890, Saudi Arabia
| | - Monagi H Alkinani
- Department of Computer & AI, University of Jeddah, Jeddah, 23890, Saudi Arabia
| | - Jalal S Alowibdi
- Department of Computer & AI, University of Jeddah, Jeddah, 23890, Saudi Arabia
| |
Collapse
|