1
|
Cai Y, Zhaoxiong Y, Zhu W, Wang H. Association between sleep duration, depression and breast cancer in the United States: a national health and nutrition examination survey analysis 2009-2018. Ann Med 2024; 56:2314235. [PMID: 38329808 PMCID: PMC10854439 DOI: 10.1080/07853890.2024.2314235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/01/2023] [Indexed: 02/10/2024] Open
Abstract
OBJECTIVE Breast cancer is the most common cancer in women, threatening both physical and mental health. The epidemiological evidence for association between sleep duration, depression and breast cancer is inconsistent. The aim of this study was to determine the association between them and build machine-learning algorithms to predict breast cancer. METHODS A total of 1,789 participants from the National Health and Nutrition Examination Survey (NHANES) were included in the study, and 263 breast cancer patients were identified. Sleep duration was collected using a standardized questionnaire, and the Nine-item Patient Health Questionnaire (PHQ-9) was used to assess depression. Logistic regression yielded multivariable-adjusted breast cancer odds ratios (OR) and 95% confidence intervals (CI) for sleep duration and depression. Then, six machine learning algorithms, including AdaBoost, random forest, Boost tree, artificial neural network, limit gradient enhancement and support vector machine, were used to predict the development of breast cancer and find out the best algorithm. RESULTS Body mass index (BMI), race and smoking were statistically different between breast cancer and non-breast cancer groups. Participants with depression were associated with breast cancer (OR = 1.99, 95%CI: 1.55-3.51). Compared with 7-9h of sleep, the ORs for <7 and >9 h of sleep were 1.25 (95% CI: 0.85-1.37) and 1.05 (95% CI: 0.95-1.15), respectively. The AdaBoost model outperformed other machine learning algorithms and predicted well for breast cancer, with an area under curve (AUC) of 0.84 (95%CI: 0.81-0.87). CONCLUSIONS No significant association was observed between sleep duration and breast cancer, and participants with depression were associated with an increased risk for breast cancer. This finding provides new clues into the relationship between breast cancer and depression and sleep duration, and provides potential evidence for subsequent studies of pathological mechanisms.
Collapse
Affiliation(s)
- Yufan Cai
- Zhongshan Hospital of Fudan University, Shanghai, China
| | | | - Wei Zhu
- Zhongshan Hospital of Fudan University, Shanghai, China
| | - Haiyu Wang
- Zhongshan Hospital of Fudan University, Shanghai, China
| |
Collapse
|
2
|
Ma P, Shang S, Liu R, Dong Y, Wu J, Gu W, Yu M, Liu J, Li Y, Chen Y. Prediction of teicoplanin plasma concentration in critically ill patients: a combination of machine learning and population pharmacokinetics. J Antimicrob Chemother 2024; 79:2815-2827. [PMID: 39207798 DOI: 10.1093/jac/dkae292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 08/02/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND Teicoplanin has been widely used in patients with infections caused by Staphylococcus aureus, especially for critically ill patients. The pharmacokinetics (PK) of teicoplanin vary between individuals and within the same individual. We aim to establish a prediction model via a combination of machine learning and population PK (PPK) to support personalized medication decisions for critically ill patients. METHODS A retrospective study was performed incorporating 33 variables, including PPK parameters (clearance and volume of distribution). Multiple algorithms and Shapley additive explanations were employed for feature selection of variables to determine the strongest driving factors. RESULTS The performance of each algorithm with PPK parameters was superior to that without PPK parameters. The composition of support vector regression, categorical boosting and a backpropagation neural network (7:2:1) with the highest R2 (0.809) was determined as the final ensemble model. The model included 15 variables after feature selection, of which the predictive performance was superior to that of models considering all variables or using only PPK. The R2, mean absolute error, mean squared error, absolute accuracy (±5 mg/L) and relative accuracy (±30%) of external validation were 0.649, 3.913, 28.347, 76.12% and 76.12%, respectively. CONCLUSIONS Our study offers a non-invasive, fast and cost-effective prediction model of teicoplanin plasma concentration in critically ill patients. The model serves as a fundamental tool for clinicians to determine the effective plasma concentration range of teicoplanin and formulate individualized dosing regimens accordingly.
Collapse
Affiliation(s)
- Pan Ma
- Department of Pharmacy, The First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| | - Shenglan Shang
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, Wuhan, Hubei Province 430070, China
| | - Ruixiang Liu
- Department of Pharmacy, The First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| | - Yuzhu Dong
- Department of Pharmacy, The Third Affiliated Hospital of Chongqing Medical University, Chongqing 401120, China
| | - Jiangfan Wu
- Department of Pharmacy, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China
| | - Wenrui Gu
- Department of Pharmacy, The First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| | - Mengchen Yu
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, Wuhan, Hubei Province 430070, China
| | - Jing Liu
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, Wuhan, Hubei Province 430070, China
| | - Ying Li
- Medical Big Data and Artificial Intelligence Center, The First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| | - Yongchuan Chen
- Department of Pharmacy, The First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| |
Collapse
|
3
|
Ma P, Shang S, Huang Y, Liu R, Yu H, Zhou F, Yu M, Xiao Q, Zhang Y, Ding Q, Nie Y, Wang Z, Chen Y, Yu A, Shi Q. Joint use of population pharmacokinetics and machine learning for prediction of valproic acid plasma concentration in elderly epileptic patients. Eur J Pharm Sci 2024; 201:106876. [PMID: 39128815 DOI: 10.1016/j.ejps.2024.106876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 07/31/2024] [Accepted: 08/08/2024] [Indexed: 08/13/2024]
Abstract
BACKGROUND Valproic acid (VPA) is a commonly used broad-spectrum antiepileptic drug. For elderly epileptic patients, VPA plasma concentrations have a considerable variation. We aim to establish a prediction model via a combination of machine learning and population pharmacokinetics (PPK) for VPA plasma concentration. METHODS A retrospective study was performed incorporating 43 variables, including PPK parameters. Recursive Feature Elimination with Cross-Validation was used for feature selection. Multiple algorithms were employed for ensemble model, and the model was interpreted by Shapley Additive exPlanations. RESULTS The inclusion of PPK parameters significantly enhances the performance of individual algorithm model. The composition of categorical boosting, light gradient boosting machine, and random forest (7:2:1) with the highest R2 (0.74) was determined as the ensemble model. The model included 11 variables after feature selection, of which the predictive performance was comparable to the model that incorporated all variables. CONCLUSIONS Our model was specifically tailored for elderly epileptic patients, providing an efficient and cost-effective approach to predict VPA plasma concentration. The model combined classical PPK with machine learning, and underwent optimization through feature selection and algorithm integration. Our model can serve as a fundamental tool for clinicians in determining VPA plasma concentration and individualized dosing regimens accordingly.
Collapse
Affiliation(s)
- Pan Ma
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China; Chongqing Key Laboratory of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China; Department of Pharmacy, the First Affiliated Hospital of Army Medical University, No. 29 Gaotanyan Street, Chongqing 400038, China
| | - Shenglan Shang
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China
| | - Yifan Huang
- Medical Big Data and Artificial Intelligence Center, the First Affiliated Hospital of Army Medical University, Chongqing 400038, China
| | - Ruixiang Liu
- Department of Pharmacy, the First Affiliated Hospital of Army Medical University, No. 29 Gaotanyan Street, Chongqing 400038, China
| | - Hongfan Yu
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China; Chongqing Key Laboratory of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China
| | - Fan Zhou
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China
| | - Mengchen Yu
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China
| | - Qin Xiao
- Department of Pharmacy, Shengjing Hospital, China Medical University, Shenyang 110002, China
| | - Ying Zhang
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China
| | - Qianxue Ding
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China
| | - Yuxian Nie
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China
| | - Zhibiao Wang
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China
| | - Yongchuan Chen
- Department of Pharmacy, the First Affiliated Hospital of Army Medical University, No. 29 Gaotanyan Street, Chongqing 400038, China.
| | - Airong Yu
- Department of Clinical Pharmacy, General Hospital of Central Theater Command, No. 627 Wuluo Street, Wuhan City, Hubei Province 430070, China.
| | - Qiuling Shi
- State Key Laboratory of Ultrasound in Medicine and Engineering, College of Biomedical Engineering, Chongqing Medical University, Chongqing 400016, China; School of Public Health, Chongqing Medical University, Chongqing 400016, China.
| |
Collapse
|
4
|
Su Y, Li Y, Yang W, Luo X, Chen L. Optimized machine learning model for predicting unplanned reoperation after rectal cancer anterior resection. EUROPEAN JOURNAL OF SURGICAL ONCOLOGY 2024; 50:108703. [PMID: 39326305 DOI: 10.1016/j.ejso.2024.108703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 09/12/2024] [Accepted: 09/19/2024] [Indexed: 09/28/2024]
Abstract
BACKGROUND Unplanned reoperation (URO) after surgery adversely affects the quality of life and prognosis of patients undergoing anterior resection for rectal cancer. This study aims to meet the urgent need for reliable predictive tools by developing an optimized machine learning model to estimate the risk of URO following anterior resection in rectal cancer patients. METHODS This retrospective study collected multidimensional data from patients who underwent anterior resection for rectal cancer at Tongji Hospital of Huazhong University of Science and Technology from January 2012 to December 2022. Feature selection was conducted using both least absolute shrinkage and selection operator (LASSO) regression and the Boruta algorithm. Multiple machine learning models were developed, with parameter optimization via grid search and cross-validation. Performance metrics included accuracy, specificity, sensitivity, and area under curve (AUC). The optimal model was interpreted using SHapley Additive exPlanations (SHAP), and an online platform was created for real-time risk prediction. RESULTS A total of 2384 patients who underwent anterior resection for rectal cancer were included in this study. Following rigorous selection, 14 variables were identified for constructing the machine learning model. The optimized model demonstrated high predictive accuracy, with the random forest (RF) model achieving the best overall performance. The model achieved an AUC of 0.889 and an accuracy of 0.842 on the test dataset. SHAP analysis revealed that the tumor location, previous abdominal surgery, and operative time were the most significant factors influencing the risk of URO. CONCLUSION This study developed an optimized machine learning-based online predictive system to assess the risk of URO after anterior resection in rectal cancer patients. Accessible at https://yangsu2023.shinyapps.io/UROrisk/, this system improves prediction accuracy and offers real-time risk assessment, providing a valuable tool that may support clinical decision-making and potentially improve the prognosis of rectal cancer patients.
Collapse
Affiliation(s)
- Yang Su
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Yanqi Li
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Wangshuo Yang
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Xuelai Luo
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China
| | - Lisheng Chen
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, China.
| |
Collapse
|
5
|
Lanjewar MG, Panchbhai KG, Patle LB. Sugar detection in adulterated honey using hyper-spectral imaging with stacking generalization method. Food Chem 2024; 450:139322. [PMID: 38613963 DOI: 10.1016/j.foodchem.2024.139322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/26/2024] [Accepted: 04/08/2024] [Indexed: 04/15/2024]
Abstract
This paper develops a new hybrid, automated, and non-invasive approach by combining hyper-spectral imaging, Savitzky-Golay (SG) Filter, Principal Components Analysis (PCA), Machine Learning (ML) classifiers/regressors, and stacking generalization methods to detect sugar in honey. First, the 32 different sugar concentration levels in honey were predicted using various ML regressors. Second, the six ranges of sugar were classified using various classifiers. Third, the 11 types of honey and 100% sugar were classified using classifiers. The stacking model (STM) obtained R2: 0.999, RMSE: 0.493 ml (v/v), RPD: 40.2, a 10-fold average R2: 0.996 and RMSE: 1.27 ml (v/v) for predicting 32 sugar concentrations. The STM achieved a Matthews Correlation Coefficient (MCC) of 99.7% and a Kappa score of 99.7%, a 10-fold average MCC of 98.9% and a Kappa score of 98.9% for classifying the six sugar ranges and 12 categories of honey types and a sugar.
Collapse
Affiliation(s)
- Madhusudan G Lanjewar
- School of Physical and Applied Sciences, Goa University, Taleigao Plateau, Goa 403206, India.
| | | | - Lalchand B Patle
- PG Department of Electronics, MGSM's DDSGP College Chopda, KBCNMU, Jalgaon 425107, Maharashtra, India
| |
Collapse
|
6
|
Seu MY, Rezania N, Murray CE, Qiao MT, Arnold S, Siotos C, Ferraro J, Jazayeri HE, Hood K, Shenaq D, Kokosis G. Predicting Reduction Mammaplasty Total Resection Weight With Machine Learning. Ann Plast Surg 2024; 93:246-252. [PMID: 38833662 DOI: 10.1097/sap.0000000000004016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
BACKGROUND Machine learning (ML) is a form of artificial intelligence that has been used to create better predictive models in medicine. Using ML algorithms, we sought to create a predictive model for breast resection weight based on anthropometric measurements. METHODS We analyzed 237 patients (474 individual breasts) who underwent reduction mammoplasty at our institution. Anthropometric variables included body surface area (BSA), body mass index, sternal notch-to-nipple (SN-N), and nipple-to-inframammary fold values. Four different ML algorithms (linear regression, ridge regression, support vector regression, and random forest regression) either including or excluding the Schnur Scale prediction for the same data were trained and tested on their ability to recognize the relationship between the anthropometric variables and total resection weights. Resection weight prediction accuracy for each model and the Schnur scale alone were evaluated based on using mean absolute error (MAE). RESULTS In our cohort, mean age was 40.36 years. Most patients (71.61%) were African American. Mean BSA was 2.0 m 2 , mean body mass index was 33.045 kg/m 2 , mean SN-N was 35.0 cm, and mean nipple-to-inframammary fold was 16.0 cm. Mean SN-N was found to have the greatest variable importance. All 4 models made resection weight predictions with MAE lower than that of the Schnur Scale alone in both the training and testing datasets. Overall, the random forest regression model without Schnur scale weight had the lowest MAE at 186.20. CONCLUSION Our ML resection weight prediction model represents an accurate and promising alternative to the Schnur Scale in the setting of reduction mammaplasty consultations.
Collapse
Affiliation(s)
| | - Nikki Rezania
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Carolyn E Murray
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Mark T Qiao
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Sydney Arnold
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Charalampos Siotos
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Jennifer Ferraro
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Hossein E Jazayeri
- Section of Oral and Maxillofacial Surgery, Department of Surgery, Michigan Medicine, Ann Arbor, MI
| | - Keith Hood
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - Deana Shenaq
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| | - George Kokosis
- From the Division of Plastic & Reconstructive Surgery, Rush University Medical Center, Chicago, IL
| |
Collapse
|
7
|
Uddin MG, Rana MSP, Diganta MTM, Bamal A, Sajib AM, Abioui M, Shaibur MR, Ashekuzzaman S, Nikoo MR, Rahman A, Moniruzzaman M, Olbert AI. Enhancing groundwater quality assessment in coastal area: A hybrid modeling approach. Heliyon 2024; 10:e33082. [PMID: 39027495 PMCID: PMC11255574 DOI: 10.1016/j.heliyon.2024.e33082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/12/2024] [Accepted: 06/13/2024] [Indexed: 07/20/2024] Open
Abstract
Monitoring of groundwater (GW) resources in coastal areas is vital for human needs, agriculture, ecosystems, securing water supply, biodiversity, and environmental sustainability. Although the utilization of water quality index (WQI) models has proven effective in monitoring GW resources, it has faced substantial criticism due to its inconsistent outcomes, prompting the need for more reliable assessment methods. Therefore, this study addressed this concern by employing the data-driven root mean squared (RMS) models to evaluate groundwater quality (GWQ) in the coastal Bhola district near the Bay of Bengal, Bangladesh. To enhance the reliability of the RMS-WQI model, the research incorporated the extreme gradient boosting (XGBoost) machine learning (ML) algorithm. For the assessment of GWQ, the study utilized eleven crucial indicators, including turbidity (TURB), electric conductivity (EC), pH, total dissolved solids (TDS), nitrate (NO3 -), ammonium (NH4 +), sodium (Na), potassium (K), magnesium (Mg), calcium (Ca), and iron (Fe). In terms of the GW indicators, concentration of K, Ca and Mg exceeded the guideline limit in the collected GW samples. The computed RMS-WQI scores ranged from 54.3 to 72.1, with an average of 65.2, categorizing all sampling sites' GWQ as "fair." In terms of model reliability, XGBoost demonstrated exceptional sensitivity (R2 = 0.97) in predicting GWQ accurately. Furthermore, the RMS-WQI model exhibited minimal uncertainty (<1 %) in predicting WQI scores. These findings implied the efficacy of the RMS-WQI model in accurately assessing GWQ in coastal areas, that would ultimately assist regional environmental managers and strategic planners for effective monitoring and sustainable management of coastal GW resources.
Collapse
Affiliation(s)
- Md Galal Uddin
- School of Engineering, University of Galway, Ireland
- Ryan Institute, University of Galway, Ireland
- MaREI Research Centre, University of Galway, Ireland
- Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - M.M. Shah Porun Rana
- The Department of Geography and Environment, Jagannath University, Dhaka, Bangladesh
| | - Mir Talas Mahammad Diganta
- School of Engineering, University of Galway, Ireland
- Ryan Institute, University of Galway, Ireland
- MaREI Research Centre, University of Galway, Ireland
- Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - Apoorva Bamal
- School of Engineering, University of Galway, Ireland
- Ryan Institute, University of Galway, Ireland
- MaREI Research Centre, University of Galway, Ireland
- Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - Abdul Majed Sajib
- School of Engineering, University of Galway, Ireland
- Ryan Institute, University of Galway, Ireland
- MaREI Research Centre, University of Galway, Ireland
- Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - Mohamed Abioui
- Geosciences, Environment and Geomatics Laboratory (GEG), Department of Earth Sciences, Faculty of Sciences, Ibnou Zohr University, Agadir, Morocco
- MARE-Marine and Environmental Sciences Centre-Sedimentary Geology Group, Department of Earth Sciences, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
- Laboratory for Sustainable Innovation and Applied Research, Universiapolis—International University of Agadir, Agadir, Morocco
| | - Molla Rahman Shaibur
- Laboratory of Environmental Chemistry, Department of Environmental Science and Technology, Faculty of Applied Science and Technology, Jashore University of Science and Technology, Jashore, 7408, Bangladesh
| | - S.M. Ashekuzzaman
- Department of Civil, Structural and Environmental Engineering, and Sustainable Infrastructure Research & Innovation Group, Munster Technological University, Cork, Ireland
| | - Mohammad Reza Nikoo
- Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat, Oman
| | - Azizur Rahman
- School of Computing, Mathematics and Engineering, Charles Sturt University, Wagga Wagga, Australia
- The Gulbali Institute of Agriculture, Water and Environment, Charles Sturt University, Wagga Wagga, Australia
| | - Md Moniruzzaman
- The Department of Geography and Environment, Jagannath University, Dhaka, Bangladesh
| | - Agnieszka I. Olbert
- School of Engineering, University of Galway, Ireland
- Ryan Institute, University of Galway, Ireland
- MaREI Research Centre, University of Galway, Ireland
- Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| |
Collapse
|
8
|
Xiao H, Tian Y, Gao H, Cui X, Dong S, Xue Q, Yao D. Analysis of the fatigue status of medical security personnel during the closed-loop period using multiple machine learning methods: a case study of the Beijing 2022 Olympic Winter Games. Sci Rep 2024; 14:8987. [PMID: 38637575 PMCID: PMC11026406 DOI: 10.1038/s41598-024-59397-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 04/10/2024] [Indexed: 04/20/2024] Open
Abstract
Using machine learning methods to analyze the fatigue status of medical security personnel and the factors influencing fatigue (such as BMI, gender, and wearing protective clothing working hours), with the goal of identifying the key factors contributing to fatigue. By validating the predicted outcomes, actionable and practical recommendations can be offered to enhance fatigue status, such as reducing wearing protective clothing working hours. A questionnaire was designed to assess the fatigue status of medical security personnel during the closed-loop period, aiming to capture information on fatigue experienced during work and disease recovery. The collected data was then preprocessed and used to determine the structural parameters for each machine learning algorithm. To evaluate the prediction performance of different models, the mean relative error (MRE) and goodness of fit (R2) between the true and predicted values were calculated. Furthermore, the importance rankings of various parameters in relation to fatigue status were determined using the RF feature importance analysis method. The fatigue status of medical security personnel during the closed-loop period was analyzed using multiple machine learning methods. The prediction performance of these methods was ranked from highest to lowest as follows: Gradient Boosting Regression (GBM) > Random Forest (RF) > Adaptive Boosting (AdaBoost) > K-Nearest Neighbors (KNN) > Support Vector Regression (SVR). Among these algorithms, four out of the five achieved good prediction results, with the GBM method performing the best. The five most critical parameters influencing fatigue status were identified as working hours in protective clothing, a customized symptom and disease score (CSDS), physical exercise, body mass index (BMI), and age, all of which had importance scores exceeding 0.06. Notably, working hours in protective clothing obtained the highest importance score of 0.54, making it the most critical factor impacting fatigue status. Fatigue is a prevalent and pressing issue among medical security personnel operating in closed-loop environments. In our investigation, we observed that the GBM method exhibited superior predictive performance in determining the fatigue status of medical security personnel during the closed-loop period, surpassing other machine learning techniques. Notably, our analysis identified several critical factors influencing the fatigue status of medical security personnel, including the duration of working hours in protective clothing, CSDS, and engagement in physical exercise. These findings shed light on the multifaceted nature of fatigue among healthcare workers and emphasize the importance of considering various contributing factors. To effectively alleviate fatigue, prudent management of working hours for security personnel, along with minimizing the duration of wearing protective clothing, proves to be promising strategies. Furthermore, promoting regular physical exercise among medical security personnel can significantly impact fatigue reduction. Additionally, the exploration of medication interventions and the adoption of innovative protective clothing options present potential avenues for mitigating fatigue. The insights derived from this study offer valuable guidance to management personnel involved in organizing large-scale events, enabling them to make informed decisions and implement targeted interventions to address fatigue among medical security personnel. In our upcoming research, we will further expand the fatigue dataset while considering higher precisionprediction algorithms, such as XGBoost model, ensemble model, etc., and explore their potential contributions to our research.
Collapse
Affiliation(s)
- Hao Xiao
- Department of Emergency, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, China
| | - Yingping Tian
- Department of Emergency, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, China
| | - Hengbo Gao
- Department of Emergency, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, China
| | - Xiaolei Cui
- Department of Emergency, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, China
| | - Shimin Dong
- Department of Emergency, The Third Hospital of Hebei Medical University, Shijiazhuang, 050000, China
| | - Qianlong Xue
- Department of Emergency, The First Affiliated Hospital of Hebei North University, Zhangjiakou, 075000, China
| | - Dongqi Yao
- Department of Emergency, The Second Hospital of Hebei Medical University, Shijiazhuang, 050000, China.
| |
Collapse
|
9
|
Feng S, Wang S, Liu C, Wu S, Zhang B, Lu C, Huang C, Chen T, Zhou C, Zhu J, Chen J, Xue J, Wei W, Zhan X. Prediction model for spinal cord injury in spinal tuberculosis patients using multiple machine learning algorithms: a multicentric study. Sci Rep 2024; 14:7691. [PMID: 38565845 PMCID: PMC10987632 DOI: 10.1038/s41598-024-56711-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 03/09/2024] [Indexed: 04/04/2024] Open
Abstract
Spinal cord injury (SCI) is a prevalent and serious complication among patients with spinal tuberculosis (STB) that can lead to motor and sensory impairment and potentially paraplegia. This research aims to identify factors associated with SCI in STB patients and to develop a clinically significant predictive model. Clinical data from STB patients at a single hospital were collected and divided into training and validation sets. Univariate analysis was employed to screen clinical indicators in the training set. Multiple machine learning (ML) algorithms were utilized to establish predictive models. Model performance was evaluated and compared using receiver operating characteristic (ROC) curves, area under the curve (AUC), calibration curve analysis, decision curve analysis (DCA), and precision-recall (PR) curves. The optimal model was determined, and a prospective cohort from two other hospitals served as a testing set to assess its accuracy. Model interpretation and variable importance ranking were conducted using the DALEX R package. The model was deployed on the web by using the Shiny app. Ten clinical characteristics were utilized for the model. The random forest (RF) model emerged as the optimal choice based on the AUC, PRs, calibration curve analysis, and DCA, achieving a test set AUC of 0.816. Additionally, MONO was identified as the primary predictor of SCI in STB patients through variable importance ranking. The RF predictive model provides an efficient and swift approach for predicting SCI in STB patients.
Collapse
Affiliation(s)
- Sitan Feng
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Shujiang Wang
- Department of Outpatient, General Hospital of Eastern Theater Command, Nanjing, Jiangsu, People's Republic of China
| | - Chong Liu
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Shaofeng Wu
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Bin Zhang
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
- Department of Spine Ward, Bei Jing Ji Shui Tan Hospital Gui Zhou Hospital, Guiyang, Guizhou, People's Republic of China
| | - Chunxian Lu
- Department of Spine and Osteopathy Ward, Bai Se People's Hospital, Baise, Guangxi, People's Republic of China
| | - Chengqian Huang
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Tianyou Chen
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Chenxing Zhou
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Jichong Zhu
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Jiarui Chen
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Jiang Xue
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Wendi Wei
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China
| | - Xinli Zhan
- Department of Spine and Osteopathy Ward, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, People's Republic of China.
| |
Collapse
|
10
|
Dong C, Zhao L, Liu X, Dang L, Zhang X. Single-cell analysis reveals landscape of endometrial cancer response to estrogen and identification of early diagnostic markers. PLoS One 2024; 19:e0301128. [PMID: 38517922 PMCID: PMC10959392 DOI: 10.1371/journal.pone.0301128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/08/2024] [Indexed: 03/24/2024] Open
Abstract
BACKGROUND The development of endometrial cancer (EC) is closely related to the abnormal activation of the estrogen signaling pathway. Effective diagnostic markers are important for the early detection and treatment of EC. METHOD We downloaded single-cell RNA sequencing (scRNA-seq) and spatial transcriptome (ST) data of EC from public databases. Enrichment scores were calculated for EC cell subpopulations using the "AddModuleScore" function and the AUCell package, respectively. Six predictive models were constructed, including logistic regression (LR), Gaussian naive Bayes (GaussianNB), k-nearest neighbor (KNN), support vector machine (SVM), extreme gradient boosting (XGB), and neural network (NK). Subsequently, receiver-operating characteristics with areas under the curves (AUCs) were used to assess the robustness of the predictive model. RESULT We classified EC cell coaggregation into six cell clusters, of which the epithelial, fibroblast and endothelial cell clusters had higher estrogen signaling pathway activity. We founded the epithelial cell subtype Epi cluster1, the fibroblast cell subtype Fib cluster3, and the endothelial cell subtype Endo cluster3 all showed early activation levels of estrogen response. Based on EC cell subtypes, estrogen-responsive early genes, and genes encoding Stage I and para-cancer differentially expressed proteins in EC patients, a total of 24 early diagnostic markers were identified. The AUCs values of all six classifiers were higher than 0.95, which indicates that the early diagnostic markers we screened have superior robustness across different classification algorithms. CONCLUSION Our study elucidates the potential biological mechanism of EC response to estrogen at single-cell resolution, which provides a new direction for early diagnosis of EC.
Collapse
Affiliation(s)
- Chunli Dong
- Department of Anesthesiology and Operation, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Liyan Zhao
- Department of Anesthesiology and Operation, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Xiongtao Liu
- Department of Anesthesiology and Operation, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Ling Dang
- Department of Obstetrics and Gynecology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| | - Xin Zhang
- Department of Obstetrics and Gynecology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
11
|
Yang K, Liu L, Wen Y. The impact of Bayesian optimization on feature selection. Sci Rep 2024; 14:3948. [PMID: 38366092 PMCID: PMC10873405 DOI: 10.1038/s41598-024-54515-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 02/13/2024] [Indexed: 02/18/2024] Open
Abstract
Feature selection is an indispensable step for the analysis of high-dimensional molecular data. Despite its importance, consensus is lacking on how to choose the most appropriate feature selection methods, especially when the performance of the feature selection methods itself depends on hyper-parameters. Bayesian optimization has demonstrated its advantages in automatically configuring the settings of hyper-parameters for various models. However, it remains unclear whether Bayesian optimization can benefit feature selection methods. In this research, we conducted extensive simulation studies to compare the performance of various feature selection methods, with a particular focus on the impact of Bayesian optimization on those where hyper-parameters tuning is needed. We further utilized the gene expression data obtained from the Alzheimer's Disease Neuroimaging Initiative to predict various brain imaging-related phenotypes, where various feature selection methods were employed to mine the data. We found through simulation studies that feature selection methods with hyper-parameters tuned using Bayesian optimization often yield better recall rates, and the analysis of transcriptomic data further revealed that Bayesian optimization-guided feature selection can improve the accuracy of disease risk prediction models. In conclusion, Bayesian optimization can facilitate feature selection methods when hyper-parameter tuning is needed and has the potential to substantially benefit downstream tasks.
Collapse
Affiliation(s)
- Kaixin Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, No 56 Xinjian South Road, Yingze District, Taiyuan, Shanxi, China
| | - Long Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, No 56 Xinjian South Road, Yingze District, Taiyuan, Shanxi, China.
| | - Yalu Wen
- Department of Statistics, University of Auckland, 38 Princes Street, Auckland Central, Auckland, 1010, New Zealand.
| |
Collapse
|
12
|
Alsenan S, Al-Turaiki I, Aldayel M, Tounsi M. Role of Optimization in RNA-Protein-Binding Prediction. Curr Issues Mol Biol 2024; 46:1360-1373. [PMID: 38392205 PMCID: PMC11154364 DOI: 10.3390/cimb46020087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/25/2024] [Accepted: 01/31/2024] [Indexed: 02/24/2024] Open
Abstract
RNA-binding proteins (RBPs) play an important role in regulating biological processes, such as gene regulation. Understanding their behaviors, for example, their binding site, can be helpful in understanding RBP-related diseases. Studies have focused on predicting RNA binding by means of machine learning algorithms including deep convolutional neural network models. One of the integral parts of modeling deep learning is achieving optimal hyperparameter tuning and minimizing a loss function using optimization algorithms. In this paper, we investigate the role of optimization in the RBP classification problem using the CLIP-Seq 21 dataset. Three optimization methods are employed on the RNA-protein binding CNN prediction model; namely, grid search, random search, and Bayesian optimizer. The empirical results show an AUC of 94.42%, 93.78%, 93.23% and 92.68% on the ELAVL1C, ELAVL1B, ELAVL1A, and HNRNPC datasets, respectively, and a mean AUC of 85.30 on 24 datasets. This paper's findings provide evidence on the role of optimizers in improving the performance of RNA-protein binding prediction.
Collapse
Affiliation(s)
- Shrooq Alsenan
- Information Systems Department, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Isra Al-Turaiki
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11653, Saudi Arabia;
| | - Mashael Aldayel
- Information Technology Department, College of Computer and Information Sciences, King Saud University, Riyadh 11451, Saudi Arabia;
| | - Mohamed Tounsi
- Department of Computer Science, College of Computer and information Sciences, Prince Sultan University, P.O. Box 66833, Riyadh 12435, Saudi Arabia;
| |
Collapse
|
13
|
Uddin MG, Imran MH, Sajib AM, Hasan MA, Diganta MTM, Dabrowski T, Olbert AI, Moniruzzaman M. Assessment of human health risk from potentially toxic elements and predicting groundwater contamination using machine learning approaches. JOURNAL OF CONTAMINANT HYDROLOGY 2024; 261:104307. [PMID: 38278020 DOI: 10.1016/j.jconhyd.2024.104307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/10/2024] [Accepted: 01/18/2024] [Indexed: 01/28/2024]
Abstract
The Rooppur Nuclear Power Plant (RNPP) at Ishwardi, Bangladesh is planning to go into operation within 2024 and therefore, adjacent areas of RNPP is gaining adequate attention from the scientific community for environmental monitoring purposes especially for water resources management. However, there is a substantial lack of literature as well as environmental datasets for earlier years since very little was done at the beginning of the RNPP's construction phase. Therefore, this study was conducted to assess the potential toxic elements (PTEs) contamination in the groundwater and its associated health risk for residents at the adjacent part of the RNPP during the year of 2014-2015. For the purposes of achieving the aim of the study, groundwater samples were collected seasonally (dry and wet season) from nine sampling sites and afterwards analyzed for water quality indicators such as temperature (Temp.), pH, electrical conductivity (EC), total dissolved solid (TDS), total hardness (TH) and for PTEs including Iron (Fe), Manganese (Mn), Copper (Cu), Lead (Pb), Chromium (Cr), Cadmium (Cd) and Arsenic (As). This study adopted the newly developed Root Mean Square water quality index (RMS-WQI) model to assess the scenario of contamination from PTEs in groundwater whereas the human health risk assessment model was utilized to quantify the risk of toxicity from PTEs. In most of the sampling sites, PTEs concentration was found higher during the wet season than the dry season and Fe, Mn, Cd and As exceeded the guideline limit for drinking water. The RMS score mostly classified the groundwater in terms of PTEs contamination into "Fair" condition. The non-carcinogenic risks (expressed as Hazard Index-HI) revealed that around 44% and 89% of samples for adults and 67% and 100% of samples for children exceeded the threshold limit set by USEPA (HI > 1) and possessed risks through the oral pathway during dry and wet season, respectively. Furthermore, the calculated cumulative HI score was found higher for children than the adults throughout the study period. In terms of carcinogenic risk (CR) from PTEs, the magnitude of risk decreased following the pattern of Cr > As > Cd. Although the current study is based on old dataset, the findings might serve as a baseline for monitoring purposes to reduce future hazardous impact from the power plant.
Collapse
Affiliation(s)
- Md Galal Uddin
- Civil Engineering, School of Engineering, College of Science and Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland; Department of Geography and Environment, Jagannath University, Dhaka, Bangladesh.
| | - Md Hasan Imran
- Department of Environmental Science and Resource Management, Mawlana Bhashani Science and Technology University, Tangail 1902, Bangladesh
| | - Abdul Majed Sajib
- Civil Engineering, School of Engineering, College of Science and Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - Md Abu Hasan
- Bangladesh Reference institute for Chemical Measurements (BRiCM), Dr. Qudrat-e-Khuda Road, Dhanmondi, Dhaka 1205, Bangladesh
| | - Mir Talas Mahammad Diganta
- Civil Engineering, School of Engineering, College of Science and Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | | | - Agnieszka I Olbert
- Civil Engineering, School of Engineering, College of Science and Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| | - Md Moniruzzaman
- Department of Geography and Environment, Jagannath University, Dhaka, Bangladesh
| |
Collapse
|
14
|
Kang Z, Fan R, Zhan C, Wu Y, Lin Y, Li K, Qing R, Xu L. The Rapid Non-Destructive Differentiation of Different Varieties of Rice by Fluorescence Hyperspectral Technology Combined with Machine Learning. Molecules 2024; 29:682. [PMID: 38338424 PMCID: PMC10856461 DOI: 10.3390/molecules29030682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/27/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
A rice classification method for the fast and non-destructive differentiation of different varieties is significant in research at present. In this study, fluorescence hyperspectral technology combined with machine learning techniques was used to distinguish five rice varieties by analyzing the fluorescence hyperspectral features of Thai jasmine rice and four rice varieties with a similar appearance to Thai jasmine rice in the wavelength range of 475-1000 nm. The fluorescence hyperspectral data were preprocessed by a first-order derivative (FD) to reduce the background and baseline drift effects of the rice samples. Then, a principal component analysis (PCA) and t-distributed stochastic neighborhood embedding (t-SNE) were used for feature reduction and 3D visualization display. A partial least squares discriminant analysis (PLS-DA), BP neural network (BP), and random forest (RF) were used to build the rice classification models. The RF classification model parameters were optimized using the gray wolf algorithm (GWO). The results show that FD-t-SNE-GWO-RF is the best model for rice classification, with accuracy values of 99.8% and 95.3% for the training and test sets, respectively. The fluorescence hyperspectral technique combined with machine learning is feasible for classifying rice varieties.
Collapse
Affiliation(s)
- Zhiliang Kang
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Rongsheng Fan
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Chunyi Zhan
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Youli Wu
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Yi Lin
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Kunyu Li
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Rui Qing
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| | - Lijia Xu
- College of Mechanical and Electrical Engineering, Sichuan Agriculture University, Ya’an 625000, China; (Z.K.); (R.F.); (C.Z.); (Y.W.); (Y.L.); (K.L.); (R.Q.)
- Sichuan Research Center for Smart Agriculture Engineering Technology, Ya’an 625000, China
| |
Collapse
|
15
|
Uddin MG, Nash S, Rahman A, Dabrowski T, Olbert AI. Data-driven modelling for assessing trophic status in marine ecosystems using machine learning approaches. ENVIRONMENTAL RESEARCH 2024; 242:117755. [PMID: 38008200 DOI: 10.1016/j.envres.2023.117755] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 10/05/2023] [Accepted: 11/20/2023] [Indexed: 11/28/2023]
Abstract
Assessing eutrophication in coastal and transitional waters is of utmost importance, yet existing Trophic Status Index (TSI) models face challenges like multicollinearity, data redundancy, inappropriate aggregation methods, and complex classification schemes. To tackle these issues, we developed a novel tool that harnesses machine learning (ML) and artificial intelligence (AI), enhancing the reliability and accuracy of trophic status assessments. Our research introduces an improved data-driven methodology specifically tailored for transitional and coastal (TrC) waters, with a focus on Cork Harbour, Ireland, as a case study. Our innovative approach, named the Assessment Trophic Status Index (ATSI) model, comprises three main components: the selection of pertinent water quality indicators, the computation of ATSI scores, and the implementation of a new classification scheme. To optimize input data and minimize redundancy, we employed ML techniques, including advanced deep learning methods. Specifically, we developed a CHL prediction model utilizing ten algorithms, among which XGBoost demonstrated exceptional performance, showcasing minimal errors during both training (RMSE = 0.0, MSE = 0.0, MAE = 0.01) and testing (RMSE = 0.0, MSE = 0.0, MAE = 0.01) phases. Utilizing a novel linear rescaling interpolation function, we calculated ATSI scores and evaluated the model's sensitivity and efficiency across diverse application domains, employing metrics such as R2, the Nash-Sutcliffe efficiency (NSE), and the model efficiency factor (MEF). The results consistently revealed heightened sensitivity and efficiency across all application domains. Additionally, we introduced a brand new classification scheme for ranking the trophic status of transitional and coastal waters. To assess spatial sensitivity, we applied the ATSI model to four distinct waterbodies in Ireland, comparing trophic assessment outcomes with the Assessment of Trophic Status of Estuaries and Bays in Ireland (ATSEBI) System. Remarkably, significant disparities between the ATSI and ATSEBI System were evident in all domains, except for Mulroy Bay. Overall, our research significantly enhances the accuracy of trophic status assessments in marine ecosystems. The ATSI model, combined with cutting-edge ML techniques and our new classification scheme, represents a promising avenue for evaluating and monitoring trophic conditions in TrC waters. The study also demonstrated the effectiveness of ATSI in assessing trophic status across various waterbodies, including lakes, rivers, and more. These findings make substantial contributions to the field of marine ecosystem management and conservation.
Collapse
Affiliation(s)
- Md Galal Uddin
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland.
| | - Stephen Nash
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland
| | - Azizur Rahman
- School of Computing, Mathematics and Engineering, Charles Sturt University, Wagga Wagga, Australia; The Gulbali Institute of Agriculture, Water and Environment, Charles Sturt University, Wagga Wagga, Australia
| | | | - Agnieszka I Olbert
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| |
Collapse
|
16
|
Jian Z, Song T, Zhang Z, Ai Z, Zhao H, Tang M, Liu K. An Improved Nested U-Net Network for Fluorescence In Situ Hybridization Cell Image Segmentation. SENSORS (BASEL, SWITZERLAND) 2024; 24:928. [PMID: 38339644 PMCID: PMC10857237 DOI: 10.3390/s24030928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 01/26/2024] [Accepted: 01/28/2024] [Indexed: 02/12/2024]
Abstract
Fluorescence in situ hybridization (FISH) is a powerful cytogenetic method used to precisely detect and localize nucleic acid sequences. This technique is proving to be an invaluable tool in medical diagnostics and has made significant contributions to biology and the life sciences. However, the number of cells is large and the nucleic acid sequences are disorganized in the FISH images taken using the microscope. Processing and analyzing images is a time-consuming and laborious task for researchers, as it can easily tire the human eyes and lead to errors in judgment. In recent years, deep learning has made significant progress in the field of medical imaging, especially the successful application of introducing the attention mechanism. The attention mechanism, as a key component of deep learning, improves the understanding and interpretation of medical images by giving different weights to different regions of the image, enabling the model to focus more on important features. To address the challenges in FISH image analysis, we combined medical imaging with deep learning to develop the SEAM-Unet++ automated cell contour segmentation algorithm with integrated attention mechanism. The significant advantage of this algorithm is that it improves the accuracy of cell contours in FISH images. Experiments have demonstrated that by introducing the attention mechanism, our method is able to segment cells that are adherent to each other more efficiently.
Collapse
Affiliation(s)
| | | | | | | | | | - Man Tang
- School of Electronic and Electrical Engineering, Wuhan Textile University, Wuhan 430200, China; (Z.J.); (T.S.); (Z.Z.); (Z.A.); (H.Z.)
| | - Kan Liu
- School of Electronic and Electrical Engineering, Wuhan Textile University, Wuhan 430200, China; (Z.J.); (T.S.); (Z.Z.); (Z.A.); (H.Z.)
| |
Collapse
|
17
|
Jain P, Aggarwal S, Adam S, Imam M. Parametric optimization and comparative study of machine learning and deep learning algorithms for breast cancer diagnosis. Breast Dis 2024; 43:257-270. [PMID: 39331085 PMCID: PMC11492030 DOI: 10.3233/bd-240018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2024]
Abstract
Breast Cancer is the leading form of cancer found in women and a major cause of increased mortality rates among them. However, manual diagnosis of the disease is time-consuming and often limited by the availability of screening systems. Thus, there is a pressing need for an automatic diagnosis system that can quickly detect cancer in its early stages. Data mining and machine learning techniques have emerged as valuable tools in developing such a system. In this study we investigated the performance of several machine learning models on the Wisconsin Breast Cancer (original) dataset with a particular emphasis on finding which models perform the best for breast cancer diagnosis. The study also explores the contrast between the proposed ANN methodology and conventional machine learning techniques. The comparison between the methods employed in the current study and those utilized in earlier research on the Wisconsin Breast Cancer dataset is also compared. The findings of this study are in line with those of previous studies which also highlighted the efficacy of SVM, Decision Tree, CART, ANN, and ELM ANN for breast cancer detection. Several classifiers achieved high accuracy, precision and F1 scores for benign and malignant tumours, respectively. It is also found that models with hyperparameter adjustment performed better than those without and boosting methods like as XGBoost, Adaboost, and Gradient Boost consistently performed well across benign and malignant tumours. The study emphasizes the significance of hyperparameter tuning and the efficacy of boosting algorithms in addressing the complexity and nonlinearity of data. Using the Wisconsin Breast Cancer (original) dataset, a detailed summary of the current status of research on breast cancer diagnosis is provided.
Collapse
Affiliation(s)
- Parul Jain
- Department of Computer Science, Atma Ram Sanatan Dharma College, University of Delhi, New Delhi, India
| | - Shalini Aggarwal
- Department of Computer Science, Atma Ram Sanatan Dharma College, University of Delhi, New Delhi, India
| | - Sufiyan Adam
- Department of Computer Science, Atma Ram Sanatan Dharma College, University of Delhi, New Delhi, India
| | - Mohsin Imam
- Department of Computer Science, Atma Ram Sanatan Dharma College, University of Delhi, New Delhi, India
| |
Collapse
|
18
|
Yang TH, Chen YF, Cheng YF, Huang JN, Wu CS, Chu YC. Optimizing age-related hearing risk predictions: an advanced machine learning integration with HHIE-S. BioData Min 2023; 16:35. [PMID: 38098102 PMCID: PMC10722728 DOI: 10.1186/s13040-023-00351-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 11/28/2023] [Indexed: 12/17/2023] Open
Abstract
OBJECTIVES The elderly are disproportionately affected by age-related hearing loss (ARHL). Despite being a well-known tool for ARHL evaluation, the Hearing Handicap Inventory for the Elderly Screening version (HHIE-S) has only traditionally been used for direct screening using self-reported outcomes. This work uses a novel integration of machine learning approaches to improve the predicted accuracy of the HHIE-S tool for ARHL in older adults. METHODS We employed a dataset that was gathered between 2016 and 2018 and included 1,526 senior citizens from several Taipei City Hospital branches. 80% of the data were used for training (n = 1220) and 20% were used for testing (n = 356). XGBoost, Gradient Boosting, and LightGBM were among the machine learning models that were only used and assessed on the training set. In order to prevent data leakage and overfitting, the Light Gradient Boosting Machine (LGBM) model-which had the greatest AUC of 0.83 (95% CI 0.81-0.85)-was then only used on the holdout testing data. RESULTS On the testing set, the LGBM model showed a strong AUC of 0.82 (95% CI 0.79-0.86), far outperforming conventional techniques. Notably, several HHIE-S items and age were found to be significant characteristics. In contrast to traditional HHIE research, which concentrates on the psychological effects of hearing loss, this study combines cutting-edge machine learning techniques-specifically, the LGBM classifier-with the HHIE-S tool. The incorporation of SHAP values enhances the interpretability of the model's predictions and provides a more comprehensive comprehension of the significance of various aspects. CONCLUSIONS Our methodology highlights the great potential that arises from combining machine learning with validated hearing evaluation instruments such as the HHIE-S. Healthcare practitioners can anticipate ARHL more accurately thanks to this integration, which makes it easier to intervene quickly and precisely.
Collapse
Affiliation(s)
- Tzong-Hann Yang
- Department of Otorhinolaryngology, Taipei City Hospital, Taipei, 100, Taiwan
- General Education Center, University of Taipei, Taipei, 10671, Taiwan
- Department of Speech-Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taipei, 112303, Taiwan
- Department of Otolaryngology-Head and Neck Surgery, School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yu-Fu Chen
- Department of Speech-Language Pathology and Audiology, National Taipei University of Nursing and Health Sciences, Taipei, 112303, Taiwan
| | - Yen-Fu Cheng
- Department of Medical Research, Taipei Veterans General Hospital, Taipei, 112, Taiwan
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
- Department of Otolaryngology-Head and Neck Surgery, Taipei Veterans General Hospital, Taipei, 112, Taiwan
- Institute of Brain Science, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
| | - Jue-Ni Huang
- Information Management Office, Taipei Veterans General Hospital, Taipei, 112, Taiwan
| | - Chuan-Song Wu
- Department of Otorhinolaryngology, Taipei City Hospital, Taipei, 100, Taiwan.
- College of Science and Engineering, Fu Jen University, Taipei, 243, Taiwan.
| | - Yuan-Chia Chu
- Information Management Office, Taipei Veterans General Hospital, Taipei, 112, Taiwan.
- Big Data Center, Taipei Veterans General Hospital, Taipei, 112, Taiwan.
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, 112, Taiwan.
| |
Collapse
|
19
|
Tang SY, Chen TH, Kuo KL, Huang JN, Kuo CT, Chu YC. Using artificial intelligence algorithms to predict the overall survival of hemodialysis patients during the COVID-19 pandemic: A prospective cohort study. J Chin Med Assoc 2023; 86:1020-1027. [PMID: 37713313 DOI: 10.1097/jcma.0000000000000994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 09/17/2023] Open
Abstract
BACKGROUND Hemodialysis (HD) patients are a vulnerable population at high risk for severe complications from COVID-19. The impact of partial COVID-19 vaccination on the survival of HD patients remains uncertain. This prospective cohort study was designed to use artificial intelligence algorithms to predict the survival impact of partial COVID-19 vaccination in HD patients. METHODS A cohort of 433 HD patients was used to develop machine-learning models based on a subset of clinical features assessed between July 1, 2021, and April 29, 2022. The patient cohort was randomly split into training (80%) and testing (20%) sets for model development and evaluation. Machine-learning models, including categorical boosting (CatBoost), light gradient boosting machines (LightGBM), RandomForest, and extreme gradient boosting models (XGBoost), were applied to evaluate their discriminative performance using the patient cohorts. RESULTS Among these models, LightGBM achieved the highest F1 score of 0.95, followed by CatBoost, RandomForest, and XGBoost, with area under the receiver operating characteristic curve values of 0.94 on the testing dataset. The SHapley Additive explanation summary plot derived from the XGBoost model indicated that key features such as age, albumin, and vaccination details had a significant impact on survival. Furthermore, the fully vaccinated group exhibited higher levels of anti-spike (S) receptor-binding domain antibodies. CONCLUSION This prospective cohort study involved using artificial intelligence algorithms to predict overall survival in HD patients during the COVID-19 pandemic. These predictive models assisted in identifying high-risk individuals and guiding vaccination strategies for HD patients, ultimately improving overall prognosis. Further research is warranted to validate and refine these predictive models in larger and more diverse populations of HD patients.
Collapse
Affiliation(s)
- Shao-Yu Tang
- Division of Nephrology, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei, Taiwan, ROC
| | - Tz-Heng Chen
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- School of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
- Institute of Emergency and Critical Care Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan, ROC
| | - Ko-Lin Kuo
- Division of Nephrology, Taipei Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, Taipei, Taiwan, ROC
| | - Jue-Ni Huang
- Information Management Office, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
| | - Chen-Tsung Kuo
- Information Management Office, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Big Data Center, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan, ROC
| | - Yuan-Chia Chu
- Information Management Office, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Big Data Center, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan, ROC
| |
Collapse
|
20
|
Uddin MG, Rahman A, Nash S, Diganta MTM, Sajib AM, Moniruzzaman M, Olbert AI. Marine waters assessment using improved water quality model incorporating machine learning approaches. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 344:118368. [PMID: 37364491 DOI: 10.1016/j.jenvman.2023.118368] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 05/06/2023] [Accepted: 06/08/2023] [Indexed: 06/28/2023]
Abstract
In marine ecosystems, both living and non-living organisms depend on "good" water quality. It depends on a number of factors, and one of the most important is the quality of the water. The water quality index (WQI) model is widely used to assess water quality, but existing models have uncertainty issues. To address this, the authors introduced two new WQI models: the weight based weighted quadratic mean (WQM) and unweighted based root mean squared (RMS) models. These models were used to assess water quality in the Bay of Bengal, using seven water quality indicators including salinity (SAL), temperature (TEMP), pH, transparency (TRAN), dissolved oxygen (DOX), total oxidized nitrogen (TON), and molybdate reactive phosphorus (MRP). Both models ranked water quality between "good" and "fair" categories, with no significant difference between the weighted and unweighted models' results. The models showed considerable variation in the computed WQI scores, ranging from 68 to 88 with an average of 75 for WQM and 70 to 76 with an average of 72 for RMS. The models did not have any issues with sub-index or aggregation functions, and both had a high level of sensitivity (R2 = 1) in terms of the spatio-temporal resolution of waterbodies. The study demonstrated that both WQI approaches effectively assessed marine waters, reducing uncertainty and improving the accuracy of the WQI score.
Collapse
Affiliation(s)
- Md Galal Uddin
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco HydroInformatics Research Group (EHIRG), School of Engineering, College of Science and Engineering, University of Galway, Ireland.
| | - Azizur Rahman
- School of Computing, Mathematics and Engineering, Charles Sturt University, Wagga Wagga, Australia; The Gulbali Institute of Agriculture, Water and Environment, Charles Sturt University, Wagga Wagga, Australia
| | - Stephen Nash
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland
| | - Mir Talas Mahammad Diganta
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco HydroInformatics Research Group (EHIRG), School of Engineering, College of Science and Engineering, University of Galway, Ireland
| | - Abdul Majed Sajib
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco HydroInformatics Research Group (EHIRG), School of Engineering, College of Science and Engineering, University of Galway, Ireland
| | - Md Moniruzzaman
- The Department of Geography and Environment, Jagannath University, Dhaka, Bangladesh
| | - Agnieszka I Olbert
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco HydroInformatics Research Group (EHIRG), School of Engineering, College of Science and Engineering, University of Galway, Ireland
| |
Collapse
|
21
|
Mahardika T NQ, Fuadah YN, Jeong DU, Lim KM. PPG Signals-Based Blood-Pressure Estimation Using Grid Search in Hyperparameter Optimization of CNN-LSTM. Diagnostics (Basel) 2023; 13:2566. [PMID: 37568929 PMCID: PMC10417316 DOI: 10.3390/diagnostics13152566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/24/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
Researchers commonly use continuous noninvasive blood-pressure measurement (cNIBP) based on photoplethysmography (PPG) signals to monitor blood pressure conveniently. However, the performance of the system still needs to be improved. Accuracy and precision in blood-pressure measurements are critical factors in diagnosing and managing patients' health conditions. Therefore, we propose a convolutional long short-term memory neural network (CNN-LSTM) with grid search ability, which provides a robust blood-pressure estimation system by extracting meaningful information from PPG signals and reducing the complexity of hyperparameter optimization in the proposed model. The multiparameter intelligent monitoring for intensive care III (MIMIC III) dataset obtained PPG and arterial-blood-pressure (ABP) signals. We obtained 75,226 signal segments, with 60,180 signals allocated for training data, 12,030 signals allocated for the validation set, and 15,045 signals allocated for the test data. During training, we applied five-fold cross-validation with a grid-search method to select the best model and determine the optimal hyperparameter settings. The optimized configuration of the CNN-LSTM layers consisted of five convolutional layers, one long short-term memory (LSTM) layer, and two fully connected layers for blood-pressure estimation. This study successfully achieved good accuracy in assessing both systolic blood pressure (SBP) and diastolic blood pressure (DBP) by calculating the standard deviation (SD) and the mean absolute error (MAE), resulting in values of 7.89 ± 3.79 and 5.34 ± 2.89 mmHg, respectively. The optimal configuration of the CNN-LSTM provided satisfactory performance according to the standards set by the British Hypertension Society (BHS), the Association for the Advancement of Medical Instrumentation (AAMI), and the Institute of Electrical and Electronics Engineers (IEEE) for blood-pressure monitoring devices.
Collapse
Affiliation(s)
- Nurul Qashri Mahardika T
- Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Gyeongbuk, Republic of Korea; (N.Q.M.T.); (Y.N.F.); (D.U.J.)
| | - Yunendah Nur Fuadah
- Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Gyeongbuk, Republic of Korea; (N.Q.M.T.); (Y.N.F.); (D.U.J.)
- School of Electrical Engineering, Telkom University, Bandung 40257, Indonesia
| | - Da Un Jeong
- Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Gyeongbuk, Republic of Korea; (N.Q.M.T.); (Y.N.F.); (D.U.J.)
| | - Ki Moo Lim
- Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Gyeongbuk, Republic of Korea; (N.Q.M.T.); (Y.N.F.); (D.U.J.)
- Computational Medicine Lab, Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Gyeongbuk, Republic of Korea
- Meta Heart Co., Ltd., Gumi 39177, Gyeongbuk, Republic of Korea
| |
Collapse
|
22
|
Xu C, Coen-Pirani P, Jiang X. Empirical Study of Overfitting in Deep Learning for Predicting Breast Cancer Metastasis. Cancers (Basel) 2023; 15:cancers15071969. [PMID: 37046630 PMCID: PMC10093528 DOI: 10.3390/cancers15071969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 03/18/2023] [Accepted: 03/20/2023] [Indexed: 03/29/2023] Open
Abstract
Overfitting may affect the accuracy of predicting future data because of weakened generalization. In this research, we used an electronic health records (EHR) dataset concerning breast cancer metastasis to study the overfitting of deep feedforward neural networks (FNNs) prediction models. We studied how each hyperparameter and some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied were activate function, weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch size, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs.
Collapse
|
23
|
Ou SM, Tsai MT, Lee KH, Tseng WC, Yang CY, Chen TH, Bin PJ, Chen TJ, Lin YP, Sheu WHH, Chu YC, Tarng DC. Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms. BioData Min 2023; 16:8. [PMID: 36899426 PMCID: PMC10007785 DOI: 10.1186/s13040-023-00324-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 02/17/2023] [Indexed: 03/12/2023] Open
Abstract
OBJECTIVES Type 2 diabetes mellitus (T2DM) imposes a great burden on healthcare systems, and these patients experience higher long-term risks for developing end-stage renal disease (ESRD). Managing diabetic nephropathy becomes more challenging when kidney function starts declining. Therefore, developing predictive models for the risk of developing ESRD in newly diagnosed T2DM patients may be helpful in clinical settings. METHODS We established machine learning models constructed from a subset of clinical features collected from 53,477 newly diagnosed T2DM patients from January 2008 to December 2018 and then selected the best model. The cohort was divided, with 70% and 30% of patients randomly assigned to the training and testing sets, respectively. RESULTS The discriminative ability of our machine learning models, including logistic regression, extra tree classifier, random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine were evaluated across the cohort. XGBoost yielded the highest area under the receiver operating characteristic curve (AUC) of 0.953, followed by extra tree and GBDT, with AUC values of 0.952 and 0.938 on the testing dataset. The SHapley Additive explanation summary plot in the XGBoost model illustrated that the top five important features included baseline serum creatinine, mean serum creatine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, spot urine protein-to-creatinine ratio and female gender. CONCLUSIONS Because our machine learning prediction models were based on routinely collected clinical features, they can be used as risk assessment tools for developing ESRD. By identifying high-risk patients, intervention strategies may be provided at an early stage.
Collapse
Affiliation(s)
- Shuo-Ming Ou
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Ming-Tsun Tsai
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Kuo-Hua Lee
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Wei-Cheng Tseng
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chih-Yu Yang
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Tz-Heng Chen
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Pin-Jie Bin
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan
| | - Tzeng-Ji Chen
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Department of Family Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.,Department of Family Medicine, Taipei Veterans General Hospital, Hsinchu Branch, Hsinchu, Taiwan.,Institute of Hospital and Health Care Administration, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yao-Ping Lin
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan.,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Wayne Huey-Herng Sheu
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Division of Endocrinology and Metabolism, Department of Internal Medicine, Taipei Veterans General Hospital, Taipei, Taiwan.,Institute of Molecular and Genetic Medicine, National Health Research Institute, Miaoli, Taiwan
| | - Yuan-Chia Chu
- Information Management Office, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan. .,Big Data Center, Taipei Veterans General Hospital, Taipei, Taiwan. .,Department of Information Management, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan.
| | - Der-Cherng Tarng
- Division of Nephrology, Department of Medicine, Taipei Veterans General Hospital, 201, Section 2, Shih-Pai Road, Taipei, 11217, Taiwan. .,School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan. .,Institute of Clinical Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan. .,Department and Institute of Physiology, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
24
|
Zeng L, Liu L, Chen D, Lu H, Xue Y, Bi H, Yang W. The innovative model based on artificial intelligence algorithms to predict recurrence risk of patients with postoperative breast cancer. Front Oncol 2023; 13:1117420. [PMID: 36959794 PMCID: PMC10029918 DOI: 10.3389/fonc.2023.1117420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Accepted: 02/16/2023] [Indexed: 03/09/2023] Open
Abstract
Purpose This study aimed to develop a machine learning model to retrospectively study and predict the recurrence risk of breast cancer patients after surgery by extracting the clinicopathological features of tumors from unstructured clinical electronic health record (EHR) data. Methods This retrospective cohort included 1,841 breast cancer patients who underwent surgical treatment. To extract the principal features associated with recurrence risk, the clinical notes and histopathology reports of patients were collected and feature engineering was used. Predictive models were next conducted based on this important information. All algorithms were implemented using Python software. The accuracy of prediction models was further verified in the test cohort. The area under the curve (AUC), precision, recall, and F1 score were adopted to evaluate the performance of each model. Results A training cohort with 1,289 patients and a test cohort with 552 patients were recruited. From 2011 to 2019, a total of 1,841 textual reports were included. For the prediction of recurrence risk, both LSTM, XGBoost, and SVM had favorable accuracies of 0.89, 0.86, and 0.78. The AUC values of the micro-average ROC curve corresponding to LSTM, XGBoost, and SVM were 0.98 ± 0.01, 0.97 ± 0.03, and 0.92 ± 0.06. Especially the LSTM model achieved superior execution than other models. The accuracy, F1 score, macro-avg F1 score (0.87), and weighted-avg F1 score (0.89) of the LSTM model produced higher values. All P values were statistically significant. Patients in the high-risk group predicted by our model performed more resistant to DNA damage and microtubule targeting drugs than those in the intermediate-risk group. The predicted low-risk patients were not statistically significant compared with intermediate- or high-risk patients due to the small sample size (188 low-risk patients were predicted via our model, and only two of them were administered chemotherapy alone after surgery). The prognosis of patients predicted by our model was consistent with the actual follow-up records. Conclusions The constructed model accurately predicted the recurrence risk of breast cancer patients from EHR data and certainly evaluated the chemoresistance and prognosis of patients. Therefore, our model can help clinicians to formulate the individualized management of breast cancer patients.
Collapse
Affiliation(s)
- Lixuan Zeng
- Department of Pathology, Harbin Medical University, Harbin, China
| | - Lei Liu
- Department of Breast Surgery, The Third Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Dongxin Chen
- Department of Pathology, Harbin Medical University, Harbin, China
| | - Henghui Lu
- Department of Dermatology, The Second Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Yang Xue
- Department of Pathology, Harbin Medical University, Harbin, China
| | - Hongjie Bi
- Department of Pathology, Harbin Medical University, Harbin, China
| | - Weiwei Yang
- Department of Pathology, Harbin Medical University, Harbin, China
| |
Collapse
|
25
|
Wu R, Luo J, Wan H, Zhang H, Yuan Y, Hu H, Feng J, Wen J, Wang Y, Li J, Liang Q, Gan F, Zhang G. Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database. PLoS One 2023; 18:e0280340. [PMID: 36701415 PMCID: PMC9879508 DOI: 10.1371/journal.pone.0280340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 12/26/2022] [Indexed: 01/27/2023] Open
Abstract
INTRODUCTION Many researchers used machine learning (ML) to predict the prognosis of breast cancer (BC) patients and noticed that the ML model had good individualized prediction performance. OBJECTIVE The cohort study was intended to establish a reliable data analysis model by comparing the performance of 10 common ML algorithms and the the traditional American Joint Committee on Cancer (AJCC) stage, and used this model in Web application development to provide a good individualized prediction for others. METHODS This study included 63145 BC patients from the Surveillance, Epidemiology, and End Results database. RESULTS Through the performance of the 10 ML algorithms and 7th AJCC stage in the optimal test set, we found that in terms of 5-year overall survival, multivariate adaptive regression splines (MARS) had the highest area under the curve (AUC) value (0.831) and F1-score (0.608), and both sensitivity (0.737) and specificity (0.772) were relatively high. Besides, MARS showed a highest AUC value (0.831, 95%confidence interval: 0.820-0.842) in comparison to the other ML algorithms and 7th AJCC stage (all P < 0.05). MARS, the best performing model, was selected for web application development (https://w12251393.shinyapps.io/app2/). CONCLUSIONS The comparative study of multiple forecasting models utilizing a large data noted that MARS based model achieved a much better performance compared to other ML algorithms and 7th AJCC stage in individualized estimation of survival of BC patients, which was very likely to be the next step towards precision medicine.
Collapse
Affiliation(s)
- Ruiyang Wu
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jing Luo
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Hangyu Wan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Haiyan Zhang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Yewei Yuan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Huihua Hu
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jinyan Feng
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Jing Wen
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Yan Wang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Junyan Li
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Qi Liang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Fengjiao Gan
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
| | - Gang Zhang
- Department of Breast and Thyroid Surgery, Sichuan Provincial Hospital for Women and Children (Affiliated Women and Children’s Hospital of Chengdu Medical College), Chengdu, China
- * E-mail:
| |
Collapse
|