1
|
Balian J, Sakowitz S, Verma A, Vadlakonda A, Cruz E, Ali K, Benharash P. Machine learning based predictive modeling of readmissions following extracorporeal membrane oxygenation hospitalizations. Surg Open Sci 2024; 19:125-130. [PMID: 38655069 PMCID: PMC11035075 DOI: 10.1016/j.sopen.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 04/05/2024] [Indexed: 04/26/2024] Open
Abstract
Background Despite increasing utilization and survival benefit over the last decade, extracorporeal membrane oxygenation (ECMO) remains resource-intensive with significant complications and rehospitalization risk. We thus utilized machine learning (ML) to develop prediction models for 90-day nonelective readmission following ECMO. Methods All adult patients receiving ECMO who survived index hospitalization were tabulated from the 2016-2020 Nationwide Readmissions Database. Extreme Gradient Boosting (XGBoost) models were developed to identify features associated with readmission following ECMO. Area under the receiver operating characteristic (AUROC), mean Average Precision (mAP), and the Brier score were calculated to estimate model performance relative to logistic regression (LR). Shapley Additive Explanation summary (SHAP) plots evaluated the relative impact of each factor on the model. An additional sensitivity analysis solely included patient comorbidities and indication for ECMO as potential model covariates. Results Of ∼22,947 patients, 4495 (19.6 %) were readmitted nonelectively within 90 days. The XGBoost model exhibited superior discrimination (AUROC 0.64 vs 0.49), classification accuracy (mAP 0.30 vs 0.20) and calibration (Brier score 0.154 vs 0.165, all P < 0.001) in predicting readmission compared to LR. SHAP plots identified duration of index hospitalization, undergoing heart/lung transplantation, and Medicare insurance to be associated with increased odds of readmission. Upon sub-analysis, XGBoost demonstrated superior disclination compared to LR (AUROC 0.61 vs 0.60, P < 0.05). Chronic liver disease and frailty were linked with increased odds of nonelective readmission. Conclusions ML outperformed LR in predicting readmission following ECMO. Future work is needed to identify other factors linked with readmission and further optimize post-ECMO care among this cohort.
Collapse
Affiliation(s)
- Jeffrey Balian
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Sara Sakowitz
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Arjun Verma
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Amulya Vadlakonda
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Emma Cruz
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Konmal Ali
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
| | - Peyman Benharash
- Cardiovascular Outcomes Research Laboratories (CORELAB), University of California, Los Angeles, CA, United States of America
- Division of Cardiac Surgery, Department of Surgery, University of California, Los Angeles, CA, United States of America
| |
Collapse
|
2
|
Chauhan R, Goel A, Alankar B, Kaur H. Predictive modeling and web-based tool for cervical cancer risk assessment: A comparative study of machine learning models. MethodsX 2024; 12:102653. [PMID: 38524310 PMCID: PMC10957413 DOI: 10.1016/j.mex.2024.102653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Accepted: 03/08/2024] [Indexed: 03/26/2024] Open
Abstract
In today's digital era, the rapid growth of databases presents significant challenges in data management. In order to address this, we have developed and designed CHAMP (Cervical Health Assessment using machine learning for Prediction), which is a user interface tool that can effectively and efficiently handle cervical cancer databases to detect patterns for future prediction diagnosis. CHAMP employs various machine learning algorithms which include XGBoost, SVM, Naive Bayes, AdaBoost, Decision Tree, and K-Nearest Neighbors in order to predict cervical cancer accurately. Moreover, this tool also designates to evaluate and optimize processes, to retrieve the significantly augmented algorithm for predicting cervical cancer. Although, the developed user interface tool was implemented in Python 3.9.0 using Flask, which provides a personalized and intuitive platform for pattern detection. The current study approach contributes to the accurate prediction and early detection of cervical cancer by leveraging the power of machine learning algorithms and comprehensive validation tools, which aim to provide learned decision-making.•CHAMP is a user interface tool which is designed for the detection of patterns for future diagnosis and prognosis of cervical cancer.•Various machine learning algorithms are employed for accurate prediction.•This tool provides personalized and intuitive data analysis which enables informed decision-making in healthcare.
Collapse
Affiliation(s)
- Ritu Chauhan
- Artificial Intelligence and IoT Automation Lab, Center for Computational Biology and Bioinformatics, Amity University, Noida, Uttar Pradesh 201313, India
| | - Anika Goel
- Artificial Intelligence and IoT Automation Lab, Center for Computational Biology and Bioinformatics, Amity University, Noida, Uttar Pradesh 201313, India
| | - Bhavya Alankar
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| | - Harleen Kaur
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| |
Collapse
|
3
|
Jiang Y, Zhao Q, Guan J, Wang Y, Chen J, Li Y. Analyzing prehospital delays in recurrent acute ischemic stroke: Insights from interpretable machine learning. Patient Educ Couns 2024; 123:108228. [PMID: 38458092 DOI: 10.1016/j.pec.2024.108228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/18/2024] [Accepted: 02/24/2024] [Indexed: 03/10/2024]
Abstract
OBJECTIVE This study investigates prehospital delays in recurrent Acute Ischemic Stroke (AIS) patients, aiming to identify key factors contributing to these delays to inform effective interventions. METHODS A retrospective cohort analysis of 1419 AIS patients in Shenzhen from December 2021 to August 2023 was performed. The study applied the Extreme Gradient Boosting (XGBoost) algorithm and SHapley Additive exPlanations (SHAP) for identifying determinants of delay. RESULTS Living with others and lack of stroke knowledge emerged as significant risk factors for delayed hospital presentation in recurrent AIS patients. Key features impacting delay times included residential status, awareness of stroke symptoms, presence of conscious disturbance, diabetes mellitus awareness, physical weakness, mode of hospital presentation, type of stroke, and presence of coronary artery disease. CONCLUSION Prehospital delays are similarly prevalent among both recurrent and first-time AIS patients, highlighting a pronounced knowledge gap in the former group. This discovery underscores the urgent need for enhanced stroke education and management. PRACTICE IMPLICATION The similarity in prehospital delay patterns between recurrent and first-time AIS patients emphasizes the necessity for public health initiatives and tailored educational programs. These strategies aim to improve stroke response times and outcomes for all patients.
Collapse
Affiliation(s)
- Youli Jiang
- Department of Neurology, People's Hospital of Longhua, 38 Jinglong Jianshe Road, Longhua District, Shenzhen 518109, China
| | - Qingshi Zhao
- Department of Neurology, People's Hospital of Longhua, 38 Jinglong Jianshe Road, Longhua District, Shenzhen 518109, China
| | - Jincheng Guan
- Department of Neurology, People's Hospital of Longhua, 38 Jinglong Jianshe Road, Longhua District, Shenzhen 518109, China
| | - Yuying Wang
- Department of Neurology, People's Hospital of Longhua, 38 Jinglong Jianshe Road, Longhua District, Shenzhen 518109, China
| | - Jingfang Chen
- The Third People's Hospital of Shenzhen, Shenzhen 518112, China; National Clinical Research Center for Infectious Diseases, 29 Bulan Road, Longgang District, Shenzhen 518112, China.
| | - Yanfeng Li
- Department of Neurology, People's Hospital of Longhua, 38 Jinglong Jianshe Road, Longhua District, Shenzhen 518109, China.
| |
Collapse
|
4
|
Guo L, Xu X, Niu C, Wang Q, Park J, Zhou L, Lei H, Wang X, Yuan X. Machine learning-based prediction and experimental validation of heavy metal adsorption capacity of bentonite. Sci Total Environ 2024; 926:171986. [PMID: 38552979 DOI: 10.1016/j.scitotenv.2024.171986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/23/2024] [Accepted: 03/24/2024] [Indexed: 04/01/2024]
Abstract
As a natural adsorbent material, bentonite is widely used in the field of heavy metal adsorption. The heavy metal adsorption capacity of bentonite varies significantly in studies due to the differences in the properties of bentonite, solution, and heavy metal. To achieve accurate predictions of bentonite's heavy metal adsorption capacity, this study employed six machine learning (ML) regression algorithms to investigate the adsorption characteristics of bentonite. Finally, an eXtreme Gradient Boosting Regression (XGB) model with outstanding predictive performance was constructed. Explanation analysis of the XGB model further reveal the importance and influence manner of each input feature in predicting the heavy metal adsorption capacity of bentonite. The feature categories influencing heavy metal adsorption capacity were ranked in order of importance as adsorption conditions > bentonite properties > heavy metal properties. Furthermore, a web-based graphical user interface (GUI) software was developed, facilitating researchers and engineers to conveniently use the XGB model for predicting the heavy metal adsorption capacity of bentonite. This study provides new insights into the adsorption behaviors of bentonite for heavy metals, offering guidance and support for enhancing its application efficiency and addressing heavy metal pollution remediation.
Collapse
Affiliation(s)
- Lisheng Guo
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Xin Xu
- College of Construction Engineering, Jilin University, Changchun 130026, China.
| | - Cencen Niu
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Qing Wang
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Junboum Park
- Department of Civil and Environment Engineering, Seoul National University, Seoul 08826, Republic of Korea
| | - Lu Zhou
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Haomin Lei
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Xinhai Wang
- College of Construction Engineering, Jilin University, Changchun 130026, China
| | - Xiaoqing Yuan
- College of Construction Engineering, Jilin University, Changchun 130026, China
| |
Collapse
|
5
|
Ebrahimian A, Mohammadi H, Maftoon N. Material characterization of human middle ear using machine-learning-based surrogate models. J Mech Behav Biomed Mater 2024; 153:106478. [PMID: 38493562 DOI: 10.1016/j.jmbbm.2024.106478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 02/09/2024] [Accepted: 02/24/2024] [Indexed: 03/19/2024]
Abstract
This study aims to introduce a novel non-invasive method for rapid material characterization of middle-ear structures, taking into consideration the invaluable insights provided by the mechanical properties of ear tissues. Valuable insights into various ear pathologies can be gleaned from the mechanical properties of ear tissues, yet conventional techniques for assessing these properties often entail invasive procedures that preclude their use on living patients. In this study, in the first step, we developed machine-learning models of the middle ear to predict its responses with a significantly lower computational cost in comparison to finite-element models. Leveraging findings from prior research, we focused on the most influential model parameters: the Young's modulus and thickness of the tympanic membrane and the Young's modulus of the stapedial annular ligament. The eXtreme Gradient Boosting (XGBoost) method was implemented for creating the machine-learning models. Subsequently, we combined the created machine-learning models with Bayesian optimization (BoTorch) for fast and efficient estimation of the Young's moduli of the tympanic membrane and the stapedial annular ligament. We demonstrate that the resultant surrogate models can fairly represent the vibrational responses of the umbo, stapes footplate, and vibration patterns of the tympanic membrane at most frequencies. Also, our proposed material characterization approach successfully estimated the Young's moduli of the tympanic membrane and stapedial annular ligament (separately and simultaneously) with values of mean absolute percentage error of less than 7%. The remarkable accuracy achieved through the proposed material characterization method underscores its potential for eventual clinical applications of estimating mechanical properties of the middle-ear structures for diagnostic purposes.
Collapse
Affiliation(s)
- Arash Ebrahimian
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada; Centre for Bioengineering and Biotechnology, University of Waterloo, Waterloo, ON, Canada
| | - Hossein Mohammadi
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada; Centre for Bioengineering and Biotechnology, University of Waterloo, Waterloo, ON, Canada
| | - Nima Maftoon
- Department of Systems Design Engineering, University of Waterloo, Waterloo, ON, Canada; Centre for Bioengineering and Biotechnology, University of Waterloo, Waterloo, ON, Canada.
| |
Collapse
|
6
|
Schoonemann J, Nagelkerke J, Seuntjens TG, Osinga N, van Liere D. Applying XGBoost and SHAP to Open Source Data to Identify Key Drivers and Predict Likelihood of Wolf Pair Presence. Environ Manage 2024; 73:1072-1087. [PMID: 38372749 DOI: 10.1007/s00267-024-01941-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 01/20/2024] [Indexed: 02/20/2024]
Abstract
Wolves have returned to Germany since 2000. Numbers have grown to 209 territorial pairs in 2021. XGBoost machine learning, combined with SHAP analysis is applied to predict German wolf pair presence in 2022 for 10 × 10 km grid cells. Model input consisted of 38 variables from open sources, covering the period 2000 to 2021. The XGBoost model predicted well, with 0.91 as the AUC. SHAP analysis ranked the variables: distance to the closest neighboring wolf pair was the main driver for a grid cell to become occupied by a wolf pair. The clustering tendency of related wolves seems to be an important explanatory factor here. Second was the percentage of wooded area. The next eight variables related to wolf presence in the preceding year, except at fifth, eighth and tenth position in the total order: human density (square root) in the grid, percentage arable land and road density respectively. Other variables including the occurrence of wild prey were the weakest predictors. The SHAP analysis also provided crucial added value in identifying a variable that had threshold values where its contribution to the prediction changed from positive to negative or vice versa. For instance, low density of people increased the probability of wolf pair presence, whereas a high density decreased this probability. Cumulative lift techniques showed that the model performed almost four times better than random prediction. The combination of XGBoost, SHAP and cumulative lift techniques is new in wolf management and conservation, allowing for the focusing of educational and financial resources.
Collapse
Affiliation(s)
| | | | | | - Nynke Osinga
- Institute for Coexistence with Wildlife, Heuvelweg 7, 7218 BD, Almen, Nederland
| | - Diederik van Liere
- Institute for Coexistence with Wildlife, Heuvelweg 7, 7218 BD, Almen, Nederland
| |
Collapse
|
7
|
Sawant PA, Hiralkar SS, Hulsurkar YP, Phutane MS, Mahajan US, Kudale AM. Predicting over-the-counter antibiotic use in rural Pune, India, using machine learning methods. Epidemiol Health 2024:e2024044. [PMID: 38637971 DOI: 10.4178/epih.e2024044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 03/25/2024] [Indexed: 04/20/2024] Open
Abstract
Objectives Over-the-counter (OTC) antibiotic use can cause antibiotic resistance, threatening global public health gains. To counter OTC use, this study used machine learning (ML) methods to identify predictors of OTC antibiotic use in rural Pune, India. Methods The features of OTC antibiotic use were selected using stepwise logistic, lasso, random forest, XGBoost, and Boruta algorithms. Regression and tree-based models with all confirmed and tentatively important features were built to predict the use of OTC antibiotics. Five-fold cross-validation was used to tune the models' hyperparameters. The final model was selected based on the highest area under the curve (AUROC) with a 95% confidence interval and the lowest log-loss. Results In rural Pune, the prevalence of OTC antibiotic use was 35.9% (95% CI, 31.56%-40.46%). The perception that buying medicines directly from a medicine shop/pharmacy is useful, using antibiotics for eye-related complaints, more household members consuming antibiotics, and longer duration and higher doses of antibiotic consumption in rural blocks and other social groups were confirmed as important features by the Boruta algorithm. The final model was the XGBoost+Boruta model with 7 predictors (AUROC=0.934; 95% CI, 0.8906-0.9782; log-loss=0.2793) log-loss. Conclusion XGBoost+Boruta, with 7 predictors, was the most accurate model for predicting OTC antibiotic use in rural Pune. Using OTC antibiotics for eye-related complaints, higher consumption of antibiotics and the perception that buying antibiotics directly from a medicine shop/pharmacy is useful were identified as key factors for planning interventions to improve awareness about proper antibiotic use.
Collapse
Affiliation(s)
- Pravin Arun Sawant
- School of Health Sciences, Savitribai Phule Pune University, Pune, Maharashtra, India, Pune, India
| | - Sakshi Shantanu Hiralkar
- School of Health Sciences, Savitribai Phule Pune University, Pune, Maharashtra, India, Pune, India
| | | | - Mugdha Sharad Phutane
- School of Health Sciences, Savitribai Phule Pune University, Pune, Maharashtra, India, Pune, India
| | - Uma Satish Mahajan
- School of Health Sciences, Savitribai Phule Pune University, Pune, Maharashtra, India, Pune, India
| | - Abhay Machindra Kudale
- School of Health Sciences, Savitribai Phule Pune University, Pune, Maharashtra, India, Pune, India
| |
Collapse
|
8
|
Nikpour P, Shafiei M, Khatibi V. Gelato: a new hybrid deep learning-based Informer model for multivariate air pollution prediction. Environ Sci Pollut Res Int 2024:10.1007/s11356-024-33190-4. [PMID: 38592633 DOI: 10.1007/s11356-024-33190-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 03/29/2024] [Indexed: 04/10/2024]
Abstract
The increase in air pollutants and its adverse effects on human health and the environment has raised significant concerns. This implies the necessity of predicting air pollutant levels. Numerous studies have aimed to provide new models for more accurate prediction of air pollutants such as CO2, O3, and PM2.5. Most of the models used in the literature are deep learning models with Transformers being the best for time series prediction. However, there is still a need to enhance accuracy in air pollution prediction using Transformers. Alongside the need for increased accuracy, there is a significant demand for predicting a broader spectrum of air pollutants. To encounter this challenge, this paper proposes a new hybrid deep learning-based Informer model called "Gelato" for multivariate air pollution prediction. Gelato takes a leap forward by taking several air pollutants into consideration simultaneously. Besides introducing new changes to the Informer structure as the base model, Gelato utilizes Particle Swarm Optimization for hyperparameter optimization. Moreover, XGBoost is used at the final stage to achieve minimal errors. Applying the proposed model on a dataset containing eight important air pollutants, including CO2, O3, NO, NO2, SO2, PM10, NH3, and PM2.5, the Gelato performance is assessed. Comparing the results of Gelato with other models shows Gelato's superiority over them, proving it is a high-confidence model for multivariate air pollution prediction.
Collapse
Affiliation(s)
- Parsa Nikpour
- Department of Intelligent Systems Engineering, School of Industrial Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
| | - Mahdis Shafiei
- Department of Intelligent Systems Engineering, School of Industrial Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran
| | - Vahid Khatibi
- Department of Intelligent Systems Engineering, School of Industrial Engineering, Iran University of Science and Technology, Tehran, 16846-13114, Iran.
| |
Collapse
|
9
|
Rasouli S, Dakkali MS, Azarbad R, Ghazvini A, Asani M, Mirzaasgari Z, Arish M. Predicting the conversion from clinically isolated syndrome to multiple sclerosis: An explainable machine learning approach. Mult Scler Relat Disord 2024; 86:105614. [PMID: 38642495 DOI: 10.1016/j.msard.2024.105614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/22/2024]
Abstract
INTRODUCTION Predicting the conversion of clinically isolated syndrome (CIS) to clinically definite multiple sclerosis (CDMS) is critical to personalizing treatment planning and benefits for patients. The aim of this study is to develop an explainable machine learning (ML) model for predicting this conversion based on demographic, clinical, and imaging data. METHOD The ML model, Extreme Gradient Boosting (XGBoost), was employed on the public dataset of 273 Mexican mestizo CIS patients with 10-year follow-up. The data was divided into a training set for cross-validation and feature selection, and a holdout test set for final testing. Feature importance was determined using the SHapley Additive Explanations library (SHAP). Then, two experiments were conducted to optimize the model's performance by selectively adding variables and selecting the most contributive variables for the final model. RESULTS Nine variables including age, gender, schooling, motor symptoms, infratentorial and periventricular lesion at imaging, oligoclonal band in cerebrospinal fluid, lesion and symptoms types were significant. The model achieved an accuracy of 83.6 %, AUC of 91.8 %, sensitivity of 83.9 %, and specificity of 83.4 % in cross-validation. In the final testing, the model achieved an accuracy of 78.3 %, AUC of 85.8 %, sensitivity of 75 %, and specificity of 81.1 %. Finally, a web-based demo of the model was created for testing purposes. CONCLUSION The model, focusing on feature selection and interpretability, effectively stratifies risk for treatment decisions and disability prevention in MS patients. It provides a numerical risk estimate for CDMS conversion, enhancing transparency in clinical decision-making and aiding in patient care.
Collapse
Affiliation(s)
- Saeid Rasouli
- School of Medicine, Five Senses Health Research Institute, Hazrat-e Rasool General Hospital, Iran University of Medical Sciences, Tehran, Iran.
| | - Mohammad Sedigh Dakkali
- Department of Ophthalmology, School of Medicine, Al Zahra Eye Hospital, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Reza Azarbad
- Cellular and Molecular Biology Research Center, Health Research Institute, Babol University of Medical Sciences, Babol, Iran
| | - Azim Ghazvini
- School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Mahdi Asani
- Department of Ophthalmology, School of Medicine, Al Zahra Eye Hospital, Zahedan University of Medical Sciences, Zahedan, Iran
| | - Zahra Mirzaasgari
- Department of Neurology, Firoozgar hospital, School of medicine, University of Medical Science, Iran
| | - Mohammed Arish
- Department of Ophthalmology, School of Medicine, Al Zahra Eye Hospital, Zahedan University of Medical Sciences, Zahedan, Iran
| |
Collapse
|
10
|
Huckvale ED, Moseley HN. Predicting The Pathway Involvement Of Metabolites Based on Combined Metabolite and Pathway Features. bioRxiv 2024:2024.04.01.587582. [PMID: 38617261 PMCID: PMC11014601 DOI: 10.1101/2024.04.01.587582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
A major limitation of most metabolomics datasets is the sparsity of pathway annotations of detected metabolites. It is common for less than half of identified metabolites in these datasets to have known metabolic pathway involvement. Trying to address this limitation, machine learning models have been developed to predict the association of a metabolite with a "pathway category", as defined by one of the metabolic knowledgebases like the Kyoto Encyclopedia of Gene and Genomes. Most of these models are implemented as a single binary classifier specific to a single pathway category, requiring a set of binary classifiers for generating predictions for multiple pathway categories. This single binary classifier per pathway category approach both multiplies the computational resources necessary for training while diluting the positive entries in gold standard datasets needed for training. To address the limitations of training separate classifiers, we propose a generalization of the metabolic pathway prediction problem using a single binary classifier that accepts both features representing a metabolite and features representing a generic pathway category and then predicts whether the given metabolite is involved in the corresponding pathway category. We demonstrate that this metabolite-pathway features-pair approach is not only competitive with the combined performance of training separate binary classifiers, but it outperforms the previous benchmark models.
Collapse
Affiliation(s)
- Erik D. Huckvale
- Markey Cancer Center, University of Kentucky, Lexington, KY 40506, USA
| | - Hunter N.B. Moseley
- Markey Cancer Center, University of Kentucky, Lexington, KY 40506, USA
- Superfund Research Center, University of Kentucky, Lexington, KY 40506, USA
- Department of Toxicology and Cancer Biology, University of Kentucky, Lexington, KY 40536, USA
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, KY 40506, USA
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY 40506, USA
| |
Collapse
|
11
|
Reveguk I, Simonson T. Classifying protein kinase conformations with machine learning. Protein Sci 2024; 33:e4918. [PMID: 38501429 PMCID: PMC10962494 DOI: 10.1002/pro.4918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 01/02/2024] [Accepted: 01/22/2024] [Indexed: 03/20/2024]
Abstract
Protein kinases are key actors of signaling networks and important drug targets. They cycle between active and inactive conformations, distinguished by a few elements within the catalytic domain. One is the activation loop, whose conserved DFG motif can occupy DFG-in, DFG-out, and some rarer conformations. Annotation and classification of the structural kinome are important, as different conformations can be targeted by different inhibitors and activators. Valuable resources exist; however, large-scale applications will benefit from increased automation and interpretability of structural annotation. Interpretable machine learning models are described for this purpose, based on ensembles of decision trees. To train them, a set of catalytic domain sequences and structures was collected, somewhat larger and more diverse than existing resources. The structures were clustered based on the DFG conformation and manually annotated. They were then used as training input. Two main models were constructed, which distinguished active/inactive and in/out/other DFG conformations. They considered initially 1692 structural variables, spanning the whole catalytic domain, then identified ("learned") a small subset that sufficed for accurate classification. The first model correctly labeled all but 3 of 3289 structures as active or inactive, while the second assigned the correct DFG label to all but 17 of 8826 structures. The most potent classifying variables were all related to well-known structural elements in or near the activation loop and their ranking gives insights into the conformational preferences. The models were used to automatically annotate 3850 kinase structures predicted recently with the Alphafold2 tool, showing that Alphafold2 reproduced the active/inactive but not the DFG-in proportions seen in the Protein Data Bank. We expect the models will be useful for understanding and engineering kinases.
Collapse
Affiliation(s)
- Ivan Reveguk
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654)Ecole PolytechniquePalaiseauFrance
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654)Ecole PolytechniquePalaiseauFrance
| |
Collapse
|
12
|
Mohit A, Remya N. Exploring effects of carbon, nitrogen, and phosphorus on greywater treatment by polyculture microalgae using response surface methodology and machine learning. J Environ Manage 2024; 356:120728. [PMID: 38531138 DOI: 10.1016/j.jenvman.2024.120728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/20/2024] [Accepted: 03/19/2024] [Indexed: 03/28/2024]
Abstract
The microalgae-based wastewater treatment is a promising technique that contribute to achieving sustainable development goals (SDGs), such as SDG-6, "Clean Water and Sanitation". However, it is strongly influenced by the initial composition of wastewater. In this study, the impact of initial organics and nutrient concentration on the removal of total organic carbon (TOC), total carbon (TC), ammonium (NH4+), total nitrogen (TN), and phosphate (PO43-) from greywater using native polyculture microalgae was explored. Response surface methodology was employed along with two machine learning approaches, AdaBoost and XGBoost, to evaluate the interactions among three main factors: TOC, NH4+, and PO43-, and their effects on treatment efficiency. The C/N ratios for achieving maximum TOC and TC removal efficiency of 99.2% and 97.7% were determined to be 10.3, and 65.4-73.6, respectively. Notably, the N/P ratio did not significantly affect their removal. The highest NH4+ removal efficiency, reaching 96.2%, was attained at C/N ratios of 4.3, 24.0, 38.2, and 212.9, coupled with N/P ratios of 0.3, 2.6, and 23.4. Highest TN removal efficiency of 77.2% was achieved at C/N and N/P ratios of 12.2 and 2.0, respectively. Highest PO43- removal of 78.8% was obtained at N/P ratio 12.8. However, C/N ratio did not affect the removal efficiency. Maintaining these specified C/N and N/P ratios in the influent greywater would ensure that the treated greywater meets the required standards for various reuse applications, including flushing, groundwater recharge, and surface water discharge. The integration of RSM with AdaBoost and XGBoost provided accurate predictions of removal efficiencies. For all the models, XGBoost had the highest R2, and lowest MAE and MSE values. The cross validation of RSM models with AdaBoost and XGBoost further reinforced the reliability of these models in predicting treatment outcomes.
Collapse
Affiliation(s)
- Aggarwal Mohit
- School of Infrastructure, Indian Institute of Technology Bhubaneswar, Odisha, 752050, India
| | - Neelancherry Remya
- School of Infrastructure, Indian Institute of Technology Bhubaneswar, Odisha, 752050, India.
| |
Collapse
|
13
|
Schönnagel L, Tani S, Vu-Han TL, Zhu J, Camino-Willhuber G, Dodo Y, Caffard T, Chiapparelli E, Oezel L, Shue J, Zelenty WD, Lebl DR, Cammisa FP, Girardi FP, Sokunbi G, Hughes AP, Sama AA. Predicting conversion of ambulatory ACDF patients to inpatient: a machine learning approach. Spine J 2024; 24:563-571. [PMID: 37980960 DOI: 10.1016/j.spinee.2023.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 10/29/2023] [Accepted: 11/12/2023] [Indexed: 11/21/2023]
Abstract
BACKGROUND CONTEXT Machine learning is a powerful tool that has become increasingly important in the orthopedic field. Recently, several studies have reported that predictive models could provide new insights into patient risk factors and outcomes. Anterior cervical discectomy and fusion (ACDF) is a common operation that is performed as an outpatient procedure. However, some patients are required to convert to inpatient status and prolonged hospitalization due to their condition. Appropriate patient selection and identification of risk factors for conversion could provide benefits to patients and the use of medical resources. PURPOSE This study aimed to develop a machine-learning algorithm to identify risk factors associated with unplanned conversion from outpatient to inpatient status for ACDF patients. STUDY DESIGN/SETTING This is a machine-learning-based analysis using retrospectively collected data. PATIENT SAMPLE Patients who underwent one- or two-level ACDF in an ambulatory setting at a single specialized orthopedic hospital between February 2016 to December 2021. OUTCOME MEASURES Length of stay, conversion rates from ambulatory setting to inpatient. METHODS Patients were divided into two groups based on length of stay: (1) Ambulatory (discharge within 24 hours) or Extended Stay (greater than 24 hours but fewer than 48 hours), and (2) Inpatient (greater than 48 hours). Factors included in the model were based on literature review and clinical expertise. Patient demographics, comorbidities, and intraoperative factors, such as surgery duration and time, were included. We compared the performance of different machine learning algorithms: Logistic Regression, Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). We split the patient data into a training and validation dataset using a 70/30 split. The different models were trained in the training dataset using cross-validation. The performance was then tested in the unseen validation set. This step is important to detect overfitting. The performance was evaluated using the area under the curve (AUC) of the receiver operating characteristics analysis (ROC) as the primary outcome. An AUC of 0.7 was considered fair, 0.8 good, and 0.9 excellent, according to established cut-offs. RESULTS A total of 581 patients (59% female) were available for analysis. Of those, 140 (24.1%) were converted to inpatient status. The median age was 51 (IQR 44-59), and the median BMI was 28 kg/m2 (IQR 24-32). The XGBoost model showed the best performance with an AUC of 0.79. The most important features were the length of the operation, followed by sex (based on biological attributes), age, and operation start time. The logistic regression model and the SVM showed worse results, with an AUC of 0.71 each. CONCLUSIONS This study demonstrated a novel approach to predicting conversion to inpatient status in eligible patients for ambulatory surgery. The XGBoost model showed good predictive capabilities, superior to the older machine learning approaches. This model also revealed the importance of surgical duration time, BMI, and age as risk factors for patient conversion. A developing field of study is using machine learning in clinical decision-making. Our findings contribute to this field by demonstrating the feasibility and accuracy of such methods in predicting outcomes and identifying risk factors, although external and multi-center validation studies are needed.
Collapse
Affiliation(s)
- Lukas Schönnagel
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA; Center for Musculoskeletal Surgery, Charité - Universitätsmedizin Berlin, Freie Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Soji Tani
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA; Department of Orthopaedic Surgery, Showa University School of Medicine, 1-5-8 Hatanodai, Shinagawa-ku, Tokyo 142-8666, Japan
| | - Tu-Lan Vu-Han
- Center for Musculoskeletal Surgery, Charité - Universitätsmedizin Berlin, Freie Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Jiaqi Zhu
- Biostatistics Core, Hospital for Special Surgery, 541 E. 71st Street, New York, NY 10021, USA
| | - Gaston Camino-Willhuber
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Yusuke Dodo
- Department of Orthopaedic Surgery, Showa University School of Medicine, 1-5-8 Hatanodai, Shinagawa-ku, Tokyo 142-8666, Japan
| | - Thomas Caffard
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA; Department of Orthopedic Surgery, University of Ulm, Oberer Eselsberg 45, 89081 Ulm, Germany
| | - Erika Chiapparelli
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Lisa Oezel
- Department of Orthopedic Surgery and Traumatology, University Hospital Duesseldorf, Moorenstraße 5, 40225 Duesseldorf, Germany
| | - Jennifer Shue
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - William D Zelenty
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Darren R Lebl
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Frank P Cammisa
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Federico P Girardi
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Gbolabo Sokunbi
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Alexander P Hughes
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA
| | - Andrew A Sama
- Spine Care Institute, Hospital for Special Surgery, 535 East 70th Street, New York, NY 10021, USA.
| |
Collapse
|
14
|
Alan N, Zenkin S, Lavadi RS, Legarreta AD, Hudson JS, Fields DP, Agarwal N, Mamindla P, Ak M, Peddagangireddy V, Puccio L, Buell TJ, Hamilton DK, Kanter AS, Okonkwo DO, Zinn PO, Colen RR. Associating T1-Weighted and T2-Weighted Magnetic Resonance Imaging Radiomic Signatures With Preoperative Symptom Severity in Patients With Cervical Spondylotic Myelopathy. World Neurosurg 2024; 184:e137-e143. [PMID: 38253177 DOI: 10.1016/j.wneu.2024.01.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 01/14/2024] [Indexed: 01/24/2024]
Abstract
BACKGROUND Preoperative symptom severity in cervical spondylotic myelopathy (CSM) can be variable. Radiomic signatures could provide an imaging biomarker for symptom severity in CSM. This study utilizes radiomic signatures of T1-weighted and T2-weighted magnetic resonance imaging images to correlate with preoperative symptom severity based on modified Japanese Orthopaedic Association (mJOA) scores for patients with CSM. METHODS Sixty-two patients with CSM were identified. Preoperative T1-weighted and T2-weighted magnetic resonance imaging images for each patient were segmented from C2-C7. A total of 205 texture features were extracted from each volume of interest. After feature normalization, each second-order feature was further subdivided to yield a total of 400 features from each volume of interest for analysis. Supervised machine learning was used to build radiomic models. RESULTS The patient cohort had a median mJOA preoperative score of 13; of which, 30 patients had a score of >13 (low severity) and 32 patients had a score of ≤13 (high severity). Radiomic analysis of T2-weighted imaging resulted in 4 radiomic signatures that correlated with preoperative mJOA with a sensitivity, specificity, and accuracy of 78%, 89%, and 83%, respectively (P < 0.004). The area under the curve value for the ROC curves were 0.69, 0.70, and 0.77 for models generated by independent T1 texture features, T1 and T2 texture features in combination, and independent T2 texture features, respectively. CONCLUSIONS Radiomic models correlate with preoperative mJOA scores using T2 texture features in patients with CSM. This may serve as a surrogate, objective imaging biomarker to measure the preoperative functional status of patients.
Collapse
Affiliation(s)
- Nima Alan
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, California.
| | - Serafettin Zenkin
- Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Raj Swaroop Lavadi
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Andrew D Legarreta
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Joseph S Hudson
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Daryl P Fields
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Nitin Agarwal
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania; Department of Neurological Surgery, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Priyadarshini Mamindla
- Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Murat Ak
- Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Vishal Peddagangireddy
- Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Lauren Puccio
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Thomas J Buell
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - D Kojo Hamilton
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Adam S Kanter
- Department of Neurosurgery, Hoag Neurosciences Institute, Newport Beach, California
| | - David O Okonkwo
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Pascal O Zinn
- Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Rivka R Colen
- Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; Hillman Cancer Center, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| |
Collapse
|
15
|
Mishra M, Chen PH, Lin GY, Nguyen TTN, Le TC, Dejchanchaiwong R, Tekasakul P, Shih SH, Jhang CW, Tsai CJ. Photochemical oxidation of VOCs and their source impact assessment on ozone under de-weather conditions in Western Taiwan. Environ Pollut 2024; 346:123662. [PMID: 38417604 DOI: 10.1016/j.envpol.2024.123662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/17/2024] [Accepted: 02/25/2024] [Indexed: 03/01/2024]
Abstract
The application of statistical models has excellent potential to provide crucial information for mitigating the challenging issue of ozone (O3) pollution by capturing its associations with explanatory variables, including reactive precursors (VOCs and NOX) and meteorology. Considering the large contribution of O3 in degrading the air quality of western Taiwan, three-year (2019-2021) hourly concentration data of VOC, NOX and O3 from 4 monitoring stations of western Taiwan: Tucheng (TC), Zhongming (ZM), Taixi (TX) and Xiaogang (XG), was evaluated to identify the effect of anthropogenic emissions on O3 formation. Owing to the high-ambient reactivity of VOCs on the underestimation of sources, photochemical oxidation was assessed to calculate the consumed VOC (VOCcons) which was followed by the source identification of their initial concentrations. VOCcons was observed to be highest in the summer season (16.7 and 22.7 ppbC) at north (TC and ZM) and in the autumn season (17.8 and 11.4 ppbC) in southward-located stations (TX and XG, respectively). Results showed that VOCs from solvents (25-27%) were the major source at northward stations whereas VOCs-industrial emissions (30%) dominated in south. Furthermore, machine learning (ML): eXtreme Gradient Boost (XGBoost) model based de-weather analysis identified that meteorological factors favor to reduce ambient O3 levels at TC, ZM and XG stations (-67%, -47% and -21%, respectively) but they have a major role in accumulating the O3 (+38%) at the TX station which is primarily transported from the upwind region of south-central Taiwan. Crucial insights using ML outputs showed that the finding of the study can be utilized for region-specific data-driven control of emission from VOCs-sources and prioritized to limit the O3-pollution at the study location-ns as well as their accumulation in distant regions.
Collapse
Affiliation(s)
- Manisha Mishra
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Pin-Hsin Chen
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Guan-Yu Lin
- Department of Environmental Science and Engineering, Tunghai University, Taichung 407302, Taiwan
| | - Thi-Thuy-Nghiem Nguyen
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Thi-Cuc Le
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Racha Dejchanchaiwong
- Air Pollution and Health Effect Research Center, and Department of Chemical Engineering, Prince of Songkla University, Songkhla 90100, Thailand
| | - Perapong Tekasakul
- Air Pollution and Health Effect Research Center, and Department of Mechanical and Mechatronics Engineering, Prince of Songkla University, Songkhla 90100, Thailand
| | - Shih-Heng Shih
- Wisdom Environmental Technical Service and Consultant Company, New Taipei City, Taiwan
| | | | - Chuen-Jinn Tsai
- Institute of Environmental Engineering, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan.
| |
Collapse
|
16
|
Yanagawa R, Iwadoh K, Akabane M, Imaoka Y, Bozhilov KK, Melcher ML, Sasaki K. LightGBM outperforms other machine learning techniques in predicting graft failure after liver transplantation: Creation of a predictive model through large-scale analysis. Clin Transplant 2024; 38:e15316. [PMID: 38607291 DOI: 10.1111/ctr.15316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 03/18/2024] [Accepted: 03/24/2024] [Indexed: 04/13/2024]
Abstract
BACKGROUND The incidence of graft failure following liver transplantation (LTx) is consistent. While traditional risk scores for LTx have limited accuracy, the potential of machine learning (ML) in this area remains uncertain, despite its promise in other transplant domains. This study aims to determine ML's predictive limitations in LTx by replicating methods used in previous heart transplant research. METHODS This study utilized the UNOS STAR database, selecting 64,384 adult patients who underwent LTx between 2010 and 2020. Gradient boosting models (XGBoost and LightGBM) were used to predict 14, 30, and 90-day graft failure compared to conventional logistic regression model. Models were evaluated using both shuffled and rolling cross-validation (CV) methodologies. Model performance was assessed using the AUC across validation iterations. RESULTS In a study comparing predictive models for 14-day, 30-day and 90-day graft survival, LightGBM consistently outperformed other models, achieving the highest AUC of.740,.722, and.700 in shuffled CV methods. However, in rolling CV the accuracy of the model declined across every ML algorithm. The analysis revealed influential factors for graft survival prediction across all models, including total bilirubin, medical condition, recipient age, and donor AST, among others. Several features like donor age and recipient diabetes history were important in two out of three models. CONCLUSIONS LightGBM enhances short-term graft survival predictions post-LTx. However, due to changing medical practices and selection criteria, continuous model evaluation is essential. Future studies should focus on temporal variations, clinical implications, and ensure model transparency for broader medical utility.
Collapse
Affiliation(s)
| | - Kazuhiro Iwadoh
- Department of Transplant Surgery, Mita Hospital, International University of Health and Welfare, Tokyo, Japan
| | - Miho Akabane
- Division of Abdominal Transplant, Department of Surgery, Stanford University Medical Center, Stanford, California, USA
| | - Yuki Imaoka
- Division of Abdominal Transplant, Department of Surgery, Stanford University Medical Center, Stanford, California, USA
- Department of Gastroenterological and Transplant Surgery, Graduate School of Biomedical and Health Sciences, Hiroshima University, Hiroshima, Japan
| | - Kliment Krassimirov Bozhilov
- Division of Abdominal Transplant, Department of Surgery, Stanford University Medical Center, Stanford, California, USA
| | - Marc L Melcher
- Division of Abdominal Transplant, Department of Surgery, Stanford University Medical Center, Stanford, California, USA
| | - Kazunari Sasaki
- Division of Abdominal Transplant, Department of Surgery, Stanford University Medical Center, Stanford, California, USA
| |
Collapse
|
17
|
Wu J, Chen X, Li R, Wang A, Huang S, Li Q, Qi H, Liu M, Cheng H, Wang Z. A novel framework for high resolution air quality index prediction with interpretable artificial intelligence and uncertainties estimation. J Environ Manage 2024; 357:120785. [PMID: 38583378 DOI: 10.1016/j.jenvman.2024.120785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 02/02/2024] [Accepted: 03/27/2024] [Indexed: 04/09/2024]
Abstract
Accurate air quality index (AQI) prediction is essential in environmental monitoring and management. Given that previous studies neglect the importance of uncertainty estimation and the necessity of constraining the output during prediction, we proposed a new hybrid model, namely TMSSICX, to forecast the AQI of multiple cities. Firstly, time-varying filtered based empirical mode decomposition (TVFEMD) was adopted to decompose the AQI sequence into multiple internal mode functions (IMF) components. Secondly, multi-scale fuzzy entropy (MFE) was applied to evaluate the complexity of each IMF component and clustered them into high and low-frequency portions. In addition, the high-frequency portion was secondarily decomposed by successive variational mode decomposition (SVMD) to reduce volatility. Then, six air pollutant concentrations, namely CO, SO2, PM2.5, PM10, O3, and NO2, were used as inputs. The secondary decomposition and preliminary portion were employed as the outputs for the bidirectional long short-term memory network optimized by the snake optimization algorithm (SOABiLSTM) and improved Catboost (ICatboost), respectively. Furthermore, extreme gradient boosting (XGBoost) was applied to ensemble each predicted sub-model to acquire the consequence. Ultimately, we introduced adaptive kernel density estimation (AKDE) for interval estimation. The empirical outcome indicated the TMSSICX model achieved the best performance among the other 23 models across all datasets. Moreover, implementing the XGBoost to ensemble each predicted sub-model led to an 8.73%, 8.94%, and 0.19% reduction in RMSE, compared to SVM. Additionally, by utilizing SHapley Additive exPlanations (SHAP) to assess the impact of the six pollutant concentrations on AQI, the results reveal that PM2.5 and PM10 had the most notable positive effects on the long-term trend of AQI. We hope this model can provide guidance for air quality management.
Collapse
Affiliation(s)
- Junhao Wu
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai, 200062, China
| | - Xi Chen
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China; Key Laboratory of Geographic Information Science, Ministry of Education, East China Normal University, Shanghai, 200241, China; Key Laboratory of Spatial-Temporal Big Data Analysis and Application of Natural Resources in Megacities, Ministry of Natural Resources, Shanghai, 200241, China.
| | - Rui Li
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China
| | - Anqi Wang
- Department of Mathematics, The University of Manchester, Manchester, M13 9PL, UK
| | - Shutong Huang
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China
| | - Qingli Li
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, 200241, China
| | - Honggang Qi
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Min Liu
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China; Key Laboratory of Geographic Information Science, Ministry of Education, East China Normal University, Shanghai, 200241, China
| | - Heqin Cheng
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai, 200062, China.
| | - Zhaocai Wang
- College of Information, Shanghai Ocean University, Shanghai, 201306, China.
| |
Collapse
|
18
|
Yilmaz R, Yagin FH, Colak C, Toprak K, Abdel Samee N, Mahmoud NF, Alshahrani AA. Analysis of hematological indicators via explainable artificial intelligence in the diagnosis of acute heart failure: a retrospective study. Front Med (Lausanne) 2024; 11:1285067. [PMID: 38633310 PMCID: PMC11023638 DOI: 10.3389/fmed.2024.1285067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 03/14/2024] [Indexed: 04/19/2024] Open
Abstract
Introduction Acute heart failure (AHF) is a serious medical problem that necessitates hospitalization and often results in death. Patients hospitalized in the emergency department (ED) should therefore receive an immediate diagnosis and treatment. Unfortunately, there is not yet a fast and accurate laboratory test for identifying AHF. The purpose of this research is to apply the principles of explainable artificial intelligence (XAI) to the analysis of hematological indicators for the diagnosis of AHF. Methods In this retrospective analysis, 425 patients with AHF and 430 healthy individuals served as assessments. Patients' demographic and hematological information was analyzed to diagnose AHF. Important risk variables for AHF diagnosis were identified using the Least Absolute Shrinkage and Selection Operator (LASSO) feature selection. To test the efficacy of the suggested prediction model, Extreme Gradient Boosting (XGBoost), a 10-fold cross-validation procedure was implemented. The area under the receiver operating characteristic curve (AUC), F1 score, Brier score, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) were all computed to evaluate the model's efficacy. Permutation-based analysis and SHAP were used to assess the importance and influence of the model's incorporated risk factors. Results White blood cell (WBC), monocytes, neutrophils, neutrophil-lymphocyte ratio (NLR), red cell distribution width-standard deviation (RDW-SD), RDW-coefficient of variation (RDW-CV), and platelet distribution width (PDW) values were significantly higher than the healthy group (p < 0.05). On the other hand, erythrocyte, hemoglobin, basophil, lymphocyte, mean platelet volume (MPV), platelet, hematocrit, mean erythrocyte hemoglobin (MCH), and procalcitonin (PCT) values were found to be significantly lower in AHF patients compared to healthy controls (p < 0.05). When XGBoost was used in conjunction with LASSO to diagnose AHF, the resulting model had an AUC of 87.9%, an F1 score of 87.4%, a Brier score of 0.036, and an F1 score of 87.4%. PDW, age, RDW-SD, and PLT were identified as the most crucial risk factors in differentiating AHF. Conclusion The results of this study showed that XAI combined with ML could successfully diagnose AHF. SHAP descriptions show that advanced age, low platelet count, high RDW-SD, and PDW are the primary hematological parameters for the diagnosis of AHF.
Collapse
Affiliation(s)
- Rustem Yilmaz
- Department of Cardiology, Samsun Training and Research Hospital, Samsun University Faculty of Medicine, Samsun, Türkiye
| | - Fatma Hilal Yagin
- Department of Biostatistics and Medical Informatics, Inonu University Faculty of Medicine, Malatya, Türkiye
| | - Cemil Colak
- Department of Biostatistics and Medical Informatics, Inonu University Faculty of Medicine, Malatya, Türkiye
| | - Kenan Toprak
- Department of Cardiology, Faculty of Medicine, Harran University, Sanlıurfa, Türkiye
| | - Nagwan Abdel Samee
- Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Noha F. Mahmoud
- Department of Rehabilitation Sciences, Health and Rehabilitation Sciences College, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Amnah Ali Alshahrani
- Department of Computer Science and Information Technology, Applied College, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| |
Collapse
|
19
|
Nath SJ, Girach IA, Harithasree S, Bhuyan K, Ojha N, Kumar M. Urban ozone variability using automated machine learning: inference from different feature importance schemes. Environ Monit Assess 2024; 196:393. [PMID: 38520559 DOI: 10.1007/s10661-024-12549-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 03/16/2024] [Indexed: 03/25/2024]
Abstract
Tropospheric ozone is an air pollutant at the ground level and a greenhouse gas which significantly contributes to the global warming. Strong anthropogenic emissions in and around urban environments enhance surface ozone pollution impacting the human health and vegetation adversely. However, observations are often scarce and the factors driving ozone variability remain uncertain in the developing regions of the world. In this regard, here, we conducted machine learning (ML) simulations of ozone variability and comprehensively examined the governing factors over a major urban environment (Ahmedabad) in western India. Ozone precursors (NO2, NO, CO, C5H8 and CH2O) from the CAMS (Copernicus Atmosphere Monitoring Service) reanalysis and meteorological parameters from the ERA5 (European Centre for Medium-Range Weather Forecast's (ECMWF) fifth-generation reanalysis) were included as features in the ML models. Automated ML (AutoML) fitted the deep learning model optimally and simulated the daily ozone with root mean square error (RMSE) of ~2 ppbv reproducing 84-88% of variability. The model performance achieved here is comparable to widely used ML models (RF-Random Forest and XGBoost-eXtreme Gradient Boosting). Explainability of the models is discussed through different schemes of feature importance, including SAGE (Shapley Additive Global importancE) and permutation importance. The leading features are found to be different from different feature importance schemes. We show that urban ozone could be simulated well (RMSE = 2.5 ppbv and R2 = 0.78) by considering first four leading features, from different schemes, which are consistent with ozone photochemistry. Our study underscores the need to conduct science-informed analysis of feature importance from multiple schemes to infer the roles of input variables in ozone variability. AutoML-based studies, exploiting potentials of long-term observations, can strongly complement the conventional chemistry-transport modelling and can also help in accurate simulation and forecast of urban ozone.
Collapse
Affiliation(s)
- Sankar Jyoti Nath
- Centre for Environment and Energy Development, Ranchi, 834001, India
| | - Imran A Girach
- Space Applications Centre, Indian Space Research Organisation, Ahmedabad, 380015, India.
| | - S Harithasree
- Physical Research Laboratory, Ahmedabad, 380009, India
- Indian Institute of Technology, Gandhinagar, 382055, Gujarat, India
| | - Kalyan Bhuyan
- Centre for Atmospheric Studies, Dibrugarh University, Dibrugarh, 786004, India
| | - Narendra Ojha
- Physical Research Laboratory, Ahmedabad, 380009, India.
| | - Manish Kumar
- Centre for Environment and Energy Development, Ranchi, 834001, India
| |
Collapse
|
20
|
Myśliwiec P, Kubit A, Szawara P. Optimization of 2024-T3 Aluminum Alloy Friction Stir Welding Using Random Forest, XGBoost, and MLP Machine Learning Techniques. Materials (Basel) 2024; 17:1452. [PMID: 38611968 PMCID: PMC11012866 DOI: 10.3390/ma17071452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 03/18/2024] [Accepted: 03/20/2024] [Indexed: 04/14/2024]
Abstract
This study optimized friction stir welding (FSW) parameters for 1.6 mm thick 2024T3 aluminum alloy sheets. A 3 × 3 factorial design was employed to explore tool rotation speeds (1100 to 1300 rpm) and welding speeds (140 to 180 mm/min). Static tensile tests revealed the joints' maximum strength at 87% relative to the base material. Hyperparameter optimization was conducted for machine learning (ML) models, including random forest and XGBoost, and multilayer perceptron artificial neural network (MLP-ANN) models, using grid search. Welding parameter optimization and extrapolation were then carried out, with final strength predictions analyzed using response surface methodology (RSM). The ML models achieved over 98% accuracy in parameter regression, demonstrating significant effectiveness in FSW process enhancement. Experimentally validated, optimized parameters resulted in an FSW joint efficiency of 93% relative to the base material. This outcome highlights the critical role of advanced analytical techniques in improving welding quality and efficiency.
Collapse
Affiliation(s)
- Piotr Myśliwiec
- Department of Materials Forming and Processing, Rzeszow University of Technology, al. Powst. Warszawy 8, 35-959 Rzeszów, Poland
| | - Andrzej Kubit
- Department of Manufacturing and Production Engineering, Rzeszow University of Technology, al. Powst. Warszawy 8, 35-959 Rzeszów, Poland;
| | - Paulina Szawara
- Doctoral School of Engineering and Technical Sciences, Rzeszow University of Technology, al. Powst. Warszawy 12, 35-959 Rzeszów, Poland;
| |
Collapse
|
21
|
Miao R, Dong Q, Liu X, Chen Y, Wang J, Chen J. A cost-effective, machine learning-driven approach for screening arterial functional aging in a large-scale Chinese population. Front Public Health 2024; 12:1365479. [PMID: 38572001 PMCID: PMC10987946 DOI: 10.3389/fpubh.2024.1365479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 02/23/2024] [Indexed: 04/05/2024] Open
Abstract
Introduction An easily accessible and cost-free machine learning model based on prior probabilities of vascular aging enables an application to pinpoint high-risk populations before physical checks and optimize healthcare investment. Methods A dataset containing questionnaire responses and physical measurement parameters from 77,134 adults was extracted from the electronic records of the Health Management Center at the Third Xiangya Hospital. The least absolute shrinkage and selection operator and recursive feature elimination-Lightweight Gradient Elevator were employed to select features from a pool of potential covariates. The participants were randomly divided into training (70%) and test cohorts (30%). Four machine learning algorithms were applied to build the screening models for elevated arterial stiffness (EAS), and the performance of models was evaluated by calculating the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. Results Fourteen easily accessible features were selected to construct the model, including "systolic blood pressure" (SBP), "age," "waist circumference," "history of hypertension," "sex," "exercise," "awareness of normal blood pressure," "eat fruit," "work intensity," "drink milk," "eat bean products," "smoking," "alcohol consumption," and "Irritableness." The extreme gradient boosting (XGBoost) model outperformed the other three models, achieving AUC values of 0.8722 and 0.8710 in the training and test sets, respectively. The most important five features are SBP, age, waist, history of hypertension, and sex. Conclusion The XGBoost model ideally assesses the prior probability of the current EAS in the general population. The integration of the model into primary care facilities has the potential to lower medical expenses and enhance the management of arterial aging.
Collapse
Affiliation(s)
- Rujia Miao
- Health Management Medicine Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Qian Dong
- School of Science, Hunan University of Technology and Business, Changsha, China
| | - Xuelian Liu
- Health Management Medicine Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Yingying Chen
- Health Management Medicine Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Jiangang Wang
- Health Management Medicine Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Jianwen Chen
- School of Science, Hunan University of Technology and Business, Changsha, China
| |
Collapse
|
22
|
Codde C, Rivals F, Destere A, Fromage Y, Labriffe M, Marquet P, Benoist C, Ponthier L, Faucher JF, Woillard JB. A machine learning approach to predict daptomycin exposure from two concentrations based on Monte Carlo simulations. Antimicrob Agents Chemother 2024:e0141523. [PMID: 38501807 DOI: 10.1128/aac.01415-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 02/23/2024] [Indexed: 03/20/2024] Open
Abstract
Daptomycin is a concentration-dependent lipopeptide antibiotic for which exposure/effect relationships have been shown. Machine learning (ML) algorithms, developed to predict the individual exposure to drugs, have shown very good performances in comparison to maximum a posteriori Bayesian estimation (MAP-BE). The aim of this work was to predict the area under the blood concentration curve (AUC) of daptomycin from two samples and a few covariates using XGBoost ML algorithm trained on Monte Carlo simulations. Five thousand one hundred fifty patients were simulated from two literature population pharmacokinetics models. Data from the first model were split into a training set (75%) and a testing set (25%). Four ML algorithms were built to learn AUC based on daptomycin blood concentration samples at pre-dose and 1 h post-dose. The XGBoost model (best ML algorithm) with the lowest root mean square error (RMSE) in a 10-fold cross-validation experiment was evaluated in both the test set and the simulations from the second population pharmacokinetic model (validation). The ML model based on the two concentrations, the differences between these concentrations, and five other covariates (sex, weight, daptomycin dose, creatinine clearance, and body temperature) yielded very good AUC estimation in the test (relative bias/RMSE = 0.43/7.69%) and validation sets (relative bias/RMSE = 4.61/6.63%). The XGBoost ML model developed allowed accurate estimation of daptomycin AUC using C0, C1h, and a few covariates and could be used for exposure estimation and dose adjustment. This ML approach can facilitate the conduct of future therapeutic drug monitoring (TDM) studies.
Collapse
Affiliation(s)
- Cyrielle Codde
- Service de Maladies Infectieuses et Tropicales, CHU Dupuytren, Limoges, France
| | - Florence Rivals
- Service de Pharmacologie, Toxicologie et Pharmacovigilance, CHU Dupuytren, Limoges, France
| | | | - Yeleen Fromage
- Service de Pharmacologie, Toxicologie et Pharmacovigilance, CHU Dupuytren, Limoges, France
| | - Marc Labriffe
- Service de Pharmacologie, Toxicologie et Pharmacovigilance, CHU Dupuytren, Limoges, France
- Inserm, Univ. Limoges, CHU Limoges, Pharmacology & Toxicology, Limoges, France
| | - Pierre Marquet
- Service de Pharmacologie, Toxicologie et Pharmacovigilance, CHU Dupuytren, Limoges, France
- Inserm, Univ. Limoges, CHU Limoges, Pharmacology & Toxicology, Limoges, France
| | - Clément Benoist
- Inserm, Univ. Limoges, CHU Limoges, Pharmacology & Toxicology, Limoges, France
| | - Laure Ponthier
- Inserm, Univ. Limoges, CHU Limoges, Pharmacology & Toxicology, Limoges, France
| | | | - Jean-Baptiste Woillard
- Service de Pharmacologie, Toxicologie et Pharmacovigilance, CHU Dupuytren, Limoges, France
- Inserm, Univ. Limoges, CHU Limoges, Pharmacology & Toxicology, Limoges, France
| |
Collapse
|
23
|
Zhou X, Chen X, Tang L, Wang Y, Zheng J, Zhang W. Event-related driver stress detection with smartphones in an urban environment: a naturalistic driving study. Ergonomics 2024:1-19. [PMID: 38501496 DOI: 10.1080/00140139.2024.2323997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 02/23/2024] [Indexed: 03/20/2024]
Abstract
Driving in urban areas can be challenging and encounter acute stress. To detect driver stress, collecting data on real roads without interfering the driver is preferred. A smartphone-based data collection protocol was developed to support a naturalistic driving study. Sixty-one participants drove on predetermined real road routes, and driving information as well as physiological, psychological, and facial data were collected. The algorithm identified potentially stressful events based on the collected data. Participants classified these events as low, medium, or highly stressful events by watching recorded videos after the experiment. These events were then used to train prediction models. The best model achieved an accuracy of 92.5% in classifying low/medium/highly stressful events. The contribution of physiological, psychological, and facial expression indices and individual profile information was evaluated. The method can be applied to visualise the geographical distribution of stressors, monitor driver behaviour, and help drivers regulate their driving habits.
Collapse
Affiliation(s)
- Xin Zhou
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Xing Chen
- Human Factors Engineering Laboratory, Chongqing Changan Automobile Co., Ltd, Chongqing, China
| | - Liu Tang
- Human Factors Engineering Laboratory, Chongqing Changan Automobile Co., Ltd, Chongqing, China
| | - Yi Wang
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Jingyue Zheng
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Wei Zhang
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| |
Collapse
|
24
|
Liu L, Zhang P, Liu Z, Sun T, Qiao H. Joint global and local interpretation method for CIN status classification in breast cancer. Heliyon 2024; 10:e27054. [PMID: 38562500 PMCID: PMC10982965 DOI: 10.1016/j.heliyon.2024.e27054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 12/10/2023] [Accepted: 02/22/2024] [Indexed: 04/04/2024] Open
Abstract
Breast cancer is among the cancer types with the highest numbers of new cases. The study of this disease from a microscopic perspective has been a prominent research topic. Previous studies have shown that microRNAs (miRNAs) are closely linked to chromosomal instability (CIN). Correctly predicting CIN status from miRNAs can help to improve the survival of breast cancer patients. In this study, a joint global and local interpretation method called GL_XGBoost is proposed for predicting CIN status in breast cancer. GL_XGBoost integrates the eXtreme Gradient Boosting (XGBoost) and SHapley Additive exPlanation (SHAP) methods. XGBoost is used to predict CIN status from miRNA data, whereas SHAP is used to select miRNA features that have strong relationships with CIN. Furthermore, SHAP's rich visualization strategies enhance the interpretability of the entire model at the global and local levels. The performance of GL_XGBoost is validated on the TCGA-BRCA dataset, and it is shown to have an accuracy of 78.57% and an area under the curve value of 0.87. Rich visual analysis is used to explain the relationships between miRNAs and CIN status from different perspectives. Our study demonstrates an intuitive way of exploring the relationship between CIN and cancer from a microscopic perspective.
Collapse
Affiliation(s)
- Liangliang Liu
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Pei Zhang
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Zhihong Liu
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Tong Sun
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Hongbo Qiao
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| |
Collapse
|
25
|
Zhang Y, Xiao L, LYu L, Zhang L. Construction of a predictive model for bone metastasis from first primary lung adenocarcinoma within 3 cm based on machine learning algorithm: a retrospective study. PeerJ 2024; 12:e17098. [PMID: 38495760 PMCID: PMC10944632 DOI: 10.7717/peerj.17098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/21/2024] [Indexed: 03/19/2024] Open
Abstract
Background Adenocarcinoma, the most prevalent histological subtype of non-small cell lung cancer, is associated with a significantly higher likelihood of bone metastasis compared to other subtypes. The presence of bone metastasis has a profound adverse impact on patient prognosis. However, to date, there is a lack of accurate bone metastasis prediction models. As a result, this study aims to employ machine learning algorithms for predicting the risk of bone metastasis in patients. Method We collected a dataset comprising 19,454 cases of solitary, primary lung adenocarcinoma with pulmonary nodules measuring less than 3 cm. These cases were diagnosed between 2010 and 2015 and were sourced from the Surveillance, Epidemiology, and End Results (SEER) database. Utilizing clinical feature indicators, we developed predictive models using seven machine learning algorithms, namely extreme gradient boosting (XGBoost), logistic regression (LR), light gradient boosting machine (LightGBM), Adaptive Boosting (AdaBoost), Gaussian Naive Bayes (GNB), multilayer perceptron (MLP) and support vector machine (SVM). Results The results demonstrated that XGBoost exhibited superior performance among the four algorithms (training set: AUC: 0.913; test set: AUC: 0.853). Furthermore, for convenient application, we created an online scoring system accessible at the following URL: https://www.xsmartanalysis.com/model/predict/?mid=731symbol=7Fr16wX56AR9Mk233917, which is based on the highest performing model. Conclusion XGBoost proves to be an effective algorithm for predicting the occurrence of bone metastasis in patients with solitary, primary lung adenocarcinoma featuring pulmonary nodules below 3 cm in size. Moreover, its robust clinical applicability enhances its potential utility.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Thoracic Surgery, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Lixia Xiao
- Department of Thoracic Surgery, Feicheng Hospital Affiliated to Shandong First Medical University, Taian, Shandong, China
| | - Lan LYu
- Department of Plastic Surgery, Feicheng Hospital Affiliated to Shandong First Medical University, Taian, Shandong, China
| | - Liwei Zhang
- Department of Thoracic Surgery, First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China
| |
Collapse
|
26
|
Zhang G, Shao F, Yuan W, Wu J, Qi X, Gao J, Shao R, Tang Z, Wang T. Predicting sepsis in-hospital mortality with machine learning: a multi-center study using clinical and inflammatory biomarkers. Eur J Med Res 2024; 29:156. [PMID: 38448999 PMCID: PMC10918942 DOI: 10.1186/s40001-024-01756-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 02/28/2024] [Indexed: 03/08/2024] Open
Abstract
BACKGROUND This study aimed to develop and validate an interpretable machine-learning model that utilizes clinical features and inflammatory biomarkers to predict the risk of in-hospital mortality in critically ill patients suffering from sepsis. METHODS We enrolled all patients diagnosed with sepsis in the Medical Information Mart for Intensive Care IV (MIMIC-IV, v.2.0), eICU Collaborative Research Care (eICU-CRD 2.0), and the Amsterdam University Medical Centers databases (AmsterdamUMCdb 1.0.2). LASSO regression was employed for feature selection. Seven machine-learning methods were applied to develop prognostic models. The optimal model was chosen based on its accuracy, F1 score and area under curve (AUC) in the validation cohort. Moreover, we utilized the SHapley Additive exPlanations (SHAP) method to elucidate the effects of the features attributed to the model and analyze how individual features affect the model's output. Finally, Spearman correlation analysis examined the associations among continuous predictor variables. Restricted cubic splines (RCS) explored potential non-linear relationships between continuous risk factors and in-hospital mortality. RESULTS 3535 patients with sepsis were eligible for participation in this study. The median age of the participants was 66 years (IQR, 55-77 years), and 56% were male. After selection, 12 of the 45 clinical parameters collected on the first day after ICU admission remained associated with prognosis and were used to develop machine-learning models. Among seven constructed models, the eXtreme Gradient Boosting (XGBoost) model achieved the best performance, with an AUC of 0.94 and an F1 score of 0.937 in the validation cohort. Feature importance analysis revealed that Age, AST, invasive ventilation treatment, and serum urea nitrogen (BUN) were the top four features of the XGBoost model with the most significant impact. Inflammatory biomarkers may have prognostic value. Furthermore, SHAP force analysis illustrated how the constructed model visualized the prediction of the model. CONCLUSIONS This study demonstrated the potential of machine-learning approaches for early prediction of outcomes in patients with sepsis. The SHAP method could improve the interoperability of machine-learning models and help clinicians better understand the reasoning behind the outcome.
Collapse
Affiliation(s)
- Guyu Zhang
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Fei Shao
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Wei Yuan
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Junyuan Wu
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Xuan Qi
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Jie Gao
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Rui Shao
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China
| | - Ziren Tang
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China.
| | - Tao Wang
- Emergency Medicine Clinical Research Center, Beijing Chaoyang Hospital, Capital Medical University, Beijing Key Laboratory of Cardiopulmonary Cerebral Resuscitation, Beijing, 100020, China.
| |
Collapse
|
27
|
Banat R, Daoud S, Taha MO. Ligand-based pharmacophore modeling and machine learning for the discovery of potent aurora A kinase inhibitory leads of novel chemotypes. Mol Divers 2024:10.1007/s11030-024-10814-y. [PMID: 38446372 DOI: 10.1007/s11030-024-10814-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 01/19/2024] [Indexed: 03/07/2024]
Abstract
Aurora-A (AURKA) is serine/threonine protein kinase involved in the regulation of numerous processes of cell division. Numerous studies have demonstrated strong association between AURKA and cancer. AURKA is overexpressed in many cancers, such as colon, breast and prostate cancers. Consequently, AURKA has emerged as promising target for therapeutic intervention in cancer management. Herein, we describe a computational workflow for the discovery of novel anti-AURKA inhibitory leads starting with ligand-based assessment of the pharmacophoric space of six diverse sets of inhibitors. Subsequently, machine learning/QSAR modeling was coupled with genetic function algorithm to search for the best possible combination of machine learner, ligand-based pharmacophore(s) and molecular descriptors capable of explaining variation in anti-AURKA bioactivities within a collected list of inhibitors. Two learners succeeded in achieving acceptable structure/activity correlations, namely, random forests and extreme gradient boosting (XGBoost). Three pharmacophores emerged in the successful ML models. These were then used as 3D search queries to mine the National Cancer Institute database for novel anti-AURKA leads. Top-ranking 38 hits were assessed in vitro for their anti-AURKA bioactivities. Among them, three compounds exhibited promising dose-response curves, demonstrating experimental IC50 values ranging from sub-micromolar to low micromolar values. Remarkably, two of these compounds are of novel chemotypes.
Collapse
Affiliation(s)
- Rajaa Banat
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan
| | - Safa Daoud
- Department of Pharmaceutical Chemistry and Pharmacognosy, Faculty of Pharmacy, Applied Sciences Private University, Amman, Jordan
| | - Mutasem Omar Taha
- Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan.
| |
Collapse
|
28
|
Fu R, Hao X, Yu J, Wang D, Zhang J, Yu Z, Gao F, Zhou C. Machine learning-based prediction of sertraline concentration in patients with depression through therapeutic drug monitoring. Front Pharmacol 2024; 15:1289673. [PMID: 38510645 PMCID: PMC10953499 DOI: 10.3389/fphar.2024.1289673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/21/2024] [Indexed: 03/22/2024] Open
Abstract
Background: Sertraline is a commonly employed antidepressant in clinical practice. In order to control the plasma concentration of sertraline within the therapeutic window to achieve the best effect and avoid adverse reactions, a personalized model to predict sertraline concentration is necessary. Aims: This study aimed to establish a personalized medication model for patients with depression receiving sertraline based on machine learning to provide a reference for clinicians to formulate drug regimens. Methods: A total of 415 patients with 496 samples of sertraline concentration from December 2019 to July 2022 at the First Hospital of Hebei Medical University were collected as the dataset. Nine different algorithms, namely, XGBoost, LightGBM, CatBoost, random forest, GBDT, SVM, lasso regression, ANN, and TabNet, were used for modeling to compare the model abilities to predict sertraline concentration. Results: XGBoost was chosen to establish the personalized medication model with the best performance (R 2 = 0.63). Five important variables, namely, sertraline dose, alanine transaminase, aspartate transaminase, uric acid, and sex, were shown to be correlated with sertraline concentration. The model prediction accuracy of sertraline concentration in the therapeutic window was 62.5%. Conclusion: In conclusion, the personalized medication model of sertraline for patients with depression based on XGBoost had good predictive ability, which provides guidance for clinicians in proposing an optimal medication regimen.
Collapse
Affiliation(s)
- Ran Fu
- Department of Clinical Pharmacy, The First Hospital of Hebei Medical University, Shijiazhuang, China
- The Technology Innovation Center for Artificial Intelligence in Clinical Pharmacy of Hebei Province, The First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Xin Hao
- Dalian Medicinovo Technology Co., Ltd, Dalian, China
| | - Jing Yu
- Department of Clinical Pharmacy, The First Hospital of Hebei Medical University, Shijiazhuang, China
- The Technology Innovation Center for Artificial Intelligence in Clinical Pharmacy of Hebei Province, The First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Donghan Wang
- Department of Clinical Pharmacy, The First Hospital of Hebei Medical University, Shijiazhuang, China
- The Technology Innovation Center for Artificial Intelligence in Clinical Pharmacy of Hebei Province, The First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Jinyuan Zhang
- Beijing Medicinovo Technology Co., Ltd, Beijing, China
| | - Ze Yu
- Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Fei Gao
- Beijing Medicinovo Technology Co., Ltd, Beijing, China
| | - Chunhua Zhou
- Department of Clinical Pharmacy, The First Hospital of Hebei Medical University, Shijiazhuang, China
- The Technology Innovation Center for Artificial Intelligence in Clinical Pharmacy of Hebei Province, The First Hospital of Hebei Medical University, Shijiazhuang, China
| |
Collapse
|
29
|
Guo Y, Yang Y, Li R, Liao X, Li Y. Cadmium accumulation in tropical island paddy soils: From environment and health risk assessment to model prediction. J Hazard Mater 2024; 465:133212. [PMID: 38101012 DOI: 10.1016/j.jhazmat.2023.133212] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/22/2023] [Accepted: 12/07/2023] [Indexed: 12/17/2023]
Abstract
Cultivated soil quality is crucial because it directly affects food safety and human health, and rice is of primary concern because of its centrality to global food networks. However, a detailed understanding of cadmium (Cd) geochemical cycling in paddy soils is complicated by the multiple influencing factors present in many rice-growing areas that overlap with industrial centers. This study analyzed the pollution characteristics and health risks of Cd in paddy soils across Hainan Island and identified key influencing factors based on multi-source environmental data and prediction models. Approximately 27.07% of the soil samples exceeded the risk control standard screening value for Cd in China, posing an uncontaminated to moderate contamination risk. Cd concentration and exposure duration contributed the most to non-carcinogenic and carcinogenic risks to children, teens, and adults through ingestion. Among the nine prediction models tested, Extreme Gradient Boosting (XGBoost) exhibited the best performance for Cd prediction with soil properties having the highest importance, followed by climatic variables and topographic attributes. In summary, XGBoost reliably predicted the soil Cd concentrations on tropical islands. Further research should incorporate additional soil properties and environmental variables for more accurate predictions and to comprehensively identify their driving factors and corresponding contribution rates.
Collapse
Affiliation(s)
- Yan Guo
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yi Yang
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ruxia Li
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoyong Liao
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
| | - Yonghua Li
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographical Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
30
|
Patel RH, Fan L, Kelly NR, Gelsey F, Hertzberg JK, Brnabic AJM. A machine learning-based algorithm to identify U-500R insulin candidates among adults with type 2 diabetes mellitus in US retrospective databases. Curr Med Res Opin 2024; 40:367-375. [PMID: 38259227 DOI: 10.1080/03007995.2023.2293116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 12/06/2023] [Indexed: 01/24/2024]
Abstract
OBJECTIVE To develop a machine learning-based predictive algorithm to identify patients with type 2 diabetes mellitus (T2DM) who are candidates for initiation of U-500R insulin (U-500R). METHODS A retrospective cohort of patients with T2DM was used from a large US administrative claims and electronic health records (EHR) database affiliated with Optum. Predictor variables derived from the data were used to identify appropriate supervised machine learning models including least absolute shrinkage and selection operator (LASSO) and extreme gradient boosted (XGBoost) methods. Predictive performance was assessed using precision-recall (PR) and receiver operating characteristic (ROC) area under the curve (AUC). The clinical interpretation of the final model was supported by fitting the final set of variables from the LASSO and XGBoost models to a traditional logistic regression model. Model choice was determined by comparing Akaike Information Criterion (AIC), residual deviances, and scaled Brier scores. RESULTS Among 81,242 patients who met the study eligibility criteria, 577 initiated U-500R and were assigned to the positive class. Predictors of U-500R initiation included overweight/obesity, neuropathy, HbA1c ≥9% and 8%-9%, BUN 23.8 to <112 mg/dl, ALT 35.9-2056.2 U/L, no radiological chest exams, no GFR labs, and gait/mobility abnormalities. The best performing model was the LASSO model with an ROC AUC of 0.776 on the hold-out test set. CONCLUSION This study successfully developed and validated a machine learning-based algorithm to identify U-500R candidates among patients with T2DM. This may help health care providers and decision-makers to understand important characteristics of patients who could use U-500R therapies which in turn could support policies and guidelines for optimal patient management.
Collapse
Affiliation(s)
| | - Ludi Fan
- Eli Lilly and Company, Indianapolis, IN, USA
| | | | | | | | | |
Collapse
|
31
|
Chen C, He Y, Ni Y, Tang Z, Zhang W. Identification of crosstalk genes relating to ECM-receptor interaction genes in MASH and DN using bioinformatics and machine learning. J Cell Mol Med 2024; 28:e18156. [PMID: 38429902 PMCID: PMC10907849 DOI: 10.1111/jcmm.18156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 01/01/2024] [Accepted: 01/12/2024] [Indexed: 03/03/2024] Open
Abstract
This study aimed to identify genes shared by metabolic dysfunction-associated fatty liver disease (MASH) and diabetic nephropathy (DN) and the effect of extracellular matrix (ECM) receptor interaction genes on them. Datasets with MASH and DN were downloaded from the Gene Expression Omnibus (GEO) database. Pearson's coefficients assessed the correlation between ECM-receptor interaction genes and cross talk genes. The coexpression network of co-expression pairs (CP) genes was integrated with its protein-protein interaction (PPI) network, and machine learning was employed to identify essential disease-representing genes. Finally, immuno-penetration analysis was performed on the MASH and DN gene datasets using the CIBERSORT algorithm to evaluate the plausibility of these genes in diseases. We found 19 key CP genes. Fos proto-oncogene (FOS), belonging to the IL-17 signalling pathway, showed greater centrality PPI network; Hyaluronan Mediated Motility Receptor (HMMR), belonging to ECM-receptor interaction genes, showed most critical in the co-expression network map of 19 CP genes; Forkhead Box C1 (FOXC1), like FOS, showed a high ability to predict disease in XGBoost analysis. Further immune infiltration showed a clear positive correlation between FOS/FOXC1 and mast cells that secrete IL-17 during inflammation. Combining the results of previous studies, we suggest a FOS/FOXC1/HMMR regulatory axis in MASH and DN may be associated with mast cells in the acting IL-17 signalling pathway. Extracellular HMMR may regulate the IL-17 pathway represented by FOS through the Mitogen-Activated Protein Kinase 1 (ERK) or PI3K-Akt-mTOR pathway. HMMR may serve as a signalling carrier between MASH and DN and could be targeted for therapeutic development.
Collapse
Affiliation(s)
- Chao Chen
- Instrumentation and Service Center for Science and TechnologyBeijing Normal UniversityZhuhaiChina
| | - Yuxi He
- Pediatric Research InstituteThe Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical UniversityWenzhouChina
| | - Ying Ni
- Zhuhai Branch of State Key Laboratory of Earth Surface Processes and Resource Ecology, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
- Engineering Research Center of Natural Medicine, Ministry of Education, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
| | - Zhanming Tang
- Zhuhai Branch of State Key Laboratory of Earth Surface Processes and Resource Ecology, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
- Engineering Research Center of Natural Medicine, Ministry of Education, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
| | - Wensheng Zhang
- Zhuhai Branch of State Key Laboratory of Earth Surface Processes and Resource Ecology, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
- Engineering Research Center of Natural Medicine, Ministry of Education, Advanced Institute of Natural SciencesBeijing Normal UniversityZhuhaiChina
| |
Collapse
|
32
|
Ayinde BO, Musa MR, Ayinde AAO. Application of machine learning models and landsat 8 data for estimating seasonal pm 2.5 concentrations. Environ Anal Health Toxicol 2024; 39:e2024011-0. [PMID: 38631403 DOI: 10.5620/eaht.2024011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 03/12/2024] [Indexed: 04/19/2024] Open
Abstract
Air pollution is a significant global challenge that affects many cities. In Europe, Bosnia and Herzegovina (BiH) are among the most highly polluted and are mainly affected by air pollution. In this study, we integrate open-source landsat 8 remote sensing products, topographical data, and the limited ground truth PM2.5 data to spatially predict the air quality level across different seasons in Tuzla Canton, BiH by adopting three pre-existing machine learning models, namely XGBoost, K-Nearest Neighbour (KNN) and Naive Bayes (NB). These classification models were implemented based on landsat 8 bands, environmental-derived indices, and topographical variables generated for the study area. Based on the predicted results, the XGBoost model exhibited the highest overall accuracy across all seasons. The predicted model results were used to generate spatial air quality maps. Based on the classification maps, the PM2.5 air quality level predicted for Tuzla Canton in the Winter Season is very unhealthy. The findings conclude that the PM2.5 air quality concentration in Tuzla Canton is relatively unsatisfactory and requires urgent intervention by the government to prevent further deterioration of air quality in Tuzla and other affected cantons in BiH.
Collapse
|
33
|
Barry KA, Manzali Y, Flouchi R, Balouki Y, Chelhi K, Elfar M. Exploring the use of association rules in random forest for predicting heart disease. Comput Methods Biomech Biomed Engin 2024; 27:338-346. [PMID: 36877167 DOI: 10.1080/10255842.2023.2185477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 02/07/2023] [Accepted: 02/16/2023] [Indexed: 03/07/2023]
Abstract
Heart disease is one of the most dangerous diseases in the world. People with these diseases, most of them end up losing their lives. Therefore, machine learning algorithms have proven to be useful in this sense to help decision-making and prediction from the large amount of data generated by the healthcare sector. In this work, we have proposed a novel method that allows increasing the performance of the classical random forest technique so that this technique can be used for the prediction of heart disease with its better performance. We used in this study other classifiers such as classical random forest, support vector machine, decision tree, Naïve Bayes, and XGBoost. This work was done in the heart dataset Cleveland. According to the experimental results, the accuracy of the proposed model is better than that of other classifiers with 83.5%.This study contributed to the optimization of the random forest technique as well as gave solid knowledge of the formation of this technique.
Collapse
Affiliation(s)
| | | | - Rachid Flouchi
- Laboratory of Microbial Biotechnology and Bioactive Molecules, Science and Technologies Faculty, Sidi Mohamed Ben Abdellah University, Fez, Morocco
| | - Youssef Balouki
- Labo: Mathematics, Computer Science and Engineering Sciences(MISI), Settat, Morocco
| | - Khadija Chelhi
- The logistics center of excellence, Higher School of Textile and Clothing Industries(ESITH Casablanca), Casablanca, Morocco
| | - Mohamed Elfar
- LPAIS Laboratory, Faculty of Sciences, USMBA, Fez, Morocco
| |
Collapse
|
34
|
Ren Y, Cui M, Zhou Y, Sun S, Guo F, Ma J, Han Z, Park J, Son Y, Khim J. Utilizing machine learning for reactive material selection and width design in permeable reactive barrier (PRB). Water Res 2024; 251:121097. [PMID: 38218071 DOI: 10.1016/j.watres.2023.121097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 12/19/2023] [Accepted: 12/30/2023] [Indexed: 01/15/2024]
Abstract
Permeable reactive barrier (PRB) is an important groundwater treatment technology. However, selecting the optimal reactive material and estimating the width remain critical and challenging problems in PRB design. Machine learning (ML) has advantages in predicting evolution and tracing contaminants in temporal and spatial distribution. In this study, ML was developed to design PRB, and its feasibility was validated through experiments and a case study. ML algorithm showed a good prediction about the Freundlich equilibrium parameter (R2 0.94 for KF, R2 0.96 for n). After SHapley Additive exPlanation (SHAP) analysis, redefining the range of the significant impact factors (initial concentration and pH) can further improve the prediction accuracy (R2 0.99 for KF, R2 0.99 for n). To mitigate model bias and ensure comprehensiveness, evaluation index with expert opinions was used to determine the optimal material from candidate materials. Meanwhile, the ML algorithm was also applied to predict the width of the mass transport zone in the adsorption column. This procedure showed excellent accuracy with R2 and root-mean-square-error (RMSE) of 0.98 and 1.2, respectively. Compared with the traditional width design methodology, ML can enhance design efficiency and save experiment time. The novel approach is based on traditional design principles, and the limitations and challenges are highlighted. After further expanding the data set and optimizing the algorithm, the accuracy of ML can make up for the existing limitations and obtain wider applications.
Collapse
Affiliation(s)
- Yangmin Ren
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Mingcan Cui
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea.
| | - Yongyue Zhou
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Shiyu Sun
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Fengshi Guo
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Junjun Ma
- Nanjing Green-water Environment Engineering Limited by Share Ltd, C Building No. 606 Ningliu Road, Chemical Industrial Park, Nanjing, China
| | - Zhengchang Han
- Nanjing Green-water Environment Engineering Limited by Share Ltd, C Building No. 606 Ningliu Road, Chemical Industrial Park, Nanjing, China
| | - Jooyoung Park
- Emtomega Co.,Ltd, Seochojungang-ro 8-gil, Seocho-gu, Seoul 06642, Republic of Korea
| | - Younggyu Son
- Department of Environmental Engineering, Kumoh National Institute of Technology, Gumi 39177, Republic of Korea
| | - Jeehyeong Khim
- School of Civil, Environmental, and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea.
| |
Collapse
|
35
|
Liu F, Wu R, Liu S, Liu C, Su M. Assessing the determinants of corporate environmental investment: a machine learning approach. Environ Sci Pollut Res Int 2024; 31:17401-17416. [PMID: 38337115 DOI: 10.1007/s11356-024-32158-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/19/2024] [Indexed: 02/12/2024]
Abstract
In recent years, experts and academics in the environmental management field have developed an interest in the factors and evaluation techniques that influence corporate environmental investment decisions. However, there are substantial differences between studies employing the most recent evaluation methodologies and those that use indicator systems. To explore the mechanisms that influence corporate environmental investment, this study investigated the determinants of environmental investment through the perspectives of firm, board, chair, and chief executive officer (CEO) characteristics using a machine learning approach. Based on a large-scale data sample from Chinese-listed companies, the results indicated that the extreme gradient boosting (XGBoost) model had an accuracy of up to 97.63%, thus performing the best. Additionally, the model that used SHapley Additive exPlanations (SHAP) to interpret XGBoost showed that a company's sales performance was the most important factor that influenced environmental investment, followed by CEO tenure, board independence, board gender diversity, chair academic experience, and the company's level of internationalization. Furthermore, when examining the sample of heavily polluting enterprises, sales, board gender diversity, CEO tenure, chair academic experience, board independence, and chair-CEO duality, all were found to play crucial roles in predicting environmental investment. Overall, this study aids in evaluating the factors that influence corporate environmental investment decisions and provides policymakers and practitioners with a machine learning approach for use when assessing these factors.
Collapse
Affiliation(s)
- Feng Liu
- Business School, Shandong University, Weihai, China
| | - Ruixue Wu
- Business School, Shandong University, Weihai, China
| | - Si Liu
- The Graduate School of Technology Management, Kyunghee University, Yongin, 17104, Yongin, South Korea
| | - Caixia Liu
- Business School, Shandong University, Weihai, China
| | - Miao Su
- The Graduate School of Technology Management, Kyunghee University, Yongin, 17104, Yongin, South Korea.
| |
Collapse
|
36
|
Shopsowitz K, Lofroth J, Chan G, Kim J, Rana M, Brinkman R, Weng A, Medvedev N, Wang X. MAGIC-DR: An interpretable machine-learning guided approach for acute myeloid leukemia measurable residual disease analysis. Cytometry B Clin Cytom 2024. [PMID: 38415807 DOI: 10.1002/cyto.b.22168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 02/08/2024] [Accepted: 02/13/2024] [Indexed: 02/29/2024]
Abstract
Multiparameter flow cytometry is widely used for acute myeloid leukemia minimal residual disease testing (AML MRD) but is time consuming and demands substantial expertise. Machine learning offers potential advancements in accuracy and efficiency, but has yet to be widely adopted for this application. To explore this, we trained single cell XGBoost classifiers from 98 diagnostic AML cell populations and 30 MRD negative samples. Performance was assessed by cross-validation. Predictions were integrated with UMAP as a heatmap parameter for an augmented/interactive AML MRD analysis framework, which was benchmarked against traditional MRD analysis for 25 test cases. The results showed that XGBoost achieved a median AUC of 0.97, effectively distinguishing diverse AML cell populations from normal cells. When integrated with UMAP, the classifiers highlighted MRD populations against the background of normal events. Our pipeline, MAGIC-DR, incorporated classifier predictions and UMAP into flow cytometry standard (FCS) files. This enabled a human-in-the-loop machine learning guided MRD workflow. Validation against conventional analysis for 25 MRD samples showed 100% concordance in myeloid blast detection, with MAGIC-DR also identifying several immature monocytic populations not readily found by conventional analysis. In conclusion, Integrating a supervised classifier with unsupervised dimension reduction offers a robust method for AML MRD analysis that can be seamlessly integrated into conventional workflows. Our approach can support and augment human analysis by highlighting abnormal populations that can be gated on for quantification and further assessment. This has the potential to speed up MRD analysis, and potentially improve detection sensitivity for certain AML immunophenotypes.
Collapse
Affiliation(s)
- Kevin Shopsowitz
- Division of Hematopathology, Vancouver General Hospital, Vancouver, British Columbia, Canada
- Department of pathology and laboratory medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jack Lofroth
- Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Geoffrey Chan
- Division of Hematopathology, Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Jubin Kim
- Terry Fox Lab, BC Cancer, Vancouver, British Columbia, Canada
| | - Makhan Rana
- Division of Hematopathology, Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Ryan Brinkman
- Terry Fox Lab, BC Cancer, Vancouver, British Columbia, Canada
| | - Andrew Weng
- Department of pathology and laboratory medicine, University of British Columbia, Vancouver, British Columbia, Canada
- Terry Fox Lab, BC Cancer, Vancouver, British Columbia, Canada
| | - Nadia Medvedev
- Division of Hematopathology, Vancouver General Hospital, Vancouver, British Columbia, Canada
- Department of pathology and laboratory medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xuehai Wang
- Division of Hematopathology, Vancouver General Hospital, Vancouver, British Columbia, Canada
- Department of pathology and laboratory medicine, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
37
|
Saylam B, İncel ÖD. Multitask Learning for Mental Health: Depression, Anxiety, Stress (DAS) Using Wearables. Diagnostics (Basel) 2024; 14:501. [PMID: 38472973 DOI: 10.3390/diagnostics14050501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 02/23/2024] [Accepted: 02/24/2024] [Indexed: 03/14/2024] Open
Abstract
This study investigates the prediction of mental well-being factors-depression, stress, and anxiety-using the NetHealth dataset from college students. The research addresses four key questions, exploring the impact of digital biomarkers on these factors, their alignment with conventional psychology literature, the time-based performance of applied methods, and potential enhancements through multitask learning. The findings reveal modality rankings aligned with psychology literature, validated against paper-based studies. Improved predictions are noted with temporal considerations, and further enhanced by multitasking. Mental health multitask prediction results show aligned baseline and multitask performances, with notable enhancements using temporal aspects, particularly with the random forest (RF) classifier. Multitask learning improves outcomes for depression and stress but not anxiety using RF and XGBoost.
Collapse
Affiliation(s)
- Berrenur Saylam
- Computer Engineering Department, Boğaziçi University, 34342 İstanbul, Türkiye
| | - Özlem Durmaz İncel
- Computer Engineering Department, Boğaziçi University, 34342 İstanbul, Türkiye
| |
Collapse
|
38
|
Sharif S, Wunder C, Amendt J, Qamar A. Deciphering the impact of microenvironmental factors on cuticular hydrocarbon degradation in Lucilia sericata empty Puparia: Bridging ecological and forensic entomological perspectives using machine learning models. Sci Total Environ 2024; 913:169719. [PMID: 38171456 DOI: 10.1016/j.scitotenv.2023.169719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 12/23/2023] [Accepted: 12/25/2023] [Indexed: 01/05/2024]
Abstract
Blow flies (Calliphoridae) play essential ecological roles in nutrient recycling by consuming decaying organic matter. They serve as valuable bioindicators in ecosystem management and forensic entomology, with their unique feeding behavior leading to the accumulation of environmental pollutants in their cuticular hydrocarbons (CHCs), making them potential indicators of exposure history. This study focuses on CHC degradation dynamics in empty puparia of Lucilia sericata under different environmental conditions for up to 90 days. The three distinct conditions were considered: outdoor-buried, outdoor-above-ground, and indoor environments. Five predominant CHCs, n-Pentacosane (n-C25), n-Hexacosane (n-C26), n-Heptacosane (n-C27), n-Octacosane (n-C28), and n-Nonacosane (n-C29), were analyzed using Gas Chromatography-Mass Spectrometry (GC-MS). The findings revealed variations in CHC concentrations over time, influenced by environmental factors, with significant differences at different time points. Correlation heatmap analysis indicated negative correlations between weathering time and certain CHCs, suggesting decreasing concentrations over time. Machine learning techniques Support Vector Machine (SVM), Multilayer Perceptron (MLP), and eXtreme Gradient Boosting (XGBoost) models explored the potential of CHCs as age indicators. SVM achieved an R-squared value of 0.991, demonstrating high accuracy in age estimation based on CHC concentrations. MLP also exhibited satisfactory performance in outdoor conditions, while SVM and MLP yielded unsatisfactory results indoors due to the lack of significant CHC variations. After comprehensive model selection and performance evaluations, it was found that the XGBoost model excelled in capturing the patterns in all three datasets. This study bridges the gap between baseline and ecological/forensic use of empty puparia, offering valuable insights into the potential of CHCs in environmental monitoring and investigations. Understanding CHCs' stability and degradation enhances blow flies' utility as bioindicators for pollutants and exposure history, benefiting environmental monitoring and forensic entomology.
Collapse
Affiliation(s)
- Swaima Sharif
- Institute of Legal Medicine, Forensic Biology, University Hospital, Goethe University, Frankfurt am Main, Germany.
| | - Cora Wunder
- Institute of Legal Medicine, Forensic Biology, University Hospital, Goethe University, Frankfurt am Main, Germany.
| | - Jens Amendt
- Institute of Legal Medicine, Forensic Biology, University Hospital, Goethe University, Frankfurt am Main, Germany.
| | - Ayesha Qamar
- Section of Entomology, Department of Zoology, Aligarh Muslim University, Aligarh 202002, U.P., India.
| |
Collapse
|
39
|
Xu A, Gao J, Sui X, Wang C, Shi Z. LiDAR Dynamic Target Detection Based on Multidimensional Features. Sensors (Basel) 2024; 24:1369. [PMID: 38474905 DOI: 10.3390/s24051369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 02/17/2024] [Accepted: 02/18/2024] [Indexed: 03/14/2024]
Abstract
To address the limitations of LiDAR dynamic target detection methods, which often require heuristic thresholding, indirect computational assistance, supplementary sensor data, or postdetection, we propose an innovative method based on multidimensional features. Using the differences between the positions and geometric structures of point cloud clusters scanned by the same target in adjacent frame point clouds, the motion states of the point cloud clusters are comprehensively evaluated. To enable the automatic precision pairing of point cloud clusters from adjacent frames of the same target, a double registration algorithm is proposed for point cloud cluster centroids. The iterative closest point (ICP) algorithm is employed for approximate interframe pose estimation during coarse registration. The random sample consensus (RANSAC) and four-parameter transformation algorithms are employed to obtain precise interframe pose relations during fine registration. These processes standardize the coordinate systems of adjacent point clouds and facilitate the association of point cloud clusters from the same target. Based on the paired point cloud cluster, a classification feature system is used to construct the XGBoost decision tree. To enhance the XGBoost training efficiency, a Spearman's rank correlation coefficient-bidirectional search for a dimensionality reduction algorithm is proposed to expedite the optimal classification feature subset construction. After preliminary outcomes are generated by XGBoost, a double Boyer-Moore voting-sliding window algorithm is proposed to refine the final LiDAR dynamic target detection accuracy. To validate the efficacy and efficiency of our method in LiDAR dynamic target detection, an experimental platform is established. Real-world data are collected and pertinent experiments are designed. The experimental results illustrate the soundness of our method. The LiDAR dynamic target correct detection rate is 92.41%, the static target error detection rate is 1.43%, and the detection efficiency is 0.0299 s. Our method exhibits notable advantages over open-source comparative methods, achieving highly efficient and precise LiDAR dynamic target detection.
Collapse
Affiliation(s)
- Aigong Xu
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China
| | - Jiaxin Gao
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China
| | - Xin Sui
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China
| | - Changqiang Wang
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China
| | - Zhengxu Shi
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China
| |
Collapse
|
40
|
Wang P, Wu S, Tian M, Liu K, Cong J, Zhang W, Wei B. A conformal regressor for predicting negative conversion time of Omicron patients. Med Biol Eng Comput 2024:10.1007/s11517-024-03029-8. [PMID: 38363486 DOI: 10.1007/s11517-024-03029-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]
Abstract
In light of the situation and the characteristics of Omicron, the country has continuously optimized the rules for the prevention and control of COVID-19. The global epidemic is still spreading, and new cases of infection continue to emerge in China. To facilitate the infected person to estimate the course of virus infection, a prediction model for predicting negative conversion time is proposed in this article. The clinical features of Omicron-infected patients in Shandong Province in the first half of 2022 are retrospectively studied. These features are grouped by disease diagnosis result, clinical sign, traditional Chinese medicine symptoms, and drug use. These features are input to the eXtreme Gradient Boosting (XGBoost) model, and the output is the predicted number of negative conversion days. At the same time, XGBoost is used as the underlying algorithm of the conformal prediction (CP) framework, which can realize the probability interval estimation with a controllable error rate. The results show that the proposed model has a mean absolute error of 3.54 days and has the shortest interval prediction result. This shows that the method in this paper can carry more decision-making information and help people better understand the disease and self-estimate the course of the disease to a certain extent.
Collapse
Affiliation(s)
- Pingping Wang
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Shenjing Wu
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Mei Tian
- Affiliated Hospital of Shandong University of Chinese Medicine, Jinan, 250011, China
| | - Kunmeng Liu
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Jinyu Cong
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China
| | - Wei Zhang
- Affiliated Hospital of Shandong University of Chinese Medicine, Jinan, 250011, China.
| | - Benzheng Wei
- Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China.
- Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, 266112, China.
| |
Collapse
|
41
|
Mehdary A, Chehri A, Jakimi A, Saadane R. Hyperparameter Optimization with Genetic Algorithms and XGBoost: A Step Forward in Smart Grid Fraud Detection. Sensors (Basel) 2024; 24:1230. [PMID: 38400385 PMCID: PMC10892895 DOI: 10.3390/s24041230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 02/07/2024] [Accepted: 02/13/2024] [Indexed: 02/25/2024]
Abstract
This study provides a comprehensive analysis of the combination of Genetic Algorithms (GA) and XGBoost, a well-known machine-learning model. The primary emphasis lies in hyperparameter optimization for fraud detection in smart grid applications. The empirical findings demonstrate a noteworthy enhancement in the model's performance metrics following optimization, particularly emphasizing a substantial increase in accuracy from 0.82 to 0.978. The precision, recall, and AUROC metrics demonstrate a clear improvement, indicating the effectiveness of optimizing the XGBoost model for fraud detection. The findings from our study significantly contribute to the expanding field of smart grid fraud detection. These results emphasize the potential uses of advanced metaheuristic algorithms to optimize complex machine-learning models. This work showcases significant progress in enhancing the accuracy and efficiency of fraud detection systems in smart grids.
Collapse
Affiliation(s)
- Adil Mehdary
- LaGes, Hassania School of Public Works, Casablanca 20000, Morocco; (A.M.); (R.S.)
| | - Abdellah Chehri
- Department of Mathematics and Computer Science, Royal Military College of Canada, Kingston, ON K7K 7B4, Canada
| | - Abdeslam Jakimi
- GL-ISI Team, Faculty of Science and Technology Errachidia, Moulay Ismail University, Meknes 50050, Morocco;
| | - Rachid Saadane
- LaGes, Hassania School of Public Works, Casablanca 20000, Morocco; (A.M.); (R.S.)
| |
Collapse
|
42
|
Cao C, Zhang T, Xin T. The effect of reading engagement on scientific literacy - an analysis based on the XGBoost method. Front Psychol 2024; 15:1329724. [PMID: 38420178 PMCID: PMC10899671 DOI: 10.3389/fpsyg.2024.1329724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 01/22/2024] [Indexed: 03/02/2024] Open
Abstract
Scientific literacy is a key factor of personal competitiveness, and reading is the most common activity in daily learning life, and playing the influence of reading on individuals day by day is the most convenient way to improve the level of scientific literacy of all people. Reading engagement is one of the important student characteristics related to reading literacy, which is highly malleable and is jointly reflected by behavioral, cognitive, and affective engagement, and it is of theoretical and practical significance to explore the relationship between reading engagement and scientific literacy using reading engagement as an entry point. In this study, we used PISA2018 data from China to explore the relationship between reading engagement and scientific literacy with a sample of 15-year-old students in mainland China. 36 variables related to reading engagement and background variables (gender, grade, and socioeconomic and cultural status of the family) were selected from the questionnaire as the independent variables, and the score of the Scientific Literacy Assessment (SLA) was taken as the outcome variable, and supervised machine learning method, the XGBoost algorithm, to construct the model. The dataset is randomly divided into training set and test set to optimize the model, which can verify that the obtained model has good fitting degree and generalization ability. Meanwhile, global and local personalized interpretation is done by introducing the SHAP value, a cutting-edge machine model interpretation method. It is found that among the three major components of reading engagement, cognitive engagement is the more influential factor, and students with high reading cognitive engagement level are more likely to get high scores in scientific literacy assessment, which is relatively dominant in the model of this study. On the other hand, this study verifies the feasibility of the current popular machine learning model, i.e., XGBoost, in a large-scale international education assessment program, with a better model adaptability and conditions for global and local interpretation.
Collapse
Affiliation(s)
| | | | - Tao Xin
- Collaborative Innovation Center of Assessment for Basic Education Quality, Beijing Normal University, Beijing, China
| |
Collapse
|
43
|
Radhakrishnan BL, Ezra K, Jebadurai IJ, Selvakumar I, Karthikeyan P. An Autonomous Sleep-Stage Detection Technique in Disruptive Technology Environment. Sensors (Basel) 2024; 24:1197. [PMID: 38400354 PMCID: PMC10892786 DOI: 10.3390/s24041197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/07/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024]
Abstract
Autonomous sleep tracking at home has become inevitable in today's fast-paced world. A crucial aspect of addressing sleep-related issues involves accurately classifying sleep stages. This paper introduces a novel approach PSO-XGBoost, combining particle swarm optimisation (PSO) with extreme gradient boosting (XGBoost) to enhance the XGBoost model's performance. Our model achieves improved overall accuracy and faster convergence by leveraging PSO to fine-tune hyperparameters. Our proposed model utilises features extracted from EEG signals, spanning time, frequency, and time-frequency domains. We employed the Pz-oz signal dataset from the sleep-EDF expanded repository for experimentation. Our model achieves impressive metrics through stratified-K-fold validation on ten selected subjects: 95.4% accuracy, 95.4% F1-score, 95.4% precision, and 94.3% recall. The experiment results demonstrate the effectiveness of our technique, showcasing an average accuracy of 95%, outperforming traditional machine learning classifications. The findings revealed that the feature-shifting approach supplements the classification outcome by 3 to 4 per cent. Moreover, our findings suggest that prefrontal EEG derivations are ideal options and could open up exciting possibilities for using wearable EEG devices in sleep monitoring. The ease of obtaining EEG signals with dry electrodes on the forehead enhances the feasibility of this application. Furthermore, the proposed method demonstrates computational efficiency and holds significant value for real-time sleep classification applications.
Collapse
Affiliation(s)
- Baskaran Lizzie Radhakrishnan
- Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore 641114, India; (B.L.R.); (I.J.J.)
| | - Kirubakaran Ezra
- Department of Computer Science and Engineering, Grace College of Engineering, Coimbatore 628005, India;
| | - Immanuel Johnraja Jebadurai
- Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore 641114, India; (B.L.R.); (I.J.J.)
| | - Immanuel Selvakumar
- Department of Electrical and Electronics Engineering, Karunya Institute of Technology and Sciences, Coimbatore 641114, India;
| | | |
Collapse
|
44
|
Navratil G, Giannopoulos I. Classifying Motorcyclist Behaviour with XGBoost Based on IMU Data. Sensors (Basel) 2024; 24:1042. [PMID: 38339759 PMCID: PMC10857319 DOI: 10.3390/s24031042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/31/2024] [Accepted: 02/01/2024] [Indexed: 02/12/2024]
Abstract
Human behaviour detection is relevant in many fields. During navigational tasks it is an indicator for environmental conditions. Therefore, monitoring people while they move along the street network provides insights on the environment. This is especially true for motorcyclists, who have to observe aspects such as road surface conditions or traffic very careful. We thus performed an experiment to check whether IMU data is sufficient to classify motorcyclist behaviour as a data source for later spatial and temporal analysis. The classification was done using XGBoost and proved successful for four out of originally five different types of behaviour. A classification accuracy of approximately 80% was achieved. Only overtake manoeuvrers were not identified reliably.
Collapse
Affiliation(s)
- Gerhard Navratil
- Department for Geodesy and Geoinformation, TU Wien, Wiedner Hauptstr. 8-10, 1040 Vienna, Austria;
| | | |
Collapse
|
45
|
Zheng Z, Liang L, Luo X, Chen J, Lin M, Wang G, Xue C. Diagnosing and tracking depression based on eye movement in response to virtual reality. Front Psychiatry 2024; 15:1280935. [PMID: 38374979 PMCID: PMC10875075 DOI: 10.3389/fpsyt.2024.1280935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 01/16/2024] [Indexed: 02/21/2024] Open
Abstract
Introduction Depression is a prevalent mental illness that is primarily diagnosed using psychological and behavioral assessments. However, these assessments lack objective and quantitative indices, making rapid and objective detection challenging. In this study, we propose a novel method for depression detection based on eye movement data captured in response to virtual reality (VR). Methods Eye movement data was collected and used to establish high-performance classification and prediction models. Four machine learning algorithms, namely eXtreme Gradient Boosting (XGBoost), multilayer perceptron (MLP), Support Vector Machine (SVM), and Random Forest, were employed. The models were evaluated using five-fold cross-validation, and performance metrics including accuracy, precision, recall, area under the curve (AUC), and F1-score were assessed. The predicted error for the Patient Health Questionnaire-9 (PHQ-9) score was also determined. Results The XGBoost model achieved a mean accuracy of 76%, precision of 94%, recall of 73%, and AUC of 82%, with an F1-score of 78%. The MLP model achieved a classification accuracy of 86%, precision of 96%, recall of 91%, and AUC of 86%, with an F1-score of 92%. The predicted error for the PHQ-9 score ranged from -0.6 to 0.6.To investigate the role of computerized cognitive behavioral therapy (CCBT) in treating depression, participants were divided into intervention and control groups. The intervention group received CCBT, while the control group received no treatment. After five CCBT sessions, significant changes were observed in the eye movement indices of fixation and saccade, as well as in the PHQ-9 scores. These two indices played significant roles in the predictive model, indicating their potential as biomarkers for detecting depression symptoms. Discussion The results suggest that eye movement indices obtained using a VR eye tracker can serve as useful biomarkers for detecting depression symptoms. Specifically, the fixation and saccade indices showed promise in predicting depression. Furthermore, CCBT demonstrated effectiveness in treating depression, as evidenced by the observed changes in eye movement indices and PHQ-9 scores. In conclusion, this study presents a novel approach for depression detection using eye movement data captured in VR. The findings highlight the potential of eye movement indices as biomarkers and underscore the effectiveness of CCBT in treating depression.
Collapse
Affiliation(s)
- Zhiguo Zheng
- School of Information and Communication Engineering, Hainan University, Haikou, China
- School of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, China
| | - Lijuan Liang
- The First Affiliated Hospital of Hainan Medical University, Haikou, China
| | - Xiong Luo
- Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Jie Chen
- School of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, China
| | - Meirong Lin
- School of Information Engineering, Hainan Vocational University of Science and Technology, Haikou, China
| | - Guanjun Wang
- School of Electronic Science and Technology, Hainan University, Haikou, China
| | - Chenyang Xue
- School of Electronic Science and Technology, Hainan University, Haikou, China
| |
Collapse
|
46
|
Joe H, Kim HG. Multi-label classification with XGBoost for metabolic pathway prediction. BMC Bioinformatics 2024; 25:52. [PMID: 38297220 PMCID: PMC10832249 DOI: 10.1186/s12859-024-05666-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 01/22/2024] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Metabolic pathway prediction is one possible approach to address the problem in system biology of reconstructing an organism's metabolic network from its genome sequence. Recently there have been developments in machine learning-based pathway prediction methods that conclude that machine learning-based approaches are similar in performance to the most used method, PathoLogic which is a rule-based method. One issue is that previous studies evaluated PathoLogic without taxonomic pruning which decreases its performance. RESULTS In this study, we update the evaluation results from previous studies to demonstrate that PathoLogic with taxonomic pruning outperforms previous machine learning-based approaches and that further improvements in performance need to be made for them to be competitive. Furthermore, we introduce mlXGPR, a XGBoost-based metabolic pathway prediction method based on the multi-label classification pathway prediction framework introduced from mlLGPR. We also improve on this multi-label framework by utilizing correlations between labels using classifier chains. We propose a ranking method that determines the order of the chain so that lower performing classifiers are placed later in the chain to utilize the correlations between labels more. We evaluate mlXGPR with and without classifier chains on single-organism and multi-organism benchmarks. Our results indicate that mlXGPR outperform other previous pathway prediction methods including PathoLogic with taxonomic pruning in terms of hamming loss, precision and F1 score on single organism benchmarks. CONCLUSIONS The results from our study indicate that the performance of machine learning-based pathway prediction methods can be substantially improved and can even outperform PathoLogic with taxonomic pruning.
Collapse
Affiliation(s)
- Hyunwhan Joe
- Biomedical Knowledge Engineering Lab., Seoul National University, Seoul, Republic of Korea
| | - Hong-Gee Kim
- Biomedical Knowledge Engineering Lab., Seoul National University, Seoul, Republic of Korea.
- School of Dentistry and Dental Research Institute, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
47
|
Lei L, Zhang L, Han Z, Chen Q, Liao P, Wu D, Tai J, Xie B, Su Y. Advancing chronic toxicity risk assessment in freshwater ecology by molecular characterization-based machine learning. Environ Pollut 2024; 342:123093. [PMID: 38072027 DOI: 10.1016/j.envpol.2023.123093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/30/2023] [Accepted: 12/02/2023] [Indexed: 01/26/2024]
Abstract
The continuously increased production of various chemicals and their release into environments have raised potential negative effects on ecological health. However, traditional labor-intensive assessment methods cannot effectively and rapidly evaluate these hazards, especially for chronic risk. In this study, machine learning (ML) was employed to construct quantitative structure-activity relationship (QSAR) models, enabling the prediction of chronic toxicity to aquatic organisms by leveraging the molecular characteristics of pollutants, namely, the molecular descriptors, fingerprints, and graphs. The limited dataset size hindered the notable advantages of the graph attention network (GAT) model for the molecular graphs. Considering computational efficiency and performance (R2 = 0.78; RMSE = 0.77), XGBoost (XGB) was used for reliable QSAR-ML models predicting chronic toxicity using small- or medium-sized tabular data and the molecular descriptors. Further kernel density estimation analysis confirmed the high accuracy of the model for pollutant concentrations ranging from 10-3 to 102 mg/L, effectively aligning with most environmental scenarios. Model interpretation showed SlogP and exposure duration as the primary influential factors. SlogP, representing the distribution coefficient of a molecule between lipophilic and hydrophilic environments, had a negative effect on the toxicity outcomes. Additionally, the exposure duration played a crucial role in determining the chronic toxicity. Finally, the chronic toxicity data of bisphenol A validated the robustness and reliability of the model established in this research. Our study provided a robust and feasible methodology for chronic ecological risk evaluation of various types of pollutants and could facilitate and increase the use of ML applications in environmental fields.
Collapse
Affiliation(s)
- Lang Lei
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Liangmao Zhang
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Zhibang Han
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Qirui Chen
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Pengcheng Liao
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Dong Wu
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Chongqing Key Laboratory of Precision Optics, Chongqing Institute of East China Normal University, Chongqing, 401120, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
| | - Jun Tai
- Shanghai Environmental Sanitation Engineering Design Institute Co., Ltd., Shanghai, 200232, China
| | - Bing Xie
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
| | - Yinglong Su
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Chongqing Key Laboratory of Precision Optics, Chongqing Institute of East China Normal University, Chongqing, 401120, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China.
| |
Collapse
|
48
|
Tao Q, Wu L, An J, Liu Z, Zhang K, Zhou L, Zhang X. Proteomic analysis of human aqueous humor from fuchs uveitis syndrome. Exp Eye Res 2024; 239:109752. [PMID: 38123010 DOI: 10.1016/j.exer.2023.109752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/25/2023] [Accepted: 12/11/2023] [Indexed: 12/23/2023]
Abstract
Fuchs uveitis syndrome (FUS) is a commonly misdiagnosed uveitis syndrome often presenting as an asymptomatic mild inflammatory condition until complications arise. The diagnosis of this disease remains clinical because of the lack of specific laboratory tests. The aqueous humor (AH) is a complex fluid containing nutrients and metabolic wastes from the eye. Changes in the AH protein provide important information for diagnosing intraocular diseases. This study aimed to analyze the proteomic profile of AH in individuals diagnosed with FUS and to identify potential biomarkers of the disease. We used liquid chromatography-tandem mass spectrometry-based proteomic methods to evaluate the AH protein profiles of all 37 samples, comprising 15 patients with FUS, six patients with Posner-Schlossman syndrome (PSS), and 16 patients with age-related cataract. A total of 538 proteins were identified from a comprehensive spectral library of 634 proteins. Subsequent differential expression analysis, enrichment analysis, and construction of key sub-networks revealed that the inflammatory response, complement activation and hypoxia might be crucial in mediating the process of FUS. The hypoxia inducible factor-1 may serve as a key regulator and therapeutic target. Additionally, the innate and adaptive immune responses are considered dominant in the patients with FUS. A diagnostic model was constructed using machine-learning algorithm to classify FUS, PSS, and normal controls. Two proteins, complement C1q subcomponent subunit B and secretogranin-1, were found to have the highest scores by the Extreme Gradient Boosting, suggesting their potential utility as a biomarker panel. Furthermore, these two proteins as biomarkers were validated in a cohort of 18 patients using high resolution multiple reaction monitoring assays. Therefore, this study contributes to advancing of the current knowledge of FUS pathogenesis and promotes the development of effective diagnostic strategies.
Collapse
Affiliation(s)
- Qingqin Tao
- Tianjin Key Laboratory of Retinal Functions and Diseases, Tianjin Branch of National Clinical Research Center for Ocular Disease, Eye Institute and School of Optometry, Tianjin Medical University Eye Hospital, Tianjin, China
| | - Lingzi Wu
- Tianjin Key Laboratory of Retinal Functions and Diseases, Tianjin Branch of National Clinical Research Center for Ocular Disease, Eye Institute and School of Optometry, Tianjin Medical University Eye Hospital, Tianjin, China; Beijing Institute of Ophthalmology, Beijing Tongren Eye Center, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Jinying An
- Tianjin Key Laboratory of Retinal Functions and Diseases, Tianjin Branch of National Clinical Research Center for Ocular Disease, Eye Institute and School of Optometry, Tianjin Medical University Eye Hospital, Tianjin, China
| | | | - Kai Zhang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Key Laboratory of Immune Microenvironment and Disease (Ministry of Education), Tianjin Key Laboratory of Medical Epigenetics, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Lei Zhou
- School of Optometry, Department of Applied Biology and Chemical Technology, Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Hong Kong, China; Centre for Eye and Vision Research (CEVR), 17W Hong Kong Science Park, Hong Kong, China
| | - Xiaomin Zhang
- Tianjin Key Laboratory of Retinal Functions and Diseases, Tianjin Branch of National Clinical Research Center for Ocular Disease, Eye Institute and School of Optometry, Tianjin Medical University Eye Hospital, Tianjin, China.
| |
Collapse
|
49
|
Alabi RO, Almangush A, Elmusrati M, Leivo I, Mäkitie AA. Interpretable machine learning model for prediction of overall survival in laryngeal cancer. Acta Otolaryngol 2024:1-7. [PMID: 38279817 DOI: 10.1080/00016489.2023.2301648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 12/21/2023] [Indexed: 01/29/2024]
Abstract
Background: The mortality rates of laryngeal squamous cell carcinoma cancer (LSCC) have not significantly decreased in the last decades.Objectives: We primarily aimed to compare the predictive performance of DeepTables with the state-of-the-art machine learning (ML) algorithms (Voting ensemble, Stack ensemble, and XGBoost) to stratify patients with LSCC into chance of overall survival (OS). In addition, we complemented the developed model by providing interpretability using both global and local model-agnostic techniques.Methods: A total of 2792 patients in the Surveillance, Epidemiology, and End Results (SEER) database diagnosed with LSCC were reviewed. The global model-agnostic interpretability was examined using SHapley Additive exPlanations (SHAP) technique. Likewise, individual interpretation of the prediction was made using Local Interpretable Model Agnostic Explanations (LIME).Results: The state-of-the-art ML ensemble algorithms outperformed DeepTables. Specifically, the examined ensemble algorithms showed comparable weighted area under receiving curve of 76.9, 76.8, and 76.1 with an accuracy of 71.2%, 70.2%, and 71.8%, respectively. The global methods of interpretability (SHAP) demonstrated that the age of the patient at diagnosis, N-stage, T-stage, tumor grade, and marital status are among the prominent parameters.Conclusions: A ML model for OS prediction may serve as an ancillary tool for treatment planning of LSCC patients.
Collapse
Affiliation(s)
- Rasheed Omobolaji Alabi
- Research Program in Systems Oncology, University of Helsinki, Helsinki, Finland
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
| | - Alhadi Almangush
- Research Program in Systems Oncology, University of Helsinki, Helsinki, Finland
- Department of Pathology, University of Helsinki, Helsinki, Finland
- Institute of Biomedicine, University of Turku, Pathology, Finland
| | - Mohammed Elmusrati
- Department of Industrial Digitalization, School of Technology and Innovations, University of Vaasa, Vaasa, Finland
| | - Ilmo Leivo
- Institute of Biomedicine, University of Turku, Pathology, Finland
| | - Antti A Mäkitie
- Research Program in Systems Oncology, University of Helsinki, Helsinki, Finland
- Department of Otorhinolaryngology - Head and Neck Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Division of Ear, Nose and Throat Diseases, Department of Clinical Sciences, Intervention and Technology, Karolinska Institute and Karolinska University Hospital, Stockholm, Sweden
| |
Collapse
|
50
|
Liu SH, Ting CE, Wang JJ, Chang CJ, Chen W, Sharma AK. Estimation of Gait Parameters for Adults with Surface Electromyogram Based on Machine Learning Models. Sensors (Basel) 2024; 24:734. [PMID: 38339451 PMCID: PMC10857519 DOI: 10.3390/s24030734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024]
Abstract
Gait analysis has been studied over the last few decades as the best way to objectively assess the technical outcome of a procedure designed to improve gait. The treating physician can understand the type of gait problem, gain insight into the etiology, and find the best treatment with gait analysis. The gait parameters are the kinematics, including the temporal and spatial parameters, and lack the activity information of skeletal muscles. Thus, the gait analysis measures not only the three-dimensional temporal and spatial graphs of kinematics but also the surface electromyograms (sEMGs) of the lower limbs. Now, the shoe-worn GaitUp Physilog® wearable inertial sensors can easily measure the gait parameters when subjects are walking on the general ground. However, it cannot measure muscle activity. The aim of this study is to measure the gait parameters using the sEMGs of the lower limbs. A self-made wireless device was used to measure the sEMGs from the vastus lateralis and gastrocnemius muscles of the left and right feet. Twenty young female subjects with a skeletal muscle index (SMI) below 5.7 kg/m2 were recruited for this study and examined by the InBody 270 instrument. Four parameters of sEMG were used to estimate 23 gait parameters. They were measured using the GaitUp Physilog® wearable inertial sensors with three machine learning models, including random forest (RF), decision tree (DT), and XGBoost. The results show that 14 gait parameters could be well-estimated, and their correlation coefficients are above 0.800. This study signifies a step towards a more comprehensive analysis of gait with only sEMGs.
Collapse
Affiliation(s)
- Shing-Hong Liu
- Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung City 41349, Taiwan; (S.-H.L.); (C.-E.T.)
| | - Chi-En Ting
- Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung City 41349, Taiwan; (S.-H.L.); (C.-E.T.)
| | - Jia-Jung Wang
- Department of Biomedical Engineering, I-Shou University, Kaohsiung 82445, Taiwan
| | - Chun-Ju Chang
- Department of Golden-Ager Industry Management, Chaoyang University of Technology, Taichung City 41349, Taiwan;
| | - Wenxi Chen
- Division of Information Systems, School of Computer Science and Engineering, The University of Aizu, Aizu-Wakamatsu City 965-8580, Fukushima, Japan;
| | - Alok Kumar Sharma
- Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung City 41349, Taiwan; (S.-H.L.); (C.-E.T.)
| |
Collapse
|