Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang C, Chen X, Wang S, Hu J, Wang C, Liu X. Using CatBoost algorithm to identify middle-aged and elderly depression, national health and nutrition examination survey 2011-2018. Psychiatry Res 2021;306:114261. [PMID: 34781111 DOI: 10.1016/j.psychres.2021.114261] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 10/10/2021] [Accepted: 10/30/2021] [Indexed: 12/16/2022]

For:	Zhang C, Chen X, Wang S, Hu J, Wang C, Liu X. Using CatBoost algorithm to identify middle-aged and elderly depression, national health and nutrition examination survey 2011-2018. Psychiatry Res 2021;306:114261. [PMID: 34781111 DOI: 10.1016/j.psychres.2021.114261] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 10/10/2021] [Accepted: 10/30/2021] [Indexed: 12/16/2022]

Number

Cited by Other Article(s)

Razavi-Termeh SV, Sadeghi-Niaraki A, Yao XA, Naqvi RA, Choi SM. Assessment of noise pollution-prone areas using an explainable geospatial artificial intelligence approach. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024;370:122361. [PMID: 39255573 DOI: 10.1016/j.jenvman.2024.122361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/12/2024] [Accepted: 08/30/2024] [Indexed: 09/12/2024]

Abstract

This research aims to use the power of geospatial artificial intelligence (GeoAI), employing the categorical boosting (CatBoost) machine learning model in conjunction with two metaheuristic algorithms, the firefly algorithm (CatBoost-FA) and the fruit fly optimization algorithm (CatBoost-FOA), to spatially assess and map noise pollution prone areas in Tehran city, Iran. To spatially model areas susceptible to noise pollution, we established a comprehensive spatial database encompassing data for the annual average Leq (equivalent continuous sound level) from 2019 to 2022. This database was enriched with critical spatial criteria influencing noise pollution, including urban land use, traffic volume, population density, and normalized difference vegetation index (NDVI). Our study evaluated the predictive accuracy of these models using key performance metrics, including root mean square error (RMSE), mean absolute error (MAE), and receiver operating characteristic (ROC) indices. The results demonstrated the superior performance of the CatBoost-FA algorithm, with RMSE and MAE values of 0.159 and 0.114 for the training data and 0.437 and 0.371 for the test data, outperforming both the CatBoost-FOA and CatBoost models. ROC analysis further confirmed the efficacy of the models, achieving an accuracy of 0.897, CatBoost-FOA with an accuracy of 0.871, and CatBoost with an accuracy of 0.846, highlighting their robust modeling capabilities. Additionally, we employed an explainable artificial intelligence (XAI) approach, utilizing the SHAP (Shapley Additive Explanations) method to interpret the underlying mechanisms of our models. The SHAP results revealed the significant influence of various factors on noise-pollution-prone areas, with airport, commercial, and administrative zones emerging as pivotal contributors.

Collapse

Li R, Wang X, Luo L, Yuan Y. Identifying the most crucial factors associated with depression based on interpretable machine learning: a case study from CHARLS. Front Psychol 2024;15:1392240. [PMID: 39118849 PMCID: PMC11306142 DOI: 10.3389/fpsyg.2024.1392240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 07/08/2024] [Indexed: 08/10/2024] Open

Abstract

Background

Depression is one of the most common mental illnesses among middle-aged and older adults in China. It is of great importance to find the crucial factors that lead to depression and to effectively control and reduce the risk of depression. Currently, there are limited methods available to accurately predict the risk of depression and identify the crucial factors that influence it.

Methods

We collected data from 25,586 samples from the harmonized China Health and Retirement Longitudinal Study (CHARLS), and the latest records from 2018 were included in the current cross-sectional analysis. Ninety-three input variables in the survey were considered as potential influential features. Five machine learning (ML) models were utilized, including CatBoost and eXtreme Gradient Boosting (XGBoost), Gradient Boosting decision tree (GBDT), Random Forest (RF), Light Gradient Boosting Machine (LightGBM). The models were compared to the traditional multivariable Linear Regression (LR) model. Simultaneously, SHapley Additive exPlanations (SHAP) were used to identify key influencing factors at the global level and explain individual heterogeneity through instance-level analysis. To explore how different factors are non-linearly associated with the risk of depression, we employed the Accumulated Local Effects (ALE) approach to analyze the identified critical variables while controlling other covariates.

Results

CatBoost outperformed other machine learning models in terms of MAE, MSE, MedAE, and R2metrics. The top three crucial factors identified by the SHAP were r4satlife, r4slfmem, and r4shlta, representing life satisfaction, self-reported memory, and health status levels, respectively.

Conclusion

This study demonstrates that the CatBoost model is an appropriate choice for predicting depression among middle-aged and older adults in Harmonized CHARLS. The SHAP and ALE interpretable methods have identified crucial factors and the nonlinear relationship with depression, which require the attention of domain experts.

Collapse

Li Q, Lv H, Chen Y, Shen J, Shi J, Zhou C. Development and validation of a machine learning predictive model for perioperative myocardial injury in cardiac surgery with cardiopulmonary bypass. J Cardiothorac Surg 2024;19:384. [PMID: 38926872 PMCID: PMC11201784 DOI: 10.1186/s13019-024-02856-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open

Abstract

BACKGROUND

Perioperative myocardial injury (PMI) with different cut-off values has showed to be associated with different prognostic effect after cardiac surgery. Machine learning (ML) method has been widely used in perioperative risk predictions during cardiac surgery. However, the utilization of ML in PMI has not been studied yet. Therefore, we sought to develop and validate the performances of ML for PMI with different cut-off values in cardiac surgery with cardiopulmonary bypass (CPB).

METHODS

This was a second analysis of a multicenter clinical trial (OPTIMAL) and requirement for written informed consent was waived due to the retrospective design. Patients aged 18-70 undergoing elective cardiac surgery with CPB from December 2018 to April 2021 were enrolled in China. The models were developed using the data from Fuwai Hospital and externally validated by the other three cardiac centres. Traditional logistic regression (LR) and eleven ML models were constructed. The primary outcome was PMI, defined as the postoperative maximum cardiac Troponin I beyond different times of upper reference limit (40x, 70x, 100x, 130x) We measured the model performance by examining the area under the receiver operating characteristic curve (AUROC), precision-recall curve (AUPRC), and calibration brier score.

RESULTS

A total of 2983 eligible patients eventually participated in both the model development (n = 2420) and external validation (n = 563). The CatboostClassifier and RandomForestClassifier emerged as potential alternatives to the LR model for predicting PMI. The AUROC demonstrated an increase with each of the four cutoffs, peaking at 100x URL in the testing dataset and at 70x URL in the external validation dataset. However, it's worth noting that the AUPRC decreased with each cutoff increment. Additionally, the Brier loss score decreased as the cutoffs increased, reaching its lowest point at 0.16 with a 130x URL cutoff. Moreover, extended CPB time, aortic duration, elevated preoperative N-terminal brain sodium peptide, reduced preoperative neutrophil count, higher body mass index, and increased high-sensitivity C-reactive protein levels were identified as risk factors for PMI across all four cutoff values.

CONCLUSIONS

The CatboostClassifier and RandomForestClassifer algorithms could be an alternative for LR in prediction of PMI. Furthermore, preoperative higher N-terminal brain sodium peptide and lower high-sensitivity C-reactive protein were strong risk factor for PMI, the underlying mechanism require further investigation.

Collapse

Li Q, Lv H, Chen Y, Shen J, Shi J, Zhou C. Hybrid feature selection in a machine learning predictive model for perioperative myocardial injury in noncoronary cardiac surgery with cardiopulmonary bypass. Perfusion 2024:2676591241253459. [PMID: 38733257 DOI: 10.1177/02676591241253459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2024]

Abstract

BACKGROUND

Perioperative myocardial injury (PMI) is associated with increased mobility and mortality after noncoronary cardiac surgery. However, limited studies have developed a predictive model for PMI. Therefore, we used hybrid feature selection (FS) methods to establish a predictive model for PMI in noncoronary cardiac surgery with cardiopulmonary bypass (CPB).

METHODS

This was a single-center retrospective study conducted at the Fuwai Hospital in China. Patients aged 18-70 years who underwent elective noncoronary surgery with CPB at our institution from December 2018 to April 2021 were enrolled. The primary outcome was PMI, defined as the postoperative cardiac troponin I (cTnI) levels exceeding 220 times of upper reference limit (URL). Statistical analyses were conducted by Python (Python Software Foundation, version 3.9.7 and integrated development environment Jupyter Notebook 1.1.0) and SPSS software version 26.0 (IBM Corp., Armonk, New York, USA).

RESULTS

A total of 1130 patients were eventually eligible for this study. The incidence of PMI was 20.3% (229/1130) in the overall patients, 20.6% (163/791) in the training dataset, and 19.5% (66/339) in the testing dataset. The logistic regression model performed the best AUC of 0.6893 (95 CI%: 0.6371-0.7382) by the traditional selection method, and the random forest model performed the best AUC of 0.6937 (95 CI%: 0.6416-0.7423) by the union of Wrapper and Embedded method, and the CatBoost model performed the best AUC of 0.6828 (95 CI%: 0.6304-0.7320) by the union of Embedded and forward logistic regression technique, and the Naïve Bayes model achieved the best AUC with 0.7254 (95 CI%: 0.6746-0.7723) by forwarding logistic regression method. Moreover, the decision tree, KNeighborsClassifier, and support vector machine models performed the worse AUC in all selection forms. Furthermore, the SHapley Additive exPlanations plot showed that prolonged CPB, aortic clamp time, and preoperative low platelets count were strongly related to the PMI risk.

CONCLUSIONS

In total, four category feature selection methods were utilized, comprising five individual selection techniques and 15 combined methods. Notably, the combination of logistic regression and embedded methods demonstrated outstanding performance in predicting PMI risk. We also concluded that the machine learning model, including random forest, catboost, and Naive Bayes, were suitable candidates for establishing PMI predictive model. Nevertheless, additional investigation and validation are imperative for substantiating these finding.

Collapse

Lin W, Shi S, Lan H, Wang N, Huang H, Wen J, Chen G. Identification of influence factors in overweight population through an interpretable risk model based on machine learning: a large retrospective cohort. Endocrine 2024;83:604-614. [PMID: 37776483 DOI: 10.1007/s12020-023-03536-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 09/12/2023] [Indexed: 10/02/2023]

Yang J, Wan J, Feng L, Hou S, Yv K, Xu L, Chen K. Machine learning algorithms for the prediction of adverse prognosis in patients undergoing peritoneal dialysis. BMC Med Inform Decis Mak 2024;24:8. [PMID: 38166909 PMCID: PMC10763100 DOI: 10.1186/s12911-023-02412-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Accepted: 12/19/2023] [Indexed: 01/05/2024] Open

Zhang K, Xu X, You H. Social causation, social selection, and economic selection in the health outcomes of Chinese older adults and their gender disparities. SSM Popul Health 2023;24:101508. [PMID: 37720820 PMCID: PMC10500472 DOI: 10.1016/j.ssmph.2023.101508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/26/2023] [Accepted: 09/02/2023] [Indexed: 09/19/2023] Open

Abstract

Background

The economic selection hypothesis, which argues that the initial economic situation determines both subsequent health and economic conditions, has been drawn into the debate on causation-selection issues. This study aims to construct a path model with self-rated health and depression score of older adults as health outcomes to measure and compare the social causation forces of wealth accumulation, social selection forces of adulthood health, and economic selection forces of childhood economics, and to examine their gender disparities.

Methods

Data was obtained from a sample of 19613 older adults aged 45 years or above from the 2014 life history survey and the 2015 routine follow-up survey of the China Health and Retirement Longitudinal Study. Structural equation modeling analysis was conducted employing the full information maximum likelihood estimation method.

Results

The presence of social causation, social selection, and economic selection were all statistically supported. In self-rated health, social selection forces held the dominant position, while social causation forces were comparable to economic selection forces. In depression score, social selection still exhibited stronger forces than economic selection, but social causation had forces close to social selection and greater than economic selection. The forces of the three hypotheses in self-rated health did not significantly change with gender, but social causation exerted mightier forces than economic selection within the male group, unlike the female group. The forces of economic selection in depression score were greater in females than males and no significant differences were observed among the forces of the three hypotheses in the female group.

Conclusions

Social causation, social selection, and economic selection operate simultaneously on the self-rated health and depression score of older adults. However, the force magnitudes of the three hypotheses and/or their rankings differ by health outcomes and gender.

Collapse

Zhang Y, Wang H, Yin C, Shu T, Yu J, Jian J, Jian C, Duan M, Kadier K, Xu Q, Wang X, Xiang T, Liu X. Development of a prediction model for the risk of 30-day unplanned readmission in older patients with heart failure: A multicenter retrospective study. Nutr Metab Cardiovasc Dis 2023;33:1878-1887. [PMID: 37500347 DOI: 10.1016/j.numecd.2023.05.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/21/2023] [Accepted: 05/31/2023] [Indexed: 07/29/2023]

Qu Z, Wang Y, Guo D, He G, Sui C, Duan Y, Zhang X, Lan L, Meng H, Wang Y, Liu X. Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005-2018. BMC Psychiatry 2023;23:620. [PMID: 37612646 PMCID: PMC10463693 DOI: 10.1186/s12888-023-05109-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Accepted: 08/13/2023] [Indexed: 08/25/2023] Open

Nhu NT, Kang JH, Yeh TS, Wu CC, Tsai CY, Piravej K, Lam C. Prediction of posttraumatic functional recovery in middle-aged and older patients through dynamic ensemble selection modeling. Front Public Health 2023;11:1164820. [PMID: 37408743 PMCID: PMC10319009 DOI: 10.3389/fpubh.2023.1164820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 05/17/2023] [Indexed: 07/07/2023] Open

Abstract

Introduction

Age-specific risk factors may delay posttraumatic functional recovery; complex interactions exist between these factors. In this study, we investigated the prediction ability of machine learning models for posttraumatic (6 months) functional recovery in middle-aged and older patients on the basis of their preexisting health conditions.

Methods

Data obtained from injured patients aged ≥45 years were divided into training-validation (n = 368) and test (n = 159) data sets. The input features were the sociodemographic characteristics and baseline health conditions of the patients. The output feature was functional status 6 months after injury; this was assessed using the Barthel Index (BI). On the basis of their BI scores, the patients were categorized into functionally independent (BI >60) and functionally dependent (BI ≤60) groups. The permutation feature importance method was used for feature selection. Six algorithms were validated through cross-validation with hyperparameter optimization. The algorithms exhibiting satisfactory performance were subjected to bagging to construct stacking, voting, and dynamic ensemble selection models. The best model was evaluated on the test data set. Partial dependence (PD) and individual conditional expectation (ICE) plots were created.

Results

In total, nineteen of twenty-seven features were selected. Logistic regression, linear discrimination analysis, and Gaussian Naive Bayes algorithms exhibited satisfactory performances and were, therefore, used to construct ensemble models. The k-Nearest Oracle Elimination model outperformed the other models when evaluated on the training-validation data set (sensitivity: 0.732, 95% CI: 0.702-0.761; specificity: 0.813, 95% CI: 0.805-0.822); it exhibited compatible performance on the test data set (sensitivity: 0.779, 95% CI: 0.559-0.950; specificity: 0.859, 95% CI: 0.799-0.912). The PD and ICE plots showed consistent patterns with practical tendencies.

Conclusion

Preexisting health conditions can predict long-term functional outcomes in injured middle-aged and older patients, thus predicting prognosis and facilitating clinical decision-making.

Collapse

Affiliation(s)

Nguyen Thanh Nhu International Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan Faculty of Medicine, Can Tho University of Medicine and Pharmacy, Can Tho, Vietnam
Jiunn-Horng Kang International Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan Department of Physical Medicine and Rehabilitation, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan Department of Physical Medicine and Rehabilitation, Taipei Medical University Hospital, Taipei, Taiwan Graduate Institute of Nanomedicine and Medical Engineering, College of Biomedical Engineering, Taipei Medical University, Taipei, Taiwan Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
Tian-Shin Yeh Department of Physical Medicine and Rehabilitation, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan Department of Physical Medicine and Rehabilitation, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan Department of Epidemiology and Nutrition, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA, United States Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
Chia-Chieh Wu Emergency Department, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan Department of Emergency, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
Cheng-Yu Tsai Centre for Transport Studies, Department of Civil and Environmental Engineering, Imperial College London, London, United Kingdom
Krisna Piravej Department of Rehabilitation Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand Department of Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Bangkok, Thailand
Carlos Lam Emergency Department, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan Department of Emergency, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan

Collapse

Tsai HJ, Yang WC, Tsai SJ, Lin CH, Yang AC. Right-side frontal-central cortical hyperactivation before the treatment predicts outcomes of antidepressant and electroconvulsive therapy responsivity in major depressive disorder. J Psychiatr Res 2023;161:377-385. [PMID: 37012197 DOI: 10.1016/j.jpsychires.2023.03.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 03/08/2023] [Accepted: 03/13/2023] [Indexed: 04/05/2023]

Abstract

Major depressive disorder places a great burden on healthcare resources worldwide. Antidepressants are the first-line treatment for major depressive disorder, but if patients don't respond adequately, brain stimulation therapy may be needed as second-line treatment. Digital phenotyping in patients with major depressive disorder will aid in the timely prediction of treatment effectiveness. This study explored electroencephalographic (EEG) signatures that diversify depression treatment responsivity, including antidepressant administration or brain stimulation therapy. Resting-state, pre-treatment EEG sequences from depressive patients who received fluoxetine treatment (n = 55; 26 remitters and 29 poor responders) or electroconvulsive therapy (ECT, n = 58; 36 remitters and 22 nonremitters) were recorded on 19 channels. Twenty-nine EEG segments were obtained from each patient per recording electrode. Power spectral analysis was conducted for feature extraction and showed the highest predictive accuracy for fluoxetine or ECT outcomes. Both occurred with beta-band oscillations within right-side frontal-central (F1-score = 0.9437) or prefrontal areas of the brain (F1-score = 0.9416), respectively. Significantly higher beta-band power was observed among patients who lacked adequate treatment response than the remitters, specifically at 19.2 Hz or 24.5 Hz for fluoxetine administration or ECT outcome, respectively. Our findings indicated that pre-treatment, right-side cortical hyperactivation is associated with poor outcomes of antidepressant-based or ECT-based treatment in major depression. Whether depression treatment response rates can be improved by reducing the high-frequency EEG power in corresponding areas of the brain to provide a protective effect against depression recurrence warrants further study.

Collapse

Chang W, Wang X, Yang J, Qin T. An Improved CatBoost-Based Classification Model for Ecological Suitability of Blueberries. SENSORS (BASEL, SWITZERLAND) 2023;23:1811. [PMID: 36850409 PMCID: PMC9961688 DOI: 10.3390/s23041811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/30/2023] [Accepted: 01/31/2023] [Indexed: 06/18/2023]

Abstract

Selecting the best planting area for blueberries is an essential issue in agriculture. To better improve the effectiveness of blueberry cultivation, a machine learning-based classification model for blueberry ecological suitability was proposed for the first time and its validation was conducted by using multi-source environmental features data in this paper. The sparrow search algorithm (SSA) was adopted to optimize the CatBoost model and classify the ecological suitability of blueberries based on the selection of data features. Firstly, the Borderline-SMOTE algorithm was used to balance the number of positive and negative samples. The Variance Inflation Factor and information gain methods were applied to filter out the factors affecting the growth of blueberries. Subsequently, the processed data were fed into the CatBoost for training, and the parameters of the CatBoost were optimized to obtain the optimal model using SSA. Finally, the SSA-CatBoost model was adopted to classify the ecological suitability of blueberries and output the suitability types. Taking a study on a blueberry plantation in Majiang County, Guizhou Province, China as an example, the findings demonstrate that the AUC value of the SSA-CatBoost-based blueberry ecological suitability model is 0.921, which is 2.68% higher than that of the CatBoost (AUC = 0.897) and is significantly higher than Logistic Regression (AUC = 0.855), Support Vector Machine (AUC = 0.864), and Random Forest (AUC = 0.875). Furthermore, the ecological suitability of blueberries in Majiang County is mapped according to the classification results of different models. When comparing the actual blueberry cultivation situation in Majiang County, the classification results of the SSA-CatBoost model proposed in this paper matches best with the real blueberry cultivation situation in Majiang County, which is of a high reference value for the selection of blueberry cultivation sites.

Collapse

Development and Validation of a Machine Learning Predictive Model for Cardiac Surgery-Associated Acute Kidney Injury. J Clin Med 2023;12:jcm12031166. [PMID: 36769813 PMCID: PMC9917969 DOI: 10.3390/jcm12031166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 01/16/2023] [Accepted: 01/27/2023] [Indexed: 02/05/2023] Open

Abstract

OBJECTIVE

We aimed to develop and validate a predictive machine learning (ML) model for cardiac surgery associated with acute kidney injury (CSA-AKI) based on a multicenter randomized control trial (RCT) and a Medical Information Mart for Intensive Care-IV (MIMIC-IV) dataset.

METHODS

This was a subanalysis from a completed RCT approved by the Ethics Committee of Fuwai Hospital in Beijing, China (NCT03782350). Data from Fuwai Hospital were randomly assigned, with 80% for the training dataset and 20% for the testing dataset. The data from three other centers were used for the external validation dataset. Furthermore, the MIMIC-IV dataset was also utilized to validate the performance of the predictive model. The area under the receiver operating characteristic curve (ROC-AUC), the precision-recall curve (PR-AUC), and the calibration brier score were applied to evaluate the performance of the traditional logistic regression (LR) and eleven ML algorithms. Additionally, the Shapley Additive Explanations (SHAP) interpreter was used to explain the potential risk factors for CSA-AKI.

RESULT

A total of 6495 eligible patients undergoing cardiopulmonary bypass (CPB) were eventually included in this study, 2416 of whom were from Fuwai Hospital (Beijing), for model development, 562 from three other cardiac centers in China, and 3517 from the MIMICIV dataset, were used, respectively, for external validation. The CatBoostClassifier algorithms outperformed other models, with excellent discrimination and calibration performance for the development, as well as the MIMIC-IV, datasets. In addition, the CatBoostClassifier achieved ROC-AUCs of 0.85, 0.67, and 0.77 and brier scores of 0.14, 0.19, and 0.16 in the testing, external, and MIMIC-IV datasets, respectively. Moreover, the utmost important risk factor, the N-terminal brain sodium peptide (NT-proBNP), was confirmed by the LASSO method in the feature section process. Notably, the SHAP explainer identified that the preoperative blood urea nitrogen level, prothrombin time, serum creatinine level, total bilirubin level, and age were positively correlated with CSA-AKI; preoperative platelets level, systolic and diastolic blood pressure, albumin level, and body weight were negatively associated with CSA-AKI.

CONCLUSIONS

The CatBoostClassifier algorithms outperformed other ML models in the discrimination and calibration of CSA-AKI prediction cardiac surgery with CPB, based on a multicenter RCT and MIMIC-IV dataset. Moreover, the preoperative NT-proBNP level was confirmed to be strongly related to CSA-AKI.

Collapse

Wei Q, Xu X, Xu X, Cheng Q. Early identification of autism spectrum disorder by multi-instrument fusion: A clinically applicable machine learning approach. Psychiatry Res 2023;320:115050. [PMID: 36645989 DOI: 10.1016/j.psychres.2023.115050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 12/30/2022] [Accepted: 01/05/2023] [Indexed: 01/12/2023]

Ustebay S, Sarmis A, Kaya GK, Sujan M. A comparison of machine learning algorithms in predicting COVID-19 prognostics. Intern Emerg Med 2023;18:229-239. [PMID: 36116079 PMCID: PMC9483274 DOI: 10.1007/s11739-022-03101-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 09/05/2022] [Indexed: 02/01/2023]

Qin Y, Wu J, Xiao W, Wang K, Huang A, Liu B, Yu J, Li C, Yu F, Ren Z. Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:ijerph192215027. [PMID: 36429751 PMCID: PMC9690067 DOI: 10.3390/ijerph192215027] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/04/2022] [Accepted: 11/10/2022] [Indexed: 06/01/2023]

Shi S, Pan X, Zhang L, Wang X, Zhuang Y, Lin X, Shi S, Zheng J, Lin W. An application based on bioinformatics and machine learning for risk prediction of sepsis at first clinical presentation using transcriptomic data. Front Genet 2022;13:979529. [PMID: 36159979 PMCID: PMC9490444 DOI: 10.3389/fgene.2022.979529] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Accepted: 08/10/2022] [Indexed: 12/02/2022] Open

Affiliation(s)

Songchang Shi Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Xiaobin Pan Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Lihui Zhang Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Xincai Wang Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Yingfeng Zhuang Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Xingsheng Lin Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Jinshan Hospital, Fujian Provincial Hospital, Fuzhou, China
Songjing Shi Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou, China
Jianzhang Zheng Department of Orthopedics, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou, China
Wei Lin Department of Endocrinology, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, Fuzhou, China

Collapse

Duan M, Shu T, Zhao B, Xiang T, Wang J, Huang H, Zhang Y, Xiao P, Zhou B, Xie Z, Liu X. Explainable machine learning models for predicting 30-day readmission in pediatric pulmonary hypertension: A multicenter, retrospective study. Front Cardiovasc Med 2022;9:919224. [PMID: 35958416 PMCID: PMC9360407 DOI: 10.3389/fcvm.2022.919224] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open

Gao W, Zhou L, Liu S, Guan Y, Gao H, Hui B. Machine learning prediction of lignin content in poplar with Raman spectroscopy. BIORESOURCE TECHNOLOGY 2022;348:126812. [PMID: 35131461 DOI: 10.1016/j.biortech.2022.126812] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 01/29/2022] [Accepted: 01/31/2022] [Indexed: 06/14/2023]

Chen S, Liu LP, Wang YJ, Zhou XH, Dong H, Chen ZW, Wu J, Gui R, Zhao QY. Advancing Prediction of Risk of Intraoperative Massive Blood Transfusion in Liver Transplantation With Machine Learning Models. A Multicenter Retrospective Study. Front Neuroinform 2022;16:893452. [PMID: 35645754 PMCID: PMC9140217 DOI: 10.3389/fninf.2022.893452] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open

Abstract

Background

Liver transplantation surgery is often accompanied by massive blood loss and massive transfusion (MT), while MT can cause many serious complications related to high mortality. Therefore, there is an urgent need for a model that can predict the demand for MT to reduce the waste of blood resources and improve the prognosis of patients.

Objective

To develop a model for predicting intraoperative massive blood transfusion in liver transplantation surgery based on machine learning algorithms.

Methods

A total of 1,239 patients who underwent liver transplantation surgery in three large grade lll-A general hospitals of China from March 2014 to November 2021 were included and analyzed. A total of 1193 cases were randomly divided into the training set (70%) and test set (30%), and 46 cases were prospectively collected as a validation set. The outcome of this study was an intraoperative massive blood transfusion. A total of 27 candidate risk factors were collected, and recursive feature elimination (RFE) was used to select key features based on the Categorical Boosting (CatBoost) model. A total of ten machine learning models were built, among which the three best performing models and the traditional logistic regression (LR) method were prospectively verified in the validation set. The Area Under the Receiver Operating Characteristic Curve (AUROC) was used for model performance evaluation. The Shapley additive explanation value was applied to explain the complex ensemble learning models.

Results

Fifteen key variables were screened out, including age, weight, hemoglobin, platelets, white blood cells count, activated partial thromboplastin time, prothrombin time, thrombin time, direct bilirubin, aspartate aminotransferase, total protein, albumin, globulin, creatinine, urea. Among all algorithms, the predictive performance of the CatBoost model (AUROC: 0.810) was the best. In the prospective validation cohort, LR performed far less well than other algorithms.

Conclusion

A prediction model for massive blood transfusion in liver transplantation surgery was successfully established based on the CatBoost algorithm, and a certain degree of generalization verification is carried out in the validation set. The model may be superior to the traditional LR model and other algorithms, and it can more accurately predict the risk of massive blood transfusions and guide clinical decision-making.

Collapse