Ma J, An S, Cao M, Zhang L, Lu J. Integrated machine learning and deep learning for predicting diabetic nephropathy model construction, validation, and interpretability.
Endocrine 2024:10.1007/s12020-024-03735-1. [PMID:
38393509 DOI:
10.1007/s12020-024-03735-1]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
OBJECTIVE
To construct a risk prediction model for assisted diagnosis of Diabetic Nephropathy (DN) using machine learning algorithms, and to validate it internally and externally.
METHODS
Firstly, the data was cleaned and enhanced, and was divided into training and test sets according to the 7:3 ratio. Then, the metrics related to DN were filtered by difference analysis, Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination (RFE), and Max-relevance and Min-redundancy (MRMR) algorithms. Ten machine learning models were constructed based on the key variables. The best model was filtered by Receiver Operating Characteristic (ROC), Precision-Recall (PR), Accuracy, Matthews Correlation Coefficient (MCC), and Kappa, and was internally and externally validated. Based on the best model, an online platform had been constructed.
RESULTS
15 key variables were selected, and among the 10 machine learning models, the Random Forest model achieved the best predictive performance. In the test set, the area under the ROC curve was 0.912, and in two external validation cohorts, the area under the ROC curve was 0.828 and 0.863, indicating excellent predictive and generalization abilities.
CONCLUSION
The model has a good predictive value and is expected to help in the early diagnosis and screening of clinical DN.
Collapse