Mo D, Zheng Q, Xiao B, Li L. Predicting thalassemia using deep neural network based on red blood cell indices.
Clin Chim Acta 2023;
543:117329. [PMID:
37019327 DOI:
10.1016/j.cca.2023.117329]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 03/11/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023]
Abstract
BACKGROUND AND OBJECTIVE
The traditional statistical screening method for thalassemia based on red blood cell (RBC) indices is being replaced by machine learning. Here, we developed deep neural networks (DNNs) that outperformed the traditional method for predicting thalassemia.
METHOD
Using a dataset of 8693 records comprising genetic tests and other 11 features we constructed 11 DNN models and 4 traditional statistical models and then compared their performances and analysed feature importance for interpreting DNN models.
RESULTS
The area under the receiver operating characteristic curve, accuracy, Youden's index, F1 score, sensitivity, specificity, positive predictive value and negative predictive value, were 0.960, 0.897, 0.794, 0.897, 0.883, 0.911, 0.914, and 0.882, respectively, for our best model, and compared with the traditional statistical model based on the mean corpuscular volume, these values were increased by 10.22%, 10.09%, 26.55%, 8.92%, 4.13%, 16.90%, 13.86% and 6.07%, respectively, and by 15.38%, 11.70%, 31.70%, 9.89%, 3.05%, 22.13%, 17.11% and 5.94%, respectively, for the mean cellular haemoglobin model. The DNN model performance will reduce without age, RBC distribution width (RDW), sex, or both WBC and PLT.
CONCLUSIONS
Our DNN model outperformed the current screening model. In 8 features, RDW and age were the most useful, followed by sex and the combination of WBC and PLT, the remaining nearly useless.
Collapse