1
|
Schipper A, Rutten M, van Gammeren A, Harteveld CL, Urrechaga E, Weerkamp F, den Besten G, Krabbe J, Slomp J, Schoonen L, Broeren M, van Wijnen M, Huijskens MJAJ, Koopmann T, van Ginneken B, Kusters R, Kurstjens S. Machine Learning-Based Prediction of Hemoglobinopathies Using Complete Blood Count Data. Clin Chem 2024; 70:1064-1075. [PMID: 38906831 DOI: 10.1093/clinchem/hvae081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/13/2024] [Indexed: 06/23/2024]
Abstract
BACKGROUND Hemoglobinopathies, the most common inherited blood disorder, are frequently underdiagnosed. Early identification of carriers is important for genetic counseling of couples at risk. The aim of this study was to develop and validate a novel machine learning model on a multicenter data set, covering a wide spectrum of hemoglobinopathies based on routine complete blood count (CBC) testing. METHODS Hemoglobinopathy test results from 10 322 adults were extracted retrospectively from 8 Dutch laboratories. eXtreme Gradient Boosting (XGB) and logistic regression models were developed to differentiate negative from positive hemoglobinopathy cases, using 7 routine CBC parameters. External validation was conducted on a data set from an independent Dutch laboratory, with an additional external validation on a Spanish data set (n = 2629) specifically for differentiating thalassemia from iron deficiency anemia (IDA). RESULTS The XGB and logistic regression models achieved an area under the receiver operating characteristic (AUROC) of 0.88 and 0.84, respectively, in distinguishing negative from positive hemoglobinopathy cases in the independent external validation set. Subclass analysis showed that the XGB model reached an AUROC of 0.97 for β-thalassemia, 0.98 for α0-thalassemia, 0.95 for homozygous α+-thalassemia, 0.78 for heterozygous α+-thalassemia, and 0.94 for the structural hemoglobin variants Hemoglobin C, Hemoglobin D, Hemoglobin E. Both models attained AUROCs of 0.95 in differentiating IDA from thalassemia. CONCLUSIONS Both the XGB and logistic regression model demonstrate high accuracy in predicting a broad range of hemoglobinopathies and are effective in differentiating hemoglobinopathies from IDA. Integration of these models into the laboratory information system facilitates automated hemoglobinopathy detection using routine CBC parameters.
Collapse
Affiliation(s)
- Anoeska Schipper
- Laboratory of Clinical Chemistry and Hematology, Jeroen Bosch Hospital's, Hertogenbosch, the Netherlands
- Diagnostic Image Analysis Group, Radboudumc, Nijmegen, the Netherlands
| | - Matthieu Rutten
- Diagnostic Image Analysis Group, Radboudumc, Nijmegen, the Netherlands
- Department of Radiology, Jeroen Bosch Hospital's, Hertogenbosch, the Netherlands
| | - Adriaan van Gammeren
- Laboratory of Clinical Chemistry and Laboratory Medicine, Amphia Hospital, Breda, the Netherlands
| | - Cornelis L Harteveld
- Department of Clinical Genetics, Laboratory for Genome Diagnostics, Leiden University Medical Center, Leiden, the Netherlands
| | - Eloísa Urrechaga
- Laboratory of Hematology, Hospital Universitario Galdakao Usansolo, Galdakao, Spain
| | - Floor Weerkamp
- Laboratory of Clinical Chemistry, Maasstad Hospital, Rotterdam, the Netherlands
| | - Gijs den Besten
- Laboratory of Clinical Chemistry and Laboratory Medicine, Isala Hospital, Zwolle, the Netherlands
| | - Johannes Krabbe
- Laboratory of Clinical Chemistry and Hematology, Medisch Spectrum Twente/Medlon BV, Enschede, the Netherlands
| | - Jennichjen Slomp
- Laboratory of Clinical Chemistry and Hematology, Medisch Spectrum Twente/Medlon BV, Enschede, the Netherlands
| | - Lise Schoonen
- Laboratory of Clinical Chemistry, Maasstad Hospital, Rotterdam, the Netherlands
- Laboratory of Clinical Chemistry and Laboratory Medicine, Canisius Wilhelmina Hospital, Nijmegen, the Netherlands
| | - Maarten Broeren
- Laboratory of Clinical Chemistry and Laboratory Medicine, Máxima Medical Center, Eindhoven, the Netherlands
| | - Merel van Wijnen
- Laboratory of Clinical Chemistry and Laboratory Medicine, Meander Medical Center, Amersfoort, the Netherlands
| | - Mirelle J A J Huijskens
- Department of Clinical Chemistry and Haematology, Zuyderland Medical Center, Sittard/Heerlen, the Netherlands
| | - Tamara Koopmann
- Department of Clinical Genetics, Laboratory for Genome Diagnostics, Leiden University Medical Center, Leiden, the Netherlands
| | - Bram van Ginneken
- Diagnostic Image Analysis Group, Radboudumc, Nijmegen, the Netherlands
| | - Ron Kusters
- Laboratory of Clinical Chemistry and Hematology, Jeroen Bosch Hospital's, Hertogenbosch, the Netherlands
- Department of Health Technology and Services Research, Technical Medical Centre, University of Twente, Enschede, the Netherlands
| | - Steef Kurstjens
- Laboratory of Clinical Chemistry and Hematology, Jeroen Bosch Hospital's, Hertogenbosch, the Netherlands
| |
Collapse
|
2
|
Saleem M, Aslam W, Lali MIU, Rauf HT, Nasr EA. Predicting Thalassemia Using Feature Selection Techniques: A Comparative Analysis. Diagnostics (Basel) 2023; 13:3441. [PMID: 37998577 PMCID: PMC10670018 DOI: 10.3390/diagnostics13223441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 10/25/2023] [Accepted: 11/06/2023] [Indexed: 11/25/2023] Open
Abstract
Thalassemia represents one of the most common genetic disorders worldwide, characterized by defects in hemoglobin synthesis. The affected individuals suffer from malfunctioning of one or more of the four globin genes, leading to chronic hemolytic anemia, an imbalance in the hemoglobin chain ratio, iron overload, and ineffective erythropoiesis. Despite the challenges posed by this condition, recent years have witnessed significant advancements in diagnosis, therapy, and transfusion support, significantly improving the prognosis for thalassemia patients. This research empirically evaluates the efficacy of models constructed using classification methods and explores the effectiveness of relevant features that are derived using various machine-learning techniques. Five feature selection approaches, namely Chi-Square (χ2), Exploratory Factor Score (EFS), tree-based Recursive Feature Elimination (RFE), gradient-based RFE, and Linear Regression Coefficient, were employed to determine the optimal feature set. Nine classifiers, namely K-Nearest Neighbors (KNN), Decision Trees (DT), Gradient Boosting Classifier (GBC), Linear Regression (LR), AdaBoost, Extreme Gradient Boosting (XGB), Random Forest (RF), Light Gradient Boosting Machine (LGBM), and Support Vector Machine (SVM), were utilized to evaluate the performance. The χ2 method achieved accuracy, registering 91.56% precision, 91.04% recall, and 92.65% f-score when aligned with the LR classifier. Moreover, the results underscore that amalgamating over-sampling with Synthetic Minority Over-sampling Technique (SMOTE), RFE, and 10-fold cross-validation markedly elevates the detection accuracy for αT patients. Notably, the Gradient Boosting Classifier (GBC) achieves 93.46% accuracy, 93.89% recall, and 92.72% F1 score.
Collapse
Affiliation(s)
- Muniba Saleem
- Department of Computer Science & Information Technology, The Government Sadiq College Women University Bahawalpur, Bahawalpur 63100, Pakistan;
| | - Waqar Aslam
- Department of Information Security, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | | | - Hafiz Tayyab Rauf
- Centre for Smart Systems, AI and Cybersecurity, Staffordshire University, Stoke-on-Trent ST4 2DE, UK;
| | - Emad Abouel Nasr
- Industrial Engineering Department, College of Engineering, King Saud University, Riyadh 11421, Saudi Arabia;
| |
Collapse
|
3
|
Ferih K, Elsayed B, Elshoeibi AM, Elsabagh AA, Elhadary M, Soliman A, Abdalgayoom M, Yassin M. Applications of Artificial Intelligence in Thalassemia: A Comprehensive Review. Diagnostics (Basel) 2023; 13:diagnostics13091551. [PMID: 37174943 PMCID: PMC10177591 DOI: 10.3390/diagnostics13091551] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 04/18/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Thalassemia is an autosomal recessive genetic disorder that affects the beta or alpha subunits of the hemoglobin structure. Thalassemia is classified as a hypochromic microcytic anemia and a definitive diagnosis of thalassemia is made by genetic testing of the alpha and beta genes. Thalassemia carries similar features to the other diseases that lead to microcytic hypochromic anemia, particularly iron deficiency anemia (IDA). Therefore, distinguishing between thalassemia and other causes of microcytic anemia is important to help in the treatment of the patients. Different indices and algorithms are used based on the complete blood count (CBC) parameters to diagnose thalassemia. In this article, we review how effective artificial intelligence is in aiding in the diagnosis and classification of thalassemia.
Collapse
Affiliation(s)
- Khaled Ferih
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
| | - Basel Elsayed
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
| | - Amgad M Elshoeibi
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
| | - Ahmed A Elsabagh
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
| | - Mohamed Elhadary
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
| | - Ashraf Soliman
- Hematology Section, Pediatrics Department, Hamad Medical Corporation (HMC), Doha P.O. Box 3050, Qatar
| | - Mohammed Abdalgayoom
- Hematology Section, Medical Oncology, National Center for Cancer Care and Research (NCCCR), Hamad Medical Corporation (HMC), Doha P.O. Box 3050, Qatar
| | - Mohamed Yassin
- College of Medicine, QU Health, Qatar University, Doha P.O. Box 2713, Qatar
- Hematology Section, Medical Oncology, National Center for Cancer Care and Research (NCCCR), Hamad Medical Corporation (HMC), Doha P.O. Box 3050, Qatar
| |
Collapse
|
4
|
A New Artificial Intelligence Approach Using Extreme Learning Machine as the Potentially Effective Model to Predict and Analyze the Diagnosis of Anemia. Healthcare (Basel) 2023; 11:healthcare11050697. [PMID: 36900702 PMCID: PMC10000789 DOI: 10.3390/healthcare11050697] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 02/09/2023] [Accepted: 02/16/2023] [Indexed: 03/02/2023] Open
Abstract
The procedure to diagnose anemia is time-consuming and resource-intensive due to the existence of a multitude of symptoms that can be felt physically or seen visually. Anemia also has several forms, which can be distinguished based on several characteristics. It is possible to diagnose anemia through a quick, affordable, and easily accessible laboratory test known as the complete blood count (CBC), but the method cannot directly identify different kinds of anemia. Therefore, further tests are required to establish a gold standard for the type of anemia in a patient. These tests are uncommon in settings that offer healthcare on a smaller scale because they require expensive equipment. Moreover, it is also difficult to discern between beta thalassemia trait (BTT), iron deficiency anemia (IDA), hemoglobin E (HbE), and combination anemias despite the presence of multiple red blood cell (RBC) formulas and indices with differing optimal cutoff values. This is due to the existence of several varieties of anemia in individuals, making it difficult to distinguish between BTT, IDA, HbE, and combinations. Therefore, a more precise and automated prediction model is proposed to distinguish these four types to accelerate the identification process for doctors. Historical data were retrieved from the Laboratory of the Department of Clinical Pathology and Laboratory Medicine, Faculty of Medicine, Public Health, and Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia for this purpose. Furthermore, the model was developed using the algorithm for the extreme learning machine (ELM). This was followed by the measurement of the performance using the confusion matrix and 190 data representing the four classes, and the results showed 99.21% accuracy, 98.44% sensitivity, 99.30% precision, and an F1 score of 98.84%.
Collapse
|
5
|
Novel Decision Tool for More Severe α-Thalassemia Genotypes Screening with Functional Loss of Two or More α-Globin Genes: A Diagnostic Test Study. Diagnostics (Basel) 2022; 12:diagnostics12123008. [PMID: 36553015 PMCID: PMC9777031 DOI: 10.3390/diagnostics12123008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 11/16/2022] [Accepted: 11/22/2022] [Indexed: 12/04/2022] Open
Abstract
After the exclusion of iron deficiency and β-thalassemia, molecular research for α-thalassemia is recommended to investigate microcytic anemia. Aiming to suggest more efficiently the molecular analysis for individuals with a greater chance of having a symptomatic form of the disease, we have developed and validated a new decision tool to predict the presence of two or more deletions of α-thalassemia, increasing considerably the pre-test probability. The model was created using the variables: the percentage of HbA2, serum ferritin and mean corpuscular volume standardized by age. The model was trained in 134 patients and validated in 160 randomly selected patients from the total sample. We used Youden's index applied to the ROC curve methodology to establish the optimal odds ratio (OR) cut-off for the presence of two or more α-globin gene deletions. Using the OR cut-off of 0.4, the model's negative predictive value (NPV) was 96.8%; the cut-off point accuracy was 85.4%; and the molecular analysis pre-test probability increased from 25.9% to 65.4% after the use of the proposed model. This tool aims to assist the physician in deciding when to perform molecular studies for the diagnosis of α-thalassemia. The model is useful in places with few financial health resources.
Collapse
|
6
|
Chen YC, Hsu KN, Lai JCY, Chen LY, Kuo MS, Liao CC, Hsu K. Influence of hemoglobin on blood pressure among people with GP.Mur blood type ☆. J Formos Med Assoc 2022; 121:1721-1727. [PMID: 35000824 DOI: 10.1016/j.jfma.2021.12.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 12/06/2021] [Accepted: 12/16/2021] [Indexed: 10/19/2022] Open
Abstract
BACKGROUND/PURPOSE GP.Mur is a clinically important red blood cell (RBC) type. GP.Mur and band 3 interact on the RBCs. We previously observed that healthy adults with GP.Mur type present slightly higher blood pressure (BP). Because band 3 and Hb comodulate nitric oxide (NO)-dependent vasodilation and hemoglobin (Hb) is positively associated with BP, we aimed to test whether these could contribute to higher BP in GP.Mur+ people. METHODS We recruited 989 non-elderly adults (21% GP.Mur) free of catastrophic illness and not on cardiovascular or anti-hypertensive medication. Their body indices, blood lab data and lifestyle data were collected for analyses of potential BP-related factors (BMI, age, smoking, Hb, and GP.Mur). RESULTS BMI and age remained the most significant contributors to BP. GP.Mur slightly increased systolic BP (SBP). The direct correlation between Hb and BP was only found in Taiwanese non-anemic men, not women. After age and BMI adjusted, we estimated an increase of 1.8 mmHg and 2.6 mmHg of SBP by 1 g/dL Hb among men without and with GP.Mur type, respectively. Hb was generally lower among people expressing GP.Mur, which likely limited their larger impact on BP. CONCLUSION GP.Mur contributed to BP in both Hb-dependent and Hb-independent fashion. A pronounced impact of hemoglobin on BP likely requires sufficient Hb, as GP.Mur increased the sensitivity of SBP to Hb only in non-anemic Taiwanese men, and not in Taiwanese women or anemic men. The mechanism through which GP.Mur affected BP independent of Hb is unknown.
Collapse
Affiliation(s)
- Yung-Chih Chen
- Division of Cardiology, Department of Internal Medicine, Taitung MacKay Memorial Hospital, Taitung, Taiwan
| | - Kuang-Nan Hsu
- Department of Neurology, Taitung MacKay Memorial Hospital, Taitung, Taiwan
| | - Jerry Cheng-Yen Lai
- Department of Medical Research, Taitung MacKay Memorial Hospital, Taitung, Taiwan
| | - Li-Yang Chen
- The Laboratory of Immunogenetics, Department of Medical Research, MacKay Memorial Hospital, Tamsui, New Taipei City, Taiwan
| | - Mei-Shin Kuo
- The Department of Laboratory Medicine, Taitung MacKay Memorial Hospital, Taitung, Taiwan
| | - Chiu-Chu Liao
- The Department of Laboratory Medicine, Taitung MacKay Memorial Hospital, Taitung, Taiwan
| | - Kate Hsu
- The Laboratory of Immunogenetics, Department of Medical Research, MacKay Memorial Hospital, Tamsui, New Taipei City, Taiwan; MacKay Junior College of Medicine, Nursing, and Management, New Taipei City, Taiwan; Institute of Biomedical Sciences, MacKay Medical College, New Taipei City, Taiwan.
| |
Collapse
|