1
|
Pruski M. What does it mean for a clinical AI to be just: conflicts between local fairness and being fit-for-purpose? JOURNAL OF MEDICAL ETHICS 2024:jme-2023-109675. [PMID: 38423759 DOI: 10.1136/jme-2023-109675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 02/15/2024] [Indexed: 03/02/2024]
Abstract
There have been repeated calls to ensure that clinical artificial intelligence (AI) is not discriminatory, that is, it provides its intended benefit to all members of society irrespective of the status of any protected characteristics of individuals in whose healthcare the AI might participate. There have also been repeated calls to ensure that any clinical AI is tailored to the local population in which it is being used to ensure that it is fit-for-purpose. Yet, there might be a clash between these two calls since tailoring an AI to a local population might reduce its effectiveness when the AI is used in the care of individuals who have characteristics which are not represented in the local population. Here, I explore the bioethical concept of local fairness as applied to clinical AI. I first introduce the discussion concerning fairness and inequalities in healthcare and how this problem has continued in attempts to develop AI-enhanced healthcare. I then discuss various technical aspects which might affect the implementation of local fairness. Next, I introduce some rule of law considerations into the discussion to contextualise the issue better by drawing key parallels. I then discuss some potential technical solutions which have been proposed to address the issue of local fairness. Finally, I outline which solutions I consider most likely to contribute to a fit-for-purpose and fair AI.
Collapse
Affiliation(s)
- Michal Pruski
- Department of Medical Physics and Clinical Engineering, Cardiff and Vale UHB, Cardiff, UK
- School of Health Sciences, The University of Manchester, Manchester, UK
| |
Collapse
|
2
|
Kagerbauer SM, Ulm B, Podtschaske AH, Andonov DI, Blobner M, Jungwirth B, Graessner M. Susceptibility of AutoML mortality prediction algorithms to model drift caused by the COVID pandemic. BMC Med Inform Decis Mak 2024; 24:34. [PMID: 38308256 PMCID: PMC10837894 DOI: 10.1186/s12911-024-02428-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/16/2024] [Indexed: 02/04/2024] Open
Abstract
BACKGROUND Concept drift and covariate shift lead to a degradation of machine learning (ML) models. The objective of our study was to characterize sudden data drift as caused by the COVID pandemic. Furthermore, we investigated the suitability of certain methods in model training to prevent model degradation caused by data drift. METHODS We trained different ML models with the H2O AutoML method on a dataset comprising 102,666 cases of surgical patients collected in the years 2014-2019 to predict postoperative mortality using preoperatively available data. Models applied were Generalized Linear Model with regularization, Default Random Forest, Gradient Boosting Machine, eXtreme Gradient Boosting, Deep Learning and Stacked Ensembles comprising all base models. Further, we modified the original models by applying three different methods when training on the original pre-pandemic dataset: (Rahmani K, et al, Int J Med Inform 173:104930, 2023) we weighted older data weaker, (Morger A, et al, Sci Rep 12:7244, 2022) used only the most recent data for model training and (Dilmegani C, 2023) performed a z-transformation of the numerical input parameters. Afterwards, we tested model performance on a pre-pandemic and an in-pandemic data set not used in the training process, and analysed common features. RESULTS The models produced showed excellent areas under receiver-operating characteristic and acceptable precision-recall curves when tested on a dataset from January-March 2020, but significant degradation when tested on a dataset collected in the first wave of the COVID pandemic from April-May 2020. When comparing the probability distributions of the input parameters, significant differences between pre-pandemic and in-pandemic data were found. The endpoint of our models, in-hospital mortality after surgery, did not differ significantly between pre- and in-pandemic data and was about 1% in each case. However, the models varied considerably in the composition of their input parameters. None of our applied modifications prevented a loss of performance, although very different models emerged from it, using a large variety of parameters. CONCLUSIONS Our results show that none of our tested easy-to-implement measures in model training can prevent deterioration in the case of sudden external events. Therefore, we conclude that, in the presence of concept drift and covariate shift, close monitoring and critical review of model predictions are necessary.
Collapse
Affiliation(s)
- Simone Maria Kagerbauer
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany.
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany.
| | - Bernhard Ulm
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - Armin Horst Podtschaske
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
| | - Dimislav Ivanov Andonov
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
| | - Manfred Blobner
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - Bettina Jungwirth
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - Martin Graessner
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| |
Collapse
|
3
|
Hussain W, Mabrok M, Gao H, Rabhi FA, Rashed EA. Revolutionising healthcare with artificial intelligence: A bibliometric analysis of 40 years of progress in health systems. Digit Health 2024; 10:20552076241258757. [PMID: 38817839 PMCID: PMC11138196 DOI: 10.1177/20552076241258757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open
Abstract
The development of artificial intelligence (AI) has revolutionised the medical system, empowering healthcare professionals to analyse complex nonlinear big data and identify hidden patterns, facilitating well-informed decisions. Over the last decade, there has been a notable trend of research in AI, machine learning (ML), and their associated algorithms in health and medical systems. These approaches have transformed the healthcare system, enhancing efficiency, accuracy, personalised treatment, and decision-making. Recognising the importance and growing trend of research in the topic area, this paper presents a bibliometric analysis of AI in health and medical systems. The paper utilises the Web of Science (WoS) Core Collection database, considering documents published in the topic area for the last four decades. A total of 64,063 papers were identified from 1983 to 2022. The paper evaluates the bibliometric data from various perspectives, such as annual papers published, annual citations, highly cited papers, and most productive institutions, and countries. The paper visualises the relationship among various scientific actors by presenting bibliographic coupling and co-occurrences of the author's keywords. The analysis indicates that the field began its significant growth in the late 1970s and early 1980s, with significant growth since 2019. The most influential institutions are in the USA and China. The study also reveals that the scientific community's top keywords include 'ML', 'Deep Learning', and 'Artificial Intelligence'.
Collapse
Affiliation(s)
- Walayat Hussain
- Peter Faber Business School, Australian Catholic University, North Sydney, Australia
| | - Mohamed Mabrok
- Department of Mathematics and Statistics, Qatar University, Doha, Qatar
| | - Honghao Gao
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Fethi A. Rabhi
- School of Computer Science and Engineering, University of New South Wales (UNSW), Sydney, Australia
| | - Essam A. Rashed
- Graduate School of Information Science, University of Hyogo, Kobe, Japan
| |
Collapse
|
4
|
Tozzi AE, Croci I, Voicu P, Dotta F, Colafati GS, Carai A, Fabozzi F, Lacanna G, Premuselli R, Mastronuzzi A. A systematic review of data sources for artificial intelligence applications in pediatric brain tumors in Europe: implications for bias and generalizability. Front Oncol 2023; 13:1285775. [PMID: 38016063 PMCID: PMC10646175 DOI: 10.3389/fonc.2023.1285775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 10/16/2023] [Indexed: 11/30/2023] Open
Abstract
Introduction Europe works to improve cancer management through the use of artificialintelligence (AI), and there is a need to accelerate the development of AI applications for childhood cancer. However, the current strategies used for algorithm development in childhood cancer may have bias and limited generalizability. This study reviewed existing publications on AI tools for pediatric brain tumors, Europe's most common type of childhood solid tumor, to examine the data sources for developing AI tools. Methods We performed a bibliometric analysis of the publications on AI tools for pediatric brain tumors, and we examined the type of data used, data sources, and geographic location of cohorts to evaluate the generalizability of the algorithms. Results We screened 10503 publications, and we selected 45. A total of 34/45 publications developing AI tools focused on glial tumors, while 35/45 used MRI as a source of information to predict the classification and prognosis. The median number of patients for algorithm development was 89 for single-center studies and 120 for multicenter studies. A total of 17/45 publications used pediatric datasets from the UK. Discussion Since the development of AI tools for pediatric brain tumors is still in its infancy, there is a need to support data exchange and collaboration between centers to increase the number of patients used for algorithm training and improve their generalizability. To this end, there is a need for increased data exchange and collaboration between centers and to explore the applicability of decentralized privacy-preserving technologies consistent with the General Data Protection Regulation (GDPR). This is particularly important in light of using the European Health Data Space and international collaborations.
Collapse
Affiliation(s)
- Alberto Eugenio Tozzi
- Predictive and Preventive Medicine Research Unit, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Ileana Croci
- Predictive and Preventive Medicine Research Unit, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Paul Voicu
- Department of Neuroscience and Imaging, “SS Annunziata” Hospital, “G. D’Annunzio” University, Chieti, Italy
| | - Francesco Dotta
- Imaging Department, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | | | - Andrea Carai
- Department of Neurosciences, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Francesco Fabozzi
- Department of Hematology/Oncology, Cell and Gene Therapy, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Giuseppe Lacanna
- Predictive and Preventive Medicine Research Unit, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Roberto Premuselli
- Department of Hematology/Oncology, Cell and Gene Therapy, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| | - Angela Mastronuzzi
- Department of Hematology/Oncology, Cell and Gene Therapy, Bambino Gesù Children’s Hospital, IRCCS, Rome, Italy
| |
Collapse
|
5
|
van Breugel M, Fehrmann RSN, Bügel M, Rezwan FI, Holloway JW, Nawijn MC, Fontanella S, Custovic A, Koppelman GH. Current state and prospects of artificial intelligence in allergy. Allergy 2023; 78:2623-2643. [PMID: 37584170 DOI: 10.1111/all.15849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/08/2023] [Accepted: 07/31/2023] [Indexed: 08/17/2023]
Abstract
The field of medicine is witnessing an exponential growth of interest in artificial intelligence (AI), which enables new research questions and the analysis of larger and new types of data. Nevertheless, applications that go beyond proof of concepts and deliver clinical value remain rare, especially in the field of allergy. This narrative review provides a fundamental understanding of the core concepts of AI and critically discusses its limitations and open challenges, such as data availability and bias, along with potential directions to surmount them. We provide a conceptual framework to structure AI applications within this field and discuss forefront case examples. Most of these applications of AI and machine learning in allergy concern supervised learning and unsupervised clustering, with a strong emphasis on diagnosis and subtyping. A perspective is shared on guidelines for good AI practice to guide readers in applying it effectively and safely, along with prospects of field advancement and initiatives to increase clinical impact. We anticipate that AI can further deepen our knowledge of disease mechanisms and contribute to precision medicine in allergy.
Collapse
Affiliation(s)
- Merlijn van Breugel
- Department of Pediatric Pulmonology and Pediatric Allergology, Beatrix Children's Hospital, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- MIcompany, Amsterdam, the Netherlands
| | - Rudolf S N Fehrmann
- Department of Medical Oncology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | | | - Faisal I Rezwan
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
- Department of Computer Science, Aberystwyth University, Aberystwyth, UK
| | - John W Holloway
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
- National Institute for Health and Care Research Southampton Biomedical Research Centre, University Hospitals Southampton NHS Foundation Trust, Southampton, UK
| | - Martijn C Nawijn
- Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Department of Pathology and Medical Biology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Sara Fontanella
- National Heart and Lung Institute, Imperial College London, London, UK
- National Institute for Health and Care Research Imperial Biomedical Research Centre (BRC), London, UK
| | - Adnan Custovic
- National Heart and Lung Institute, Imperial College London, London, UK
- National Institute for Health and Care Research Imperial Biomedical Research Centre (BRC), London, UK
| | - Gerard H Koppelman
- Department of Pediatric Pulmonology and Pediatric Allergology, Beatrix Children's Hospital, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| |
Collapse
|
6
|
Rahmani K, Thapa R, Tsou P, Casie Chetty S, Barnes G, Lam C, Foon Tso C. Assessing the effects of data drift on the performance of machine learning models used in clinical sepsis prediction. Int J Med Inform 2023; 173:104930. [PMID: 36893656 DOI: 10.1016/j.ijmedinf.2022.104930] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 10/30/2022] [Accepted: 11/15/2022] [Indexed: 11/21/2022]
Abstract
BACKGROUND Data drift can negatively impact the performance of machine learning algorithms (MLAs) that were trained on historical data. As such, MLAs should be continuously monitored and tuned to overcome the systematic changes that occur in the distribution of data. In this paper, we study the extent of data drift and provide insights about its characteristics for sepsis onset prediction. This study will help elucidate the nature of data drift for prediction of sepsis and similar diseases. This may aid with the development of more effective patient monitoring systems that can stratify risk for dynamic disease states in hospitals. METHODS We devise a series of simulations that measure the effects of data drift in patients with sepsis, using electronic health records (EHR). We simulate multiple scenarios in which data drift may occur, namely the change in the distribution of the predictor variables (covariate shift), the change in the statistical relationship between the predictors and the target (concept shift), and the occurrence of a major healthcare event (major event) such as the COVID-19 pandemic. We measure the impact of data drift on model performances, identify the circumstances that necessitate model retraining, and compare the effects of different retraining methodologies and model architecture on the outcomes. We present the results for two different MLAs, eXtreme Gradient Boosting (XGB) and Recurrent Neural Network (RNN). RESULTS Our results show that the properly retrained XGB models outperform the baseline models in all simulation scenarios, hence signifying the existence of data drift. In the major event scenario, the area under the receiver operating characteristic curve (AUROC) at the end of the simulation period is 0.811 for the baseline XGB model and 0.868 for the retrained XGB model. In the covariate shift scenario, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.853 and 0.874 respectively. In the concept shift scenario and under the mixed labeling method, the retrained XGB models perform worse than the baseline model for most simulation steps. However, under the full relabeling method, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.852 and 0.877 respectively. The results for the RNN models were mixed, suggesting that retraining based on a fixed network architecture may be inadequate for an RNN. We also present the results in the form of other performance metrics such as the ratio of observed to expected probabilities (calibration) and the normalized rate of positive predictive values (PPV) by prevalence, referred to as lift, at a sensitivity of 0.8. CONCLUSION Our simulations reveal that retraining periods of a couple of months or using several thousand patients are likely to be adequate to monitor machine learning models that predict sepsis. This indicates that a machine learning system for sepsis prediction will probably need less infrastructure for performance monitoring and retraining compared to other applications in which data drift is more frequent and continuous. Our results also show that in the event of a concept shift, a full overhaul of the sepsis prediction model may be necessary because it indicates a discrete change in the definition of sepsis labels, and mixing the labels for the sake of incremental training may not produce the desired results.
Collapse
Affiliation(s)
- Keyvan Rahmani
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Rahul Thapa
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Peiling Tsou
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Satish Casie Chetty
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA.
| | - Gina Barnes
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Carson Lam
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| | - Chak Foon Tso
- Dascena, Inc., 12333 Sowden Rd Ste B PMB 65148, Houston, TX 77080-2059, USA
| |
Collapse
|
7
|
Andonov DI, Ulm B, Graessner M, Podtschaske A, Blobner M, Jungwirth B, Kagerbauer SM. Impact of the Covid-19 pandemic on the performance of machine learning algorithms for predicting perioperative mortality. BMC Med Inform Decis Mak 2023; 23:67. [PMID: 37046259 PMCID: PMC10092913 DOI: 10.1186/s12911-023-02151-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/15/2023] [Indexed: 04/14/2023] Open
Abstract
BACKGROUND Machine-learning models are susceptible to external influences which can result in performance deterioration. The aim of our study was to elucidate the impact of a sudden shift in covariates, like the one caused by the Covid-19 pandemic, on model performance. METHODS After ethical approval and registration in Clinical Trials (NCT04092933, initial release 17/09/2019), we developed different models for the prediction of perioperative mortality based on preoperative data: one for the pre-pandemic data period until March 2020, one including data before the pandemic and from the first wave until May 2020, and one that covers the complete period before and during the pandemic until October 2021. We applied XGBoost as well as a Deep Learning neural network (DL). Performance metrics of each model during the different pandemic phases were determined, and XGBoost models were analysed for changes in feature importance. RESULTS XGBoost and DL provided similar performance on the pre-pandemic data with respect to area under receiver operating characteristic (AUROC, 0.951 vs. 0.942) and area under precision-recall curve (AUPR, 0.144 vs. 0.187). Validation in patient cohorts of the different pandemic waves showed high fluctuations in performance from both AUROC and AUPR for DL, whereas the XGBoost models seemed more stable. Change in variable frequencies with onset of the pandemic were visible in age, ASA score, and the higher proportion of emergency operations, among others. Age consistently showed the highest information gain. Models based on pre-pandemic data performed worse during the first pandemic wave (AUROC 0.914 for XGBoost and DL) whereas models augmented with data from the first wave lacked performance after the first wave (AUROC 0.907 for XGBoost and 0.747 for DL). The deterioration was also visible in AUPR, which worsened by over 50% in both XGBoost and DL in the first phase after re-training. CONCLUSIONS A sudden shift in data impacts model performance. Re-training the model with updated data may cause degradation in predictive accuracy if the changes are only transient. Too early re-training should therefore be avoided, and close model surveillance is necessary.
Collapse
Affiliation(s)
- D I Andonov
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
| | - B Ulm
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University Hospital Ulm, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - M Graessner
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University Hospital Ulm, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - A Podtschaske
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
| | - M Blobner
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University Hospital Ulm, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - B Jungwirth
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University Hospital Ulm, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany
| | - S M Kagerbauer
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, Technical University of Munich, Munich, Germany.
- Department of Anaesthesiology and Intensive Care Medicine, School of Medicine, University Hospital Ulm, University of Ulm, Albert-Einstein-Allee 23, Ulm, 89081, Germany.
| |
Collapse
|
8
|
Mo D, Zheng Q, Xiao B, Li L. Predicting thalassemia using deep neural network based on red blood cell indices. Clin Chim Acta 2023; 543:117329. [PMID: 37019327 DOI: 10.1016/j.cca.2023.117329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 03/11/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023]
Abstract
BACKGROUND AND OBJECTIVE The traditional statistical screening method for thalassemia based on red blood cell (RBC) indices is being replaced by machine learning. Here, we developed deep neural networks (DNNs) that outperformed the traditional method for predicting thalassemia. METHOD Using a dataset of 8693 records comprising genetic tests and other 11 features we constructed 11 DNN models and 4 traditional statistical models and then compared their performances and analysed feature importance for interpreting DNN models. RESULTS The area under the receiver operating characteristic curve, accuracy, Youden's index, F1 score, sensitivity, specificity, positive predictive value and negative predictive value, were 0.960, 0.897, 0.794, 0.897, 0.883, 0.911, 0.914, and 0.882, respectively, for our best model, and compared with the traditional statistical model based on the mean corpuscular volume, these values were increased by 10.22%, 10.09%, 26.55%, 8.92%, 4.13%, 16.90%, 13.86% and 6.07%, respectively, and by 15.38%, 11.70%, 31.70%, 9.89%, 3.05%, 22.13%, 17.11% and 5.94%, respectively, for the mean cellular haemoglobin model. The DNN model performance will reduce without age, RBC distribution width (RDW), sex, or both WBC and PLT. CONCLUSIONS Our DNN model outperformed the current screening model. In 8 features, RDW and age were the most useful, followed by sex and the combination of WBC and PLT, the remaining nearly useless.
Collapse
Affiliation(s)
- Donghua Mo
- The First School of Clinical Medicine, Southern Medical University, Guangzhou, China; Clinical Laboratory Medicine Department, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Qian Zheng
- Department of Cardiovascular, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Bin Xiao
- Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, 511518 Qingyuan, China.
| | - Linhai Li
- The First School of Clinical Medicine, Southern Medical University, Guangzhou, China; Department of Laboratory Medicine, The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, 511518 Qingyuan, China.
| |
Collapse
|
9
|
van Klaveren D, Zanos TP, Nelson J, Levy TJ, Park JG, Retel Helmrich IRA, Rietjens JAC, Basile MJ, Hajizadeh N, Lingsma HF, Kent DM. Prognostic models for COVID-19 needed updating to warrant transportability over time and space. BMC Med 2022; 20:456. [PMID: 36424619 PMCID: PMC9686462 DOI: 10.1186/s12916-022-02651-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/04/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Supporting decisions for patients who present to the emergency department (ED) with COVID-19 requires accurate prognostication. We aimed to evaluate prognostic models for predicting outcomes in hospitalized patients with COVID-19, in different locations and across time. METHODS We included patients who presented to the ED with suspected COVID-19 and were admitted to 12 hospitals in the New York City (NYC) area and 4 large Dutch hospitals. We used second-wave patients who presented between September and December 2020 (2137 and 3252 in NYC and the Netherlands, respectively) to evaluate models that were developed on first-wave patients who presented between March and August 2020 (12,163 and 5831). We evaluated two prognostic models for in-hospital death: The Northwell COVID-19 Survival (NOCOS) model was developed on NYC data and the COVID Outcome Prediction in the Emergency Department (COPE) model was developed on Dutch data. These models were validated on subsequent second-wave data at the same site (temporal validation) and at the other site (geographic validation). We assessed model performance by the Area Under the receiver operating characteristic Curve (AUC), by the E-statistic, and by net benefit. RESULTS Twenty-eight-day mortality was considerably higher in the NYC first-wave data (21.0%), compared to the second-wave (10.1%) and the Dutch data (first wave 10.8%; second wave 10.0%). COPE discriminated well at temporal validation (AUC 0.82), with excellent calibration (E-statistic 0.8%). At geographic validation, discrimination was satisfactory (AUC 0.78), but with moderate over-prediction of mortality risk, particularly in higher-risk patients (E-statistic 2.9%). While discrimination was adequate when NOCOS was tested on second-wave NYC data (AUC 0.77), NOCOS systematically overestimated the mortality risk (E-statistic 5.1%). Discrimination in the Dutch data was good (AUC 0.81), but with over-prediction of risk, particularly in lower-risk patients (E-statistic 4.0%). Recalibration of COPE and NOCOS led to limited net benefit improvement in Dutch data, but to substantial net benefit improvement in NYC data. CONCLUSIONS NOCOS performed moderately worse than COPE, probably reflecting unique aspects of the early pandemic in NYC. Frequent updating of prognostic models is likely to be required for transportability over time and space during a dynamic pandemic.
Collapse
Affiliation(s)
- David van Klaveren
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands. .,Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA.
| | - Theodoros P Zanos
- Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jason Nelson
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| | - Todd J Levy
- Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jinny G Park
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| | - Isabel R A Retel Helmrich
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - Judith A C Rietjens
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - Melissa J Basile
- Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
| | - Negin Hajizadeh
- Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
| | - Hester F Lingsma
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - David M Kent
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| |
Collapse
|
10
|
Shreve JT, Khanani SA, Haddad TC. Artificial Intelligence in Oncology: Current Capabilities, Future Opportunities, and Ethical Considerations. Am Soc Clin Oncol Educ Book 2022; 42:1-10. [PMID: 35687826 DOI: 10.1200/edbk_350652] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The promise of highly personalized oncology care using artificial intelligence (AI) technologies has been forecasted since the emergence of the field. Cumulative advances across the science are bringing this promise to realization, including refinement of machine learning- and deep learning algorithms; expansion in the depth and variety of databases, including multiomics; and the decreased cost of massively parallelized computational power. Examples of successful clinical applications of AI can be found throughout the cancer continuum and in multidisciplinary practice, with computer vision-assisted image analysis in particular having several U.S. Food and Drug Administration-approved uses. Techniques with emerging clinical utility include whole blood multicancer detection from deep sequencing, virtual biopsies, natural language processing to infer health trajectories from medical notes, and advanced clinical decision support systems that combine genomics and clinomics. Substantial issues have delayed broad adoption, with data transparency and interpretability suffering from AI's "black box" mechanism, and intrinsic bias against underrepresented persons limiting the reproducibility of AI models and perpetuating health care disparities. Midfuture projections of AI maturation involve increasing a model's complexity by using multimodal data elements to better approximate an organic system. Far-future positing includes living databases that accumulate all aspects of a person's health into discrete data elements; this will fuel highly convoluted modeling that can tailor treatment selection, dose determination, surveillance modality and schedule, and more. The field of AI has had a historical dichotomy between its proponents and detractors. The successful development of recent applications, and continued investment in prospective validation that defines their impact on multilevel outcomes, has established a momentum of accelerated progress.
Collapse
Affiliation(s)
| | | | - Tufia C Haddad
- Department of Oncology, Mayo Clinic, Rochester, MN.,Center for Digital Health, Mayo Clinic, Rochester, MN
| |
Collapse
|