1
|
Islam MS, Kalmady SV, Hindle A, Sandhu R, Sun W, Sepehrvand N, Greiner R, Kaul P. Diagnostic and Prognostic Electrocardiogram-Based Models for Rapid Clinical Applications. Can J Cardiol 2024:S0828-282X(24)00523-3. [PMID: 38992812 DOI: 10.1016/j.cjca.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 07/13/2024] Open
Abstract
Leveraging artificial intelligence (AI) for the analysis of electrocardiograms (ECGs) has the potential to transform diagnosis and estimate the prognosis of not only cardiac but, increasingly, noncardiac conditions. In this review, we summarize clinical studies and AI-enhanced ECG-based clinical applications in the early detection, diagnosis, and estimating prognosis of cardiovascular diseases in the past 5 years (2019-2023). With advancements in deep learning and the rapid increased use of ECG technologies, a large number of clinical studies have been published. However, most of these studies are single-centre, retrospective, proof-of-concept studies that lack external validation. Prospective studies that progress from development toward deployment in clinical settings account for < 15% of the studies. Successful implementations of ECG-based AI applications that have received approval from the Food and Drug Administration have been developed through commercial collaborations, with approximately half of them being for mobile or wearable devices. The field is in its early stages, and overcoming several obstacles is essential, such as prospective validation in multicentre large data sets, addressing technical issues, bias, privacy, data security, model generalizability, and global scalability. This review concludes with a discussion of these challenges and potential solutions. By providing a holistic view of the state of AI in ECG analysis, this review aims to set a foundation for future research directions, emphasizing the need for comprehensive, clinically integrated, and globally deployable AI solutions in cardiovascular disease management.
Collapse
Affiliation(s)
- Md Saiful Islam
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Alberta, Edmonton, Alberta, Canada
| | - Sunil Vasu Kalmady
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Abram Hindle
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Roopinder Sandhu
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Smidt Heart Institute, Cedars-Sinai Medical Center Hospital System, Los Angeles, California, USA
| | - Weijie Sun
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada
| | - Nariman Sepehrvand
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Russell Greiner
- Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada; Alberta Machine Intelligence Institute, Edmonton, Alberta, Canada
| | - Padma Kaul
- Canadian VIGOUR Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Alberta, Edmonton, Alberta, Canada.
| |
Collapse
|
2
|
Wagner P, Mehari T, Haverkamp W, Strodthoff N. Explaining deep learning for ECG analysis: Building blocks for auditing and knowledge discovery. Comput Biol Med 2024; 176:108525. [PMID: 38749322 DOI: 10.1016/j.compbiomed.2024.108525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 04/22/2024] [Accepted: 04/25/2024] [Indexed: 05/31/2024]
Abstract
Deep neural networks have become increasingly popular for analyzing ECG data because of their ability to accurately identify cardiac conditions and hidden clinical factors. However, the lack of transparency due to the black box nature of these models is a common concern. To address this issue, explainable AI (XAI) methods can be employed. In this study, we present a comprehensive analysis of post-hoc XAI methods, investigating the glocal (aggregated local attributions over multiple samples) and global (concept based XAI) perspectives. We have established a set of sanity checks to identify saliency as the most sensible attribution method. We provide a dataset-wide analysis across entire patient subgroups, which goes beyond anecdotal evidence, to establish the first quantitative evidence for the alignment of model behavior with cardiologists' decision rules. Furthermore, we demonstrate how these XAI techniques can be utilized for knowledge discovery, such as identifying subtypes of myocardial infarction. We believe that these proposed methods can serve as building blocks for a complementary assessment of the internal validity during a certification process, as well as for knowledge discovery in the field of ECG analysis.
Collapse
Affiliation(s)
| | - Temesgen Mehari
- Fraunhofer Heinrich Hertz Institute, Berlin, Germany; Physikalisch-Technische Bundesanstalt, Berlin, Germany.
| | | | - Nils Strodthoff
- Carl von Ossietzky Universität Oldenburg, Oldenburg, Germany.
| |
Collapse
|
3
|
Diaw MD, Papelier S, Durand-Salmon A, Felblinger J, Oster J. A Human-Centered AI Framework for Efficient Labelling of ECGs From Drug Safety Trials. IEEE Trans Biomed Eng 2024; 71:1697-1704. [PMID: 38157467 DOI: 10.1109/tbme.2023.3348329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Drug safety trials require substantial ECG labelling like, in thorough QT studies, measurements of the QT interval, whose prolongation is a biomarker of proarrhythmic risk. The traditional method of manually measuring the QT interval is time-consuming and error-prone. Studies have demonstrated the potential of deep learning (DL)-based methods to automate this task but expert validation of these computerized measurements remains of paramount importance, particularly for abnormal ECG recordings. In this paper, we propose a highly automated framework that combines such a DL-based QT estimator with human expertise. The framework consists of 3 key components: (1) automated QT measurement with uncertainty quantification (2) expert review of a few DL-based measurements, mostly those with high model uncertainty and (3) recalibration of the unreviewed measurements based on the expert-validated data. We assess its effectiveness on 3 drug safety trials and show that it can significantly reduce effort required for ECG labelling-in our experiments only 10% of the data were reviewed per trial-while maintaining high levels of QT accuracy. Our study thus demonstrates the possibility of productive human-machine collaboration in ECG analysis without any compromise on the reliability of subsequent clinical interpretations.
Collapse
|
4
|
Lu L, Zhu T, Ribeiro AH, Clifton L, Zhao E, Zhou J, Ribeiro ALP, Zhang YT, Clifton DA. Decoding 2.3 million ECGs: interpretable deep learning for advancing cardiovascular diagnosis and mortality risk stratification. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2024; 5:247-259. [PMID: 38774384 PMCID: PMC11104458 DOI: 10.1093/ehjdh/ztae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 02/07/2024] [Accepted: 02/14/2024] [Indexed: 05/24/2024]
Abstract
Aims Electrocardiogram (ECG) is widely considered the primary test for evaluating cardiovascular diseases. However, the use of artificial intelligence (AI) to advance these medical practices and learn new clinical insights from ECGs remains largely unexplored. We hypothesize that AI models with a specific design can provide fine-grained interpretation of ECGs to advance cardiovascular diagnosis, stratify mortality risks, and identify new clinically useful information. Methods and results Utilizing a data set of 2 322 513 ECGs collected from 1 558 772 patients with 7 years follow-up, we developed a deep-learning model with state-of-the-art granularity for the interpretable diagnosis of cardiac abnormalities, gender identification, and hypertension screening solely from ECGs, which are then used to stratify the risk of mortality. The model achieved the area under the receiver operating characteristic curve (AUC) scores of 0.998 (95% confidence interval (CI), 0.995-0.999), 0.964 (95% CI, 0.963-0.965), and 0.839 (95% CI, 0.837-0.841) for the three diagnostic tasks separately. Using ECG-predicted results, we find high risks of mortality for subjects with sinus tachycardia (adjusted hazard ratio (HR) of 2.24, 1.96-2.57), and atrial fibrillation (adjusted HR of 2.22, 1.99-2.48). We further use salient morphologies produced by the deep-learning model to identify key ECG leads that achieved similar performance for the three diagnoses, and we find that the V1 ECG lead is important for hypertension screening and mortality risk stratification of hypertensive cohorts, with an AUC of 0.816 (0.814-0.818) and a univariate HR of 1.70 (1.61-1.79) for the two tasks separately. Conclusion Using ECGs alone, our developed model showed cardiologist-level accuracy in interpretable cardiac diagnosis and the advancement in mortality risk stratification. In addition, it demonstrated the potential to facilitate clinical knowledge discovery for gender and hypertension detection which are not readily available.
Collapse
Affiliation(s)
- Lei Lu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
- School of Life Course and Population Sciences, King’s College London, London, SE1 1UL, UK
| | - Tingting Zhu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
| | - Antonio H Ribeiro
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Lei Clifton
- Nuffield Department of Population Health, University of Oxford Big Data Institute, Oxford, OX3 7LF, UK
| | - Erying Zhao
- Psychological Science and Health Management Center, Harbin Medical University, Harbin, 150076, China
- Department of Psychiatry, University of Oxford, Oxford, OX3 7JX, UK
| | - Jiandong Zhou
- Department of Family Medicine and Primary Care, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- Division of Health Science, Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| | - Antonio Luiz P Ribeiro
- Department of Internal Medicine, Faculdade de Medicina, and Telehealth Center and Cardiology Service, Hospital das Clínicas, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Yuan-Ting Zhang
- Department of Electronic Engineering, Chinese University of Hong Kong, Hong Kong SAR, China
| | - David A Clifton
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, UK
- Oxford Suzhou Centre for Advanced Research, Suzhou, 215123, China
| |
Collapse
|
5
|
邰 美, 金 至, 王 浩, 郭 豫. [Application of photoplethysmography for atrial fibrillation in early warning, diagnosis and integrated management]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2023; 40:1102-1107. [PMID: 38151932 PMCID: PMC10753309 DOI: 10.7507/1001-5515.202206005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 08/07/2023] [Indexed: 12/29/2023]
Abstract
Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Early diagnosis and effective management are important to reduce atrial fibrillation-related adverse events. Photoplethysmography (PPG) is often used to assist wearables for continuous electrocardiograph monitoring, which shows its unique value. The development of PPG has provided an innovative solution to AF management. Serial studies of mobile health technology for improving screening and optimized integrated care in atrial fibrillation have explored the application of PPG in screening, diagnosing, early warning, and integrated management in patients with AF. This review summarizes the latest progress of PPG analysis based on artificial intelligence technology and mobile health in AF field in recent years, as well as the limitations of current research and the focus of future research.
Collapse
Affiliation(s)
- 美慧 邰
- 中国人民解放军总医院 第六医学中心 肺血管与血栓性疾病科 (北京 100048)Department of Cardiopulmonary vascular and Thrombotic Diseases, Sixth Medical Department, Chinese PLA General Hospital, Beijing 100048, P. R. China
- 中国人民解放军医学院(北京 100853)Chinese PLA Medical College, Beijing 100853, P. R. China
| | - 至赓 金
- 中国人民解放军总医院 第六医学中心 肺血管与血栓性疾病科 (北京 100048)Department of Cardiopulmonary vascular and Thrombotic Diseases, Sixth Medical Department, Chinese PLA General Hospital, Beijing 100048, P. R. China
| | - 浩 王
- 中国人民解放军总医院 第六医学中心 肺血管与血栓性疾病科 (北京 100048)Department of Cardiopulmonary vascular and Thrombotic Diseases, Sixth Medical Department, Chinese PLA General Hospital, Beijing 100048, P. R. China
| | - 豫涛 郭
- 中国人民解放军总医院 第六医学中心 肺血管与血栓性疾病科 (北京 100048)Department of Cardiopulmonary vascular and Thrombotic Diseases, Sixth Medical Department, Chinese PLA General Hospital, Beijing 100048, P. R. China
| |
Collapse
|
6
|
Ansari MY, Qaraqe M, Charafeddine F, Serpedin E, Righetti R, Qaraqe K. Estimating age and gender from electrocardiogram signals: A comprehensive review of the past decade. Artif Intell Med 2023; 146:102690. [PMID: 38042607 DOI: 10.1016/j.artmed.2023.102690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 10/13/2023] [Accepted: 10/18/2023] [Indexed: 12/04/2023]
Abstract
Twelve lead electrocardiogram signals capture unique fingerprints about the body's biological processes and electrical activity of heart muscles. Machine learning and deep learning-based models can learn the embedded patterns in the electrocardiogram to estimate complex metrics such as age and gender that depend on multiple aspects of human physiology. ECG estimated age with respect to the chronological age reflects the overall well-being of the cardiovascular system, with significant positive deviations indicating an aged cardiovascular system and a higher likelihood of cardiovascular mortality. Several conventional, machine learning, and deep learning-based methods have been proposed to estimate age from electronic health records, health surveys, and ECG data. This manuscript comprehensively reviews the methodologies proposed for ECG-based age and gender estimation over the last decade. Specifically, the review highlights that elevated ECG age is associated with atherosclerotic cardiovascular disease, abnormal peripheral endothelial dysfunction, and high mortality, among many other cardiovascular disorders. Furthermore, the survey presents overarching observations and insights across methods for age and gender estimation. This paper also presents several essential methodological improvements and clinical applications of ECG-estimated age and gender to encourage further improvements of the state-of-the-art methodologies.
Collapse
Affiliation(s)
- Mohammed Yusuf Ansari
- Texas A&M University, College Station, TX, USA; Texas A&M University at Qatar, Doha, Qatar.
| | - Marwa Qaraqe
- Division of Information and Computing Technology, Hamad Bin Khalifa University, Doha, Qatar; Texas A&M University at Qatar, Doha, Qatar
| | | | | | | | | |
Collapse
|
7
|
Thunold HH, Riegler MA, Yazidi A, Hammer HL. A Deep Diagnostic Framework Using Explainable Artificial Intelligence and Clustering. Diagnostics (Basel) 2023; 13:3413. [PMID: 37998548 PMCID: PMC10670034 DOI: 10.3390/diagnostics13223413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/03/2023] [Accepted: 11/06/2023] [Indexed: 11/25/2023] Open
Abstract
An important part of diagnostics is to gain insight into properties that characterize a disease. Machine learning has been used for this purpose, for instance, to identify biomarkers in genomics. However, when patient data are presented as images, identifying properties that characterize a disease becomes far more challenging. A common strategy involves extracting features from the images and analyzing their occurrence in healthy versus pathological images. A limitation of this approach is that the ability to gain new insights into the disease from the data is constrained by the information in the extracted features. Typically, these features are manually extracted by humans, which further limits the potential for new insights. To overcome these limitations, in this paper, we propose a novel framework that provides insights into diseases without relying on handcrafted features or human intervention. Our framework is based on deep learning (DL), explainable artificial intelligence (XAI), and clustering. DL is employed to learn deep patterns, enabling efficient differentiation between healthy and pathological images. Explainable artificial intelligence (XAI) visualizes these patterns, and a novel "explanation-weighted" clustering technique is introduced to gain an overview of these patterns across multiple patients. We applied the method to images from the gastrointestinal tract. In addition to real healthy images and real images of polyps, some of the images had synthetic shapes added to represent other types of pathologies than polyps. The results show that our proposed method was capable of organizing the images based on the reasons they were diagnosed as pathological, achieving high cluster quality and a rand index close to or equal to one.
Collapse
Affiliation(s)
- Håvard Horgen Thunold
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
| | - Michael A. Riegler
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
- Department of Holistic Systems, SimulaMet, 0176 Oslo, Norway
| | - Anis Yazidi
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
| | - Hugo L. Hammer
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
- Department of Holistic Systems, SimulaMet, 0176 Oslo, Norway
| |
Collapse
|
8
|
Storås AM, Andersen OE, Lockhart S, Thielemann R, Gnesin F, Thambawita V, Hicks SA, Kanters JK, Strümke I, Halvorsen P, Riegler MA. Usefulness of Heat Map Explanations for Deep-Learning-Based Electrocardiogram Analysis. Diagnostics (Basel) 2023; 13:2345. [PMID: 37510089 PMCID: PMC10378376 DOI: 10.3390/diagnostics13142345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/06/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Deep neural networks are complex machine learning models that have shown promising results in analyzing high-dimensional data such as those collected from medical examinations. Such models have the potential to provide fast and accurate medical diagnoses. However, the high complexity makes deep neural networks and their predictions difficult to understand. Providing model explanations can be a way of increasing the understanding of "black box" models and building trust. In this work, we applied transfer learning to develop a deep neural network to predict sex from electrocardiograms. Using the visual explanation method Grad-CAM, heat maps were generated from the model in order to understand how it makes predictions. To evaluate the usefulness of the heat maps and determine if the heat maps identified electrocardiogram features that could be recognized to discriminate sex, medical doctors provided feedback. Based on the feedback, we concluded that, in our setting, this mode of explainable artificial intelligence does not provide meaningful information to medical doctors and is not useful in the clinic. Our results indicate that improved explanation techniques that are tailored to medical data should be developed before deep neural networks can be applied in the clinic for diagnostic purposes.
Collapse
Affiliation(s)
- Andrea M Storås
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, Oslo Metropolitan University, 0130 Oslo, Norway
| | - Ole Emil Andersen
- Department of Public Health, Aarhus University, 8000 Aarhus, Denmark
- Steno Diabetes Center, Aarhus University, 8000 Aarhus, Denmark
| | - Sam Lockhart
- Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Roman Thielemann
- Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Filip Gnesin
- Department of Cardiology, North Zealand Hospital, 3400 Hillerød, Denmark
| | - Vajira Thambawita
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
| | - Steven A Hicks
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
| | - Jørgen K Kanters
- Department of Biomedical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Inga Strümke
- Department of Computer Science, Norwegian University of Science and Technology, 7491 Trondheim, Norway
| | - Pål Halvorsen
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, Oslo Metropolitan University, 0130 Oslo, Norway
| | - Michael A Riegler
- Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, 0167 Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, 9037 Tromsø, Norway
| |
Collapse
|
9
|
Diaw MD, Papelier S, Durand-Salmon A, Felblinger J, Oster J. AI-Assisted QT Measurements for Highly Automated Drug Safety Studies. IEEE Trans Biomed Eng 2023; 70:1504-1515. [PMID: 36355743 DOI: 10.1109/tbme.2022.3221339] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Rate-corrected QT interval (QTc) prolongation has been suggested as a biomarker for the risk of drug-induced torsades de pointes, and is therefore monitored during clinical trials for the assessment of drug safety. Manual QT measurements by expert ECG analysts are expensive, laborious and prone to errors. Wavelet-based delineators and other automatic methods do not generalize well to different T wave morphologies and may require laborious tuning. Our study investigates the robustness of convolutional neural networks (CNNs) for QT measurement. We trained 3 CNN-based deep learning models on a private ECG database with human expert-annotated QT intervals. Among these models, we propose a U-Net model, which is widely used for segmentation tasks, to build a novel clinically useful QT estimator that includes QT delineation for better interpretability. We tested the 3 models on four external databases, amongst which a clinical trial investigating four drugs. Our results show that the deep learning models are in stronger agreement with the experts than the state-of-the-art wavelet-based algorithm. Indeed, the deep learning models yielded up to 71% of accurate QT measurements (absolute difference between manual and automatic QT below 15 ms) whereas the wavelet-based algorithm only allowed 52% of QT accuracy. For the 2 studies of drugs with small to no QT prolonging effect, a mean absolute difference of 6 ms (std = 5 ms) was obtained between the manual and deep learning methods. For the other 2 drugs with more significant effect on the volunteers, a mean difference of up to 17 ms (std = 17 ms) was obtained. The proposed models are therefore promising for automated QT measurements during clinical trials. They can analyze various ECG morphologies from a diversity of individuals although some QT-prolonged ECGs can be challenging. The U-Net model is particularly interesting for our application as it facilitates expert review of automatic QT intervals, which is still required by regulatory bodies, by providing QRS onset and T offset positions that are consistent with the estimated QT intervals.
Collapse
|
10
|
Interpretable Machine Learning Techniques in ECG-Based Heart Disease Classification: A Systematic Review. Diagnostics (Basel) 2022; 13:diagnostics13010111. [PMID: 36611403 PMCID: PMC9818170 DOI: 10.3390/diagnostics13010111] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022] Open
Abstract
Heart disease is one of the leading causes of mortality throughout the world. Among the different heart diagnosis techniques, an electrocardiogram (ECG) is the least expensive non-invasive procedure. However, the following are challenges: the scarcity of medical experts, the complexity of ECG interpretations, the manifestation similarities of heart disease in ECG signals, and heart disease comorbidity. Machine learning algorithms are viable alternatives to the traditional diagnoses of heart disease from ECG signals. However, the black box nature of complex machine learning algorithms and the difficulty in explaining a model's outcomes are obstacles for medical practitioners in having confidence in machine learning models. This observation paves the way for interpretable machine learning (IML) models as diagnostic tools that can build a physician's trust and provide evidence-based diagnoses. Therefore, in this systematic literature review, we studied and analyzed the research landscape in interpretable machine learning techniques by focusing on heart disease diagnosis from an ECG signal. In this regard, the contribution of our work is manifold; first, we present an elaborate discussion on interpretable machine learning techniques. In addition, we identify and characterize ECG signal recording datasets that are readily available for machine learning-based tasks. Furthermore, we identify the progress that has been achieved in ECG signal interpretation using IML techniques. Finally, we discuss the limitations and challenges of IML techniques in interpreting ECG signals.
Collapse
|
11
|
Niraula D, Cui S, Pakela J, Wei L, Luo Y, Ten Haken RK, El Naqa I. Current status and future developments in predicting outcomes in radiation oncology. Br J Radiol 2022; 95:20220239. [PMID: 35867841 PMCID: PMC9793488 DOI: 10.1259/bjr.20220239] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Advancements in data-driven technologies and the inclusion of information-rich multiomics features have significantly improved the performance of outcomes modeling in radiation oncology. For this current trend to be sustainable, challenges related to robust data modeling such as small sample size, low size to feature ratio, noisy data, as well as issues related to algorithmic modeling such as complexity, uncertainty, and interpretability, need to be mitigated if not resolved. Emerging computational technologies and new paradigms such as federated learning, human-in-the-loop, quantum computing, and novel interpretability methods show great potential in overcoming these challenges and bridging the gap towards precision outcome modeling in radiotherapy. Examples of these promising technologies will be presented and their potential role in improving outcome modeling will be discussed.
Collapse
Affiliation(s)
- Dipesh Niraula
- Department of Machine Learning, H Lee Moffitt Cancer Center and Research Institute, Tampa, USA
| | - Sunan Cui
- Department of Radiation Oncology, Stanford Medicine, Stanford University, Stanford, USA
| | - Julia Pakela
- Department of Radiation Oncology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Lise Wei
- Department of Radiation Oncology, University of Michigan, Ann Arbor, USA
| | - Yi Luo
- Department of Machine Learning, H Lee Moffitt Cancer Center and Research Institute, Tampa, USA
| | | | - Issam El Naqa
- Department of Machine Learning, H Lee Moffitt Cancer Center and Research Institute, Tampa, USA
| |
Collapse
|
12
|
Al-Zaiti SS, Alghwiri AA, Hu X, Clermont G, Peace A, Macfarlane P, Bond R. A clinician's guide to understanding and critically appraising machine learning studies: a checklist for Ruling Out Bias Using Standard Tools in Machine Learning (ROBUST-ML). EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2022; 3:125-140. [PMID: 36713011 PMCID: PMC9708024 DOI: 10.1093/ehjdh/ztac016] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 02/11/2022] [Indexed: 05/06/2023]
Abstract
Developing functional machine learning (ML)-based models to address unmet clinical needs requires unique considerations for optimal clinical utility. Recent debates about the rigours, transparency, explainability, and reproducibility of ML models, terms which are defined in this article, have raised concerns about their clinical utility and suitability for integration in current evidence-based practice paradigms. This featured article focuses on increasing the literacy of ML among clinicians by providing them with the knowledge and tools needed to understand and critically appraise clinical studies focused on ML. A checklist is provided for evaluating the rigour and reproducibility of the four ML building blocks: data curation, feature engineering, model development, and clinical deployment. Checklists like this are important for quality assurance and to ensure that ML studies are rigourously and confidently reviewed by clinicians and are guided by domain knowledge of the setting in which the findings will be applied. Bridging the gap between clinicians, healthcare scientists, and ML engineers can address many shortcomings and pitfalls of ML-based solutions and their potential deployment at the bedside.
Collapse
Affiliation(s)
| | - Alaa A Alghwiri
- Data Science Core, The Provost Office, University of Pittsburgh, Pittsburgh PA, USA
| | - Xiao Hu
- Center for Data Science, Emory University, Atlanta, GA, USA
| | - Gilles Clermont
- Departments of Critical Care Medicine, Mathematics, Clinical and Translational Science, and Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Aaron Peace
- The Clinical Translational Research and Innovation Centre, Northern Ireland, UK
| | - Peter Macfarlane
- Institute of Health and Wellbeing, Electrocardiology Section, University of Glasgow, Glasgow, UK
| | - Raymond Bond
- School of Computing, Ulster University, Ulster, UK
| |
Collapse
|
13
|
Anand A, Kadian T, Shetty MK, Gupta A. Explainable AI decision model for ECG data of cardiac disorders. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103584] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
14
|
Petmezas G, Stefanopoulos L, Kilintzis V, Tzavelis A, Rogers JA, Katsaggelos AK, Maglaveras N. State-of-the-art Deep Learning Methods on Electrocardiogram Data: A Systematic Review (Preprint). JMIR Med Inform 2022; 10:e38454. [PMID: 35969441 PMCID: PMC9425174 DOI: 10.2196/38454] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 06/03/2022] [Accepted: 07/03/2022] [Indexed: 11/13/2022] Open
Abstract
Background Electrocardiogram (ECG) is one of the most common noninvasive diagnostic tools that can provide useful information regarding a patient’s health status. Deep learning (DL) is an area of intense exploration that leads the way in most attempts to create powerful diagnostic models based on physiological signals. Objective This study aimed to provide a systematic review of DL methods applied to ECG data for various clinical applications. Methods The PubMed search engine was systematically searched by combining “deep learning” and keywords such as “ecg,” “ekg,” “electrocardiogram,” “electrocardiography,” and “electrocardiology.” Irrelevant articles were excluded from the study after screening titles and abstracts, and the remaining articles were further reviewed. The reasons for article exclusion were manuscripts written in any language other than English, absence of ECG data or DL methods involved in the study, and absence of a quantitative evaluation of the proposed approaches. Results We identified 230 relevant articles published between January 2020 and December 2021 and grouped them into 6 distinct medical applications, namely, blood pressure estimation, cardiovascular disease diagnosis, ECG analysis, biometric recognition, sleep analysis, and other clinical analyses. We provide a complete account of the state-of-the-art DL strategies per the field of application, as well as major ECG data sources. We also present open research problems, such as the lack of attempts to address the issue of blood pressure variability in training data sets, and point out potential gaps in the design and implementation of DL models. Conclusions We expect that this review will provide insights into state-of-the-art DL methods applied to ECG data and point to future directions for research on DL to create robust models that can assist medical experts in clinical decision-making.
Collapse
Affiliation(s)
- Georgios Petmezas
- Lab of Computing, Medical Informatics and Biomedical-Imaging Technologies, The Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Leandros Stefanopoulos
- Lab of Computing, Medical Informatics and Biomedical-Imaging Technologies, The Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Vassilis Kilintzis
- Lab of Computing, Medical Informatics and Biomedical-Imaging Technologies, The Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Andreas Tzavelis
- Department of Biomedical Engineering, Northwestern University, Evanston, IL, United States
| | - John A Rogers
- Department of Material Science, Northwestern University, Evanston, IL, United States
| | - Aggelos K Katsaggelos
- Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
| | - Nicos Maglaveras
- Lab of Computing, Medical Informatics and Biomedical-Imaging Technologies, The Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
15
|
Conti F, Frosini P, Quercioli N. On the Construction of Group Equivariant Non-Expansive Operators via Permutants and Symmetric Functions. Front Artif Intell 2022; 5:786091. [PMID: 35243336 PMCID: PMC8887714 DOI: 10.3389/frai.2022.786091] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 01/18/2022] [Indexed: 11/21/2022] Open
Abstract
Group Equivariant Operators (GEOs) are a fundamental tool in the research on neural networks, since they make available a new kind of geometric knowledge engineering for deep learning, which can exploit symmetries in artificial intelligence and reduce the number of parameters required in the learning process. In this paper we introduce a new method to build non-linear GEOs and non-linear Group Equivariant Non-Expansive Operators (GENEOs), based on the concepts of symmetric function and permutant. This method is particularly interesting because of the good theoretical properties of GENEOs and the ease of use of permutants to build equivariant operators, compared to the direct use of the equivariance groups we are interested in. In our paper, we prove that the technique we propose works for any symmetric function, and benefits from the approximability of continuous symmetric functions by symmetric polynomials. A possible use in Topological Data Analysis of the GENEOs obtained by this new method is illustrated.
Collapse
Affiliation(s)
- Francesco Conti
- Department of Mathematics, University of Pisa, Pisa, Italy
- Institute of Information Science and Technologies “A. Faedo”, National Research Council of Italy (CNR), Pisa, Italy
| | - Patrizio Frosini
- Department of Mathematics, University of Bologna, Bologna, Italy
- Alma Mater Research Center on Applied Mathematics, University of Bologna, Bologna, Italy
- Alma Mater Research Institute for Human-Centered Artificial Intelligence, University of Bologna, Bologna, Italy
- Research Centre on Electronic Systems for the Information and Communication Technology, University of Bologna, Bologna, Italy
| | - Nicola Quercioli
- Department of Mathematics, University of Bologna, Bologna, Italy
- ENEA Centro Ricerche Bologna, Bologna, Italy
| |
Collapse
|
16
|
Isaksen JL, Baumert M, Hermans ANL, Maleckar M, Linz D. Artificial intelligence for the detection, prediction, and management of atrial fibrillation. Herzschrittmacherther Elektrophysiol 2022; 33:34-41. [PMID: 35147766 PMCID: PMC8853037 DOI: 10.1007/s00399-022-00839-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 01/17/2022] [Indexed: 11/07/2022]
Abstract
The present article reviews the state of the art of machine learning algorithms for the detection, prediction, and management of atrial fibrillation (AF), as well as of the development and evaluation of artificial intelligence (AI) in cardiology and beyond. Today, AI detects AF with a high accuracy using 12-lead or single-lead electrocardiograms or photoplethysmography. The prediction of paroxysmal or future AF currently operates at a level of precision that is too low for clinical use. Further studies are needed to determine whether patient selection for interventions may be possible with machine learning.
Collapse
Affiliation(s)
- Jonas L Isaksen
- Department of Biomedical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Mathias Baumert
- School of Electrical and Electronic Engineering, The University of Adelaide, Adelaide, SA, Australia
| | - Astrid N L Hermans
- Department of Cardiology, Maastricht University Medical Center and Cardiovascular Research Institute Maastricht, Maastricht, The Netherlands
| | - Molly Maleckar
- Department of Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | - Dominik Linz
- Department of Biomedical Sciences, University of Copenhagen, Copenhagen, Denmark. .,Department of Cardiology, Maastricht University Medical Center and Cardiovascular Research Institute Maastricht, Maastricht, The Netherlands.
| |
Collapse
|
17
|
Identifying Risk of Adverse Outcomes in COVID-19 Patients via Artificial Intelligence-Powered Analysis of 12-Lead Intake Electrocardiogram. CARDIOVASCULAR DIGITAL HEALTH JOURNAL 2021; 3:62-74. [PMID: 35005676 PMCID: PMC8719367 DOI: 10.1016/j.cvdhj.2021.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background Adverse events in COVID-19 are difficult to predict. Risk stratification is encumbered by the need to protect healthcare workers. We hypothesize that artificial intelligence (AI) can help identify subtle signs of myocardial involvement in the 12-lead electrocardiogram (ECG), which could help predict complications. Objective Use intake ECGs from COVID-19 patients to train AI models to predict risk of mortality or major adverse cardiovascular events (MACE). Methods We studied intake ECGs from 1448 COVID-19 patients (60.5% male, aged 63.4 ± 16.9 years). Records were labeled by mortality (death vs discharge) or MACE (no events vs arrhythmic, heart failure [HF], or thromboembolic [TE] events), then used to train AI models; these were compared to conventional regression models developed using demographic and comorbidity data. Results A total of 245 (17.7%) patients died (67.3% male, aged 74.5 ± 14.4 years); 352 (24.4%) experienced at least 1 MACE (119 arrhythmic, 107 HF, 130 TE). AI models predicted mortality and MACE with area under the curve (AUC) values of 0.60 ± 0.05 and 0.55 ± 0.07, respectively; these were comparable to AUC values for conventional models (0.73 ± 0.07 and 0.65 ± 0.10). There were no prominent temporal trends in mortality rate or MACE incidence in our cohort; holdout testing with data from after a cutoff date (June 9, 2020) did not degrade model performance. Conclusion Using intake ECGs alone, our AI models had limited ability to predict hospitalized COVID-19 patients’ risk of mortality or MACE. Our models’ accuracy was comparable to that of conventional models built using more in-depth information, but translation to clinical use would require higher sensitivity and positive predictive value. In the future, we hope that mixed-input AI models utilizing both ECG and clinical data may be developed to enhance predictive accuracy.
Collapse
|
18
|
Maleckar MM, Myklebust L, Uv J, Florvaag PM, Strøm V, Glinge C, Jabbari R, Vejlstrup N, Engstrøm T, Ahtarovski K, Jespersen T, Tfelt-Hansen J, Naumova V, Arevalo H. Combined In-silico and Machine Learning Approaches Toward Predicting Arrhythmic Risk in Post-infarction Patients. Front Physiol 2021; 12:745349. [PMID: 34819872 PMCID: PMC8606551 DOI: 10.3389/fphys.2021.745349] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 10/06/2021] [Indexed: 11/29/2022] Open
Abstract
Background: Remodeling due to myocardial infarction (MI) significantly increases patient arrhythmic risk. Simulations using patient-specific models have shown promise in predicting personalized risk for arrhythmia. However, these are computationally- and time- intensive, hindering translation to clinical practice. Classical machine learning (ML) algorithms (such as K-nearest neighbors, Gaussian support vector machines, and decision trees) as well as neural network techniques, shown to increase prediction accuracy, can be used to predict occurrence of arrhythmia as predicted by simulations based solely on infarct and ventricular geometry. We present an initial combined image-based patient-specific in silico and machine learning methodology to assess risk for dangerous arrhythmia in post-infarct patients. Furthermore, we aim to demonstrate that simulation-supported data augmentation improves prediction models, combining patient data, computational simulation, and advanced statistical modeling, improving overall accuracy for arrhythmia risk assessment. Methods: MRI-based computational models were constructed from 30 patients 5 days post-MI (the “baseline” population). In order to assess the utility biophysical model-supported data augmentation for improving arrhythmia prediction, we augmented the virtual baseline patient population. Each patient ventricular and ischemic geometry in the baseline population was used to create a subfamily of geometric models, resulting in an expanded set of patient models (the “augmented” population). Arrhythmia induction was attempted via programmed stimulation at 17 sites for each virtual patient corresponding to AHA LV segments and simulation outcome, “arrhythmia,” or “no-arrhythmia,” were used as ground truth for subsequent statistical prediction (machine learning, ML) models. For each patient geometric model, we measured and used choice data features: the myocardial volume and ischemic volume, as well as the segment-specific myocardial volume and ischemia percentage, as input to ML algorithms. For classical ML techniques (ML), we trained k-nearest neighbors, support vector machine, logistic regression, xgboost, and decision tree models to predict the simulation outcome from these geometric features alone. To explore neural network ML techniques, we trained both a three - and a four-hidden layer multilayer perceptron feed forward neural networks (NN), again predicting simulation outcomes from these geometric features alone. ML and NN models were trained on 70% of randomly selected segments and the remaining 30% was used for validation for both baseline and augmented populations. Results: Stimulation in the baseline population (30 patient models) resulted in reentry in 21.8% of sites tested; in the augmented population (129 total patient models) reentry occurred in 13.0% of sites tested. ML and NN models ranged in mean accuracy from 0.83 to 0.86 for the baseline population, improving to 0.88 to 0.89 in all cases. Conclusion: Machine learning techniques, combined with patient-specific, image-based computational simulations, can provide key clinical insights with high accuracy rapidly and efficiently. In the case of sparse or missing patient data, simulation-supported data augmentation can be employed to further improve predictive results for patient benefit. This work paves the way for using data-driven simulations for prediction of dangerous arrhythmia in MI patients.
Collapse
Affiliation(s)
- Mary M Maleckar
- Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | - Lena Myklebust
- Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | - Julie Uv
- Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | | | - Vilde Strøm
- Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | - Charlotte Glinge
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Reza Jabbari
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Niels Vejlstrup
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Thomas Engstrøm
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Kiril Ahtarovski
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark
| | - Thomas Jespersen
- Department of Biomedical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jacob Tfelt-Hansen
- Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark.,Department of Forensic Medicine, Faculty of Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Valeriya Naumova
- Computational Physiology, Simula Research Laboratory, Oslo, Norway
| | | |
Collapse
|
19
|
Thambawita V, Isaksen JL, Hicks SA, Ghouse J, Ahlberg G, Linneberg A, Grarup N, Ellervik C, Olesen MS, Hansen T, Graff C, Holstein-Rathlou NH, Strümke I, Hammer HL, Maleckar MM, Halvorsen P, Riegler MA, Kanters JK. DeepFake electrocardiograms using generative adversarial networks are the beginning of the end for privacy issues in medicine. Sci Rep 2021; 11:21896. [PMID: 34753975 PMCID: PMC8578227 DOI: 10.1038/s41598-021-01295-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 10/26/2021] [Indexed: 11/09/2022] Open
Abstract
Recent global developments underscore the prominent role big data have in modern medical science. But privacy issues constitute a prevalent problem for collecting and sharing data between researchers. However, synthetic data generated to represent real data carrying similar information and distribution may alleviate the privacy issue. In this study, we present generative adversarial networks (GANs) capable of generating realistic synthetic DeepFake 10-s 12-lead electrocardiograms (ECGs). We have developed and compared two methods, named WaveGAN* and Pulse2Pulse. We trained the GANs with 7,233 real normal ECGs to produce 121,977 DeepFake normal ECGs. By verifying the ECGs using a commercial ECG interpretation program (MUSE 12SL, GE Healthcare), we demonstrate that the Pulse2Pulse GAN was superior to the WaveGAN* to produce realistic ECGs. ECG intervals and amplitudes were similar between the DeepFake and real ECGs. Although these synthetic ECGs mimic the dataset used for creation, the ECGs are not linked to any individuals and may thus be used freely. The synthetic dataset will be available as open access for researchers at OSF.io and the DeepFake generator available at the Python Package Index (PyPI) for generating synthetic ECGs. In conclusion, we were able to generate realistic synthetic ECGs using generative adversarial neural networks on normal ECGs from two population studies, thereby addressing the relevant privacy issues in medical datasets.
Collapse
Affiliation(s)
- Vajira Thambawita
- SimulaMet, 0167, Oslo, Norway. .,Oslo Metropolitan University, 0167, Oslo, Norway.
| | | | - Steven A Hicks
- SimulaMet, 0167, Oslo, Norway.,Oslo Metropolitan University, 0167, Oslo, Norway
| | - Jonas Ghouse
- University of Copenhagen, 2200, Copenhagen N, Denmark
| | | | - Allan Linneberg
- University of Copenhagen, 2200, Copenhagen N, Denmark.,Bispebjerg and Frederiksberg Hospital, 2400, Copenhagen NV, Denmark
| | - Niels Grarup
- University of Copenhagen, 2200, Copenhagen N, Denmark
| | | | | | - Torben Hansen
- University of Copenhagen, 2200, Copenhagen N, Denmark
| | | | | | | | - Hugo L Hammer
- SimulaMet, 0167, Oslo, Norway.,Oslo Metropolitan University, 0167, Oslo, Norway
| | - Mary M Maleckar
- SimulaMet, 0167, Oslo, Norway.,Oslo Metropolitan University, 0167, Oslo, Norway
| | - Pål Halvorsen
- SimulaMet, 0167, Oslo, Norway.,Oslo Metropolitan University, 0167, Oslo, Norway
| | - Michael A Riegler
- SimulaMet, 0167, Oslo, Norway. .,UiT The Arctic University of Norway, Tromsø, Norway.
| | | |
Collapse
|
20
|
McCoy LG, Brenna CTA, Chen S, Vold K, Das S. Believing in Black Boxes: Machine Learning for Healthcare Does Not Need Explainability to be Evidence-Based. J Clin Epidemiol 2021; 142:252-257. [PMID: 34748907 DOI: 10.1016/j.jclinepi.2021.11.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 10/25/2021] [Accepted: 11/01/2021] [Indexed: 12/31/2022]
Abstract
OBJECTIVE To examine the role of explainability in machine learning for healthcare (MLHC), and its necessity and significance with respect to effective and ethical MLHC application. STUDY DESIGN AND SETTING This commentary engages with the growing and dynamic corpus of literature on the use of MLHC and artificial intelligence (AI) in medicine, which provide the context for a focused narrative review of arguments presented in favour of and opposition to explainability in MLHC. RESULTS We find that concerns regarding explainability are not limited to MLHC, but rather extend to numerous well-validated treatment interventions as well as to human clinical judgment itself. We examine the role of evidence-based medicine in evaluating inexplicable treatments and technologies, and highlight the analogy between the concept of explainability in MLHC and the related concept of mechanistic reasoning in evidence-based medicine. CONCLUSION Ultimately, we conclude that the value of explainability in MLHC is not intrinsic, but is instead instrumental to achieving greater imperatives such as performance and trust. We caution against the uncompromising pursuit of explainability, and advocate instead for the development of robust empirical methods to successfully evaluate increasingly inexplicable algorithmic systems.
Collapse
Affiliation(s)
- Liam G McCoy
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Connor T A Brenna
- Department of Anesthesiology & Pain Medicine, University of Toronto, Toronto, Ontario, Canada; Department of Philosophy, University of Toronto, Toronto, Ontario, Canada
| | - Stacy Chen
- Joint Centre for Bioethics, University of Toronto, Toronto, Ontario, Canada
| | - Karina Vold
- Institute for the History and Philosophy of Science and Technology, University of Toronto, Toronto, Ontario, Canada; Schwartz Reisman Institute for Technology and Society, University of Toronto, Toronto, Ontario, Canada; Centre for Ethics, University of Toronto, Toronto, Ontario, Canada; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, United Kingdom
| | - Sunit Das
- Centre for Ethics, University of Toronto, Toronto, Ontario, Canada; Division of Neurosurgery, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
21
|
Petch J, Di S, Nelson W. Opening the black box: the promise and limitations of explainable machine learning in cardiology. Can J Cardiol 2021; 38:204-213. [PMID: 34534619 DOI: 10.1016/j.cjca.2021.09.004] [Citation(s) in RCA: 110] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 08/23/2021] [Accepted: 09/08/2021] [Indexed: 11/29/2022] Open
Abstract
Many clinicians remain wary of machine learning due to long-standing concerns about "black box" models. "Black box" is shorthand for models that are sufficiently complex that they are not straightforwardly interpretable to humans. Lack of interpretability in predictive models can undermine trust in those models, especially in health care where so many decisions are literally life and death. There has recently been an explosion of research in the field of explainable machine learning aimed at addressing these concerns. The promise of explainable machine learning is considerable, but it is important for cardiologists who may encounter these techniques in clinical decision support tools or novel research papers to have a critical understanding of both their strengths and their limitations. This paper reviews key concepts and techniques in the field of explainable machine learning as they apply to cardiology. Key concepts reviewed include interpretability versus explainability and global versus local explanations. Techniques demonstrated include permutation importance, surrogate decision trees, local interpretable model-agnostic explanations, and partial dependence plots. We discuss several limitations with explainability techniques, focusing on the how the nature of explanations as approximations may omit important information about how black box models work and why they make certain predictions. We conclude by proposing a rule of thumb about when it is appropriate to use black box models with explanations, rather than interpretable models.
Collapse
Affiliation(s)
- Jeremy Petch
- Centre for Data Science and Digital Health, Hamilton Health Sciences; Institute of Health Policy, Management and Evaluation, University of Toronto; Division of Cardiology, Department of Medicine, McMaster University; Population Health Research Institute.
| | - Shuang Di
- Centre for Data Science and Digital Health, Hamilton Health Sciences; Dalla Lana School of Public Health, University of Toronto
| | - Walter Nelson
- Centre for Data Science and Digital Health, Hamilton Health Sciences; Department of Statistical Sciences, University of Toronto
| |
Collapse
|