Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Petersen E, Holm S, Ganz M, Feragen A. The path toward equal performance in medical machine learning. Patterns (N Y) 2023;4:100790. [PMID: 37521051 PMCID: PMC10382979 DOI: 10.1016/j.patter.2023.100790] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]

For:	Petersen E, Holm S, Ganz M, Feragen A. The path toward equal performance in medical machine learning. Patterns (N Y) 2023;4:100790. [PMID: 37521051 PMCID: PMC10382979 DOI: 10.1016/j.patter.2023.100790] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]

Number

Cited by Other Article(s)

Yang Y, Zhang H, Gichoya JW, Katabi D, Ghassemi M. The limits of fair medical imaging AI in real-world generalization. Nat Med 2024:10.1038/s41591-024-03113-4. [PMID: 38942996 DOI: 10.1038/s41591-024-03113-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 06/05/2024] [Indexed: 06/30/2024]

Wei Y, Deng Y, Sun C, Lin M, Jiang H, Peng Y. Deep learning with noisy labels in medical prediction problems: a scoping review. J Am Med Inform Assoc 2024;31:1596-1607. [PMID: 38814164 PMCID: PMC11187424 DOI: 10.1093/jamia/ocae108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 04/27/2024] [Accepted: 05/03/2024] [Indexed: 05/31/2024] Open

Abstract

OBJECTIVES

Medical research faces substantial challenges from noisy labels attributed to factors like inter-expert variability and machine-extracted labels. Despite this, the adoption of label noise management remains limited, and label noise is largely ignored. To this end, there is a critical need to conduct a scoping review focusing on the problem space. This scoping review aims to comprehensively review label noise management in deep learning-based medical prediction problems, which includes label noise detection, label noise handling, and evaluation. Research involving label uncertainty is also included.

METHODS

Our scoping review follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We searched 4 databases, including PubMed, IEEE Xplore, Google Scholar, and Semantic Scholar. Our search terms include "noisy label AND medical/healthcare/clinical," "uncertainty AND medical/healthcare/clinical," and "noise AND medical/healthcare/clinical."

RESULTS

A total of 60 papers met inclusion criteria between 2016 and 2023. A series of practical questions in medical research are investigated. These include the sources of label noise, the impact of label noise, the detection of label noise, label noise handling techniques, and their evaluation. Categorization of both label noise detection methods and handling techniques are provided.

DISCUSSION

From a methodological perspective, we observe that the medical community has been up to date with the broader deep-learning community, given that most techniques have been evaluated on medical data. We recommend considering label noise as a standard element in medical research, even if it is not dedicated to handling noisy labels. Initial experiments can start with easy-to-implement methods, such as noise-robust loss functions, weighting, and curriculum learning.

Collapse

Schaekermann M, Spitz T, Pyles M, Cole-Lewis H, Wulczyn E, Pfohl SR, Martin D, Jaroensri R, Keeling G, Liu Y, Farquhar S, Xue Q, Lester J, Hughes C, Strachan P, Tan F, Bui P, Mermel CH, Peng LH, Matias Y, Corrado GS, Webster DR, Virmani S, Semturs C, Liu Y, Horn I, Cameron Chen PH. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. EClinicalMedicine 2024;70:102479. [PMID: 38685924 PMCID: PMC11056401 DOI: 10.1016/j.eclinm.2024.102479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/16/2024] [Accepted: 01/25/2024] [Indexed: 05/02/2024] Open

Abstract

Background

Artificial intelligence (AI) has repeatedly been shown to encode historical inequities in healthcare. We aimed to develop a framework to quantitatively assess the performance equity of health AI technologies and to illustrate its utility via a case study.

Methods

Here, we propose a methodology to assess whether health AI technologies prioritise performance for patient populations experiencing worse outcomes, that is complementary to existing fairness metrics. We developed the Health Equity Assessment of machine Learning performance (HEAL) framework designed to quantitatively assess the performance equity of health AI technologies via a four-step interdisciplinary process to understand and quantify domain-specific criteria, and the resulting HEAL metric. As an illustrative case study (analysis conducted between October 2022 and January 2023), we applied the HEAL framework to a dermatology AI model. A set of 5420 teledermatology cases (store-and-forward cases from patients of 20 years or older, submitted from primary care providers in the USA and skin cancer clinics in Australia), enriched for diversity in age, sex and race/ethnicity, was used to retrospectively evaluate the AI model's HEAL metric, defined as the likelihood that the AI model performs better for subpopulations with worse average health outcomes as compared to others. The likelihood that AI performance was anticorrelated to pre-existing health outcomes was estimated using bootstrap methods as the probability that the negated Spearman's rank correlation coefficient (i.e., "R") was greater than zero. Positive values of R suggest that subpopulations with poorer health outcomes have better AI model performance. Thus, the HEAL metric, defined as p (R >0), measures how likely the AI technology is to prioritise performance for subpopulations with worse average health outcomes as compared to others (presented as a percentage below). Health outcomes were quantified as disability-adjusted life years (DALYs) when grouping by sex and age, and years of life lost (YLLs) when grouping by race/ethnicity. AI performance was measured as top-3 agreement with the reference diagnosis from a panel of 3 dermatologists per case.

Findings

Across all dermatologic conditions, the HEAL metric was 80.5% for prioritizing AI performance of racial/ethnic subpopulations based on YLLs, and 92.1% and 0.0% respectively for prioritizing AI performance of sex and age subpopulations based on DALYs. Certain dermatologic conditions were significantly associated with greater AI model performance compared to a reference category of less common conditions. For skin cancer conditions, the HEAL metric was 73.8% for prioritizing AI performance of age subpopulations based on DALYs.

Interpretation

Analysis using the proposed HEAL framework showed that the dermatology AI model prioritised performance for race/ethnicity, sex (all conditions) and age (cancer conditions) subpopulations with respect to pre-existing health disparities. More work is needed to investigate ways of promoting equitable AI performance across age for non-cancer conditions and to better understand how AI models can contribute towards improving equity in health outcomes.

Funding

Google LLC.

Collapse

Meissen F, Breuer S, Knolle M, Buyx A, Müller R, Kaissis G, Wiestler B, Rückert D. (Predictable) performance bias in unsupervised anomaly detection. EBioMedicine 2024;101:105002. [PMID: 38335791 PMCID: PMC10873649 DOI: 10.1016/j.ebiom.2024.105002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/23/2024] [Accepted: 01/24/2024] [Indexed: 02/12/2024] Open

Affiliation(s)

Felix Meissen Chair for AI in Healthcare and Medicine, Klinikum rechts der Isar der Technischen Universität München, Einsteinstr. 25, Munich, 81675, Germany.
Svenja Breuer Department of Science, Technology and Society, School of Social Sciences and Technology, and Technical University of Munich, Arcisstr. 21, Munich, 80333, Germany; Department of Economics and Policy, School of Management, Technical University of Munich, Arcisstraße 21, 80333, Munich, Germany
Moritz Knolle Chair for AI in Healthcare and Medicine, Klinikum rechts der Isar der Technischen Universität München, Einsteinstr. 25, Munich, 81675, Germany; Konrad Zuse School of Excellence in Reliable AI, Munich Data Science Institute (MDSI), Walther-von-Dyck-Str. 10, Garching, 85748, Germany
Alena Buyx Department of Science, Technology and Society, School of Social Sciences and Technology, and Technical University of Munich, Arcisstr. 21, Munich, 80333, Germany; Institute for History and Ethics of Medicine, School of Medicine, Technical University of Munich, Prinzregentenstraße 68, Munich, 81675, Germany
Ruth Müller Department of Science, Technology and Society, School of Social Sciences and Technology, and Technical University of Munich, Arcisstr. 21, Munich, 80333, Germany; Department of Economics and Policy, School of Management, Technical University of Munich, Arcisstraße 21, 80333, Munich, Germany
Georgios Kaissis Chair for AI in Healthcare and Medicine, Klinikum rechts der Isar der Technischen Universität München, Einsteinstr. 25, Munich, 81675, Germany; Institute for Machine Learning in Biomedical Imaging, Helmholtz Munich, Ingolstädter Landstraße 1, 85764, Neuherberg, Germany; Department of Computing, Imperial College London, London, SW7 2AZ, UK
Benedikt Wiestler Department of Diagnostic and Interventional Neuroradiology, Klinikum rechts der Isar, Ismaninger Str. 22, Munich, 81675, Germany; TranslaTUM, Center for Translational Cancer Research, Technical University of Munich, Ismaninger Str. 22, Munich, 81675, Germany
Daniel Rückert Chair for AI in Healthcare and Medicine, Klinikum rechts der Isar der Technischen Universität München, Einsteinstr. 25, Munich, 81675, Germany; Department of Computing, Imperial College London, London, SW7 2AZ, UK

Collapse