1
|
McLennan S, Fiske A, Celi LA. Building a house without foundations? A 24-country qualitative interview study on artificial intelligence in intensive care medicine. BMJ Health Care Inform 2024; 31:e101052. [PMID: 38642921 PMCID: PMC11033632 DOI: 10.1136/bmjhci-2024-101052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/08/2024] [Indexed: 04/22/2024] Open
Abstract
OBJECTIVES To explore the views of intensive care professionals in high-income countries (HICs) and lower-to-middle-income countries (LMICs) regarding the use and implementation of artificial intelligence (AI) technologies in intensive care units (ICUs). METHODS Individual semi-structured qualitative interviews were conducted between December 2021 and August 2022 with 59 intensive care professionals from 24 countries. Transcripts were analysed using conventional content analysis. RESULTS Participants had generally positive views about the potential use of AI in ICUs but also reported some well-known concerns about the use of AI in clinical practice and important technical and non-technical barriers to the implementation of AI. Important differences existed between ICUs regarding their current readiness to implement AI. However, these differences were not primarily between HICs and LMICs, but between a small number of ICUs in large tertiary hospitals in HICs, which were reported to have the necessary digital infrastructure for AI, and nearly all other ICUs in both HICs and LMICs, which were reported to neither have the technical capability to capture the necessary data or use AI, nor the staff with the right knowledge and skills to use the technology. CONCLUSION Pouring massive amounts of resources into developing AI without first building the necessary digital infrastructure foundation needed for AI is unethical. Real-world implementation and routine use of AI in the vast majority of ICUs in both HICs and LMICs included in our study is unlikely to occur any time soon. ICUs should not be using AI until certain preconditions are met.
Collapse
Affiliation(s)
- Stuart McLennan
- Institute of History and Ethics in Medicine, Department of Preclinical Medicine, TUM School of Medicine and Health, Technical University of Munich, Munich, Bavaria, Germany
- Institute for Biomedical Ethics, University of Basel, Basel, Switzerland
| | - Amelia Fiske
- Institute of History and Ethics in Medicine, Department of Preclinical Medicine, TUM School of Medicine and Health, Technical University of Munich, Munich, Bavaria, Germany
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
2
|
Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, Ghassemi M, Liu X, Reitsma JB, van Smeden M, Boulesteix AL, Camaradou JC, Celi LA, Denaxas S, Denniston AK, Glocker B, Golub RM, Harvey H, Heinze G, Hoffman MM, Kengne AP, Lam E, Lee N, Loder EW, Maier-Hein L, Mateen BA, McCradden MD, Oakden-Rayner L, Ordish J, Parnell R, Rose S, Singh K, Wynants L, Logullo P. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385:e078378. [PMID: 38626948 PMCID: PMC11019967 DOI: 10.1136/bmj-2023-078378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/17/2024] [Indexed: 04/19/2024]
Affiliation(s)
- Gary S Collins
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Karel G M Moons
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Paula Dhiman
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
| | - Andrew L Beam
- Department of Epidemiology, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Science, Leiden University Medical Centre, Leiden, Netherlands
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Johannes B Reitsma
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Maarten van Smeden
- Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, Netherlands
| | - Anne-Laure Boulesteix
- Department of Medical Information Processing, Biometry and Epidemiology, Ludwig-Maximilians-University of Munich, Munich, Germany
| | - Jennifer Catherine Camaradou
- Patient representative, Health Data Research UK patient and public involvement and engagement group
- Patient representative, University of East Anglia, Faculty of Health Sciences, Norwich Research Park, Norwich, UK
| | - Leo Anthony Celi
- Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
| | - Alastair K Denniston
- National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre, Birmingham, UK
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Ben Glocker
- Department of Computing, Imperial College London, London, UK
| | - Robert M Golub
- Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | | | - Georg Heinze
- Section for Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, Canada
| | | | - Emily Lam
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Naomi Lee
- National Institute for Health and Care Excellence, London, UK
| | - Elizabeth W Loder
- The BMJ, London, UK
- Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lena Maier-Hein
- Department of Intelligent Medical Systems, German Cancer Research Centre, Heidelberg, Germany
| | - Bilal A Mateen
- Institute of Health Informatics, University College London, London, UK
- Wellcome Trust, London, UK
- Alan Turing Institute, London, UK
| | - Melissa D McCradden
- Department of Bioethics, Hospital for Sick Children Toronto, ON, Canada
- Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, Canada
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, SA, Australia
| | - Johan Ordish
- Medicines and Healthcare products Regulatory Agency, London, UK
| | - Richard Parnell
- Patient representative, Health Data Research UK patient and public involvement and engagement group
| | - Sherri Rose
- Department of Health Policy and Center for Health Policy, Stanford University, Stanford, CA, USA
| | - Karandeep Singh
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Laure Wynants
- Department of Epidemiology, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Patricia Logullo
- Centre for Statistics in Medicine, UK EQUATOR Centre, Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| |
Collapse
|
3
|
Teotia K, Jia Y, Link Woite N, Celi LA, Matos J, Struja T. Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations. J Biomed Inform 2024; 153:104643. [PMID: 38621640 DOI: 10.1016/j.jbi.2024.104643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 03/29/2024] [Accepted: 04/12/2024] [Indexed: 04/17/2024]
Abstract
OBJECTIVE Health inequities can be influenced by demographic factors such as race and ethnicity, proficiency in English, and biological sex. Disparities may manifest as differential likelihood of testing which correlates directly with the likelihood of an intervention to address an abnormal finding. Our retrospective observational study evaluated the presence of variation in glucose measurements in the Intensive Care Unit (ICU). METHODS Using the MIMIC-IV database (2008-2019), a single-center, academic referral hospital in Boston (USA), we identified adult patients meeting sepsis-3 criteria. Exclusion criteria were diabetic ketoacidosis, ICU length of stay under 1 day, and unknown race or ethnicity. We performed a logistic regression analysis to assess differential likelihoods of glucose measurements on day 1. A negative binomial regression was fitted to assess the frequency of subsequent glucose readings. Analyses were adjusted for relevant clinical confounders, and performed across three disparity proxy axes: race and ethnicity, sex, and English proficiency. RESULTS We studied 24,927 patients, of which 19.5% represented racial and ethnic minority groups, 42.4% were female, and 9.8% had limited English proficiency. No significant differences were found for glucose measurement on day 1 in the ICU. This pattern was consistent irrespective of the axis of analysis, i.e. race and ethnicity, sex, or English proficiency. Conversely, subsequent measurement frequency revealed potential disparities. Specifically, males (incidence rate ratio (IRR) 1.06, 95% confidence interval (CI) 1.01 - 1.21), patients who identify themselves as Hispanic (IRR 1.11, 95% CI 1.01 - 1.21), or Black (IRR 1.06, 95% CI 1.01 - 1.12), and patients being English proficient (IRR 1.08, 95% CI 1.01 - 1.15) had higher chances of subsequent glucose readings. CONCLUSION We found disparities in ICU glucose measurements among patients with sepsis, albeit the magnitude was small. Variation in disease monitoring is a source of data bias that may lead to spurious correlations when modeling health data.
Collapse
Affiliation(s)
- Khushboo Teotia
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Yueran Jia
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Naira Link Woite
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
| | - João Matos
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Faculty of Engineering, University of Porto (FEUP), Porto, Portugal; Institute for Systems and Computer Engineering, Technology and Science (INESCTEC), Porto, Portugal.
| | - Tristan Struja
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Medical University Clinic, Kantonsspital Aarau, Aarau, Switzerland.
| |
Collapse
|
4
|
Iqbal U, Lee LTJ, Rahmanti AR, Celi LA, Li YCJ. Can large language models provide secondary reliable opinion on treatment options for dermatological diseases? J Am Med Inform Assoc 2024:ocae067. [PMID: 38578616 DOI: 10.1093/jamia/ocae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 02/26/2024] [Accepted: 03/27/2024] [Indexed: 04/06/2024] Open
Abstract
OBJECTIVE To investigate the consistency and reliability of medication recommendations provided by ChatGPT for common dermatological conditions, highlighting the potential for ChatGPT to offer second opinions in patient treatment while also delineating possible limitations. MATERIALS AND METHODS In this mixed-methods study, we used survey questions in April 2023 for drug recommendations generated by ChatGPT with data from secondary databases, that is, Taiwan's National Health Insurance Research Database and an US medical center database, and validated by dermatologists. The methodology included preprocessing queries, executing them multiple times, and evaluating ChatGPT responses against the databases and dermatologists. The ChatGPT-generated responses were analyzed statistically in a disease-drug matrix, considering disease-medication associations (Q-value) and expert evaluation. RESULTS ChatGPT achieved a high 98.87% dermatologist approval rate for common dermatological medication recommendations. We evaluated its drug suggestions using the Q-value, showing that human expert validation agreement surpassed Q-value cutoff-based agreement. Varying cutoff values for disease-medication associations, a cutoff of 3 achieved 95.14% accurate prescriptions, 5 yielded 85.42%, and 10 resulted in 72.92%. While ChatGPT offered accurate drug advice, it occasionally included incorrect ATC codes, leading to issues like incorrect drug use and type, nonexistent codes, repeated errors, and incomplete medication codes. CONCLUSION ChatGPT provides medication recommendations as a second opinion in dermatology treatment, but its reliability and comprehensiveness need refinement for greater accuracy. In the future, integrating a medical domain-specific knowledge base for training and ongoing optimization will enhance the precision of ChatGPT's results.
Collapse
Affiliation(s)
- Usman Iqbal
- School of Population Health, Faculty of Medicine and Health, University of New South Wales (UNSW), Sydney, NSW 2052, Australia
- Department of Health, Tasmania 7000, Australia
- Global Health and Health Security Department, College of Public Health, Taipei Medical University, Taipei 110, Taiwan
| | - Leon Tsung-Ju Lee
- Graduate Institute of Clinical Medicine, Taipei Medical University, Taipei 110, Taiwan
- Department of Dermatology, Taipei Medical University Hospital, Taipei Medical University, Taipei 110, Taiwan
- Department of Dermatology, School of Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan
| | - Annisa Ristya Rahmanti
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
- International Center for Health Information and Technology, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
- Department Health Policy and Management, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta 55281, Indonesia
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215, United States
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
| | - Yu-Chuan Jack Li
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
- International Center for Health Information and Technology, College of Medical Science and Technology, Taipei Medical University, Taipei 110, Taiwan
- Department of Dermatology, Taipei Municipal Wanfang Hospital, Taipei Medical University, Taipei 116, Taiwan
- The International Medical Informatics Association (IMIA), Genève CH-1204, Switzerland
| |
Collapse
|
5
|
Hassan A, Critelli B, Lahooti I, Lahooti A, Matzko N, Adams JN, Liss L, Quion J, Restrepo D, Nikahd M, Culp S, Noh L, Tong K, Park JS, Akshintala V, Windsor JA, Mull NK, Papachristou GI, Celi LA, Lee PJ. Critical appraisal of machine learning prognostic models for acute pancreatitis: protocol for a systematic review. Diagn Progn Res 2024; 8:6. [PMID: 38561864 PMCID: PMC10986113 DOI: 10.1186/s41512-024-00169-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/15/2024] [Indexed: 04/04/2024] Open
Abstract
Acute pancreatitis (AP) is an acute inflammatory disorder that is common, costly, and is increasing in incidence worldwide with over 300,000 hospitalizations occurring yearly in the United States alone. As its course and outcomes vary widely, a critical knowledge gap in the field has been a lack of accurate prognostic tools to forecast AP patients' outcomes. Despite several published studies in the last three decades, the predictive performance of published prognostic models has been found to be suboptimal. Recently, non-regression machine learning models (ML) have garnered intense interest in medicine for their potential for better predictive performance. Each year, an increasing number of AP models are being published. However, their methodologic quality relating to transparent reporting and risk of bias in study design has never been systematically appraised. Therefore, through collaboration between a group of clinicians and data scientists with appropriate content expertise, we will perform a systematic review of papers published between January 2021 and December 2023 containing artificial intelligence prognostic models in AP. To systematically assess these studies, the authors will leverage the CHARMS checklist, PROBAST tool for risk of bias assessment, and the most current version of the TRIPOD-AI. (Research Registry ( http://www.reviewregistry1727 .).
Collapse
Affiliation(s)
- Amier Hassan
- Division of Gastroenterology and Hepatology, Weill Cornell Medical College, New York, USA
| | - Brian Critelli
- Division of Gastroenterology and Hepatology, Weill Cornell Medical College, New York, USA
| | - Ila Lahooti
- Division of Gastroenterology and Hepatology, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Ali Lahooti
- Division of Gastroenterology and Hepatology, Weill Cornell Medical College, New York, USA
| | - Nate Matzko
- Division of Gastroenterology and Hepatology, Weill Cornell Medical College, New York, USA
| | - Jan Niklas Adams
- Division of Process and Data Science, Rheinisch-Westfälische Technische Hochschule Aachen University, Aachen, Germany
| | - Lukas Liss
- Division of Process and Data Science, Rheinisch-Westfälische Technische Hochschule Aachen University, Aachen, Germany
| | - Justin Quion
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, USA
| | - David Restrepo
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, USA
| | - Melica Nikahd
- Division of Bioinformatics, Ohio State University Wexner Medical Center, Columbus, USA
| | - Stacey Culp
- Division of Bioinformatics, Ohio State University Wexner Medical Center, Columbus, USA
| | - Lydia Noh
- Northeast Ohio Medical School, Rootstown, USA
| | - Kathleen Tong
- Division of Gastroenterology and Hepatology, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Jun Sung Park
- Division of Gastroenterology and Hepatology, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Venkata Akshintala
- Division of Gastroenterology, Johns Hopkins Medical Center, Baltimore, USA
| | - John A Windsor
- Department of Surgery, University of Auckland, Auckland, New Zealand
| | - Nikhil K Mull
- Division of Hospital Medicine and Penn Medicine Center for Evidence-based Practice, University of Pennsylvania, Philadelphia, USA
| | - Georgios I Papachristou
- Division of Gastroenterology and Hepatology, Ohio State University Wexner Medical Center, Columbus, OH, USA
| | - Leo Anthony Celi
- Department of Surgery, University of Auckland, Auckland, New Zealand
- Division of Critical Care, Beth Israel Medical Center, Boston, USA
| | - Peter J Lee
- Division of Gastroenterology and Hepatology, Ohio State University Wexner Medical Center, Columbus, OH, USA.
| |
Collapse
|
6
|
Armoundas AA, Narayan SM, Arnett DK, Spector-Bagdady K, Bennett DA, Celi LA, Friedman PA, Gollob MH, Hall JL, Kwitek AE, Lett E, Menon BK, Sheehan KA, Al-Zaiti SS. Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement From the American Heart Association. Circulation 2024; 149:e1028-e1050. [PMID: 38415358 PMCID: PMC11042786 DOI: 10.1161/cir.0000000000001201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
A major focus of academia, industry, and global governmental agencies is to develop and apply artificial intelligence and other advanced analytical tools to transform health care delivery. The American Heart Association supports the creation of tools and services that would further the science and practice of precision medicine by enabling more precise approaches to cardiovascular and stroke research, prevention, and care of individuals and populations. Nevertheless, several challenges exist, and few artificial intelligence tools have been shown to improve cardiovascular and stroke care sufficiently to be widely adopted. This scientific statement outlines the current state of the art on the use of artificial intelligence algorithms and data science in the diagnosis, classification, and treatment of cardiovascular disease. It also sets out to advance this mission, focusing on how digital tools and, in particular, artificial intelligence may provide clinical and mechanistic insights, address bias in clinical studies, and facilitate education and implementation science to improve cardiovascular and stroke outcomes. Last, a key objective of this scientific statement is to further the field by identifying best practices, gaps, and challenges for interested stakeholders.
Collapse
|
7
|
Restrepo D, Quion JM, Do Carmo Novaes F, Azevedo Costa ID, Vasquez C, Bautista AN, Quiminiano E, Lim PA, Mwavu R, Celi LA, Nakayama LF. Ophthalmology Optical Coherence Tomography Databases for Artificial Intelligence Algorithm: A Review. Semin Ophthalmol 2024; 39:193-200. [PMID: 38334303 DOI: 10.1080/08820538.2024.2308248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 12/28/2023] [Indexed: 02/10/2024]
Abstract
BACKGROUND Imaging plays a pivotal role in eye assessment. With the introduction of advanced machine learning and artificial intelligence (AI), the focus has shifted to imaging datasets in ophthalmology. While disparities and health inequalities hidden within data are well-documented, the ophthalmology field faces specific challenges to the creation and maintenance of datasets. Optical Coherence Tomography (OCT) is useful for the diagnosis and monitoring of retinal pathologies, making it valuable for AI applications. This review aims to identify and compare the landscape of publicly available optical coherence tomography databases for AI applications. METHODS We conducted a literature review on OCT and AI articles with publicly accessible datasets, using PubMed, Scopus, and Web of Science databases. The review retrieved 183 articles, and after full-text analysis, 50 articles were included. From the included articles were identified 8 publicly available OCT datasets, focusing on patient demographics and clinical details for thorough assessment and comparison. RESULTS The resulting datasets encompass 154,313 images collected from Spectralis, Cirrus HD, Topcon 3D, and Bioptigen devices. These datasets included normal exams, age-related macular degeneration, and diabetic maculopathy, among others. Comprehensive demographic information is available in one dataset and the USA is the most represented population. DISCUSSION Current publicly available OCT databases for AI applications exhibit limitations, stemming from their non-representative nature and the lack of comprehensive demographic information. Limited datasets hamper research and equitable AI development. To promote equitable AI algorithmic development in ophthalmology, there is a need for the creation and dissemination of more representative datasets.
Collapse
Affiliation(s)
- David Restrepo
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Telematics Department, University of Cauca, Popayan, Colombia
| | - Justin Michael Quion
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Frederico Do Carmo Novaes
- Department of Ophthalmology, São Paulo Federal University, São Paulo Brazil 4 Scientific Image Analysis Lab, Integrative Biology Program, Biomedical Sciences Institute (ICBM), Faculty of Medicine, Universidad de Chile, Santiago, Chile
| | - Iago Diogenes Azevedo Costa
- Department of Ophthalmology, São Paulo Federal University, São Paulo Brazil 4 Scientific Image Analysis Lab, Integrative Biology Program, Biomedical Sciences Institute (ICBM), Faculty of Medicine, Universidad de Chile, Santiago, Chile
- Department of Ophthalmology, São Paulo Federal University, São Paulo Brazil
| | - Constanza Vasquez
- Department of Medicine, Instituto Politécnico Nacional, Escuela Superior de Medicina, Ciudad de, Mexico
| | - Alyssa Nicole Bautista
- Department of Medicine, University of the East Ramon Magsaysay Memorial Medical Center Inc, Quezon, Philippines
| | - Ellaine Quiminiano
- Department of Medicine, University of the East Ramon Magsaysay Memorial Medical Center Inc, Quezon, Philippines
| | | | - Roger Mwavu
- Department of Information Technology, Mbarara University of Science and Technology, Mbarara, Uganda
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Luis Filipe Nakayama
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Ophthalmology, São Paulo Federal University, São Paulo Brazil 4 Scientific Image Analysis Lab, Integrative Biology Program, Biomedical Sciences Institute (ICBM), Faculty of Medicine, Universidad de Chile, Santiago, Chile
| |
Collapse
|
8
|
Holste G, Zhou Y, Wang S, Jaiswal A, Lin M, Zhuge S, Yang Y, Kim D, Nguyen-Mau TH, Tran MT, Jeong J, Park W, Ryu J, Hong F, Verma A, Yamagishi Y, Kim C, Seo H, Kang M, Celi LA, Lu Z, Summers RM, Shih G, Wang Z, Peng Y. Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge. ArXiv 2024:arXiv:2310.16112v2. [PMID: 37986726 PMCID: PMC10659524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Collapse
Affiliation(s)
- Gregory Holste
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Yiliang Zhou
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| | - Song Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Ajay Jaiswal
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| | - Sherry Zhuge
- School of Information Systems, Carnegie Mellon University, 15213, Pittsburgh, PA USA
| | - Yuzhe Yang
- Department of Electrical Engineering and Computer Science, Massachussetts Institute of Technology, 02139, Cambridge, MA USA
| | - Dongkyun Kim
- School of Computer Science, Carnegie Mellon University, 15213, Pittsburgh, PA USA
| | | | - Minh-Triet Tran
- University of Science, VNU-HCM, 70000, Ho Chi Minh City, Vietnam
| | - Jaehyup Jeong
- KT Research & Development Center, KT Corporation, 06763, Seoul, South Korea
| | - Wongi Park
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Jongbin Ryu
- Department of Software and Computer Engineering, Ajou University, 16499, Suwon, South Korea
| | - Feng Hong
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, 200240, Shanghai, China
| | - Arsh Verma
- Wadhwani Institute for Artificial Intelligence, 400079, Mumbai, India
| | - Yosuke Yamagishi
- Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 113-0033, Tokyo, Japan
| | - Changhyun Kim
- BioMedical AI Team, AIX Future R&D Center, SK Telecom, 04539, Seoul, South Korea
| | - Hyeryeong Seo
- Interdisciplinary Program in AI (IPAI), Seoul National University, 02504, Seoul, South Korea
| | - Myungjoo Kang
- Department of Mathematical Sciences, Seoul National University, 02504, Seoul, South Korea
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, 02139, Cambridge, MA USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, 02215, Boston, MA USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 02115, Boston, MA USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, 20894, Bethesda, MD USA
| | - Ronald M. Summers
- Clinical Center, National Institutes of Health, 20892, Bethesda, MD USA
| | - George Shih
- Department of Radiology, Weill Cornell Medicine, 10065, New York, NY USA
| | - Zhangyang Wang
- Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712, Austin, TX USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY USA
| |
Collapse
|
9
|
Balagopalan A, Baldini I, Celi LA, Gichoya J, McCoy LG, Naumann T, Shalit U, van der Schaar M, Wagstaff KL. Machine learning for healthcare that matters: Reorienting from technical novelty to equitable impact. PLOS Digit Health 2024; 3:e0000474. [PMID: 38620047 PMCID: PMC11018283 DOI: 10.1371/journal.pdig.0000474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 02/18/2024] [Indexed: 04/17/2024]
Abstract
Despite significant technical advances in machine learning (ML) over the past several years, the tangible impact of this technology in healthcare has been limited. This is due not only to the particular complexities of healthcare, but also due to structural issues in the machine learning for healthcare (MLHC) community which broadly reward technical novelty over tangible, equitable impact. We structure our work as a healthcare-focused echo of the 2012 paper "Machine Learning that Matters", which highlighted such structural issues in the ML community at large, and offered a series of clearly defined "Impact Challenges" to which the field should orient itself. Drawing on the expertise of a diverse and international group of authors, we engage in a narrative review and examine issues in the research background environment, training processes, evaluation metrics, and deployment protocols which act to limit the real-world applicability of MLHC. Broadly, we seek to distinguish between machine learning ON healthcare data and machine learning FOR healthcare-the former of which sees healthcare as merely a source of interesting technical challenges, and the latter of which regards ML as a tool in service of meeting tangible clinical needs. We offer specific recommendations for a series of stakeholders in the field, from ML researchers and clinicians, to the institutions in which they work, and the governments which regulate their data access.
Collapse
Affiliation(s)
- Aparna Balagopalan
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology; Cambridge, Massachusetts, United States of America
| | - Ioana Baldini
- IBM Research; Yorktown Heights, New York, United States of America
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology; Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center; Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health; Boston, Massachusetts, United States of America
| | - Judy Gichoya
- Department of Radiology and Imaging Sciences, School of Medicine, Emory University; Atlanta, Georgia, United States of America
| | - Liam G. McCoy
- Division of Neurology, Department of Medicine, University of Alberta; Edmonton, Alberta, Canada
| | - Tristan Naumann
- Microsoft Research; Redmond, Washington, United States of America
| | - Uri Shalit
- The Faculty of Data and Decision Sciences, Technion; Haifa, Israel
| | - Mihaela van der Schaar
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge; Cambridge, United Kingdom
- The Alan Turing Institute; London, United Kingdom
| | | |
Collapse
|
10
|
Ellen JG, Matos J, Viola M, Gallifant J, Quion J, Anthony Celi L, Abu Hussein NS. Participant flow diagrams for health equity in AI. J Biomed Inform 2024; 152:104631. [PMID: 38548006 DOI: 10.1016/j.jbi.2024.104631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 12/29/2023] [Accepted: 03/26/2024] [Indexed: 04/01/2024]
Abstract
Selection bias can arise through many aspects of a study, including recruitment, inclusion/exclusion criteria, input-level exclusion and outcome-level exclusion, and often reflects the underrepresentation of populations historically disadvantaged in medical research. The effects of selection bias can be further amplified when non-representative samples are used in artificial intelligence (AI) and machine learning (ML) applications to construct clinical algorithms. Building on the "Data Cards" initiative for transparency in AI research, we advocate for the addition of a participant flow diagram for AI studies detailing relevant sociodemographic and/or clinical characteristics of excluded participants across study phases, with the goal of identifying potential algorithmic biases before their clinical implementation. We include both a model for this flow diagram as well as a brief case study explaining how it could be implemented in practice. Through standardized reporting of participant flow diagrams, we aim to better identify potential inequities embedded in AI applications, facilitating more reliable and equitable clinical algorithms.
Collapse
Affiliation(s)
| | - João Matos
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Faculty of Engineering, University of Porto, Porto, Portugal; Institute for Systems and Computer Engineering, Technology and Science (INESCTEC), Porto, Portugal
| | | | - Jack Gallifant
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Critical Care, Guy's and St Thomas' NHS Trust, London, United Kingdom
| | - Justin Quion
- University of the East Ramon Magsaysay Memorial Medical School, Quezon City, Philippines
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | | |
Collapse
|
11
|
Wang R, Kuo PC, Chen LC, Seastedt KP, Gichoya JW, Celi LA. Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images. EBioMedicine 2024; 102:105047. [PMID: 38471396 PMCID: PMC10945176 DOI: 10.1016/j.ebiom.2024.105047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/15/2024] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND It has been shown that AI models can learn race on medical images, leading to algorithmic bias. Our aim in this study was to enhance the fairness of medical image models by eliminating bias related to race, age, and sex. We hypothesise models may be learning demographics via shortcut learning and combat this using image augmentation. METHODS This study included 44,953 patients who identified as Asian, Black, or White (mean age, 60.68 years ±18.21; 23,499 women) for a total of 194,359 chest X-rays (CXRs) from MIMIC-CXR database. The included CheXpert images comprised 45,095 patients (mean age 63.10 years ±18.14; 20,437 women) for a total of 134,300 CXRs were used for external validation. We also collected 1195 3D brain magnetic resonance imaging (MRI) data from the ADNI database, which included 273 participants with an average age of 76.97 years ±14.22, and 142 females. DL models were trained on either non-augmented or augmented images and assessed using disparity metrics. The features learned by the models were analysed using task transfer experiments and model visualisation techniques. FINDINGS In the detection of radiological findings, training a model using augmented CXR images was shown to reduce disparities in error rate among racial groups (-5.45%), age groups (-13.94%), and sex (-22.22%). For AD detection, the model trained with augmented MRI images was shown 53.11% and 31.01% reduction of disparities in error rate among age and sex groups, respectively. Image augmentation led to a reduction in the model's ability to identify demographic attributes and resulted in the model trained for clinical purposes incorporating fewer demographic features. INTERPRETATION The model trained using the augmented images was less likely to be influenced by demographic information in detecting image labels. These results demonstrate that the proposed augmentation scheme could enhance the fairness of interpretations by DL models when dealing with data from patients with different demographic backgrounds. FUNDING National Science and Technology Council (Taiwan), National Institutes of Health.
Collapse
Affiliation(s)
- Ryan Wang
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Po-Chih Kuo
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan.
| | - Li-Ching Chen
- Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Kenneth Patrick Seastedt
- Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA; Department of Thoracic Surgery, Roswell Park Comprehensive Cancer Center, Buffalo, NY, USA
| | | | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
12
|
Furfaro D, Celi LA, Schwartzstein RM. Artificial Intelligence in Medical Education: A Long Way to Go. Chest 2024; 165:771-774. [PMID: 38599751 DOI: 10.1016/j.chest.2023.11.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 11/16/2023] [Accepted: 11/19/2023] [Indexed: 04/12/2024] Open
Affiliation(s)
- David Furfaro
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA; Harvard Medical School, Boston, MA.
| | - Leo Anthony Celi
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA; Harvard Medical School, Boston, MA; Massachusetts Institute of Technology Laboratory of Computational Physiology, Boston, MA; Harvard T.H. Chan School of Public Health, Boston, MA
| | - Richard M Schwartzstein
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA; Harvard Medical School, Boston, MA
| |
Collapse
|
13
|
Ranard BL, Park S, Jia Y, Zhang Y, Alwan F, Celi LA, Lusczek ER. Minimizing bias when using artificial intelligence in critical care medicine. J Crit Care 2024; 82:154796. [PMID: 38552451 DOI: 10.1016/j.jcrc.2024.154796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/02/2024] [Accepted: 03/06/2024] [Indexed: 04/02/2024]
Affiliation(s)
- Benjamin L Ranard
- Division of Pulmonary, Allergy, and Critical Care Medicine, Department of Medicine, Columbia University Vagelos College of Physicians and Surgeons and NewYork-Presbyterian Hospital, New York, NY, USA; Program for Hospital and Intensive Care Informatics, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA.
| | - Soojin Park
- Program for Hospital and Intensive Care Informatics, Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA; Departments of Neurology and Bioinformatics, Columbia University Vagelos College of Physicians and Surgeons and NewYork-Presbyterian Hospital, New York, NY, USA
| | - Yugang Jia
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yiye Zhang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Fatima Alwan
- Department of Surgery, Hennepin Healthcare, Minneapolis, MN, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Elizabeth R Lusczek
- Department of Surgery, University of Minnesota Department of Surgery, Minneapolis, MN, USA.
| |
Collapse
|
14
|
Rush B, Ziegler J, Dyck S, Dhaliwal S, Mooney O, Lother S, Celi LA, Mendelson AA. Disparities in access to and timing of interventional therapies for pulmonary embolism across the United States. J Thromb Haemost 2024:S1538-7836(24)00171-5. [PMID: 38554934 DOI: 10.1016/j.jtha.2024.03.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/20/2024] [Accepted: 03/15/2024] [Indexed: 04/02/2024]
Abstract
BACKGROUND Interventional therapies (ITs) are an emerging treatment modality for pulmonary embolism (PE); however, the degree of racial, sex-based, and sociodemographic disparities in access and timing is unknown. OBJECTIVES To investigate barriers to access and timing of ITs for PE across the United States. METHODS A retrospective cohort study utilizing the Nationwide Inpatient Sample from 2016-2020 included adult patients with PE. The use of ITs (mechanical thrombectomy and catheter-directed thrombolysis) was identified via International Classification of Diseases 10th revision codes. Early IT was defined as procedure performed within the first 2 days after admission. RESULTS A total of 27 805 273 records from the 2016-2020 Nationwide Inpatient Sample database were examined. There were 387 514 (1.4%) patients with PE, with 14 249 (3.6%) of them having undergone IT procedures (11 115 catheter-directed thrombolysis, 2314 thrombectomy, and 780 both procedures). After multivariate adjustment, factors associated with less use of IT included Black race (odds ratio [OR], 0.90; 95% CI, 0.86-0.94; P < .01), Hispanic race (OR, 0.73; 95% CI, 0.68-0.79; P < .01), female sex (OR, 0.88; 95% CI, 0.85-0.91; P < .01), treatment in a rural hospital (OR, 0.49; 95% CI, 0.44-0.54; P < .01), and lack of private insurance (Medicare OR, 0.77; 95% CI, 0.73-0.80; P < .01; Medicaid OR, 0.65; 95% CI, 0.61-0.69; P < .01; no coverage OR, 0.87; 95% CI, 0.82-0.93; P < .01). Among the patients who received IT, 11 315 (79%) procedures were conducted within 2 days of admission and 2934 (21%) were delayed. Factors associated with delayed procedures included Black race (OR, 1.12; 95% CI, 1.01-1.26; P = .04), Hispanic race (OR, 1.52; 95% CI, 1.28-1.80; P < .01), weekend admission (OR, 1.37; 95% CI, 1.25-1.51; P < .01), Medicare coverage (OR, 1.24; 95% CI, 1.10-1.40; P < .01), and Medicaid coverage (OR, 1.29; 95% CI, 1.12-1.49; P < .01). CONCLUSION Significant racial, sex-based, and geographic barriers exist in overall access to IT for PE in the United States.
Collapse
Affiliation(s)
- Barret Rush
- Section of Critical Care Medicine, Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
| | - Jennifer Ziegler
- Section of Critical Care Medicine, Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Stephanie Dyck
- Department of Radiology, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Surinder Dhaliwal
- Department of Radiology, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Owen Mooney
- Section of Critical Care Medicine, Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Sylvain Lother
- Section of Critical Care Medicine, Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Leo Anthony Celi
- Harvard Medical School, Boston, Massachusetts, USA; Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Asher A Mendelson
- Section of Critical Care Medicine, Department of Internal Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| |
Collapse
|
15
|
Charpignon ML, Matos J, Nakayama L, Gallifant J, Alfonso PGI, Cobanaj M, Fiske A, Gates AJ, Ho FDV, Jain U, Kashkooli M, McCoy LG, Shaffer J, Link Woite N, Celi LA. Does diversity beget diversity? A scientometric analysis of over 150,000 studies and 49,000 authors published in high-impact medical journals between 2007 and 2022. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.21.24304695. [PMID: 38562711 PMCID: PMC10984076 DOI: 10.1101/2024.03.21.24304695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Background Health research that significantly impacts global clinical practice and policy is often published in high-impact factor (IF) medical journals. These outlets play a pivotal role in the worldwide dissemination of novel medical knowledge. However, researchers identifying as women and those affiliated with institutions in low- and middle-income countries (LMIC) have been largely underrepresented in high-IF journals across multiple fields of medicine. To evaluate disparities in gender and geographical representation among authors who have published in any of five top general medical journals, we conducted scientometric analyses using a large-scale dataset extracted from the New England Journal of Medicine (NEJM), Journal of the American Medical Association (JAMA), The British Medical Journal (BMJ), The Lancet, and Nature Medicine. Methods Author metadata from all articles published in the selected journals between 2007 and 2022 were collected using the DimensionsAI platform. The Genderize.io API was then utilized to infer each author's likely gender based on their extracted first name. The World Bank country classification was used to map countries associated with researcher affiliations to the LMIC or the high-income country (HIC) category. We characterized the overall gender and country income category representation across the medical journals. In addition, we computed article-level diversity metrics and contrasted their distributions across the journals. Findings We studied 151,536 authors across 49,764 articles published in five top medical journals, over a long period spanning 15 years. On average, approximately one-third (33.1%) of the authors of a given paper were inferred to be women; this result was consistent across the journals we studied. Further, 86.6% of the teams were exclusively composed of HIC authors; in contrast, only 3.9% were exclusively composed of LMIC authors. The probability of serving as the first or last author was significantly higher if the author was inferred to be a man (18.1% vs 16.8%, P < .01) or was affiliated with an institution in a HIC (16.9% vs 15.5%, P < .01). Our primary finding reveals that having a diverse team promotes further diversity, within the same dimension (i.e., gender or geography) and across dimensions. Notably, papers with at least one woman among the authors were more likely to also involve at least two LMIC authors (11.7% versus 10.4% in baseline, P < .001; based on inferred gender); conversely, papers with at least one LMIC author were more likely to also involve at least two women (49.4% versus 37.6%, P < .001; based on inferred gender). Conclusion We provide a scientometric framework to assess authorship diversity. Our research suggests that the inclusiveness of high-impact medical journals is limited in terms of both gender and geography. We advocate for medical journals to adopt policies and practices that promote greater diversity and collaborative research. In addition, our findings offer a first step towards understanding the composition of teams conducting medical research globally and an opportunity for individual authors to reflect on their own collaborative research practices and possibilities to cultivate more diverse partnerships in their work.
Collapse
Affiliation(s)
- Marie-Laure Charpignon
- Institute for Data Systems and Society, Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - João Matos
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Faculty of Engineering, University of Porto (FEUP), Porto, Portugal
- Institute for Systems and Computer Engineering, Technology and Science (INESCTEC), Porto, Portugal
| | - Luis Nakayama
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Ophthalmology, São Paulo Federal University, São Paulo, SP, Brazil
| | - Jack Gallifant
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Critical Care, Guy's and St Thomas' NHS Trust, London, United Kingdom
| | | | - Marisa Cobanaj
- Institute of Radiooncology-OncoRay, National Center for Radiation Research in Oncology, Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Germany
| | - Amelia Fiske
- Institute of History and Ethics in Medicine, Department of Clinical Medicine, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Alexander J Gates
- School of Data Science, University of Virginia, Charlottesville, VA, USA
| | | | - Urvish Jain
- University of Pittsburgh, Pittsburgh, PA, USA
| | - Mohammad Kashkooli
- Epilepsy Research Center, Department of Neurology, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Liam G McCoy
- Division of Neurology, Department of Medicine, University of Alberta, Edmonton, Alberta, Canada
| | - Jonathan Shaffer
- Department of Sociology, University of Vermont, Burlington, VT, USA
| | - Naira Link Woite
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| |
Collapse
|
16
|
Gottlieb ER, Celi LA. A Reassessment of Sodium Correction Rates and Hospital Length of Stay Accounting for Admission Diagnosis. medRxiv 2024:2024.03.08.24303993. [PMID: 38559087 PMCID: PMC10980130 DOI: 10.1101/2024.03.08.24303993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Background Recent studies have challenged assumptions about slow correction of severe hyponatremia and have shown that rapid correction is associated with shorter hospital length of stay. However, the confounding effect of admission diagnosis has not been fully explored. The objective of this study was to determine whether rapid correction is still associated with shorter length of stay when controlling for admission diagnosis. Methods This retrospective cohort study is based on the Medical Information Mart for Intensive Care, including data from both MIMIC-III (2001-2012) and MIMIC-IV (2008-2019). Patients were identified who presented to the hospital with initial sodium <120 mEq/L and were categorized according to total sodium correction achieved in the first day (<6 mEq/L; 6-10 mEq/L; >10 mEq/L). Linear regression was used to assess for an association between correction rate and hospital length of stay, and to determine if this association was significant when controlling for admission diagnosis classifications based on diagnosis related groups (DRGs). Results There were 636 patients included in this study. Median [IQR] hospital length of stay was 7 [4, 11] days. Patients had a median [IQR] initial sodium value of 117 [114, 118] mEq/L and final sodium value of 124 [119, 128] mEq/L. In a univariate linear regression, the highest rate of correction (>10 mEq/L) was associated with a shorter length of stay than a moderate rate of correction (coef. -2.363, 95% CI [-4.710, -0.017], p=0.048), but the association was not significant when controlling for admission diagnosis group (coef. -1.685, 95% CI [-3.836, 0.467], p=0.125). Conclusions Faster sodium correction was not associated with shorter length of stay when controlling for admission diagnosis categories, suggesting that the disease state confounds this association. While some patients may be discharged earlier if sodium is corrected more rapidly, others may not benefit or may be harmed by this strategy.
Collapse
|
17
|
D'Couto HT, Celi LA. Racial Physiology: A Dangerous Precedent. Am J Respir Crit Care Med 2024. [PMID: 38452375 DOI: 10.1164/rccm.202402-0296le] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 03/06/2024] [Indexed: 03/09/2024] Open
Affiliation(s)
- Helen T D'Couto
- MedStar Georgetown University Hospital, 71541, Pulmonary, Critical Care, and Sleep Medicine, Washington, District of Columbia, United States;
| | - Leo Anthony Celi
- Massachusetts Institute of Technology, 2167, Laboratory for Computational Physiology, Cambridge, Massachusetts, United States
- Beth Israel Deaconess Medical Center, 1859, Division of Pulmonary, Critical Care and Sleep Medicine, , Boston, Massachusetts, United States
- Harvard University T H Chan School of Public Health, 1857, Department of Biostatistics, Boston, Massachusetts, United States
| |
Collapse
|
18
|
Gonçalves MB, Nakayama LF, Ferraz D, Faber H, Korot E, Malerbi FK, Regatieri CV, Maia M, Celi LA, Keane PA, Belfort R. Image quality assessment of retinal fundus photographs for diabetic retinopathy in the machine learning era: a review. Eye (Lond) 2024; 38:426-433. [PMID: 37667028 PMCID: PMC10858054 DOI: 10.1038/s41433-023-02717-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/26/2023] [Accepted: 08/25/2023] [Indexed: 09/06/2023] Open
Abstract
This study aimed to evaluate the image quality assessment (IQA) and quality criteria employed in publicly available datasets for diabetic retinopathy (DR). A literature search strategy was used to identify relevant datasets, and 20 datasets were included in the analysis. Out of these, 12 datasets mentioned performing IQA, but only eight specified the quality criteria used. The reported quality criteria varied widely across datasets, and accessing the information was often challenging. The findings highlight the importance of IQA for AI model development while emphasizing the need for clear and accessible reporting of IQA information. The study suggests that automated quality assessments can be a valid alternative to manual labeling and emphasizes the importance of establishing quality standards based on population characteristics, clinical use, and research purposes. In conclusion, image quality assessment is important for AI model development; however, strict data quality standards must not limit data sharing. Given the importance of IQA for developing, validating, and implementing deep learning (DL) algorithms, it's recommended that this information be reported in a clear, specific, and accessible way whenever possible. Automated quality assessments are a valid alternative to the traditional manual labeling process, and quality standards should be determined according to population characteristics, clinical use, and research purpose.
Collapse
Affiliation(s)
- Mariana Batista Gonçalves
- Department of Ophthalmology, Sao Paulo Federal University, São Paulo, SP, Brazil
- Instituto Paulista de Estudos e Pesquisas em Oftalmologia, IPEPO, Vision Institute, São Paulo, SP, Brazil
- NIHR Biomedical Research Centre for Ophthalmology, Moorfield Eye Hospital, NHS Foundation Trust, and UCL Institute of Ophthalmology, London, UK
| | - Luis Filipe Nakayama
- Department of Ophthalmology, Sao Paulo Federal University, São Paulo, SP, Brazil.
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, MA, USA.
| | - Daniel Ferraz
- Department of Ophthalmology, Sao Paulo Federal University, São Paulo, SP, Brazil
- Instituto Paulista de Estudos e Pesquisas em Oftalmologia, IPEPO, Vision Institute, São Paulo, SP, Brazil
- NIHR Biomedical Research Centre for Ophthalmology, Moorfield Eye Hospital, NHS Foundation Trust, and UCL Institute of Ophthalmology, London, UK
| | - Hanna Faber
- Department of Ophthalmology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
- Department of Ophthalmology, University of Tuebingen, Tuebingen, Germany
| | - Edward Korot
- Retina Specialists of Michigan, Grand Rapids, MI, USA
- Stanford University Byers Eye Institute Palo Alto, Palo Alto, CA, USA
| | | | | | - Mauricio Maia
- Department of Ophthalmology, Sao Paulo Federal University, São Paulo, SP, Brazil
| | - Leo Anthony Celi
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, MA, USA
- Harvard TH Chan School of Public Health, Department of Biostatistics, Boston, MA, USA
- Beth Israel Deaconess Medical Center, Department of Medicine, Boston, MA, USA
| | - Pearse A Keane
- NIHR Biomedical Research Centre for Ophthalmology, Moorfield Eye Hospital, NHS Foundation Trust, and UCL Institute of Ophthalmology, London, UK
| | - Rubens Belfort
- Department of Ophthalmology, Sao Paulo Federal University, São Paulo, SP, Brazil
- Instituto Paulista de Estudos e Pesquisas em Oftalmologia, IPEPO, Vision Institute, São Paulo, SP, Brazil
| |
Collapse
|
19
|
Abdelmalek FM, Angriman F, Moore J, Liu K, Burry L, Seyyed-Kalantari L, Mehta S, Gichoya J, Celi LA, Tomlinson G, Fralick M, Yarnell CJ. Association between Patient Race and Ethnicity and Use of Invasive Ventilation in the United States. Ann Am Thorac Soc 2024; 21:287-295. [PMID: 38029405 DOI: 10.1513/annalsats.202305-485oc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 11/28/2023] [Indexed: 12/01/2023] Open
Abstract
Rationale: Outcomes for people with respiratory failure in the United States vary by patient race and ethnicity. Invasive ventilation is an important treatment initiated based on expert opinion. It is unknown whether the use of invasive ventilation varies by patient race and ethnicity. Objectives: To measure 1) the association between patient race and ethnicity and the use of invasive ventilation; and 2) the change in 28-day mortality mediated by any association. Methods: We performed a multicenter cohort study of nonintubated adults receiving oxygen within 24 hours of intensive care admission using the Medical Information Mart for Intensive Care IV (MIMIC-IV, 2008-2019) and Phillips eICU (eICU, 2014-2015) databases from the United States. We modeled the association between patient race and ethnicity (Asian, Black, Hispanic, White) and invasive ventilation rate using a Bayesian multistate model that adjusted for baseline and time-varying covariates, calculated hazard ratios (HRs), and estimated 28-day hospital mortality changes mediated by differential invasive ventilation use. We reported posterior means and 95% credible intervals (CrIs). Results: We studied 38,258 patients, 52% (20,032) from MIMIC-IV and 48% (18,226) from eICU: 2% Asian (892), 11% Black (4,289), 5% Hispanic (1,964), and 81% White (31,113). Invasive ventilation occurred in 9.2% (3,511), and 7.5% (2,869) died. The adjusted rate of invasive ventilation was lower in Asian (HR, 0.82; CrI, 0.70-0.95), Black (HR, 0.78; CrI, 0.71-0.86), and Hispanic (HR, 0.70; CrI, 0.61-0.79) patients compared with White patients. For the average patient, lower rates of invasive ventilation did not mediate differences in 28-day mortality. For a patient on high-flow nasal cannula with inspired oxygen fraction of 1.0, the odds ratios for mortality if invasive ventilation rates were equal to the rate for White patients were 0.97 (CrI, 0.91-1.03) for Asian patients, 0.96 (CrI, 0.91-1.03) for Black patients, and 0.94 (CrI, 0.89-1.01) for Hispanic patients. Conclusions: Asian, Black, and Hispanic patients had lower rates of invasive ventilation than White patients. These decreases did not mediate harm for the average patient, but we could not rule out harm for patients with more severe hypoxemia.
Collapse
Affiliation(s)
| | - Federico Angriman
- Institute of Health Policy, Management, and Evaluation
- Interdepartmental Division of Critical Care Medicine
- Sunnybrook Health Sciences Center, Toronto, Ontario, Canada
| | - Julie Moore
- Lawrence S. Bloomberg Faculty of Nursing, University of Toronto, Toronto, Ontario, Canada
- University Health Network/Sinai Health, Toronto, Ontario, Canada
| | - Kuan Liu
- Institute of Health Policy, Management, and Evaluation
| | - Lisa Burry
- Interdepartmental Division of Critical Care Medicine
- Leslie Dan Faculty of Pharmacy, and
- University Health Network/Sinai Health, Toronto, Ontario, Canada
| | - Laleh Seyyed-Kalantari
- Department of Electrical Engineering and Computer Science, Lassonde School of Engineering, York University, Toronto, Ontario, Canada
| | - Sangeeta Mehta
- Interdepartmental Division of Critical Care Medicine
- University Health Network/Sinai Health, Toronto, Ontario, Canada
| | - Judy Gichoya
- Department of Radiology and Biomedical Informatics, Emory University, Atlanta, Georgia
| | - Leo Anthony Celi
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts; and
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - George Tomlinson
- Institute of Health Policy, Management, and Evaluation
- University Health Network/Sinai Health, Toronto, Ontario, Canada
| | - Michael Fralick
- University Health Network/Sinai Health, Toronto, Ontario, Canada
| | - Christopher J Yarnell
- Institute of Health Policy, Management, and Evaluation
- Interdepartmental Division of Critical Care Medicine
- University Health Network/Sinai Health, Toronto, Ontario, Canada
- Department of Critical Care Medicine and
- Scarborough Health Network Research Institute, Scarborough Health Network, Toronto, Ontario, Canada
| |
Collapse
|
20
|
Nakayama LF, Restrepo D, Matos J, Ribeiro LZ, Malerbi FK, Celi LA, Regatieri CS. BRSET: A Brazilian Multilabel Ophthalmological Dataset of Retina Fundus Photos. medRxiv 2024:2024.01.23.24301660. [PMID: 38343827 PMCID: PMC10854338 DOI: 10.1101/2024.01.23.24301660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Introduction The Brazilian Multilabel Ophthalmological Dataset (BRSET) addresses the scarcity of publicly available ophthalmological datasets in Latin America. BRSET comprises 16,266 color fundus retinal photos from 8,524 Brazilian patients, aiming to enhance data representativeness, serving as a research and teaching tool. It contains sociodemographic information, enabling investigations into differential model performance across demographic groups. Methods Data from three São Paulo outpatient centers yielded demographic and medical information from electronic records, including nationality, age, sex, clinical history, insulin use, and duration of diabetes diagnosis. A retinal specialist labeled images for anatomical features (optic disc, blood vessels, macula), quality control (focus, illumination, image field, artifacts), and pathologies (e.g., diabetic retinopathy). Diabetic retinopathy was graded using International Clinic Diabetic Retinopathy and Scottish Diabetic Retinopathy Grading. Validation used Dino V2 Base for feature extraction, with 70% training and 30% testing subsets. Support Vector Machines (SVM) and Logistic Regression (LR) were employed with weighted training. Performance metrics included area under the receiver operating curve (AUC) and Macro F1-score. Results BRSET comprises 65.1% Canon CR2 and 34.9% Nikon NF5050 images. 61.8% of the patients are female, and the average age is 57.6 years. Diabetic retinopathy affected 15.8% of patients, across a spectrum of disease severity. Anatomically, 20.2% showed abnormal optic discs, 4.9% abnormal blood vessels, and 28.8% abnormal macula. Models were trained on BRSET in three prediction tasks: "diabetes diagnosis"; "sex classification"; and "diabetic retinopathy diagnosis". Discussion BRSET is the first multilabel ophthalmological dataset in Brazil and Latin America. It provides an opportunity for investigating model biases by evaluating performance across demographic groups. The model performance of three prediction tasks demonstrates the value of the dataset for external validation and for teaching medical computer vision to learners in Latin America using locally relevant data sources.
Collapse
Affiliation(s)
- Luis Filipe Nakayama
- Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - David Restrepo
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Telematics Department, University of Cauca, Popayán, Cauca, Colombia
| | - João Matos
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Faculty of Engineering of University of Porto, Porto, Portugal
| | - Lucas Zago Ribeiro
- Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
| | - Fernando Korn Malerbi
- Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Caio Saito Regatieri
- Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
| |
Collapse
|
21
|
Alberto IRI, Alberto NRI, Altinel Y, Blacker S, Binotti WW, Celi LA, Chua T, Fiske A, Griffin M, Karaca G, Mokolo N, Naawu DKN, Patscheider J, Petushkov A, Quion JM, Senteio C, Taisbak S, Tırnova İ, Tokashiki H, Velasquez A, Yaghy A, Yap K. A scientometric analysis of fairness in health AI literature. PLOS Glob Public Health 2024; 4:e0002513. [PMID: 38241250 PMCID: PMC10798451 DOI: 10.1371/journal.pgph.0002513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024]
Abstract
Artificial intelligence (AI) and machine learning are central components of today's medical environment. The fairness of AI, i.e. the ability of AI to be free from bias, has repeatedly come into question. This study investigates the diversity of members of academia whose scholarship poses questions about the fairness of AI. The articles that combine the topics of fairness, artificial intelligence, and medicine were selected from Pubmed, Google Scholar, and Embase using keywords. Eligibility and data extraction from the articles were done manually and cross-checked by another author for accuracy. Articles were selected for further analysis, cleaned, and organized in Microsoft Excel; spatial diagrams were generated using Public Tableau. Additional graphs were generated using Matplotlib and Seaborn. Linear and logistic regressions were conducted using Python to measure the relationship between funding status, number of citations, and the gender demographics of the authorship team. We identified 375 eligible publications, including research and review articles concerning AI and fairness in healthcare. Analysis of the bibliographic data revealed that there is an overrepresentation of authors that are white, male, and are from high-income countries, especially in the roles of first and last author. Additionally, analysis showed that papers whose authors are based in higher-income countries were more likely to be cited more often and published in higher impact journals. These findings highlight the lack of diversity among the authors in the AI fairness community whose work gains the largest readership, potentially compromising the very impartiality that the AI fairness community is working towards.
Collapse
Affiliation(s)
| | | | - Yuksel Altinel
- Bagcilar Research and Training Hospital, General Surgery Department, University of Health Sciences, Istanbul, Turkey
| | - Sarah Blacker
- Department of Social Science, York University, Toronto, Ontario, Canada
| | - William Warr Binotti
- New England Eye Center, Tufts Medical Center, Boston, Massachusetts, United States of America
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Tiffany Chua
- University of San Francisco, San Francisco, California, United States of America
| | - Amelia Fiske
- Institute for History and Ethics in Medicine, School of Medicine, Technical University of Munich, Munich, Germany
| | - Molly Griffin
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
| | - Gulce Karaca
- Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Nkiruka Mokolo
- Meharry Medical College School of Medicine, Nashville, Tennessee, United States of America
| | - David Kojo N Naawu
- Meharry Medical College School of Medicine, Nashville, Tennessee, United States of America
| | | | - Anton Petushkov
- University of Michigan, Ann Arbor, Michigan, United States of America
| | - Justin Michael Quion
- University of the East Ramon Magsaysay Memorial Medical Center Inc, Quezon City, Philippines
| | - Charles Senteio
- Department of Library and Information Science, Rutgers University School of Communication and Information, New Brunswick, New Jersey, United States of America
| | | | - İsmail Tırnova
- Department of General Surgery, Baskent University School of Medicine, Istanbul, Turkey
| | - Harumi Tokashiki
- Department of Medicine, Carney Hospital, Boston, Massachusetts, United States of America
| | - Adrian Velasquez
- Department of Medicine, Carney Hospital, Boston, Massachusetts, United States of America
- Warren Alpert School of Medicine at Brown University, Providence, Rhode Island, United States of America
| | - Antonio Yaghy
- New England Eye Center, Boston, Massachusetts, United States of America
| | - Keagan Yap
- Harvard College, Cambridge, Massachusetts, United States of America
| |
Collapse
|
22
|
Manzo G, Celi LA, Shabazz Y, Mulcahey R, Flores LJ, Demner-Fushman D. Caregivers Attitude Detection From Clinical Notes. AMIA Annu Symp Proc 2024; 2023:1125-1134. [PMID: 38222330 PMCID: PMC10785866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Caregivers' attitudes impact healthcare quality and disparities. Clinical notes contain highly specialized and ambiguous language that requires extensive domain knowledge to understand, and using negative language does not necessarily imply a negative attitude. This study discusses the challenge of detecting caregivers' attitudes from their clinical notes. To address these challenges, we annotate MIMIC clinical notes and train state-of-the-art language models from the Hugging Face platform. The study focuses on the Neonatal Intensive Care Unit and evaluates models in zero-shot, few-shot, and fully-trained scenarios. Among the chosen models, RoBERTa identifies caregivers' attitudes from clinical notes with an F1-score of 0.75. This approach not only enhances patient satisfaction, but opens up exciting possibilities for detecting and preventing care provider syndromes, such as fatigue, stress, and burnout. The paper concludes by discussing limitations and potential future work.
Collapse
Affiliation(s)
- Gaetano Manzo
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| | - Leo Anthony Celi
- Massachusetts Institute of Technology (MIT), Harvard Medical School, and the Beth Israel Deaconess Medical Center
| | - Yasmeen Shabazz
- Massachusetts Institute of Technology (MIT), Harvard Medical School, and the Beth Israel Deaconess Medical Center
| | - Rory Mulcahey
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| | - Lorenzo Jaime Flores
- Massachusetts Institute of Technology (MIT), Harvard Medical School, and the Beth Israel Deaconess Medical Center
| | - Dina Demner-Fushman
- Computational Health Research Branch, National Library of Medicine, Bethesda, Maryland, USA
| |
Collapse
|
23
|
Gallifant J, Fiske A, Levites Strekalova YA, Osorio-Valencia JS, Parke R, Mwavu R, Martinez N, Gichoya JW, Ghassemi M, Demner-Fushman D, McCoy LG, Celi LA, Pierce R. Peer review of GPT-4 technical report and systems card. PLOS Digit Health 2024; 3:e0000417. [PMID: 38236824 PMCID: PMC10795998 DOI: 10.1371/journal.pdig.0000417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2024]
Abstract
The study provides a comprehensive review of OpenAI's Generative Pre-trained Transformer 4 (GPT-4) technical report, with an emphasis on applications in high-risk settings like healthcare. A diverse team, including experts in artificial intelligence (AI), natural language processing, public health, law, policy, social science, healthcare research, and bioethics, analyzed the report against established peer review guidelines. The GPT-4 report shows a significant commitment to transparent AI research, particularly in creating a systems card for risk assessment and mitigation. However, it reveals limitations such as restricted access to training data, inadequate confidence and uncertainty estimations, and concerns over privacy and intellectual property rights. Key strengths identified include the considerable time and economic investment in transparent AI research and the creation of a comprehensive systems card. On the other hand, the lack of clarity in training processes and data raises concerns about encoded biases and interests in GPT-4. The report also lacks confidence and uncertainty estimations, crucial in high-risk areas like healthcare, and fails to address potential privacy and intellectual property issues. Furthermore, this study emphasizes the need for diverse, global involvement in developing and evaluating large language models (LLMs) to ensure broad societal benefits and mitigate risks. The paper presents recommendations such as improving data transparency, developing accountability frameworks, establishing confidence standards for LLM outputs in high-risk settings, and enhancing industry research review processes. It concludes that while GPT-4's report is a step towards open discussions on LLMs, more extensive interdisciplinary reviews are essential for addressing bias, harm, and risk concerns, especially in high-risk domains. The review aims to expand the understanding of LLMs in general and highlights the need for new reflection forms on how LLMs are reviewed, the data required for effective evaluation, and addressing critical issues like bias and risk.
Collapse
Affiliation(s)
- Jack Gallifant
- Department of Critical Care, Guy’s & St Thomas’ NHS Trust, London, United Kingdom
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, Massachusetts, United States of America
| | - Amelia Fiske
- Institute of History and Ethics in Medicine, Department of Clinical Medicine, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Yulia A. Levites Strekalova
- Department of Health Services Research, Management, and Policy, College of Public Health and Health Professions, University of Florida, Gainesville, Florida, United States of America
| | - Juan S. Osorio-Valencia
- A.I. and Innovation Committee, Colombian Radiology Association, Medellin, Colombia
- ScienteLab, Bogota, Colombia
- Be4tech, Medellin, Colombia
| | - Rachael Parke
- Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
- School of Nursing, The University of Auckland, Auckland, New Zealand
| | - Rogers Mwavu
- Faculty of Computing and Informatics, Mbarara University of Science and Technology, Mbarara, Uganda
| | - Nicole Martinez
- Center for Biomedical Ethics, Stanford University, Stanford, California, United States of America
| | - Judy Wawira Gichoya
- Department of Radiology, Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Marzyeh Ghassemi
- Massachusetts Institute of Technology, Electrical Engineering and Computer Science (EECS), Cambridge, Massachusetts, United States of America
| | - Dina Demner-Fushman
- National Library of Medicine, NIH, HHS, Bethesda, Maryland, United States of America
| | - Liam G. McCoy
- Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada
| | - Leo Anthony Celi
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Robin Pierce
- The Law School, Faculty of Humanities, Arts, and Social Sciences, University of Exeter, Exeter, United Kingdom
| |
Collapse
|
24
|
Chidambaram S, Jain B, Jain U, Mwavu R, Baru R, Thomas B, Greaves F, Jayakumar S, Jain P, Rojo M, Battaglino MR, Meara JG, Sounderajah V, Celi LA, Darzi A. An introduction to digital determinants of health. PLOS Digit Health 2024; 3:e0000346. [PMID: 38175828 PMCID: PMC10766177 DOI: 10.1371/journal.pdig.0000346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
In recent years, technology has been increasingly incorporated within healthcare for the provision of safe and efficient delivery of services. Although this can be attributed to the benefits that can be harnessed, digital technology has the potential to exacerbate and reinforce preexisting health disparities. Previous work has highlighted how sociodemographic, economic, and political factors affect individuals' interactions with digital health systems and are termed social determinants of health [SDOH]. But, there is a paucity of literature addressing how the intrinsic design, implementation, and use of technology interact with SDOH to influence health outcomes. Such interactions are termed digital determinants of health [DDOH]. This paper will, for the first time, propose a definition of DDOH and provide a conceptual model characterizing its influence on healthcare outcomes. Specifically, DDOH is implicit in the design of artificial intelligence systems, mobile phone applications, telemedicine, digital health literacy [DHL], and other forms of digital technology. A better appreciation of DDOH by the various stakeholders at the individual and societal levels can be channeled towards policies that are more digitally inclusive. In tandem with ongoing work to minimize the digital divide caused by existing SDOH, further work is necessary to recognize digital determinants as an important and distinct entity.
Collapse
Affiliation(s)
- Swathikan Chidambaram
- Department of Surgery & Cancer, Imperial College London, St. Mary’s Hospital, London, United Kingdom
- Institute of Global Health Innovation, Imperial College London, South Kensington Campus, London, United Kingdom
| | - Bhav Jain
- Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Urvish Jain
- Dietrich School of Arts and Sciences, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Rogers Mwavu
- Mbarara University of Science & Technology, Uganda
| | - Rama Baru
- Centre of Social Medicine and Community Health, Jawaharlal Nehru University, New Delhi, India
| | - Beena Thomas
- Indian Council of Medical Research, National Institute for Research in Tuberculosis, Chennai, India
| | - Felix Greaves
- Science, Evidence and Analytics, National Institute for Health and Care Excellence, England, United Kingdom
- Faculty of Medicine, School of Public Health, Imperial College London, United Kingdom
| | - Shruti Jayakumar
- Department of Surgery & Cancer, Imperial College London, St. Mary’s Hospital, London, United Kingdom
- Institute of Global Health Innovation, Imperial College London, South Kensington Campus, London, United Kingdom
| | - Pankaj Jain
- Health Plan Consumer and Provider Technology, Highmark Health, Pittsburgh, Pennsylvania, United States of America
- Department of Marketing, Indiana University of Pennsylvania, Indiana, Pennsylvania, United States of America
| | - Marina Rojo
- Public Health Innovation Lab, Med School, Buenos AIres University, Argentina
| | | | - John G. Meara
- Department of Plastic and Oral Surgery, Longwood Avenue, Boston, Massachusetts, United States of America
| | - Viknesh Sounderajah
- Department of Surgery & Cancer, Imperial College London, St. Mary’s Hospital, London, United Kingdom
- Institute of Global Health Innovation, Imperial College London, South Kensington Campus, London, United Kingdom
| | - Leo Anthony Celi
- Division of Pulmonary, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Boston, Massachusetts, United States of America
| | - Ara Darzi
- Department of Surgery & Cancer, Imperial College London, St. Mary’s Hospital, London, United Kingdom
- Institute of Global Health Innovation, Imperial College London, South Kensington Campus, London, United Kingdom
| |
Collapse
|
25
|
Tiangco B, Daguit SEJ, Astrologo NC, Flores L, Parma RN, Celi LA. Challenges in the maintenance of an open hospital-based cancer registry system in a low-to-middle-income country (LMIC): 2017-2022 experience. PLOS Digit Health 2024; 3:e0000328. [PMID: 38265986 PMCID: PMC10807826 DOI: 10.1371/journal.pdig.0000328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/14/2023] [Indexed: 01/26/2024]
Abstract
Hospital-based cancer registries (HBCRs) record data on all patients diagnosed and/or treated for cancer at healthcare facilities and evaluate the burden of the disease and the quality of healthcare services at that hospital, helping improve patient care, and providing an assessment of healthcare quality. The CARE PH app was created as a tool to facilitate a system of hospital-based cancer registries in the Philippines, a lower middle-income country. From 2017 to 2022, a total of 60,021 cancer registrants from 44 CARE PH hospitals were entered into the database. Breast cancer was the most common primary site, accounting for 17,660 cases (29.4%). This was followed by colorectal cancer at 11.1%, cervical cancer at 6.2%, head and neck cancer at 5.9%, and prostate and other male genital cancer at 5.1%.Among the 30 data fields collected, 17 exhibited 0-20% missing data, eight displayed 21%-90% missing data, while five depicted 91%-100% missing data. Most of the data fields with missing data are in the treatment and follow-up modules, which are stored in separate forms in a patient's record. Digital transformation of hospitals from paper-based charts to electronic medical records, and the integration of the HBCR to the EMR and hospital information system, will likely be the best solution for these limitations. It is recommended that the creation and maintenance of HBCRs nationwide must be harmonized, and embedded in all relevant national programs and legislations. The development of an information technology process that is based on a cancer patient's journey, should be built on an open system embedded in a well designed enterprise architecture, functioning under the guidance of a strong leadership and governance team. All these must be present in order to create and maintain a robust HBCR that is useful for furthering cancer registry and research in the country.
Collapse
Affiliation(s)
- Beatrice Tiangco
- Cancer CARE Registry and Research Philippines Foundation, Inc, Pasig, Philippines
- University of the Philippines National Institutes of Health, Manila, Philippines
| | | | - Nicole Cathlene Astrologo
- Cancer CARE Registry and Research Philippines Foundation, Inc, Pasig, Philippines
- University of the Philippines Los Baños, Los Baños, Philippines
| | - Leo Flores
- Cancer CARE Registry and Research Philippines Foundation, Inc, Pasig, Philippines
| | - Ric Nonato Parma
- Cancer CARE Registry and Research Philippines Foundation, Inc, Pasig, Philippines
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
26
|
Zack T, Lehman E, Suzgun M, Rodriguez JA, Celi LA, Gichoya J, Jurafsky D, Szolovits P, Bates DW, Abdulnour REE, Butte AJ, Alsentzer E. Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. Lancet Digit Health 2024; 6:e12-e22. [PMID: 38123252 DOI: 10.1016/s2589-7500(23)00225-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/30/2023] [Accepted: 10/26/2023] [Indexed: 12/23/2023]
Abstract
BACKGROUND Large language models (LLMs) such as GPT-4 hold great promise as transformative tools in health care, ranging from automating administrative tasks to augmenting clinical decision making. However, these models also pose a danger of perpetuating biases and delivering incorrect medical diagnoses, which can have a direct, harmful impact on medical care. We aimed to assess whether GPT-4 encodes racial and gender biases that impact its use in health care. METHODS Using the Azure OpenAI application interface, this model evaluation study tested whether GPT-4 encodes racial and gender biases and examined the impact of such biases on four potential applications of LLMs in the clinical domain-namely, medical education, diagnostic reasoning, clinical plan generation, and subjective patient assessment. We conducted experiments with prompts designed to resemble typical use of GPT-4 within clinical and medical education applications. We used clinical vignettes from NEJM Healer and from published research on implicit bias in health care. GPT-4 estimates of the demographic distribution of medical conditions were compared with true US prevalence estimates. Differential diagnosis and treatment planning were evaluated across demographic groups using standard statistical tests for significance between groups. FINDINGS We found that GPT-4 did not appropriately model the demographic diversity of medical conditions, consistently producing clinical vignettes that stereotype demographic presentations. The differential diagnoses created by GPT-4 for standardised clinical vignettes were more likely to include diagnoses that stereotype certain races, ethnicities, and genders. Assessment and plans created by the model showed significant association between demographic attributes and recommendations for more expensive procedures as well as differences in patient perception. INTERPRETATION Our findings highlight the urgent need for comprehensive and transparent bias assessments of LLM tools such as GPT-4 for intended use cases before they are integrated into clinical care. We discuss the potential sources of these biases and potential mitigation strategies before clinical implementation. FUNDING Priscilla Chan and Mark Zuckerberg.
Collapse
Affiliation(s)
- Travis Zack
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA, USA
| | - Eric Lehman
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Mirac Suzgun
- Department of Computer Science, Stanford University, Stanford, CA, USA; Stanford Law School, Stanford University, Stanford, CA, USA
| | - Jorge A Rodriguez
- Division of General Internal Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Judy Gichoya
- Department of Radiology, Emory University, Atlanta, GA, USA
| | - Dan Jurafsky
- Department of Computer Science, Stanford University, Stanford, CA, USA; Department of Linguistics, Stanford University, Stanford, CA, USA
| | - Peter Szolovits
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - David W Bates
- Division of General Internal Medicine, Brigham and Women's Hospital, Boston, MA, USA; Department of Health Policy and Management, Harvard T H Chan School of Public Health, Boston, MA, USA
| | - Raja-Elie E Abdulnour
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA; Center for Data-Driven Insights and Innovation, University of California, Office of the President, Oakland, CA, USA
| | - Emily Alsentzer
- Division of General Internal Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
27
|
Liu X, Shen M, Lie M, Zhang Z, Liu C, Li D, Mark RG, Zhang Z, Celi LA. Evaluating Prognostic Bias of Critical Illness Severity Scores Based on Age, Sex, and Primary Language in the United States: A Retrospective Multicenter Study. Crit Care Explor 2024; 6:e1033. [PMID: 38239408 PMCID: PMC10796141 DOI: 10.1097/cce.0000000000001033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024] Open
Abstract
OBJECTIVES Although illness severity scoring systems are widely used to support clinical decision-making and assess ICU performance, their potential bias across different age, sex, and primary language groups has not been well-studied. DESIGN SETTING AND PATIENTS We aimed to identify potential bias of Sequential Organ Failure Assessment (SOFA) and Acute Physiology and Chronic Health Evaluation (APACHE) IVa scores via large ICU databases. SETTING/PATIENTS This multicenter, retrospective study was conducted using data from the Medical Information Mart for Intensive Care (MIMIC) and eICU Collaborative Research Database. SOFA and APACHE IVa scores were obtained from ICU admission. Hospital mortality was the primary outcome. Discrimination (area under receiver operating characteristic [AUROC] curve) and calibration (standardized mortality ratio [SMR]) were assessed for all subgroups. INTERVENTIONS Not applicable. MEASUREMENTS AND MAIN RESULTS A total of 196,310 patient encounters were studied. Discrimination for both scores was worse in older patients compared with younger patients and female patients rather than male patients. In MIMIC, discrimination of SOFA in non-English primary language speakers patients was worse than that of English speakers (AUROC 0.726 vs. 0.783, p < 0.0001). Evaluating calibration via SMR showed statistically significant underestimations of mortality when compared with overall cohort in the oldest patients for both SOFA and APACHE IVa, female patients (1.09) for SOFA, and non-English primary language patients (1.38) for SOFA in MIMIC. CONCLUSIONS Differences in discrimination and calibration of two scores across varying age, sex, and primary language groups suggest illness severity scores are prone to bias in mortality predictions. Caution must be taken when using them for quality benchmarking and decision-making among diverse real-world populations.
Collapse
Affiliation(s)
- Xiaoli Liu
- Center for Artificial Intelligence in Medicine, The General Hospital of PLA, Beijing, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA
| | - Max Shen
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA
| | - Margaret Lie
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA
| | - Zhongheng Zhang
- Department of Emergency Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Chao Liu
- Department of Critical Care Medicine, The First Medical Center, The General Hospital of PLA, Beijing, China
| | - Deyu Li
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Roger G Mark
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA
| | - Zhengbo Zhang
- Center for Artificial Intelligence in Medicine, The General Hospital of PLA, Beijing, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| |
Collapse
|
28
|
Agha-Mir-Salim L, McCullum L, Dähnert E, Scheel YD, Wilson A, Carpio M, Chan C, Lo C, Maher L, Dressler C, Balzer F, Celi LA, Poncette AS, Pelter MM. Interdisciplinary collaboration in critical care alarm research: A bibliometric analysis. Int J Med Inform 2024; 181:105285. [PMID: 37977055 DOI: 10.1016/j.ijmedinf.2023.105285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/30/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023]
Abstract
BACKGROUND Alarm fatigue in nurses is a major patient safety concern in the intensive care unit. This is caused by exposure to high rates of false and non-actionable alarms. Despite decades of research, the problem persists, leading to stress, burnout, and patient harm resulting from true missed events. While engineering approaches to reduce false alarms have spurred hope, they appear to lack collaboration between nurses and engineers to produce real-world solutions. The aim of this bibliometric analysis was to examine the relevant literature to quantify the level of authorial collaboration between nurses, physicians, and engineers. METHODS We conducted a bibliometric analysis of articles on alarm fatigue and false alarm reduction strategies in critical care published between 2010 and 2022. Data were extracted at the article and author level. The percentages of author disciplines per publication were calculated by study design, journal subject area, and other article-level factors. RESULTS A total of 155 articles with 583 unique authors were identified. While 31.73 % (n = 185) of the unique authors had a nursing background, publications using an engineering study design (n = 46), e.g., model development, had a very low involvement of nursing authors (mean proportion at 1.09 %). Observational studies (n = 58) and interventional studies (n = 33) had a higher mean involvement of 52.27 % and 47.75 %, respectively. Articles published in nursing journals (n = 32) had the highest mean proportion of nursing authors (80.32 %), while those published in engineering journals (n = 46) had the lowest (9.00 %), with 6 (13.04 %) articles having one or more nurses as co-authors. CONCLUSION Minimal involvement of nursing expertise in alarm research utilizing engineering methodologies may be one reason for the lack of successful, real-world solutions to ameliorate alarm fatigue. Fostering a collaborative, interdisciplinary research culture can promote a common publication culture across fields and may yield sustainable implementation of technological solutions in healthcare.
Collapse
Affiliation(s)
- Louis Agha-Mir-Salim
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
| | - Lucas McCullum
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Enrico Dähnert
- Hospital Management, Nursing Directorate, Practice Development and Nursing Science, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Yanick-Daniel Scheel
- Hospital Management, Nursing Directorate, Practice Development and Nursing Science, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Ainsley Wilson
- Department of Nursing, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Marianne Carpio
- Medical Intensive Care Unit, Boston Children's Hospital, Boston, MA, USA
| | - Carmen Chan
- School of Nursing and Health Professions, University of San Francisco, San Francisco, CA, USA
| | - Claudia Lo
- School of Nursing and Health Professions, University of San Francisco, San Francisco, CA, USA; Department of Business Analytics and Information Systems, School of Management, University of San Francisco, San Francisco, CA, USA
| | - Lindsay Maher
- School of Nursing and Health Professions, University of San Francisco, San Francisco, CA, USA
| | - Corinna Dressler
- Medical Library, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Felix Balzer
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Akira-Sebastian Poncette
- Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany; Department of Anesthesiology and Intensive Care Medicine, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Michele M Pelter
- Department of Physiological Nursing, University of California San Francisco School of Nursing, San Francisco, CA, USA
| |
Collapse
|
29
|
Khavandi S, Zaghloul F, Higham A, Lim E, de Pennington N, Celi LA. Investigating the Impact of Automation on the Health Care Workforce Through Autonomous Telemedicine in the Cataract Pathway: Protocol for a Multicenter Study. JMIR Res Protoc 2023; 12:e49374. [PMID: 38051569 PMCID: PMC10731565 DOI: 10.2196/49374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 09/17/2023] [Accepted: 09/18/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND While digital health innovations are increasingly being adopted by health care organizations, implementation is often carried out without considering the impacts on frontline staff who will be using the technology and who will be affected by its introduction. The enthusiasm surrounding the use of artificial intelligence (AI)-enabled digital solutions in health care is tempered by uncertainty around how it will change the working lives and practices of health care professionals. Digital enablement can be viewed as facilitating enhanced effectiveness and efficiency by improving services and automating cognitive labor, yet the implementation of such AI technology comes with challenges related to changes in work practices brought by automation. This research explores staff experiences before and after care pathway automation with an autonomous clinical conversational assistant, Dora (Ufonia Ltd), that is able to automate routine clinical conversations. OBJECTIVE The primary objective is to examine the impact of AI-enabled automation on clinicians, allied health professionals, and administrators who provide or facilitate health care to patients in high-volume, low-complexity care pathways. In the process of transforming care pathways through automation of routine tasks, staff will increasingly "work at the top of their license." The impact of this fundamental change on the professional identity, well-being, and work practices of the individual is poorly understood at present. METHODS We will adopt a multiple case study approach, combining qualitative and quantitative data collection methods, over 2 distinct phases, namely phase A (preimplementation) and phase B (postimplementation). RESULTS The analysis is expected to reveal the interrelationship between Dora and those affected by its introduction. This will reveal how tasks and responsibilities have changed or shifted, current tensions and contradictions, ways of working, and challenges, benefits, and opportunities as perceived by those on the frontlines of the health care system. The findings will enable a better understanding of the resistance or susceptibility of different stakeholders within the health care workforce and encourage managerial awareness of differing needs, demands, and uncertainties. CONCLUSIONS The implementation of AI in the health care sector, as well as the body of research on this topic, remain in their infancy. The project's key contribution will be to understand the impact of AI-enabled automation on the health care workforce and their work practices. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/49374.
Collapse
Affiliation(s)
- Sarah Khavandi
- Ufonia, Oxford, United Kingdom
- Gloucestershire Hospitals NHS Foundation Trust, Cheltenham, United Kingdom
- Imperial College School of Medicine, Imperial College London, London, United Kingdom
| | - Fatema Zaghloul
- Operations and Management Science, Healthcare and Innovation, University of Bristol, Bristol, United Kingdom
| | - Aisling Higham
- Ufonia, Oxford, United Kingdom
- Royal Berkshire NHS Foundation Trust, Reading, United Kingdom
| | - Ernest Lim
- Ufonia, Oxford, United Kingdom
- Department of Computer Science, University of York, York, United Kingdom
| | | | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, United States
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, United States
| |
Collapse
|
30
|
Gallifant J, Celi LA, Pierce RL. Digital determinants of health: opportunities and risks amidst health inequities. Nat Rev Nephrol 2023; 19:749-750. [PMID: 37626271 DOI: 10.1038/s41581-023-00763-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2023]
Affiliation(s)
- Jack Gallifant
- Department of Critical Care, Guy's & St Thomas' NHS Trust, London, UK.
| | - Leo Anthony Celi
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Robin L Pierce
- The Law School, Faculty of Humanities, Arts, and Social Sciences, University of Exeter, Exeter, UK
| |
Collapse
|
31
|
Samadani A, Wang T, van Zon K, Celi LA. VAP risk index: Early prediction and hospital phenotyping of ventilator-associated pneumonia using machine learning. Artif Intell Med 2023; 146:102715. [PMID: 38042602 DOI: 10.1016/j.artmed.2023.102715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 11/03/2023] [Accepted: 11/05/2023] [Indexed: 12/04/2023]
Abstract
BACKGROUND Ventilator-associated pneumonia (VAP) is a leading cause of morbidity and mortality in intensive care units (ICUs). Early identification of patients at risk of VAP enables early intervention, which in turn improves patient outcomes. We developed a predictive model for individualized risk assessment utilizing machine learning to identify patients at risk of developing VAP. METHODS The Philips eRI dataset, a multi-institution electronic medical record (EMR), was used for model development. For adult (≥18y) patients, we propose a set of criteria using indications of the start of a new antibiotic treatment temporally contiguous to a microbiological test to mark suspected infection events, of which those with a positive culture are labeled as presumed VAP if 1) the event occurs at least 48 h after intubation, and 2) there are no indications of community-acquired pneumonia (CAP) or other hospital-acquired infections (HAI) in the patient charts. The resulting VAP and no-VAP (control) cases were then used to build an ensemble of decision trees to predict the risk of VAP in the next 24 h using data on patients' demographics, vitals, labs, and ventilator settings. RESULTS The resulting model predicts the development of VAP 24 h in advance with an AUC of 76 % and AUPRC of 75 %. Additionally, we group hospitals that are similar in healthcare processes into distinct clusters and characterize VAP prediction for the identified hospital clusters. We show inter-hospital (teaching status and healthcare processes) and cohort-specific (age groups, gender, early vs late VAP, ICU mortality status) differences in VAP prediction and associated symptomologies. CONCLUSIONS Our proposed VAP criteria use clinical actions to mark incidences of presumed VAP infection, which enables the development of models for early detection of these events. We curated a patient cohort using these criteria and used it to build a model for predicting impending VAP events prior to clinical suspicions. We present a clustering approach for tailoring the VAP prediction model for different hospital types based on their EMR data characteristics. The model provides an instantaneous risk score that allows early interventions and confirmatory diagnostic actions.
Collapse
Affiliation(s)
- Ali Samadani
- Philips Research North America, Cambridge, MA, USA.
| | - Taiyao Wang
- Philips Research North America, Cambridge, MA, USA
| | - Kees van Zon
- Philips Research North America, Cambridge, MA, USA
| | - Leo Anthony Celi
- Massachusetts Institute of Technology, Laboratory for Computational Physiology, Cambridge, MA, USA; Beth Israel Deaconess Medical Center, Division of Pulmonary, Critical Care, and Sleep Medicine, Boston, MA, USA
| |
Collapse
|
32
|
Peine A, Gronholz M, Seidl-Rathkopf K, Wolfram T, Hallawa A, Reitz A, Celi LA, Marx G, Martin L. Standardized Comparison of Voice-Based Information and Documentation Systems to Established Systems in Intensive Care: Crossover Study. JMIR Med Inform 2023; 11:e44773. [PMID: 38015593 DOI: 10.2196/44773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 06/21/2023] [Accepted: 10/17/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND The medical teams in intensive care units (ICUs) spend increasing amounts of time at computer systems for data processing, input, and interpretation purposes. As each patient creates about 1000 data points per hour, the available information is abundant, making the interpretation difficult and time-consuming. This data flood leads to a decrease in time for evidence-based, patient-centered care. Information systems, such as patient data management systems (PDMSs), are increasingly used at ICUs. However, they often create new challenges arising from the increasing documentation burden. OBJECTIVE New concepts, such as artificial intelligence (AI)-based assistant systems, are hence introduced to the workflow to cope with these challenges. However, there is a lack of standardized, published metrics in order to compare the various data input and management systems in the ICU setting. The objective of this study is to compare established documentation and retrieval processes with newer methods, such as PDMSs and voice information and documentation systems (VIDSs). METHODS In this crossover study, we compare traditional, paper-based documentation systems with PDMSs and newer AI-based VIDSs in terms of performance (required time), accuracy, mental workload, and user experience in an intensive care setting. Performance is assessed on a set of 6 standardized, typical ICU tasks, ranging from documentation to medical interpretation. RESULTS A total of 60 ICU-experienced medical professionals participated in the study. The VIDS showed a statistically significant advantage compared to the other 2 systems. The tasks were completed significantly faster with the VIDS than with the PDMS (1-tailed t59=12.48; Cohen d=1.61; P<.001) or paper documentation (t59=20.41; Cohen d=2.63; P<.001). Significantly fewer errors were made with VIDS than with the PDMS (t59=3.45; Cohen d=0.45; P=.03) and paper-based documentation (t59=11.2; Cohen d=1.45; P<.001). The analysis of the mental workload of VIDS and PDMS showed no statistically significant difference (P=.06). However, the analysis of subjective user perception showed a statistically significant perceived benefit of the VIDS compared to the PDMS (P<.001) and paper documentation (P<.001). CONCLUSIONS The results of this study show that the VIDS reduced error rate, documentation time, and mental workload regarding the set of 6 standardized typical ICU tasks. In conclusion, this indicates that AI-based systems such as the VIDS tested in this study have the potential to reduce this workload and improve evidence-based and safe patient care.
Collapse
Affiliation(s)
- Arne Peine
- Department of Intensive Care Medicine and Intermediate Care, University Hospital RWTH Aachen, Aachen, Germany
- Clinomic Group GmbH, Aachen, Germany
| | | | | | | | - Ahmed Hallawa
- Department of Intensive Care Medicine and Intermediate Care, University Hospital RWTH Aachen, Aachen, Germany
| | | | - Leo Anthony Celi
- Laboratory of Computational Physiology, Harvard-MIT Division of Health Sciences Technology, Cambridge, MA, United States
- Beth Israel Deaconess Medical Center, Boston, MA, United States
| | - Gernot Marx
- Department of Intensive Care Medicine and Intermediate Care, University Hospital RWTH Aachen, Aachen, Germany
| | - Lukas Martin
- Department of Intensive Care Medicine and Intermediate Care, University Hospital RWTH Aachen, Aachen, Germany
- Clinomic Group GmbH, Aachen, Germany
| |
Collapse
|
33
|
Gallifant J, Kistler EA, Nakayama LF, Zera C, Kripalani S, Ntatin A, Fernandez L, Bates D, Dankwa-Mullan I, Celi LA. Disparity dashboards: an evaluation of the literature and framework for health equity improvement. Lancet Digit Health 2023; 5:e831-e839. [PMID: 37890905 PMCID: PMC10639125 DOI: 10.1016/s2589-7500(23)00150-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 06/25/2023] [Accepted: 07/26/2023] [Indexed: 10/29/2023]
Abstract
The growing recognition of differences in health outcomes across populations has led to a slow but increasing shift towards transparent reporting of patient outcomes. In addition, pay-for-equity initiatives, such as those proposed by the Centers for Medicare and Medicaid, will require the reporting of health outcomes across subgroups over time. Dashboards offer one means of visualising data in the health-care context that can highlight essential disparities in clinical outcomes, guide targeted quality-improvement efforts, and ultimately improve health equity. In this Viewpoint, we evaluate all studies that have reported the successful development of a disparity dashboard and share the data collected and unintended consequences reported. We propose a framework for systematic equality improvement through incentivisation of the collecting and reporting of health data and through implementation of reward systems to reduce health disparities.
Collapse
Affiliation(s)
- Jack Gallifant
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Emmett Alexander Kistler
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Luis Filipe Nakayama
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Ophthalmology, São Paulo Federal University, São Paulo, Brazil
| | - Chloe Zera
- Department of Obstetrics, Gynecology and Reproductive Biology, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Sunil Kripalani
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Adelline Ntatin
- Department of Health Equity, Beth Israel Lahey Health, Boston, MA, USA
| | - Leonor Fernandez
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - David Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Irene Dankwa-Mullan
- Merative & Center for AI, Research, and Evaluation, IBM Watson Health, Cambridge, MA, USA; Department of Health Policy and Management, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
34
|
Nazer L, Abusara A, Aloran B, Szakmany T, Nabulsi H, Petushkov A, Charpignon ML, Ahmed T, Cobanaj M, Elaibaid M, Lee C, Li C, Mlombwa D, Moukheiber S, Panitchote A, Parke R, Shapiro S, Link Woite N, Celi LA. Patient diversity and author representation in clinical studies supporting the Surviving Sepsis Campaign guidelines for management of sepsis and septic shock 2021: a systematic review of citations. BMC Infect Dis 2023; 23:751. [PMID: 37915042 PMCID: PMC10621092 DOI: 10.1186/s12879-023-08745-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/25/2023] [Indexed: 11/03/2023] Open
Abstract
BACKGROUND The generalizability of the Surviving Sepsis Campaign (SSC) guidelines to various patient populations and hospital settings has been debated. A quantitative assessment of the diversity and representation in the clinical evidence supporting the guidelines would help evaluate the generalizability of the recommendations and identify strategic research goals and priorities. In this study, we evaluated the diversity of patients in the original studies, in terms of sex, race/ethnicity, and geographical location. We also assessed diversity in sex and geographical representation among study first and last authors. METHODS All clinical studies cited in support of the 2021 SSC adult guideline recommendations were identified. Original clinical studies were included, while editorials, reviews, non-clinical studies, and meta-analyses were excluded. For eligible studies, we recorded the proportion of male patients, percentage of each represented racial/ethnic subgroup (when available), and countries in which they were conducted. We also recorded the sex and location of the first and last authors. The World Bank classification was used to categorize countries. RESULTS The SSC guidelines included six sections, with 85 recommendations based on 351 clinical studies. The proportion of male patients ranged from 47 to 62%. Most studies did not report the racial/ ethnic distribution of the included patients; when they did so, most were White patients (68-77%). Most studies were conducted in high-income countries (77-99%), which included Europe/Central Asia (33-66%) and North America (36-55%). Moreover, most first/last authors were males (55-93%) and from high-income countries (77-99%). CONCLUSIONS To enhance the generalizability of the SCC guidelines, stakeholders should define strategies to enhance the diversity and representation in clinical studies. Though there was reasonable representation in sex among patients included in clinical studies, the evidence did not reflect diversity in the race/ethnicity and geographical locations. There was also lack of diversity among the first and last authors contributing to the evidence.
Collapse
Affiliation(s)
- Lama Nazer
- King Hussein Cancer Center, Amman, Jordan.
| | | | | | | | | | | | | | | | | | | | | | - Chenyu Li
- University of Pittsburgh School of Medicine, Pittsburgh, USA
| | | | | | | | | | | | | | - Leo Anthony Celi
- Massachusetts Institute of Technology, Massachusetts, USA
- Harvard T.H. Chan School of Public Health, Massachusetts, USA
- Beth Israel Deaconess Medical Center, Massachusetts, Boston, USA
| |
Collapse
|
35
|
Zhang Z, Chen L, Liu X, Yang J, Huang J, Yang Q, Hu Q, Jin K, Celi LA, Hong Y. Exploring disease axes as an alternative to distinct clusters for characterizing sepsis heterogeneity. Intensive Care Med 2023; 49:1349-1359. [PMID: 37792053 DOI: 10.1007/s00134-023-07226-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/05/2023] [Indexed: 10/05/2023]
Abstract
PURPOSE Various studies have analyzed sepsis subtypes, yet the reproducibility of such results remains unclear. This study aimed to determine the reproducibility of sepsis subtypes across multiple cohorts. METHODS The study examined 63,547 sepsis patients from six distinct cohorts who had similar sepsis-related characteristics (vital signs, lactate, sequential organ failure assessment score, bilirubin, serum, urine output, and Glasgow coma scale). Identical cluster analysis techniques were used, employing 27 clustering schemes, and normalized mutual information (NMI), a metric ranging from 0 to 1 with higher values indicating better concordance, was employed to quantify the clustering solutions' reproducibility. Principal component analysis (PCA) was utilized to obtain the disease axis, and its uniformity across cohorts was evaluated through patterns of feature loading and correlation. RESULTS The reproducibility of sepsis clustering subtypes across the various studies was modest (median NMI ranging from 0.08 to 0.54). The top-down transfer learning method (model trained on cohorts with greater severity was transferred to cohorts with lower severity score) had a higher NMI value than the bottom-up approach (median [Q1, Q3]: 0.64 [0.49, 0.78] vs. 0.23 [0.2, 0.31], p < 0.001). The reproducibility was greater when the transfer solution was performed within United States (US) cohorts. The PCA analysis revealed that the correlation pattern between variables was consistent across all cohorts, and the first two disease axes were the "shock axis" and "systemic inflammatory response syndrome (SIRS) axis." CONCLUSIONS Cluster analysis of sepsis patients across various cohorts showed modest reproducibility. Sepsis heterogeneity is better characterized through continuous disease axes that coexist to varying degrees within the same individual instead of mutually exclusive subtypes.
Collapse
Affiliation(s)
- Zhongheng Zhang
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310016, China.
| | - Lin Chen
- Neurological Intensive Care Unit, Department of Neurosurgery, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, China
| | - Xiaoli Liu
- Center for Artificial Intelligence in Medicine, The General Hospital of PLA, Beijing, China
| | - Jie Yang
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310016, China
| | - Jiajie Huang
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310016, China
| | - Qiling Yang
- Department of Critical Care, The Second Affiliated Hospital of Guangzhou Medical University, No. 250 Changgang East RoadHaizhu District, Guangzhou, Guangdong, China
| | - Qichao Hu
- Key Laboratory of Digital Technology in Medical Diagnostics of Zhejiang Province, Dian Diagnostics Group Co., Ltd., Hangzhou, Zhejiang Province, China
| | - Ketao Jin
- Department of Colorectal Surgery, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, 321000, Zhejiang, China
| | - Leo Anthony Celi
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yucai Hong
- Department of Emergency Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310016, China
| |
Collapse
|
36
|
Gallifant J, Griffin M, Pierce RL, Celi LA. From quality improvement to equality improvement projects: A scoping review and framework. iScience 2023; 26:107924. [PMID: 37817930 PMCID: PMC10561034 DOI: 10.1016/j.isci.2023.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2023] Open
Abstract
Increasing awareness of health disparities has led to proposals for a pay-for-equity scheme. Implementing such proposals requires systematic methods of collecting and reporting health outcomes for targeted demographics over time. This lays the foundation for a shift from quality improvement projects (QIPs) to equality improvement projects (EQIPs) that could evaluate adherence to standards and progress toward health equity. We performed a scoping review on EQIPs to inform a new framework for quality improvement through a health equity lens. Forty studies implemented an intervention after identifying a disparity compared to 149 others which merely identified group differences. Most evaluated race-based differences and were conducted at the institutional level, with representation in both the inpatient and outpatient settings. EQIPs that improved equity leveraged multidisciplinary expertise, healthcare staff education, and developed tools to track health outcomes continuously. EQIPs can help bridge the inequality gap and form part of an incentivized systematic equality improvement framework.
Collapse
Affiliation(s)
- Jack Gallifant
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Critical Care, Guy’s and St Thomas’ NHS Foundation Trust, London, UK
| | - Molly Griffin
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Robin L. Pierce
- The Law School, School of Social Sciences and International Studies, University of Exeter, Exeter, UK
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
37
|
Teotia K, Jia Y, Woite NL, Celi LA, Matos J, Struja T. Variation in monitoring: Glucose measurement in the ICU as a case study to preempt spurious correlations. medRxiv 2023:2023.10.12.23296568. [PMID: 37873163 PMCID: PMC10593024 DOI: 10.1101/2023.10.12.23296568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Objective Health inequities can be influenced by demographic factors such as race and ethnicity, proficiency in English, and biological sex. Disparities may manifest as differential likelihood of testing which correlates directly with the likelihood of an intervention to address an abnormal finding. Our retrospective observational study evaluated the presence of variation in glucose measurements in the Intensive Care Unit (ICU). Methods Using the MIMIC-IV database (2008-2019), a single-center, academic referral hospital in Boston (USA), we identified adult patients meeting sepsis-3 criteria. Exclusion criteria were diabetic ketoacidosis, ICU length of stay under 1 day, and unknown race or ethnicity. We performed a logistic regression analysis to assess differential likelihoods of glucose measurements on day 1. A negative binomial regression was fitted to assess the frequency of subsequent glucose readings. Analyses were adjusted for relevant clinical confounders, and performed across three disparity proxy axes: race and ethnicity, sex, and English proficiency. Results We studied 24,927 patients, of which 19.5% represented racial and ethnic minority groups, 42.4% were female, and 9.8% had limited English proficiency. No significant differences were found for glucose measurement on day 1 in the ICU. This pattern was consistent irrespective of the axis of analysis, i.e. race and ethnicity, sex, or English proficiency. Conversely, subsequent measurement frequency revealed potential disparities. Specifically, males (incidence rate ratio (IRR) 1.06, 95% confidence interval (CI) 1.01 - 1.21), patients who identify themselves as Hispanic (IRR 1.11, 95% CI 1.01 - 1.21), or Black (IRR 1.06, 95% CI 1.01 - 1.12), and patients being English proficient (IRR 1.08, 95% CI 1.01 - 1.15) had higher chances of subsequent glucose readings. Conclusion We found disparities in ICU glucose measurements among patients with sepsis, albeit the magnitude was small. Variation in disease monitoring is a source of data bias that may lead to spurious correlations when modeling health data.
Collapse
|
38
|
Struja T, Matos J, Lam B, Cao Y, Liu X, Jia Y, Sauer CM, D'Couto H, Dankwa-Mullan I, Celi LA, Waschka AK. Evaluating equitable care in the ICU: Creating a causal inference framework to assess the impact of life-sustaining interventions across racial and ethnic groups. medRxiv 2023:2023.10.12.23296933. [PMID: 37873267 PMCID: PMC10592988 DOI: 10.1101/2023.10.12.23296933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Background Variability in the provision of intensive care unit (ICU)-interventions may lead to disparities between socially defined racial-ethnic groups. Research Question We used causal inference to examine the use of invasive mechanical ventilation (IMV), renal replacement therapy (RRT), and vasopressor agents (VP) to identify disparities in outcomes across race-ethnicity in patients with sepsis. Study Design and Methods Single-center, academic referral hospital in Boston, Massachusetts, USA. Retrospective analysis of treatment effect with a targeted trial design categorized by treatment assignment within the first 24 hours in the MIMIC-IV dataset (2008- 2019) using targeted maximum likelihood estimation. Of 76,943 ICU stays in MIMIC-IV, 32,971 adult stays fulfilling sepsis-3 criteria were included. The primary outcome was in-hospital mortality. Secondary outcomes were hospital-free days, and occurrence of nosocomial infection stratified by predicted mortality probability ranges and self-reported race-ethnicity. Average treatment effects by treatment type and race-ethnicity, Racial-ethnic group (REG) or White group (WG), were estimated. Results Of 19,419 admissions that met inclusion criteria, median age was 68 years, 57.4% were women, 82% were White, and mortality was 18.2%. There was no difference in mortality benefit associated with the administration of IMV, RRT, or VP between the REG and the WG. There was also no difference in hospital-free days or nosocomial infections. These findings are unchanged with different eligibility periods. Interpretation There were no differences in the treatment outcomes from three life-sustaining interventions in the ICU according to race-ethnicity. While there was no discernable harm from the treatments across mortality risk, there was also no measurable benefit. These findings highlight the need for research to understand better the risk-benefit of life-sustaining interventions in the ICU.
Collapse
|
39
|
Matos J, Struja T, Gallifant J, Nakayama L, Charpignon ML, Liu X, Economou-Zavlanos N, Cardoso JS, Johnson KS, Bhavsar N, Gichoya J, Celi LA, Wong AI. BOLD: Blood-gas and Oximetry Linked Dataset - Open Source Research. medRxiv 2023:2023.10.03.23296485. [PMID: 37873343 PMCID: PMC10593048 DOI: 10.1101/2023.10.03.23296485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Pulse oximeters measure peripheral arterial oxygen saturation (SpO 2 ) noninvasively, while the gold standard (SaO 2 ) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a new comprehensive dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients. The dataset was created by harmonizing three Electronic Health Record databases (MIMIC-III, MIMIC-IV, eICU-CRD) comprising Intensive Care Unit stays of US patients. Paired SpO 2 and SaO 2 measurements were time-aligned and combined with various other sociodemographic and parameters to provide a detailed representation of each patient. BOLD includes 49,099 paired measurements, within a 5-minute window and with oxygen saturation levels between 70-100%. Minority racial and ethnic groups account for ∼25% of the data - a proportion seldom achieved in previous studies. The codebase is publicly available. Given the prevalent use of pulse oximeters in the hospital and at home, we hope that BOLD will be leveraged to develop debiasing algorithms that can result in more equitable healthcare solutions.
Collapse
|
40
|
Liu X, Hu P, Yeung W, Zhang Z, Ho V, Liu C, Dumontier C, Thoral PJ, Mao Z, Cao D, Mark RG, Zhang Z, Feng M, Li D, Celi LA. Illness severity assessment of older adults in critical illness using machine learning (ELDER-ICU): an international multicentre study with subgroup bias evaluation. Lancet Digit Health 2023; 5:e657-e667. [PMID: 37599147 DOI: 10.1016/s2589-7500(23)00128-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 05/31/2023] [Accepted: 06/22/2023] [Indexed: 08/22/2023]
Abstract
BACKGROUND Comorbidity, frailty, and decreased cognitive function lead to a higher risk of death in elderly patients (more than 65 years of age) during acute medical events. Early and accurate illness severity assessment can support appropriate decision making for clinicians caring for these patients. We aimed to develop ELDER-ICU, a machine learning model to assess the illness severity of older adults admitted to the intensive care unit (ICU) with cohort-specific calibration and evaluation for potential model bias. METHODS In this retrospective, international multicentre study, the ELDER-ICU model was developed using data from 14 US hospitals, and validated in 171 hospitals from the USA and Netherlands. Data were extracted from the Medical Information Mart for Intensive Care database, electronic ICU Collaborative Research Database, and Amsterdam University Medical Centers Database. We used six categories of data as predictors, including demographics and comorbidities, physical frailty, laboratory tests, vital signs, treatments, and urine output. Patient data from the first day of ICU stay were used to predict in-hospital mortality. We used the eXtreme Gradient Boosting algorithm (XGBoost) to develop models and the SHapley Additive exPlanations method to explain model prediction. The trained model was calibrated before internal, external, and temporal validation. The final XGBoost model was compared against three other machine learning algorithms and five clinical scores. We performed subgroup analysis based on age, sex, and race. We assessed the discrimination and calibration of models using the area under receiver operating characteristic (AUROC) and standardised mortality ratio (SMR) with 95% CIs. FINDINGS Using the development dataset (n=50 366) and predictive model building process, the XGBoost algorithm performed the best in all types of validations compared with other machine learning algorithms and clinical scores (internal validation with 5037 patients from 14 US hospitals, AUROC=0·866 [95% CI 0·851-0·880]; external validation in the US population with 20 541 patients from 169 hospitals, AUROC=0·838 [0·829-0·847]; external validation in European population with 2411 patients from one hospital, AUROC=0·833 [0·812-0·853]; temporal validation with 4311 patients from one hospital, AUROC=0·884 [0·869-0·897]). In the external validation set (US population), the median AUROCs of bias evaluations covering eight subgroups were above 0·81, and the overall SMR was 0·99 (0·96-1·03). The top ten risk predictors were the minimum Glasgow Coma Scale score, total urine output, average respiratory rate, mechanical ventilation use, best state of activity, Charlson Comorbidity Index score, geriatric nutritional risk index, code status, age, and maximum blood urea nitrogen. A simplified model containing only the top 20 features (ELDER-ICU-20) had similar predictive performance to the full model. INTERPRETATION The ELDER-ICU model reliably predicts the risk of in-hospital mortality using routinely collected clinical features. The predictions could inform clinicians about patients who are at elevated risk of deterioration. Prospective validation of this model in clinical practice and a process for continuous performance monitoring and model recalibration are needed. FUNDING National Institutes of Health, National Natural Science Foundation of China, National Special Health Science Program, Health Science and Technology Plan of Zhejiang Province, Fundamental Research Funds for the Central Universities, Drug Clinical Evaluate Research of Chinese Pharmaceutical Association, and National Key R&D Program of China.
Collapse
Affiliation(s)
- Xiaoli Liu
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China; Center for Artificial Intelligence in Medicine, Chinese PLA General Hospital, Beijing, China; Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Pan Hu
- Department of Anesthesiology, The 920 Hospital of Joint Logistic Support Force of Chinese PLA, Kunming Yunnan, China; Department of Critical Care Medicine, The First Medical Center of PLA General Hospital, Beijing, China
| | - Wesley Yeung
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Cardiology, National University Heart Centre, Singapore
| | - Zhongheng Zhang
- Department of Emergency Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Vanda Ho
- Division of Geriatric Medicine, Department of Medicine, National University Hospital, Singapore
| | - Chao Liu
- Department of Critical Care Medicine, The First Medical Center of PLA General Hospital, Beijing, China
| | - Clark Dumontier
- New England Geriatric Research Education and Clinical Center, VA Boston Healthcare System, Boston, MA, USA; Division of Aging, Brigham and Women's Hospital, Boston, MA, USA
| | - Patrick J Thoral
- Center for Critical Care Computational Intelligence, Department of Intensive Care Medicine, Amsterdam UMC, Vrije Universiteit, Amsterdam, Netherlands
| | - Zhi Mao
- Department of Critical Care Medicine, The First Medical Center of PLA General Hospital, Beijing, China
| | - Desen Cao
- Department of Biomedical Engineering, Chinese PLA General Hospital, Beijing, China
| | - Roger G Mark
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Zhengbo Zhang
- Center for Artificial Intelligence in Medicine, Chinese PLA General Hospital, Beijing, China
| | - Mengling Feng
- Saw Swee Hock School of Public Health and the Institute of Data Science, National University of Singapore, Singapore
| | - Deyu Li
- School of Biological Science and Medical Engineering, Beihang University, Beijing, China; National Key Lab for Virtual Reality Technology and Systems, Beihang University, Beijing, China.
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
41
|
Charpignon ML, Byers J, Cabral S, Celi LA, Fernandes C, Gallifant J, Lough ME, Mlombwa D, Moukheiber L, Ong BA, Panitchote A, William W, Wong AKI, Nazer L. Critical Bias in Critical Care Devices. Crit Care Clin 2023; 39:795-813. [PMID: 37704341 DOI: 10.1016/j.ccc.2023.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Critical care data contain information about the most physiologically fragile patients in the hospital, who require a significant level of monitoring. However, medical devices used for patient monitoring suffer from measurement biases that have been largely underreported. This article explores sources of bias in commonly used clinical devices, including pulse oximeters, thermometers, and sphygmomanometers. Further, it provides a framework for mitigating these biases and key principles to achieve more equitable health care delivery.
Collapse
Affiliation(s)
- Marie-Laure Charpignon
- Institute for Data, Systems, and Society (IDSS), E18-407A, 50 Ames Street, Cambridge, MA 02142, USA.
| | - Joseph Byers
- Respiratory Therapy, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA
| | - Stephanie Cabral
- Department of Medicine, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Chrystinne Fernandes
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Jack Gallifant
- Imperial College London NHS Trust, St Thomas' Hospital, Westminster Bridge Road, London SE1 7EH, UK
| | - Mary E Lough
- Stanford Health Care, Stanford University, 300 Pasteur Drive, Stanford, CA 94305, USA
| | - Donald Mlombwa
- Zomba Central Hospital, 8th Avenue, Zomba, Malawi; Kamuzu College of Health Sciences, Blantyre, Malawi; St. Luke's College of Health Sciences, Chilema-Zomba, Malawi
| | - Lama Moukheiber
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Avenue, E25-330, Cambridge, MA 02139, USA
| | - Bradley Ashley Ong
- College of Medicine, University of the Philippines Manila, Calderon hall, UP College of Medicine, 547 Pedro Gil Street, Ermita Manila, Philippines
| | - Anupol Panitchote
- Faculty of Medicine, Khon Kaen University, 123 Mittraparp Highway, Muang District, Khon Kaen 40002, Thailand
| | - Wasswa William
- Mbarara University of Science and Technology, P.O. Box 1410, Mbarara, Uganda
| | - An-Kwok Ian Wong
- Duke University Medical Center, 2424 Erwin Road, Suite 1102, Hock Plaza Box 2721, Durham, NC 27710, USA
| | - Lama Nazer
- King Hussein Cancer Center, Queen Rania Street 202, Amman, Jordan
| |
Collapse
|
42
|
Wanis KN, Madenci AL, Hao S, Moukheiber M, Moukheiber L, Moukheiber D, Moukheiber S, Young JG, Celi LA. Emulating Target Trials Comparing Early and Delayed Intubation Strategies. Chest 2023; 164:885-891. [PMID: 37150505 PMCID: PMC10567927 DOI: 10.1016/j.chest.2023.04.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 04/15/2023] [Accepted: 04/30/2023] [Indexed: 05/09/2023] Open
Abstract
BACKGROUND Whether intubation should be initiated early in the clinical course of critically ill patients remains a matter of debate. Results from prior observational studies are difficult to interpret because of avoidable flaws including immortal time bias, inappropriate eligibility criteria, and unrealistic treatment strategies. RESEARCH QUESTION Do treatment strategies that intubate patients early in the critical care admission improve 30-day survival compared with strategies that delay intubation? STUDY DESIGN AND METHODS We estimated the effect of strategies that require early intubation of critically ill patients compared with those that delay intubation. With data extracted from the Medical Information Mart for Intensive Care-IV database, we emulated three target trials, varying the flexibility of the treatment strategies and the baseline eligibility criteria. RESULTS Under unrealistically strict treatment strategies with broad eligibility criteria, the 30-day mortality risk was 7.1 percentage points higher for intubating early compared with delaying intubation (95% CI, 6.2-7.9). Risk differences were 0.4 (95% CI, -0.1 to 0.9) and -0.9 (95% CI, -2.5 to 0.7) percentage points in subsequent target trial emulations that included more realistic treatment strategies and eligibility criteria. INTERPRETATION When realistic treatment strategies and eligibility criteria are used, strategies that delay intubation result in similar 30-day mortality risks compared with those that intubate early. Delaying intubation ultimately avoids intubation in most patients.
Collapse
Affiliation(s)
- Kerollos Nashat Wanis
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA; Division of General Surgery, Department of Surgery, Western University, London, ON, Canada.
| | - Arin L Madenci
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA; Department of Surgery, Boston Children's Hospital, Harvard Medical School, Boston, MA
| | - Sicheng Hao
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA
| | - Mira Moukheiber
- The Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA
| | - Lama Moukheiber
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA
| | - Dana Moukheiber
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA
| | - Sulaiman Moukheiber
- Department of Computer Science, Worcester Polytechnic Institute, Worcester, MA
| | - Jessica G Young
- Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA
| | - Leo Anthony Celi
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA
| |
Collapse
|
43
|
Restrepo D, Quion J, Vásquez-Venegas C, Villanueva C, Anthony Celi L, Nakayama LF. A scoping review of the landscape of health-related open datasets in Latin America. PLOS Digit Health 2023; 2:e0000368. [PMID: 37878549 PMCID: PMC10599518 DOI: 10.1371/journal.pdig.0000368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 09/16/2023] [Indexed: 10/27/2023]
Abstract
Artificial intelligence (AI) algorithms have the potential to revolutionize healthcare, but their successful translation into clinical practice has been limited. One crucial factor is the data used to train these algorithms, which must be representative of the population. However, most healthcare databases are derived from high-income countries, leading to non-representative models and potentially exacerbating health inequities. This review focuses on the landscape of health-related open datasets in Latin America, aiming to identify existing datasets, examine data-sharing frameworks, techniques, platforms, and formats, and identify best practices in Latin America. The review found 61 datasets from 23 countries, with the DATASUS dataset from Brazil contributing to the majority of articles. The analysis revealed a dearth of datasets created by the authors themselves, indicating a reliance on existing open datasets. The findings underscore the importance of promoting open data in Latin America. We provide recommendations for enhancing data sharing in the region.
Collapse
Affiliation(s)
- David Restrepo
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Telematics Department, University of Cauca, Popayán, Cauca, Colombia
| | - Justin Quion
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Constanza Vásquez-Venegas
- Scientific Image Analysis Lab, Integrative Biology Program, Biomedical Sciences Institute (ICBM), Faculty of Medicine, Universidad de Chile, Santiago, Chile
| | - Cleva Villanueva
- Instituto Politécnico Nacional, Escuela Superior de Medicina, Ciudad de Mexico, Mexico
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Luis Filipe Nakayama
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Ophthalmology, São Paulo Federal University, São Paulo, São Paulo, Brazil
| |
Collapse
|
44
|
Gichoya JW, Thomas K, Celi LA, Safdar N, Banerjee I, Banja JD, Seyyed-Kalantari L, Trivedi H, Purkayastha S. AI pitfalls and what not to do: mitigating bias in AI. Br J Radiol 2023; 96:20230023. [PMID: 37698583 PMCID: PMC10546443 DOI: 10.1259/bjr.20230023] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 08/10/2023] [Accepted: 08/14/2023] [Indexed: 09/13/2023] Open
Abstract
Various forms of artificial intelligence (AI) applications are being deployed and used in many healthcare systems. As the use of these applications increases, we are learning the failures of these models and how they can perpetuate bias. With these new lessons, we need to prioritize bias evaluation and mitigation for radiology applications; all the while not ignoring the impact of changes in the larger enterprise AI deployment which may have downstream impact on performance of AI models. In this paper, we provide an updated review of known pitfalls causing AI bias and discuss strategies for mitigating these biases within the context of AI deployment in the larger healthcare enterprise. We describe these pitfalls by framing them in the larger AI lifecycle from problem definition, data set selection and curation, model training and deployment emphasizing that bias exists across a spectrum and is a sequela of a combination of both human and machine factors.
Collapse
Affiliation(s)
| | - Kaesha Thomas
- Department of Radiology, Emory University, Atlanta, United States
| | | | - Nabile Safdar
- Department of Radiology, Emory University, Atlanta, United States
| | - Imon Banerjee
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, United States
| | - John D Banja
- Emory University Center for Ethics, Emory University, Atlanta, United States
| | - Laleh Seyyed-Kalantari
- Department of Electrical Engineering and Computer Science, Lassonde School of Engineering, York University, North York, United States
| | - Hari Trivedi
- Department of Radiology, Emory University, Atlanta, United States
| | - Saptarshi Purkayastha
- School of Informatics and Computing, Indiana University Purdue University, Indianapolis, United States
| |
Collapse
|
45
|
Holmes Fee C, Hicklen RS, Jean S, Abu Hussein N, Moukheiber L, de Lota MF, Moukheiber M, Moukheiber D, Anthony Celi L, Dankwa-Mullan I. Strategies and solutions to address Digital Determinants of Health (DDOH) across underinvested communities. PLOS Digit Health 2023; 2:e0000314. [PMID: 37824481 PMCID: PMC10569606 DOI: 10.1371/journal.pdig.0000314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Healthcare has long struggled to improve services through technology without further widening health disparities. With the significant expansion of digital health, a group of healthcare professionals and scholars from across the globe are proposing the official usage of the term "Digital Determinants of Health" (DDOH) to explicitly call out the relationship between technology, healthcare, and equity. This is the final paper in a series published in PLOS Digital Health that seeks to understand and summarize current knowledge of the strategies and solutions that help to mitigate the negative effects of DDOH for underinvested communities. Through a search of English-language Medline, Scopus, and Google Scholar articles published since 2010, 345 articles were identified that discussed the application of digital health technology among underinvested communities. A group of 8 reviewers assessed 132 articles selected at random for the mention of solutions that minimize differences in DDOH. Solutions were then organized by categories of policy; design and development; implementation and adoption; and evaluation and ongoing monitoring. The data were then assessed by category and the findings summarized. The reviewers also looked for common themes across the solutions and evidence of effectiveness. From this limited scoping review, the authors found numerous solutions mentioned across the papers for addressing DDOH and many common themes emerged regardless of the specific community or digital health technology under review. There was notably less information on solutions regarding ongoing evaluation and monitoring which corresponded with a lack of research evidence regarding effectiveness. The findings directionally suggest that universal strategies and solutions can be developed to address DDOH independent of the specific community under focus. With the need for the further development of DDOH measures, we also provide a framework for DDOH assessment.
Collapse
Affiliation(s)
- Casey Holmes Fee
- Healthcare Consultant, Newton, Massachusetts, United States of America
| | - Rachel Scarlett Hicklen
- Research Medical Library, MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Sidney Jean
- Massachusetts Executive Office of Health and Human Services, Boston, Massachusetts, United States of America
- Simmons University, Boston, Massachusetts, United States of America
| | - Nebal Abu Hussein
- Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, Connecticut, United States of America
- Department for BioMedical Research DBMR, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Lama Moukheiber
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | | | - Mira Moukheiber
- Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Dana Moukheiber
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Division of Pulmonary, Critical Care and Pain Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Irene Dankwa-Mullan
- Marti Health, Atlanta, Georgia, United States of America
- Department of Health Policy and Management, Milken Institute School of Public Health, The George Washington University, Washington, DC, United States of America
| |
Collapse
|
46
|
Sheikhalishahi S, Bhattacharyya A, Celi LA, Osmani V. An interpretable deep learning model for time-series electronic health records: Case study of delirium prediction in critical care. Artif Intell Med 2023; 144:102659. [PMID: 37783541 DOI: 10.1016/j.artmed.2023.102659] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 07/19/2023] [Accepted: 09/04/2023] [Indexed: 10/04/2023]
Abstract
Deep Learning (DL) models have received increasing attention in the clinical setting, particularly in intensive care units (ICU). In this context, the interpretability of the outcomes estimated by the DL models is an essential step towards increasing adoption of DL models in clinical practice. To address this challenge, we propose an ante-hoc, interpretable neural network model. Our proposed model, named double self-attention architecture (DSA), uses two attention-based mechanisms, including self-attention and effective attention. It can capture the importance of input variables in general, as well as changes in importance along the time dimension for the outcome of interest. We evaluated our model using two real-world clinical datasets covering 22840 patients in predicting onset of delirium 12 h and 48 h in advance. Additionally, we compare the descriptive performance of our model with three post-hoc interpretable algorithms as well as with the opinion of clinicians based on the published literature and clinical experience. We find that our model covers the majority of the top-10 variables ranked by the other three post-hoc interpretable algorithms as well as the clinical opinion, with the advantage of taking into account both, the dependencies among variables as well as dependencies between varying time-steps. Finally, our results show that our model can improve descriptive performance without sacrificing predictive performance.
Collapse
Affiliation(s)
| | | | - Leo Anthony Celi
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Venet Osmani
- Fondazione Bruno Kessler Research Institute, Trento, Italy; Information School, University of Sheffield, UK.
| |
Collapse
|
47
|
Yeung AWK, Torkamani A, Butte AJ, Glicksberg BS, Schuller B, Rodriguez B, Ting DSW, Bates D, Schaden E, Peng H, Willschke H, van der Laak J, Car J, Rahimi K, Celi LA, Banach M, Kletecka-Pulker M, Kimberger O, Eils R, Islam SMS, Wong ST, Wong TY, Gao W, Brunak S, Atanasov AG. The promise of digital healthcare technologies. Front Public Health 2023; 11:1196596. [PMID: 37822534 PMCID: PMC10562722 DOI: 10.3389/fpubh.2023.1196596] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 09/04/2023] [Indexed: 10/13/2023] Open
Abstract
Digital health technologies have been in use for many years in a wide spectrum of healthcare scenarios. This narrative review outlines the current use and the future strategies and significance of digital health technologies in modern healthcare applications. It covers the current state of the scientific field (delineating major strengths, limitations, and applications) and envisions the future impact of relevant emerging key technologies. Furthermore, we attempt to provide recommendations for innovative approaches that would accelerate and benefit the research, translation and utilization of digital health technologies.
Collapse
Affiliation(s)
- Andy Wai Kan Yeung
- Oral and Maxillofacial Radiology, Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, University of Hong Kong, Hong Kong, China
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
| | - Ali Torkamani
- Department of Integrative Structural and Computational Biology, Scripps Research Translational Institute, La Jolla, CA, United States
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, United States
| | - Benjamin S. Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Björn Schuller
- Department of Computing, Imperial College London, London, United Kingdom
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany
| | - Blanca Rodriguez
- Department of Computer Science, University of Oxford, Oxford, United Kingdom
| | - Daniel S. W. Ting
- Singapore National Eye Center, Singapore Eye Research Institute, Singapore, Singapore
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
| | - David Bates
- Department of General Internal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States
| | - Eva Schaden
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
- Department of Anaesthesia, Intensive Care Medicine and Pain Medicine, Medical University of Vienna, Vienna, Austria
| | - Hanchuan Peng
- Institute for Brain and Intelligence, Southeast University, Nanjing, China
| | - Harald Willschke
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
- Department of Anaesthesia, Intensive Care Medicine and Pain Medicine, Medical University of Vienna, Vienna, Austria
| | - Jeroen van der Laak
- Department of Pathology, Radboud University Medical Center, Nijmegen, Netherlands
| | - Josip Car
- Primary Care and Public Health, School of Public Health, Imperial College London, London, United Kingdom
- Centre for Population Health Sciences, LKC Medicine, Nanyang Technological University, Singapore, Singapore
| | - Kazem Rahimi
- Deep Medicine Nuffield Department of Women’s and Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, United States
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States
| | - Maciej Banach
- Department of Preventive Cardiology and Lipidology, Medical University of Lodz (MUL), Lodz, Poland
- Department of Cardiology and Adult Congenital Heart Diseases, Polish Mother’s Memorial Hospital Research Institute (PMMHRI), Lodz, Poland
| | - Maria Kletecka-Pulker
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
- Institute for Ethics and Law in Medicine, University of Vienna, Vienna, Austria
| | - Oliver Kimberger
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
- Department of Anaesthesia, Intensive Care Medicine and Pain Medicine, Medical University of Vienna, Vienna, Austria
| | - Roland Eils
- Digital Health Center, Berlin Institute of Health (BIH), Charité – Universitätsmedizin Berlin, Berlin, Germany
| | | | - Stephen T. Wong
- Department of Systems Medicine and Bioengineering, Houston Methodist Cancer Center, T. T. and W. F. Chao Center for BRAIN, Houston Methodist Academic Institute, Houston Methodist Hospital, Houston, TX, United States
- Departments of Radiology, Pathology and Laboratory Medicine and Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, United States
| | - Tien Yin Wong
- Singapore National Eye Center, Singapore Eye Research Institute, Singapore, Singapore
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Tsinghua Medicine, Tsinghua University, Beijing, China
| | - Wei Gao
- Andrew and Peggy Cherng Department of Medical Engineering, California Institute of Technology, Pasadena, CA, United States
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Atanas G. Atanasov
- Ludwig Boltzmann Institute Digital Health and Patient Safety, Medical University of Vienna, Vienna, Austria
- Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, Jastrzebiec, Poland
| |
Collapse
|
48
|
Nakayama LF, Zago Ribeiro L, Novaes F, Miyawaki IA, Miyawaki AE, de Oliveira JAE, Oliveira T, Malerbi FK, Regatieri CVS, Celi LA, Silva PS. Artificial intelligence for telemedicine diabetic retinopathy screening: a review. Ann Med 2023; 55:2258149. [PMID: 37734417 PMCID: PMC10515659 DOI: 10.1080/07853890.2023.2258149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Accepted: 08/31/2023] [Indexed: 09/23/2023] Open
Abstract
PURPOSE This study aims to compare artificial intelligence (AI) systems applied in diabetic retinopathy (DR) teleophthalmology screening, currently deployed systems, fairness initiatives and the challenges for implementation. METHODS The review included articles retrieved from PubMed/Medline/EMBASE literature search strategy regarding telemedicine, DR and AI. The screening criteria included human articles in English, Portuguese or Spanish and related to telemedicine and AI for DR screening. The author's affiliations and the study's population income group were classified according to the World Bank Country and Lending Groups. RESULTS The literature search yielded a total of 132 articles, and nine were included after full-text assessment. The selected articles were published between 2004 and 2020 and were grouped as telemedicine systems, algorithms, economic analysis and image quality assessment. Four telemedicine systems that perform a quality assessment, image preprocessing and pathological screening were reviewed. A data and post-deployment bias assessment are not performed in any of the algorithms, and none of the studies evaluate the social impact implementations. There is a lack of representativeness in the reviewed articles, with most authors and target populations from high-income countries and no low-income country representation. CONCLUSIONS Telemedicine and AI hold great promise for augmenting decision-making in medical care, expanding patient access and enhancing cost-effectiveness. Economic studies and social science analysis are crucial to support the implementation of AI in teleophthalmology screening programs. Promoting fairness and generalizability in automated systems combined with telemedicine screening programs is not straightforward. Improving data representativeness, reducing biases and promoting equity in deployment and post-deployment studies are all critical steps in model development.
Collapse
Affiliation(s)
- Luis Filipe Nakayama
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, Brazil
| | - Lucas Zago Ribeiro
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, Brazil
| | - Frederico Novaes
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, Brazil
| | | | | | | | - Talita Oliveira
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, Brazil
| | | | | | - Leo Anthony Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Paolo S. Silva
- Beetham Eye Institute, Joslin Diabetes Centre, Harvard Medical School, Boston, MA, USA
- Philippine Eye Research Institute, University of the Philippines, Manila, Philippines
| |
Collapse
|
49
|
Liu M, Ning Y, Teixayavong S, Mertens M, Xu J, Ting DSW, Cheng LTE, Ong JCL, Teo ZL, Tan TF, RaviChandran N, Wang F, Celi LA, Ong MEH, Liu N. A translational perspective towards clinical AI fairness. NPJ Digit Med 2023; 6:172. [PMID: 37709945 PMCID: PMC10502051 DOI: 10.1038/s41746-023-00918-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023] Open
Abstract
Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the fairness of such data-driven insights remains a concern in high-stakes fields. Despite extensive developments, issues of AI fairness in clinical contexts have not been adequately addressed. A fair model is normally expected to perform equally across subgroups defined by sensitive variables (e.g., age, gender/sex, race/ethnicity, socio-economic status, etc.). Various fairness measurements have been developed to detect differences between subgroups as evidence of bias, and bias mitigation methods are designed to reduce the differences detected. This perspective of fairness, however, is misaligned with some key considerations in clinical contexts. The set of sensitive variables used in healthcare applications must be carefully examined for relevance and justified by clear clinical motivations. In addition, clinical AI fairness should closely investigate the ethical implications of fairness measurements (e.g., potential conflicts between group- and individual-level fairness) to select suitable and objective metrics. Generally defining AI fairness as "equality" is not necessarily reasonable in clinical settings, as differences may have clinical justifications and do not indicate biases. Instead, "equity" would be an appropriate objective of clinical AI fairness. Moreover, clinical feedback is essential to developing fair and well-performing AI models, and efforts should be made to actively involve clinicians in the process. The adaptation of AI fairness towards healthcare is not self-evident due to misalignments between technical developments and clinical considerations. Multidisciplinary collaboration between AI researchers, clinicians, and ethicists is necessary to bridge the gap and translate AI fairness into real-life benefits.
Collapse
Affiliation(s)
- Mingxuan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Yilin Ning
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | | | - Mayli Mertens
- Centre for Ethics, Department of Philosophy, University of Antwerp, Antwerp, Belgium
- Antwerp Center on Responsible AI, University of Antwerp, Antwerp, Belgium
| | - Jie Xu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Daniel Shu Wei Ting
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
- SingHealth AI Office, Singapore Health Services, Singapore, Singapore
| | - Lionel Tim-Ee Cheng
- Department of Diagnostic Radiology, Singapore General Hospital, Singapore, Singapore
| | | | - Zhen Ling Teo
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Ting Fang Tan
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | | | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore
- Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore.
- SingHealth AI Office, Singapore Health Services, Singapore, Singapore.
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore.
- Institute of Data Science, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
50
|
Patel TA, Jain B, Eala MAB, Manlongat KD, Vapiwala N, Celi LA, Dee EC. Disparities in Receipt of Mental Health Services and Mental Distress Among Patients with Chronic Obstructive Pulmonary Disease. J Gen Intern Med 2023; 38:2849-2851. [PMID: 37349638 PMCID: PMC10506969 DOI: 10.1007/s11606-023-08273-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 06/09/2023] [Indexed: 06/24/2023]
Affiliation(s)
- Tej A Patel
- University of Pennsylvania, Philadelphia, PA, USA
| | - Bhav Jain
- Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Michelle Ann B Eala
- College of Medicine, University of the Philippines, Manila, Philippines
- Department of Radiation Oncology, University of California Los Angeles, Los Angeles, CA, USA
| | | | - Neha Vapiwala
- Department of Radiation Oncology, University of Pennsylvania, Philadelphia, PA, USA
| | - Leo Anthony Celi
- Harvard Medical School, Boston, MA, USA.
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Edward Christopher Dee
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|