51
|
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge. Nature 2023; 620:172-180. [PMID: 37438534 PMCID: PMC10396962 DOI: 10.1038/s41586-023-06291-2] [Citation(s) in RCA: 251] [Impact Index Per Article: 251.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/05/2023] [Indexed: 07/14/2023]
Abstract
Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.
Collapse
Affiliation(s)
| | | | - Tao Tu
- Google Research, Mountain View, CA, USA
| | | | - Jason Wei
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yun Liu
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | |
Collapse
|
52
|
Polevikov S. Advancing AI in healthcare: A comprehensive review of best practices. Clin Chim Acta 2023; 548:117519. [PMID: 37595864 DOI: 10.1016/j.cca.2023.117519] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/14/2023] [Accepted: 08/15/2023] [Indexed: 08/20/2023]
Abstract
Artificial Intelligence (AI) and Machine Learning (ML) are powerful tools shaping the healthcare sector. This review considers twelve key aspects of AI in clinical practice: 1) Ethical AI; 2) Explainable AI; 3) Health Equity and Bias in AI; 4) Sponsorship Bias; 5) Data Privacy; 6) Genomics and Privacy; 7) Insufficient Sample Size and Self-Serving Bias; 8) Bridging the Gap Between Training Datasets and Real-World Scenarios; 9) Open Source and Collaborative Development; 10) Dataset Bias and Synthetic Data; 11) Measurement Bias; 12) Reproducibility in AI Research. These categories represent both the challenges and opportunities of AI implementation in healthcare. While AI holds significant potential for improving patient care, it also presents risks and challenges, such as ensuring privacy, combating bias, and maintaining transparency and ethics. The review underscores the necessity of developing comprehensive best practices for healthcare organizations and fostering a diverse dialogue involving data scientists, clinicians, patient advocates, ethicists, economists, and policymakers. We are at the precipice of significant transformation in healthcare powered by AI. By continuing to reassess and refine our approach, we can ensure that AI is implemented responsibly and ethically, maximizing its benefit to patient care and public health.
Collapse
|
53
|
Do H, Chang Y, Cho YS, Smyth P, Zhong J. Fair Survival Time Prediction via Mutual Information Minimization. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2023; 219:128-149. [PMID: 38707261 PMCID: PMC11067550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Survival analysis is a general framework for predicting the time until a specific event occurs, often in the presence of censoring. Although this framework is widely used in practice, few studies to date have considered fairness for time-to-event outcomes, despite recent significant advances in the algorithmic fairness literature more broadly. In this paper, we propose a framework to achieve demographic parity in survival analysis models by minimizing the mutual information between predicted time-to-event and sensitive attributes. We show that our approach effectively minimizes mutual information to encourage statistical independence of time-to-event predictions and sensitive attributes. Furthermore, we propose four types of disparity assessment metrics based on common survival analysis metrics. Through experiments on multiple benchmark datasets, we demonstrate that by minimizing the dependence between the prediction and the sensitive attributes, our method can systematically improve the fairness of survival predictions and is robust to censoring.
Collapse
Affiliation(s)
- Hyungrok Do
- Department of Population Health NYU Grossman School of Medicine
| | - Yuxin Chang
- Department of Computer Science University of California, Irvine
| | - Yoon Sang Cho
- Department of Population Health NYU Grossman School of Medicine
| | - Padhraic Smyth
- Department of Computer Science University of California, Irvine
| | - Judy Zhong
- Department of Population Health NYU Grossman School of Medicine
| |
Collapse
|
54
|
Madrid-García A, Merino-Barbancho B, Rodríguez-González A, Fernández-Gutiérrez B, Rodríguez-Rodríguez L, Menasalvas-Ruiz E. Understanding the role and adoption of artificial intelligence techniques in rheumatology research: An in-depth review of the literature. Semin Arthritis Rheum 2023; 61:152213. [PMID: 37315379 DOI: 10.1016/j.semarthrit.2023.152213] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 04/28/2023] [Accepted: 05/02/2023] [Indexed: 06/16/2023]
Abstract
The major and upward trend in the number of published research related to rheumatic and musculoskeletal diseases, in which artificial intelligence plays a key role, has exhibited the interest of rheumatology researchers in using these techniques to answer their research questions. In this review, we analyse the original research articles that combine both worlds in a five- year period (2017-2021). In contrast to other published papers on the same topic, we first studied the review and recommendation articles that were published during that period, including up to October 2022, as well as the publication trends. Secondly, we review the published research articles and classify them into one of the following categories: disease identification and prediction, disease classification, patient stratification and disease subtype identification, disease progression and activity, treatment response, and predictors of outcomes. Thirdly, we provide a table with illustrative studies in which artificial intelligence techniques have played a central role in more than twenty rheumatic and musculoskeletal diseases. Finally, the findings of the research articles, in terms of disease and/or data science techniques employed, are highlighted in a discussion. Therefore, the present review aims to characterise how researchers are applying data science techniques in the rheumatology medical field. The most immediate conclusions that can be drawn from this work are: multiple and novel data science techniques have been used in a wide range of rheumatic and musculoskeletal diseases including rare diseases; the sample size and the data type used are heterogeneous, and new technical approaches are expected to arrive in the short-middle term.
Collapse
Affiliation(s)
- Alfredo Madrid-García
- Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos, Prof. Martin Lagos s/n, Madrid, 28040, Spain; Escuela Técnica Superior de Ingenieros de Telecomunicación. Universidad Politécnica de Madrid, Avenida Complutense, 30, Madrid, 28040, Spain.
| | - Beatriz Merino-Barbancho
- Escuela Técnica Superior de Ingenieros de Telecomunicación. Universidad Politécnica de Madrid, Avenida Complutense, 30, Madrid, 28040, Spain
| | | | - Benjamín Fernández-Gutiérrez
- Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos, Prof. Martin Lagos s/n, Madrid, 28040, Spain
| | - Luis Rodríguez-Rodríguez
- Grupo de Patología Musculoesquelética. Hospital Clínico San Carlos, Prof. Martin Lagos s/n, Madrid, 28040, Spain
| | - Ernestina Menasalvas-Ruiz
- Centro de Tecnología Biomédica. Universidad Politécnica de Madrid, Pozuelo de Alarcón, Madrid, 28223, Spain
| |
Collapse
|
55
|
Watkins SH, Testa C, Chen JT, De Vivo I, Simpkin AJ, Tilling K, Diez Roux AV, Davey Smith G, Waterman PD, Suderman M, Relton C, Krieger N. Epigenetic clocks and research implications of the lack of data on whom they have been developed: a review of reported and missing sociodemographic characteristics. ENVIRONMENTAL EPIGENETICS 2023; 9:dvad005. [PMID: 37564905 PMCID: PMC10411856 DOI: 10.1093/eep/dvad005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 06/17/2023] [Accepted: 07/13/2023] [Indexed: 08/12/2023]
Abstract
Epigenetic clocks are increasingly being used as a tool to assess the impact of a wide variety of phenotypes and exposures on healthy ageing, with a recent focus on social determinants of health. However, little attention has been paid to the sociodemographic characteristics of participants on whom these clocks have been based. Participant characteristics are important because sociodemographic and socioeconomic factors are known to be associated with both DNA methylation variation and healthy ageing. It is also well known that machine learning algorithms have the potential to exacerbate health inequities through the use of unrepresentative samples - prediction models may underperform in social groups that were poorly represented in the training data used to construct the model. To address this gap in the literature, we conducted a review of the sociodemographic characteristics of the participants whose data were used to construct 13 commonly used epigenetic clocks. We found that although some of the epigenetic clocks were created utilizing data provided by individuals from different ages, sexes/genders, and racialized groups, sociodemographic characteristics are generally poorly reported. Reported information is limited by inadequate conceptualization of the social dimensions and exposure implications of gender and racialized inequality, and socioeconomic data are infrequently reported. It is important for future work to ensure clear reporting of tangible data on the sociodemographic and socioeconomic characteristics of all the participants in the study to ensure that other researchers can make informed judgements about the appropriateness of the model for their study population.
Collapse
Affiliation(s)
- Sarah Holmes Watkins
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
- Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
| | - Christian Testa
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
| | - Jarvis T Chen
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
| | - Immaculata De Vivo
- Program in Genetic Epidemiology and Statistical Genetics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
- Department of Medicine, Harvard Medical School, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Andrew J Simpkin
- School of Medicine, National University of Ireland Galway, Galway H91 TK33, Ireland
| | - Kate Tilling
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
- Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
| | - Ana V Diez Roux
- Department of Epidemiology and Biostatistics and Urban Health Collaborative, Dornsife School of Public Health, Drexel University, Philadelphia, PA 19104, USA
| | - George Davey Smith
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
- Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
| | - Pamela D Waterman
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
| | - Matthew Suderman
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
- Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
| | - Caroline Relton
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
- Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol BS8 2BN, UK
| | - Nancy Krieger
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA
| |
Collapse
|
56
|
Lehmann B, Mackintosh M, McVean G, Holmes C. Optimal strategies for learning multi-ancestry polygenic scores vary across traits. Nat Commun 2023; 14:4023. [PMID: 37419925 PMCID: PMC10328935 DOI: 10.1038/s41467-023-38930-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 05/22/2023] [Indexed: 07/09/2023] Open
Abstract
Polygenic scores (PGSs) are individual-level measures that aggregate the genome-wide genetic predisposition to a given trait. As PGS have predominantly been developed using European-ancestry samples, trait prediction using such European ancestry-derived PGS is less accurate in non-European ancestry individuals. Although there has been recent progress in combining multiple PGS trained on distinct populations, the problem of how to maximize performance given a multiple-ancestry cohort is largely unexplored. Here, we investigate the effect of sample size and ancestry composition on PGS performance for fifteen traits in UK Biobank. For some traits, PGS estimated using a relatively small African-ancestry training set outperformed, on an African-ancestry test set, PGS estimated using a much larger European-ancestry only training set. We observe similar, but not identical, results when considering other minority-ancestry groups within UK Biobank. Our results emphasise the importance of targeted data collection from underrepresented groups in order to address existing disparities in PGS performance.
Collapse
Affiliation(s)
- Brieuc Lehmann
- Department of Statistical Science, University College London, London, UK.
| | | | - Gil McVean
- Big Data Institute, University of Oxford, Oxford, UK
| | - Chris Holmes
- The Alan Turing Institute, London, UK
- Big Data Institute, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| |
Collapse
|
57
|
Master SR, Badrick TC, Bietenbeck A, Haymond S. Machine Learning in Laboratory Medicine: Recommendations of the IFCC Working Group. Clin Chem 2023; 69:690-698. [PMID: 37252943 PMCID: PMC10320011 DOI: 10.1093/clinchem/hvad055] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 04/12/2023] [Indexed: 06/01/2023]
Abstract
BACKGROUND Machine learning (ML) has been applied to an increasing number of predictive problems in laboratory medicine, and published work to date suggests that it has tremendous potential for clinical applications. However, a number of groups have noted the potential pitfalls associated with this work, particularly if certain details of the development and validation pipelines are not carefully controlled. METHODS To address these pitfalls and other specific challenges when applying machine learning in a laboratory medicine setting, a working group of the International Federation for Clinical Chemistry and Laboratory Medicine was convened to provide a guidance document for this domain. RESULTS This manuscript represents consensus recommendations for best practices from that committee, with the goal of improving the quality of developed and published ML models designed for use in clinical laboratories. CONCLUSIONS The committee believes that implementation of these best practices will improve the quality and reproducibility of machine learning utilized in laboratory medicine. SUMMARY We have provided our consensus assessment of a number of important practices that are required to ensure that valid, reproducible machine learning (ML) models can be applied to address operational and diagnostic questions in the clinical laboratory. These practices span all phases of model development, from problem formulation through predictive implementation. Although it is not possible to exhaustively discuss every potential pitfall in ML workflows, we believe that our current guidelines capture best practices for avoiding the most common and potentially dangerous errors in this important emerging field.
Collapse
Affiliation(s)
- Stephen R Master
- Department of Pathology and Laboratory Medicine, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Tony C Badrick
- Royal College of Pathologists of Australasia Quality Assurance Programs, Sydney, Australia
| | | | - Shannon Haymond
- Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL, United States
- Department of Pathology, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States
| |
Collapse
|
58
|
Pierson L, Tsai B. Misaligned AI constitutes a growing public health threat. BMJ 2023; 381:p1340. [PMID: 37308217 DOI: 10.1136/bmj.p1340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Affiliation(s)
- Leah Pierson
- Department of Global Health and Population, Harvard TH Chan School of Public Health
- Harvard-MIT doctoral program, Harvard Medical School
| | | |
Collapse
|
59
|
Do H, Nandi S, Putzel P, Smyth P, Zhong J. A joint fairness model with applications to risk predictions for underrepresented populations. Biometrics 2023; 79:826-840. [PMID: 35142367 PMCID: PMC9363518 DOI: 10.1111/biom.13632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 01/18/2022] [Indexed: 11/29/2022]
Abstract
In data collection for predictive modeling, underrepresentation of certain groups, based on gender, race/ethnicity, or age, may yield less accurate predictions for these groups. Recently, this issue of fairness in predictions has attracted significant attention, as data-driven models are increasingly utilized to perform crucial decision-making tasks. Existing methods to achieve fairness in the machine learning literature typically build a single prediction model in a manner that encourages fair prediction performance for all groups. These approaches have two major limitations: (i) fairness is often achieved by compromising accuracy for some groups; (ii) the underlying relationship between dependent and independent variables may not be the same across groups. We propose a joint fairness model (JFM) approach for logistic regression models for binary outcomes that estimates group-specific classifiers using a joint modeling objective function that incorporates fairness criteria for prediction. We introduce an accelerated smoothing proximal gradient algorithm to solve the convex objective function, and present the key asymptotic properties of the JFM estimates. Through simulations, we demonstrate the efficacy of the JFM in achieving good prediction performance and across-group parity, in comparison with the single fairness model, group-separate model, and group-ignorant model, especially when the minority group's sample size is small. Finally, we demonstrate the utility of the JFM method in a real-world example to obtain fair risk predictions for underrepresented older patients diagnosed with coronavirus disease 2019 (COVID-19).
Collapse
Affiliation(s)
- Hyungrok Do
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, 10016, USA
| | - Shinjini Nandi
- Department of Mathematical Sciences, Montana State University, Bozeman, MT, 59717, USA
| | - Preston Putzel
- Department of Computer Science, University of California, Irvine, CA, 92697, USA
| | - Padhraic Smyth
- Department of Computer Science, University of California, Irvine, CA, 92697, USA
| | - Judy Zhong
- Department of Population Health, NYU Grossman School of Medicine, New York, NY, 10016, USA
| |
Collapse
|
60
|
Chen RJ, Wang JJ, Williamson DFK, Chen TY, Lipkova J, Lu MY, Sahai S, Mahmood F. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng 2023; 7:719-742. [PMID: 37380750 PMCID: PMC10632090 DOI: 10.1038/s41551-023-01056-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 04/13/2023] [Indexed: 06/30/2023]
Abstract
In healthcare, the development and deployment of insufficiently fair systems of artificial intelligence (AI) can undermine the delivery of equitable care. Assessments of AI models stratified across subpopulations have revealed inequalities in how patients are diagnosed, treated and billed. In this Perspective, we outline fairness in machine learning through the lens of healthcare, and discuss how algorithmic biases (in data acquisition, genetic variation and intra-observer labelling variability, in particular) arise in clinical workflows and the resulting healthcare disparities. We also review emerging technology for mitigating biases via disentanglement, federated learning and model explainability, and their role in the development of AI-based software as a medical device.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Judy J Wang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Boston University School of Medicine, Boston, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sharifa Sahai
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
61
|
Sun TY, Bhave SA, Altosaar J, Elhadad N. Assessing Phenotype Definitions for Algorithmic Fairness. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2023; 2022:1032-1041. [PMID: 37128361 PMCID: PMC10148336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Phenotyping is a core, routine activity in observational health research. Cohorts impact downstream analyses, such as how a condition is characterized, how patient risk is defined, and what treatments are studied. It is thus critical to ensure that cohorts are representative of all patients, independently of their demographics or social determinants of health. In this paper, we propose a set of best practices to assess the fairness of phenotype definitions. We leverage established fairness metrics commonly used in predictive models and relate them to commonly used epidemiological metrics. We describe an empirical study for Crohn's disease and diabetes type 2, each with multiple phenotype definitions taken from the literature across gender and race. We show that the different phenotype definitions exhibit widely varying and disparate performance according to the different fairness metrics and subgroups. We hope that the proposed best practices can help in constructing fair and inclusive phenotype definitions.
Collapse
|
62
|
McDermott MBA, Nestor B, Szolovits P. Clinical Artificial Intelligence: Design Principles and Fallacies. Clin Lab Med 2023; 43:29-46. [PMID: 36764807 DOI: 10.1016/j.cll.2022.09.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Clinical artificial intelligence (AI)/machine learning (ML) is anticipated to offer new abilities in clinical decision support, diagnostic reasoning, precision medicine, clinical operational support, and clinical research, but careful concern is needed to ensure these technologies work effectively in the clinic. Here, we detail the clinical ML/AI design process, identifying several key questions and detailing several common forms of issues that arise with ML tools, as motivated by real-world examples, such that clinicians and researchers can better anticipate and correct for such issues in their own use of ML/AI techniques.
Collapse
Affiliation(s)
| | - Bret Nestor
- Department of Computer Science, University of Toronto, 40 St George St, Toronto, ON M5S 2E4, Canada
| | | |
Collapse
|
63
|
Baroudi H, Brock KK, Cao W, Chen X, Chung C, Court LE, El Basha MD, Farhat M, Gay S, Gronberg MP, Gupta AC, Hernandez S, Huang K, Jaffray DA, Lim R, Marquez B, Nealon K, Netherton TJ, Nguyen CM, Reber B, Rhee DJ, Salazar RM, Shanker MD, Sjogreen C, Woodland M, Yang J, Yu C, Zhao Y. Automated Contouring and Planning in Radiation Therapy: What Is 'Clinically Acceptable'? Diagnostics (Basel) 2023; 13:diagnostics13040667. [PMID: 36832155 PMCID: PMC9955359 DOI: 10.3390/diagnostics13040667] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 01/21/2023] [Accepted: 01/30/2023] [Indexed: 02/12/2023] Open
Abstract
Developers and users of artificial-intelligence-based tools for automatic contouring and treatment planning in radiotherapy are expected to assess clinical acceptability of these tools. However, what is 'clinical acceptability'? Quantitative and qualitative approaches have been used to assess this ill-defined concept, all of which have advantages and disadvantages or limitations. The approach chosen may depend on the goal of the study as well as on available resources. In this paper, we discuss various aspects of 'clinical acceptability' and how they can move us toward a standard for defining clinical acceptability of new autocontouring and planning tools.
Collapse
Affiliation(s)
- Hana Baroudi
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Kristy K. Brock
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Wenhua Cao
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xinru Chen
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Caroline Chung
- Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Laurence E. Court
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Correspondence:
| | - Mohammad D. El Basha
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Maguy Farhat
- Department of Radiation Oncology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Skylar Gay
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Mary P. Gronberg
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Aashish Chandra Gupta
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Soleil Hernandez
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Kai Huang
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - David A. Jaffray
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rebecca Lim
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Barbara Marquez
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Kelly Nealon
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Tucker J. Netherton
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Callistus M. Nguyen
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Brandon Reber
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Dong Joo Rhee
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Ramon M. Salazar
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Mihir D. Shanker
- The University of Queensland, Saint Lucia 4072, Australia
- The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Carlos Sjogreen
- Department of Physics, University of Houston, Houston, TX 77004, USA
| | - McKell Woodland
- Department of Imaging Physics, Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Jinzhong Yang
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Cenji Yu
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| | - Yao Zhao
- Department of Radiation Physics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| |
Collapse
|
64
|
Berdahl CT, Baker L, Mann S, Osoba O, Girosi F. Strategies to Improve the Impact of Artificial Intelligence on Health Equity: Scoping Review. JMIR AI 2023; 2:e42936. [PMID: 38875587 PMCID: PMC11041459 DOI: 10.2196/42936] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 12/14/2022] [Accepted: 12/29/2022] [Indexed: 06/16/2024]
Abstract
BACKGROUND Emerging artificial intelligence (AI) applications have the potential to improve health, but they may also perpetuate or exacerbate inequities. OBJECTIVE This review aims to provide a comprehensive overview of the health equity issues related to the use of AI applications and identify strategies proposed to address them. METHODS We searched PubMed, Web of Science, the IEEE (Institute of Electrical and Electronics Engineers) Xplore Digital Library, ProQuest U.S. Newsstream, Academic Search Complete, the Food and Drug Administration (FDA) website, and ClinicalTrials.gov to identify academic and gray literature related to AI and health equity that were published between 2014 and 2021 and additional literature related to AI and health equity during the COVID-19 pandemic from 2020 and 2021. Literature was eligible for inclusion in our review if it identified at least one equity issue and a corresponding strategy to address it. To organize and synthesize equity issues, we adopted a 4-step AI application framework: Background Context, Data Characteristics, Model Design, and Deployment. We then created a many-to-many mapping of the links between issues and strategies. RESULTS In 660 documents, we identified 18 equity issues and 15 strategies to address them. Equity issues related to Data Characteristics and Model Design were the most common. The most common strategies recommended to improve equity were improving the quantity and quality of data, evaluating the disparities introduced by an application, increasing model reporting and transparency, involving the broader community in AI application development, and improving governance. CONCLUSIONS Stakeholders should review our many-to-many mapping of equity issues and strategies when planning, developing, and implementing AI applications in health care so that they can make appropriate plans to ensure equity for populations affected by their products. AI application developers should consider adopting equity-focused checklists, and regulators such as the FDA should consider requiring them. Given that our review was limited to documents published online, developers may have unpublished knowledge of additional issues and strategies that we were unable to identify.
Collapse
Affiliation(s)
- Carl Thomas Berdahl
- RAND Corporation, Santa Monica, CA, United States
- Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
- Department of Emergency Medicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | | | - Sean Mann
- RAND Corporation, Santa Monica, CA, United States
| | - Osonde Osoba
- RAND Corporation, Santa Monica, CA, United States
| | | |
Collapse
|
65
|
The computational psychiatry of antisocial behaviour and psychopathy. Neurosci Biobehav Rev 2023; 145:104995. [PMID: 36535376 DOI: 10.1016/j.neubiorev.2022.104995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 11/21/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022]
Abstract
Antisocial behaviours such as disobedience, lying, stealing, destruction of property, and aggression towards others are common to multiple disorders of childhood and adulthood, including conduct disorder, oppositional defiant disorder, psychopathy, and antisocial personality disorder. These disorders have a significant negative impact for individuals and for society, but whether they represent clinically different phenomena, or simply different approaches to diagnosing the same underlying psychopathology is highly debated. Computational psychiatry, with its dual focus on identifying different classes of disorder and health (data-driven) and latent cognitive and neurobiological mechanisms (theory-driven), is well placed to address these questions. The elucidation of mechanisms that might characterise latent processes across different disorders of antisocial behaviour can also provide important advances. In this review, we critically discuss the contribution of computational research to our understanding of various antisocial behaviour disorders, and highlight suggestions for how computational psychiatry can address important clinical and scientific questions about these disorders in the future.
Collapse
|
66
|
Davoudi A, Sajdeya R, Ison R, Hagen J, Rashidi P, Price CC, Tighe PJ. Fairness in the prediction of acute postoperative pain using machine learning models. Front Digit Health 2023; 4:970281. [PMID: 36714611 PMCID: PMC9874861 DOI: 10.3389/fdgth.2022.970281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/24/2022] [Indexed: 01/12/2023] Open
Abstract
Introduction Overall performance of machine learning-based prediction models is promising; however, their generalizability and fairness must be vigorously investigated to ensure they perform sufficiently well for all patients. Objective This study aimed to evaluate prediction bias in machine learning models used for predicting acute postoperative pain. Method We conducted a retrospective review of electronic health records for patients undergoing orthopedic surgery from June 1, 2011, to June 30, 2019, at the University of Florida Health system/Shands Hospital. CatBoost machine learning models were trained for predicting the binary outcome of low (≤4) and high pain (>4). Model biases were assessed against seven protected attributes of age, sex, race, area deprivation index (ADI), speaking language, health literacy, and insurance type. Reweighing of protected attributes was investigated for reducing model bias compared with base models. Fairness metrics of equal opportunity, predictive parity, predictive equality, statistical parity, and overall accuracy equality were examined. Results The final dataset included 14,263 patients [age: 60.72 (16.03) years, 53.87% female, 39.13% low acute postoperative pain]. The machine learning model (area under the curve, 0.71) was biased in terms of age, race, ADI, and insurance type, but not in terms of sex, language, and health literacy. Despite promising overall performance in predicting acute postoperative pain, machine learning-based prediction models may be biased with respect to protected attributes. Conclusion These findings show the need to evaluate fairness in machine learning models involved in perioperative pain before they are implemented as clinical decision support tools.
Collapse
Affiliation(s)
- Anis Davoudi
- Department of Anesthesiology, University of Florida College of Medicine, Gainesville, FL, United Sates
| | - Ruba Sajdeya
- Department of Epidemiology, University of Florida College of Public Health and Health Professions, Gainesville, FL, United States
| | - Ron Ison
- Department of Anesthesiology, University of Florida College of Medicine, Gainesville, FL, United Sates
| | - Jennifer Hagen
- Department of Orthopedic Surgery, University of Florida College of Medicine, Gainesville, FL, United States
| | - Parisa Rashidi
- Department of Biomedical Engineering, University of Florida Herbert Wertheim College of Engineering, Gainesville, FL, United States
| | - Catherine C. Price
- Department of Anesthesiology, University of Florida College of Medicine, Gainesville, FL, United Sates
- Department of Clinical and Health Psychology, University of Florida College of Public Health and Health Professions, Gainesville, FL, United States
| | - Patrick J. Tighe
- Department of Anesthesiology, University of Florida College of Medicine, Gainesville, FL, United Sates
| |
Collapse
|
67
|
|
68
|
Kumar R, Singh D, Srinivasan K, Hu YC. AI-Powered Blockchain Technology for Public Health: A Contemporary Review, Open Challenges, and Future Research Directions. Healthcare (Basel) 2022; 11:healthcare11010081. [PMID: 36611541 PMCID: PMC9819078 DOI: 10.3390/healthcare11010081] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/14/2022] [Accepted: 12/20/2022] [Indexed: 12/29/2022] Open
Abstract
Blockchain technology has been growing at a substantial growth rate over the last decade. Introduced as the backbone of cryptocurrencies such as Bitcoin, it soon found its application in other fields because of its security and privacy features. Blockchain has been used in the healthcare industry for several purposes including secure data logging, transactions, and maintenance using smart contracts. Great work has been carried out to make blockchain smart, with the integration of Artificial Intelligence (AI) to combine the best features of the two technologies. This review incorporates the conceptual and functional aspects of the individual technologies and innovations in the domains of blockchain and artificial intelligence and lays down a strong foundational understanding of the domains individually and also rigorously discusses the various ways AI has been used along with blockchain to power the healthcare industry including areas of great importance such as electronic health record (EHR) management, distant-patient monitoring and telemedicine, genomics, drug research, and testing, specialized imaging and outbreak prediction. It compiles various algorithms from supervised and unsupervised machine learning problems along with deep learning algorithms such as convolutional/recurrent neural networks and numerous platforms currently being used in AI-powered blockchain systems and discusses their applications. The review also presents the challenges still faced by these systems which they inherit from the AI and blockchain algorithms used at the core of them and the scope of future work.
Collapse
Affiliation(s)
- Ritik Kumar
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Divyangi Singh
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Kathiravan Srinivasan
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Yuh-Chung Hu
- Department of Mechanical and Electromechanical Engineering, National ILan University, Yilan 26047, Taiwan
| |
Collapse
|
69
|
Wambua S, Crowe F, Thangaratinam S, O'Reilly D, McCowan C, Brophy S, Yau C, Nirantharakumar K, Riley R. Protocol for development and validation of postpartum cardiovascular disease (CVD) risk prediction model incorporating reproductive and pregnancy-related candidate predictors. Diagn Progn Res 2022; 6:23. [PMID: 36536453 PMCID: PMC9761974 DOI: 10.1186/s41512-022-00137-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Cardiovascular disease (CVD) is a leading cause of death among women. CVD is associated with reduced quality of life, significant treatment and management costs, and lost productivity. Estimating the risk of CVD would help patients at a higher risk of CVD to initiate preventive measures to reduce risk of disease. The Framingham risk score and the QRISK® score are two risk prediction models used to evaluate future CVD risk in the UK. Although the algorithms perform well in the general population, they do not take into account pregnancy complications, which are well known risk factors for CVD in women and have been highlighted in a recent umbrella review. We plan to develop a robust CVD risk prediction model to assess the additional value of pregnancy risk factors in risk prediction of CVD in women postpartum. METHODS Using candidate predictors from QRISK®-3, the umbrella review identified from literature and from discussions with clinical experts and patient research partners, we will use time-to-event Cox proportional hazards models to develop and validate a 10-year risk prediction model for CVD postpartum using Clinical Practice Research Datalink (CPRD) primary care database for development and internal validation of the algorithm and the Secure Anonymised Information Linkage (SAIL) databank for external validation. We will then assess the value of additional candidate predictors to the QRISK®-3 in our internal and external validations. DISCUSSION The developed risk prediction model will incorporate pregnancy-related factors which have been shown to be associated with future risk of CVD but have not been taken into account in current risk prediction models. Our study will therefore highlight the importance of incorporating pregnancy-related risk factors into risk prediction modeling for CVD postpartum.
Collapse
Affiliation(s)
- Steven Wambua
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, UK.
| | - Francesca Crowe
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, UK
| | - Shakila Thangaratinam
- WHO Collaborating Centre for Global Women's Health, Institute of Metabolism and Systems Research, University of Birmingham, Birmingham, UK
- Department of Obstetrics and Gynaecology, Birmingham Women's and Children's NHS Foundation Trust, Birmingham, UK
| | - Dermot O'Reilly
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| | - Colin McCowan
- School of Medicine, University of St Andrews, St Andrews, UK
| | - Sinead Brophy
- Data Science, Medical School, Swansea University, Swansea, UK
| | - Christopher Yau
- Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Old Road Campus, Oxford, OX3 7LF, UK
- Nuffield Department of Women's & Reproductive Health, University of Oxford, Level 3 Women's Centre, John Radcliffe Hospital, Oxford, OX3 9DU, UK
- Health Data Research, London, UK
| | - Krishnarajah Nirantharakumar
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, UK
| | - Richard Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham, UK
| |
Collapse
|
70
|
Gulamali FF, Sawant AS, Kovatch P, Glicksberg B, Charney A, Nadkarni GN, Oermann E. Autoencoders for sample size estimation for fully connected neural network classifiers. NPJ Digit Med 2022; 5:180. [PMID: 36513729 PMCID: PMC9747810 DOI: 10.1038/s41746-022-00728-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 11/29/2022] [Indexed: 12/15/2022] Open
Abstract
Sample size estimation is a crucial step in experimental design but is understudied in the context of deep learning. Currently, estimating the quantity of labeled data needed to train a classifier to a desired performance, is largely based on prior experience with similar models and problems or on untested heuristics. In many supervised machine learning applications, data labeling can be expensive and time-consuming and would benefit from a more rigorous means of estimating labeling requirements. Here, we study the problem of estimating the minimum sample size of labeled training data necessary for training computer vision models as an exemplar for other deep learning problems. We consider the problem of identifying the minimal number of labeled data points to achieve a generalizable representation of the data, a minimum converging sample (MCS). We use autoencoder loss to estimate the MCS for fully connected neural network classifiers. At sample sizes smaller than the MCS estimate, fully connected networks fail to distinguish classes, and at sample sizes above the MCS estimate, generalizability strongly correlates with the loss function of the autoencoder. We provide an easily accessible, code-free, and dataset-agnostic tool to estimate sample sizes for fully connected networks. Taken together, our findings suggest that MCS and convergence estimation are promising methods to guide sample size estimates for data collection and labeling prior to training deep learning models in computer vision.
Collapse
Affiliation(s)
- Faris F. Gulamali
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Ashwin S. Sawant
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Patricia Kovatch
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Benjamin Glicksberg
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Alexander Charney
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Girish N. Nadkarni
- grid.59734.3c0000 0001 0670 2351Icahn School of Medicine, New York, NY 10029 USA
| | - Eric Oermann
- grid.137628.90000 0004 1936 8753New York University, New York, NY 10016 USA
| |
Collapse
|
71
|
Butt MA, Qayyum A, Ali H, Al-Fuqaha A, Qadir J. Towards Secure Private and Trustworthy Human-Centric Embedded Machine Learning: An Emotion-Aware Facial Recognition Case Study. Comput Secur 2022. [DOI: 10.1016/j.cose.2022.103058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
72
|
Mashraqi AM, Allehyani B. Current trends on the application of artificial intelligence in medical sciences. Bioinformation 2022; 18:1050-1061. [PMID: 37693078 PMCID: PMC10484692 DOI: 10.6026/973206300181050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 09/12/2023] Open
Abstract
Artificial Intelligence (AI) is expanding with colossal applications in various sectors. In the healthcare sector, it is booming to make life simpler with utmost accuracy by predicting, diagnosing and up to care with the help of Machine Learning (ML) and Deep Learning (DL) applications. Modern computer algorithms have attained accuracy levels comparable to those of human specialists in medical sciences, although computers often do jobs more quickly than people do. It is also expected that there will not be a mandate for humans to be present for the jobs that machines can do, and it is gaining the highest peak because of good trained artificial models in the medical field. ML enhances the therapeutic process and improves health by encouraging more patient participation. ML may get more accurate patient data when used with the Internet of Medical Things (IoMT) and automate message notifications that prompt patients to respond at certain times. The motivation behind this article is to make a comprehensive review of the on-going implementation of ML in medical science, what challenges it is facing now, and how it can be simplified for future researchers to contribute better to medical sciences while applying it to the practitioners' jobs easier. In this review, we have extensively mined the data and brought up systematised applications of AI in healthcare, what challenges have been faced by the experts, and what ethical responsibilities are liable to them while taking the data. We also tabulated which algorithms will be helpful for what kind of data and disease conditions will be useful for future researchers and developers. This article will provide a better insight into AI and ML for the beginner to the advanced developer and researcher to understand the concepts from the basics.
Collapse
Affiliation(s)
- Aisha Mousa Mashraqi
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, UAE
| | - Budoor Allehyani
- Department of Information System, College of Computers and Information Systems, Umm Al-Qura University, UAE
| |
Collapse
|
73
|
Fisher S, Rosella LC. Priorities for successful use of artificial intelligence by public health organizations: a literature review. BMC Public Health 2022; 22:2146. [PMID: 36419010 PMCID: PMC9682716 DOI: 10.1186/s12889-022-14422-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 10/23/2022] [Indexed: 11/24/2022] Open
Abstract
Artificial intelligence (AI) has the potential to improve public health's ability to promote the health of all people in all communities. To successfully realize this potential and use AI for public health functions it is important for public health organizations to thoughtfully develop strategies for AI implementation. Six key priorities for successful use of AI technologies by public health organizations are discussed: 1) Contemporary data governance; 2) Investment in modernized data and analytic infrastructure and procedures; 3) Addressing the skills gap in the workforce; 4) Development of strategic collaborative partnerships; 5) Use of good AI practices for transparency and reproducibility, and; 6) Explicit consideration of equity and bias.
Collapse
Affiliation(s)
- Stacey Fisher
- grid.17063.330000 0001 2157 2938Dalla Lana School of Public Health, University of Toronto, Toronto, ON Canada ,grid.415400.40000 0001 1505 2354Public Health Ontario, Toronto, ON Canada ,grid.494618.6Vector Institute for Artificial Intelligence, Toronto, ON Canada ,grid.418647.80000 0000 8849 1617ICES, Toronto, ON Canada
| | - Laura C. Rosella
- grid.17063.330000 0001 2157 2938Dalla Lana School of Public Health, University of Toronto, Toronto, ON Canada ,grid.494618.6Vector Institute for Artificial Intelligence, Toronto, ON Canada ,grid.418647.80000 0000 8849 1617ICES, Toronto, ON Canada ,grid.417293.a0000 0004 0459 7334Institute for Better Health, Trillium Health Partners, Mississauga, ON Canada ,grid.17063.330000 0001 2157 2938Department of Laboratory Medicine and Pathobiology, Temerty Faculty of Medicine, University of Toronto, Toronto, ON Canada
| |
Collapse
|
74
|
Koutsouleris N, Hauser TU, Skvortsova V, De Choudhury M. From promise to practice: towards the realisation of AI-informed mental health care. THE LANCET DIGITAL HEALTH 2022; 4:e829-e840. [DOI: 10.1016/s2589-7500(22)00153-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 07/14/2022] [Accepted: 07/27/2022] [Indexed: 11/07/2022]
|
75
|
Leist AK, Klee M, Kim JH, Rehkopf DH, Bordas SPA, Muniz-Terrera G, Wade S. Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences. SCIENCE ADVANCES 2022; 8:eabk1942. [PMID: 36260666 PMCID: PMC9581488 DOI: 10.1126/sciadv.abk1942] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/01/2022] [Indexed: 05/20/2023]
Abstract
Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
Collapse
Affiliation(s)
- Anja K. Leist
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Corresponding author.
| | - Matthias Klee
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jung Hyun Kim
- Department of Social Sciences, Institute for Research on Socio-Economic Inequality (IRSEI), University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - David H. Rehkopf
- Department of Epidemiology and Population Health, Stanford University, Palo Alto, CA, USA
| | | | - Graciela Muniz-Terrera
- Centre for Dementia Prevention, University of Edinburgh, Edinburgh, UK
- Ohio University, Athens, OH, USA
| | - Sara Wade
- School of Mathematics, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
76
|
de Biase A, Sourlos N, van Ooijen PM. Standardization of Artificial Intelligence Development in Radiotherapy. Semin Radiat Oncol 2022; 32:415-420. [DOI: 10.1016/j.semradonc.2022.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
77
|
Rasheed K, Qayyum A, Ghaly M, Al-Fuqaha A, Razi A, Qadir J. Explainable, trustworthy, and ethical machine learning for healthcare: A survey. Comput Biol Med 2022; 149:106043. [PMID: 36115302 DOI: 10.1016/j.compbiomed.2022.106043] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 08/15/2022] [Accepted: 08/20/2022] [Indexed: 12/18/2022]
Abstract
With the advent of machine learning (ML) and deep learning (DL) empowered applications for critical applications like healthcare, the questions about liability, trust, and interpretability of their outputs are raising. The black-box nature of various DL models is a roadblock to clinical utilization. Therefore, to gain the trust of clinicians and patients, we need to provide explanations about the decisions of models. With the promise of enhancing the trust and transparency of black-box models, researchers are in the phase of maturing the field of eXplainable ML (XML). In this paper, we provided a comprehensive review of explainable and interpretable ML techniques for various healthcare applications. Along with highlighting security, safety, and robustness challenges that hinder the trustworthiness of ML, we also discussed the ethical issues arising because of the use of ML/DL for healthcare. We also describe how explainable and trustworthy ML can resolve all these ethical problems. Finally, we elaborate on the limitations of existing approaches and highlight various open research problems that require further development.
Collapse
Affiliation(s)
- Khansa Rasheed
- IHSAN Lab, Information Technology University of the Punjab (ITU), Lahore, Pakistan.
| | - Adnan Qayyum
- IHSAN Lab, Information Technology University of the Punjab (ITU), Lahore, Pakistan.
| | - Mohammed Ghaly
- Research Center for Islamic Legislation and Ethics (CILE), College of Islamic Studies, Hamad Bin Khalifa University (HBKU), Doha, Qatar.
| | - Ala Al-Fuqaha
- Information and Computing Technology Division, College of Science and Engineering, Hamad Bin Khalifa University (HBKU), Doha, Qatar.
| | - Adeel Razi
- Turner Institute for Brain and Mental Health, Monash University, Clayton, Australia; Monash Biomedical Imaging, Monash University, Clayton, Australia; Wellcome Centre for Human Neuroimaging, UCL, London, United Kingdom; CIFAR Azrieli Global Scholars program, CIFAR, Toronto, Canada.
| | - Junaid Qadir
- Department of Computer Science and Engineering, College of Engineering, Qatar University, Doha, Qatar.
| |
Collapse
|
78
|
Uche-Anya E, Anyane-Yeboa A, Berzin TM, Ghassemi M, May FP. Artificial intelligence in gastroenterology and hepatology: how to advance clinical practice while ensuring health equity. Gut 2022; 71:1909-1915. [PMID: 35688612 DOI: 10.1136/gutjnl-2021-326271] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 04/19/2022] [Indexed: 12/12/2022]
Abstract
Artificial intelligence (AI) and machine learning (ML) systems are increasingly used in medicine to improve clinical decision-making and healthcare delivery. In gastroenterology and hepatology, studies have explored a myriad of opportunities for AI/ML applications which are already making the transition to bedside. Despite these advances, there is a risk that biases and health inequities can be introduced or exacerbated by these technologies. If unrecognised, these technologies could generate or worsen systematic racial, ethnic and sex disparities when deployed on a large scale. There are several mechanisms through which AI/ML could contribute to health inequities in gastroenterology and hepatology, including diagnosis of oesophageal cancer, management of inflammatory bowel disease (IBD), liver transplantation, colorectal cancer screening and many others. This review adapts a framework for ethical AI/ML development and application to gastroenterology and hepatology such that clinical practice is advanced while minimising bias and optimising health equity.
Collapse
Affiliation(s)
- Eugenia Uche-Anya
- Division of Gastroenterology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Adjoa Anyane-Yeboa
- Division of Gastroenterology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Tyler M Berzin
- Center for Advanced Endoscopy, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Marzyeh Ghassemi
- Institute for Medical and Evaluative Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Folasade P May
- Vatche and Tamar Manoukian Division of Digestive Diseases, UCLA Kaiser Permanente Center for Health Equity and Jonsson Comprehensive Cancer Center, University of California Los Angeles, Los Angeles, California, USA
| |
Collapse
|
79
|
Albert K, Delano M. Sex trouble: Sex/gender slippage, sex confusion, and sex obsession in machine learning using electronic health records. PATTERNS (NEW YORK, N.Y.) 2022; 3:100534. [PMID: 36033589 PMCID: PMC9403398 DOI: 10.1016/j.patter.2022.100534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
False assumptions that sex and gender are binary, static, and concordant are deeply embedded in the medical system. As machine learning researchers use medical data to build tools to solve novel problems, understanding how existing systems represent sex/gender incorrectly is necessary to avoid perpetuating harm. In this perspective, we identify and discuss three factors to consider when working with sex/gender in research: "sex/gender slippage," the frequent substitution of sex and sex-related terms for gender and vice versa; "sex confusion," the fact that any given sex variable holds many different potential meanings; and "sex obsession," the idea that the relevant variable for most inquiries related to sex/gender is sex assigned at birth. We then explore how these phenomena show up in medical machine learning research using electronic health records, with a specific focus on HIV risk prediction. Finally, we offer recommendations about how machine learning researchers can engage more carefully with questions of sex/gender.
Collapse
Affiliation(s)
- Kendra Albert
- Cyberlaw Clinic, Harvard Law School, Cambridge, MA 02138, USA
| | - Maggie Delano
- Engineering Department, Swarthmore College, Swarthmore, PA 19146, USA
| |
Collapse
|
80
|
Addressing fairness in artificial intelligence for medical imaging. Nat Commun 2022; 13:4581. [PMID: 35933408 PMCID: PMC9357063 DOI: 10.1038/s41467-022-32186-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 07/21/2022] [Indexed: 12/19/2022] Open
|
81
|
Honarvar H, Agarwal C, Somani S, Vaid A, Lampert J, Wanyan T, Reddy VY, Nadkarni GN, Miotto R, Zitnik M, Wang F, Glicksberg BS. Enhancing convolutional neural network predictions of electrocardiograms with left ventricular dysfunction using a novel sub-waveform representation. CARDIOVASCULAR DIGITAL HEALTH JOURNAL 2022; 3:220-231. [PMID: 36310683 PMCID: PMC9596304 DOI: 10.1016/j.cvdhj.2022.07.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background Electrocardiogram (ECG) deep learning (DL) has promise to improve the outcomes of patients with cardiovascular abnormalities. In ECG DL, researchers often use convolutional neural networks (CNNs) and traditionally use the full duration of raw ECG waveforms that create redundancies in feature learning and result in inaccurate predictions with large uncertainties. Objective For enhancing these predictions, we introduced a sub-waveform representation that leverages the rhythmic pattern of ECG waveforms (data-centric approach) rather than changing the CNN architecture (model-centric approach). Results We applied the proposed representation to a population with 92,446 patients to identify left ventricular dysfunction. We found that the sub-waveform representation increases the performance metrics compared to the full-waveform representation. We observed a 2% increase for area under the receiver operating characteristic curve and 10% increase for area under the precision-recall curve. We also carefully examined three reliability components of explainability, interpretability, and fairness. We provided an explanation for enhancements obtained by heartbeat alignment mechanism. By developing a new scoring system, we interpreted the clinical relevance of ECG features and showed that sub-waveform representation further pushes the scores towards clinical predictions. Finally, we showed that the new representation significantly reduces prediction uncertainties within subgroups that contributes to individual fairness. Conclusion We expect that this added control over the granularity of ECG data will improve the DL modeling for new artificial intelligence technologies in the cardiovascular space.
Collapse
Affiliation(s)
- Hossein Honarvar
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Chirag Agarwal
- Department of Biomedical Informatics, Harvard University, Boston, Massachusetts
| | - Sulaiman Somani
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Akhil Vaid
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Joshua Lampert
- Helmsley Center for Cardiac Electrophysiology, Mount Sinai Hospital, New York, New York
| | - Tingyi Wanyan
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana
| | - Vivek Y. Reddy
- Helmsley Center for Cardiac Electrophysiology, Mount Sinai Hospital, New York, New York
| | - Girish N. Nadkarni
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
- Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Riccardo Miotto
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard University, Boston, Massachusetts
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York
| | - Benjamin S. Glicksberg
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
- Address reprint requests and correspondence: Dr Benjamin S. Glicksberg, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, New York, NY 10029.
| |
Collapse
|
82
|
Tang A, Woldemariam S, Roger J, Sirota M. Translational Bioinformatics to Enable Precision Medicine for All: Elevating Equity across Molecular, Clinical, and Digital Realms. Yearb Med Inform 2022; 31:106-115. [PMID: 36463867 PMCID: PMC9719766 DOI: 10.1055/s-0042-1742513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022] Open
Abstract
OBJECTIVES Over the past few years, challenges from the pandemic have led to an explosion of data sharing and algorithmic development efforts in the areas of molecular measurements, clinical data, and digital health. We aim to characterize and describe recent advanced computational approaches in translational bioinformatics across these domains in the context of issues or progress related to equity and inclusion. METHODS We conducted a literature assessment of the trends and approaches in translational bioinformatics in the past few years. RESULTS We present a review of recent computational approaches across molecular, clinical, and digital realms. We discuss applications of phenotyping, disease subtype characterization, predictive modeling, biomarker discovery, and treatment selection. We consider these methods and applications through the lens of equity and inclusion in biomedicine. CONCLUSION Equity and inclusion should be incorporated at every step of translational bioinformatics projects, including project design, data collection, model creation, and clinical implementation. These considerations, coupled with the exciting breakthroughs in big data and machine learning, are pivotal to reach the goals of precision medicine for all.
Collapse
Affiliation(s)
- Alice Tang
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Graduate Program in Bioengineering, UCSF, San Francisco, CA, USA
- School of Medicine, UCSF, San Francisco, CA, USA
| | - Sarah Woldemariam
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- School of Medicine, UCSF, San Francisco, CA, USA
| | - Jacquelyn Roger
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Graduate Program in Biological and Medical Informatics, UCSF, San Francisco, CA, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| |
Collapse
|
83
|
Vigdorchik JM, Jang SJ, Taunton MJ, Haddad FS. Deep learning in orthopaedic research : weighing idealism against realism. Bone Joint J 2022; 104-B:909-910. [PMID: 35909380 DOI: 10.1302/0301-620x.104b8.bjj-2022-0416] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Jonathan M Vigdorchik
- Department of Orthopaedic Surgery, Adult Reconstruction and Joint Replacement Service, New York, New York, USA
| | - Seong J Jang
- Department of Orthopaedic Surgery, Adult Reconstruction and Joint Replacement Service, New York, New York, USA.,Weill Cornell Medical College, New York, New York, USA
| | - Michael J Taunton
- Department of Orthopedic Surgery, Mayo Clinic, Rochester, Minnesota, USA
| | - Fares S Haddad
- University College London Hospitals NHS Foundation Trust, The Princess Grace Hospital, and The NIHR Biomedical Research Centre at UCLH, London, UK.,The Bone & Joint Journal, London, UK
| |
Collapse
|
84
|
An Optimized Hyperparameter of Convolutional Neural Network Algorithm for Bug Severity Prediction in Alzheimer’s-Based IoT System. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7210928. [PMID: 35800696 PMCID: PMC9256343 DOI: 10.1155/2022/7210928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/17/2022] [Accepted: 05/27/2022] [Indexed: 11/17/2022]
Abstract
Softwares are involved in all aspects of healthcare, such as booking appointments to software systems that are used for treatment and care of patients. Many vendors and consultants develop high quality software healthcare systems such as hospital management systems, medical electronic systems, and middle-ware softwares in medical devices. Internet of Things (IoT) medical devices are gaining attention and facilitate the people with new technology. The health condition of the patients are monitored by the IoT devices using sensors, specifically brain diseases such as Alzheimer, Parkinson's, and Traumatic brain injury. Embedded software is present in IoT medical devices and the complexity of software increases day-by-day with the increase in the number and complexity of bugs in the devices. Bugs present in IoT medical devices can have severe consequences such as inaccurate records, circulatory suffering, and death in some cases along with delay in handling patients. There is a need to predict the impact of bugs (severe or nonsevere), especially in case of IoT medical devices due to their critical nature. This research proposes a hybrid bug severity prediction model using convolution neural network (CNN) and Harris Hawk optimization (HHO) based on an optimized hyperparameter of CNN with HHO. The dataset is created, that consists of the bugs present in healthcare systems and IoT medical devices, which is used for evaluation of the proposed model. A preprocessing technique on textual dataset is applied along with a feature extraction technique for CNN embedding layer. In HHO, we define the hyperparameter values of “Batch Size, Learning Rate, Activation Function, Optimizer Parameters, and Kernel Initializers,” before training the model. Hybrid model CNN-HHO is applied, and a 10-fold cross validation is performed for evaluation. Results indicate an accuracy of 96.21% with the proposed model.
Collapse
|
85
|
Berger SE, Baria AT. Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches. FRONTIERS IN PAIN RESEARCH 2022; 3:896276. [PMID: 35721658 PMCID: PMC9201034 DOI: 10.3389/fpain.2022.896276] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 05/12/2022] [Indexed: 11/13/2022] Open
Abstract
Pain research traverses many disciplines and methodologies. Yet, despite our understanding and field-wide acceptance of the multifactorial essence of pain as a sensory perception, emotional experience, and biopsychosocial condition, pain scientists and practitioners often remain siloed within their domain expertise and associated techniques. The context in which the field finds itself today-with increasing reliance on digital technologies, an on-going pandemic, and continued disparities in pain care-requires new collaborations and different approaches to measuring pain. Here, we review the state-of-the-art in human pain research, summarizing emerging practices and cutting-edge techniques across multiple methods and technologies. For each, we outline foreseeable technosocial considerations, reflecting on implications for standards of care, pain management, research, and societal impact. Through overviewing alternative data sources and varied ways of measuring pain and by reflecting on the concerns, limitations, and challenges facing the field, we hope to create critical dialogues, inspire more collaborations, and foster new ideas for future pain research methods.
Collapse
Affiliation(s)
- Sara E. Berger
- Responsible and Inclusive Technologies Research, Exploratory Sciences Division, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, United States
| | | |
Collapse
|
86
|
Yousefi PD, Suderman M, Langdon R, Whitehurst O, Davey Smith G, Relton CL. DNA methylation-based predictors of health: applications and statistical considerations. Nat Rev Genet 2022; 23:369-383. [PMID: 35304597 DOI: 10.1038/s41576-022-00465-w] [Citation(s) in RCA: 67] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2022] [Indexed: 12/12/2022]
Abstract
DNA methylation data have become a valuable source of information for biomarker development, because, unlike static genetic risk estimates, DNA methylation varies dynamically in relation to diverse exogenous and endogenous factors, including environmental risk factors and complex disease pathology. Reliable methods for genome-wide measurement at scale have led to the proliferation of epigenome-wide association studies and subsequently to the development of DNA methylation-based predictors across a wide range of health-related applications, from the identification of risk factors or exposures, such as age and smoking, to early detection of disease or progression in cancer, cardiovascular and neurological disease. This Review evaluates the progress of existing DNA methylation-based predictors, including the contribution of machine learning techniques, and assesses the uptake of key statistical best practices needed to ensure their reliable performance, such as data-driven feature selection, elimination of data leakage in performance estimates and use of generalizable, adequately powered training samples.
Collapse
Affiliation(s)
- Paul D Yousefi
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK
| | - Matthew Suderman
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK
| | - Ryan Langdon
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK
| | - Oliver Whitehurst
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK
| | - George Davey Smith
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK
| | - Caroline L Relton
- Medical Research Council Integrative Epidemiology Unit at the University of Bristol, University of Bristol, Bristol, UK.
| |
Collapse
|
87
|
Borowiec ML, Dikow RB, Frandsen PB, McKeeken A, Valentini G, White AE. Deep learning as a tool for ecology and evolution. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13901] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Marek L. Borowiec
- Entomology, Plant Pathology and Nematology University of Idaho Moscow ID USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST) University of Idaho Moscow ID USA
| | - Rebecca B. Dikow
- Data Science Lab, Office of the Chief Information Officer Smithsonian Institution Washington DC USA
| | - Paul B. Frandsen
- Data Science Lab, Office of the Chief Information Officer Smithsonian Institution Washington DC USA
- Department of Plant and Wildlife Sciences Brigham Young University Provo UT USA
| | - Alexander McKeeken
- Entomology, Plant Pathology and Nematology University of Idaho Moscow ID USA
| | | | - Alexander E. White
- Data Science Lab, Office of the Chief Information Officer Smithsonian Institution Washington DC USA
- Department of Botany, National Museum of Natural History Smithsonian Institution Washington DC USA
| |
Collapse
|
88
|
Ghassemi M, Mohamed S. Machine learning and health need better values. NPJ Digit Med 2022; 5:51. [PMID: 35459793 PMCID: PMC9033858 DOI: 10.1038/s41746-022-00595-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 03/29/2022] [Indexed: 11/10/2022] Open
Affiliation(s)
- Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,CIFAR AI Chair, Vector Institute, Toronto, Ontario, M5G 1M1, Canada.
| | | |
Collapse
|
89
|
Ethics methods are required as part of reporting guidelines for artificial intelligence in healthcare. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00479-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
90
|
Rosella LC. Deep learning approaches applied to routinely collected health data: future directions. Int J Epidemiol 2022; 51:931-933. [PMID: 35373824 DOI: 10.1093/ije/dyac064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Accepted: 03/24/2022] [Indexed: 11/12/2022] Open
Affiliation(s)
- Laura C Rosella
- Dalla Lana School of Public Health, Division of Epidemiology, University of Toronto, ON, Canada.,Department of Laboratory Medicine and Pathobiology, Temerty Faculty of Medicine, University of Toronto, ON, Canada.,Institute for Better Health, Trillium Health Partners, Mississauga, ON, Canada.,ICES, Toronto, ON, Canada.,Vector Institute, Toronto, ON, Canada
| |
Collapse
|
91
|
Filipow N, Main E, Sebire NJ, Booth J, Taylor AM, Davies G, Stanojevic S. Implementation of prognostic machine learning algorithms in paediatric chronic respiratory conditions: a scoping review. BMJ Open Respir Res 2022; 9:9/1/e001165. [PMID: 35297371 PMCID: PMC8928277 DOI: 10.1136/bmjresp-2021-001165] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 03/06/2022] [Indexed: 11/23/2022] Open
Abstract
Machine learning (ML) holds great potential for predicting clinical outcomes in heterogeneous chronic respiratory diseases (CRD) affecting children, where timely individualised treatments offer opportunities for health optimisation. This paper identifies rate-limiting steps in ML prediction model development that impair clinical translation and discusses regulatory, clinical and ethical considerations for ML implementation. A scoping review of ML prediction models in paediatric CRDs was undertaken using the PRISMA extension scoping review guidelines. From 1209 results, 25 articles published between 2013 and 2021 were evaluated for features of a good clinical prediction model using the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines. Most of the studies were in asthma (80%), with few in cystic fibrosis (12%), bronchiolitis (4%) and childhood wheeze (4%). There were inconsistencies in model reporting and studies were limited by a lack of validation, and absence of equations or code for replication. Clinician involvement during ML model development is essential and diversity, equity and inclusion should be assessed at each step of the ML pipeline to ensure algorithms do not promote or amplify health disparities among marginalised groups. As ML prediction studies become more frequent, it is important that models are rigorously developed using published guidelines and take account of regulatory frameworks which depend on model complexity, patient safety, accountability and liability.
Collapse
Affiliation(s)
- Nicole Filipow
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Eleanor Main
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Neil J Sebire
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - John Booth
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - Andrew M Taylor
- GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK.,Institute of Cardiovascular Science, University College London, London, UK
| | - Gwyneth Davies
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,GOSH NIHR BRC, Great Ormond Street Hospital for Children, London, UK
| | - Sanja Stanojevic
- Community Health and Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
92
|
Nedios S, Iliodromitis K, Kowalewski C, Bollmann A, Hindricks G, Dagres N, Bogossian H. Big Data in electrophysiology. Herzschrittmacherther Elektrophysiol 2022; 33:26-33. [PMID: 35137276 DOI: 10.1007/s00399-022-00837-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 01/07/2022] [Indexed: 06/14/2023]
Abstract
The quantity of data produced and captured in medicine today is unprecedented. Technological improvements and automation have expanded the traditional statistical methods and enabled the analysis of Big Data. This has permitted the discovery of new associations with a granularity that was previously hidden to human eyes. In the first part of this review, the authors would like to provide an overview of basic Machine Learning (ML) principles and techniques in order to better understand their application in recent publications about cardiac arrhythmias. In the second part, ML-enabled advances in disease detection and diagnosis, outcome prediction, and novel disease characterization in topics like electrocardiography, atrial fibrillation, ventricular arrhythmias, and cardiac devices are presented. Finally, the limitations and challenges of applying ML in clinical practice, such as validation, replication, generalizability, and regulatory issues, are discussed. More carefully designed studies and collaborations are needed for ML to become feasible, trustworthy, accurate, and reproducible and to reach its full potential for patient-oriented precision medicine.
Collapse
Affiliation(s)
- Sotirios Nedios
- Department of Electrophysiology, Heart Center Leipzig at the University of Leipzig, Leipzig, Germany.
- Rhythmologie, Herzzentrum Leipzig, Universität Leipzig, Strümpellstr. 39, 04289, Leipzig, Germany.
| | - Konstantinos Iliodromitis
- Department of Cardiology and Rhythmology, Ev. Krankenhaus Hagen, Hagen, Germany
- Department of Cardiology, University Witten/Herdecke, Witten, Germany
| | - Christopher Kowalewski
- Department of Electrophysiology, Heart Center Leipzig at the University of Leipzig, Leipzig, Germany
| | - Andreas Bollmann
- Department of Electrophysiology, Heart Center Leipzig at the University of Leipzig, Leipzig, Germany
| | - Gerhard Hindricks
- Department of Electrophysiology, Heart Center Leipzig at the University of Leipzig, Leipzig, Germany
| | - Nikolaos Dagres
- Department of Electrophysiology, Heart Center Leipzig at the University of Leipzig, Leipzig, Germany
| | - Harilaos Bogossian
- Department of Cardiology and Rhythmology, Ev. Krankenhaus Hagen, Hagen, Germany
- Department of Cardiology, University Witten/Herdecke, Witten, Germany
| |
Collapse
|
93
|
A comparison of approaches to improve worst-case predictive model performance over patient subpopulations. Sci Rep 2022; 12:3254. [PMID: 35228563 PMCID: PMC8885701 DOI: 10.1038/s41598-022-07167-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 01/31/2022] [Indexed: 12/12/2022] Open
Abstract
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.
Collapse
|
94
|
Rubinger L, Gazendam A, Ekhtiari S, Bhandari M. Machine learning and artificial intelligence in research and healthcare ✰,✰✰. Injury 2022:S0020-1383(22)00076-6. [PMID: 35135685 DOI: 10.1016/j.injury.2022.01.046] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 01/29/2022] [Indexed: 02/02/2023]
Abstract
Artificial intelligence (AI) is a broad term referring to the application of computational algorithms that can analyze large data sets to classify, predict, or gain useful conclusions. Under the umbrella of AI is machine learning (ML). ML is the process of building or learning statistical models using previously observed real world data to predict outcomes, or categorize observations based on 'training' provided by humans. These predictions are then applied to future data, all the while folding in the new data into its perpetually improving and calibrated statistical model. The future of AI and ML in healthcare research is exciting and expansive. AI and ML are becoming cornerstones in the medical and healthcare-research domains and are integral in our continued processing and capitalization of robust patient EMR data. Considerations for the use and application of ML in healthcare settings include assessing the quality of data inputs and decision-making that serve as the foundations of the ML model, ensuring the end-product is interpretable, transparent, and ethical concerns are considered throughout the development process. The current and future applications of ML include improving the quality and quantity of data collected from EMRs to improve registry data, utilizing these robust datasets to improve and standardized research protocols and outcomes, clinical decision-making applications, natural language processing and improving the fundamentals of value-based care, to name only a few.
Collapse
Affiliation(s)
- Luc Rubinger
- Division of Orthopaedics, Department of Surgery, McMaster University, Hamilton, ON Canada; Centre for Evidence-Based Orthopaedics, 293 Wellington St. N, Suite 110, Hamilton, ON L8L 8E7 Canada.
| | - Aaron Gazendam
- Division of Orthopaedics, Department of Surgery, McMaster University, Hamilton, ON Canada; Centre for Evidence-Based Orthopaedics, 293 Wellington St. N, Suite 110, Hamilton, ON L8L 8E7 Canada
| | - Seper Ekhtiari
- Division of Orthopaedics, Department of Surgery, McMaster University, Hamilton, ON Canada; Centre for Evidence-Based Orthopaedics, 293 Wellington St. N, Suite 110, Hamilton, ON L8L 8E7 Canada
| | - Mohit Bhandari
- Division of Orthopaedics, Department of Surgery, McMaster University, Hamilton, ON Canada; Centre for Evidence-Based Orthopaedics, 293 Wellington St. N, Suite 110, Hamilton, ON L8L 8E7 Canada
| |
Collapse
|
95
|
Huang J, Galal G, Etemadi M, Vaidyanathan M. Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: A Scoping Review (Preprint). JMIR Med Inform 2022; 10:e36388. [PMID: 35639450 PMCID: PMC9198828 DOI: 10.2196/36388] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/17/2022] [Accepted: 03/27/2022] [Indexed: 01/12/2023] Open
Abstract
Background Racial bias is a key concern regarding the development, validation, and implementation of machine learning (ML) models in clinical settings. Despite the potential of bias to propagate health disparities, racial bias in clinical ML has yet to be thoroughly examined and best practices for bias mitigation remain unclear. Objective Our objective was to perform a scoping review to characterize the methods by which the racial bias of ML has been assessed and describe strategies that may be used to enhance algorithmic fairness in clinical ML. Methods A scoping review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Extension for Scoping Reviews. A literature search using PubMed, Scopus, and Embase databases, as well as Google Scholar, identified 635 records, of which 12 studies were included. Results Applications of ML were varied and involved diagnosis, outcome prediction, and clinical score prediction performed on data sets including images, diagnostic studies, clinical text, and clinical variables. Of the 12 studies, 1 (8%) described a model in routine clinical use, 2 (17%) examined prospectively validated clinical models, and the remaining 9 (75%) described internally validated models. In addition, 8 (67%) studies concluded that racial bias was present, 2 (17%) concluded that it was not, and 2 (17%) assessed the implementation of bias mitigation strategies without comparison to a baseline model. Fairness metrics used to assess algorithmic racial bias were inconsistent. The most commonly observed metrics were equal opportunity difference (5/12, 42%), accuracy (4/12, 25%), and disparate impact (2/12, 17%). All 8 (67%) studies that implemented methods for mitigation of racial bias successfully increased fairness, as measured by the authors’ chosen metrics. Preprocessing methods of bias mitigation were most commonly used across all studies that implemented them. Conclusions The broad scope of medical ML applications and potential patient harms demand an increased emphasis on evaluation and mitigation of racial bias in clinical ML. However, the adoption of algorithmic fairness principles in medicine remains inconsistent and is limited by poor data availability and ML model reporting. We recommend that researchers and journal editors emphasize standardized reporting and data availability in medical ML studies to improve transparency and facilitate evaluation for racial bias.
Collapse
Affiliation(s)
- Jonathan Huang
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - Galal Galal
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| | - Mozziyar Etemadi
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
- Department of Biomedical Engineering, Northwestern University, Evanston, IL, United States
| | - Mahesh Vaidyanathan
- Department of Anesthesiology, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
- Digital Health & Data Science Curricular Thread, Northwestern University Feinberg School of Medicine, Chicago, IL, United States
| |
Collapse
|
96
|
Nethery RC, Chen JT, Krieger N, Waterman PD, Peterson E, Waller LA, Coull BA. Statistical implications of endogeneity induced by residential segregation in small-area modelling of health inequities. AM STAT 2022; 76:142-151. [PMID: 35531350 PMCID: PMC9070859 DOI: 10.1080/00031305.2021.2003245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Health inequities are assessed by health departments to identify social groups disproportionately burdened by disease and by academic researchers to understand how social, economic, and environmental inequities manifest as health inequities. To characterize inequities, group-specific small-area health data are often modeled using log-linear generalized linear models (GLM) or generalized linear mixed models (GLMM) with a random intercept. These approaches estimate the same marginal rate ratio comparing disease rates across groups under standard assumptions. Here we explore how residential segregation combined with social group differences in disease risk can lead to contradictory findings from the GLM and GLMM. We show that this occurs because small-area disease rate data collected under these conditions induce endogeneity in the GLMM due to correlation between the model's offset and random effect. This results in GLMM estimates that represent conditional rather than marginal associations. We refer to endogeneity arising from the offset, which to our knowledge has not been noted previously, as "offset endogeneity". We illustrate this phenomenon in simulated data and real premature mortality data, and we propose alternative modeling approaches to address it. We also introduce to a statistical audience the social epidemiologic terminology for framing health inequities, which enables responsible interpretation of results.
Collapse
Affiliation(s)
- Rachel C. Nethery
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA,Corresponding author 655 Huntington Ave, Building 2, 4th floor, Boston, MA 02215.
| | - Jarvis T. Chen
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Nancy Krieger
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Pamela D. Waterman
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Emily Peterson
- Department of Biostatistics and Bioinformatics, Emory Rollins School of Public Health, Atlanta, GA, USA
| | - Lance A. Waller
- Department of Biostatistics and Bioinformatics, Emory Rollins School of Public Health, Atlanta, GA, USA
| | - Brent A. Coull
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
97
|
McGuire TG, Zink AL, Rose S. Improving the Performance of Risk Adjustment Systems: Constrained Regressions, Reinsurance, and Variable Selection. AMERICAN JOURNAL OF HEALTH ECONOMICS 2021; 7:497-521. [PMID: 34869790 PMCID: PMC8635414 DOI: 10.1086/716199] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Modifications of risk-adjustment systems used to pay health plans in individual health insurance markets typically seek to reduce selection incentives at the individual and group levels by adding variables to the payment formula. Adding variables can be costly and lead to unintended incentives for upcoding or service utilization. While these drawbacks are recognized, they are hard to quantify and difficult to balance against the concrete, measurable improvements in fit that may be achieved by adding variables to the formula. This paper takes a different approach to improving the performance of health plan payment systems. Using the HHS-HHC V0519 model from the Marketplaces as a starting point, we constrain fit at the individual and group level to be as good or better than the current payment model while reducing the number of variables in the model. We introduce three elements in the design of plan payment: reinsurance, constrained regressions, and machine learning methods for variable selection. The fit performance of our alternative formulas with many fewer variables is as good or better than the current HHS-HHC V0519 formula.
Collapse
Affiliation(s)
- Thomas G McGuire
- Health Economics, Department of Health Care Policy, Harvard Medical School
| | | | - Sherri Rose
- Center for Health Policy and Center for Primary Care and Outcomes Research, Stanford University
| |
Collapse
|
98
|
Zink A, Rose S. Identifying undercompensated groups defined by multiple attributes in risk adjustment. BMJ Health Care Inform 2021; 28:bmjhci-2021-100414. [PMID: 34535447 PMCID: PMC8451283 DOI: 10.1136/bmjhci-2021-100414] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 08/25/2021] [Indexed: 11/22/2022] Open
Abstract
Objective To identify undercompensated groups in plan payment risk adjustment that are defined by multiple attributes with a systematic new approach, improving on the arbitrary and inconsistent nature of existing evaluations. Methods Extending the concept of variable importance for single attributes, we construct a measure of ‘group importance’ in the random forests algorithm to identify groups with multiple attributes that are undercompensated by current risk adjustment formulas. Using 2016–2018 IBM MarketScan and 2015–2018 Medicare claims and enrolment data, we evaluate two risk adjustment scenarios: the risk adjustment formula used in the individual health insurance Marketplaces and the risk adjustment formula used in Medicare. Results A number of previously unidentified groups with multiple chronic conditions are undercompensated in the Marketplaces risk adjustment formula, while groups without chronic conditions tend to be overcompensated in the Marketplaces. The magnitude of undercompensation when defining groups with multiple attributes is many times larger than with single attributes. No complex groups were found to be consistently undercompensated or overcompensated in the Medicare risk adjustment formula. Conclusions Our method is effective at identifying complex undercompensated groups in health plan payment risk adjustment where undercompensation creates incentives for insurers to discriminate against these groups. This work provides policy-makers with new information on potential targets of discrimination in the healthcare system and a path towards more equitable health coverage.
Collapse
Affiliation(s)
- Anna Zink
- PhD Candidate in Health Policy, Harvard University, Cambridge, Massachusetts, USA
| | - Sherri Rose
- Center for Health Policy and Center for Primary Care & Outcomes Research, Stanford University, Stanford, California, USA
| |
Collapse
|
99
|
Adhikari S, Normand SL, Bloom J, Shahian D, Rose S. Revisiting performance metrics for prediction with rare outcomes. Stat Methods Med Res 2021; 30:2352-2366. [PMID: 34468239 DOI: 10.1177/09622802211038754] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were <1%, however, true positive rates were <7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve (>90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy.
Collapse
Affiliation(s)
- Samrachana Adhikari
- Department of Population Health, 12296New York University School of Medicine, USA
| | | | - Jordan Bloom
- Department of Surgery, 2348Massachusetts General Hospital, USA
| | - David Shahian
- Department of Surgery, 2348Massachusetts General Hospital, USA
| | - Sherri Rose
- Center for Health Policy, 6429Stanford University, USA
| |
Collapse
|
100
|
Katsaouni N, Tashkandi A, Wiese L, Schulz MH. Machine learning based disease prediction from genotype data. Biol Chem 2021; 402:871-885. [PMID: 34218544 DOI: 10.1515/hsz-2021-0109] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2021] [Accepted: 06/15/2021] [Indexed: 12/16/2022]
Abstract
Using results from genome-wide association studies for understanding complex traits is a current challenge. Here we review how genotype data can be used with different machine learning (ML) methods to predict phenotype occurrence and severity from genotype data. We discuss common feature encoding schemes and how studies handle the often small number of samples compared to the huge number of variants. We compare which ML methods are being applied, including recent results using deep neural networks. Further, we review the application of methods for feature explanation and interpretation.
Collapse
Affiliation(s)
- Nikoletta Katsaouni
- Institute for Cardiovascular Regeneration, Goethe University, 60590Frankfurt am Main, Germany
| | - Araek Tashkandi
- Institute of Computer Sciences and Engineering, University of Jeddah, 21959Jeddah, Saudi Arabia
| | - Lena Wiese
- Institute of Computer Science, Goethe University, 60629Frankfurt am Main, Germany
| | - Marcel H Schulz
- Institute for Cardiovascular Regeneration, Goethe University, 60590Frankfurt am Main, Germany.,German Center for Cardiovascular Research (DZHK), Partner Site RheinMain, 60590Frankfurt am Main, Germany.,Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| |
Collapse
|