1
|
Eyler Dang L, Klazura G, Yap A, Ozgediz D, Bryce E, Cheung M, Fedatto M, Ameh EA. Development and Internal-external Validation of a Post-operative Mortality Risk Calculator for Pediatric Surgical Patients in Low- and Middle- Income Countries Using Machine Learning. J Pediatr Surg 2024:161883. [PMID: 39317568 DOI: 10.1016/j.jpedsurg.2024.161883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 08/09/2024] [Accepted: 08/22/2024] [Indexed: 09/26/2024]
Abstract
BACKGROUND The purpose of this study was to develop and validate a mortality risk algorithm for pediatric surgery patients treated at KidsOR sites in 14 low- and middle-income countries. METHODS A SuperLearner machine learning algorithm was trained to predict post-operative mortality by hospital discharge using the retrospectively and prospectively collected KidsOR database including patients treated at 20 KidsOR sites from June 2018 to June 2023. Algorithm performance was evaluated by internal-external cross-validated AUC and calibration. FINDINGS Of 23,905 eligible patients, 21,703 with discharge status recorded were included in the analysis, representing a post-operative mortality rate of 3.1% (671 mortality events). The candidate algorithm with the best cross-validated performance was an extreme gradient boosting model. The cross-validated AUC was 0.945 (95% CI 0.936 to 0.954) and cross-validated calibration slope and intercept were 1.01 (95% CI 0.96 to 1.06) and 0.05 (95% CI -0.10 to 0.21). For Super Learner models trained on all but one site and evaluated in the holdout site for sites with at least 25 mortality events, overall external validation AUC was 0.864 (95% CI 0.846 to 0.882) with calibration slope and intercept of 1.03 (95% CI 0.97 to 1.09) and 1.18 (95% CI 0.98 to 1.39). INTERPRETATION The KidsOR post-operative mortality risk algorithm had outstanding cross-validated discrimination and strong cross-validated calibration. Across all external validation sites, discrimination of Super Learner models trained on the remaining sites was excellent, though re-calibration may be necessary prior to use at new sites. This model has the potential to inform clinical practice and guide resource allocation at KidsOR sites world-wide. TYPE OF STUDY AND LEVEL OF EVIDENCE Observational Study, Level III.
Collapse
Affiliation(s)
| | - Greg Klazura
- University of Illinois at Chicago Department of Surgery, Chicago, IL, USA; Loyola University Medical Center Department of Surgery, Maywood, IL, USA.
| | - Ava Yap
- UCSF Center for Health Equity in Surgery and Anesthesia, San Francisco, CA, USA
| | - Doruk Ozgediz
- UCSF Center for Health Equity in Surgery and Anesthesia, San Francisco, CA, USA
| | - Emma Bryce
- Usher Institute of Population Health Sciences and Informatics at the University of Edinburgh, Edinburgh, Scotland, UK; KidsOR Research Team, Edinburgh, Scotland, UK
| | - Maija Cheung
- Yale University Medical Center, New Haven, CT, USA
| | | | | |
Collapse
|
2
|
Cunningham JW, Singh P, Reeder C, Claggett B, Marti-Castellote PM, Lau ES, Khurshid S, Batra P, Lubitz SA, Maddah M, Philippakis A, Desai AS, Ellinor PT, Vardeny O, Solomon SD, Ho JE. Natural Language Processing for Adjudication of Heart Failure in a Multicenter Clinical Trial: A Secondary Analysis of a Randomized Clinical Trial. JAMA Cardiol 2024; 9:174-181. [PMID: 37950744 PMCID: PMC10640703 DOI: 10.1001/jamacardio.2023.4859] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 10/29/2023] [Indexed: 11/13/2023]
Abstract
Importance The gold standard for outcome adjudication in clinical trials is medical record review by a physician clinical events committee (CEC), which requires substantial time and expertise. Automated adjudication of medical records by natural language processing (NLP) may offer a more resource-efficient alternative but this approach has not been validated in a multicenter setting. Objective To externally validate the Community Care Cohort Project (C3PO) NLP model for heart failure (HF) hospitalization adjudication, which was previously developed and tested within one health care system, compared to gold-standard CEC adjudication in a multicenter clinical trial. Design, Setting, and Participants This was a retrospective analysis of the Influenza Vaccine to Effectively Stop Cardio Thoracic Events and Decompensated Heart Failure (INVESTED) trial, which compared 2 influenza vaccines in 5260 participants with cardiovascular disease at 157 sites in the US and Canada between September 2016 and January 2019. Analysis was performed from November 2022 to October 2023. Exposures Individual sites submitted medical records for each hospitalization. The central INVESTED CEC and the C3PO NLP model independently adjudicated whether the cause of hospitalization was HF using the prepared hospitalization dossier. The C3PO NLP model was fine-tuned (C3PO + INVESTED) and a de novo NLP model was trained using half the INVESTED hospitalizations. Main Outcomes and Measures Concordance between the C3PO NLP model HF adjudication and the gold-standard INVESTED CEC adjudication was measured by raw agreement, κ, sensitivity, and specificity. The fine-tuned and de novo INVESTED NLP models were evaluated in an internal validation cohort not used for training. Results Among 4060 hospitalizations in 1973 patients (mean [SD] age, 66.4 [13.2] years; 514 [27.4%] female and 1432 [72.6%] male]), 1074 hospitalizations (26%) were adjudicated as HF by the CEC. There was good agreement between the C3PO NLP and CEC HF adjudications (raw agreement, 87% [95% CI, 86-88]; κ, 0.69 [95% CI, 0.66-0.72]). C3PO NLP model sensitivity was 94% (95% CI, 92-95) and specificity was 84% (95% CI, 83-85). The fine-tuned C3PO and de novo NLP models demonstrated agreement of 93% (95% CI, 92-94) and κ of 0.82 (95% CI, 0.77-0.86) and 0.83 (95% CI, 0.79-0.87), respectively, vs the CEC. CEC reviewer interrater reproducibility was 94% (95% CI, 93-95; κ, 0.85 [95% CI, 0.80-0.89]). Conclusions and Relevance The C3PO NLP model developed within 1 health care system identified HF events with good agreement relative to the gold-standard CEC in an external multicenter clinical trial. Fine-tuning the model improved agreement and approximated human reproducibility. Further study is needed to determine whether NLP will improve the efficiency of future multicenter clinical trials by identifying clinical events at scale.
Collapse
Affiliation(s)
- Jonathan W. Cunningham
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
| | - Pulkit Singh
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge
| | - Christopher Reeder
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge
| | - Brian Claggett
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | | | - Emily S. Lau
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
- Division of Cardiology, Massachusetts General Hospital, Boston
| | - Shaan Khurshid
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston
| | - Puneet Batra
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge
| | - Steven A. Lubitz
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston
| | - Mahnaz Maddah
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge
| | - Anthony Philippakis
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge
| | - Akshay S. Desai
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Patrick T. Ellinor
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston
| | - Orly Vardeny
- Minneapolis VA Hospital, University of Minnesota, Minneapolis
| | - Scott D. Solomon
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Jennifer E. Ho
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge
- CardioVascular Institute and Division of Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| |
Collapse
|
3
|
Gorham TJ, Tumin D, Groner J, Allen E, Retzke J, Hersey S, Liu SB, Macias C, Alachraf K, Smith AW, Blount T, Wall B, Crickmore K, Wooten WI, Jamison SD, Rust S. Predicting emergency department visits among children with asthma in two academic medical systems. J Asthma 2023; 60:2137-2144. [PMID: 37318283 DOI: 10.1080/02770903.2023.2225603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 06/11/2023] [Indexed: 06/16/2023]
Abstract
Objective: To develop and validate a predictive algorithm that identifies pediatric patients at risk of asthma-related emergencies, and to test whether algorithm performance can be improved in an external site via local retraining.Methods: In a retrospective cohort at the first site, data from 26 008 patients with asthma aged 2-18 years (2012-2017) were used to develop a lasso-regularized logistic regression model predicting emergency department visits for asthma within one year of a primary care encounter, known as the Asthma Emergency Risk (AER) score. Internal validation was conducted on 8634 patient encounters from 2018. External validation of the AER score was conducted using 1313 pediatric patient encounters from a second site during 2018. The AER score components were then reweighted using logistic regression using data from the second site to improve local model performance. Prediction intervals (PI) were constructed via 10 000 bootstrapped samples.Results: At the first site, the AER score had a cross-validated area under the receiver operating characteristic curve (AUROC) of 0.768 (95% PI: 0.745-0.790) during model training and an AUROC of 0.769 in the 2018 internal validation dataset (p = 0.959). When applied without modification to the second site, the AER score had an AUROC of 0.684 (95% PI: 0.624-0.742). After local refitting, the cross-validated AUROC improved to 0.737 (95% PI: 0.676-0.794; p = 0.037 as compared to initial AUROC).Conclusions: The AER score demonstrated strong internal validity, but external validity was dependent on reweighting model components to reflect local data characteristics at the external site.
Collapse
Affiliation(s)
- Tyler J Gorham
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| | - Dmitry Tumin
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Judith Groner
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Elizabeth Allen
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Jessica Retzke
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Stephen Hersey
- Division of Primary Care Pediatrics, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Swan Bee Liu
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| | - Charlie Macias
- Quality Improvement Services, Nationwide Children's Hospital, Columbus, OH, USA
| | - Kamel Alachraf
- Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Aimee W Smith
- Department of Psychology, East Carolina University, Greenville, NC, USA
| | | | | | | | - William I Wooten
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Shaundreal D Jamison
- Department of Pediatrics, Brody School of Medicine at East Carolina University, Greenville, NC, USA
| | - Steve Rust
- Information Technology Research & Innovation, Nationwide Children's Hospital, Columbus, OH, USA
| |
Collapse
|
4
|
Parnass G, Levtzion-Korach O, Peres R, Assaf M. Estimating emergency department crowding with stochastic population models. PLoS One 2023; 18:e0295130. [PMID: 38039309 PMCID: PMC10691698 DOI: 10.1371/journal.pone.0295130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 11/15/2023] [Indexed: 12/03/2023] Open
Abstract
Environments such as shopping malls, airports, or hospital emergency-departments often experience crowding, with many people simultaneously requesting service. Crowding highly fluctuates, with sudden overcrowding "spikes". Past research has either focused on average behavior, used context-specific models with a large number of parameters, or machine-learning models that are hard to interpret. Here we show that a stochastic population model, previously applied to a broad range of natural phenomena, can aptly describe hospital emergency-department crowding. We test the model using data from five-year minute-by-minute emergency-department records. The model provides reliable forecasting of the crowding distribution. Overcrowding is highly sensitive to the patient arrival-flux and length-of-stay: a 10% increase in arrivals triples the probability of overcrowding events. Expediting patient exit-rate to shorten the typical length-of-stay by just 20 minutes (8.5%) cuts the probability of severe overcrowding events by 50%. Such forecasting is critical in prevention and mitigation of breakdown events. Our results demonstrate that despite its high volatility, crowding follows a dynamic behavior common to many systems in nature.
Collapse
Affiliation(s)
- Gil Parnass
- Racah Institute of Physics, Hebrew University of Jerusalem, Jerusalem, Israel
| | | | - Renana Peres
- The Hebrew University Business school, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michael Assaf
- Racah Institute of Physics, Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
5
|
Barak-Corren Y, Tsurel D, Keidar D, Gofer I, Shahaf D, Leventer-Roberts M, Barda N, Reis BY. The value of parental medical records for the prediction of diabetes and cardiovascular disease: a novel method for generating and incorporating family histories. J Am Med Inform Assoc 2023; 30:1915-1924. [PMID: 37535812 PMCID: PMC10654871 DOI: 10.1093/jamia/ocad154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/16/2023] [Accepted: 07/24/2023] [Indexed: 08/05/2023] Open
Abstract
OBJECTIVE To determine whether data-driven family histories (DDFH) derived from linked EHRs of patients and their parents can improve prediction of patients' 10-year risk of diabetes and atherosclerotic cardiovascular disease (ASCVD). MATERIALS AND METHODS A retrospective cohort study using data from Israel's largest healthcare organization. A random sample of 200 000 subjects aged 40-60 years on the index date (January 1, 2010) was included. Subjects with insufficient history (<1 year) or insufficient follow-up (<10 years) were excluded. Two separate XGBoost models were developed-1 for diabetes and 1 for ASCVD-to predict the 10-year risk for each outcome based on data available prior to the index date of January 1, 2010. RESULTS Overall, the study included 110 734 subject-father-mother triplets. There were 22 153 cases of diabetes (20%) and 11 715 cases of ASCVD (10.6%). The addition of parental information significantly improved prediction of diabetes risk (P < .001), but not ASCVD risk. For both outcomes, maternal medical history was more predictive than paternal medical history. A binary variable summarizing parental disease state delivered similar predictive results to the full parental EHR. DISCUSSION The increasing availability of EHRs for multiple family generations makes DDFH possible and can assist in delivering more personalized and precise medicine to patients. Consent frameworks must be established to enable sharing of information across generations, and the results suggest that sharing the full records may not be necessary. CONCLUSION DDFH can address limitations of patient self-reported family history, and it improves clinical predictions for some conditions, but not for all, and particularly among younger adults.
Collapse
Affiliation(s)
- Yuval Barak-Corren
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, USA
| | - David Tsurel
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, USA
- Clalit Research Institute, Ramat Gan, Israel
- The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Daphna Keidar
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, USA
- Clalit Research Institute, Ramat Gan, Israel
| | - Ilan Gofer
- Clalit Research Institute, Ramat Gan, Israel
| | - Dafna Shahaf
- The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Maya Leventer-Roberts
- Clalit Research Institute, Ramat Gan, Israel
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Noam Barda
- Clalit Research Institute, Ramat Gan, Israel
| | - Ben Y Reis
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
6
|
Millarch AS, Bonde A, Bonde M, Klein KV, Folke F, Rudolph SS, Sillesen M. Assessing optimal methods for transferring machine learning models to low-volume and imbalanced clinical datasets: experiences from predicting outcomes of Danish trauma patients. Front Digit Health 2023; 5:1249258. [PMID: 38026835 PMCID: PMC10656776 DOI: 10.3389/fdgth.2023.1249258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 10/10/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction Accurately predicting patient outcomes is crucial for improving healthcare delivery, but large-scale risk prediction models are often developed and tested on specific datasets where clinical parameters and outcomes may not fully reflect local clinical settings. Where this is the case, whether to opt for de-novo training of prediction models on local datasets, direct porting of externally trained models, or a transfer learning approach is not well studied, and constitutes the focus of this study. Using the clinical challenge of predicting mortality and hospital length of stay on a Danish trauma dataset, we hypothesized that a transfer learning approach of models trained on large external datasets would provide optimal prediction results compared to de-novo training on sparse but local datasets or directly porting externally trained models. Methods Using an external dataset of trauma patients from the US Trauma Quality Improvement Program (TQIP) and a local dataset aggregated from the Danish Trauma Database (DTD) enriched with Electronic Health Record data, we tested a range of model-level approaches focused on predicting trauma mortality and hospital length of stay on DTD data. Modeling approaches included de-novo training of models on DTD data, direct porting of models trained on TQIP data to the DTD, and a transfer learning approach by training a model on TQIP data with subsequent transfer and retraining on DTD data. Furthermore, data-level approaches, including mixed dataset training and methods countering imbalanced outcomes (e.g., low mortality rates), were also tested. Results Using a neural network trained on a mixed dataset consisting of a subset of TQIP and DTD, with class weighting and transfer learning (retraining on DTD), we achieved excellent results in predicting mortality, with a ROC-AUC of 0.988 and an F2-score of 0.866. The best-performing models for predicting long-term hospitalization were trained only on local data, achieving an ROC-AUC of 0.890 and an F1-score of 0.897, although only marginally better than alternative approaches. Conclusion Our results suggest that when assessing the optimal modeling approach, it is important to have domain knowledge of how incidence rates and workflows compare between hospital systems and datasets where models are trained. Including data from other health-care systems is particularly beneficial when outcomes are suffering from class imbalance and low incidence. Scenarios where outcomes are not directly comparable are best addressed through either de-novo local training or a transfer learning approach.
Collapse
Affiliation(s)
- Andreas Skov Millarch
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Alexander Bonde
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Mikkel Bonde
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | | | - Fredrik Folke
- Copenhagen Emergency Medical Services, University of Copenhagen, Ballerup, Denmark
- Department of Cardiology, Herlev Gentofte University Hospital, Hellerup, Denmark
| | - Søren Steemann Rudolph
- Department of Anesthesia, Center of Head and Orthopedics, Rigshospitalet, Copenhagen, Denmark
| | - Martin Sillesen
- Department of Organ Surgery and Transplantation, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Center for Surgical Translational and Artificial Intelligence Research (CSTAR), Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
7
|
Kim JP, Ryan K, Kasun M, Hogg J, Dunn LB, Roberts LW. Physicians' and Machine Learning Researchers' Perspectives on Ethical Issues in the Early Development of Clinical Machine Learning Tools: Qualitative Interview Study. JMIR AI 2023; 2:e47449. [PMID: 38875536 PMCID: PMC11041441 DOI: 10.2196/47449] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 08/20/2023] [Accepted: 09/16/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND Innovative tools leveraging artificial intelligence (AI) and machine learning (ML) are rapidly being developed for medicine, with new applications emerging in prediction, diagnosis, and treatment across a range of illnesses, patient populations, and clinical procedures. One barrier for successful innovation is the scarcity of research in the current literature seeking and analyzing the views of AI or ML researchers and physicians to support ethical guidance. OBJECTIVE This study aims to describe, using a qualitative approach, the landscape of ethical issues that AI or ML researchers and physicians with professional exposure to AI or ML tools observe or anticipate in the development and use of AI and ML in medicine. METHODS Semistructured interviews were used to facilitate in-depth, open-ended discussion, and a purposeful sampling technique was used to identify and recruit participants. We conducted 21 semistructured interviews with a purposeful sample of AI and ML researchers (n=10) and physicians (n=11). We asked interviewees about their views regarding ethical considerations related to the adoption of AI and ML in medicine. Interviews were transcribed and deidentified by members of our research team. Data analysis was guided by the principles of qualitative content analysis. This approach, in which transcribed data is broken down into descriptive units that are named and sorted based on their content, allows for the inductive emergence of codes directly from the data set. RESULTS Notably, both researchers and physicians articulated concerns regarding how AI and ML innovations are shaped in their early development (ie, the problem formulation stage). Considerations encompassed the assessment of research priorities and motivations, clarity and centeredness of clinical needs, professional and demographic diversity of research teams, and interdisciplinary knowledge generation and collaboration. Phase-1 ethical issues identified by interviewees were notably interdisciplinary in nature and invited questions regarding how to align priorities and values across disciplines and ensure clinical value throughout the development and implementation of medical AI and ML. Relatedly, interviewees suggested interdisciplinary solutions to these issues, for example, more resources to support knowledge generation and collaboration between developers and physicians, engagement with a broader range of stakeholders, and efforts to increase diversity in research broadly and within individual teams. CONCLUSIONS These qualitative findings help elucidate several ethical challenges anticipated or encountered in AI and ML for health care. Our study is unique in that its use of open-ended questions allowed interviewees to explore their sentiments and perspectives without overreliance on implicit assumptions about what AI and ML currently are or are not. This analysis, however, does not include the perspectives of other relevant stakeholder groups, such as patients, ethicists, industry researchers or representatives, or other health care professionals beyond physicians. Additional qualitative and quantitative research is needed to reproduce and build on these findings.
Collapse
Affiliation(s)
- Jane Paik Kim
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Katie Ryan
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Max Kasun
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Justin Hogg
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Laura B Dunn
- Department of Psychiatry, University of Arkansas for Medical Sciences, Arkansas, CA, United States
| | - Laura Weiss Roberts
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
8
|
Farrell S, Appleton C, Noble PJM, Al Moubayed N. PetBERT: automated ICD-11 syndromic disease coding for outbreak detection in first opinion veterinary electronic health records. Sci Rep 2023; 13:18015. [PMID: 37865683 PMCID: PMC10590382 DOI: 10.1038/s41598-023-45155-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 10/17/2023] [Indexed: 10/23/2023] Open
Abstract
Effective public health surveillance requires consistent monitoring of disease signals such that researchers and decision-makers can react dynamically to changes in disease occurrence. However, whilst surveillance initiatives exist in production animal veterinary medicine, comparable frameworks for companion animals are lacking. First-opinion veterinary electronic health records (EHRs) have the potential to reveal disease signals and often represent the initial reporting of clinical syndromes in animals presenting for medical attention, highlighting their possible significance in early disease detection. Yet despite their availability, there are limitations surrounding their free text-based nature, inhibiting the ability for national-level mortality and morbidity statistics to occur. This paper presents PetBERT, a large language model trained on over 500 million words from 5.1 million EHRs across the UK. PetBERT-ICD is the additional training of PetBERT as a multi-label classifier for the automated coding of veterinary clinical EHRs with the International Classification of Disease 11 framework, achieving F1 scores exceeding 83% across 20 disease codings with minimal annotations. PetBERT-ICD effectively identifies disease outbreaks, outperforming current clinician-assigned point-of-care labelling strategies up to 3 weeks earlier. The potential for PetBERT-ICD to enhance disease surveillance in veterinary medicine represents a promising avenue for advancing animal health and improving public health outcomes.
Collapse
Affiliation(s)
- Sean Farrell
- Department of Computer Science, Durham University, Durham, UK.
| | - Charlotte Appleton
- Centre for Health Informatics, Computing, and Statistics, Lancaster Medical School, Lancaster University, Lancaster, UK
| | - Peter-John Mäntylä Noble
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
| | - Noura Al Moubayed
- Department of Computer Science, Durham University, Durham, UK
- Evergreen Life Ltd, Manchester, UK
| |
Collapse
|
9
|
Cunningham JW, Singh P, Reeder C, Claggett B, Marti-Castellote PM, Lau ES, Khurshid S, Batra P, Lubitz SA, Maddah M, Philippakis A, Desai AS, Ellinor PT, Vardeny O, Solomon SD, Ho JE. Natural Language Processing for Adjudication of Heart Failure Hospitalizations in a Multi-Center Clinical Trial. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.08.17.23294234. [PMID: 37662283 PMCID: PMC10473787 DOI: 10.1101/2023.08.17.23294234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Background The gold standard for outcome adjudication in clinical trials is chart review by a physician clinical events committee (CEC), which requires substantial time and expertise. Automated adjudication by natural language processing (NLP) may offer a more resource-efficient alternative. We previously showed that the Community Care Cohort Project (C3PO) NLP model adjudicates heart failure (HF) hospitalizations accurately within one healthcare system. Methods This study externally validated the C3PO NLP model against CEC adjudication in the INVESTED trial. INVESTED compared influenza vaccination formulations in 5260 patients with cardiovascular disease at 157 North American sites. A central CEC adjudicated the cause of hospitalizations from medical records. We applied the C3PO NLP model to medical records from 4060 INVESTED hospitalizations and evaluated agreement between the NLP and final consensus CEC HF adjudications. We then fine-tuned the C3PO NLP model (C3PO+INVESTED) and trained a de novo model using half the INVESTED hospitalizations, and evaluated these models in the other half. NLP performance was benchmarked to CEC reviewer inter-rater reproducibility. Results 1074 hospitalizations (26%) were adjudicated as HF by the CEC. There was high agreement between the C3PO NLP and CEC HF adjudications (agreement 87%, kappa statistic 0.69). C3PO NLP model sensitivity was 94% and specificity was 84%. The fine-tuned C3PO and de novo NLP models demonstrated agreement of 93% and kappa of 0.82 and 0.83, respectively. CEC reviewer inter-rater reproducibility was 94% (kappa 0.85). Conclusion Our NLP model developed within a single healthcare system accurately identified HF events relative to the gold-standard CEC in an external multi-center clinical trial. Fine-tuning the model improved agreement and approximated human reproducibility. NLP may improve the efficiency of future multi-center clinical trials by accurately identifying clinical events at scale.
Collapse
Affiliation(s)
- Jonathan W. Cunningham
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Pulkit Singh
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Christopher Reeder
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Brian Claggett
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | | | - Emily S. Lau
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
- Division of Cardiology, Massachusetts General Hospital, Boston, Massachusetts
| | - Shaan Khurshid
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, Massachusetts
| | - Puneet Batra
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Steven A. Lubitz
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, Massachusetts
| | - Mahnaz Maddah
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Anthony Philippakis
- Data Sciences Platform, Broad Institute of Harvard and the Massachusetts Institute of Technology, Cambridge, Massachusetts
| | - Akshay S. Desai
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Patrick T. Ellinor
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, Massachusetts
| | - Orly Vardeny
- Minneapolis VA Hospital, University of Minnesota, Minneapolis, Minnesota
| | - Scott D. Solomon
- Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Jennifer E. Ho
- Cardiovascular Disease Initiative, Broad Institute of Harvard University and the Massachusetts Institute of Technology, Cambridge, Massachusetts
- CardioVascular Institute and Division of Cardiology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| |
Collapse
|
10
|
de Hond AAH, Kant IMJ, Fornasa M, Cinà G, Elbers PWG, Thoral PJ, Sesmu Arbous M, Steyerberg EW. Predicting Readmission or Death After Discharge From the ICU: External Validation and Retraining of a Machine Learning Model. Crit Care Med 2023; 51:291-300. [PMID: 36524820 PMCID: PMC9848213 DOI: 10.1097/ccm.0000000000005758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
OBJECTIVES Many machine learning (ML) models have been developed for application in the ICU, but few models have been subjected to external validation. The performance of these models in new settings therefore remains unknown. The objective of this study was to assess the performance of an existing decision support tool based on a ML model predicting readmission or death within 7 days after ICU discharge before, during, and after retraining and recalibration. DESIGN A gradient boosted ML model was developed and validated on electronic health record data from 2004 to 2021. We performed an independent validation of this model on electronic health record data from 2011 to 2019 from a different tertiary care center. SETTING Two ICUs in tertiary care centers in The Netherlands. PATIENTS Adult patients who were admitted to the ICU and stayed for longer than 12 hours. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS We assessed discrimination by area under the receiver operating characteristic curve (AUC) and calibration (slope and intercept). We retrained and recalibrated the original model and assessed performance via a temporal validation design. The final retrained model was cross-validated on all data from the new site. Readmission or death within 7 days after ICU discharge occurred in 577 of 10,052 ICU admissions (5.7%) at the new site. External validation revealed moderate discrimination with an AUC of 0.72 (95% CI 0.67-0.76). Retrained models showed improved discrimination with AUC 0.79 (95% CI 0.75-0.82) for the final validation model. Calibration was poor initially and good after recalibration via isotonic regression. CONCLUSIONS In this era of expanding availability of ML models, external validation and retraining are key steps to consider before applying ML models to new settings. Clinicians and decision-makers should take this into account when considering applying new ML models to their local settings.
Collapse
Affiliation(s)
- Anne A H de Hond
- Department of Information Technology and Digital Innovation, Leiden University Medical Centre, Leiden, The Netherlands
- Department of Biomedical Informatics, Stanford Medicine, Stanford, CA
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, The Netherlands
| | - Ilse M J Kant
- Department of Information Technology and Digital Innovation, Leiden University Medical Centre, Leiden, The Netherlands
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, The Netherlands
| | | | - Giovanni Cinà
- Pacmed, Stadhouderskade 55, Amsterdam, The Netherlands
- Institute of Logic, Language and Computation, University of Amsterdam, Amsterdam, The Netherlands
| | - Paul W G Elbers
- Department of Intensive Care Medicine, Laboratory for Critical Care Computational Intelligence, Amsterdam UMC, Amsterdam, The Netherlands
| | - Patrick J Thoral
- Department of Intensive Care Medicine, Laboratory for Critical Care Computational Intelligence, Amsterdam UMC, Amsterdam, The Netherlands
| | - M Sesmu Arbous
- Department of Intensive Care Medicine, Leiden University Medical Centre, Leiden, The Netherlands
| | - Ewout W Steyerberg
- Department of Biomedical Data Sciences, Leiden University Medical Centre, Leiden, The Netherlands
| |
Collapse
|
11
|
Yogarajan V, Dobbie G, Leitch S, Keegan TT, Bensemann J, Witbrock M, Asrani V, Reith D. Data and model bias in artificial intelligence for healthcare applications in New Zealand. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.1070493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
IntroductionDevelopments in Artificial Intelligence (AI) are adopted widely in healthcare. However, the introduction and use of AI may come with biases and disparities, resulting in concerns about healthcare access and outcomes for underrepresented indigenous populations. In New Zealand, Māori experience significant inequities in health compared to the non-Indigenous population. This research explores equity concepts and fairness measures concerning AI for healthcare in New Zealand.MethodsThis research considers data and model bias in NZ-based electronic health records (EHRs). Two very distinct NZ datasets are used in this research, one obtained from one hospital and another from multiple GP practices, where clinicians obtain both datasets. To ensure research equality and fair inclusion of Māori, we combine expertise in Artificial Intelligence (AI), New Zealand clinical context, and te ao Māori. The mitigation of inequity needs to be addressed in data collection, model development, and model deployment. In this paper, we analyze data and algorithmic bias concerning data collection and model development, training and testing using health data collected by experts. We use fairness measures such as disparate impact scores, equal opportunities and equalized odds to analyze tabular data. Furthermore, token frequencies, statistical significance testing and fairness measures for word embeddings, such as WEAT and WEFE frameworks, are used to analyze bias in free-form medical text. The AI model predictions are also explained using SHAP and LIME.ResultsThis research analyzed fairness metrics for NZ EHRs while considering data and algorithmic bias. We show evidence of bias due to the changes made in algorithmic design. Furthermore, we observe unintentional bias due to the underlying pre-trained models used to represent text data. This research addresses some vital issues while opening up the need and opportunity for future research.DiscussionsThis research takes early steps toward developing a model of socially responsible and fair AI for New Zealand's population. We provided an overview of reproducible concepts that can be adopted toward any NZ population data. Furthermore, we discuss the gaps and future research avenues that will enable more focused development of fairness measures suitable for the New Zealand population's needs and social structure. One of the primary focuses of this research was ensuring fair inclusions. As such, we combine expertise in AI, clinical knowledge, and the representation of indigenous populations. This inclusion of experts will be vital moving forward, proving a stepping stone toward the integration of AI for better outcomes in healthcare.
Collapse
|
12
|
King Z, Farrington J, Utley M, Kung E, Elkhodair S, Harris S, Sekula R, Gillham J, Li K, Crowe S. Machine learning for real-time aggregated prediction of hospital admission for emergency patients. NPJ Digit Med 2022; 5:104. [PMID: 35882903 PMCID: PMC9321296 DOI: 10.1038/s41746-022-00649-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 07/04/2022] [Indexed: 12/23/2022] Open
Abstract
Machine learning for hospital operations is under-studied. We present a prediction pipeline that uses live electronic health-records for patients in a UK teaching hospital’s emergency department (ED) to generate short-term, probabilistic forecasts of emergency admissions. A set of XGBoost classifiers applied to 109,465 ED visits yielded AUROCs from 0.82 to 0.90 depending on elapsed visit-time at the point of prediction. Patient-level probabilities of admission were aggregated to forecast the number of admissions among current ED patients and, incorporating patients yet to arrive, total emergency admissions within specified time-windows. The pipeline gave a mean absolute error (MAE) of 4.0 admissions (mean percentage error of 17%) versus 6.5 (32%) for a benchmark metric. Models developed with 104,504 later visits during the Covid-19 pandemic gave AUROCs of 0.68–0.90 and MAE of 4.2 (30%) versus a 4.9 (33%) benchmark. We discuss how we surmounted challenges of designing and implementing models for real-time use, including temporal framing, data preparation, and changing operational conditions.
Collapse
Affiliation(s)
- Zella King
- Clinical Operational Research Unit, University College London, 4 Taviton Street, London, WC1H 0BT, UK. .,Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK.
| | - Joseph Farrington
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK
| | - Martin Utley
- Clinical Operational Research Unit, University College London, 4 Taviton Street, London, WC1H 0BT, UK
| | - Enoch Kung
- Clinical Operational Research Unit, University College London, 4 Taviton Street, London, WC1H 0BT, UK
| | - Samer Elkhodair
- University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2PG, UK
| | - Steve Harris
- University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2PG, UK
| | - Richard Sekula
- University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2PG, UK
| | - Jonathan Gillham
- University College London Hospitals NHS Foundation Trust, 250 Euston Road, London, NW1 2PG, UK
| | - Kezhi Li
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK
| | - Sonya Crowe
- Clinical Operational Research Unit, University College London, 4 Taviton Street, London, WC1H 0BT, UK
| |
Collapse
|
13
|
Yang J, Soltan AAS, Clifton DA. Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening. NPJ Digit Med 2022; 5:69. [PMID: 35672368 PMCID: PMC9174159 DOI: 10.1038/s41746-022-00614-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 05/19/2022] [Indexed: 11/08/2022] Open
Abstract
As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this-(1) applying a ready-made model "as-is" (2); readjusting the decision threshold on the model's output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.
Collapse
Affiliation(s)
- Jenny Yang
- Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford, Oxford, UK.
| | - Andrew A S Soltan
- John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- RDM Division of Cardiovascular Medicine, University of Oxford, Oxford, UK
| | - David A Clifton
- Institute of Biomedical Engineering, Dept. Engineering Science, University of Oxford, Oxford, UK
| |
Collapse
|
14
|
Kim JH, Hua M, Whittington RA, Lee J, Liu C, Ta CN, Marcantonio ER, Goldberg TE, Weng C. A machine learning approach to identifying delirium from electronic health records. JAMIA Open 2022; 5:ooac042. [PMID: 35663114 PMCID: PMC9152701 DOI: 10.1093/jamiaopen/ooac042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 05/01/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
The identification of delirium in electronic health records (EHRs) remains difficult due to inadequate assessment or under-documentation. The purpose of this research is to present a classification model that identifies delirium using retrospective EHR data. Delirium was confirmed with the Confusion Assessment Method for the Intensive Care Unit. Age, sex, Elixhauser comorbidity index, drug exposures, and diagnoses were used as features. The model was developed based on the Columbia University Irving Medical Center EHR data and further validated with the Medical Information Mart for Intensive Care III dataset. Seventy-six patients from Surgical/Cardiothoracic ICU were included in the model. The logistic regression model achieved the best performance in identifying delirium; mean AUC of 0.874 ± 0.033. The mean positive predictive value of the logistic regression model was 0.80. The model promises to identify delirium cases with EHR data, thereby enable a sustainable infrastructure to build a retrospective cohort of delirium. Delirium is a commonly observed complication in hospitalized patients, especially with intensive care. While signs and symptoms of delirium could be observed and well managed during the hospital stay, less is known about the long-term complication of delirium after discharge. In order to monitor the long-term sequelae of delirium, the correct identification of delirium patients is crucial. Currently, the retrospective identification of delirium patients is limited due to the under-coding of delirium diagnosis in electronic health records. We proposed a simple machine-learning model to retrospectively identify patients who experienced delirium during their intensive care unit stay. The model could be used to identify missed delirium cases and the establishment of a delirium cohort for long-term monitoring and surveillance.
Collapse
Affiliation(s)
- Jae Hyun Kim
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - May Hua
- Department of Anesthesiology, Columbia University Medical Center, New York Presbyterian Hospital, New York, New York, USA
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, New York, USA
| | - Robert A Whittington
- Department of Anesthesiology, Columbia University Medical Center, New York Presbyterian Hospital, New York, New York, USA
| | - Junghwan Lee
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Casey N Ta
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Edward R Marcantonio
- Harvard Medical School, Boston, Massachusetts, USA
- Division of General Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
- Division of Gerontology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
| | - Terry E Goldberg
- Department of Anesthesiology, Columbia University Medical Center, New York Presbyterian Hospital, New York, New York, USA
- Department of Psychiatry, Columbia University Irving Medical Center, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| |
Collapse
|