1
|
Lala A, Johnson KW, Januzzi JL, Russak AJ, Paranjpe I, Richter F, Zhao S, Somani S, Van Vleck T, Vaid A, Chaudhry F, De Freitas JK, Fayad ZA, Pinney SP, Levin M, Charney A, Bagiella E, Narula J, Glicksberg BS, Nadkarni G, Mancini DM, Fuster V. Prevalence and Impact of Myocardial Injury in Patients Hospitalized With COVID-19 Infection. J Am Coll Cardiol 2020; 76:533-546. [PMID: 32517963 PMCID: PMC7279721 DOI: 10.1016/j.jacc.2020.06.007] [Citation(s) in RCA: 544] [Impact Index Per Article: 108.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 06/02/2020] [Indexed: 02/06/2023]
Abstract
BACKGROUND The degree of myocardial injury, as reflected by troponin elevation, and associated outcomes among U.S. hospitalized patients with coronavirus disease-2019 (COVID-19) are unknown. OBJECTIVES The purpose of this study was to describe the degree of myocardial injury and associated outcomes in a large hospitalized cohort with laboratory-confirmed COVID-19. METHODS Patients with COVID-19 admitted to 1 of 5 Mount Sinai Health System hospitals in New York City between February 27, 2020, and April 12, 2020, with troponin-I (normal value <0.03 ng/ml) measured within 24 h of admission were included (n = 2,736). Demographics, medical histories, admission laboratory results, and outcomes were captured from the hospitals' electronic health records. RESULTS The median age was 66.4 years, with 59.6% men. Cardiovascular disease (CVD), including coronary artery disease, atrial fibrillation, and heart failure, was more prevalent in patients with higher troponin concentrations, as were hypertension and diabetes. A total of 506 (18.5%) patients died during hospitalization. In all, 985 (36%) patients had elevated troponin concentrations. After adjusting for disease severity and relevant clinical factors, even small amounts of myocardial injury (e.g., troponin I >0.03 to 0.09 ng/ml; n = 455; 16.6%) were significantly associated with death (adjusted hazard ratio: 1.75; 95% CI: 1.37 to 2.24; p < 0.001) while greater amounts (e.g., troponin I >0.09 ng/dl; n = 530; 19.4%) were significantly associated with higher risk (adjusted HR: 3.03; 95% CI: 2.42 to 3.80; p < 0.001). CONCLUSIONS Myocardial injury is prevalent among patients hospitalized with COVID-19; however, troponin concentrations were generally present at low levels. Patients with CVD are more likely to have myocardial injury than patients without CVD. Troponin elevation among patients hospitalized with COVID-19 is associated with higher risk of mortality.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
544 |
2
|
Johnson KW, Torres Soto J, Glicksberg BS, Shameer K, Miotto R, Ali M, Ashley E, Dudley JT. Artificial Intelligence in Cardiology. J Am Coll Cardiol 2019; 71:2668-2679. [PMID: 29880128 DOI: 10.1016/j.jacc.2018.03.521] [Citation(s) in RCA: 520] [Impact Index Per Article: 86.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 03/01/2018] [Accepted: 03/05/2018] [Indexed: 01/24/2023]
Abstract
Artificial intelligence and machine learning are poised to influence nearly every aspect of the human condition, and cardiology is not an exception to this trend. This paper provides a guide for clinicians on relevant aspects of artificial intelligence and machine learning, reviews selected applications of these methods in cardiology to date, and identifies how cardiovascular medicine could incorporate artificial intelligence in the future. In particular, the paper first reviews predictive modeling concepts relevant to cardiology such as feature selection and frequent pitfalls such as improper dichotomization. Second, it discusses common algorithms used in supervised learning and reviews selected applications in cardiology and related disciplines. Third, it describes the advent of deep learning and related methods collectively called unsupervised learning, provides contextual examples both in general medicine and in cardiovascular medicine, and then explains how these methods could be applied to enable precision cardiology and improve patient outcomes.
Collapse
|
Review |
6 |
520 |
3
|
Chan L, Chaudhary K, Saha A, Chauhan K, Vaid A, Zhao S, Paranjpe I, Somani S, Richter F, Miotto R, Lala A, Kia A, Timsina P, Li L, Freeman R, Chen R, Narula J, Just AC, Horowitz C, Fayad Z, Cordon-Cardo C, Schadt E, Levin MA, Reich DL, Fuster V, Murphy B, He JC, Charney AW, Böttinger EP, Glicksberg BS, Coca SG, Nadkarni GN. AKI in Hospitalized Patients with COVID-19. J Am Soc Nephrol 2021; 32:151-160. [PMID: 32883700 PMCID: PMC7894657 DOI: 10.1681/asn.2020050615] [Citation(s) in RCA: 468] [Impact Index Per Article: 117.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 08/03/2020] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Early reports indicate that AKI is common among patients with coronavirus disease 2019 (COVID-19) and associated with worse outcomes. However, AKI among hospitalized patients with COVID-19 in the United States is not well described. METHODS This retrospective, observational study involved a review of data from electronic health records of patients aged ≥18 years with laboratory-confirmed COVID-19 admitted to the Mount Sinai Health System from February 27 to May 30, 2020. We describe the frequency of AKI and dialysis requirement, AKI recovery, and adjusted odds ratios (aORs) with mortality. RESULTS Of 3993 hospitalized patients with COVID-19, AKI occurred in 1835 (46%) patients; 347 (19%) of the patients with AKI required dialysis. The proportions with stages 1, 2, or 3 AKI were 39%, 19%, and 42%, respectively. A total of 976 (24%) patients were admitted to intensive care, and 745 (76%) experienced AKI. Of the 435 patients with AKI and urine studies, 84% had proteinuria, 81% had hematuria, and 60% had leukocyturia. Independent predictors of severe AKI were CKD, men, and higher serum potassium at admission. In-hospital mortality was 50% among patients with AKI versus 8% among those without AKI (aOR, 9.2; 95% confidence interval, 7.5 to 11.3). Of survivors with AKI who were discharged, 35% had not recovered to baseline kidney function by the time of discharge. An additional 28 of 77 (36%) patients who had not recovered kidney function at discharge did so on posthospital follow-up. CONCLUSIONS AKI is common among patients hospitalized with COVID-19 and is associated with high mortality. Of all patients with AKI, only 30% survived with recovery of kidney function by the time of discharge.
Collapse
|
Observational Study |
4 |
468 |
4
|
Pujadas E, Chaudhry F, McBride R, Richter F, Zhao S, Wajnberg A, Nadkarni G, Glicksberg BS, Houldsworth J, Cordon-Cardo C. SARS-CoV-2 viral load predicts COVID-19 mortality. THE LANCET RESPIRATORY MEDICINE 2020; 8:e70. [PMID: 32771081 PMCID: PMC7836878 DOI: 10.1016/s2213-2600(20)30354-4] [Citation(s) in RCA: 355] [Impact Index Per Article: 71.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 07/16/2020] [Accepted: 07/29/2020] [Indexed: 12/16/2022]
|
Letter |
5 |
355 |
5
|
Nadkarni GN, Lala A, Bagiella E, Chang HL, Moreno PR, Pujadas E, Arvind V, Bose S, Charney AW, Chen MD, Cordon-Cardo C, Dunn AS, Farkouh ME, Glicksberg BS, Kia A, Kohli-Seth R, Levin MA, Timsina P, Zhao S, Fayad ZA, Fuster V. Anticoagulation, Bleeding, Mortality, and Pathology in Hospitalized Patients With COVID-19. J Am Coll Cardiol 2020; 76:1815-1826. [PMID: 32860872 PMCID: PMC7449655 DOI: 10.1016/j.jacc.2020.08.041] [Citation(s) in RCA: 336] [Impact Index Per Article: 67.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 08/20/2020] [Accepted: 08/20/2020] [Indexed: 12/22/2022]
Abstract
Background Thromboembolic disease is common in coronavirus disease-2019 (COVID-19). There is limited evidence on the association of in-hospital anticoagulation (AC) with outcomes and postmortem findings. Objectives The purpose of this study was to examine association of AC with in-hospital outcomes and describe thromboembolic findings on autopsies. Methods This retrospective analysis examined the association of AC with mortality, intubation, and major bleeding. Subanalyses were also conducted on the association of therapeutic versus prophylactic AC initiated ≤48 h from admission. Thromboembolic disease was contextualized by premortem AC among consecutive autopsies. Results Among 4,389 patients, median age was 65 years with 44% women. Compared with no AC (n = 1,530; 34.9%), therapeutic AC (n = 900; 20.5%) and prophylactic AC (n = 1,959; 44.6%) were associated with lower in-hospital mortality (adjusted hazard ratio [aHR]: 0.53; 95% confidence interval [CI]: 0.45 to 0.62 and aHR: 0.50; 95% CI: 0.45 to 0.57, respectively), and intubation (aHR: 0.69; 95% CI: 0.51 to 0.94 and aHR: 0.72; 95% CI: 0.58 to 0.89, respectively). When initiated ≤48 h from admission, there was no statistically significant difference between therapeutic (n = 766) versus prophylactic AC (n = 1,860) (aHR: 0.86; 95% CI: 0.73 to 1.02; p = 0.08). Overall, 89 patients (2%) had major bleeding adjudicated by clinician review, with 27 of 900 (3.0%) on therapeutic, 33 of 1,959 (1.7%) on prophylactic, and 29 of 1,530 (1.9%) on no AC. Of 26 autopsies, 11 (42%) had thromboembolic disease not clinically suspected and 3 of 11 (27%) were on therapeutic AC. Conclusions AC was associated with lower mortality and intubation among hospitalized COVID-19 patients. Compared with prophylactic AC, therapeutic AC was associated with lower mortality, although not statistically significant. Autopsies revealed frequent thromboembolic disease. These data may inform trials to determine optimal AC regimens.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
336 |
6
|
Li L, Cheng WY, Glicksberg BS, Gottesman O, Tamler R, Chen R, Bottinger EP, Dudley JT. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci Transl Med 2016; 7:311ra174. [PMID: 26511511 DOI: 10.1126/scitranslmed.aaa9364] [Citation(s) in RCA: 301] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to improve early prevention and clinical management of T2D and its complications. Clinicians have understood that patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related complications. We used a precision medicine approach to characterize the complexity of T2D patient populations based on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was characterized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer malignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases, neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent T2D subtypes to identify subtype-specific genetic markers and identified 1279, 1227, and 1338 single-nucleotide polymorphisms (SNPs) that mapped to 425, 322, and 437 unique genes specific to subtypes 1, 2, and 3, respectively. By assessing the human disease-SNP association for each subtype, the enriched phenotypes and biological functions at the gene level for each subtype matched with the disease comorbidities and clinical differences that we identified through EMRs. Our approach demonstrates the utility of applying the precision medicine paradigm in T2D and the promise of extending the approach to the study of other complex, multifactorial diseases.
Collapse
|
Research Support, N.I.H., Extramural |
9 |
301 |
7
|
Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart 2018; 104:1156-1164. [PMID: 29352006 DOI: 10.1136/heartjnl-2017-311198] [Citation(s) in RCA: 240] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 12/19/2017] [Accepted: 12/21/2017] [Indexed: 12/11/2022] Open
Abstract
Artificial intelligence (AI) broadly refers to analytical algorithms that iteratively learn from data, allowing computers to find hidden insights without being explicitly programmed where to look. These include a family of operations encompassing several terms like machine learning, cognitive learning, deep learning and reinforcement learning-based methods that can be used to integrate and interpret complex biomedical and healthcare data in scenarios where traditional statistical methods may not be able to perform. In this review article, we discuss the basics of machine learning algorithms and what potential data sources exist; evaluate the need for machine learning; and examine the potential limitations and challenges of implementing machine in the context of cardiovascular medicine. The most promising avenues for AI in medicine are the development of automated risk prediction algorithms which can be used to guide clinical care; use of unsupervised learning techniques to more precisely phenotype complex disease; and the implementation of reinforcement learning algorithms to intelligently augment healthcare providers. The utility of a machine learning-based predictive model will depend on factors including data heterogeneity, data depth, data breadth, nature of modelling task, choice of machine learning and feature selection algorithms, and orthogonal evidence. A critical understanding of the strength and limitations of various methods and tasks amenable to machine learning is vital. By leveraging the growing corpus of big data in medicine, we detail pathways by which machine learning may facilitate optimal development of patient-specific models for improving diagnoses, intervention and outcome in cardiovascular medicine.
Collapse
|
Review |
7 |
240 |
8
|
Ramlall V, Thangaraj PM, Meydan C, Foox J, Butler D, Kim J, May B, De Freitas JK, Glicksberg BS, Mason CE, Tatonetti NP, Shapira SD. Immune complement and coagulation dysfunction in adverse outcomes of SARS-CoV-2 infection. Nat Med 2020; 26:1609-1615. [PMID: 32747830 PMCID: PMC7809634 DOI: 10.1038/s41591-020-1021-2] [Citation(s) in RCA: 232] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 07/16/2020] [Indexed: 11/08/2022]
Abstract
Understanding the pathophysiology of SARS-CoV-2 infection is critical for therapeutic and public health strategies. Viral-host interactions can guide discovery of disease regulators, and protein structure function analysis points to several immune pathways, including complement and coagulation, as targets of coronaviruses. To determine whether conditions associated with dysregulated complement or coagulation systems impact disease, we performed a retrospective observational study and found that history of macular degeneration (a proxy for complement-activation disorders) and history of coagulation disorders (thrombocytopenia, thrombosis and hemorrhage) are risk factors for SARS-CoV-2-associated morbidity and mortality-effects that are independent of age, sex or history of smoking. Transcriptional profiling of nasopharyngeal swabs demonstrated that in addition to type-I interferon and interleukin-6-dependent inflammatory responses, infection results in robust engagement of the complement and coagulation pathways. Finally, in a candidate-driven genetic association study of severe SARS-CoV-2 disease, we identified putative complement and coagulation-associated loci including missense, eQTL and sQTL variants of critical complement and coagulation regulators. In addition to providing evidence that complement function modulates SARS-CoV-2 infection outcome, the data point to putative transcriptional genetic markers of susceptibility. The results highlight the value of using a multimodal analytical approach to reveal determinants and predictors of immunity, susceptibility and clinical outcome associated with infection.
Collapse
|
Observational Study |
5 |
232 |
9
|
Xu J, Glicksberg BS, Su C, Walker P, Bian J, Wang F. Federated Learning for Healthcare Informatics. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2020; 5:1-19. [PMID: 33204939 PMCID: PMC7659898 DOI: 10.1007/s41666-020-00082-4] [Citation(s) in RCA: 223] [Impact Index Per Article: 44.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 10/21/2020] [Accepted: 10/30/2020] [Indexed: 01/02/2023]
Abstract
With the rapid development of computer software and hardware technologies, more and more healthcare data are becoming readily available from clinical institutions, patients, insurance companies, and pharmaceutical industries, among others. This access provides an unprecedented opportunity for data science technologies to derive data-driven insights and improve the quality of care delivery. Healthcare data, however, are usually fragmented and private making it difficult to generate robust results across populations. For example, different hospitals own the electronic health records (EHR) of different patient populations and these records are difficult to share across hospitals because of their sensitive nature. This creates a big barrier for developing effective analytical approaches that are generalizable, which need diverse, “big data.” Federated learning, a mechanism of training a shared global model with a central server while keeping all the sensitive data in local institutions where the data belong, provides great promise to connect the fragmented healthcare data sources with privacy-preservation. The goal of this survey is to provide a review for federated learning technologies, particularly within the biomedical space. In particular, we summarize the general solutions to the statistical challenges, system challenges, and privacy issues in federated learning, and point out the implications and potentials in healthcare.
Collapse
|
Journal Article |
5 |
223 |
10
|
Sigel K, Swartz T, Golden E, Paranjpe I, Somani S, Richter F, De Freitas JK, Miotto R, Zhao S, Polak P, Mutetwa T, Factor S, Mehandru S, Mullen M, Cossarini F, Bottinger E, Fayad Z, Merad M, Gnjatic S, Aberg J, Charney A, Nadkarni G, Glicksberg BS. Coronavirus 2019 and People Living With Human Immunodeficiency Virus: Outcomes for Hospitalized Patients in New York City. Clin Infect Dis 2020; 71:2933-2938. [PMID: 32594164 PMCID: PMC7337691 DOI: 10.1093/cid/ciaa880] [Citation(s) in RCA: 155] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 06/22/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND There are limited data regarding the clinical impact of coronavirus disease 2019 (COVID-19) on people living with human immunodeficiency virus (PLWH). In this study, we compared outcomes for PLWH with COVID-19 to a matched comparison group. METHODS We identified 88 PLWH hospitalized with laboratory-confirmed COVID-19 in our hospital system in New York City between 12 March and 23 April 2020. We collected data on baseline clinical characteristics, laboratory values, HIV status, treatment, and outcomes from this group and matched comparators (1 PLWH to up to 5 patients by age, sex, race/ethnicity, and calendar week of infection). We compared clinical characteristics and outcomes (death, mechanical ventilation, hospital discharge) for these groups, as well as cumulative incidence of death by HIV status. RESULTS Patients did not differ significantly by HIV status by age, sex, or race/ethnicity due to the matching algorithm. PLWH hospitalized with COVID-19 had high proportions of HIV virologic control on antiretroviral therapy. PLWH had greater proportions of smoking (P < .001) and comorbid illness than uninfected comparators. There was no difference in COVID-19 severity on admission by HIV status (P = .15). Poor outcomes for hospitalized PLWH were frequent but similar to proportions in comparators; 18% required mechanical ventilation and 21% died during follow-up (compared with 23% and 20%, respectively). There was similar cumulative incidence of death over time by HIV status (P = .94). CONCLUSIONS We found no differences in adverse outcomes associated with HIV infection for hospitalized COVID-19 patients compared with a demographically similar patient group.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
155 |
11
|
Alvarez-Garcia J, Lee S, Gupta A, Cagliostro M, Joshi AA, Rivas-Lasarte M, Contreras J, Mitter SS, LaRocca G, Tlachi P, Brunjes D, Glicksberg BS, Levin MA, Nadkarni G, Fayad Z, Fuster V, Mancini D, Lala A. Prognostic Impact of Prior Heart Failure in Patients Hospitalized With COVID-19. J Am Coll Cardiol 2020; 76:2334-2348. [PMID: 33129663 PMCID: PMC7598769 DOI: 10.1016/j.jacc.2020.09.549] [Citation(s) in RCA: 145] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 09/14/2020] [Accepted: 09/17/2020] [Indexed: 02/07/2023]
Abstract
BACKGROUND Patients with pre-existing heart failure (HF) are likely at higher risk for adverse outcomes in coronavirus disease-2019 (COVID-19), but data on this population are sparse. OBJECTIVES This study described the clinical profile and associated outcomes among patients with HF hospitalized with COVID-19. METHODS This study conducted a retrospective analysis of 6,439 patients admitted for COVID-19 at 1 of 5 Mount Sinai Health System hospitals in New York City between February 27 and June 26, 2020. Clinical characteristics and outcomes (length of stay, need for intensive care unit, mechanical ventilation, and in-hospital mortality) were captured from electronic health records. For patients identified as having a history of HF by International Classification of Diseases-9th and/or 10th Revisions codes, manual chart abstraction informed etiology, functional class, and left ventricular ejection fraction (LVEF). RESULTS Mean age was 63.5 years, and 45% were women. Compared with patients without HF, those with previous HF experienced longer length of stay (8 days vs. 6 days; p < 0.001), increased risk of mechanical ventilation (22.8% vs. 11.9%; adjusted odds ratio: 3.64; 95% confidence interval: 2.56 to 5.16; p < 0.001), and mortality (40.0% vs. 24.9%; adjusted odds ratio: 1.88; 95% confidence interval: 1.27 to 2.78; p = 0.002). Outcomes among patients with HF were similar, regardless of LVEF or renin-angiotensin-aldosterone inhibitor use. CONCLUSIONS History of HF was associated with higher risk of mechanical ventilation and mortality among patients hospitalized for COVID-19, regardless of LVEF.
Collapse
|
research-article |
5 |
145 |
12
|
Vaid A, Somani S, Russak AJ, De Freitas JK, Chaudhry FF, Paranjpe I, Johnson KW, Lee SJ, Miotto R, Richter F, Zhao S, Beckmann ND, Naik N, Kia A, Timsina P, Lala A, Paranjpe M, Golden E, Danieletto M, Singh M, Meyer D, O'Reilly PF, Huckins L, Kovatch P, Finkelstein J, Freeman RM, Argulian E, Kasarskis A, Percha B, Aberg JA, Bagiella E, Horowitz CR, Murphy B, Nestler EJ, Schadt EE, Cho JH, Cordon-Cardo C, Fuster V, Charney DS, Reich DL, Bottinger EP, Levin MA, Narula J, Fayad ZA, Just AC, Charney AW, Nadkarni GN, Glicksberg BS. Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation. J Med Internet Res 2020; 22:e24018. [PMID: 33027032 PMCID: PMC7652593 DOI: 10.2196/24018] [Citation(s) in RCA: 133] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 10/02/2020] [Accepted: 10/02/2020] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. OBJECTIVE The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. METHODS We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19-positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. RESULTS Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. CONCLUSIONS We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
133 |
13
|
Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, McConnell MV, Percha B, Snyder TM, Dudley JT. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digit Med 2019; 2:31. [PMID: 31304378 PMCID: PMC6550136 DOI: 10.1038/s41746-019-0105-1] [Citation(s) in RCA: 124] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 03/05/2019] [Indexed: 01/31/2023] Open
Abstract
Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs, and delayed diagnosis leads to higher cost and worse outcomes. Computer-aided diagnosis (CAD) algorithms have shown promise for helping radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep-learning models on 17,587 radiographs to classify fracture, 5 patient traits, and 14 hospital process variables. All 20 variables could be individually predicted from a radiograph, with the best performances on scanner model (AUC = 1.00), scanner brand (AUC = 0.98), and whether the order was marked "priority" (AUC = 0.79). Fracture was predicted moderately well from the image (AUC = 0.78) and better when combining image features with patient data (AUC = 0.86, DeLong paired AUC comparison, p = 2e-9) or patient data plus hospital process features (AUC = 0.91, p = 1e-21). Fracture prediction on a test set that balanced fracture risk across patient variables was significantly lower than a random test set (AUC = 0.67, DeLong unpaired AUC comparison, p = 0.003); and on a test set with fracture risk balanced across patient and hospital process variables, the model performed randomly (AUC = 0.52, 95% CI 0.46-0.58), indicating that these variables were the main source of the model's fracture predictions. A single model that directly combines image features, patient, and hospital process data outperforms a Naive Bayes ensemble of an image-only model prediction, patient, and hospital process data. If CAD algorithms are inexplicably leveraging patient and process variables in their predictions, it is unclear how radiologists should interpret their predictions in the context of other known patient data. Further research is needed to illuminate deep-learning decision processes so that computers and clinicians can effectively cooperate.
Collapse
|
research-article |
6 |
124 |
14
|
Norgeot B, Glicksberg BS, Trupin L, Lituiev D, Gianfrancesco M, Oskotsky B, Schmajuk G, Yazdany J, Butte AJ. Assessment of a Deep Learning Model Based on Electronic Health Record Data to Forecast Clinical Outcomes in Patients With Rheumatoid Arthritis. JAMA Netw Open 2019; 2:e190606. [PMID: 30874779 PMCID: PMC6484652 DOI: 10.1001/jamanetworkopen.2019.0606] [Citation(s) in RCA: 120] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 01/17/2019] [Indexed: 12/12/2022] Open
Abstract
Importance Knowing the future condition of a patient would enable a physician to customize current therapeutic options to prevent disease worsening, but predicting that future condition requires sophisticated modeling and information. If artificial intelligence models were capable of forecasting future patient outcomes, they could be used to aid practitioners and patients in prognosticating outcomes or simulating potential outcomes under different treatment scenarios. Objective To assess the ability of an artificial intelligence system to prognosticate the state of disease activity of patients with rheumatoid arthritis (RA) at their next clinical visit. Design, Setting, and Participants This prognostic study included 820 patients with RA from rheumatology clinics at 2 distinct health care systems with different electronic health record platforms: a university hospital (UH) and a public safety-net hospital (SNH). The UH and SNH had substantially different patient populations and treatment patterns. The UH has records on approximately 1 million total patients starting in January 2012. The UH data for this study were accessed on July 1, 2017. The SNH has records on 65 000 unique individuals starting in January 2013. The SNH data for the study were collected on February 27, 2018. Exposures Structured data were extracted from the electronic health record, including exposures (medications), patient demographics, laboratories, and prior measures of disease activity. A longitudinal deep learning model was used to predict disease activity for patients with RA at their next rheumatology clinic visit and to evaluate interhospital performance and model interoperability strategies. Main Outcomes and Measures Model performance was quantified using the area under the receiver operating characteristic curve (AUROC). Disease activity in RA was measured using a composite index score. Results A total of 578 UH patients (mean [SD] age, 57 [15] years; 477 [82.5%] female; 296 [51.2%] white) and 242 SNH patients (mean [SD] age, 60 [15] years; 195 [80.6%] female; 30 [12.4%] white) were included in the study. Patients at the UH compared with those at the SNH were seen more frequently (median time between visits, 100 vs 180 days) and were more frequently prescribed higher-class medications (biologics) (364 [63.0%] vs 70 [28.9%]). At the UH, the model reached an AUROC of 0.91 (95% CI, 0.86-0.96) in a test cohort of 116 patients. The UH-trained model had an AUROC of 0.74 (95% CI, 0.65-0.83) in the SNH test cohort (n = 117) despite marked differences in the patient populations. In both settings, baseline prediction using each patients' most recent disease activity score had statistically random performance. Conclusions and Relevance The findings suggest that building accurate models to forecast complex disease outcomes using electronic health record data is possible and these models can be shared across hospitals with diverse patient populations.
Collapse
|
Research Support, N.I.H., Extramural |
6 |
120 |
15
|
Livanos AE, Jha D, Cossarini F, Gonzalez-Reiche AS, Tokuyama M, Aydillo T, Parigi TL, Ladinsky MS, Ramos I, Dunleavy K, Lee B, Dixon RE, Chen ST, Martinez-Delgado G, Nagula S, Bruce EA, Ko HM, Glicksberg BS, Nadkarni G, Pujadas E, Reidy J, Naymagon S, Grinspan A, Ahmad J, Tankelevich M, Bram Y, Gordon R, Sharma K, Houldsworth J, Britton GJ, Chen-Liaw A, Spindler MP, Plitt T, Wang P, Cerutti A, Faith JJ, Colombel JF, Kenigsberg E, Argmann C, Merad M, Gnjatic S, Harpaz N, Danese S, Cordon-Cardo C, Rahman A, Schwartz RE, Kumta NA, Aghemo A, Bjorkman PJ, Petralia F, van Bakel H, Garcia-Sastre A, Mehandru S. Intestinal Host Response to SARS-CoV-2 Infection and COVID-19 Outcomes in Patients With Gastrointestinal Symptoms. Gastroenterology 2021; 160:2435-2450.e34. [PMID: 33676971 PMCID: PMC7931673 DOI: 10.1053/j.gastro.2021.02.056] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 02/22/2021] [Accepted: 02/23/2021] [Indexed: 02/06/2023]
Abstract
BACKGROUND & AIMS Given that gastrointestinal (GI) symptoms are a prominent extrapulmonary manifestation of COVID-19, we investigated intestinal infection with SARS-CoV-2, its effect on pathogenesis, and clinical significance. METHODS Human intestinal biopsy tissues were obtained from patients with COVID-19 (n = 19) and uninfected control individuals (n = 10) for microscopic examination, cytometry by time of flight analyses, and RNA sequencing. Additionally, disease severity and mortality were examined in patients with and without GI symptoms in 2 large, independent cohorts of hospitalized patients in the United States (N = 634) and Europe (N = 287) using multivariate logistic regressions. RESULTS COVID-19 case patients and control individuals in the biopsy cohort were comparable for age, sex, rates of hospitalization, and relevant comorbid conditions. SARS-CoV-2 was detected in small intestinal epithelial cells by immunofluorescence staining or electron microscopy in 15 of 17 patients studied. High-dimensional analyses of GI tissues showed low levels of inflammation, including down-regulation of key inflammatory genes including IFNG, CXCL8, CXCL2, and IL1B and reduced frequencies of proinflammatory dendritic cells compared with control individuals. Consistent with these findings, we found a significant reduction in disease severity and mortality in patients presenting with GI symptoms that was independent of sex, age, and comorbid illnesses and despite similar nasopharyngeal SARS-CoV-2 viral loads. Furthermore, there was reduced levels of key inflammatory proteins in circulation in patients with GI symptoms. CONCLUSIONS These data highlight the absence of a proinflammatory response in the GI tract despite detection of SARS-CoV-2. In parallel, reduced mortality in patients with COVID-19 presenting with GI symptoms was observed. A potential role of the GI tract in attenuating SARS-CoV-2-associated inflammation needs to be further examined.
Collapse
|
Multicenter Study |
4 |
109 |
16
|
Somani S, Russak AJ, Richter F, Zhao S, Vaid A, Chaudhry F, De Freitas JK, Naik N, Miotto R, Nadkarni GN, Narula J, Argulian E, Glicksberg BS. Deep learning and the electrocardiogram: review of the current state-of-the-art. Europace 2021; 23:1179-1191. [PMID: 33564873 PMCID: PMC8350862 DOI: 10.1093/europace/euaa377] [Citation(s) in RCA: 101] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 11/25/2020] [Indexed: 12/22/2022] Open
Abstract
In the recent decade, deep learning, a subset of artificial intelligence and machine learning, has been used to identify patterns in big healthcare datasets for disease phenotyping, event predictions, and complex decision making. Public datasets for electrocardiograms (ECGs) have existed since the 1980s and have been used for very specific tasks in cardiology, such as arrhythmia, ischemia, and cardiomyopathy detection. Recently, private institutions have begun curating large ECG databases that are orders of magnitude larger than the public databases for ingestion by deep learning models. These efforts have demonstrated not only improved performance and generalizability in these aforementioned tasks but also application to novel clinical scenarios. This review focuses on orienting the clinician towards fundamental tenets of deep learning, state-of-the-art prior to its use for ECG analysis, and current applications of deep learning on ECGs, as well as their limitations and future areas of improvement.
Collapse
|
Journal Article |
4 |
101 |
17
|
Dobbyn A, Huckins LM, Boocock J, Sloofman LG, Glicksberg BS, Giambartolomei C, Hoffman GE, Perumal TM, Girdhar K, Jiang Y, Raj T, Ruderfer DM, Kramer RS, Pinto D, Akbarian S, Roussos P, Domenici E, Devlin B, Sklar P, Stahl EA, Sieberts SK. Landscape of Conditional eQTL in Dorsolateral Prefrontal Cortex and Co-localization with Schizophrenia GWAS. Am J Hum Genet 2018; 102:1169-1184. [PMID: 29805045 PMCID: PMC5993513 DOI: 10.1016/j.ajhg.2018.04.011] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 04/24/2018] [Indexed: 12/12/2022] Open
Abstract
Causal genes and variants within genome-wide association study (GWAS) loci can be identified by integrating GWAS statistics with expression quantitative trait loci (eQTL) and determining which variants underlie both GWAS and eQTL signals. Most analyses, however, consider only the marginal eQTL signal, rather than dissect this signal into multiple conditionally independent signals for each gene. Here we show that analyzing conditional eQTL signatures, which could be important under specific cellular or temporal contexts, leads to improved fine mapping of GWAS associations. Using genotypes and gene expression levels from post-mortem human brain samples (n = 467) reported by the CommonMind Consortium (CMC), we find that conditional eQTL are widespread; 63% of genes with primary eQTL also have conditional eQTL. In addition, genomic features associated with conditional eQTL are consistent with context-specific (e.g., tissue-, cell type-, or developmental time point-specific) regulation of gene expression. Integrating the 2014 Psychiatric Genomics Consortium schizophrenia (SCZ) GWAS and CMC primary and conditional eQTL data reveals 40 loci with strong evidence for co-localization (posterior probability > 0.8), including six loci with co-localization of conditional eQTL. Our co-localization analyses support previously reported genes, identify novel genes associated with schizophrenia risk, and provide specific hypotheses for their functional follow-up.
Collapse
|
Research Support, N.I.H., Extramural |
7 |
89 |
18
|
Brin D, Sorin V, Vaid A, Soroush A, Glicksberg BS, Charney AW, Nadkarni G, Klang E. Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments. Sci Rep 2023; 13:16492. [PMID: 37779171 PMCID: PMC10543445 DOI: 10.1038/s41598-023-43436-9] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 09/23/2023] [Indexed: 10/03/2023] Open
Abstract
The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models' consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT's 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.
Collapse
|
research-article |
2 |
85 |
19
|
Belbin GM, Cullina S, Wenric S, Soper ER, Glicksberg BS, Torre D, Moscati A, Wojcik GL, Shemirani R, Beckmann ND, Cohain A, Sorokin EP, Park DS, Ambite JL, Ellis S, Auton A, Bottinger EP, Cho JH, Loos RJF, Abul-Husn NS, Zaitlen NA, Gignoux CR, Kenny EE. Toward a fine-scale population health monitoring system. Cell 2021; 184:2068-2083.e11. [PMID: 33861964 DOI: 10.1016/j.cell.2021.03.034] [Citation(s) in RCA: 83] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 11/18/2020] [Accepted: 03/12/2021] [Indexed: 12/22/2022]
Abstract
Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.
Collapse
|
Research Support, N.I.H., Extramural |
4 |
83 |
20
|
Shameer K, Badgeley MA, Miotto R, Glicksberg BS, Morgan JW, Dudley JT. Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams. Brief Bioinform 2017; 18:105-124. [PMID: 26876889 PMCID: PMC5221424 DOI: 10.1093/bib/bbv118] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Revised: 11/27/2015] [Indexed: 01/01/2023] Open
Abstract
Monitoring and modeling biomedical, health care and wellness data from individuals and converging data on a population scale have tremendous potential to improve understanding of the transition to the healthy state of human physiology to disease setting. Wellness monitoring devices and companion software applications capable of generating alerts and sharing data with health care providers or social networks are now available. The accessibility and clinical utility of such data for disease or wellness research are currently limited. Designing methods for streaming data capture, real-time data aggregation, machine learning, predictive analytics and visualization solutions to integrate wellness or health monitoring data elements with the electronic medical records (EMRs) maintained by health care providers permits better utilization. Integration of population-scale biomedical, health care and wellness data would help to stratify patients for active health management and to understand clinically asymptomatic patients and underlying illness trajectories. In this article, we discuss various health-monitoring devices, their ability to capture the unique state of health represented in a patient and their application in individualized diagnostics, prognosis, clinical or wellness intervention. We also discuss examples of translational bioinformatics approaches to integrating patient-generated data with existing EMRs, personal health records, patient portals and clinical data repositories. Briefly, translational bioinformatics methods, tools and resources are at the center of these advances in implementing real-time biomedical and health care analytics in the clinical setting. Furthermore, these advances are poised to play a significant role in clinical decision-making and implementation of data-driven medicine and wellness care.
Collapse
|
research-article |
8 |
80 |
21
|
Hirten RP, Danieletto M, Tomalin L, Choi KH, Zweig M, Golden E, Kaur S, Helmus D, Biello A, Pyzik R, Charney A, Miotto R, Glicksberg BS, Levin M, Nabeel I, Aberg J, Reich D, Charney D, Bottinger EP, Keefer L, Suarez-Farinas M, Nadkarni GN, Fayad ZA. Use of Physiological Data From a Wearable Device to Identify SARS-CoV-2 Infection and Symptoms and Predict COVID-19 Diagnosis: Observational Study. J Med Internet Res 2021; 23:e26107. [PMID: 33529156 PMCID: PMC7901594 DOI: 10.2196/26107] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 01/14/2021] [Accepted: 01/29/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Changes in autonomic nervous system function, characterized by heart rate variability (HRV), have been associated with infection and observed prior to its clinical identification. OBJECTIVE We performed an evaluation of HRV collected by a wearable device to identify and predict COVID-19 and its related symptoms. METHODS Health care workers in the Mount Sinai Health System were prospectively followed in an ongoing observational study using the custom Warrior Watch Study app, which was downloaded to their smartphones. Participants wore an Apple Watch for the duration of the study, measuring HRV throughout the follow-up period. Surveys assessing infection and symptom-related questions were obtained daily. RESULTS Using a mixed-effect cosinor model, the mean amplitude of the circadian pattern of the standard deviation of the interbeat interval of normal sinus beats (SDNN), an HRV metric, differed between subjects with and without COVID-19 (P=.006). The mean amplitude of this circadian pattern differed between individuals during the 7 days before and the 7 days after a COVID-19 diagnosis compared to this metric during uninfected time periods (P=.01). Significant changes in the mean and amplitude of the circadian pattern of the SDNN was observed between the first day of reporting a COVID-19-related symptom compared to all other symptom-free days (P=.01). CONCLUSIONS Longitudinally collected HRV metrics from a commonly worn commercial wearable device (Apple Watch) can predict the diagnosis of COVID-19 and identify COVID-19-related symptoms. Prior to the diagnosis of COVID-19 by nasal swab polymerase chain reaction testing, significant changes in HRV were observed, demonstrating the predictive ability of this metric to identify COVID-19 infection.
Collapse
|
Observational Study |
4 |
76 |
22
|
Landi I, Glicksberg BS, Lee HC, Cherng S, Landi G, Danieletto M, Dudley JT, Furlanello C, Miotto R. Deep representation learning of electronic health records to unlock patient stratification at scale. NPJ Digit Med 2020; 3:96. [PMID: 32699826 PMCID: PMC7367859 DOI: 10.1038/s41746-020-0301-z] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 06/17/2020] [Indexed: 12/15/2022] Open
Abstract
Deriving disease subtypes from electronic health records (EHRs) can guide next-generation personalized medicine. However, challenges in summarizing and representing patient data prevent widespread practice of scalable EHR-based stratification analysis. Here we present an unsupervised framework based on deep learning to process heterogeneous EHRs and derive patient representations that can efficiently and effectively enable patient stratification at scale. We considered EHRs of 1,608,741 patients from a diverse hospital cohort comprising a total of 57,464 clinical concepts. We introduce a representation learning model based on word embeddings, convolutional neural networks, and autoencoders (i.e., ConvAE) to transform patient trajectories into low-dimensional latent vectors. We evaluated these representations as broadly enabling patient stratification by applying hierarchical clustering to different multi-disease and disease-specific patient cohorts. ConvAE significantly outperformed several baselines in a clustering task to identify patients with different complex conditions, with 2.61 entropy and 0.31 purity average scores. When applied to stratify patients within a certain condition, ConvAE led to various clinically relevant subtypes for different disorders, including type 2 diabetes, Parkinson's disease, and Alzheimer's disease, largely related to comorbidities, disease progression, and symptom severity. With these results, we demonstrate that ConvAE can generate patient representations that lead to clinically meaningful insights. This scalable framework can help better understand varying etiologies in heterogeneous sub-populations and unlock patterns for EHR-based research in the realm of personalized medicine.
Collapse
|
research-article |
5 |
68 |
23
|
Somani SS, Richter F, Fuster V, De Freitas JK, Naik N, Sigel K, Bottinger EP, Levin MA, Fayad Z, Just AC, Charney AW, Zhao S, Glicksberg BS, Lala A, Nadkarni GN. Characterization of Patients Who Return to Hospital Following Discharge from Hospitalization for COVID-19. J Gen Intern Med 2020; 35:2838-2844. [PMID: 32815060 PMCID: PMC7437962 DOI: 10.1007/s11606-020-06120-6] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 08/06/2020] [Indexed: 11/29/2022]
Abstract
BACKGROUND Data on patients with coronavirus disease 2019 (COVID-19) who return to hospital after discharge are scarce. Characterization of these patients may inform post-hospitalization care. OBJECTIVE To describe clinical characteristics of patients with COVID-19 who returned to the emergency department (ED) or required readmission within 14 days of discharge. DESIGN Retrospective cohort study of SARS-COV-2-positive patients with index hospitalization between February 27 and April 12, 2020, with ≥ 14-day follow-up. Significance was defined as P < 0.05 after multiplying P by 125 study-wide comparisons. PARTICIPANTS Hospitalized patients with confirmed SARS-CoV-2 discharged alive from five New York City hospitals. MAIN MEASURES Readmission or return to ED following discharge. RESULTS Of 2864 discharged patients, 103 (3.6%) returned for emergency care after a median of 4.5 days, with 56 requiring inpatient readmission. The most common reason for return was respiratory distress (50%). Compared with patients who did not return, there were higher proportions of COPD (6.8% vs 2.9%) and hypertension (36% vs 22.1%) among those who returned. Patients who returned also had a shorter median length of stay (LOS) during index hospitalization (4.5 [2.9,9.1] vs 6.7 [3.5, 11.5] days; Padjusted = 0.006), and were less likely to have required intensive care on index hospitalization (5.8% vs 19%; Padjusted = 0.001). A trend towards association between absence of in-hospital treatment-dose anticoagulation on index admission and return to hospital was also observed (20.9% vs 30.9%, Padjusted = 0.06). On readmission, rates of intensive care and death were 5.8% and 3.6%, respectively. CONCLUSIONS Return to hospital after admission for COVID-19 was infrequent within 14 days of discharge. The most common cause for return was respiratory distress. Patients who returned more likely had COPD and hypertension, shorter LOS on index-hospitalization, and lower rates of in-hospital treatment-dose anticoagulation. Future studies should focus on whether these comorbid conditions, longer LOS, and anticoagulation are associated with reduced readmissions.
Collapse
|
Multicenter Study |
5 |
66 |
24
|
Vaid A, Jaladanki SK, Xu J, Teng S, Kumar A, Lee S, Somani S, Paranjpe I, De Freitas JK, Wanyan T, Johnson KW, Bicak M, Klang E, Kwon YJ, Costa A, Zhao S, Miotto R, Charney AW, Böttinger E, Fayad ZA, Nadkarni GN, Wang F, Glicksberg BS. Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach. JMIR Med Inform 2021; 9:e24207. [PMID: 33400679 PMCID: PMC7842859 DOI: 10.2196/24207] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 10/23/2020] [Accepted: 12/14/2020] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. METHODS Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. RESULTS The LASSOfederated model outperformed the LASSOlocal model at 3 hospitals, and the MLPfederated model performed better than the MLPlocal model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSOpooled model outperformed the LASSOfederated model at all hospitals, and the MLPfederated model outperformed the MLPpooled model at 2 hospitals. CONCLUSIONS The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy.
Collapse
|
research-article |
4 |
65 |
25
|
Uzilov AV, Ding W, Fink MY, Antipin Y, Brohl AS, Davis C, Lau CY, Pandya C, Shah H, Kasai Y, Powell J, Micchelli M, Castellanos R, Zhang Z, Linderman M, Kinoshita Y, Zweig M, Raustad K, Cheung K, Castillo D, Wooten M, Bourzgui I, Newman LC, Deikus G, Mathew B, Zhu J, Glicksberg BS, Moe AS, Liao J, Edelmann L, Dudley JT, Maki RG, Kasarskis A, Holcombe RF, Mahajan M, Hao K, Reva B, Longtine J, Starcevic D, Sebra R, Donovan MJ, Li S, Schadt EE, Chen R. Development and clinical application of an integrative genomic approach to personalized cancer therapy. Genome Med 2016; 8:62. [PMID: 27245685 PMCID: PMC4888213 DOI: 10.1186/s13073-016-0313-0] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2016] [Accepted: 05/04/2016] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Personalized therapy provides the best outcome of cancer care and its implementation in the clinic has been greatly facilitated by recent convergence of enormous progress in basic cancer research, rapid advancement of new tumor profiling technologies, and an expanding compendium of targeted cancer therapeutics. METHODS We developed a personalized cancer therapy (PCT) program in a clinical setting, using an integrative genomics approach to fully characterize the complexity of each tumor. We carried out whole exome sequencing (WES) and single-nucleotide polymorphism (SNP) microarray genotyping on DNA from tumor and patient-matched normal specimens, as well as RNA sequencing (RNA-Seq) on available frozen specimens, to identify somatic (tumor-specific) mutations, copy number alterations (CNAs), gene expression changes, gene fusions, and also germline variants. To provide high sensitivity in known cancer mutation hotspots, Ion AmpliSeq Cancer Hotspot Panel v2 (CHPv2) was also employed. We integrated the resulting data with cancer knowledge bases and developed a specific workflow for each cancer type to improve interpretation of genomic data. RESULTS We returned genomics findings to 46 patients and their physicians describing somatic alterations and predicting drug response, toxicity, and prognosis. Mean 17.3 cancer-relevant somatic mutations per patient were identified, 13.3-fold, 6.9-fold, and 4.7-fold more than could have been detected using CHPv2, Oncomine Cancer Panel (OCP), and FoundationOne, respectively. Our approach delineated the underlying genetic drivers at the pathway level and provided meaningful predictions of therapeutic efficacy and toxicity. Actionable alterations were found in 91 % of patients (mean 4.9 per patient, including somatic mutations, copy number alterations, gene expression alterations, and germline variants), a 7.5-fold, 2.0-fold, and 1.9-fold increase over what could have been uncovered by CHPv2, OCP, and FoundationOne, respectively. The findings altered the course of treatment in four cases. CONCLUSIONS These results show that a comprehensive, integrative genomic approach as outlined above significantly enhanced genomics-based PCT strategies.
Collapse
|
research-article |
9 |
61 |