Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

18
(from Reference Citation Analysis)

Article PDFs (7)

Cited by > 0 (11)

Searched Name

Zachary H. Strasser

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Azhir A, Hügel J, Tian J, Cheng J, Bassett IV, Bell DS, Bernstam EV, Farhat MR, Henderson DW, Lau ES, Morris M, Semenov YR, Triant VA, Visweswaran S, Strasser ZH, Klann JG, Murphy SN, Estiri H. Precision Phenotyping for Curating Research Cohorts of Patients with Post-Acute Sequelae of COVID-19 (PASC) as a Diagnosis of Exclusion. medRxiv 2024:2024.04.13.24305771. [PMID: 38699316 PMCID: PMC11065031 DOI: 10.1101/2024.04.13.24305771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]

Abstract

Scalable identification of patients with the post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms and the suboptimal accuracy, demographic biases, and underestimation of the PASC diagnosis code (ICD-10 U09.9). In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying research cohorts of PASC patients, defined as a diagnosis of exclusion. We used longitudinal electronic health records (EHR) data from over 295 thousand patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to exclude sequelae that prior conditions can explain. We performed independent chart reviews to tune and validate our precision phenotyping algorithm. Our PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying Long COVID patients compared to the U09.9 diagnosis code. Our algorithm identified a PASC research cohort of over 24 thousand patients (compared to about 6 thousand when using the U09.9 diagnosis code), with a 79.9 percent precision (compared to 77.8 percent from the U09.9 diagnosis code). Our estimated prevalence of PASC was 22.8 percent, which is close to the national estimates for the region. We also provide an in-depth analysis outlining the clinical attributes, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC. The PASC phenotyping method presented in this study boasts superior precision, accurately gauges the prevalence of PASC without underestimating it, and exhibits less bias in pinpointing Long COVID patients. The PASC cohort derived from our algorithm will serve as a springboard for delving into Long COVID's genetic, metabolomic, and clinical intricacies, surmounting the constraints of recent PASC cohort studies, which were hampered by their limited size and available outcome data.

Collapse

Foer D, Strasser ZH, Cui J, Cahill KN, Boyce JA, Murphy SN, Karlson EW. Reply to Li et al.. Am J Respir Crit Care Med 2023;208:1346-1347. [PMID: 37855723 PMCID: PMC10765385 DOI: 10.1164/rccm.202310-1721le] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 10/18/2023] [Indexed: 10/20/2023] Open

Foer D, Strasser ZH, Cui J, Cahill KN, Boyce JA, Murphy SN, Karlson EW. Association of GLP-1 Receptor Agonists with Chronic Obstructive Pulmonary Disease Exacerbations among Patients with Type 2 Diabetes. Am J Respir Crit Care Med 2023;208:1088-1100. [PMID: 37647574 PMCID: PMC10867930 DOI: 10.1164/rccm.202303-0491oc] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/30/2023] [Indexed: 09/01/2023] Open

Abstract

Rationale: Patients with chronic obstructive pulmonary disease (COPD) and type 2 diabetes (T2D) have worse clinical outcomes compared with patients without metabolic dysregulation. GLP-1 (glucagon-like peptide 1) receptor agonists (GLP-1RAs) reduce asthma exacerbation risk and improve FVC in patients with COPD. Objectives: To determine whether GLP-1RA use is associated with reduced COPD exacerbation rates, and severe and moderate exacerbation risk, compared with other T2D therapies. Methods: A retrospective, observational, electronic health records-based study was conducted using an active comparator, new-user design of 1,642 patients with COPD in a U.S. health system from 2012 to 2022. The COPD cohort was identified using a previously validated machine learning algorithm that includes a natural language processing tool. Exposures were defined as prescriptions for GLP-1RAs (reference group), DPP-4 (dipeptidyl peptidase 4) inhibitors (DPP-4is), SGLT2 (sodium-glucose cotransporter 2) inhibitors, or sulfonylureas. Measurements and Main Results: Unadjusted COPD exacerbation counts were lower in GLP-1RA users. Adjusted exacerbation rates were significantly higher in DPP-4i (incidence rate ratio, 1.48 [95% confidence interval, 1.08-2.04]; P = 0.02) and sulfonylurea (incidence rate ratio, 2.09 [95% confidence interval, 1.62-2.69]; P < 0.0001) users compared with GLP-1RA users. GLP-1RA use was also associated with significantly reduced risk of severe exacerbations compared with DPP-4i and sulfonylurea use, and of moderate exacerbations compared with sulfonylurea use. After adjustment for clinical covariates, moderate exacerbation risk was also lower in GLP-1RA users compared with DPP-4i users. No statistically significant difference in exacerbation outcomes was seen between GLP-1RA and SGLT2 inhibitor users. Conclusions: Prospective studies of COPD exacerbations in patients with comorbid T2D are warranted. Additional research may elucidate the mechanisms underlying these observed associations with T2D medications.

Collapse

Dagliati A, Strasser ZH, Hossein Abad ZS, Klann JG, Wagholikar KB, Mesa R, Visweswaran S, Morris M, Luo Y, Henderson DW, Samayamuthu MJ, Tan BW, Verdy G, Omenn GS, Xia Z, Bellazzi R, Murphy SN, Holmes JH, Estiri H. Characterization of long COVID temporal sub-phenotypes by distributed representation learning from electronic health record data: a cohort study. EClinicalMedicine 2023;64:102210. [PMID: 37745021 PMCID: PMC10511779 DOI: 10.1016/j.eclinm.2023.102210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 08/29/2023] [Accepted: 08/29/2023] [Indexed: 09/26/2023] Open

Affiliation(s)

Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Zachary H. Strasser Department of Medicine, Massachusetts General Hospital, Boston, United States
Zahra Shakeri Hossein Abad University of Toronto, Dalla Lana School of Public Health, Toronto, Canada
Jeffrey G. Klann Department of Medicine, Massachusetts General Hospital, Boston, United States
Kavishwar B. Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, United States
Rebecca Mesa Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, United States
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, United States
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, United States
Darren W. Henderson University of Kentucky, Center for Clinical and Translational Science, Lexington, United States
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, United States
Bryce W.Q. Tan National University Hospital, Singapore Department of Medicine, Singapore
Guillame Verdy Bordeaux University Hospital, IAM Unit, Bordeaux, France
Gilbert S. Omenn University of Michigan, Department of Computational Medicine and Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, Ann Arbor, United States
Zongqi Xia University of Pittsburgh Department of Neurology, Pittsburgh, United States
Riccardo Bellazzi Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Shawn N. Murphy Department of Neurology, Massachusetts General Hospital, Boston, United States
John H. Holmes University of Pennsylvania Perelman School of Medicine, Department of Biostatistics, Epidemiology, and Informatics, Institute for Biomedical Informatics, Philadelphia, United States
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, United States

Collapse

Strasser ZH, Dagliati A, Shakeri Hossein Abad Z, Klann JG, Wagholikar KB, Mesa R, Visweswaran S, Morris M, Luo Y, Henderson DW, Samayamuthu MJ, Omenn GS, Xia Z, Holmes JH, Estiri H, Murphy SN. A retrospective cohort analysis leveraging augmented intelligence to characterize long COVID in the electronic health record: A precision medicine framework. PLOS Digit Health 2023;2:e0000301. [PMID: 37490472 PMCID: PMC10368277 DOI: 10.1371/journal.pdig.0000301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 06/16/2023] [Indexed: 07/27/2023]

Abstract

Physical and psychological symptoms lasting months following an acute COVID-19 infection are now recognized as post-acute sequelae of COVID-19 (PASC). Accurate tools for identifying such patients could enhance screening capabilities for the recruitment for clinical trials, improve the reliability of disease estimates, and allow for more accurate downstream cohort analysis. In this retrospective cohort study, we analyzed the EHR of hospitalized COVID-19 patients across three healthcare systems to develop a pipeline for better identifying patients with persistent PASC symptoms (dyspnea, fatigue, or joint pain) after their SARS-CoV-2 infection. We implemented distributed representation learning powered by the Machine Learning for modeling Health Outcomes (MLHO) to identify novel EHR features that could suggest PASC symptoms outside of typical diagnosis codes. MLHO applies an entropy-based feature selection and boosting algorithms for representation mining. These improved definitions were then used for estimating PASC among hospitalized patients. 30,422 hospitalized patients were diagnosed with COVID-19 across three healthcare systems between March 13, 2020 and February 28, 2021. The mean age of the population was 62.3 years (SD, 21.0 years) and 15,124 (49.7%) were female. We implemented the distributed representation learning technique to augment PASC definitions. These definitions were found to have positive predictive values of 0.73, 0.74, and 0.91 for dyspnea, fatigue, and joint pain, respectively. We estimated that 25 percent (CI 95%: 6-48), 11 percent (CI 95%: 6-15), and 13 percent (CI 95%: 8-17) of hospitalized COVID-19 patients will have dyspnea, fatigue, and joint pain, respectively, 3 months or longer after a COVID-19 diagnosis. We present a validated framework for screening and identifying patients with PASC in the EHR and then use the tool to estimate its prevalence among hospitalized COVID-19 patients.

Collapse

Affiliation(s)

Zachary H. Strasser Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Zahra Shakeri Hossein Abad Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
Jeffrey G. Klann Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
Kavishwar B. Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
Rebecca Mesa Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, Illinois, United States of America
Darren W. Henderson Center for Clinical and Translation Science, University of Kentucky, Lexington, Kentucky, United States of America
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States
The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
Gilbert S. Omenn Dept of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, Michigan, United States of America
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
John H. Holmes Department of Biostatistics, Epidemiology, and Informatics; Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, United States of America
Shawn N. Murphy Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts, United States of America

Collapse

Azhir A, Strasser ZH, Murphy SN, Estiri H. Severity of COVID-19-Related Illness in Massachusetts, July 2021 to December 2022. JAMA Netw Open 2023;6:e238203. [PMID: 37052921 PMCID: PMC10102873 DOI: 10.1001/jamanetworkopen.2023.8203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/14/2023] Open

Tan ALM, Getzen EJ, Hutch MR, Strasser ZH, Gutiérrez-Sacristán A, Le TT, Dagliati A, Morris M, Hanauer DA, Moal B, Bonzel CL, Yuan W, Chiudinelli L, Das P, Zhang HG, Aronow BJ, Avillach P, Brat GA, Cai T, Hong C, La Cava WG, Hooi Will Loh H, Luo Y, Murphy SN, Yuan Hgiam K, Omenn GS, Patel LP, Jebathilagam Samayamuthu M, Shriver ER, Shakeri Hossein Abad Z, Tan BWL, Visweswaran S, Wang X, Weber GM, Xia Z, Verdy B, Long Q, Mowery DL, Holmes JH. Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record? J Biomed Inform 2023;139:104306. [PMID: 36738870 PMCID: PMC10849195 DOI: 10.1016/j.jbi.2023.104306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 01/21/2023] [Accepted: 01/29/2023] [Indexed: 02/05/2023]

Abstract

BACKGROUND

In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients.

METHODS

We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern.

RESULTS

With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors.

CONCLUSION

In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.

Collapse

Affiliation(s)

Amelia L M Tan Harvard Medical School, Cambridge, MA, USA
Emily J Getzen University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Meghan R Hutch Northwestern University, Chicago, IL, USA
Zachary H Strasser Massachusetts General Hospital, Boston, MA, USA
Alba Gutiérrez-Sacristán Harvard Medical School, Cambridge, MA, USA
Trang T Le University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Arianna Dagliati University of Pavia, Pavia, Italy
Michele Morris University of Pittsburgh, Pittsburgh, PA, USA
David A Hanauer University of Michigan, Ann Arbor, MI, USA
Bertrand Moal Bordeaux University Hospital, Talence, France
Clara-Lea Bonzel Harvard Medical School, Cambridge, MA, USA
William Yuan Harvard Medical School, Cambridge, MA, USA
Lorenzo Chiudinelli ASST Papa Giovanni XXIII, Bergamo, Italy
Priam Das Harvard Medical School, Cambridge, MA, USA
Harrison G Zhang Harvard Medical School, Cambridge, MA, USA
Bruce J Aronow Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
Paul Avillach Harvard Medical School, Cambridge, MA, USA
Gabriel A Brat Harvard Medical School, Cambridge, MA, USA
Tianxi Cai Harvard Medical School, Cambridge, MA, USA
Chuan Hong Harvard Medical School, Cambridge, MA, USA; Duke University, Durham, NC, USA
William G La Cava Harvard Medical School, Cambridge, MA, USA; Boston Children's Hospital, Boston, MA, USA
He Hooi Will Loh National University Health Systems, Singapore
Yuan Luo Northwestern University, Chicago, IL, USA
Shawn N Murphy Massachusetts General Hospital, Boston, MA, USA
Kee Yuan Hgiam National University Health Systems, Singapore
Gilbert S Omenn University of Pittsburgh, Pittsburgh, PA, USA
Lav P Patel University of Kansas Medical Center, United States
Malarkodi Jebathilagam Samayamuthu University of Pittsburgh, Pittsburgh, PA, USA
Emily R Shriver University of Pennsylvania Health System, Philadelphia, PA, USA
Zahra Shakeri Hossein Abad Harvard Medical School, Cambridge, MA, USA
Byorn W L Tan National University Health Systems, Singapore
Shyam Visweswaran University of Pittsburgh, Pittsburgh, PA, USA
Xuan Wang Harvard Medical School, Cambridge, MA, USA
Griffin M Weber Harvard Medical School, Cambridge, MA, USA
Zongqi Xia University of Pittsburgh, Pittsburgh, PA, USA
Bertrand Verdy Bordeaux University Hospital, Talence, France
Qi Long University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Danielle L Mowery University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
John H Holmes University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA

Collapse

Strasser ZH, Greifer N, Hadavand A, Murphy SN, Estiri H. Estimates of SARS-CoV-2 Omicron BA.2 Subvariant Severity in New England. JAMA Netw Open 2022;5:e2238354. [PMID: 36282501 PMCID: PMC9597387 DOI: 10.1001/jamanetworkopen.2022.38354] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Abstract

IMPORTANCE

The SARS-CoV-2 Omicron subvariant, BA.2, may be less severe than previous variants; however, confounding factors make interpreting the intrinsic severity challenging.

OBJECTIVE

To compare the adjusted risks of mortality, hospitalization, intensive care unit admission, and invasive ventilation between the BA.2 subvariant and the Omicron and Delta variants, after accounting for multiple confounders.

DESIGN, SETTING, AND PARTICIPANTS

This was a retrospective cohort study that applied an entropy balancing approach. Patients in a multicenter inpatient and outpatient system in New England with COVID-19 between March 3, 2020, and June 20, 2022, were identified.

EXPOSURES

Cases were assigned as being exposed to the Delta (B.1.617.2) variant, the Omicron (B.1.1.529) variant, or the Omicron BA.2 lineage subvariants.

MAIN OUTCOMES AND MEASURES

The primary study outcome planned before analysis was risk of 30-day mortality. Secondary outcomes included the risks of hospitalization, invasive ventilation, and intensive care unit admissions.

RESULTS

Of 102 315 confirmed COVID-19 cases (mean [SD] age, 44.2 [21.6] years; 63 482 women [62.0%]), 20 770 were labeled as Delta variants, 52 605 were labeled as the Omicron B.1.1.529 variant, and 28 940 were labeled as Omicron BA.2 subvariants. Patient cases were excluded if they occurred outside the prespecified temporal windows associated with the variants or had minimal longitudinal data in the Mass General Brigham system before COVID-19. Mortality rates were 0.7% for Delta (B.1.617.2), 0.4% for Omicron (B.1.1.529), and 0.3% for Omicron (BA.2). The adjusted odds ratio of mortality from the Delta variant compared with the Omicron BA.2 subvariants was 2.07 (95% CI, 1.04-4.10) and that of the original Omicron variant compared with the Omicron BA.2 subvariant was 2.20 (95% CI, 1.56-3.11). For all outcomes, the Omicron BA.2 subvariants were significantly less severe than that of the Omicron and Delta variants.

CONCLUSIONS AND RELEVANCE

In this cohort study, after having accounted for a variety of confounding factors associated with SARS-CoV-2 outcomes, the Omicron BA.2 subvariant was found to be intrinsically less severe than both the Delta and Omicron variants. With respect to these variants, the severity profile of SARS-CoV-2 appears to be diminishing after taking into account various factors including therapeutics, vaccinations, and prior infections.

Collapse

Zhang HG, Dagliati A, Shakeri Hossein Abad Z, Xiong X, Bonzel CL, Xia Z, Tan BWQ, Avillach P, Brat GA, Hong C, Morris M, Visweswaran S, Patel LP, Gutiérrez-Sacristán A, Hanauer DA, Holmes JH, Samayamuthu MJ, Bourgeois FT, L'Yi S, Maidlow SE, Moal B, Murphy SN, Strasser ZH, Neuraz A, Ngiam KY, Loh NHW, Omenn GS, Prunotto A, Dalvin LA, Klann JG, Schubert P, Vidorreta FJS, Benoit V, Verdy G, Kavuluru R, Estiri H, Luo Y, Malovini A, Tibollo V, Bellazzi R, Cho K, Ho YL, Tan ALM, Tan BWL, Gehlenborg N, Lozano-Zahonero S, Jouhet V, Chiovato L, Aronow BJ, Toh EMS, Wong WGS, Pizzimenti S, Wagholikar KB, Bucalo M, Cai T, South AM, Kohane IS, Weber GM. International electronic health record-derived post-acute sequelae profiles of COVID-19 patients. NPJ Digit Med 2022;5:81. [PMID: 35768548 PMCID: PMC9242995 DOI: 10.1038/s41746-022-00623-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 05/19/2022] [Indexed: 11/10/2022] Open

Affiliation(s)

Harrison G Zhang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Zahra Shakeri Hossein Abad Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Xin Xiong Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Clara-Lea Bonzel Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
Bryce W Q Tan Department of Medicine, National University Hospital, Singapore, Singapore
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Lav P Patel Department of Internal Medicine, Division of Medical Informatics, University Of Kansas Medical Center, Kansas City, MO, USA
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, USA
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.,Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Florence T Bourgeois Department of Pediatrics, Harvard Medical School, Boston, MA, USA
Sehi L'Yi Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Sarah E Maidlow Michigan Institute for Clinical and Health Research (MICHR) Informatics, University of Michigan, Ann Arbor, MI, USA
Bertrand Moal IAM unit, Bordeaux University Hospital, Bordeaux, France
Shawn N Murphy Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
Zachary H Strasser Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Antoine Neuraz Department of biomedical informatics, Hôpital Necker-Enfants Malade, Assistance Publique Hôpitaux de Paris (APHP), University of Paris, Paris, France
Kee Yuan Ngiam Department of Biomedical informatics, WiSDM, National University Health Systems Singapore, Singapore, Singapore
Ne Hooi Will Loh Department of Anaesthesia, National University Health Systems Singapore, Singapore, Singapore
Gilbert S Omenn Department of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI, USA
Andrea Prunotto Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Lauren A Dalvin Department of Ophthalmology, Mayo Clinic, Rochester, NY, USA
Jeffrey G Klann Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Petra Schubert Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA
Fernando J Sanz Vidorreta Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
Vincent Benoit IT Department, Innovation & Data, APHP Greater Paris University Hospital, Paris, France
Guillaume Verdy IAM unit, Bordeaux University Hospital, Bordeaux, France
Ramakanth Kavuluru Division of Biomedical Informatics (Department of Internal Medicine), University of Kentucky, Lexington, KY, USA
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
Alberto Malovini Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Valentina Tibollo Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA.,Population Health and Data Science, VA Boston Healthcare System, Boston, MA, USA
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA
Amelia L M Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Byorn W L Tan Department of Medicine, National University Hospital, Singapore, Singapore
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Sara Lozano-Zahonero Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Vianney Jouhet IAM unit, INSERM Bordeaux Population Health ERIAS TEAM, Bordeaux University Hospital / ERIAS - Inserm, U1219 BPH, Bordeaux, France
Luca Chiovato Unit of Internal Medicine and Endocrinology, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Bruce J Aronow Departments of Biomedical Informatics, Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
Emma M S Toh Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Wei Gen Scott Wong Department of Medicine, National University Health Systems Singapore, Singapore, Singapore
Sara Pizzimenti Scientific Direction, IRCCS Ca' Granda Ospedale Maggiore Policlinico di Milano, Milan, Italy
Kavishwar B Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Mauro Bucalo BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy

Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Andrew M South Department of Pediatrics-Section of Nephrology, Brenner Children's, Wake Forest School of Medicine, Winston Salem, NC, USA
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Griffin M Weber Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Collapse

Klann JG, Strasser ZH, Hutch MR, Kennedy CJ, Marwaha JS, Morris M, Samayamuthu MJ, Pfaff AC, Estiri H, South AM, Weber GM, Yuan W, Avillach P, Wagholikar KB, Luo Y, Omenn GS, Visweswaran S, Holmes JH, Xia Z, Brat GA, Murphy SN. Distinguishing Admissions Specifically for COVID-19 From Incidental SARS-CoV-2 Admissions: National Retrospective Electronic Health Record Study. J Med Internet Res 2022;24:e37931. [PMID: 35476727 PMCID: PMC9119395 DOI: 10.2196/37931] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 04/22/2022] [Accepted: 04/22/2022] [Indexed: 01/16/2023] Open

Abstract

BACKGROUND

Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 versus incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification.

OBJECTIVE

The aims of this study are to, first, quantify the frequency of incidental hospitalizations over the first 15 months of the pandemic in multiple hospital systems in the United States and, second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification.

METHODS

From a retrospective EHR-based cohort in 4 US health care systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1123 SARS-CoV-2 PCR-positive patients hospitalized from March 2020 to August 2021 was manually chart-reviewed and classified as "admitted with COVID-19" (incidental) versus specifically admitted for COVID-19 ("for COVID-19"). EHR-based phenotyping was used to find feature sets to filter out incidental admissions.

RESULTS

EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0% to 75%). The top site-specific feature sets had 79%-99% specificity with 62%-75% sensitivity, while the best-performing across-site feature sets had 71%-94% specificity with 69%-81% sensitivity.

CONCLUSIONS

A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.

Collapse

Affiliation(s)

Jeffrey G Klann Laboratory of Computer Science, Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Zachary H Strasser Laboratory of Computer Science, Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Meghan R Hutch Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Chris J Kennedy Center for Precision Psychiatry, Massachusetts General Hospital, Boston, MA, United States
Jayson S Marwaha Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, United States
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Ashley C Pfaff Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, United States
Hossein Estiri Laboratory of Computer Science, Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Andrew M South Section of Nephrology, Department of Pediatrics, Brenner Children's, Wake Forest School of Medicine, Winston Salem, NC, United States
Griffin M Weber see Acknowledgments,
William Yuan see Acknowledgments,
Paul Avillach see Acknowledgments,
Kavishwar B Wagholikar Laboratory of Computer Science, Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Gilbert S Omenn Center for Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, United States
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, PA, United States
Gabriel A Brat see Acknowledgments,
Shawn N Murphy Department of Neurology, Massachusetts General Hospital, Boston, MA, United States

Collapse

Estiri H, Strasser ZH, Rashidian S, Klann JG, Wagholikar KB, McCoy TH, Murphy SN. An Objective Framework for Evaluating Unrecognized Bias in Medical AI Models Predicting COVID-19 Outcomes. J Am Med Inform Assoc 2022;29:1334-1341. [PMID: 35511151 PMCID: PMC9277645 DOI: 10.1093/jamia/ocac070] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 04/04/2022] [Accepted: 04/27/2022] [Indexed: 12/15/2022] Open

Abstract

OBJECTIVE

The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models.

MATERIALS AND METHODS

Using data from over 56 thousand Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in four AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. Models were evaluated both retrospectively and prospectively using model-level metrics of discrimination, accuracy, and reliability, and a novel individual-level metric for error.

RESULTS

We found inconsistent instances of model-level bias in the prediction models. From an individual-level aspect, however, we found most all models performing with slightly higher error rates for older patients.

DISCUSSION

While a model can be biased against certain protected groups (i.e., perform worse) in certain tasks, it can be at the same time biased towards another protected group (i.e., perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations.

CONCLUSION

Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change.

Collapse

Klann JG, Strasser ZH, Hutch MR, Kennedy CJ, Marwaha JS, Morris M, Samayamuthu MJ, Pfaff AC, Estiri H, South AM, Weber GM, Yuan W, Avillach P, Wagholikar KB, Luo Y, Omenn GS, Visweswaran S, Holmes JH, Xia Z, Brat GA, Murphy SN. Distinguishing Admissions Specifically for COVID-19 from Incidental SARS-CoV-2 Admissions: A National EHR Research Consortium Study. medRxiv 2022:2022.02.10.22270728. [PMID: 35350202 PMCID: PMC8963684 DOI: 10.1101/2022.02.10.22270728] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Estiri H, Strasser ZH, Brat GA, Semenov YR, Patel CJ, Murphy SN. Evolving phenotypes of non-hospitalized patients that indicate long COVID. BMC Med 2021;19:249. [PMID: 34565368 PMCID: PMC8474909 DOI: 10.1186/s12916-021-02115-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/01/2021] [Indexed: 01/28/2023] Open

Abstract

BACKGROUND

METHODS

In this retrospective electronic health record (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston Metropolitan Area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized.

RESULTS

We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients' medical records 2 months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR 2.60, 95% CI [1.94-3.46]), alopecia (OR 3.09, 95% CI [2.53-3.76]), chest pain (OR 1.27, 95% CI [1.09-1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22-1.64]), pneumonia (OR 1.66, 95% CI [1.28-2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22-1.64]) is one of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65.

CONCLUSIONS

The findings of this study confirm many of the post-COVID-19 symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63% of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.

Collapse

Estiri H, Strasser ZH, Murphy SN. High-throughput phenotyping with temporal sequences. J Am Med Inform Assoc 2021;28:772-781. [PMID: 33313899 DOI: 10.1093/jamia/ocaa288] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 11/04/2020] [Indexed: 12/15/2022] Open

Abstract

OBJECTIVE

High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs is often underutilized in developing computational phenotypic definitions. This study aims to develop a high-throughput phenotyping method, leveraging temporal sequential patterns from EHRs.

MATERIALS AND METHODS

We develop a representation mining algorithm to extract 5 classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (aggregated vector representation), the standard sequential patterns (sequential pattern mining), the transitive sequential patterns (transitive sequential pattern mining), and 2 hybrid classes. Using EHR data on 10 phenotypes from the Mass General Brigham Biobank, we train and validate phenotyping algorithms.

RESULTS

Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the standard representations in electronic phenotyping. The high-throughput algorithm's classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations.

DISCUSSION

The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. Transitive sequences offer more accurate characterization of the phenotype, compared with its individual components, and reflect the actual lived experiences of the patients with that particular disease.

CONCLUSION

Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream machine learning. Our approach starts with user interpretability and works backward to the technology.

Collapse

Estiri H, Strasser ZH, Brat GA, Semenov YR, Patel CJ, Murphy SN. Evolving Phenotypes of non-hospitalized Patients that Indicate Long Covid. medRxiv 2021. [PMID: 33948602 DOI: 10.1101/2021.04.25.21255923] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. In this retrospective electronic health records (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston metropolitan area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patientsâ€™ medical records two months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR: 2.60, 95% CI [1.94 - 3.46]), alopecia (OR: 3.09, 95% CI [2.53 - 3.76]), chest pain (OR: 1.27, 95% CI [1.09 - 1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22 - 1.64]), pneumonia (OR 1.66, 95% CI [1.28 - 2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22 - 1.64]) are some of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. Our approach avoids a flood of false positive discoveries while offering a more robust probabilistic approach compared to the standard linear phenome-wide association study (PheWAS). The findings of this study confirm many of the post-COVID symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63 percent of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.

Collapse

Estiri H, Strasser ZH, Murphy SN. Individualized prediction of COVID-19 adverse outcomes with MLHO. Sci Rep 2021;11:5322. [PMID: 33674708 PMCID: PMC7935934 DOI: 10.1038/s41598-021-84781-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 02/18/2021] [Indexed: 12/28/2022] Open

Abstract

The COVID-19 pandemic has devastated the world with health and economic wreckage. Precise estimates of adverse outcomes from COVID-19 could have led to better allocation of healthcare resources and more efficient targeted preventive measures, including insight into prioritizing how to best distribute a vaccination. We developed MLHO (pronounced as melo), an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health Outcomes. MLHO implements iterative sequential representation mining, and feature and model selection, for predicting patient-level risk of hospitalization, ICU admission, need for mechanical ventilation, and death. It bases this prediction on data from patients' past medical records (before their COVID-19 infection). MLHO's architecture enables a parallel and outcome-oriented model calibration, in which different statistical learning algorithms and vectors of features are simultaneously tested to improve prediction of health outcomes. Using clinical and demographic data from a large cohort of over 13,000 COVID-19-positive patients, we modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics. The mean AUC ROC for mortality prediction was 0.91, while the prediction performance ranged between 0.80 and 0.81 for the ICU, hospitalization, and ventilation. We broadly describe the clusters of features that were utilized in modeling and their relative influence for predicting each outcome. Our results demonstrated that while demographic variables (namely age) are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model. As the COVID-19 pandemic unfolds around the world, adaptable and interpretable machine learning frameworks (like MLHO) are crucial to improve our readiness for confronting the potential future waves of COVID-19, as well as other novel infectious diseases that may emerge.

Collapse

Estiri H, Strasser ZH, Klann JG, Naseri P, Wagholikar KB, Murphy SN. Predicting COVID-19 mortality with electronic medical records. NPJ Digit Med 2021;4:15. [PMID: 33542473 PMCID: PMC7862405 DOI: 10.1038/s41746-021-00383-x] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 12/24/2020] [Indexed: 01/31/2023] Open

Estiri H, Strasser ZH, Klann JG, McCoy TH, Wagholikar KB, Vasey S, Castro VM, Murphy ME, Murphy SN. Transitive Sequencing Medical Records for Mining Predictive and Interpretable Temporal Representations. Patterns (N Y) 2020;1:100051. [PMID: 32835307 PMCID: PMC7301790 DOI: 10.1016/j.patter.2020.100051] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 04/27/2020] [Accepted: 05/26/2020] [Indexed: 12/13/2022]

Affiliation(s)

Hossein Estiri Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Zachary H. Strasser Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
Jeffery G. Klann Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Thomas H. McCoy Harvard Medical School, Boston, MA 02115, USA Center for Quantitative Health, Massachusetts General Hospital, Boston, MA 02114, USA
Kavishwar B. Wagholikar Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA
Sebastien Vasey Department of Mathematics, Harvard University, Cambridge, MA 02138, USA
Victor M. Castro Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA
MaryKate E. Murphy Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA
Shawn N. Murphy Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA 02144, USA Research Information Science and Computing, Mass General Brigham, Somerville, MA 02145, USA Harvard Medical School, Boston, MA 02115, USA Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA

Collapse