Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang Y, Cai T, Yu S, Cho K, Hong C, Sun J, Huang J, Ho YL, Ananthakrishnan AN, Xia Z, Shaw SY, Gainer V, Castro V, Link N, Honerlaw J, Huang S, Gagnon D, Karlson EW, Plenge RM, Szolovits P, Savova G, Churchill S, O'Donnell C, Murphy SN, Gaziano JM, Kohane I, Cai T, Liao KP. High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP). Nat Protoc 2019;14:3426-44. [PMID: 31748751 DOI: 10.1038/s41596-019-0227-6] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 07/22/2019] [Indexed: 01/12/2023]

For:	Zhang Y, Cai T, Yu S, Cho K, Hong C, Sun J, Huang J, Ho YL, Ananthakrishnan AN, Xia Z, Shaw SY, Gainer V, Castro V, Link N, Honerlaw J, Huang S, Gagnon D, Karlson EW, Plenge RM, Szolovits P, Savova G, Churchill S, O'Donnell C, Murphy SN, Gaziano JM, Kohane I, Cai T, Liao KP. High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP). Nat Protoc 2019;14:3426-44. [PMID: 31748751 DOI: 10.1038/s41596-019-0227-6] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Accepted: 07/22/2019] [Indexed: 01/12/2023]

Number

Cited by Other Article(s)

Castro VM, Gainer V, Wattanasin N, Benoit B, Cagan A, Ghosh B, Goryachev S, Metta R, Park H, Wang D, Mendis M, Rees M, Herrick C, Murphy SN. The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics. J Am Med Inform Assoc 2022;29:643-651. [PMID: 34849976 PMCID: PMC8922162 DOI: 10.1093/jamia/ocab264] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 10/20/2021] [Accepted: 11/16/2021] [Indexed: 01/07/2023] Open

Wang DD, Li Y, Nguyen XMT, Song RJ, Ho YL, Hu FB, Willett WC, Wilson PWF, Cho K, Gaziano JM, Djoussé L. Dietary Sodium and Potassium Intake and Risk of Non-Fatal Cardiovascular Diseases: The Million Veteran Program. Nutrients 2022;14:nu14051121. [PMID: 35268096 PMCID: PMC8912456 DOI: 10.3390/nu14051121] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 03/02/2022] [Accepted: 03/03/2022] [Indexed: 11/16/2022] Open

Abstract

Objective: To examine the association between intakes of sodium and potassium and the ratio of sodium to potassium and incident myocardial infarction and stroke. Design, Setting and Participants: Prospective cohort study of 180,156 Veterans aged 19 to 107 years with plausible dietary intake measured by food frequency questionnaire (FFQ) who were free of cardiovascular disease (CVD) and cancer at baseline in the VA Million Veteran Program (MVP). Main outcome measures: CVD defined as non-fatal myocardial infarction (MI) or acute ischemic stroke (AIS) ascertained using high-throughput phenotyping algorithms applied to electronic health records. Results: During up to 8 years of follow-up, we documented 4090 CVD cases (2499 MI and 1712 AIS). After adjustment for confounding factors, a higher sodium intake was associated with a higher risk of CVD, whereas potassium intake was inversely associated with the risk of CVD [hazard ratio (HR) comparing extreme quintiles, 95% confidence interval (CI): 1.09 (95% CI: 0.99−1.21, p trend = 0.01) for sodium and 0.87 (95% CI: 0.79−0.96, p trend = 0.005) for potassium]. In addition, the ratio of sodium to potassium (Na/K ratio) was positively associated with the risk of CVD (HR comparing extreme quintiles = 1.26, 95% CI: 1.14−1.39, p trend < 0.0001). The associations of Na/K ratio were consistent for two subtypes of CVD; one standard deviation increment in the ratio was associated with HRs (95% CI) of 1.12 (1.06−1.19) for MI and 1.11 (1.03−1.19) for AIS. In secondary analyses, the observed associations were consistent across race and status for diabetes, hypertension, and high cholesterol at baseline. Associations appeared to be more pronounced among participants with poor dietary quality. Conclusions: A high sodium intake and a low potassium intake were associated with a higher risk of CVD in this large population of US veterans.

Collapse

Affiliation(s)

Dong D Wang Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA The Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
Yanping Li Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
Xuan-Mai T Nguyen Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA Harvard Medical School, Boston, MA 02115, USA
Rebecca J Song Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Department of Epidemiology, Boston University School of Public Health, Boston, MA 02115, USA
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA
Frank B Hu The Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
Walter C Willett The Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
Peter W F Wilson Atlanta VA Medical Center, Atlanta, GA 30033, USA Emory Clinical Cardiovascular Research Institute, Atlanta, GA 30033, USA
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA Harvard Medical School, Boston, MA 02115, USA
J Michael Gaziano Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA Harvard Medical School, Boston, MA 02115, USA
Luc Djoussé Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, USA Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA Harvard Medical School, Boston, MA 02115, USA

Collapse

Cade BE, Hassan SM, Dashti HS, Kiernan M, Pavlova MK, Redline S, Karlson EW. Sleep apnea phenotyping and relationship to disease in a large clinical biobank. JAMIA Open 2022;5:ooab117. [PMID: 35156000 PMCID: PMC8826997 DOI: 10.1093/jamiaopen/ooab117] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 12/08/2021] [Accepted: 12/28/2021] [Indexed: 11/14/2022] Open

Abstract

Objective

Sleep apnea is associated with a broad range of pathophysiology. While electronic health record (EHR) information has the potential for revealing relationships between sleep apnea and associated risk factors and outcomes, practical challenges hinder its use. Our objectives were to develop a sleep apnea phenotyping algorithm that improves the precision of EHR case/control information using natural language processing (NLP); identify novel associations between sleep apnea and comorbidities in a large clinical biobank; and investigate the relationship between polysomnography statistics and comorbid disease using NLP phenotyping.

Materials and Methods

We performed clinical chart reviews on 300 participants putatively diagnosed with sleep apnea and applied International Classification of Sleep Disorders criteria to classify true cases and noncases. We evaluated 2 NLP and diagnosis code-only methods for their abilities to maximize phenotyping precision. The lead algorithm was used to identify incident and cross-sectional associations between sleep apnea and common comorbidities using 4876 NLP-defined sleep apnea cases and 3× matched controls.

Results

The optimal NLP phenotyping strategy had improved model precision (≥0.943) compared to the use of one diagnosis code (≤0.733). Of the tested diseases, 170 disorders had significant incidence odds ratios (ORs) between cases and controls, 8 of which were confirmed using polysomnography (n = 4544), and 281 disorders had significant prevalence OR between sleep apnea cases versus controls, 41 of which were confirmed using polysomnography data.

Discussion and Conclusion

An NLP-informed algorithm can improve the accuracy of case-control sleep apnea ascertainment and thus improve the performance of phenome-wide, genetic, and other EHR analyses of a highly prevalent disorder.

Sleep apnea is a common disease in which breathing partially or completely pauses during sleep, leading to less oxygen in the blood, repeated awakenings, and increased risk of developing multiple diseases. Current studies of sleep apnea often have relatively few participants due to the challenge of performing overnight sleep recordings. Electronic health record (EHR) billing code diagnoses of sleep apnea could be repurposed to increase the size of research studies, but the accuracy of the diagnoses is reduced. We developed a reusable algorithm that improves the accuracy of EHR sleep apnea diagnoses using natural language processing to extract information from clinical notes. As a proof of concept, we used the algorithm to identify hundreds of diseases that are increased among participants with sleep apnea compared to similar patients without sleep apnea. Many of these disease relationships with sleep apnea have not been previously recognized. This improved algorithm will help to accelerate future large-scale investigations of the causes and consequences of sleep apnea.

Collapse

Affiliation(s)

Brian E Cade Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts, USA Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
Syed Moin Hassan Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts, USA Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, USA Division of Pulmonary Disease and Critical Care Medicine, University of Vermont, Burlington, Vermont, USA
Hassan S Dashti Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA Department of Anesthesia, Pain, and Critical Care Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
Melissa Kiernan Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts, USA NeuroCare Center for Sleep, Newton, Massachusetts, USA
Milena K Pavlova Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts, USA Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, USA
Susan Redline Division of Sleep and Circadian Disorders, Brigham and Women’s Hospital, Boston, Massachusetts, USA Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, USA Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA
Elizabeth W Karlson Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA Division of Rheumatology, Inflammation and Immunity, Brigham and Women's Hospital, Boston, Massachusetts, USA

Collapse

Zhang Y, Liu M, Neykov M, Cai T. Prior Adaptive Semi-supervised Learning with Application to EHR Phenotyping. JOURNAL OF MACHINE LEARNING RESEARCH : JMLR 2022;23:83. [PMID: 37974910 PMCID: PMC10653017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/19/2023]

Artificial Intelligence in Clinical Immunology. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_83] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Liang L, Kim N, Hou J, Cai T, Dahal K, Lin C, Finan S, Savovoa G, Rosso M, Polgar-Tucsanyi M, Weiner H, Chitnis T, Cai T, Xia Z. Temporal trends of multiple sclerosis disease activity: Electronic health records indicators. Mult Scler Relat Disord 2022;57:103333. [PMID: 35158446 PMCID: PMC8849591 DOI: 10.1016/j.msard.2021.103333] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 10/03/2021] [Accepted: 10/14/2021] [Indexed: 01/03/2023]

Zaccaria GM, Colella V, Colucci S, Clemente F, Pavone F, Vegliante MC, Esposito F, Opinto G, Scattone A, Loseto G, Minoia C, Rossini B, Quinto AM, Angiulli V, Grieco LA, Fama A, Ferrero S, Moia R, Di Rocco A, Quaglia FM, Tabanelli V, Guarini A, Ciavarella S. Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology. Sci Rep 2021;11:23823. [PMID: 34893665 PMCID: PMC8664934 DOI: 10.1038/s41598-021-03204-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/23/2021] [Indexed: 12/04/2022] Open

Affiliation(s)

Gian Maria Zaccaria Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy.
Vito Colella Department of Electrical and Information Engineering, Politecnico of Bari, Bari, Italy
Simona Colucci Department of Electrical and Information Engineering, Politecnico of Bari, Bari, Italy
Felice Clemente Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Fabio Pavone Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Maria Carmela Vegliante Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Flavia Esposito Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy.,Department of Mathematics, University of Bari Aldo Moro, Bari, Italy
Giuseppina Opinto Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Anna Scattone Pathology Department, IRCCS Istituto Tumori 'Giovanni Paolo II', Bari, Italy
Giacomo Loseto Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Carla Minoia Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Bernardo Rossini Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Angela Maria Quinto Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Vito Angiulli Clinical Engineering Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Bari, Italy
Luigi Alfredo Grieco Department of Electrical and Information Engineering, Politecnico of Bari, Bari, Italy
Angelo Fama Hematology, Azienda USL - IRCCS Di Reggio Emilia, Reggio Emilia, Italy
Simone Ferrero Division of Hematology 1, AOU "Città Della Salute e Della Scienza di Torino", Torino, Italy.,Department of Molecular Biotechnologies and Health Sciences, University of Torino, Torino, Italy
Riccardo Moia Division of Hematology, Azienda Ospedaliero-Universitaria Maggiore Della Carità Di Novara, Novara, Italy
Alice Di Rocco Unit of Hematology, Azienda Ospedaliero-Universitaria Policlinico Umberto I, Roma, Italy
Francesca Maria Quaglia Department of Medicine, Section of Hematology, University of Verona, Verona, Italy
Valentina Tabanelli Division of Diagnostic Haematopathology, European Institute of Oncology, IRCCS, Milano, Italy
Attilio Guarini Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy
Sabino Ciavarella Hematology and Cell Therapy Unit, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, Bari, Italy

Collapse

Hou J, Kim N, Cai T, Dahal K, Weiner H, Chitnis T, Cai T, Xia Z. Comparison of Dimethyl Fumarate vs Fingolimod and Rituximab vs Natalizumab for Treatment of Multiple Sclerosis. JAMA Netw Open 2021;4:e2134627. [PMID: 34783826 PMCID: PMC8596196 DOI: 10.1001/jamanetworkopen.2021.34627] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/20/2021] [Indexed: 01/17/2023] Open

Abstract

Importance

As disease-modifying treatment options for multiple sclerosis increase, comparisons of the options based on real-world evidence may guide clinical decision-making.

Objective

To compare the relapse outcomes between 2 pairs of disease-modifying treatments: dimethyl fumarate vs fingolimod and natalizumab vs rituximab.

Design, Setting, and Participants

This comparative effectiveness study integrated data from a clinic-based multiple sclerosis research registry and its linked electronic health records (EHR) system between January 1, 2006, and December 31, 2016, and built treatment groups for each pairwise disease-modifying treatment comparison according to both registry records and electronic prescriptions. Parallel analyses were conducted from October 11, 2019, to July 7, 2021.

Main Outcomes and Measures

The main outcomes were the 1-year and 2-year relapse rates as well as the time to relapse. To compare relapse outcomes, the study adjusted for covariates from 2 sources (registry and EHR) and corrected for confounding biases among the covariates by the doubly robust estimation.

Results

The study included 4 treatment groups: dimethyl fumarate (n = 260; 198 women [76.2%]; 227 non-Hispanic White individuals [87.3%]; mean [SD] age at diagnosis, 41.7 [10.4] years), fingolimod (n = 267; 190 women [71.2%]; 222 non-Hispanic White individuals [83.1%]; mean [SD] age at diagnosis, 37.9 [9.9] years), natalizumab (n = 204; 160 women [78.4%]; 172 non-Hispanic White individuals [84.3%]; mean [SD] age at diagnosis, 37.2 [10.6] years), and rituximab (n = 115; 83 women [72.2%]; 99 non-Hispanic White individuals [86.1%]; mean [SD] age at diagnosis, 44.1 [11.1] years). No significant differences were found in the relapse outcomes between dimethyl fumarate and fingolimod after correcting for confounding biases and multiple testing (difference in 1-year relapse rate, 0.028 [95% CI, -0.031 to 0.084]; difference in 2-year relapse rate, 0.071 [95% CI, 0.008-0.128]; relative risk of 2-year non-relapse, 0.957 [95% CI, 0.884-1.035] with dimethyl fumarate as reference). When compared with rituximab, natalizumab was associated with a higher relapse rate for all 3 outcomes after bias correction and multiple testing (difference in 1-year relapse rate, 0.080 [95% CI, 0.013-0.137]; difference in 2-year relapse rate, 0.132 [95% CI, 0.043-0.189]; relative risk of 2-year non-relapse, 0.903 [95% CI, 0.822-0.944]). Confounders were identified from EHR data not recorded in the registry data through data-driven feature selection.

Conclusions and Relevance

This study reports real-world evidence of equivalent relapse outcomes between dimethyl fumarate and fingolimod and relapse reduction in favor of rituximab relative to natalizumab. This approach illustrates the value of incorporating EHR data as high-dimensional covariates in real-world treatment comparison.

Collapse

Hong C, Rush E, Liu M, Zhou D, Sun J, Sonabend A, Castro VM, Schubert P, Panickan VA, Cai T, Costa L, He Z, Link N, Hauser R, Gaziano JM, Murphy SN, Ostrouchov G, Ho YL, Begoli E, Lu J, Cho K, Liao KP, Cai T. Clinical knowledge extraction via sparse embedding regression (KESER) with multi-center large scale electronic health record data. NPJ Digit Med 2021;4:151. [PMID: 34707226 PMCID: PMC8551205 DOI: 10.1038/s41746-021-00519-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 09/13/2021] [Indexed: 11/11/2022] Open

Affiliation(s)

Chuan Hong Harvard Medical School, Boston, MA, USA VA Boston Healthcare System, Boston, MA, USA
Everett Rush Department of Energy, Oak Ridge National Lab, Oak Ridge, TN, USA
Molei Liu Harvard T.H. Chan School of Public Health, Boston, MA, USA
Doudou Zhou University of California, Davis, CA, USA
Jiehuan Sun University of Illinois at Chicago, Chicago, IL, USA
Aaron Sonabend Harvard T.H. Chan School of Public Health, Boston, MA, USA
Victor M Castro Mass General Brigham, Boston, MA, USA
Petra Schubert VA Boston Healthcare System, Boston, MA, USA
Vidul A Panickan Harvard Medical School, Boston, MA, USA
Tianrun Cai VA Boston Healthcare System, Boston, MA, USA Mass General Brigham, Boston, MA, USA
Lauren Costa VA Boston Healthcare System, Boston, MA, USA
Zeling He Mass General Brigham, Boston, MA, USA
Nicholas Link VA Boston Healthcare System, Boston, MA, USA
Ronald Hauser West Haven VA Medical Center, West Haven, CT, USA
J Michael Gaziano Harvard Medical School, Boston, MA, USA VA Boston Healthcare System, Boston, MA, USA Brigham and Women's Hospital, Boston, MA, USA
Shawn N Murphy Mass General Brigham, Boston, MA, USA
George Ostrouchov Department of Energy, Oak Ridge National Lab, Oak Ridge, TN, USA
Yuk-Lam Ho VA Boston Healthcare System, Boston, MA, USA
Edmon Begoli Department of Energy, Oak Ridge National Lab, Oak Ridge, TN, USA
Junwei Lu VA Boston Healthcare System, Boston, MA, USA Harvard T.H. Chan School of Public Health, Boston, MA, USA
Kelly Cho Harvard Medical School, Boston, MA, USA VA Boston Healthcare System, Boston, MA, USA Brigham and Women's Hospital, Boston, MA, USA
Katherine P Liao Harvard Medical School, Boston, MA, USA VA Boston Healthcare System, Boston, MA, USA Brigham and Women's Hospital, Boston, MA, USA
Tianxi Cai Harvard Medical School, Boston, MA, USA. VA Boston Healthcare System, Boston, MA, USA. Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Collapse

Le TT, Gutiérrez-Sacristán A, Son J, Hong C, South AM, Beaulieu-Jones BK, Loh NHW, Luo Y, Morris M, Ngiam KY, Patel LP, Samayamuthu MJ, Schriver E, Tan ALM, Moore J, Cai T, Omenn GS, Avillach P, Kohane IS, Visweswaran S, Mowery DL, Xia Z. Multinational characterization of neurological phenotypes in patients hospitalized with COVID-19. Sci Rep 2021;11:20238. [PMID: 34642371 PMCID: PMC8510999 DOI: 10.1038/s41598-021-99481-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023] Open

Affiliation(s)

Trang T Le Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Jiyeon Son Department of Neurology, University of Pittsburgh, Biomedical Science Tower 3, Suite 7014, 3501 5th Avenue, Pittsburgh, PA, 15260, USA
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Andrew M South Department of Pediatrics, Wake Forest School of Medicine, Winston Salem, NC, USA
Brett K Beaulieu-Jones Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Ne Hooi Will Loh Department of Critical Care, National University Health Systems, Singapore, Singapore
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Kee Yuan Ngiam Department of Surgery, National University Health Systems, Singapore, Singapore
Lav P Patel Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA
Malarkodi J Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Emily Schriver Data Analytics Center, University of Pennsylvania Health System, Philadelphia, PA, USA
Amelia L M Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Jason Moore Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Danielle L Mowery Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Zongqi Xia Department of Neurology, University of Pittsburgh, Biomedical Science Tower 3, Suite 7014, 3501 5th Avenue, Pittsburgh, PA, 15260, USA.

Collapse

Cai T, Cai F, Dahal KP, Cremone G, Lam E, Golnik C, Seyok T, Hong C, Cai T, Liao KP. Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening. ACR Open Rheumatol 2021;3:593-600. [PMID: 34296815 PMCID: PMC8449035 DOI: 10.1002/acr2.11289] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 05/18/2021] [Indexed: 11/22/2022] Open

Abstract

Objective

Efficiently identifying eligible patients is a crucial first step for a successful clinical trial. The objective of this study was to test whether an approach using electronic health record (EHR) data and an ensemble machine learning algorithm incorporating billing codes and data from clinical notes processed by natural language processing (NLP) can improve the efficiency of eligibility screening.

Methods

We studied patients screened for a clinical trial of rheumatoid arthritis (RA) with one or more International Classification of Diseases (ICD) code for RA and age greater than 35 years, from a tertiary care center and a community hospital. The following three groups of EHR features were considered for the algorithm: 1) structured features, 2) the counts of NLP concepts from notes, 3) health care utilization. All features were linked to dates. We applied random forest and logistic regression with least absolute shrinkage and selection operator penalty against the following two standard approaches: 1) one or more RA ICD code and no ICD codes related to exclusion criteria (Screen_RAICD1_+EX) and 2) two or more RA ICD codes (Screen_RAICD2). To test the portability, we trained the algorithm at one institution and tested it at the other.

Results

In total, 3359 patients at Brigham and Women’s Hospital (BWH) and 642 patients at Faulkner Hospital (FH) were studied, with 461 (13.7%) eligible patients at BWH and 84 (13.4%) at FH. The application of the algorithm reduced ineligible patients from chart review by 40.5% at the tertiary care center and by 57.0% at the community hospital. In contrast, Screen_RAICD2 reduced patients for chart review by 2.7% to 11.3%; Screen_RAICD1+EX reduced patients for chart review by 63% to 65% but excluded 22% to 27% of eligible patients.

Conclusion

The ensemble machine learning algorithm incorporating billing codes and NLP data increased the efficiency of eligibility screening by reducing the number of patients requiring chart review while not excluding eligible patients. Moreover, this approach can be trained at one institution and applied at another for multicenter clinical trials.

Collapse

Chen IY, Joshi S, Ghassemi M, Ranganath R. Probabilistic Machine Learning for Healthcare. Annu Rev Biomed Data Sci 2021;4:393-415. [PMID: 34465179 DOI: 10.1146/annurev-biodatasci-092820-033938] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Bastarache L. Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS. Annu Rev Biomed Data Sci 2021;4:1-19. [PMID: 34465180 PMCID: PMC9307256 DOI: 10.1146/annurev-biodatasci-122320-112352] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Yuan Q, Cai T, Hong C, Du M, Johnson BE, Lanuti M, Cai T, Christiani DC. Performance of a Machine Learning Algorithm Using Electronic Health Record Data to Identify and Estimate Survival in a Longitudinal Cohort of Patients With Lung Cancer. JAMA Netw Open 2021;4:e2114723. [PMID: 34232304 PMCID: PMC8264641 DOI: 10.1001/jamanetworkopen.2021.14723] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open

Abstract

IMPORTANCE

Electronic health records (EHRs) provide a low-cost means of accessing detailed longitudinal clinical data for large populations. A lung cancer cohort assembled from EHR data would be a powerful platform for clinical outcome studies.

OBJECTIVE

To investigate whether a clinical cohort assembled from EHRs could be used in a lung cancer prognosis study.

DESIGN, SETTING, AND PARTICIPANTS

In this cohort study, patients with lung cancer were identified among 76 643 patients with at least 1 lung cancer diagnostic code deposited in an EHR in Mass General Brigham health care system from July 1988 to October 2018. Patients were identified via a semisupervised machine learning algorithm, for which clinical information was extracted from structured and unstructured data via natural language processing tools. Data completeness and accuracy were assessed by comparing with the Boston Lung Cancer Study and against criterion standard EHR review results. A prognostic model for non-small cell lung cancer (NSCLC) overall survival was further developed for clinical application. Data were analyzed from March 2019 through July 2020.

EXPOSURES

Clinical data deposited in EHRs for cohort construction and variables of interest for the prognostic model were collected.

MAIN OUTCOMES AND MEASURES

The primary outcomes were the performance of the lung cancer classification model and the quality of the extracted variables; the secondary outcome was the performance of the prognostic model.

RESULTS

Among 76 643 patients with at least 1 lung cancer diagnostic code, 42 069 patients were identified as having lung cancer, with a positive predictive value of 94.4%. The study cohort consisted of 35 375 patients (16 613 men [47.0%] and 18 756 women [53.0%]; 30 140 White individuals [85.2%], 1040 Black individuals [2.9%], and 857 Asian individuals [2.4%]) after excluding patients with lung cancer history and less than 14 days of follow-up after initial diagnosis. The median (interquartile range) age at diagnosis was 66.7 (58.4-74.1) years. The area under the receiver operating characteristic curves of the prognostic model for overall survival with NSCLC were 0.828 (95% CI, 0.815-0.842) for 1-year prediction, 0.825 (95% CI, 0.812-0.836) for 2-year prediction, 0.814 (95% CI, 0.800-0.826) for 3-year prediction, 0.814 (95% CI, 0.799-0.828) for 4-year prediction, and 0.812 (95% CI, 0.798-0.825) for 5-year prediction.

CONCLUSIONS AND RELEVANCE

These findings suggest the feasibility of assembling a large-scale EHR-based lung cancer cohort with detailed longitudinal clinical measurements and that EHR data may be applied in cancer progression with a set of generalizable approaches.

Collapse

Lee J, Liu C, Kim JH, Butler A, Shang N, Pang C, Natarajan K, Ryan P, Ta C, Weng C. Comparative effectiveness of medical concept embedding for feature engineering in phenotyping. JAMIA Open 2021;4:ooab028. [PMID: 34142015 PMCID: PMC8206403 DOI: 10.1093/jamiaopen/ooab028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 02/23/2021] [Accepted: 05/03/2021] [Indexed: 01/20/2023] Open

Wen A, Rasmussen LV, Stone D, Liu S, Kiefer R, Adekkanattu P, Brandt PS, Pacheco JA, Luo Y, Wang F, Pathak J, Liu H, Jiang G. CQL4NLP: Development and Integration of FHIR NLP Extensions in Clinical Quality Language for EHR-driven Phenotyping. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2021;2021:624-633. [PMID: 34457178 PMCID: PMC8378647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Wang H, Goodman MO, Sofer T, Redline S. Cutting the fat: advances and challenges in sleep apnoea genetics. Eur Respir J 2021;57:57/5/2004644. [PMID: 33958377 DOI: 10.1183/13993003.04644-2020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 02/10/2021] [Indexed: 01/25/2023]

Veiga RV, Schuler-Faccini L, França GVA, Andrade RFS, Teixeira MG, Costa LC, Paixão ES, Costa MDCN, Barreto ML, Oliveira JF, Oliveira WK, Cardim LL, Rodrigues MS. Classification algorithm for congenital Zika Syndrome: characterizations, diagnosis and validation. Sci Rep 2021;11:6770. [PMID: 33762667 PMCID: PMC7990918 DOI: 10.1038/s41598-021-86361-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 03/09/2021] [Indexed: 11/09/2022] Open

Affiliation(s)

Rafael V Veiga Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil. .,Instituto de Ciências da Saúde, Universidade Federal da Bahia, Salvador, Bahia, Brazil.
Lavinia Schuler-Faccini Universidade Federal do Rio Grande do Sul, Rio Grande do Sul, Brazil
Giovanny V A França Secretariat of Health Surveillance, Ministry of Health, Brasilia, Brazil
Roberto F S Andrade Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil.,Instituto de Física, Universidade Federal da Bahia, Salvador, Bahia, Brazil
Maria Glória Teixeira Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil.,Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Bahia, Brazil
Larissa C Costa Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil
Enny S Paixão Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil.,London School of Hygiene and Tropical Medicine, London, England, United Kingdom
Maria da Conceição N Costa Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil.,Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador, Bahia, Brazil
Maurício L Barreto Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil
Juliane F Oliveira Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil.,Department of Mathematics, Centre of Mathematics of the University of Porto (CMUP), Porto, Portugal
Wanderson K Oliveira Hospital das Forças Armadas, Ministério da Defesa, Distrito Federal, Brasília, Brazil
Luciana L Cardim Center of Data and Knowledge Integration for Health (CIDACS), Instituto Gonçalo Moniz, Fundação Oswaldo Cruz, Salvador, Bahia, Brazil
Moreno S Rodrigues Fundação Oswaldo Cruz, Porto Velho, Rondônia, Brazil

Collapse

Tedeschi SK, Cai T, He Z, Ahuja Y, Hong C, Yates KA, Dahal K, Xu C, Lyu H, Yoshida K, Solomon DH, Cai T, Liao KP. Classifying Pseudogout Using Machine Learning Approaches With Electronic Health Record Data. Arthritis Care Res (Hoboken) 2021;73:442-448. [PMID: 31910317 DOI: 10.1002/acr.24132] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2019] [Accepted: 12/31/2019] [Indexed: 12/19/2022]

Abstract

OBJECTIVE

Identifying pseudogout in large data sets is difficult due to its episodic nature and a lack of billing codes specific to this acute subtype of calcium pyrophosphate (CPP) deposition disease. The objective of this study was to evaluate a novel machine learning approach for classifying pseudogout using electronic health record (EHR) data.

METHODS

We created an EHR data mart of patients with ≥1 relevant billing code or ≥2 natural language processing (NLP) mentions of pseudogout or chondrocalcinosis, 1991-2017. We selected 900 subjects for gold standard chart review for definite pseudogout (synovitis + synovial fluid CPP crystals), probable pseudogout (synovitis + chondrocalcinosis), or not pseudogout. We applied a topic modeling approach to identify definite/probable pseudogout. A combined algorithm included topic modeling plus manually reviewed CPP crystal results. We compared algorithm performance and cohorts identified by billing codes, the presence of CPP crystals, topic modeling, and a combined algorithm.

RESULTS

Among 900 subjects, 123 (13.7%) had pseudogout by chart review (68 definite, 55 probable). Billing codes had a sensitivity of 65% and a positive predictive value (PPV) of 22% for pseudogout. The presence of CPP crystals had a sensitivity of 29% and a PPV of 92%. Without using CPP crystal results, topic modeling had a sensitivity of 29% and a PPV of 79%. The combined algorithm yielded a sensitivity of 42% and a PPV of 81%. The combined algorithm identified 50% more patients than the presence of CPP crystals; the latter captured a portion of definite pseudogout and missed probable pseudogout.

CONCLUSION

For pseudogout, an episodic disease with no specific billing code, combining NLP, machine learning methods, and synovial fluid laboratory results yielded an algorithm that significantly boosted the PPV compared to billing codes.

Collapse

Kohane IS, Aronow BJ, Avillach P, Beaulieu-Jones BK, Bellazzi R, Bradford RL, Brat GA, Cannataro M, Cimino JJ, García-Barrio N, Gehlenborg N, Ghassemi M, Gutiérrez-Sacristán A, Hanauer DA, Holmes JH, Hong C, Klann JG, Loh NHW, Luo Y, Mandl KD, Daniar M, Moore JH, Murphy SN, Neuraz A, Ngiam KY, Omenn GS, Palmer N, Patel LP, Pedrera-Jiménez M, Sliz P, South AM, Tan ALM, Taylor DM, Taylor BW, Torti C, Vallejos AK, Wagholikar KB, Weber GM, Cai T. What Every Reader Should Know About Studies Using Electronic Health Record Data but May Be Afraid to Ask. J Med Internet Res 2021;23:e22219. [PMID: 33600347 PMCID: PMC7927948 DOI: 10.2196/22219] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 09/14/2020] [Accepted: 01/10/2021] [Indexed: 12/13/2022] Open

Affiliation(s)

Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Bruce J Aronow Biomedical Informatics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, United States
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Brett K Beaulieu-Jones Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy.,ICS Maugeri, Pavia, Italy
Robert L Bradford North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Mario Cannataro Data Analytics Research Center, University Magna Graecia of Catanzaro, Catanzaro, Italy.,Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, Italy
James J Cimino Informatics Institute, University of Alabama at Birmingham, Birmingham, AL, United States
Noelia García-Barrio Department of Informatics, 12 de Octubre University Hospital, Madrid, Spain
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Marzyeh Ghassemi Department of Computer Science and Medicine, University of Toronto, Toronto, ON, Canada
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, United States
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Jeffrey G Klann Department of Medicine, Harvard Medical School, Boston, MA, United States.,Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA, United States
Ne Hooi Will Loh National University Health Systems, Singapore, Singapore
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Kenneth D Mandl Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, United States
Mohamad Daniar Clinical Research Informatics, Boston Children's Hospital, Boston, MA, United States
Jason H Moore Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
Shawn N Murphy Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States.,Department of Neurology, Massachusetts General Hospital, Boston, MA, United States
Antoine Neuraz Department of Biomedical Informatics, Necker-Enfant Malades Hospital, Assistance Publique - Hôpitaux de Paris, Paris, France.,Centre de Recherche des Cordeliers, INSERM UMRS 1138 Team 22, Université de Paris, Paris, France
Kee Yuan Ngiam National University Health Systems, Singapore, Singapore
Gilbert S Omenn Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, United States
Nathan Palmer Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Lav P Patel Department of Internal Medicine, Division of Medical Informatics, University of Kansas Medical Center, Kansas City, KS, United States
Miguel Pedrera-Jiménez Department of Informatics, 12 de Octubre University Hospital, Madrid, Spain
Piotr Sliz Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, United States
Andrew M South Section of Nephrology, Department of Pediatrics, Brenner Children's Hospital, Wake Forest School of Medicine, Winston Salem, NC, United States
Amelia Li Min Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States.,Department of Biomedical Informatics, National University of Singapore, Singapore, Singapore
Deanne M Taylor Department of Biomedical and Health Informatics, The Children's Hospital of Philadelphia, Philadelphia, PA, United States.,Department of Pediatrics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, United States
Bradley W Taylor Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI, United States
Carlo Torti Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, Italy
Andrew K Vallejos Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI, United States
Kavishwar B Wagholikar Department of Medicine, Harvard Medical School, Boston, MA, United States.,Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA, United States
See Acknowledgments,
Griffin M Weber Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States

Collapse

Ahuja Y, Kim N, Liang L, Cai T, Dahal K, Seyok T, Lin C, Finan S, Liao K, Savovoa G, Chitnis T, Cai T, Xia Z. Leveraging electronic health records data to predict multiple sclerosis disease activity. Ann Clin Transl Neurol 2021;8:800-810. [PMID: 33626237 PMCID: PMC8045951 DOI: 10.1002/acn3.51324] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 12/26/2020] [Accepted: 02/01/2021] [Indexed: 12/26/2022] Open

Geva A, Liu M, Panickan VA, Avillach P, Cai T, Mandl KD. A high-throughput phenotyping algorithm is portable from adult to pediatric populations. J Am Med Inform Assoc 2021;28:1265-1269. [PMID: 33594412 DOI: 10.1093/jamia/ocaa343] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/27/2020] [Accepted: 12/28/2020] [Indexed: 11/14/2022] Open

Le TT, Gutiérrez-Sacristán A, Son J, Hong C, South AM, Beaulieu-Jones BK, Loh NHW, Luo Y, Morris M, Ngiam KY, Patel LP, Samayamuthu MJ, Schriver E, Tan AL, Moore J, Cai T, Omenn GS, Avillach P, Kohane IS, Visweswaran S, Mowery DL, Xia Z. Multinational Prevalence of Neurological Phenotypes in Patients Hospitalized with COVID-19. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021. [PMID: 33655281 PMCID: PMC7924306 DOI: 10.1101/2021.01.27.21249817] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract

OBJECTIVE:

Neurological complications can worsen outcomes in COVID-19. We defined the prevalence of a wide range of neurological conditions among patients hospitalized with COVID-19 in geographically diverse multinational populations.

METHODS:

Using electronic health record (EHR) data from 348 participating hospitals across 6 countries and 3 continents between January and September 2020, we performed a cross-sectional study of hospitalized adult and pediatric patients with a positive SARS-CoV-2 reverse transcription polymerase chain reaction test, both with and without severe COVID-19. We assessed the frequency of each disease category and 3-character International Classification of Disease (ICD) code of neurological diseases by countries, sites, time before and after admission for COVID-19, and COVID-19 severity.

RESULTS:

Among the 35,177 hospitalized patients with SARS-CoV-2 infection, there was increased prevalence of disorders of consciousness (5.8%, 95% confidence interval [CI]: 3.7%−7.8%, p_FDR<.001) and unspecified disorders of the brain (8.1%, 95%CI: 5.7%−10.5%, p_FDR<.001), compared to pre-admission prevalence. During hospitalization, patients who experienced severe COVID-19 status had 22% (95%CI: 19%−25%) increase in the relative risk (RR) of disorders of consciousness, 24% (95%CI: 13%−35%) increase in other cerebrovascular diseases, 34% (95%CI: 20%−50%) increase in nontraumatic intracranial hemorrhage, 37% (95%CI: 17%−60%) increase in encephalitis and/or myelitis, and 72% (95%CI: 67%−77%) increase in myopathy compared to those who never experienced severe disease.

INTERPRETATION:

Using an international network and common EHR data elements, we highlight an increase in the prevalence of central and peripheral neurological phenotypes in patients hospitalized with SARS-CoV-2 infection, particularly among those with severe disease.

Collapse

Affiliation(s)

Trang T Le Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Jiyeon Son Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
Chuan Hong Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
Andrew M South Department of Pediatrics, Wake Forest School of Medicine, Winston Salem, NC, USA
Brett K Beaulieu-Jones Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
Ne Hooi Will Loh Department of Critical Care, National University Health Systems, Singapore
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Kee Yuan Ngiam Department of Surgery, National University Health Systems, Singapore
Lav P Patel Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, USA
Malarkodi J Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Emily Schriver Data Analytics Center, University of Pennsylvania Health System, Philadelphia, PA, USA
Amelia Lm Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Jason Moore Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Gilbert S Omenn Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA

Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Danielle L Mowery Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Zongqi Xia Department of Neurology, University of Pittsburgh, PA, USA

Collapse

Artificial Intelligence in Clinical Immunology. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_83-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Huang S, Huang J, Cai T, Dahal KP, Cagan A, He Z, Stratton J, Gorelik I, Hong C, Cai T, Liao KP. Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms. Rheumatology (Oxford) 2020;59:3759-3766. [PMID: 32413107 DOI: 10.1093/rheumatology/keaa198] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/17/2020] [Indexed: 12/18/2022] Open

Dligach D, Afshar M, Miller T. Pre-training phenotyping classifiers. J Biomed Inform 2020;113:103626. [PMID: 33259943 DOI: 10.1016/j.jbi.2020.103626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 09/09/2020] [Accepted: 11/14/2020] [Indexed: 11/17/2022]

Statistical Physics for Medical Diagnostics: Learning, Inference, and Optimization Algorithms. Diagnostics (Basel) 2020;10:diagnostics10110972. [PMID: 33228143 PMCID: PMC7699346 DOI: 10.3390/diagnostics10110972] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 02/03/2023] Open

Raghavan S, Ho YL, Vassy JL, Posner D, Honerlaw J, Costa L, Phillips LS, Gagnon DR, Wilson PWF, Cho K. Optimizing Atherosclerotic Cardiovascular Disease Risk Estimation for Veterans With Diabetes Mellitus. Circ Cardiovasc Qual Outcomes 2020;13:e006528. [PMID: 32862698 PMCID: PMC7914289 DOI: 10.1161/circoutcomes.120.006528] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Abstract

BACKGROUND

Estimated 10-year atherosclerotic cardiovascular disease (ASCVD) risk in diabetes mellitus patients is used to guide primary prevention, but the performance of risk estimators (2013 Pooled Cohort Equations [PCE] and Risk Equations for Complications of Diabetes [RECODe]) varies across populations. Data from electronic health records could be used to improve risk estimation for a health system's patients. We aimed to evaluate risk equations for initial ASCVD events in US veterans with diabetes mellitus and improve model performance in this population.

METHODS AND RESULTS

We studied 183 096 adults with diabetes mellitus and without prior ASCVD who received care in the Veterans Affairs Healthcare System (VA) from 2002 to 2016 with mean follow-up of 4.6 years. We evaluated model discrimination, using Harrell's C statistic, and calibration, using the reclassification χ2 test, of the PCE and RECODe equations to predict fatal or nonfatal myocardial infarction or stroke and cardiovascular mortality. We then tested whether model performance was affected by deriving VA-specific β-coefficients. Discrimination of ASCVD events by the PCE was improved by deriving VA-specific β-coefficients (C statistic increased from 0.560 to 0.597) and improved further by including measures of glycemia, renal function, and diabetes mellitus treatment (C statistic, 0.632). Discrimination by the RECODe equations was improved by substituting VA-specific coefficients (C statistic increased from 0.604 to 0.621). Absolute risk estimation by PCE and RECODe equations also improved with VA-specific coefficients; the calibration P increased from <0.001 to 0.08 for PCE and from <0.001 to 0.005 for RECODe, where higher P indicates better calibration. Approximately two-thirds of veterans would meet a guideline indication for high-intensity statin therapy based on the PCE versus only 10% to 15% using VA-fitted models.

CONCLUSIONS

Existing ASCVD risk equations overestimate risk in veterans with diabetes mellitus, potentially impacting guideline-indicated statin therapy. Prediction model performance can be improved for a health system's patients using readily available electronic health record data.

Collapse

Daniel C, Kalra D. Clinical Research Informatics. Yearb Med Inform 2020;29:203-207. [PMID: 32823317 PMCID: PMC7442510 DOI: 10.1055/s-0040-1702007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Abstract

Objectives : To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2019.

Method : A bibliographic search using a combination of MeSH descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. After peer-review ranking, a consensus meeting between the two section editors and the editorial team was organized to finally conclude on the selected three best papers.

Results : Among the 517 papers, published in 2019, returned by the search, that were in the scope of the various areas of CRI, the full review process selected three best papers. The first best paper describes the use of a homomorphic encryption technique to enable federated analysis of real-world data while complying more easily with data protection requirements. The authors of the second best paper demonstrate the evidence value of federated data networks reporting a large real world data study related to the first line treatment for hypertension. The third best paper reports the migration of the US Food and Drug Administration (FDA) adverse event reporting system database to the OMOP common data model. This work opens the combined analysis of both spontaneous reporting system and electronic health record (EHR) data for pharmacovigilance.

Conclusions : The most significant research efforts in the CRI field are currently focusing on real world evidence generation and especially the reuse of EHR data. With the progress achieved this year in the areas of phenotyping, data integration, semantic interoperability, and data quality assessment, real world data is becoming more accessible and reusable. High quality data sets are key assets not only for large scale observational studies or for changing the way clinical trials are conducted but also for developing or evaluating artificial intelligence algorithms guiding clinical decision for more personalized care. And lastly, security and confidentiality, ethical and regulatory issues, and more generally speaking data governance are still active research areas this year.

Collapse

Vassy JL, Lu B, Ho YL, Galloway A, Raghavan S, Honerlaw J, Tarko L, Russo J, Qazi S, Orkaby AR, Tanukonda V, Djousse L, Gaziano JM, Gagnon DR, Cho K, Wilson PWF. Estimation of Atherosclerotic Cardiovascular Disease Risk Among Patients in the Veterans Affairs Health Care System. JAMA Netw Open 2020;3:e208236. [PMID: 32662843 PMCID: PMC7361654 DOI: 10.1001/jamanetworkopen.2020.8236] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Abstract

IMPORTANCE

Current guidelines recommend statin therapy for millions of US residents for the primary prevention of atherosclerotic cardiovascular disease (ASCVD). It is unclear whether traditional prediction models that do not account for current widespread statin use are sufficient for risk assessment.

OBJECTIVES

To examine the performance of the Pooled Cohort Equations (PCE) for 5-year ASCVD risk estimation in a contemporary cohort and to test the hypothesis that inclusion of statin therapy improves model performance.

DESIGN, SETTING, AND PARTICIPANTS

This cohort study included adult patients in the Veterans Affairs health care system without baseline ASCVD. Using national electronic health record data, 3 Cox proportional hazards models were developed to estimate 5-year ASCVD risk, as follows: the variables and published β coefficients from the PCE (model 1), the PCE variables with cohort-derived β coefficients (model 2), and model 2 plus baseline statin use (model 3). Data were collected from January 2002 to December 2012 and analyzed from June 2016 to March 2020.

EXPOSURES

Traditional ASCVD risk factors from the PCE plus baseline statin use.

MAIN OUTCOMES AND MEASURES

Incident ASCVD and ASCVD mortality.

RESULTS

Of 1 672 336 patients in the cohort (mean [SD] baseline age 58.0 [13.8] years, 1 575 163 [94.2%] men, 1 383 993 [82.8%] white), 312 155 (18.7%) were receiving statin therapy at baseline. During 5 years of follow-up, 66 605 (4.0%) experienced an ASCVD event, and 31 878 (1.9%) experienced ASCVD death. Compared with the original PCE, the cohort-derived model did not improve model discrimination in any of the 4 age-sex strata but did improve model calibration. The PCE overestimated ASCVD risk compared with the cohort-derived model; 211 237 of 1 136 161 white men (18.6%), 29 634 of 218 463 black men (13.6%), 1741 of 44 399 white women (3.9%), and 836 of 16 034 black women (5.2%) would be potentially eligible for statin therapy under the PCE but not the cohort-derived model. When added to the cohort-derived model, baseline statin therapy was associated with a 7% (95% CI, 5%-9%) lower relative risk of ASCVD and a 25% (95% CI, 23%-28%) lower relative risk for ASCVD death.

CONCLUSIONS AND RELEVANCE

In this study, lower than expected rates of incident ASCVD events in a contemporary national cohort were observed. The PCE overestimated ASCVD risk, and more than 15% of patients would be potentially eligible for statin therapy based on the PCE but not on a cohort-derived model. In the statin era, health care professionals and systems should base ASCVD risk assessment on models calibrated to their patient populations.

Collapse

Affiliation(s)

Jason L. Vassy Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
Bing Lu Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
Yuk-Lam Ho Veterans Affairs Boston Healthcare System, Boston, Massachusetts
Ashley Galloway Veterans Affairs Boston Healthcare System, Boston, Massachusetts
Sridharan Raghavan Veterans Affairs Eastern Colorado Healthcare System, Aurora Division of Hospital Medicine, University of Colorado School of Medicine, Aurora Colorado Cardiovascular Outcomes Research Consortium, Aurora
Jacqueline Honerlaw Veterans Affairs Boston Healthcare System, Boston, Massachusetts
Laura Tarko Veterans Affairs Boston Healthcare System, Boston, Massachusetts
John Russo Veterans Affairs Boston Healthcare System, Boston, Massachusetts Landmark College, Putney, Vermont
Saadia Qazi Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
Ariela R. Orkaby Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
Vidisha Tanukonda Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
Luc Djousse Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
J. Michael Gaziano Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
David R. Gagnon Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
Kelly Cho Veterans Affairs Boston Healthcare System, Boston, Massachusetts Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
Peter W. F. Wilson Atlanta Veterans Affairs Medical Center, Decatur, Georgia Division of Cardiology, Emory University School of Medicine, Atlanta, Georgia Rollins School of Public Health, Department of Epidemiology, Emory University, Atlanta, Georgia

Collapse

Siontis KC, Yao X, Pirruccello JP, Philippakis AA, Noseworthy PA. How Will Machine Learning Inform the Clinical Care of Atrial Fibrillation? Circ Res 2020;127:155-169. [DOI: 10.1161/circresaha.120.316401] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Zhao SS, Hong C, Cai T, Xu C, Huang J, Ermann J, Goodson NJ, Solomon DH, Cai T, Liao KP. Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records. Rheumatology (Oxford) 2020;59:1059-1065. [PMID: 31535693 DOI: 10.1093/rheumatology/kez375] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 07/22/2019] [Indexed: 12/13/2022] Open

Raghavan S, Ho YL, Kini V, Rhee MK, Vassy JL, Gagnon DR, Cho K, Wilson PWF, Phillips LS. Association Between Early Hypertension Control and Cardiovascular Disease Incidence in Veterans With Diabetes. Diabetes Care 2019;42:1995-2003. [PMID: 31515207 PMCID: PMC6754236 DOI: 10.2337/dc19-0686] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 07/26/2019] [Indexed: 02/03/2023]

Abstract

OBJECTIVE

Guidelines for hypertension treatment in patients with diabetes diverge regarding the systolic blood pressure (SBP) threshold at which treatment should be initiated and treatment goal. We examined associations of early SBP treatment with atherosclerotic cardiovascular disease (ASCVD) events in U.S. adults with diabetes.

RESEARCH DESIGN AND METHODS

We studied 43,986 patients with diabetes who newly initiated antihypertensive therapy between 2002 and 2007. Patients were classified into categories based on SBP at treatment initiation (130-139 or ≥140 mmHg) and after 2 years of treatment (100-119, 120-129, 130-139, 140-159, and ≥160 mmHg). The primary outcome was composite ASCVD events (fatal and nonfatal myocardial infarction and stroke), estimated using inverse probability of treatment-weighted Poisson regression and multivariable Cox proportional hazards regression.

RESULTS

Relative to individuals who initiated treatment when SBP was 130-139 mmHg, those with pretreatment SBP ≥140 mmHg had higher ASCVD risk (hazard ratio 1.10 [95% CI 1.02, 1.19]). Relative to those with pretreatment SBP of 130-139 mmHg and on-treatment SBP of 120-129 mmHg (reference group), ASCVD incidence was higher in those with pretreatment SBP ≥140 mmHg and on-treatment SBP 120-129 mmHg (adjusted incidence rate difference [IRD] 1.0 [-0.2 to 2.1] events/1,000 person-years) and in those who achieved on-treatment SBP 130-139 mmHg (IRD 1.9 [0.6, 3.2] and 1.1 [0.04, 2.2] events/1,000 person-years for those with pretreatment SBP 130-139 mmHg and ≥140 mmHg, respectively).

CONCLUSIONS

In this observational study, patients with diabetes initiating antihypertensive therapy when SBP was 130-139 mmHg and those achieving on-treatment SBP <130 mmHg had better outcomes than those with higher SBP levels when initiating or after 2 years on treatment.

Collapse