Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

73
(from Reference Citation Analysis)

Article PDFs (17)

Cited by > 0 (57)

Searched Name

Ramakanth Kavuluru

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Davarpanah MA, Adatorwovor R, Mansoori Y, Ramsheh FSR, Parsa A, Hajiani M, Faramarzi H, Kavuluru R, Asadipooya K. Combination of spironolactone and sitagliptin improves clinical outcomes of outpatients with COVID-19: a prospective cohort study. J Endocrinol Invest 2024;47:235-243. [PMID: 37354247 DOI: 10.1007/s40618-023-02141-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 06/16/2023] [Indexed: 06/26/2023]

Liu S, Wen A, Wang L, He H, Fu S, Miller R, Williams A, Harris D, Kavuluru R, Liu M, Abu-el-Rub N, Schutte D, Zhang R, Rouhizadeh M, Osborne JD, He Y, Topaloglu U, Hong SS, Saltz JH, Schaffter T, Pfaff E, Chute CG, Duong T, Haendel MA, Fuentes R, Szolovits P, Xu H, Liu H. An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C). J Am Med Inform Assoc 2023;30:2036-2040. [PMID: 37555837 PMCID: PMC10654844 DOI: 10.1093/jamia/ocad134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 06/28/2023] [Accepted: 08/08/2023] [Indexed: 08/10/2023] Open

Affiliation(s)

Sijia Liu Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
Andrew Wen Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
Liwei Wang Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
Huan He Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
Sunyang Fu Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA
Robert Miller Tufts Clinical and Translational Science Institute, Tufts Medical Center, Boston, Massachusetts, USA
Andrew Williams Tufts Clinical and Translational Science Institute, Tufts Medical Center, Boston, Massachusetts, USA
Daniel Harris Department of Internal Medicine, University of Kentucky, Lexington, Kentucky, USA
Ramakanth Kavuluru Department of Internal Medicine, University of Kentucky, Lexington, Kentucky, USA
Mei Liu Department of Internal Medicine, University of Kansas Medical Center, Kansas City, Kansas, USA
Noor Abu-el-Rub Department of Internal Medicine, University of Kansas Medical Center, Kansas City, Kansas, USA
Dalton Schutte Department of Pharmaceutical Care & Health Systems, University of Minnesota at Twin Cities, Minneapolis, Minnesota, USA
Rui Zhang Department of Pharmaceutical Care & Health Systems, University of Minnesota at Twin Cities, Minneapolis, Minnesota, USA
Masoud Rouhizadeh Department of Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, Florida, USA
John D Osborne Department of Computer Science, University of Alabama at Birmingham, Birmingham, Alabama, USA
Yongqun He Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan, USA
Umit Topaloglu Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
Stephanie S Hong Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
Joel H Saltz Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
Thomas Schaffter Sage Bionetwork, Seattle, Washington, USA
Emily Pfaff Department of Medicine, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, USA
Christopher G Chute Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA
Tim Duong Department of Radiology, Albert Einstein College of Medicine, Bronx, New York, USA
Melissa A Haendel Center for Health AI, University of Colorado Anschutz Medical Campus, Denver, Colorado, USA
Rafael Fuentes Alex Informatics, North Bethesda, Maryland, USA
Peter Szolovits Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
Hua Xu School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
Hongfang Liu Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA

Collapse

Hochheiser H, Finan S, Yuan Z, Durbin EB, Jeong JC, Hands I, Rust D, Kavuluru R, Wu XC, Warner JL, Savova G. DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction. medRxiv 2023:2023.05.05.23289524. [PMID: 37205575 PMCID: PMC10187451 DOI: 10.1101/2023.05.05.23289524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

Abstract

Objective

The manual extraction of case details from patient records for cancer surveillance efforts is a resource-intensive task. Natural Language Processing (NLP) techniques have been proposed for automating the identification of key details in clinical notes. Our goal was to develop NLP application programming interfaces (APIs) for integration into cancer registry data abstraction tools in a computer-assisted abstraction setting.

Methods

We used cancer registry manual abstraction processes to guide the design of DeepPhe-CR, a web-based NLP service API. The coding of key variables was done through NLP methods validated using established workflows. A container-based implementation including the NLP wasdeveloped. Existing registry data abstraction software was modified to include results from DeepPhe-CR. An initial usability study with data registrars provided early validation of the feasibility of the DeepPhe-CR tools.

Results

API calls support submission of single documents and summarization of cases across multiple documents. The container-based implementation uses a REST router to handle requests and support a graph database for storing results. NLP modules extract topography, histology, behavior, laterality, and grade at 0.79-1.00 F1 across common and rare cancer types (breast, prostate, lung, colorectal, ovary and pediatric brain) on data from two cancer registries. Usability study participants were able to use the tool effectively and expressed interest in adopting the tool.

Discussion

Our DeepPhe-CR system provides a flexible architecture for building cancer-specific NLP tools directly into registrar workflows in a computer-assisted abstraction setting. Improving user interactions in client tools, may be needed to realize the potential of these approaches. DeepPhe-CR: https://deepphe.github.io/.

Collapse

Hochheiser H, Finan S, Yuan Z, Durbin EB, Jeong JC, Hands I, Rust D, Kavuluru R, Wu XC, Warner JL, Savova G. DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction. JCO Clin Cancer Inform 2023;7:e2300156. [PMID: 38113411 PMCID: PMC10752457 DOI: 10.1200/cci.23.00156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/04/2023] [Accepted: 10/04/2023] [Indexed: 12/21/2023] Open

Wang L, He H, Wen A, Moon S, Fu S, Peterson KJ, Ai X, Liu S, Kavuluru R, Liu H. Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis. JMIR Med Inform 2023;11:e48072. [PMID: 37368483 PMCID: PMC10337517 DOI: 10.2196/48072] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/25/2023] [Accepted: 06/01/2023] [Indexed: 06/28/2023] Open

Abstract

BACKGROUND

A patient's family history (FH) information significantly influences downstream clinical care. Despite this importance, there is no standardized method to capture FH information in electronic health records and a substantial portion of FH information is frequently embedded in clinical notes. This renders FH information difficult to use in downstream data analytics or clinical decision support applications. To address this issue, a natural language processing system capable of extracting and normalizing FH information can be used.

OBJECTIVE

In this study, we aimed to construct an FH lexical resource for information extraction and normalization.

METHODS

We exploited a transformer-based method to construct an FH lexical resource leveraging a corpus consisting of clinical notes generated as part of primary care. The usability of the lexicon was demonstrated through the development of a rule-based FH system that extracts FH entities and relations as specified in previous FH challenges. We also experimented with a deep learning-based FH system for FH information extraction. Previous FH challenge data sets were used for evaluation.

RESULTS

The resulting lexicon contains 33,603 lexicon entries normalized to 6408 concept unique identifiers of the Unified Medical Language System and 15,126 codes of the Systematized Nomenclature of Medicine Clinical Terms, with an average number of 5.4 variants per concept. The performance evaluation demonstrated that the rule-based FH system achieved reasonable performance. The combination of the rule-based FH system with a state-of-the-art deep learning-based FH system can improve the recall of FH information evaluated using the BioCreative/N2C2 FH challenge data set, with the F1 score varied but comparable.

CONCLUSIONS

The resulting lexicon and rule-based FH system are freely available through the Open Health Natural Language Processing GitHub.

Collapse

Ai X, Kavuluru R. End-to-End Models for Chemical-Protein Interaction Extraction: Better Tokenization and Span-Based Pipeline Strategies. IEEE Int Conf Healthc Inform 2023;2023:610-618. [PMID: 38274947 PMCID: PMC10809256 DOI: 10.1109/ichi57859.2023.00108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]

Jiang Y, Kavuluru R. End-to-End n-ary Relation Extraction for Combination Drug Therapies. IEEE Int Conf Healthc Inform 2023;2023:72-80. [PMID: 38283165 PMCID: PMC10814995 DOI: 10.1109/ichi57859.2023.00021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]

Fouladvand S, Talbert J, Dwoskin LP, Bush H, Meadows AL, Peterson LE, Mishra YR, Roggenkamp SK, Wang F, Kavuluru R, Chen J. A Comparative Effectiveness Study on Opioid Use Disorder Prediction Using Artificial Intelligence and Existing Risk Models. IEEE J Biomed Health Inform 2023;PP. [PMID: 37037255 DOI: 10.1109/jbhi.2023.3265920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]

Ward PJ, Young AM, Slavova S, Liford M, Daniels L, Lucas R, Kavuluru R. Deep Neural Networks for Fine-Grained Surveillance of Overdose Mortality. Am J Epidemiol 2023;192:257-266. [PMID: 36222700 DOI: 10.1093/aje/kwac180] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 08/16/2022] [Accepted: 10/10/2022] [Indexed: 02/07/2023] Open

Phuong J, Riches NO, Calzoni L, Datta G, Duran D, Lin AY, Singh RP, Solomonides AE, Whysel NY, Kavuluru R. Toward informatics-enabled preparedness for natural hazards to minimize health impacts of climate change. J Am Med Inform Assoc 2022;29:2161-2167. [PMID: 36094062 PMCID: PMC9667167 DOI: 10.1093/jamia/ocac162] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/21/2022] [Accepted: 08/30/2022] [Indexed: 09/14/2023] Open

Zhang HG, Dagliati A, Shakeri Hossein Abad Z, Xiong X, Bonzel CL, Xia Z, Tan BWQ, Avillach P, Brat GA, Hong C, Morris M, Visweswaran S, Patel LP, Gutiérrez-Sacristán A, Hanauer DA, Holmes JH, Samayamuthu MJ, Bourgeois FT, L'Yi S, Maidlow SE, Moal B, Murphy SN, Strasser ZH, Neuraz A, Ngiam KY, Loh NHW, Omenn GS, Prunotto A, Dalvin LA, Klann JG, Schubert P, Vidorreta FJS, Benoit V, Verdy G, Kavuluru R, Estiri H, Luo Y, Malovini A, Tibollo V, Bellazzi R, Cho K, Ho YL, Tan ALM, Tan BWL, Gehlenborg N, Lozano-Zahonero S, Jouhet V, Chiovato L, Aronow BJ, Toh EMS, Wong WGS, Pizzimenti S, Wagholikar KB, Bucalo M, Cai T, South AM, Kohane IS, Weber GM. International electronic health record-derived post-acute sequelae profiles of COVID-19 patients. NPJ Digit Med 2022;5:81. [PMID: 35768548 PMCID: PMC9242995 DOI: 10.1038/s41746-022-00623-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 05/19/2022] [Indexed: 11/10/2022] Open

Affiliation(s)

Harrison G Zhang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Zahra Shakeri Hossein Abad Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Xin Xiong Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Clara-Lea Bonzel Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, PA, USA
Bryce W Q Tan Department of Medicine, National University Hospital, Singapore, Singapore
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Lav P Patel Department of Internal Medicine, Division of Medical Informatics, University Of Kansas Medical Center, Kansas City, MO, USA
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, USA
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.,Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Florence T Bourgeois Department of Pediatrics, Harvard Medical School, Boston, MA, USA
Sehi L'Yi Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Sarah E Maidlow Michigan Institute for Clinical and Health Research (MICHR) Informatics, University of Michigan, Ann Arbor, MI, USA
Bertrand Moal IAM unit, Bordeaux University Hospital, Bordeaux, France
Shawn N Murphy Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
Zachary H Strasser Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Antoine Neuraz Department of biomedical informatics, Hôpital Necker-Enfants Malade, Assistance Publique Hôpitaux de Paris (APHP), University of Paris, Paris, France
Kee Yuan Ngiam Department of Biomedical informatics, WiSDM, National University Health Systems Singapore, Singapore, Singapore
Ne Hooi Will Loh Department of Anaesthesia, National University Health Systems Singapore, Singapore, Singapore
Gilbert S Omenn Department of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, University of Michigan, Ann Arbor, MI, USA
Andrea Prunotto Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Lauren A Dalvin Department of Ophthalmology, Mayo Clinic, Rochester, NY, USA
Jeffrey G Klann Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Petra Schubert Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA
Fernando J Sanz Vidorreta Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
Vincent Benoit IT Department, Innovation & Data, APHP Greater Paris University Hospital, Paris, France
Guillaume Verdy IAM unit, Bordeaux University Hospital, Bordeaux, France
Ramakanth Kavuluru Division of Biomedical Informatics (Department of Internal Medicine), University of Kentucky, Lexington, KY, USA
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, USA
Alberto Malovini Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Valentina Tibollo Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA.,Population Health and Data Science, VA Boston Healthcare System, Boston, MA, USA
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA, USA
Amelia L M Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Byorn W L Tan Department of Medicine, National University Hospital, Singapore, Singapore
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Sara Lozano-Zahonero Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Vianney Jouhet IAM unit, INSERM Bordeaux Population Health ERIAS TEAM, Bordeaux University Hospital / ERIAS - Inserm, U1219 BPH, Bordeaux, France
Luca Chiovato Unit of Internal Medicine and Endocrinology, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Bruce J Aronow Departments of Biomedical Informatics, Pediatrics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
Emma M S Toh Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Wei Gen Scott Wong Department of Medicine, National University Health Systems Singapore, Singapore, Singapore
Sara Pizzimenti Scientific Direction, IRCCS Ca' Granda Ospedale Maggiore Policlinico di Milano, Milan, Italy
Kavishwar B Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Mauro Bucalo BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy

Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Andrew M South Department of Pediatrics-Section of Nephrology, Brenner Children's, Wake Forest School of Medicine, Winston Salem, NC, USA
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Griffin M Weber Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Collapse

Hong C, Zhang HG, L'Yi S, Weber G, Avillach P, Tan BWQ, Gutiérrez-Sacristán A, Bonzel CL, Palmer NP, Malovini A, Tibollo V, Luo Y, Hutch MR, Liu M, Bourgeois F, Bellazzi R, Chiovato L, Sanz Vidorreta FJ, Le TT, Wang X, Yuan W, Neuraz A, Benoit V, Moal B, Morris M, Hanauer DA, Maidlow S, Wagholikar K, Murphy S, Estiri H, Makoudjou A, Tippmann P, Klann J, Follett RW, Gehlenborg N, Omenn GS, Xia Z, Dagliati A, Visweswaran S, Patel LP, Mowery DL, Schriver ER, Samayamuthu MJ, Kavuluru R, Lozano-Zahonero S, Zöller D, Tan ALM, Tan BWL, Ngiam KY, Holmes JH, Schubert P, Cho K, Ho YL, Beaulieu-Jones BK, Pedrera-Jiménez M, García-Barrio N, Serrano-Balazote P, Kohane I, South A, Brat GA, Cai T. Changes in laboratory value improvement and mortality rates over the course of the pandemic: an international retrospective cohort study of hospitalised patients infected with SARS-CoV-2. BMJ Open 2022;12:e057725. [PMID: 35738646 PMCID: PMC9226470 DOI: 10.1136/bmjopen-2021-057725] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 06/12/2022] [Indexed: 01/08/2023] Open

Abstract

OBJECTIVE

To assess changes in international mortality rates and laboratory recovery rates during hospitalisation for patients hospitalised with SARS-CoV-2 between the first wave (1 March to 30 June 2020) and the second wave (1 July 2020 to 31 January 2021) of the COVID-19 pandemic.

DESIGN, SETTING AND PARTICIPANTS

This is a retrospective cohort study of 83 178 hospitalised patients admitted between 7 days before or 14 days after PCR-confirmed SARS-CoV-2 infection within the Consortium for Clinical Characterization of COVID-19 by Electronic Health Record, an international multihealthcare system collaborative of 288 hospitals in the USA and Europe. The laboratory recovery rates and mortality rates over time were compared between the two waves of the pandemic.

PRIMARY AND SECONDARY OUTCOME MEASURES

The primary outcome was all-cause mortality rate within 28 days after hospitalisation stratified by predicted low, medium and high mortality risk at baseline. The secondary outcome was the average rate of change in laboratory values during the first week of hospitalisation.

RESULTS

Baseline Charlson Comorbidity Index and laboratory values at admission were not significantly different between the first and second waves. The improvement in laboratory values over time was faster in the second wave compared with the first. The average C reactive protein rate of change was -4.72 mg/dL vs -4.14 mg/dL per day (p=0.05). The mortality rates within each risk category significantly decreased over time, with the most substantial decrease in the high-risk group (42.3% in March-April 2020 vs 30.8% in November 2020 to January 2021, p<0.001) and a moderate decrease in the intermediate-risk group (21.5% in March-April 2020 vs 14.3% in November 2020 to January 2021, p<0.001).

CONCLUSIONS

Admission profiles of patients hospitalised with SARS-CoV-2 infection did not differ greatly between the first and second waves of the pandemic, but there were notable differences in laboratory improvement rates during hospitalisation. Mortality risks among patients with similar risk profiles decreased over the course of the pandemic. The improvement in laboratory values and mortality risk was consistent across multiple countries.

Collapse

Affiliation(s)

Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Harrison G Zhang Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Sehi L'Yi Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Griffin Weber Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Bryce W Q Tan Department of Medicine, National University Hospital, Singapore
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Clara-Lea Bonzel Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Nathan P Palmer Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Alberto Malovini Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Lombardia, Italy
Valentina Tibollo Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Lombardia, Italy
Yuan Luo Department of Preventive Medicine, Northwestern University, Evanston, Illinois, USA
Meghan R Hutch Department of Preventive Medicine, Northwestern University, Evanston, Illinois, USA
Molei Liu Department of Biostatistics, Harvard University T H Chan School of Public Health, Boston, Massachusetts, USA
Florence Bourgeois Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Luca Chiovato Unit of Internal Medicine and Endocrinology, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Lombardia, Italy
Fernando J Sanz Vidorreta Department of Medicine, David Geffen School of Medicine, Los Angeles, California, USA
Trang T Le Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
Xuan Wang Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
William Yuan Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Antoine Neuraz Department of Biomedical Informatics, Hopital Universitaire Necker-Enfants Malades, Paris, Île-de-France, France
Vincent Benoit IT department, Innovation & Data, APHP Greater Paris University Hospital, Paris, France
Bertrand Moal IAM unit, Bordeaux University Hospital, Bordeaux, France
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, USA
Sarah Maidlow MICHR Informatics, University of Michigan, Ann Arbor, Michigan, USA
Kavishwar Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
Shawn Murphy Neurology, Massachusetts General Hospital, Boston, Massachusetts, USA
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
Adeline Makoudjou Institute of Medical Biometry and Statistics, University of Freiburg Faculty of Medicine, Freiburg, Baden-Württemberg, Germany
Patric Tippmann Institute of Medical Biometry and Statistics, Medical Center-University of Freiburg, Freiburg, Baden-Württemberg, Germany
Jeffery Klann Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
Robert W Follett Department of Medicine, David Geffen School of Medicine, Los Angeles, California, USA
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Gilbert S Omenn Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
Arianna Dagliati Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Kansas, USA
Lav P Patel Department of Internal Medicine, Division of Medical Informatics, University of Kansas Medical Center, Kansas City, Kansas, USA
Danielle L Mowery Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
Emily R Schriver Data Analytics Center, University of Pennsylvania Health System, Philadelphia, Pennsylvania, USA
Malarkodi Jebathilagam Samayamuthu Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
Ramakanth Kavuluru Institute for Biomedical Informatics, University of Kentucky, Lexington, Kentucky, USA
Sara Lozano-Zahonero Institute of Medical Biometry and Statistics, University of Freiburg Faculty of Medicine, Freiburg, Baden-Württemberg, Germany
Daniela Zöller Institute of Medical Biometry and Statistics, University of Freiburg Faculty of Medicine, Freiburg, Baden-Württemberg, Germany
Amelia L M Tan Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Byorn W L Tan Department of Medicine, National University Hospital, Singapore
Kee Yuan Ngiam Department of Surgery, National University Hospital, Singapore
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
Petra Schubert Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, Massachusetts, USA
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, Massachusetts, USA
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, Massachusetts, USA
Brett K Beaulieu-Jones Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Miguel Pedrera-Jiménez Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Comunidad de Madrid, Spain
Noelia García-Barrio Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Comunidad de Madrid, Spain
Pablo Serrano-Balazote Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Comunidad de Madrid, Spain
Isaac Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
Andrew South Department of Pediatrics, Section of Nephrology, Wake Forest University, Winston Salem, North Carolina, USA
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
T Cai Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA

Collapse

Bopaiah J, Garimella K, Kavuluru R. Opinions on Homeopathy for COVID-19 on Twitter. Proc ACM Web Sci Conf 2022;2022:359-363. [PMID: 36112977 PMCID: PMC9472594 DOI: 10.1145/3501247.3531575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Phuong J, Riches NO, Madlock‐Brown C, Duran D, Calzoni L, Espinoza JC, Datta G, Kavuluru R, Weiskopf NG, Ward‐Caviness CK, Lin AY. Social Determinants of Health Factors for Gene-Environment COVID-19 Research: Challenges and Opportunities. Adv Genet (Hoboken) 2022;3:2100056. [PMID: 35574521 PMCID: PMC9087427 DOI: 10.1002/ggn2.202100056] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Indexed: 01/25/2023]

Song Q, Bates B, Shao YR, Hsu FC, Liu F, Madhira V, Mitra AK, Bergquist T, Kavuluru R, Li X, Sharafeldin N, Su J, Topaloglu U. Risk and Outcome of Breakthrough COVID-19 Infections in Vaccinated Patients With Cancer: Real-World Evidence From the National COVID Cohort Collaborative. J Clin Oncol 2022;40:1414-1427. [PMID: 35286152 PMCID: PMC9061155 DOI: 10.1200/jco.21.02419] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 02/07/2022] [Accepted: 02/18/2022] [Indexed: 12/15/2022] Open

Abstract

PURPOSE

To provide real-world evidence on risks and outcomes of breakthrough COVID-19 infections in vaccinated patients with cancer using the largest national cohort of COVID-19 cases and controls.

METHODS

We used the National COVID Cohort Collaborative (N3C) to identify breakthrough infections between December 1, 2020, and May 31, 2021. We included patients partially or fully vaccinated with mRNA COVID-19 vaccines with no prior SARS-CoV-2 infection record. Risks for breakthrough infection and severe outcomes were analyzed using logistic regression.

RESULTS

A total of 6,860 breakthrough cases were identified within the N3C-vaccinated population, among whom 1,460 (21.3%) were patients with cancer. Solid tumors and hematologic malignancies had significantly higher risks for breakthrough infection (odds ratios [ORs] = 1.12, 95% CI, 1.01 to 1.23 and 4.64, 95% CI, 3.98 to 5.38) and severe outcomes (ORs = 1.33, 95% CI, 1.09 to 1.62 and 1.45, 95% CI, 1.08 to 1.95) compared with noncancer patients, adjusting for age, sex, race/ethnicity, smoking status, vaccine type, and vaccination date. Compared with solid tumors, hematologic malignancies were at increased risk for breakthrough infections (adjusted OR ranged from 2.07 for lymphoma to 7.25 for lymphoid leukemia). Breakthrough risk was reduced after the second vaccine dose for all cancers (OR = 0.04; 95% CI, 0.04 to 0.05), and for Moderna's mRNA-1273 compared with Pfizer's BNT162b2 vaccine (OR = 0.66; 95% CI, 0.62 to 0.70), particularly in patients with multiple myeloma (OR = 0.35; 95% CI, 0.15 to 0.72). Medications with major immunosuppressive effects and bone marrow transplantation were strongly associated with breakthrough risk among the vaccinated population.

CONCLUSION

Real-world evidence shows that patients with cancer, especially hematologic malignancies, are at higher risk for developing breakthrough infections and severe outcomes. Patients with vaccination were at markedly decreased risk for breakthrough infections. Further work is needed to assess boosters and new SARS-CoV-2 variants.

Collapse

Fouladvand S, Talbert J, Dwoskin LP, Bush H, Meadows AL, Peterson LE, Roggenkamp SK, Kavuluru R, Chen J. Identifying Opioid Use Disorder from Longitudinal Healthcare Data using a Multi-stream Transformer. AMIA Annu Symp Proc 2022;2021:476-485. [PMID: 35308960 PMCID: PMC8861731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Bograd S, Chen B, Kavuluru R. Tracking sentiments toward fat acceptance over a decade on Twitter. Health Informatics J 2022;28:14604582211065702. [PMID: 34986689 DOI: 10.1177/14604582211065702] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Soleymanpour M, Saderholm S, Kavuluru R. Therapeutic Claims in Cannabidiol (CBD) Marketing Messages on Twitter. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2021;2021:3083-3088. [PMID: 35096472 DOI: 10.1109/bibm52615.2021.9669404] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Deer RR, Rock MA, Vasilevsky N, Carmody L, Rando H, Anzalone AJ, Basson MD, Bennett TD, Bergquist T, Boudreau EA, Bramante CT, Byrd JB, Callahan TJ, Chan LE, Chu H, Chute CG, Coleman BD, Davis HE, Gagnier J, Greene CS, Hillegass WB, Kavuluru R, Kimble WD, Koraishy FM, Köhler S, Liang C, Liu F, Liu H, Madhira V, Madlock-Brown CR, Matentzoglu N, Mazzotti DR, McMurry JA, McNair DS, Moffitt RA, Monteith TS, Parker AM, Perry MA, Pfaff E, Reese JT, Saltz J, Schuff RA, Solomonides AE, Solway J, Spratt H, Stein GS, Sule AA, Topaloglu U, Vavougios GD, Wang L, Haendel MA, Robinson PN. Characterizing Long COVID: Deep Phenotype of a Complex Condition. EBioMedicine 2021;74:103722. [PMID: 34839263 PMCID: PMC8613500 DOI: 10.1016/j.ebiom.2021.103722] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 10/22/2021] [Accepted: 11/15/2021] [Indexed: 12/12/2022] Open

Abstract

BACKGROUND

Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 (PASC or "long COVID"), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations. Patient-led studies are of particular importance for understanding the natural history of COVID-19, but integration is hampered because they often use different terms to describe the same symptom or condition. This significant disparity in patient versus clinical characterization motivated the proposed ontological approach to specifying manifestations, which will improve capture and integration of future long COVID studies.

METHODS

The Human Phenotype Ontology (HPO) is a widely used standard for exchange and analysis of phenotypic abnormalities in human disease but has not yet been applied to the analysis of COVID-19.

FUNDING

We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to HPO terms. We present layperson synonyms and definitions that can be used to link patient self-report questionnaires to standard medical terminology. Long COVID clinical manifestations are not assessed consistently across studies, and most manifestations have been reported with a wide range of synonyms by different authors. Across at least 10 cohorts, authors reported 31 unique clinical features corresponding to HPO terms; the most commonly reported feature was Fatigue (median 45.1%) and the least commonly reported was Nausea (median 3.9%), but the reported percentages varied widely between studies.

INTERPRETATION

Translating long COVID manifestations into computable HPO terms will improve analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared/pooled more effectively. Furthermore, mapping lay terminology to HPO will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, thereby improving the stratification, diagnosis, and treatment of long COVID.

FUNDING

U24TR002306; UL1TR001439; P30AG024832; GBMF4552; R01HG010067; UL1TR002535; K23HL128909; UL1TR002389; K99GM145411.

Collapse

Affiliation(s)

Rachel R Deer University of Texas Medical Branch, Galveston, TX, USA.
Madeline A Rock University of Texas Medical Branch, Galveston, TX, USA
Nicole Vasilevsky Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; Monarch Initiative
Leigh Carmody Monarch Initiative; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
Halie Rando Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Alfred J Anzalone Department of Neurological Sciences, College of Medicine, University of Nebraska Medical Center, Omaha, NE, USA
Marc D Basson Department of Surgery, University of North Dakota School of Medicine and Health Sciences
Tellen D Bennett Section of Informatics and Data Science, Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Timothy Bergquist Sage Bionetworks, Seattle, WA
Eilis A Boudreau Department of Neurology; Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, OR 97239
Carolyn T Bramante Departments of Internal Medicine and Pediatrics, University of Minnesota Medical School, Minneapolis, MN 55455
James Brian Byrd Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan Medical School, Ann Arbor, MI, 48109
Tiffany J Callahan Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
Lauren E Chan Monarch Initiative; College of Public Health and Human Sciences, Oregon State University, Corvallis, OR, USA
Haitao Chu Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN USA
Christopher G Chute Johns Hopkins University, Schools of Medicine, Public Health, and Nursing, Baltimore, MD, USA
Ben D Coleman The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
Hannah E Davis Patient-Led Research Collaborative
Joel Gagnier Departments of Orthopaedic Surgery & Epidemiology, University of Michigan, Ann Arbor, MI, USA
Casey S Greene Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
William B Hillegass University of Mississippi Medical Center, University of Mississippi Medical Center, Jackson, MS, USA; Departments of Data Science and Medicine
Ramakanth Kavuluru Institute for Biomedical Informatics, University of Kentucky
Wesley D Kimble West Virginia Clinical and Translational Science Institute, West Virginia University, Morgantown, WV, USA
Farrukh M Koraishy Division of Nephrology, Department of Medicine, Stony Brook University
Sebastian Köhler Monarch Initiative; Ada Health GmbH, Berlin, Germany
Chen Liang Arnold School of Public Health, University of South Carolina, Columbia, SC, USA
Feifan Liu Department of Population and Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA
Hongfang Liu Department of Artificial Intelligence and Informatics, Mayo Clinic, MN, USA
Vithal Madhira Palila Software LLC, Reno, NV, USA
Charisse R Madlock-Brown Department of Diagnostic and Health Sciences, University of Tennessee Health Science Center, 920 Madison Ave. Suite 518N, Memphis TN 38613
Nicolas Matentzoglu Monarch Initiative; Semanticly Ltd; European Bioinformatics Institute (EMBL-EBI)
Diego R Mazzotti Division of Medical Informatics, Department of Internal Medicine, University of Kansas Medical Center
Julie A McMurry Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; Monarch Initiative
Douglas S McNair Quantitative Sciences, Global Health Div., Gates Foundation, Seattle, WA 98109, USA
Richard A Moffitt Stony Brook University, Stony Brook, NY 11794, USA
Teshamae S Monteith University of Miami, Miller School of Medicine, Miami, Fl 33136
Ann M Parker Pulmonary and Critical Care Medicine, Johns Hopkins University, Schools of Medicine, Baltimore, MD, USA
Mallory A Perry Children's Hospital of Philadelphia Research Institute, Philadelphia, PA, USA
Emily Pfaff University of North Carolina, Chapel Hill
Justin T Reese Monarch Initiative; Lawrence Berkeley National Laboratory
Joel Saltz Stony Brook University; Biomedical Informatics
Robert A Schuff OCHIN, Inc Portland, OR, USA
Anthony E Solomonides Outcomes Research Network, Research Institute, NorthShore University HealthSystem, Evanston, IL 60201, USA; Institute for Translational Medicine, University of Chicago, Chicago, IL, USA
Julian Solway Institute for Translational Medicine, University of Chicago, Chicago, IL, USA
Heidi Spratt University of Texas Medical Branch, Galveston, TX, USA
Gary S Stein University of Vermont Larner College of Medicine, Departments of Biochemistry and Surgery, Burlington, Vermont 05405
Anupam A Sule St Joseph Mercy Oakland, Pontiac, MI, USA
Umit Topaloglu Wake Forest School of Medicine
George D Vavougios Department of Computer Science and Telecommunications, University of Thessaly, Papasiopoulou 2 - 4, P.C.; 131 - Galaneika, Lamia, Greece; Department of Neurology, Athens Naval Hospital 70 Deinokratous Street, P.C. 115 21 Athens, Greece; Department of Respiratory Medicine, Faculty of Medicine, University of Thessaly, Biopolis, P.C. 41500 Larissa, Greece
Liwei Wang Department of Artificial Intelligence and Informatics, Mayo Clinic, MN, USA
Melissa A Haendel Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, CO, USA; Monarch Initiative.
Peter N Robinson Monarch Initiative; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA; Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA.

Collapse

Weber GM, Zhang HG, L'Yi S, Bonzel CL, Hong C, Avillach P, Gutiérrez-Sacristán A, Palmer NP, Tan ALM, Wang X, Yuan W, Gehlenborg N, Alloni A, Amendola DF, Bellasi A, Bellazzi R, Beraghi M, Bucalo M, Chiovato L, Cho K, Dagliati A, Estiri H, Follett RW, García Barrio N, Hanauer DA, Henderson DW, Ho YL, Holmes JH, Hutch MR, Kavuluru R, Kirchoff K, Klann JG, Krishnamurthy AK, Le TT, Liu M, Loh NHW, Lozano-Zahonero S, Luo Y, Maidlow S, Makoudjou A, Malovini A, Martins MR, Moal B, Morris M, Mowery DL, Murphy SN, Neuraz A, Ngiam KY, Okoshi MP, Omenn GS, Patel LP, Pedrera Jiménez M, Prudente RA, Samayamuthu MJ, Sanz Vidorreta FJ, Schriver ER, Schubert P, Serrano Balazote P, Tan BW, Tanni SE, Tibollo V, Visweswaran S, Wagholikar KB, Xia Z, Zöller D, Kohane IS, Cai T, South AM, Brat GA. Authorship Correction: International Changes in COVID-19 Clinical Trajectories Across 315 Hospitals and 6 Countries: Retrospective Cohort Study. J Med Internet Res 2021;23:e34625. [PMID: 34889759 PMCID: PMC8672293 DOI: 10.2196/34625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 11/10/2021] [Indexed: 11/15/2022] Open

Affiliation(s)

Griffin M Weber Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Harrison G Zhang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Sehi L'Yi Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Clara-Lea Bonzel Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Nathan P Palmer Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Amelia Li Min Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Xuan Wang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
William Yuan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Anna Alloni BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy
Danilo F Amendola Clinical Research Unit, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Antonio Bellasi Division of Nephrology, Department of Medicine, Ente Ospedaliero Cantonale, Lugano, Switzerland
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Michele Beraghi Information Technology Department, Azienda Socio-Sanitaria Territoriale di Pavia, Pavia, Italy
Mauro Bucalo BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy
Luca Chiovato Unit of Internal Medicine and Endocrinology, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Robert W Follett Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
Noelia García Barrio Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, United States
Darren W Henderson Department of Biomedical Informatics, University of Kentucky, Lexington, KY, United States
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States.,Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Meghan R Hutch Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Ramakanth Kavuluru Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, United States
Katie Kirchoff Medical University of South Carolina, Charleston, SC, United States
Jeffrey G Klann Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Ashok K Krishnamurthy Department of Computer Science, Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
Trang T Le Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Molei Liu Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States
Ne Hooi Will Loh Department of Anaesthesia, National University Health System, Singapore, Singapore
Sara Lozano-Zahonero Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Sarah Maidlow Michigan Institute for Clinical & Health Research Informatics, University of Michigan, Ann Arbor, MI, United States
Adeline Makoudjou Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Alberto Malovini Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Marcelo Roberto Martins Clinical Hospital of Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Bertrand Moal Informatique et archivistique médicales unit, Bordeaux University Hospital, Bordeaux, France
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Danielle L Mowery Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Shawn N Murphy Department of Neurology, Massachusetts General Hospital, Boston, MA, United States
Antoine Neuraz Department of Biomedical Informatics, Hôpital Necker-Enfants Malade, Assistance Publique Hôpitaux de Paris, University of Paris, Paris, France
Kee Yuan Ngiam Department of Biomedical Informatics, Institute for Digital Medicine, National University Health System, Singapore, Singapore
Marina P Okoshi Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Gilbert S Omenn Department of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and Public Health, University of Michigan, Ann Arbor, MI, United States
Lav P Patel Division of Medical Informatics, Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, United States
Miguel Pedrera Jiménez Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
Robson A Prudente Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Fernando J Sanz Vidorreta Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
Emily R Schriver Data Analytics Center, University of Pennsylvania Health System, Philadelphia, PA, United States
Petra Schubert Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
Pablo Serrano Balazote Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
Byorn Wl Tan Department of Medicine, National University Health System, Singapore, Singapore
Suzana E Tanni Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Valentina Tibollo Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Kavishwar B Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, PA, United States
Daniela Zöller Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
see Authors' Contributions,
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Andrew M South Section of Nephrology, Department of Pediatrics, Brenner Children's Hospital, Wake Forest School of Medicine, Winston Salem, NC, United States
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States

Collapse

Weidner K, Lowman J, Fleischer A, Kosik K, Goodbread P, Chen B, Kavuluru R. Twitter, Telepractice, and the COVID-19 Pandemic: A Social Media Content Analysis. Am J Speech Lang Pathol 2021;30:2561-2571. [PMID: 34499843 PMCID: PMC9132031 DOI: 10.1044/2021_ajslp-21-00034] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 04/04/2021] [Accepted: 06/21/2021] [Indexed: 05/31/2023]

Weber GM, Zhang HG, L'Yi S, Bonzel CL, Hong C, Avillach P, Gutiérrez-Sacristán A, Palmer NP, Tan ALM, Wang X, Yuan W, Gehlenborg N, Alloni A, Amendola DF, Bellasi A, Bellazzi R, Beraghi M, Bucalo M, Chiovato L, Cho K, Dagliati A, Estiri H, Follett RW, García Barrio N, Hanauer DA, Henderson DW, Ho YL, Holmes JH, Hutch MR, Kavuluru R, Kirchoff K, Klann JG, Krishnamurthy AK, Le TT, Liu M, Loh NHW, Lozano-Zahonero S, Luo Y, Maidlow S, Makoudjou A, Malovini A, Martins MR, Moal B, Morris M, Mowery DL, Murphy SN, Neuraz A, Ngiam KY, Okoshi MP, Omenn GS, Patel LP, Pedrera Jiménez M, Prudente RA, Samayamuthu MJ, Sanz Vidorreta FJ, Schriver ER, Schubert P, Serrano Balazote P, Tan BW, Tanni SE, Tibollo V, Visweswaran S, Wagholikar KB, Xia Z, Zöller D, Kohane IS, Cai T, South AM, Brat GA. International Changes in COVID-19 Clinical Trajectories Across 315 Hospitals and 6 Countries: Retrospective Cohort Study. J Med Internet Res 2021;23:e31400. [PMID: 34533459 PMCID: PMC8510151 DOI: 10.2196/31400] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 09/02/2021] [Accepted: 09/02/2021] [Indexed: 02/06/2023] Open

Abstract

Background

Many countries have experienced 2 predominant waves of COVID-19–related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic.

Objective

In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic.

Methods

Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19.

Results

Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain.

Conclusions

Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.

Collapse

Affiliation(s)

Griffin M Weber Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Harrison G Zhang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Sehi L'Yi Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Clara-Lea Bonzel Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Chuan Hong Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Paul Avillach Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Alba Gutiérrez-Sacristán Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Nathan P Palmer Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Amelia Li Min Tan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Xuan Wang Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
William Yuan Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Nils Gehlenborg Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Anna Alloni BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy
Danilo F Amendola Clinical Research Unit, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Antonio Bellasi Division of Nephrology, Department of Medicine, Ente Ospedaliero Cantonale, Lugano, Switzerland
Riccardo Bellazzi Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Michele Beraghi Information Technology Department, Azienda Socio-Sanitaria Territoriale di Pavia, Pavia, Italy
Mauro Bucalo BIOMERIS (BIOMedical Research Informatics Solutions), Pavia, Italy
Luca Chiovato Unit of Internal Medicine and Endocrinology, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Kelly Cho Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
Arianna Dagliati Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
Hossein Estiri Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Robert W Follett Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
Noelia García Barrio Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
David A Hanauer Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MI, United States
Darren W Henderson Department of Biomedical Informatics, University of Kentucky, Lexington, KY, United States
Yuk-Lam Ho Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
John H Holmes Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States.,Institute for Biomedical Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Meghan R Hutch Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Ramakanth Kavuluru Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, United States
Katie Kirchoff Medical University of South Carolina, Charleston, SC, United States
Jeffrey G Klann Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Ashok K Krishnamurthy Department of Computer Science, Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
Trang T Le Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Molei Liu Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States
Ne Hooi Will Loh Department of Anaesthesia, National University Health System, Singapore, Singapore
Sara Lozano-Zahonero Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Yuan Luo Department of Preventive Medicine, Northwestern University, Chicago, IL, United States
Sarah Maidlow Michigan Institute for Clinical & Health Research Informatics, University of Michigan, Ann Arbor, MI, United States
Adeline Makoudjou Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
Alberto Malovini Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Marcelo Roberto Martins Clinical Hospital of Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Bertrand Moal Informatique et archivistique médicales unit, Bordeaux University Hospital, Bordeaux, France
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Danielle L Mowery Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
Shawn N Murphy Department of Neurology, Massachusetts General Hospital, Boston, MA, United States
Antoine Neuraz Department of Biomedical Informatics, Hôpital Necker-Enfants Malade, Assistance Publique Hôpitaux de Paris, University of Paris, Paris, France
Kee Yuan Ngiam Department of Biomedical Informatics, Institute for Digital Medicine, National University Health System, Singapore, Singapore
Marina P Okoshi Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Gilbert S Omenn Department of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics, and Public Health, University of Michigan, Ann Arbor, MI, United States
Lav P Patel Division of Medical Informatics, Department of Internal Medicine, University of Kansas Medical Center, Kansas City, KS, United States
Miguel Pedrera Jiménez Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
Robson A Prudente Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Malarkodi Jebathilagam Samayamuthu Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Fernando J Sanz Vidorreta Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
Emily R Schriver Data Analytics Center, University of Pennsylvania Health System, Philadelphia, PA, United States
Petra Schubert Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, United States
Pablo Serrano Balazote Health Informatics, Hospital Universitario 12 de Octubre, Madrid, Spain
Byorn Wl Tan Department of Medicine, National University Health System, Singapore, Singapore
Suzana E Tanni Internal Medicine Department, Botucatu Medical School, São Paulo State University, Botucatu, Brazil
Valentina Tibollo Laboratory of Informatics and Systems Engineering for Clinical Research, Istituti Clinici Scientifici Maugeri SpA SB IRCCS, Pavia, Italy
Shyam Visweswaran Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, United States
Kavishwar B Wagholikar Department of Medicine, Massachusetts General Hospital, Boston, MA, United States
Zongqi Xia Department of Neurology, University of Pittsburgh, Pittsburgh, PA, United States
Daniela Zöller Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
see Authors' Contributions,
Isaac S Kohane Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Tianxi Cai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
Andrew M South Section of Nephrology, Department of Pediatrics, Brenner Children's Hospital, Wake Forest School of Medicine, Winston Salem, NC, United States
Gabriel A Brat Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States

Collapse

Kavuluru R, Noh J, Rose SW. Twitter discourse on nicotine as potential prophylactic or therapeutic for COVID-19. Int J Drug Policy 2021;99:103470. [PMID: 34607223 PMCID: PMC8450069 DOI: 10.1016/j.drugpo.2021.103470] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 09/04/2021] [Accepted: 09/10/2021] [Indexed: 12/20/2022]

Abstract

Background

An unproven “nicotine hypothesis” that indicates nicotine's therapeutic potential for COVID-19 has been proposed in recent literature. This study is about Twitter posts that misinterpret this hypothesis to make baseless claims about benefits of smoking and vaping in the context of COVID-19. We quantify the presence of such misinformation and characterize the tweeters who post such messages.

Methods

Twitter premium API was used to download tweets (n = 17,533) that match terms indicating (a) nicotine or vaping themes, (b) a prophylactic or therapeutic effect, and (c) COVID-19 (January-July 2020) as a conjunctive query. A constraint on the length of the span of text containing the terms in the tweets allowed us to focus on those that convey the therapeutic intent. We hand-annotated these filtered tweets and built a classifier that identifies tweets that extrapolate the nicotine hypothesis to smoking/vaping with a positive predictive value of 85%. We analyzed the frequently used terms in author bios, top Web links, and hashtags of such tweets.

Results

21% of our filtered COVID-19 tweets indicate a vaping or smoking-based prevention/treatment narrative. Qualitative analyses show a variety of ways therapeutic claims are being made and tweeter bios reveal pre-existing notions of positive stances toward vaping.

Conclusion

The social media landscape is a double-edged sword in tobacco communication. Although it increases information reach, consumers can also be subject to confirmation bias when exposed to inadvertent or deliberate framing of scientific discourse that may border on misinformation. This calls for circumspection and additional planning in countering such narratives as the COVID-19 pandemic continues to ravage our world. Our results also serve as a cautionary tale in how social media can be leveraged to spread misleading information about tobacco products in the wake of pandemics.

Collapse

Kavuluru R, Noh J, Rose SW. Twitter Discourse on Nicotine as Potential Prophylactic or Therapeutic for COVID-19. medRxiv 2021. [PMID: 33442710 PMCID: PMC7805473 DOI: 10.1101/2021.01.05.21249284] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Abstract

Background:

An unproven “nicotine hypothesis” that indicates nicotine’s therapeutic potential for COVID-19 has been proposed in recent literature. This study is about Twitter posts that misinterpret this hypothesis to make baseless claims about benefits of smoking and vaping in the context of COVID-19. We quantify the presence of such misinformation and characterize the tweeters who post such messages.

Methods:

Results:

Conclusion:

Collapse

Noh J, Kavuluru R. Joint Learning for Biomedical NER and Entity Normalization: Encoding Schemes, Counterfactual Examples, and Zero-Shot Evaluation. ACM BCB 2021;2021. [PMID: 34505115 DOI: 10.1145/3459930.3469533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Liang G, Greenwell C, Zhang Y, Xing X, Wang X, Kavuluru R, Jacobs N. Contrastive Cross-Modal Pre-Training: A General Strategy for Small Sample Medical Imaging. IEEE J Biomed Health Inform 2021;26:1640-1649. [PMID: 34495856 PMCID: PMC9242687 DOI: 10.1109/jbhi.2021.3110805] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Rios A, Durbin EB, Hands I, Kavuluru R. Assigning ICD-O-3 Codes to Pathology Reports using Neural Multi-Task Training with Hierarchical Regularization. ACM BCB 2021;2021:32. [PMID: 34541582 PMCID: PMC8445227 DOI: 10.1145/3459930.3469541] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Noh J, Kavuluru R. Improved biomedical word embeddings in the transformer era. J Biomed Inform 2021;120:103867. [PMID: 34284119 PMCID: PMC8373296 DOI: 10.1016/j.jbi.2021.103867] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 05/10/2021] [Accepted: 07/11/2021] [Indexed: 10/20/2022]

Abstract

BACKGROUND

Recent natural language processing (NLP) research is dominated by neural network methods that employ word embeddings as basic building blocks. Pre-training with neural methods that capture local and global distributional properties (e.g., skip-gram, GLoVE) using free text corpora is often used to embed both words and concepts. Pre-trained embeddings are typically leveraged in downstream tasks using various neural architectures that are designed to optimize task-specific objectives that might further tune such embeddings.

OBJECTIVE

Despite advances in contextualized language model based embeddings, static word embeddings still form an essential starting point in BioNLP research and applications. They are useful in low resource settings and in lexical semantics studies. Our main goal is to build improved biomedical word embeddings and make them publicly available for downstream applications.

METHODS

We jointly learn word and concept embeddings by first using the skip-gram method and further fine-tuning them with correlational information manifesting in co-occurring Medical Subject Heading (MeSH) concepts in biomedical citations. This fine-tuning is accomplished with the transformer-based BERT architecture in the two-sentence input mode with a classification objective that captures MeSH pair co-occurrence. We conduct evaluations of these tuned static embeddings using multiple datasets for word relatedness developed by previous efforts.

RESULTS

Both in qualitative and quantitative evaluations we demonstrate that our methods produce improved biomedical embeddings in comparison with other static embedding efforts. Without selectively culling concepts and terms (as was pursued by previous efforts), we believe we offer the most exhaustive evaluation of biomedical embeddings to date with clear performance improvements across the board.

CONCLUSION

We repurposed a transformer architecture (typically used to generate dynamic embeddings) to improve static biomedical word embeddings using concept correlations. We provide our code and embeddings for public use for downstream applications and research endeavors: https://github.com/bionlproc/BERT-CRel-Embeddings.

Collapse

Bennett TD, Moffitt RA, Hajagos JG, Amor B, Anand A, Bissell MM, Bradwell KR, Bremer C, Byrd JB, Denham A, DeWitt PE, Gabriel D, Garibaldi BT, Girvin AT, Guinney J, Hill EL, Hong SS, Jimenez H, Kavuluru R, Kostka K, Lehmann HP, Levitt E, Mallipattu SK, Manna A, McMurry JA, Morris M, Muschelli J, Neumann AJ, Palchuk MB, Pfaff ER, Qian Z, Qureshi N, Russell S, Spratt H, Walden A, Williams AE, Wooldridge JT, Yoo YJ, Zhang XT, Zhu RL, Austin CP, Saltz JH, Gersing KR, Haendel MA, Chute CG. Clinical Characterization and Prediction of Clinical Severity of SARS-CoV-2 Infection Among US Adults Using Data From the US National COVID Cohort Collaborative. JAMA Netw Open 2021;4:e2116901. [PMID: 34255046 PMCID: PMC8278272 DOI: 10.1001/jamanetworkopen.2021.16901] [Citation(s) in RCA: 153] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 05/03/2021] [Indexed: 12/15/2022] Open

Abstract

Importance

The National COVID Cohort Collaborative (N3C) is a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative COVID-19 cohort to date. This multicenter data set can support robust evidence-based development of predictive and diagnostic tools and inform clinical care and policy.

Objectives

To evaluate COVID-19 severity and risk factors over time and assess the use of machine learning to predict clinical severity.

Design, Setting, and Participants

In a retrospective cohort study of 1 926 526 US adults with SARS-CoV-2 infection (polymerase chain reaction >99% or antigen <1%) and adult patients without SARS-CoV-2 infection who served as controls from 34 medical centers nationwide between January 1, 2020, and December 7, 2020, patients were stratified using a World Health Organization COVID-19 severity scale and demographic characteristics. Differences between groups over time were evaluated using multivariable logistic regression. Random forest and XGBoost models were used to predict severe clinical course (death, discharge to hospice, invasive ventilatory support, or extracorporeal membrane oxygenation).

Main Outcomes and Measures

Patient demographic characteristics and COVID-19 severity using the World Health Organization COVID-19 severity scale and differences between groups over time using multivariable logistic regression.

Results

The cohort included 174 568 adults who tested positive for SARS-CoV-2 (mean [SD] age, 44.4 [18.6] years; 53.2% female) and 1 133 848 adult controls who tested negative for SARS-CoV-2 (mean [SD] age, 49.5 [19.2] years; 57.1% female). Of the 174 568 adults with SARS-CoV-2, 32 472 (18.6%) were hospitalized, and 6565 (20.2%) of those had a severe clinical course (invasive ventilatory support, extracorporeal membrane oxygenation, death, or discharge to hospice). Of the hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March to April 2020 to 8.6% in September to October 2020 (P = .002 for monthly trend). Using 64 inputs available on the first hospital day, this study predicted a severe clinical course using random forest and XGBoost models (area under the receiver operating curve = 0.87 for both) that were stable over time. The factor most strongly associated with clinical severity was pH; this result was consistent across machine learning methods. In a separate multivariable logistic regression model built for inference, age (odds ratio [OR], 1.03 per year; 95% CI, 1.03-1.04), male sex (OR, 1.60; 95% CI, 1.51-1.69), liver disease (OR, 1.20; 95% CI, 1.08-1.34), dementia (OR, 1.26; 95% CI, 1.13-1.41), African American (OR, 1.12; 95% CI, 1.05-1.20) and Asian (OR, 1.33; 95% CI, 1.12-1.57) race, and obesity (OR, 1.36; 95% CI, 1.27-1.46) were independently associated with higher clinical severity.

Conclusions and Relevance

This cohort study found that COVID-19 mortality decreased over time during 2020 and that patient demographic characteristics and comorbidities were associated with higher clinical severity. The machine learning models accurately predicted ultimate clinical severity using commonly collected clinical data from the first 24 hours of a hospital admission.

Collapse

Affiliation(s)

Tellen D. Bennett Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
Richard A. Moffitt Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
Janos G. Hajagos Stony Brook University, Stony Brook, New York
Benjamin Amor Palantir Technologies, Denver, Colorado
Adit Anand Stony Brook University, Stony Brook, New York
Mark M. Bissell Palantir Technologies, Denver, Colorado
Katie Rebecca Bradwell Palantir Technologies, Denver, Colorado
Carolyn Bremer Stony Brook University, Stony Brook, New York
James Brian Byrd Department of Internal Medicine, The University of Michigan at Ann Arbor, Ann Arbor
Alina Denham Department of Public Health Sciences, University of Rochester Medical Center, Rochester, New York
Peter E. DeWitt Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
Davera Gabriel Institute for Clinical and Translational Research, Johns Hopkins University School of Medicine, Baltimore, Maryland
Brian T. Garibaldi Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
Andrew T. Girvin Palantir Technologies, Denver, Colorado
Justin Guinney Sage Bionetworks, Seattle, Washington
Elaine L. Hill Department of Public Health Sciences, University of Rochester Medical Center, Rochester, New York
Stephanie S. Hong Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
Hunter Jimenez Stony Brook University, Stony Brook, New York
Ramakanth Kavuluru Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington
Kristin Kostka Real World Solutions, IQVIA, Cambridge, Massachusetts Observational Health Data Sciences and Informatics, New York, New York
Harold P. Lehmann Division of Health Science Informatics, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
Eli Levitt Department of Orthopaedic Surgery, University of Alabama at Birmingham, Birmingham
Sandeep K. Mallipattu Stony Brook University, Stony Brook, New York
Amin Manna Palantir Technologies, Denver, Colorado
Julie A. McMurry Translational and Integrative Sciences Center, Oregon State University, Corvallis
Michele Morris Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania
John Muschelli Department of Biostatistics, Johns Hopkins University School of Medicine, Baltimore, Maryland
Andrew J. Neumann Translational and Integrative Sciences Center, Oregon State University, Corvallis
Matvey B. Palchuk TriNetX, Cambridge, Massachusetts
Emily R. Pfaff North Carolina Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill
Zhenglong Qian Department of biomedical informatics, Stony Brook University, Stony Brook, New York
Nabeel Qureshi Palantir Technologies, Denver, Colorado
Seth Russell Section of Informatics and Data Science, Department of Pediatrics, University of Colorado School of Medicine, University of Colorado, Aurora
Heidi Spratt Department of Preventive Medicine and Public Health, University of Texas Medical Branch, Galveston
Anita Walden Sage Bionetworks, Seattle, Washington Oregon Clinical and Translational Research Institute, Oregon Health & Science University, Portland
Andrew E. Williams Tufts Medical Center Clinical and Translational Science Institute, Tufts Medical Center, Boston, Massachusetts
Jacob T. Wooldridge Stony Brook University, Stony Brook, New York
Yun Jae Yoo Stony Brook University, Stony Brook, New York
Xiaohan Tanner Zhang Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
Richard L. Zhu Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
Christopher P. Austin National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, Maryland
Joel H. Saltz Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
Ken R. Gersing National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, Maryland
Melissa A. Haendel TriNetX, Cambridge, Massachusetts Center for Health AI, University of Colorado, Aurora
Christopher G. Chute Department of Health Policy and Management, Johns Hopkins University School of Medicine, Baltimore, Maryland Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland Department of Nursing, Johns Hopkins University School of Medicine, Baltimore, Maryland

Collapse

Tran T, Kavuluru R, Kilicoglu H. Attention-Gated Graph Convolutions for Extracting Drug Interaction Information from Drug Labels. ACM Trans Comput Healthc 2021;2:10. [PMID: 34541578 PMCID: PMC8445229 DOI: 10.1145/3423209] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 09/01/2020] [Indexed: 01/02/2023]

Tran T, Ickes MJ, Hester JW, Kavuluru R. Identifying current Juul users among emerging adults through Twitter feeds. Int J Med Inform 2021;146:104350. [PMID: 33341556 PMCID: PMC7855996 DOI: 10.1016/j.ijmedinf.2020.104350] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 11/12/2020] [Accepted: 11/20/2020] [Indexed: 11/27/2022]

Bennett TD, Moffitt RA, Hajagos JG, Amor B, Anand A, Bissell MM, Bradwell KR, Bremer C, Byrd JB, Denham A, DeWitt PE, Gabriel D, Garibaldi BT, Girvin AT, Guinney J, Hill EL, Hong SS, Jimenez H, Kavuluru R, Kostka K, Lehmann HP, Levitt E, Mallipattu SK, Manna A, McMurry JA, Morris M, Muschelli J, Neumann AJ, Palchuk MB, Pfaff ER, Qian Z, Qureshi N, Russell S, Spratt H, Walden A, Williams AE, Wooldridge JT, Yoo YJ, Zhang XT, Zhu RL, Austin CP, Saltz JH, Gersing KR, Haendel MA, Chute CG. The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction. medRxiv 2021. [PMID: 33469592 PMCID: PMC7814838 DOI: 10.1101/2021.01.12.21249511] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Abstract

Background:

The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.

Methods and Findings:

In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen <1%) as well as 1,133,848 adult patients that served as lab-negative controls. Among 32,472 hospitalized patients, mortality was 11.6% overall and decreased from 16.4% in March/April 2020 to 8.6% in September/October 2020 (p = 0.002 monthly trend). In a multivariable logistic regression model, age, male sex, liver disease, dementia, African-American and Asian race, and obesity were independently associated with higher clinical severity. To demonstrate the utility of the N3C cohort for analytics, we used machine learning (ML) to predict clinical severity and risk factors over time. Using 64 inputs available on the first hospital day, we predicted a severe clinical course (death, discharge to hospice, invasive ventilation, or extracorporeal membrane oxygenation) using random forest and XGBoost models (AUROC 0.86 and 0.87 respectively) that were stable over time. The most powerful predictors in these models are patient age and widely available vital sign and laboratory values. The established expected trajectories for many vital signs and laboratory values among patients with different clinical severities validates observations from smaller studies, and provides comprehensive insight into COVID-19 characterization in U.S. patients.

Conclusions:

This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.

Collapse

Noh J, Kavuluru R. Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization. Proc Conf Empir Methods Nat Lang Process 2020;2020:3389-3399. [PMID: 34541588 PMCID: PMC8444997 DOI: 10.18653/v1/2020.findings-emnlp.304] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Ickes M, Hester JW, Wiggins AT, Rayens MK, Hahn EJ, Kavuluru R. Prevalence and reasons for Juul use among college students. J Am Coll Health 2020;68:455-459. [PMID: 30913003 PMCID: PMC6763357 DOI: 10.1080/07448481.2019.1577867] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 12/17/2018] [Accepted: 01/29/2019] [Indexed: 06/01/2023]

Bakal G, Kilicoglu H, Kavuluru R. Non-Negative Matrix Factorization for Drug Repositioning: Experiments with the repoDB Dataset. AMIA Annu Symp Proc 2020;2019:238-247. [PMID: 32308816 PMCID: PMC7153111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Tran T, Kavuluru R. Social media surveillance for perceived therapeutic effects of cannabidiol (CBD) products. Int J Drug Policy 2020;77:102688. [PMID: 32092666 DOI: 10.1016/j.drugpo.2020.102688] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 01/06/2020] [Accepted: 01/24/2020] [Indexed: 01/03/2023]

Abstract

BACKGROUND

CBD products have risen in popularity given CBD's therapeutic potential and lack of legal oversight, despite lacking conclusive scientific evidence for widespread over-the-counter usage for many of its perceived benefits. While medical evidence is being generated, social media surveillance offers a fast and inexpensive alternative to traditional surveys in ascertaining perceived therapeutic purposes and modes of consumption for CBD products.

METHODS

We collected all comments from the CBD subreddit posted between January 1 and April 30, 2019 as well as comments submitted to the FDA regarding regulation of cannabis-derived products and analyzed them using a rule-based language processing method. A relative ranking of popular therapeutic uses and product groups for CBD is obtained based on frequency of pattern matches including precise queries that entail identifying mentions of the condition, a CBD product, and some "trigger" phrase indicating therapeutic use. We validated the social media-based findings using a similar analysis on comments to the U.S. Food and Drug Administration's (FDA) 2019 request-for-comments on cannabis-derived products.

RESULTS

CBD is mostly discussed as a remedy for anxiety disorders and pain and this is consistent across both comment sources. Of comments posted to the CBD subreddit during the monitored time span, 6.19% mentioned anxiety at least once with at least 6.02% of these comments specifically mentioning CBD as a treatment for anxiety (i.e., 0.37% of total comments). The most popular CBD product group is oil and tinctures.

CONCLUSION

Social media surveillance of CBD usage has the potential to surface new therapeutic use-cases as they are posted. Contemporary social media data indicate, for example, that stress and nausea are frequently mentioned as therapeutic use cases for CBD without corresponding evidence, that affirms or denies, in the research literature. However, the abundance of anecdotal claims warrants serious scientific exploration moving forward. Meanwhile, as FDA ponders regulation, our effort demonstrates that social data offers a convenient affordance to surveil for CBD usage patterns in a way that is fast and inexpensive and can inform conventional electronic surveys.

Collapse

Walsh CG, Chaudhry B, Dua P, Goodman KW, Kaplan B, Kavuluru R, Solomonides A, Subbian V. Stigma, biomarkers, and algorithmic bias: recommendations for precision behavioral health with artificial intelligence. JAMIA Open 2020;3:9-15. [PMID: 32607482 PMCID: PMC7309258 DOI: 10.1093/jamiaopen/ooz054] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 07/29/2019] [Accepted: 10/30/2019] [Indexed: 12/22/2022] Open

Sarker A, Belousov M, Friedrichs J, Hakala K, Kiritchenko S, Mehryary F, Han S, Tran T, Rios A, Kavuluru R, de Bruijn B, Ginter F, Mahata D, Mohammad SM, Nenadic G, Gonzalez-Hernandez G. Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task. J Am Med Inform Assoc 2019;25:1274-1283. [PMID: 30272184 PMCID: PMC6188524 DOI: 10.1093/jamia/ocy114] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 08/02/2018] [Indexed: 12/19/2022] Open

Abstract

Objective

We executed the Social Media Mining for Health (SMM4H) 2017 shared tasks to enable the community-driven development and large-scale evaluation of automatic text processing methods for the classification and normalization of health-related text from social media. An additional objective was to publicly release manually annotated data.

Materials and Methods

We organized 3 independent subtasks: automatic classification of self-reports of 1) adverse drug reactions (ADRs) and 2) medication consumption, from medication-mentioning tweets, and 3) normalization of ADR expressions. Training data consisted of 15 717 annotated tweets for (1), 10 260 for (2), and 6650 ADR phrases and identifiers for (3); and exhibited typical properties of social-media-based health-related texts. Systems were evaluated using 9961, 7513, and 2500 instances for the 3 subtasks, respectively. We evaluated performances of classes of methods and ensembles of system combinations following the shared tasks.

Results

Among 55 system runs, the best system scores for the 3 subtasks were 0.435 (ADR class F₁-score) for subtask-1, 0.693 (micro-averaged F₁-score over two classes) for subtask-2, and 88.5% (accuracy) for subtask-3. Ensembles of system combinations obtained best scores of 0.476, 0.702, and 88.7%, outperforming individual systems.

Discussion

Among individual systems, support vector machines and convolutional neural networks showed high performance. Performance gains achieved by ensembles of system combinations suggest that such strategies may be suitable for operational systems relying on difficult text classification tasks (eg, subtask-1).

Conclusions

Data imbalance and lack of context remain challenges for natural language processing of social media text. Annotated data from the shared task have been made available as reference standards for future studies (http://dx.doi.org/10.17632/rxwfb3tysd.1).

Collapse

Affiliation(s)

Abeed Sarker Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
Maksim Belousov School of Computer Science, University of Manchester, Manchester, UK
Jasper Friedrichs Infosys Limited, Palo Alto, California, USA
Kai Hakala Turku NLP Group, Department of Future Technologies, University of Turku, Turku, Finland.,The University of Turku Graduate School, University of Turku, Turku, Finland
Svetlana Kiritchenko Digital Technologies Research Centre, National Research Council Canada, Ottawa, Canada
Farrokh Mehryary Turku NLP Group, Department of Future Technologies, University of Turku, Turku, Finland.,The University of Turku Graduate School, University of Turku, Turku, Finland
Sifei Han Department of Computer Science, University of Kentucky, Lexington, Kentucky, USA
Tung Tran Department of Computer Science, University of Kentucky, Lexington, Kentucky, USA
Anthony Rios Department of Computer Science, University of Kentucky, Lexington, Kentucky, USA
Ramakanth Kavuluru Department of Computer Science, University of Kentucky, Lexington, Kentucky, USA.,Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, Kentucky, USA
Berry de Bruijn Digital Technologies Research Centre, National Research Council Canada, Ottawa, Canada
Filip Ginter Turku NLP Group, Department of Future Technologies, University of Turku, Turku, Finland
Debanjan Mahata Bloomberg, New York, New York, USA
Saif M Mohammad Digital Technologies Research Centre, National Research Council Canada, Ottawa, Canada
Goran Nenadic School of Computer Science, University of Manchester, Manchester, UK
Graciela Gonzalez-Hernandez Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA

Collapse

Ward PJ, Rock PJ, Slavova S, Young AM, Bunn TL, Kavuluru R. Enhancing timeliness of drug overdose mortality surveillance: A machine learning approach. PLoS One 2019;14:e0223318. [PMID: 31618226 PMCID: PMC6795484 DOI: 10.1371/journal.pone.0223318] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 09/18/2019] [Indexed: 11/26/2022] Open

Abstract

Background

Timely data is key to effective public health responses to epidemics. Drug overdose deaths are identified in surveillance systems through ICD-10 codes present on death certificates. ICD-10 coding takes time, but free-text information is available on death certificates prior to ICD-10 coding. The objective of this study was to develop a machine learning method to classify free-text death certificates as drug overdoses to provide faster drug overdose mortality surveillance.

Methods

Using 2017–2018 Kentucky death certificate data, free-text fields were tokenized and features were created from these tokens using natural language processing (NLP). Word, bigram, and trigram features were created as well as features indicating the part-of-speech of each word. These features were then used to train machine learning classifiers on 2017 data. The resulting models were tested on 2018 Kentucky data and compared to a simple rule-based classification approach. Documented code for this method is available for reuse and extensions: https://github.com/pjward5656/dcnlp.

Results

The top scoring machine learning model achieved 0.96 positive predictive value (PPV) and 0.98 sensitivity for an F-score of 0.97 in identification of fatal drug overdoses on test data. This machine learning model achieved significantly higher performance for sensitivity (p<0.001) than the rule-based approach. Additional feature engineering may improve the model’s prediction. This model can be deployed on death certificates as soon as the free-text is available, eliminating the time needed to code the death certificates.

Conclusion

Machine learning using natural language processing is a relatively new approach in the context of surveillance of health conditions. This method presents an accessible application of machine learning that improves the timeliness of drug overdose mortality surveillance. As such, it can be employed to inform public health responses to the drug overdose epidemic in near-real time as opposed to several weeks following events.

Collapse

Rios A, Durbin EB, Hands I, Arnold SM, Shah D, Schwartz SM, Goulart BHL, Kavuluru R. Cross-registry neural domain adaptation to extract mutational test results from pathology reports. J Biomed Inform 2019;97:103267. [PMID: 31401235 PMCID: PMC6736690 DOI: 10.1016/j.jbi.2019.103267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 07/30/2019] [Accepted: 08/05/2019] [Indexed: 10/26/2022]

Abstract

OBJECTIVE

We study the performance of machine learning (ML) methods, including neural networks (NNs), to extract mutational test results from pathology reports collected by cancer registries. Given the lack of hand-labeled datasets for mutational test result extraction, we focus on the particular use-case of extracting Epidermal Growth Factor Receptor mutation results in non-small cell lung cancers. We explore the generalization of NNs across different registries where our goals are twofold: (1) to assess how well models trained on a registry's data port to test data from a different registry and (2) to assess whether and to what extent such models can be improved using state-of-the-art neural domain adaptation techniques under different assumptions about what is available (labeled vs unlabeled data) at the target registry site.

MATERIALS AND METHODS

We collected data from two registries: the Kentucky Cancer Registry (KCR) and the Fred Hutchinson Cancer Research Center (FH) Cancer Surveillance System. We combine NNs with adversarial domain adaptation to improve cross-registry performance. We compare to other classifiers in the standard supervised classification, unsupervised domain adaptation, and supervised domain adaptation scenarios.

RESULTS

The performance of ML methods varied between registries. To extract positive results, the basic convolutional neural network (CNN) had an F1 of 71.5% on the KCR dataset and 95.7% on the FH dataset. For the KCR dataset, the CNN F1 results were low when trained on FH data (Positive F1: 23%). Using our proposed adversarial CNN, without any labeled data, we match the F1 of the models trained directly on each target registry's data. The adversarial CNN F1 improved when trained on FH and applied to KCR dataset (Positive F1: 70.8%). We found similar performance improvements when we trained on KCR and tested on FH reports (Positive F1: 45% to 96%).

CONCLUSION

Adversarial domain adaptation improves the performance of NNs applied to pathology reports. In the unsupervised domain adaptation setting, we match the performance of models that are trained directly on target registry's data by using source registry's labeled data and unlabeled examples from the target registry.

Collapse

Tran T, Kavuluru R. Distant supervision for treatment relation extraction by leveraging MeSH subheadings. Artif Intell Med 2019;98:18-26. [PMID: 31521249 PMCID: PMC6748648 DOI: 10.1016/j.artmed.2019.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 06/04/2019] [Accepted: 06/05/2019] [Indexed: 11/26/2022]

Goulart BHL, Silgard ET, Baik CS, Bansal A, Sun Q, Durbin EB, Hands I, Shah D, Arnold SM, Ramsey SD, Kavuluru R, Schwartz SM. Validity of Natural Language Processing for Ascertainment of EGFR and ALK Test Results in SEER Cases of Stage IV Non-Small-Cell Lung Cancer. JCO Clin Cancer Inform 2019;3:1-15. [PMID: 31058542 PMCID: PMC6874053 DOI: 10.1200/cci.18.00098] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/29/2019] [Indexed: 01/03/2023] Open

Abstract

PURPOSE

SEER registries do not report results of epidermal growth factor receptor (EGFR) and anaplastic lymphoma kinase (ALK) mutation tests. To facilitate population-based research in molecularly defined subgroups of non-small-cell lung cancer (NSCLC), we assessed the validity of natural language processing (NLP) for the ascertainment of EGFR and ALK testing from electronic pathology (e-path) reports of NSCLC cases included in two SEER registries: the Cancer Surveillance System (CSS) and the Kentucky Cancer Registry (KCR).

METHODS

We obtained 4,278 e-path reports from 1,634 patients who were diagnosed with stage IV nonsquamous NSCLC from September 1, 2011, to December 31, 2013, included in CSS. We used 855 CSS reports to train NLP systems for the ascertainment of EGFR and ALK test status (reported v not reported) and test results (positive v negative). We assessed sensitivity, specificity, and positive and negative predictive values in an internal validation sample of 3,423 CSS e-path reports and repeated the analysis in an external sample of 1,041 e-path reports from 565 KCR patients. Two oncologists manually reviewed all e-path reports to generate gold-standard data sets.

RESULTS

NLP systems yielded internal validity metrics that ranged from 0.95 to 1.00 for EGFR and ALK test status and results in CSS e-path reports. NLP showed high internal accuracy for the ascertainment of EGFR and ALK in CSS patients-F scores of 0.95 and 0.96, respectively. In the external validation analysis, NLP yielded metrics that ranged from 0.02 to 0.96 in KCR reports and F scores of 0.70 and 0.72, respectively, in KCR patients.

CONCLUSION

NLP is an internally valid method for the ascertainment of EGFR and ALK test information from e-path reports available in SEER registries, but future work is necessary to increase NLP external validity.

Collapse

Rios A, Kavuluru R. Neural transfer learning for assigning diagnosis codes to EMRs. Artif Intell Med 2019;96:116-122. [PMID: 31164204 DOI: 10.1016/j.artmed.2019.04.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 12/20/2018] [Accepted: 04/10/2019] [Indexed: 11/25/2022]

Abstract

OBJECTIVE

Electronic medical records (EMRs) are manually annotated by healthcare professionals and specialized medical coders with a standardized set of alphanumeric diagnosis and procedure codes, specifically from the International Classification of Diseases (ICD). Annotating EMRs with ICD codes is important for medical billing and downstream epidemiological studies. However, manually annotating EMRs is both time-consuming and error prone. In this paper, we explore the use of convolutional neural networks (CNNs) for automatic ICD coding. Because many codes occur infrequently, CNN performance is inhibited. Therefore, we propose supplementing EMR data with PubMed indexed biomedical research abstracts through neural transfer learning.

MATERIALS AND METHODS

Transfer learning is the process of "transferring" knowledge acquired from one task (the source task) to a different (target) task. For the source task, we train a CNN to predict medical subject headings (MeSH) using 1.6 million PubMed indexed biomedical abstracts. For the target task, we train a CNN on 71,463 real-world EMRs collected from the University of Kentucky (UKY) medical center to predict ICD diagnosis codes. We introduce a simple, yet effective, transfer learning methodology which avoids forgetting knowledge gained from the source task.

RESULTS

Compared to our prior work using EMRs from the UKY medical center, we improve both the micro and macro F-scores by more than 8%. Likewise, compared to other transfer learning methods, our approach results in nearly 2% improvement in macro F-score.

CONCLUSION

We show that transfer learning can improve CNN performance for EMR coding in the presence of data sparsity issues. Furthermore, we find that our proposed transfer learning approach outperforms other methods with respect to macro F-score. Finally, we analyze how transfer learning impacts codes with respect to code frequency. We find that we achieve greater improvement on infrequent codes compared to improvements in most frequent codes.

Collapse

Kavuluru R, Han S, Hahn EJ. On the popularity of the USB flash drive-shaped electronic cigarette Juul. Tob Control 2019;28:110-112. [PMID: 29654121 PMCID: PMC6186192 DOI: 10.1136/tobaccocontrol-2018-054259] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 03/28/2018] [Accepted: 03/29/2018] [Indexed: 11/04/2022]

Islamaj Dogan R, Kim S, Chatr-Aryamontri A, Wei CH, Comeau DC, Antunes R, Matos S, Chen Q, Elangovan A, Panyam NC, Verspoor K, Liu H, Wang Y, Liu Z, Altinel B, Hüsünbeyi ZM, Özgür A, Fergadis A, Wang CK, Dai HJ, Tran T, Kavuluru R, Luo L, Steppi A, Zhang J, Qu J, Lu Z. Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine. Database (Oxford) 2019;2019:5303240. [PMID: 30689846 PMCID: PMC6348314 DOI: 10.1093/database/bay147] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 12/19/2018] [Indexed: 12/16/2022]

Abstract

The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein-protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation.

Collapse

Affiliation(s)

Rezarta Islamaj Dogan National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Sun Kim National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Andrew Chatr-Aryamontri Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Canada
Chih-Hsuan Wei National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Donald C Comeau National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Rui Antunes Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Sérgio Matos Department of Electronics, Telecommunications and Informatics (DETI)/Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
Qingyu Chen School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Aparna Elangovan School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Nagesh C Panyam School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Karin Verspoor School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, Australia
Hongfang Liu Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Yanshan Wang Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
Zhuang Liu School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Berna Altinel Department of Computer Engineering, Marmara University, Istanbul, Turkey
Zehra Melce Hüsünbeyi Department of Computer Engineering, Bogaziçi University, Istanbul, Turkey
Arzucan Özgür
Aris Fergadis School of Electrical and Computer Engineering, National Technical University of Athens, Zografou, Athens, Greece
Chen-Kai Wang Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei, Taiwan
Hong-Jie Dai Department of Electrical Engineering, National Kaousiung University of Science and Technology, Kaohsiung, Taiwan
Tung Tran Department of Computer Science, University of Kentucky, Lexington, KY, USA
Ramakanth Kavuluru Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY, USA
Ling Luo College of Computer Science and Technology, Dalian University of Technology, Dalian, China
Albert Steppi Department of Statistics, Florida State University, Florida, USA
Jinfeng Zhang Department of Statistics, Florida State University, Florida, USA
Jinchan Qu Department of Statistics, Florida State University, Florida, USA
Zhiyong Lu National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA

Collapse

Noh J, Kavuluru R. Document Retrieval for Biomedical Question Answering with Neural Sentence Matching. Proc Int Conf Mach Learn Appl 2018;2018:194-201. [PMID: 30714048 PMCID: PMC6353660 DOI: 10.1109/icmla.2018.00036] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Peng Y, Rios A, Kavuluru R, Lu Z. Extracting chemical-protein relations with ensembles of SVM and deep learning models. Database (Oxford) 2018;2018:5055578. [PMID: 30020437 PMCID: PMC6051439 DOI: 10.1093/database/bay073] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 06/15/2018] [Indexed: 11/14/2022]

Rios A, Kavuluru R. Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces. Proc Conf Empir Methods Nat Lang Process 2018;2018:3132-3142. [PMID: 30775726 PMCID: PMC6375489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Rios A, Kavuluru R, Lu Z. Generalizing biomedical relation classification with neural adversarial domain adaptation. Bioinformatics 2018;34:2973-2981. [PMID: 29590309 PMCID: PMC6129312 DOI: 10.1093/bioinformatics/bty190] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2017] [Revised: 03/15/2018] [Accepted: 03/25/2018] [Indexed: 11/13/2022] Open

Abstract

Motivation

Creating large datasets for biomedical relation classification can be prohibitively expensive. While some datasets have been curated to extract protein-protein and drug-drug interactions (PPIs and DDIs) from text, we are also interested in other interactions including gene-disease and chemical-protein connections. Also, many biomedical researchers have begun to explore ternary relationships. Even when annotated data are available, many datasets used for relation classification are inherently biased. For example, issues such as sample selection bias typically prevent models from generalizing in the wild. To address the problem of cross-corpora generalization, we present a novel adversarial learning algorithm for unsupervised domain adaptation tasks where no labeled data are available in the target domain. Instead, our method takes advantage of unlabeled data to improve biased classifiers through learning domain-invariant features via an adversarial process. Finally, our method is built upon recent advances in neural network (NN) methods.

Results

We experiment by extracting PPIs and DDIs from text. In our experiments, we show domain invariant features can be learned in NNs such that classifiers trained for one interaction type (protein-protein) can be re-purposed to others (drug-drug). We also show that our method can adapt to different source and target pairs of PPI datasets. Compared to prior convolutional and recurrent NN-based relation classification methods without domain adaptation, we achieve improvements as high as 30% in F1-score. Likewise, we show improvements over state-of-the-art adversarial methods.

Availability and implementation

Experimental code is available at https://github.com/bionlproc/adversarial-relation-classification.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Bakal G, Talari P, Kakani EV, Kavuluru R. Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations. J Biomed Inform 2018;82:189-199. [PMID: 29763706 PMCID: PMC6070294 DOI: 10.1016/j.jbi.2018.05.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 01/31/2018] [Accepted: 05/09/2018] [Indexed: 01/27/2023]

Abstract

BACKGROUND

Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach.

OBJECTIVE

To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs.

METHODS

We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios.

RESULTS

Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset.

CONCLUSIONS

We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction.

Collapse