1
|
Campion TR, Sholle ET, Abedian S, Fuld X, McGregor R, Lewis AN, Gripp LT, Leonard JP, Cole CL. Implementation of a commercial federated network of electronic health record data to enable sponsor-initiated clinical trials at an academic medical center. Int J Med Inform 2024; 182:105322. [PMID: 38128198 PMCID: PMC10843646 DOI: 10.1016/j.ijmedinf.2023.105322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/11/2023] [Accepted: 12/17/2023] [Indexed: 12/23/2023]
Abstract
BACKGROUND A commercial federated network called TriNetX has connected electronic health record (EHR) data from academic medical centers (AMCs) with biopharmaceutical sponsors in a privacy-preserving manner to promote sponsor-initiated clinical trials. Little is known about how AMCs have implemented TriNetX to support clinical trials. FINDINGS At our AMC over a six-year period, TriNetX integrated into existing institutional workflows enabled 402 requests for sponsor-initiated clinical trials, 14 % (n = 56) of which local investigators expressed interest in conducting. Although clinical trials administrators indicated TriNetX yielded unique study opportunities, measurement of impact of institutional participation in the network was challenging due to lack of a common trial identifier shared across TriNetX, sponsor, and our institution. CONCLUSION To the best of our knowledge, this study is among the first to describe integration of a federated network of EHR data into institutional workflows for sponsor-initiated clinical trials. This case report may inform efforts at other institutions.
Collapse
Affiliation(s)
- Thomas R Campion
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, USA; Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Sajjad Abedian
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Xiaobo Fuld
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Ryan McGregor
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Alicia N Lewis
- Joint Clinical Trials Office, Weill Cornell Medicine, New York, NY, USA
| | - Lauren T Gripp
- Joint Clinical Trials Office, Weill Cornell Medicine, New York, NY, USA
| | - John P Leonard
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Curtis L Cole
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, USA; Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA; Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA; Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
2
|
Sholle ET, Davila MA, Kostka K, Abedian S, Cusick M, Krichevsky S, Pathak J, Campion TR. Comparative Merits of Available Mortality Data Sources for Clinical Research. AMIA Annu Symp Proc 2024; 2023:634-640. [PMID: 38222379 PMCID: PMC10785894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]
Abstract
Obtaining reliable data on patient mortality is a critical challenge facing observational researchers seeking to conduct studies using real-world data. As these analyses are conducted more broadly using newly-available sources of real-world evidence, missing data can serve as a rate-limiting factor. We conducted a comparison of mortality data sources from different stakeholder perspectives - academic medical center (AMC) informatics service providers, AMC research coordinators, industry analytics professionals, and academics - to understand the strengths and limitations of differing mortality data sources: locally generated data from sites conducting research, data provided by governmental sources, and commercially available data sets. Researchers seeking to conduct observational studies using extant data should consider these factors in sourcing outcomes data for their populations of interest.
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Marcos A Davila
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | | | - Sajjad Abedian
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Marika Cusick
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Stanford University, Stanford, CA
| | - Spencer Krichevsky
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York NY
| |
Collapse
|
3
|
Hartman VC, Bapat SS, Weiner MG, Navi BB, Sholle ET, Campion TR. A method to automate the discharge summary hospital course for neurology patients. J Am Med Inform Assoc 2023; 30:1995-2003. [PMID: 37639624 PMCID: PMC10654848 DOI: 10.1093/jamia/ocad177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 07/17/2023] [Accepted: 08/16/2023] [Indexed: 08/31/2023] Open
Abstract
OBJECTIVE Generation of automated clinical notes has been posited as a strategy to mitigate physician burnout. In particular, an automated narrative summary of a patient's hospital stay could supplement the hospital course section of the discharge summary that inpatient physicians document in electronic health record (EHR) systems. In the current study, we developed and evaluated an automated method for summarizing the hospital course section using encoder-decoder sequence-to-sequence transformer models. MATERIALS AND METHODS We fine-tuned BERT and BART models and optimized for factuality through constraining beam search, which we trained and tested using EHR data from patients admitted to the neurology unit of an academic medical center. RESULTS The approach demonstrated good ROUGE scores with an R-2 of 13.76. In a blind evaluation, 2 board-certified physicians rated 62% of the automated summaries as meeting the standard of care, which suggests the method may be useful clinically. DISCUSSION AND CONCLUSION To our knowledge, this study is among the first to demonstrate an automated method for generating a discharge summary hospital course that approaches a quality level of what a physician would write.
Collapse
Affiliation(s)
- Vince C Hartman
- Cornell Tech, New York, NY 10044, United States
- Abstractive Health, New York, NY 10022, United States
| | - Sanika S Bapat
- Cornell Tech, New York, NY 10044, United States
- Abstractive Health, New York, NY 10022, United States
| | - Mark G Weiner
- Department of Medicine, Weill Cornell Medicine, New York, NY 10065, United States
- Department of Population Health, Weill Cornell Medicine, New York, NY 10065, United States
| | - Babak B Navi
- Department of Neurology and Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY 10065, United States
| | - Evan T Sholle
- Department of Population Health, Weill Cornell Medicine, New York, NY 10065, United States
| | - Thomas R Campion
- Department of Population Health, Weill Cornell Medicine, New York, NY 10065, United States
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY 10065, United States
| |
Collapse
|
4
|
Turchioe MR, Mangal S, Goyal P, Axsom K, Myers A, Liu LG, Lee J, Campion TR, Creber RM. A RE-AIM Evaluation of a Visualization-Based Electronic Patient-Reported Outcome System. Appl Clin Inform 2023; 14:227-237. [PMID: 36603838 PMCID: PMC10033223 DOI: 10.1055/a-2008-4036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 01/04/2023] [Indexed: 01/07/2023] Open
Abstract
OBJECTIVES Health care systems are primarily collecting patient-reported outcomes (PROs) for research and clinical care using proprietary, institution- and disease-specific tools for remote assessment. The purpose of this study was to conduct a Reach, Effectiveness, Adoption, Implementation, and Maintenance (RE-AIM) evaluation of a scalable electronic PRO (ePRO) reporting and visualization system in a single-arm study. METHODS The "mi.symptoms" ePRO system was designed using gerontechnological design principles to ensure high usability among older adults. The system enables longitudinal reporting of disease-agnostic ePROs and includes patient-facing PRO visualizations. We conducted an evaluation of the implementation of the system guided by the RE-AIM framework. Quantitative data were analyzed using basic descriptive statistics, and qualitative data were analyzed using directed content analysis. RESULTS Reach-the total reach of the study was 70 participants (median age: 69, 31% female, 17% Black or African American, 27% reported not having enough financial resources). Effectiveness-half (51%) of participants completed the 2-week follow-up survey and 36% completed all follow-up surveys. Adoption-the desire for increased self-knowledge, the value of tracking symptoms, and altruism motivated participants to adopt the tool. Implementation-the predisposing factor was access to, and comfort with, computers. Three enabling factors were incorporation into routines, multimodal nudges, and ease of use. Maintenance-reinforcing factors were perceived usefulness of viewing symptom reports with the tool and understanding the value of sustained symptom tracking in general. CONCLUSION Challenges in ePRO reporting, particularly sustained patient engagement, remain. Nonetheless, freely available, scalable, disease-agnostic systems may pave the road toward inclusion of a more diverse range of health systems and patients in ePRO collection and use.
Collapse
Affiliation(s)
| | - Sabrina Mangal
- University of Washington School of Nursing, Seattle, Washington, United States
- Division of General Internal Medicine, Department of Medicine, Weill Cornell Medicine, New York, New York, United States
| | - Parag Goyal
- Division of General Internal Medicine, Department of Medicine, Weill Cornell Medicine, New York, New York, United States
| | - Kelly Axsom
- Division of Cardiology, Center for Advanced Cardiac Care, Columbia University Medical Center, New York, New York, United States
| | - Annie Myers
- Columbia University School of Nursing, New York, New York, United States
| | - Lisa G. Liu
- Department of Pediatrics, University of California San Francisco, San Francisco, California, United States
| | - Jessie Lee
- Department of Pediatrics, University of California San Francisco, San Francisco, California, United States
| | - Thomas R. Campion
- University of Washington School of Nursing, Seattle, Washington, United States
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, United States
| | | |
Collapse
|
5
|
Cusick M, Velupillai S, Downs J, Campion TR, Sholle ET, Dutta R, Pathak J. Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records. J Affect Disord Rep 2022; 10:100430. [PMID: 36644339 PMCID: PMC9835770 DOI: 10.1016/j.jadr.2022.100430] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Background In the global effort to prevent death by suicide, many academic medical institutions are implementing natural language processing (NLP) approaches to detect suicidality from unstructured clinical text in electronic health records (EHRs), with the hope of targeting timely, preventative interventions to individuals most at risk of suicide. Despite the international need, the development of these NLP approaches in EHRs has been largely local and not shared across healthcare systems. Methods In this study, we developed a process to share NLP approaches that were individually developed at King's College London (KCL), UK and Weill Cornell Medicine (WCM), US - two academic medical centers based in different countries with vastly different healthcare systems. We tested and compared the algorithms' performance on manually annotated clinical notes (KCL: n = 4,911 and WCM = 837). Results After a successful technical porting of the NLP approaches, our quantitative evaluation determined that independently developed NLP approaches can detect suicidality at another healthcare organization with a different EHR system, clinical documentation processes, and culture, yet do not achieve the same level of success as at the institution where the NLP algorithm was developed (KCL approach: F1-score 0.85 vs. 0.68, WCM approach: F1-score 0.87 vs. 0.72). Limitations Independent NLP algorithm development and patient cohort selection at the two institutions comprised direct comparability. Conclusions Shared use of these NLP approaches is a critical step forward towards improving data-driven algorithms for early suicide risk identification and timely prevention.
Collapse
Affiliation(s)
- Marika Cusick
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA, South London and Maudsley NHS Foundation Trust, London, UK, Corresponding author. (M. Cusick)
| | - Sumithra Velupillai
- IoPPN, King’s College London, London, UK, South London and Maudsley NHS Foundation Trust, London, UK
| | - Johnny Downs
- IoPPN, King’s College London, London, UK, South London and Maudsley NHS Foundation Trust, London, UK
| | - Thomas R. Campion
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA, South London and Maudsley NHS Foundation Trust, London, UK
| | - Evan T. Sholle
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA, South London and Maudsley NHS Foundation Trust, London, UK
| | - Rina Dutta
- IoPPN, King’s College London, London, UK, South London and Maudsley NHS Foundation Trust, London, UK
| | - Jyotishman Pathak
- WeiCornell Medicine, 402 E. 67th St., New York, NY 10065, USA, South London and Maudsley NHS Foundation Trust, London, UK
| |
Collapse
|
6
|
Wood EA, Campion TR. Design and implementation of an integrated data model to support clinical and translational research administration. J Am Med Inform Assoc 2022; 29:1559-1566. [PMID: 35713633 DOI: 10.1093/jamia/ocac100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/17/2022] [Accepted: 06/06/2022] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Both academic medical centers and biomedical research sponsors need to understand impact of scientific funding to determine value. For the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) hubs, tracking research activities can be complex, often involving multiple institutions and continually changing federal reporting requirements. Existing research administrative systems are institution-specific and tend to focus only on parts of a greater whole. The goal of this case report is to describe a comprehensive data model that addresses this gap. MATERIALS AND METHODS Web-based Center Administrative Management Program (WebCAMP) has been developed over a period of over 15 years in the context of CTSA hubs, with the recent addition of T32 programs. Its data model centers around the key concepts of people, projects, resources (inputs), and outcomes (outputs). RESULTS The WebCAMP data model and associated toolset for biomedical research administration integrates multiple components of the research enterprise, has been used by our CTSA hub for over 15 years and has been adopted by more than 20 other CTSA hubs. DISCUSSION To the best of our knowledge, this study is among the first to describe a comprehensive data model for biomedical research administration. Opportunities for future work include improved grant tracking through the development of a universal identifier that spans public and private funders, and a more generic outcomes tracking model able to rapidly incorporate new outcome types. CONCLUSION We propose that the WebCAMP data model, or a derivative of it, could serve as a future standard for research administrative data warehousing.
Collapse
Affiliation(s)
- Elizabeth A Wood
- Clinical and Translational Science Center, Weill Cornell Medical College, New York, New York, USA
| | - Thomas R Campion
- Clinical and Translational Science Center, Weill Cornell Medical College, New York, New York, USA.,Department of Population Health Sciences, Weill Cornell Medical College, New York, New York, USA
| |
Collapse
|
7
|
Hartman V, Campion TR. A Day-to-Day Approach for Automating the Hospital Course Section of the Discharge Summary. AMIA Jt Summits Transl Sci Proc 2022; 2022:216-225. [PMID: 35854728 PMCID: PMC9285173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Abstract
Optimal solutions for abstractive summarization of electronic health record content have yet to be discovered. Although studies have applied state-of-the-art transformers in the clinical domain to radiology reports and information extraction, little is known of transformers' performance with the hospital course section of the discharge summary. This paper compares two summarization approaches for automating the hospital course section within the discharge summary: (1) a truncation approach that uses all clinical notes and (2) a day-to-day approach that segments the notes per clinical day. We pair both approaches with different transformer encoder-decoder based-models - BART, BERT2GPT2, ClinicalBERT2GPT2, and ClinicalBERT2ClinicalBERT and evaluate the transformers that work best for each approach using ROUGE metrics. The results demonstrate that the day-to-day approach can overcome the limitations of longform document summarization for the patient clinical record.
Collapse
|
8
|
Murphy SN, Visweswaran S, Becich MJ, Campion TR, Knosp BM, Melton-Meaux GB, Lenert LA. Research data warehouse best practices: catalyzing national data sharing through informatics innovation. J Am Med Inform Assoc 2022; 29:581-584. [PMID: 35289371 PMCID: PMC8922176 DOI: 10.1093/jamia/ocac024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 02/14/2022] [Indexed: 11/12/2022] Open
Affiliation(s)
- Shawn N Murphy
- Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, USA
- Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
- Clinical and Translational Science Institute, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Michael J Becich
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
- Clinical and Translational Science Institute, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, USA
| | - Boyd M Knosp
- Roy J. and Lucille A. Carver College of Medicine and the Institute for Clinical & Translational Science, University of Iowa, Iowa City, Iowa, USA
| | - Genevieve B Melton-Meaux
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, USA
- Institute for Health Informatics (IHI), University of Minnesota, Minneapolis, Minnesota, USA
| | - Leslie A Lenert
- Biomedical Informatics Center (BMIC), Medical University of South Carolina, Charleston, South Carolina, USA
- Health Sciences South Carolina, Columbia, South Carolina, USA
| |
Collapse
|
9
|
Pfaff ER, Girvin AT, Gabriel DL, Kostka K, Morris M, Palchuk MB, Lehmann HP, Amor B, Bissell M, Bradwell KR, Gold S, Hong SS, Loomba J, Manna A, McMurry JA, Niehaus E, Qureshi N, Walden A, Zhang XT, Zhu RL, Moffitt RA, Haendel MA, Chute CG, Adams WG, Al-Shukri S, Anzalone A, Baghal A, Bennett TD, Bernstam EV, Bernstam EV, Bissell MM, Bush B, Campion TR, Castro V, Chang J, Chaudhari DD, Chen W, Chu S, Cimino JJ, Crandall KA, Crooks M, Davies SJD, DiPalazzo J, Dorr D, Eckrich D, Eltinge SE, Fort DG, Golovko G, Gupta S, Haendel MA, Hajagos JG, Hanauer DA, Harnett BM, Horswell R, Huang N, Johnson SG, Kahn M, Khanipov K, Kieler C, Luzuriaga KRD, Maidlow S, Martinez A, Mathew J, McClay JC, McMahan G, Melancon B, Meystre S, Miele L, Morizono H, Pablo R, Patel L, Phuong J, Popham DJ, Pulgarin C, Santos C, Sarkar IN, Sazo N, Setoguchi S, Soby S, Surampalli S, Suver C, Vangala UMR, Visweswaran S, von Oehsen J, Walters KM, Wiley L, Williams DA, Zai A. Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative. J Am Med Inform Assoc 2022; 29:609-618. [PMID: 34590684 PMCID: PMC8500110 DOI: 10.1093/jamia/ocab217] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/19/2021] [Accepted: 09/23/2021] [Indexed: 02/01/2023] Open
Abstract
OBJECTIVE In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations. MATERIALS AND METHODS We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements. RESULTS Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback. DISCUSSION We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate. CONCLUSION By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require.
Collapse
Affiliation(s)
- Emily R Pfaff
- Department of Medicine, UNC Chapel Hill School of Medicine, Chapel Hill, North Carolina, USA
| | | | - Davera L Gabriel
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Kristin Kostka
- The OHDSI Center at the Roux Institute, Northeastern University, Portland, Maine, USA
| | - Michele Morris
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | | | - Harold P Lehmann
- Department of Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, USA
| | | | | | | | - Sigfried Gold
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Stephanie S Hong
- Section of Biomedical Informatics and Data Science, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | | | - Amin Manna
- Palantir Technologies, Denver, Colorado, USA
| | - Julie A McMurry
- Center for Health AI, University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | | | | | - Anita Walden
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| | | | - Richard L Zhu
- Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Richard A Moffitt
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA
| | - Melissa A Haendel
- University of Colorado Anschutz Medical Campus, Aurora, Colorado, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, Maryland, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Knosp BM, Craven CK, Dorr DA, Bernstam EV, Campion TR. Understanding enterprise data warehouses to support clinical and translational research: enterprise information technology relationships, data governance, workforce, and cloud computing. J Am Med Inform Assoc 2022; 29:671-676. [PMID: 35289370 PMCID: PMC8922193 DOI: 10.1093/jamia/ocab256] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/05/2021] [Indexed: 01/22/2023] Open
Abstract
OBJECTIVE Among National Institutes of Health Clinical and Translational Science Award (CTSA) hubs, effective approaches for enterprise data warehouses for research (EDW4R) development, maintenance, and sustainability remain unclear. The goal of this qualitative study was to understand CTSA EDW4R operations within the broader contexts of academic medical centers and technology. MATERIALS AND METHODS We performed a directed content analysis of transcripts generated from semistructured interviews with informatics leaders from 20 CTSA hubs. RESULTS Respondents referred to services provided by health system, university, and medical school information technology (IT) organizations as "enterprise information technology (IT)." Seventy-five percent of respondents stated that the team providing EDW4R service at their hub was separate from enterprise IT; strong relationships between EDW4R teams and enterprise IT were critical for success. Managing challenges of EDW4R staffing was made easier by executive leadership support. Data governance appeared to be a work in progress, as most hubs reported complex and incomplete processes, especially for commercial data sharing. Although nearly all hubs (n = 16) described use of cloud computing for specific projects, only 2 hubs reported using a cloud-based EDW4R. Respondents described EDW4R cloud migration facilitators, barriers, and opportunities. DISCUSSION Descriptions of approaches to how EDW4R teams at CTSA hubs work with enterprise IT organizations, manage workforces, make decisions about data, and approach cloud computing provide insights for institutions seeking to leverage patient data for research. CONCLUSION Identification of EDW4R best practices is challenging, and this study helps identify a breadth of viable options for CTSA hubs to consider when implementing EDW4R services.
Collapse
Affiliation(s)
- Boyd M Knosp
- Roy J. and Lucille A. Carver College of Medicine and the Institute for Clinical & Translational Science, University of Iowa, Iowa City, Iowa, USA
| | - Catherine K Craven
- Division of Clinical Research Informatics, Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, Texas, USA
| | - David A Dorr
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
- Department of Medicine, Oregon Health & Science University, Portland, Oregon, USA
| | - Elmer V Bernstam
- Center for Clinical and Translational Sciences, University of Texas Health Science Center, Houston, Texas, USA
| | - Thomas R Campion
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, New York, USA
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| |
Collapse
|
11
|
Vest JR, Adler-Milstein J, Gottlieb LM, Bian J, Campion TR, Cohen GR, Donnelly N, Harper J, Huerta TR, Kansky JP, Kharrazi H, Khurshid A, Kooreman HE, McDonnell C, Overhage JM, Pantell MS, Parisi W, Shenkman EA, Tierney WM, Wiehe S, Harle CA. Assessment of structured data elements for social risk factors. Am J Manag Care 2022; 28:e14-e23. [PMID: 35049262 DOI: 10.37765/ajmc.2022.88816] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
OBJECTIVES Computable social risk factor phenotypes derived from routinely collected structured electronic health record (EHR) or health information exchange (HIE) data may represent a feasible and robust approach to measuring social factors. This study convened an expert panel to identify and assess the quality of individual EHR and HIE structured data elements that could be used as components in future computable social risk factor phenotypes. STUDY DESIGN Technical expert panel. METHODS A 2-round Delphi technique included 17 experts with an in-depth knowledge of available EHR and/or HIE data. The first-round identification sessions followed a nominal group approach to generate candidate data elements that may relate to socioeconomics, cultural context, social relationships, and community context. In the second-round survey, panelists rated each data element according to overall data quality and likelihood of systematic differences in quality across populations (ie, bias). RESULTS Panelists identified a total of 89 structured data elements. About half of the data elements (n = 45) were related to socioeconomic characteristics. The panelists identified a diverse set of data elements. Elements used in reimbursement-related processes were generally rated as higher quality. Panelists noted that several data elements may be subject to implicit bias or reflect biased systems of care, which may limit their utility in measuring social factors. CONCLUSIONS Routinely collected structured data within EHR and HIE systems may reflect patient social risk factors. Identifying and assessing available data elements serves as a foundational step toward developing future computable social factor phenotypes.
Collapse
Affiliation(s)
- Joshua R Vest
- Indiana University Richard M. Fairbanks School of Public Health, 1050 Wishard Blvd, Indianapolis, IN 46202.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Yin AL, Guo WL, Sholle ET, Rajan M, Alshak MN, Choi JJ, Goyal P, Jabri A, Li HA, Pinheiro LC, Wehmeyer GT, Weiner M, Safford MM, Campion TR, Cole CL. Comparing automated vs. manual data collection for COVID-specific medications from electronic health records. Int J Med Inform 2022; 157:104622. [PMID: 34741892 PMCID: PMC8529289 DOI: 10.1016/j.ijmedinf.2021.104622] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 09/19/2021] [Accepted: 10/15/2021] [Indexed: 12/29/2022]
Abstract
INTRODUCTION Data extraction from electronic health record (EHR) systems occurs through manual abstraction, automated extraction, or a combination of both. While each method has its strengths and weaknesses, both are necessary for retrospective observational research as well as sudden clinical events, like the COVID-19 pandemic. Assessing the strengths, weaknesses, and potentials of these methods is important to continue to understand optimal approaches to extracting clinical data. We set out to assess automated and manual techniques for collecting medication use data in patients with COVID-19 to inform future observational studies that extract data from the electronic health record (EHR). MATERIALS AND METHODS For 4,123 COVID-positive patients hospitalized and/or seen in the emergency department at an academic medical center between 03/03/2020 and 05/15/2020, we compared medication use data of 25 medications or drug classes collected through manual abstraction and automated extraction from the EHR. Quantitatively, we assessed concordance using Cohen's kappa to measure interrater reliability, and qualitatively, we audited observed discrepancies to determine causes of inconsistencies. RESULTS For the 16 inpatient medications, 11 (69%) demonstrated moderate or better agreement; 7 of those demonstrated strong or almost perfect agreement. For 9 outpatient medications, 3 (33%) demonstrated moderate agreement, but none achieved strong or almost perfect agreement. We audited 12% of all discrepancies (716/5,790) and, in those audited, observed three principal categories of error: human error in manual abstraction (26%), errors in the extract-transform-load (ETL) or mapping of the automated extraction (41%), and abstraction-query mismatch (33%). CONCLUSION Our findings suggest many inpatient medications can be collected reliably through automated extraction, especially when abstraction instructions are designed with data architecture in mind. We discuss quality issues, concerns, and improvements for institutions to consider when crafting an approach. During crises, institutions must decide how to allocate limited resources. We show that automated extraction of medications is feasible and make recommendations on how to improve future iterations.
Collapse
Affiliation(s)
- Andrew L. Yin
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,Department of Medicine, Weill Cornell Medicine, New York, NY, United States,Corresponding author at: 1300 York Avenue, New York, NY 10021, United States
| | - Winston L. Guo
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States
| | - Evan T. Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
| | - Mangala Rajan
- Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Mark N. Alshak
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Justin J. Choi
- Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Parag Goyal
- Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Assem Jabri
- Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Han A. Li
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Laura C. Pinheiro
- Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Graham T. Wehmeyer
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, United States,Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Mark Weiner
- Department of Medicine, Weill Cornell Medicine, New York, NY, United States,Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
| | | | - Monika M. Safford
- Division of General Internal Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Thomas R. Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States,Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States,Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY, United States
| | - Curtis L. Cole
- Department of Medicine, Weill Cornell Medicine, New York, NY, United States,Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States
| |
Collapse
|
13
|
Campion TR, Sholle ET, Pathak J, Johnson SB, Leonard JP, Cole CL. An architecture for research computing in health to support clinical and translational investigators with electronic patient data. J Am Med Inform Assoc 2021; 29:677-685. [PMID: 34850911 PMCID: PMC8690260 DOI: 10.1093/jamia/ocab266] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 10/20/2021] [Accepted: 11/15/2021] [Indexed: 12/13/2022] Open
Abstract
Objective Obtaining electronic patient data, especially from electronic health record (EHR) systems, for clinical and translational research is difficult. Multiple research informatics systems exist but navigating the numerous applications can be challenging for scientists. This article describes Architecture for Research Computing in Health (ARCH), our institution’s approach for matching investigators with tools and services for obtaining electronic patient data. Materials and Methods Supporting the spectrum of studies from populations to individuals, ARCH delivers a breadth of scientific functions—including but not limited to cohort discovery, electronic data capture, and multi-institutional data sharing—that manifest in specific systems—such as i2b2, REDCap, and PCORnet. Through a consultative process, ARCH staff align investigators with tools with respect to study design, data sources, and cost. Although most ARCH services are available free of charge, advanced engagements require fee for service. Results Since 2016 at Weill Cornell Medicine, ARCH has supported over 1200 unique investigators through more than 4177 consultations. Notably, ARCH infrastructure enabled critical coronavirus disease 2019 response activities for research and patient care. Discussion ARCH has provided a technical, regulatory, financial, and educational framework to support the biomedical research enterprise with electronic patient data. Collaboration among informaticians, biostatisticians, and clinicians has been critical to rapid generation and analysis of EHR data. Conclusion A suite of tools and services, ARCH helps match investigators with informatics systems to reduce time to science. ARCH has facilitated research at Weill Cornell Medicine and may provide a model for informatics and research leaders to support scientists elsewhere.
Collapse
Affiliation(s)
- Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA.,Department of Pediatrics, Weill Cornell Medicine, New York, New York, USA.,Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA.,Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, USA
| | - Evan T Sholle
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA.,Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA.,Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, USA
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA.,Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, USA
| | - Stephen B Johnson
- Department of Population Health, New York University Grossman School of Medicine, New York, New York, USA
| | - John P Leonard
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Curtis L Cole
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA.,Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA.,Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, USA.,Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| |
Collapse
|
14
|
Patra BG, Sharma MM, Vekaria V, Adekkanattu P, Patterson OV, Glicksberg B, Lepow LA, Ryu E, Biernacka JM, Furmanchuk A, George TJ, Hogan W, Wu Y, Yang X, Bian J, Weissman M, Wickramaratne P, Mann JJ, Olfson M, Campion TR, Weiner M, Pathak J. Extracting social determinants of health from electronic health records using natural language processing: a systematic review. J Am Med Inform Assoc 2021; 28:2716-2727. [PMID: 34613399 PMCID: PMC8633615 DOI: 10.1093/jamia/ocab170] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 07/09/2021] [Accepted: 08/04/2021] [Indexed: 11/27/2022] Open
Abstract
OBJECTIVE Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs. MATERIALS AND METHODS A broad literature search was conducted in February 2021 using 3 scholarly databases (ACL Anthology, PubMed, and Scopus) following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. A total of 6402 publications were initially identified, and after applying the study inclusion criteria, 82 publications were selected for the final review. RESULTS Smoking status (n = 27), substance use (n = 21), homelessness (n = 20), and alcohol use (n = 15) are the most frequently studied SDoH categories. Homelessness (n = 7) and other less-studied SDoH (eg, education, financial problems, social isolation and support, family problems) are mostly identified using rule-based approaches. In contrast, machine learning approaches are popular for identifying smoking status (n = 13), substance use (n = 9), and alcohol use (n = 9). CONCLUSION NLP offers significant potential to extract SDoH data from narrative clinical notes, which in turn can aid in the development of screening tools, risk prediction models, and clinical decision support systems.
Collapse
Affiliation(s)
- Braja G Patra
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Mohit M Sharma
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Veer Vekaria
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Prakash Adekkanattu
- Information Technologies and Services, Weill Cornell Medicine, New York, New York, USA
| | - Olga V Patterson
- Department of Internal Medicine, Division of Epidemiology, University of Utah, Salt Lake City, Utah, USA
- US Department of Veterans Affairs, Salt Lake City, Utah, USA
| | | | - Lauren A Lepow
- Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Euijung Ryu
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
| | - Joanna M Biernacka
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA
| | | | - Thomas J George
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - William Hogan
- Division of Hematology & Oncology, Department of Medicine, College of Medicine, University of Florida, Gainesville, Florida, USA, and
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - Xi Yang
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - Myrna Weissman
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Priya Wickramaratne
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - J John Mann
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Mark Olfson
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
- Information Technologies and Services, Weill Cornell Medicine, New York, New York, USA
| | - Mark Weiner
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| |
Collapse
|
15
|
Schenck EJ, Hoffman KL, Oromendia C, Sanchez E, Finkelsztein EJ, Hong KS, Kabariti J, Torres LK, Harrington JS, Siempos II, Choi AMK, Campion TR. A Comparative Analysis of the Respiratory Subscore of the Sequential Organ Failure Assessment Scoring System. Ann Am Thorac Soc 2021; 18:1849-1860. [PMID: 33760709 PMCID: PMC8641830 DOI: 10.1513/annalsats.202004-399oc] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 03/23/2021] [Indexed: 11/20/2022] Open
Abstract
Rationale: The Sequential Organ Failure Assessment (SOFA) tool is a commonly used measure of illness severity. Calculation of the respiratory subscore of SOFA is frequently limited by missing arterial oxygen pressure (PaO2) data. Although missing PaO2 data are commonly replaced with normal values, the performance of different methods of substituting PaO2 for SOFA calculation is unclear. Objectives: The study objective was to compare the performance of different substitution strategies for missing PaO2 data for SOFA score calculation. Methods: This retrospective cohort study was performed using the Weill Cornell Critical Care Database for Advanced Research from a tertiary care hospital in the United States. All adult patients admitted to an intensive care unit (ICU) from 2011 to 2019 with an available respiratory SOFA score were included. We analyzed the availability of the PaO2/fraction of inspired oxygen (FiO2) ratio on the first day of ICU admission. In those without a PaO2/FiO2 ratio available, the ratio of oxygen saturation as measured by pulse oximetry to FiO2 was used to calculate a respiratory SOFA subscore according to four methods (linear substitution [Rice], nonlinear substitution [Severinghaus], modified respiratory SOFA, and multiple imputation by chained equations [MICE]) as well as the missing-as-normal technique. We then compared how well the different total SOFA scores discriminated in-hospital mortality. We performed several subgroup and sensitivity analyses. Results: We identified 35,260 unique visits, of which 9,172 included predominant respiratory failure. PaO2 data were available for 14,939 (47%). The area under the receiver operating characteristic curve for each substitution technique for discriminating in-hospital mortality was higher than that for the missing-as-normal technique (0.78 [0.77-0.79]) in all analyses (modified, 0.80 [0.79-0.81]; Rice, 0.80 [0.79-0.81]; Severinghaus, 0.80 [0.79-0.81]; and MICE, 0.80 [0.79-0.81]) (P < 0.01). Each substitution method had a higher accuracy for discriminating in-hospital mortality (MICE, 0.67; Rice, 0.67; modified, 0.66; and Severinghaus, 0.66) than the missing-as-normal technique. Model calibration for in-hospital mortality was less precise for the missing-as-normal technique than for the other substitution techniques at the lower range of SOFA and among the subgroups. Conclusions: Using physiologic and statistical substitution methods improved the total SOFA score's ability to discriminate mortality compared with the missing-as-normal technique. Treating missing data as normal may result in underreporting the severity of illness compared with using substitution. The simplicity of a direct oxygen saturation as measured by pulse oximetry/FiO2 ratio-modified SOFA technique makes it an attractive choice for electronic health record-based research. This knowledge can inform comparisons of severity of illness across studies that used different techniques.
Collapse
Affiliation(s)
- Edward J Schenck
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
- NewYork-Presbyterian Hospital, Weill Cornell Medicine, New York, New York; and
| | | | | | - Elizabeth Sanchez
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
| | - Eli J Finkelsztein
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
| | - Kyung Sook Hong
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
- Department of Surgery and Critical Care Medicine, College of Medicine, Ewha Womans University, Seoul, Republic of Korea
| | | | - Lisa K Torres
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
- NewYork-Presbyterian Hospital, Weill Cornell Medicine, New York, New York; and
| | - John S Harrington
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
- NewYork-Presbyterian Hospital, Weill Cornell Medicine, New York, New York; and
| | - Ilias I Siempos
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
| | - Augustine M K Choi
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine
- NewYork-Presbyterian Hospital, Weill Cornell Medicine, New York, New York; and
| | - Thomas R Campion
- Department of Population Health Sciences
- Information Technologies and Services, and
- Clinical and Translational Science Center, Weill Cornell Medicine, Cornell University, New York, New York
| |
Collapse
|
16
|
Abedian S, Sholle ET, Adekkanattu PM, Cusick MM, Weiner SE, Shoag JE, Hu JC, Campion TR. Automated Extraction of Tumor Staging and Diagnosis Information From Surgical Pathology Reports. JCO Clin Cancer Inform 2021; 5:1054-1061. [PMID: 34694896 PMCID: PMC8812635 DOI: 10.1200/cci.21.00065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 08/25/2021] [Accepted: 09/29/2021] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Typically stored as unstructured notes, surgical pathology reports contain data elements valuable to cancer research that require labor-intensive manual extraction. Although studies have described natural language processing (NLP) of surgical pathology reports to automate information extraction, efforts have focused on specific cancer subtypes rather than across multiple oncologic domains. To address this gap, we developed and evaluated an NLP method to extract tumor staging and diagnosis information across multiple cancer subtypes. METHODS The NLP pipeline was implemented on an open-source framework called Leo. We used a total of 555,681 surgical pathology reports of 329,076 patients to develop the pipeline and evaluated our approach on subsets of reports from patients with breast, prostate, colorectal, and randomly selected cancer subtypes. RESULTS Averaged across all four cancer subtypes, the NLP pipeline achieved an accuracy of 1.00 for International Classification of Diseases, Tenth Revision codes, 0.89 for T staging, 0.90 for N staging, and 0.97 for M staging. It achieved an F1 score of 1.00 for International Classification of Diseases, Tenth Revision codes, 0.88 for T staging, 0.90 for N staging, and 0.24 for M staging. CONCLUSION The NLP pipeline was developed to extract tumor staging and diagnosis information across multiple cancer subtypes to support the research enterprise in our institution. Although it was not possible to demonstrate generalizability of our NLP pipeline to other institutions, other institutions may find value in adopting a similar NLP approach-and reusing code available at GitHub-to support the oncology research enterprise with elements extracted from surgical pathology reports.
Collapse
Affiliation(s)
- Sajjad Abedian
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Evan T. Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | | | - Marika M. Cusick
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Stephanie E. Weiner
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Jonathan E. Shoag
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Jim C. Hu
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Thomas R. Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Urology, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
17
|
Su C, Xu Z, Hoffman K, Goyal P, Safford MM, Lee J, Alvarez-Mulett S, Gomez-Escobar L, Price DR, Harrington JS, Torres LK, Martinez FJ, Campion TR, Wang F, Schenck EJ. Identifying organ dysfunction trajectory-based subphenotypes in critically ill patients with COVID-19. Sci Rep 2021; 11:15872. [PMID: 34354174 PMCID: PMC8342520 DOI: 10.1038/s41598-021-95431-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 07/14/2021] [Indexed: 12/13/2022] Open
Abstract
COVID-19-associated respiratory failure offers the unprecedented opportunity to evaluate the differential host response to a uniform pathogenic insult. Understanding whether there are distinct subphenotypes of severe COVID-19 may offer insight into its pathophysiology. Sequential Organ Failure Assessment (SOFA) score is an objective and comprehensive measurement that measures dysfunction severity of six organ systems, i.e., cardiovascular, central nervous system, coagulation, liver, renal, and respiration. Our aim was to identify and characterize distinct subphenotypes of COVID-19 critical illness defined by the post-intubation trajectory of SOFA score. Intubated COVID-19 patients at two hospitals in New York city were leveraged as development and validation cohorts. Patients were grouped into mild, intermediate, and severe strata by their baseline post-intubation SOFA. Hierarchical agglomerative clustering was performed within each stratum to detect subphenotypes based on similarities amongst SOFA score trajectories evaluated by Dynamic Time Warping. Distinct worsening and recovering subphenotypes were identified within each stratum, which had distinct 7-day post-intubation SOFA progression trends. Patients in the worsening suphenotypes had a higher mortality than those in the recovering subphenotypes within each stratum (mild stratum, 29.7% vs. 10.3%, p = 0.033; intermediate stratum, 29.3% vs. 8.0%, p = 0.002; severe stratum, 53.7% vs. 22.2%, p < 0.001). Pathophysiologic biomarkers associated with progression were distinct at each stratum, including findings suggestive of inflammation in low baseline severity of illness versus hemophagocytic lymphohistiocytosis in higher baseline severity of illness. The findings suggest that there are clear worsening and recovering subphenotypes of COVID-19 respiratory failure after intubation, which are more predictive of outcomes than baseline severity of illness. Distinct progression biomarkers at differential baseline severity of illness suggests a heterogeneous pathobiology in the progression of COVID-19 respiratory failure.
Collapse
Affiliation(s)
- Chang Su
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Zhenxing Xu
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Katherine Hoffman
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Parag Goyal
- Division of General Internal Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
| | - Monika M Safford
- Division of General Internal Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
| | - Jerry Lee
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, USA
| | - Sergio Alvarez-Mulett
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Luis Gomez-Escobar
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - David R Price
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - John S Harrington
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Lisa K Torres
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Fernando J Martinez
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA.
| | - Edward J Schenck
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA.
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
18
|
Schenck EJ, Hoffman KL, Cusick M, Kabariti J, Sholle ET, Campion TR. Critical carE Database for Advanced Research (CEDAR): An automated method to support intensive care units with electronic health record data. J Biomed Inform 2021; 118:103789. [PMID: 33862230 DOI: 10.1016/j.jbi.2021.103789] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Revised: 02/12/2021] [Accepted: 04/10/2021] [Indexed: 12/28/2022]
Abstract
Patients treated in an intensive care unit (ICU) are critically ill and require life-sustaining organ failure support. Existing critical care data resources are limited to a select number of institutions, contain only ICU data, and do not enable the study of local changes in care patterns. To address these limitations, we developed the Critical carE Database for Advanced Research (CEDAR), a method for automating extraction and transformation of data from an electronic health record (EHR) system. Compared to an existing gold standard of manually collected data at our institution, CEDAR was statistically similar in most measures, including patient demographics and sepsis-related organ failure assessment (SOFA) scores. Additionally, CEDAR automated data extraction obviated the need for manual collection of 550 variables. Critically, during the spring 2020 COVID-19 surge in New York City, a modified version of CEDAR supported pandemic response efforts, including clinical operations and research. Other academic medical centers may find value in using the CEDAR method to automate data extraction from EHR systems to support ICU activities.
Collapse
Affiliation(s)
- Edward J Schenck
- Weill Department of Medicine, Weill Cornell Medicine, New York, NY, United States
| | - Katherine L Hoffman
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States
| | - Marika Cusick
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
| | - Joseph Kabariti
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, United States; Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, United States; Department of Pediatrics, Weill Cornell Medicine, New York, NY, United States; Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, United States
| |
Collapse
|
19
|
Butler D, Mozsary C, Meydan C, Foox J, Rosiene J, Shaiber A, Danko D, Afshinnekoo E, MacKay M, Sedlazeck FJ, Ivanov NA, Sierra M, Pohle D, Zietz M, Gisladottir U, Ramlall V, Sholle ET, Schenck EJ, Westover CD, Hassan C, Ryon K, Young B, Bhattacharya C, Ng DL, Granados AC, Santos YA, Servellita V, Federman S, Ruggiero P, Fungtammasan A, Chin CS, Pearson NM, Langhorst BW, Tanner NA, Kim Y, Reeves JW, Hether TD, Warren SE, Bailey M, Gawrys J, Meleshko D, Xu D, Couto-Rodriguez M, Nagy-Szakal D, Barrows J, Wells H, O'Hara NB, Rosenfeld JA, Chen Y, Steel PAD, Shemesh AJ, Xiang J, Thierry-Mieg J, Thierry-Mieg D, Iftner A, Bezdan D, Sanchez E, Campion TR, Sipley J, Cong L, Craney A, Velu P, Melnick AM, Shapira S, Hajirasouliha I, Borczuk A, Iftner T, Salvatore M, Loda M, Westblade LF, Cushing M, Wu S, Levy S, Chiu C, Schwartz RE, Tatonetti N, Rennert H, Imielinski M, Mason CE. Shotgun transcriptome, spatial omics, and isothermal profiling of SARS-CoV-2 infection reveals unique host responses, viral diversification, and drug interactions. Nat Commun 2021; 12:1660. [PMID: 33712587 PMCID: PMC7954844 DOI: 10.1038/s41467-021-21361-7] [Citation(s) in RCA: 92] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 01/25/2021] [Indexed: 02/08/2023] Open
Abstract
In less than nine months, the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) killed over a million people, including >25,000 in New York City (NYC) alone. The COVID-19 pandemic caused by SARS-CoV-2 highlights clinical needs to detect infection, track strain evolution, and identify biomarkers of disease course. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs and a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, viral, and microbial profiling. We applied these methods to clinical specimens gathered from 669 patients in New York City during the first two months of the outbreak, yielding a broad molecular portrait of the emerging COVID-19 disease. We find significant enrichment of a NYC-distinctive clade of the virus (20C), as well as host responses in interferon, ACE, hematological, and olfaction pathways. In addition, we use 50,821 patient records to find that renin-angiotensin-aldosterone system inhibitors have a protective effect for severe COVID-19 outcomes, unlike similar drugs. Finally, spatial transcriptomic data from COVID-19 patient autopsy tissues reveal distinct ACE2 expression loci, with macrophage and neutrophil infiltration in the lungs. These findings can inform public health and may help develop and drive SARS-CoV-2 diagnostic, prevention, and treatment strategies.
Collapse
Affiliation(s)
- Daniel Butler
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Christopher Mozsary
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Cem Meydan
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Joel Rosiene
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Alon Shaiber
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - David Danko
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Ebrahim Afshinnekoo
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
| | - Matthew MacKay
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Nikolay A Ivanov
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, USA
| | - Maria Sierra
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Diana Pohle
- Institute of Medical Virology and Epidemiology of Viral Diseases, University Hospital Tuebingen, Tuebingen, Germany
| | - Michael Zietz
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, Columbia, NY, USA
| | - Undina Gisladottir
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, Columbia, NY, USA
| | - Vijendra Ramlall
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, Columbia, NY, USA
- Department of Cellular, Molecular Physiology & Biophysics, Columbia University, Columbia, NY, USA
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Edward J Schenck
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Craig D Westover
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Ciaran Hassan
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Krista Ryon
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Benjamin Young
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | | | - Dianna L Ng
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
| | - Andrea C Granados
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
- UCSF-Abbott Viral Diagnostics and Discovery Center, San Francisco, CA, USA
| | - Yale A Santos
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
- UCSF-Abbott Viral Diagnostics and Discovery Center, San Francisco, CA, USA
| | - Venice Servellita
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
- UCSF-Abbott Viral Diagnostics and Discovery Center, San Francisco, CA, USA
| | - Scot Federman
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
- UCSF-Abbott Viral Diagnostics and Discovery Center, San Francisco, CA, USA
| | - Phyllis Ruggiero
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | | | | | | | | | | | | | | | | | | | | | - Justyna Gawrys
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Dmitry Meleshko
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine, New York, NY, USA
| | - Dong Xu
- Genomics Resources Core Facility, Weill Cornell Medicine, New York, NY, USA
| | | | - Dorottya Nagy-Szakal
- Biotia, Inc., New York, NY, USA
- Department of Cell Biology, SUNY Downstate Health Sciences University, New York, NY, USA
| | | | | | - Niamh B O'Hara
- Biotia, Inc., New York, NY, USA
- Department of Cell Biology, SUNY Downstate Health Sciences University, New York, NY, USA
| | - Jeffrey A Rosenfeld
- Rutgers Cancer Institute of New Jersey, New York, NJ, USA
- Department of Pathology, Robert Wood Johnson Medical School, New York, NJ, USA
| | - Ying Chen
- Rutgers Cancer Institute of New Jersey, New York, NJ, USA
| | - Peter A D Steel
- Department of Emergency Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Amos J Shemesh
- Department of Emergency Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Jenny Xiang
- Genomics Resources Core Facility, Weill Cornell Medicine, New York, NY, USA
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Angelika Iftner
- Institute of Medical Virology and Epidemiology of Viral Diseases, University Hospital Tuebingen, Tuebingen, Germany
| | - Daniela Bezdan
- Institute of Medical Virology and Epidemiology of Viral Diseases, University Hospital Tuebingen, Tuebingen, Germany
| | | | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - John Sipley
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Lin Cong
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Arryn Craney
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Priya Velu
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Ari M Melnick
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Sagi Shapira
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, Columbia, NY, USA
| | - Iman Hajirasouliha
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Alain Borczuk
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Thomas Iftner
- Institute of Medical Virology and Epidemiology of Viral Diseases, University Hospital Tuebingen, Tuebingen, Germany
| | - Mirella Salvatore
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Massimo Loda
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Lars F Westblade
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Melissa Cushing
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Shixiu Wu
- Hangzhou Cancer Institute, Hangzhou Cancer Hospital, Hangzhou, China
- Department of Radiation Oncology, Hangzhou Cancer Hospital, Hangzhou, China
| | - Shawn Levy
- HudsonAlpha Discovery Institute, Huntsville, AL, USA
| | - Charles Chiu
- Department of Laboratory Medicine, University of California, San Francisco, CA, USA
- UCSF-Abbott Viral Diagnostics and Discovery Center, San Francisco, CA, USA
- Department of Medicine, Division of Infectious Diseases, University of California, San Francisco, CA, USA
| | | | - Nicholas Tatonetti
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, Columbia, NY, USA.
| | - Hanna Rennert
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA.
| | - Marcin Imielinski
- New York Genome Center, New York, NY, USA.
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA.
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA.
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA.
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.
- WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA.
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
20
|
Cusick M, Adekkanattu P, Campion TR, Sholle ET, Myers A, Banerjee S, Alexopoulos G, Wang Y, Pathak J. Using weak supervision and deep learning to classify clinical notes for identification of current suicidal ideation. J Psychiatr Res 2021; 136:95-102. [PMID: 33581461 DOI: 10.1016/j.jpsychires.2021.01.052] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 01/17/2021] [Accepted: 01/29/2021] [Indexed: 10/22/2022]
Abstract
Mental health concerns, such as suicidal thoughts, are frequently documented by providers in clinical notes, as opposed to structured coded data. In this study, we evaluated weakly supervised methods for detecting "current" suicidal ideation from unstructured clinical notes in electronic health record (EHR) systems. Weakly supervised machine learning methods leverage imperfect labels for training, alleviating the burden of creating a large manually annotated dataset. After identifying a cohort of 600 patients at risk for suicidal ideation, we used a rule-based natural language processing approach (NLP) approach to label the training and validation notes (n = 17,978). Using this large corpus of clinical notes, we trained several statistical machine learning models-logistic classifier, support vector machines (SVM), Naive Bayes classifier-and one deep learning model, namely a text classification convolutional neural network (CNN), to be evaluated on a manually-reviewed test set (n = 837). The CNN model outperformed all other methods, achieving an overall accuracy of 94% and a F1-score of 0.82 on documents with "current" suicidal ideation. This algorithm correctly identified an additional 42 encounters and 9 patients indicative of suicidal ideation but missing a structured diagnosis code. When applied to a random subset of 5,000 clinical notes, the algorithm classified 0.46% (n = 23) for "current" suicidal ideation, of which 87% were truly indicative via manual review. Implementation of this approach for large-scale document screening may play an important role in point-of-care clinical information systems for targeted suicide prevention interventions and improve research on the pathways from ideation to attempt.
Collapse
Affiliation(s)
- Marika Cusick
- Department of Information and Technology Services, Weill Cornell Medicine, New York, USA; Department Population Health Sciences, Weill Cornell Medicine, New York, USA.
| | - Prakash Adekkanattu
- Department of Information and Technology Services, Weill Cornell Medicine, New York, USA.
| | - Thomas R Campion
- Department of Information and Technology Services, Weill Cornell Medicine, New York, USA; Department Population Health Sciences, Weill Cornell Medicine, New York, USA.
| | - Evan T Sholle
- Department of Information and Technology Services, Weill Cornell Medicine, New York, USA.
| | - Annie Myers
- Department Population Health Sciences, Weill Cornell Medicine, New York, USA.
| | - Samprit Banerjee
- Department Population Health Sciences, Weill Cornell Medicine, New York, USA.
| | | | - Yanshan Wang
- Division of Digital Health Sciences, Mayo Clinic, MN, USA; Department of Health Sciences Research, Mayo Clinic, MN, USA.
| | - Jyotishman Pathak
- Department Population Health Sciences, Weill Cornell Medicine, New York, USA; Department of Psychiatry, Weill Cornell Medicine, New York, USA.
| |
Collapse
|
21
|
Sholle ET, Pinheiro LC, Adekkanattu P, Davila MA, Johnson SB, Pathak J, Sinha S, Li C, Lubansky SA, Safford MM, Campion TR. Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation. J Am Med Inform Assoc 2021; 26:722-729. [PMID: 31329882 DOI: 10.1093/jamia/ocz040] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 03/06/2019] [Accepted: 03/13/2019] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE We aimed to address deficiencies in structured electronic health record (EHR) data for race and ethnicity by identifying black and Hispanic patients from unstructured clinical notes and assessing differences between patients with or without structured race/ethnicity data. MATERIALS AND METHODS Using EHR notes for 16 665 patients with encounters at a primary care practice, we developed rule-based natural language processing (NLP) algorithms to classify patients as black/Hispanic. We evaluated performance of the method against an annotated gold standard, compared race and ethnicity between NLP-derived and structured EHR data, and compared characteristics of patients identified as black or Hispanic using only NLP vs patients identified as such only in structured EHR data. RESULTS For the sample of 16 665 patients, NLP identified 948 additional patients as black, a 26%increase, and 665 additional patients as Hispanic, a 20% increase. Compared with the patients identified as black or Hispanic in structured EHR data, patients identified as black or Hispanic via NLP only were older, more likely to be male, less likely to have commercial insurance, and more likely to have higher comorbidity. DISCUSSION Structured EHR data for race and ethnicity are subject to data quality issues. Supplementing structured EHR race data with NLP-derived race and ethnicity may allow researchers to better assess the demographic makeup of populations and draw more accurate conclusions about intergroup differences in health outcomes. CONCLUSIONS Black or Hispanic patients who are not documented as such in structured EHR race/ethnicity fields differ significantly from those who are. Relatively simple NLP can help address this limitation.
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA
| | - Laura C Pinheiro
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Prakash Adekkanattu
- Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA
| | - Marcos A Davila
- Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA
| | - Stephen B Johnson
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, New York, USA
| | - Jyotishman Pathak
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, New York, USA
| | - Sanjai Sinha
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Cassidie Li
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Stasi A Lubansky
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Monika M Safford
- Department of Medicine, Weill Cornell Medicine, New York, New York, USA
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, New York, USA.,Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, New York, USA.,Department of Pediatrics, Weill Cornell Medicine, New York, New York, USA
| |
Collapse
|
22
|
Cusick MM, Sholle ET, Davila MA, Kabariti J, Cole CL, Campion TR. A Method to Improve Availability and Quality of Patient Race Data in an Electronic Health Record System. Appl Clin Inform 2020; 11:785-791. [PMID: 33241548 DOI: 10.1055/s-0040-1718756] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Although federal regulations mandate documentation of structured race data according to Office of Management and Budget (OMB) categories in electronic health record (EHR) systems, many institutions have reported gaps in EHR race data that hinder secondary use for population-level research focused on underserved populations. When evaluating race data available for research purposes, we found our institution's enterprise EHR contained structured race data for only 51% (1.6 million) of patients. OBJECTIVES We seek to improve the availability and quality of structured race data available to researchers by integrating values from multiple local sources. METHODS To address the deficiency in race data availability, we implemented a method to supplement OMB race values from four local sources-inpatient EHR, inpatient billing, natural language processing, and coded clinical observations. We evaluated this method by measuring race data availability and data quality with respect to completeness, concordance, and plausibility. RESULTS The supplementation method improved race data availability in the enterprise EHR up to 10% for some minority groups and 4% overall. We identified structured OMB race values for more than 142,000 patients, nearly a third of whom were from racial minority groups. Our data quality evaluation indicated that the supplemented race values improved completeness in the enterprise EHR, originated from sources in agreement with the enterprise EHR, and were unbiased to the enterprise EHR. CONCLUSION Implementation of this method can successfully increase OMB race data availability, potentially enhancing accrual of patients from underserved populations to research studies.
Collapse
Affiliation(s)
- Marika M Cusick
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States
| | - Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States
| | - Marcos A Davila
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States
| | - Joseph Kabariti
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States
| | - Curtis L Cole
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States.,Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, New York, United States.,Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States.,Clinical and Translational Science Center, Weill Cornell Medicine, New York, New York, United States.,Department of Pediatrics, Weill Cornell Medicine, New York, New York, United States
| |
Collapse
|
23
|
Goyal P, Ringel JB, Rajan M, Choi JJ, Pinheiro LC, Li HA, Wehmeyer GT, Alshak MN, Jabri A, Schenck EJ, Chen R, Satlin MJ, Campion TR, Nahid M, Plataki M, Hoffman KL, Reshetnyak E, Hupert N, Horn EM, Martinez FJ, Gulick RM, Safford MM. Obesity and COVID-19 in New York City: A Retrospective Cohort Study. Ann Intern Med 2020; 173:855-858. [PMID: 32628537 PMCID: PMC7384267 DOI: 10.7326/m20-2730] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Parag Goyal
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Joanna Bryan Ringel
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Mangala Rajan
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Justin J Choi
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Laura C Pinheiro
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Han A Li
- Weill Cornell Medical College, New York, New York (H.A.L., G.T.W., M.N.A.)
| | - Graham T Wehmeyer
- Weill Cornell Medical College, New York, New York (H.A.L., G.T.W., M.N.A.)
| | - Mark N Alshak
- Weill Cornell Medical College, New York, New York (H.A.L., G.T.W., M.N.A.)
| | - Assem Jabri
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Edward J Schenck
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Ruijun Chen
- Weill Cornell Medicine and Columbia University, New York, New York (R.C.)
| | - Michael J Satlin
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Thomas R Campion
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Musarrat Nahid
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Maria Plataki
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Katherine L Hoffman
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Evgeniya Reshetnyak
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Nathaniel Hupert
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Evelyn M Horn
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Fernando J Martinez
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Roy M Gulick
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| | - Monika M Safford
- Weill Cornell Medicine, New York, New York (P.G., J.B.R., M.R., J.J.C., L.C.P., A.J., E.J.S., M.J.S., T.R.C., M.N., M.P., K.L.H., E.R., N.H., E.M.H., F.J.M., R.M.G., M.M.S.)
| |
Collapse
|
24
|
Lin E, Lantos JE, Strauss SB, Phillips CD, Campion TR, Navi BB, Parikh NS, Merkler AE, Mir S, Zhang C, Kamel H, Cusick M, Goyal P, Gupta A. Brain Imaging of Patients with COVID-19: Findings at an Academic Institution during the Height of the Outbreak in New York City. AJNR Am J Neuroradiol 2020; 41:2001-2008. [PMID: 32819899 DOI: 10.3174/ajnr.a6793] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 07/17/2020] [Indexed: 12/13/2022]
Abstract
BACKGROUND AND PURPOSE A large spectrum of neurologic disease has been reported in patients with coronavirus disease 2019 (COVID-19) infection. Our aim was to investigate the yield of neuroimaging in patients with COVID-19 undergoing CT or MR imaging of the brain and to describe associated imaging findings. MATERIALS AND METHODS We performed a retrospective cohort study involving 2054 patients with laboratory-confirmed COVID-19 presenting to 2 hospitals in New York City between March 4 and May 9, 2020, of whom 278 (14%) underwent either CT or MR imaging of the brain. All images initially received a formal interpretation from a neuroradiologist within the institution and were subsequently reviewed by 2 neuroradiologists in consensus, with disputes resolved by a third neuroradiologist. RESULTS The median age of these patients was 64 years (interquartile range, 50-75 years), and 43% were women. Among imaged patients, 58 (21%) demonstrated acute or subacute neuroimaging findings, the most common including cerebral infarctions (11%), parenchymal hematomas (3.6%), and posterior reversible encephalopathy syndrome (1.1%). Among the 51 patients with MR imaging examinations, 26 (51%) demonstrated acute or subacute findings; notable findings included 6 cases of cranial nerve abnormalities (including 4 patients with olfactory bulb abnormalities) and 3 patients with a microhemorrhage pattern compatible with critical illness-associated microbleeds. CONCLUSIONS Our experience confirms the wide range of neurologic imaging findings in patients with COVID-19 and suggests the need for further studies to optimize management for these patients.
Collapse
Affiliation(s)
- E Lin
- From the Department of Radiology (E.L., J.E.L., S.B.S., C.D.P., A.G.)
| | - J E Lantos
- From the Department of Radiology (E.L., J.E.L., S.B.S., C.D.P., A.G.)
| | - S B Strauss
- From the Department of Radiology (E.L., J.E.L., S.B.S., C.D.P., A.G.)
| | - C D Phillips
- From the Department of Radiology (E.L., J.E.L., S.B.S., C.D.P., A.G.)
| | - T R Campion
- Department of Population Health Sciences (T.R.C., M.C.)
| | - B B Navi
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| | - N S Parikh
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| | - A E Merkler
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| | - S Mir
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| | - C Zhang
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| | | | - M Cusick
- Department of Population Health Sciences (T.R.C., M.C.)
| | - P Goyal
- Feil Family Brain and Mind Research Institute and Department of Neurology, and Department of Medicine (P.G.), Weill Cornell Medicine, New York, New York
| | - A Gupta
- From the Department of Radiology (E.L., J.E.L., S.B.S., C.D.P., A.G.)
- Clinical and Translational Neuroscience Unit (B.B.N., N.S.P., A.E.M., S.M., C.Z., A.G.)
| |
Collapse
|
25
|
Racine-Brzostek SE, Yang HS, Chadburn A, Orlander D, An A, Campion TR, Yee J, Chen Z, Loda M, Zhao Z, Kaushal R, Cushing MM. COVID-19 Viral and Serology Testing in New York City Health Care Workers. Am J Clin Pathol 2020; 154:592-595. [PMID: 32914176 PMCID: PMC7499487 DOI: 10.1093/ajcp/aqaa142] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
| | - He S Yang
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Amy Chadburn
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Duncan Orlander
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Anjile An
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Jim Yee
- New York-Presbyterian Hospital Weill Cornell, New York, NY
| | - Zhengming Chen
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Massimo Loda
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Zhen Zhao
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Rainu Kaushal
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY
| | - Melissa M Cushing
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| |
Collapse
|
26
|
Fu JT, Sholle E, Krichevsky S, Scandura J, Campion TR. Extracting and classifying diagnosis dates from clinical notes: A case study. J Biomed Inform 2020; 110:103569. [PMID: 32949781 DOI: 10.1016/j.jbi.2020.103569] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 08/24/2020] [Accepted: 09/12/2020] [Indexed: 11/29/2022]
Abstract
Myeloproliferative neoplasms (MPNs) are chronic hematologic malignancies that may progress over long disease courses. The original date of diagnosis is an important piece of information for patient care and research, but is not consistently documented. We describe an attempt to build a pipeline for extracting dates with natural language processing (NLP) tools and techniques and classifying them as relevant diagnoses or not. Inaccurate and incomplete date extraction and interpretation impacted the performance of the overall pipeline. Existing lightweight Python packages tended to have low specificity for identifying and interpreting partial and relative dates in clinical text. A rules-based regular expression (regex) approach achieved recall of 83.0% on dates manually annotated as diagnosis dates, and 77.4% on all annotated dates. With only 3.8% of annotated dates representing initial MPN diagnoses, additional methods of targeting candidate date instances may alleviate noise and class imbalance.
Collapse
Affiliation(s)
- Julia T Fu
- Department of Health Policy and Research, Weill Cornell Medicine, 402 E. 67th St, New York, NY 10065, United States; Division of Health Informatics, Memorial Sloan Kettering Cancer Center, 600 3rd Ave, 8th Fl, New York, NY 10016, United States.
| | - Evan Sholle
- Department of Health Policy and Research, Weill Cornell Medicine, 402 E. 67th St, New York, NY 10065, United States; Information Technologies & Services, Weill Cornell Medicine, 575 Lexington Ave, 3rd Fl, New York, NY 10022, United States.
| | - Spencer Krichevsky
- Joint Clinical Trials Office, Weill Cornell Medicine, 1300 York Ave, Box 305, New York, NY 10065, United States.
| | - Joseph Scandura
- Department of Hematology and Oncology, Weill Cornell Medicine, 428 E 72nd St, Ste 300, New York, NY 10065, United States.
| | - Thomas R Campion
- Department of Health Policy and Research, Weill Cornell Medicine, 402 E. 67th St, New York, NY 10065, United States; Information Technologies & Services, Weill Cornell Medicine, 575 Lexington Ave, 3rd Fl, New York, NY 10022, United States; Clinical and Translational Science Center, Weill Cornell Medicine, 1300 York Ave., Box 149, New York, NY 10065, United States; Department of Pediatrics, Weill Cornell Medicine, 525 E 68th St, Rm M610A, New York, NY 10065, United States.
| |
Collapse
|
27
|
Merkler AE, Parikh NS, Mir S, Gupta A, Kamel H, Lin E, Lantos J, Schenck EJ, Goyal P, Bruce SS, Kahan J, Lansdale KN, LeMoss NM, Murthy SB, Stieg PE, Fink ME, Iadecola C, Segal AZ, Cusick M, Campion TR, Diaz I, Zhang C, Navi BB. Risk of Ischemic Stroke in Patients With Coronavirus Disease 2019 (COVID-19) vs Patients With Influenza. JAMA Neurol 2020; 77:2768098. [PMID: 32614385 PMCID: PMC7333175 DOI: 10.1001/jamaneurol.2020.2730] [Citation(s) in RCA: 414] [Impact Index Per Article: 103.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
IMPORTANCE It is uncertain whether coronavirus disease 2019 (COVID-19) is associated with a higher risk of ischemic stroke than would be expected from a viral respiratory infection. OBJECTIVE To compare the rate of ischemic stroke between patients with COVID-19 and patients with influenza, a respiratory viral illness previously associated with stroke. DESIGN, SETTING, AND PARTICIPANTS This retrospective cohort study was conducted at 2 academic hospitals in New York City, New York, and included adult patients with emergency department visits or hospitalizations with COVID-19 from March 4, 2020, through May 2, 2020. The comparison cohort included adults with emergency department visits or hospitalizations with influenza A/B from January 1, 2016, through May 31, 2018 (spanning moderate and severe influenza seasons). EXPOSURES COVID-19 infection confirmed by evidence of severe acute respiratory syndrome coronavirus 2 in the nasopharynx by polymerase chain reaction and laboratory-confirmed influenza A/B. MAIN OUTCOMES AND MEASURES A panel of neurologists adjudicated the primary outcome of acute ischemic stroke and its clinical characteristics, mechanisms, and outcomes. We used logistic regression to compare the proportion of patients with COVID-19 with ischemic stroke vs the proportion among patients with influenza. RESULTS Among 1916 patients with emergency department visits or hospitalizations with COVID-19, 31 (1.6%; 95% CI, 1.1%-2.3%) had an acute ischemic stroke. The median age of patients with stroke was 69 years (interquartile range, 66-78 years); 18 (58%) were men. Stroke was the reason for hospital presentation in 8 cases (26%). In comparison, 3 of 1486 patients with influenza (0.2%; 95% CI, 0.0%-0.6%) had an acute ischemic stroke. After adjustment for age, sex, and race, the likelihood of stroke was higher with COVID-19 infection than with influenza infection (odds ratio, 7.6; 95% CI, 2.3-25.2). The association persisted across sensitivity analyses adjusting for vascular risk factors, viral symptomatology, and intensive care unit admission. CONCLUSIONS AND RELEVANCE In this retrospective cohort study from 2 New York City academic hospitals, approximately 1.6% of adults with COVID-19 who visited the emergency department or were hospitalized experienced ischemic stroke, a higher rate of stroke compared with a cohort of patients with influenza. Additional studies are needed to confirm these findings and to investigate possible thrombotic mechanisms associated with COVID-19.
Collapse
Affiliation(s)
- Alexander E Merkler
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Neal S Parikh
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Saad Mir
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Ajay Gupta
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Hooman Kamel
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
- Deputy Editor
| | - Eaton Lin
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Joshua Lantos
- Department of Radiology, Weill Cornell Medicine, New York, New York
| | - Edward J Schenck
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Weill Cornell Medicine, New York, New York
| | - Parag Goyal
- Division of Cardiology, Department of Medicine, Weill Cornell Medicine, New York, New York
| | - Samuel S Bruce
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Joshua Kahan
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Kelsey N Lansdale
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Natalie M LeMoss
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Santosh B Murthy
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Philip E Stieg
- Department of Neurosurgery, Weill Cornell Medicine, New York, New York
| | - Matthew E Fink
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Costantino Iadecola
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Alan Z Segal
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Marika Cusick
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York
| | - Ivan Diaz
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York
| | - Cenai Zhang
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| | - Babak B Navi
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, New York
| |
Collapse
|
28
|
Campion TR, Craven CK, Dorr DA, Knosp BM. Understanding enterprise data warehouses to support clinical and translational research. J Am Med Inform Assoc 2020; 27:1352-1358. [PMID: 32679585 PMCID: PMC7647350 DOI: 10.1093/jamia/ocaa089] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 04/24/2020] [Accepted: 05/12/2020] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE Among National Institutes of Health Clinical and Translational Science Award (CTSA) hubs, adoption of electronic data warehouses for research (EDW4R) containing data from electronic health record systems is nearly ubiquitous. Although benefits of EDW4R include more effective, efficient support of scientists, little is known about how CTSA hubs have implemented EDW4R services. The goal of this qualitative study was to understand the ways in which CTSA hubs have operationalized EDW4R to support clinical and translational researchers. MATERIALS AND METHODS After conducting semistructured interviews with informatics leaders from 20 CTSA hubs, we performed a directed content analysis of interview notes informed by naturalistic inquiry. RESULTS We identified 12 themes: organization and data; oversight and governance; data access request process; data access modalities; data access for users with different skill sets; engagement, communication, and literacy; service management coordinated with enterprise information technology; service management coordinated within a CTSA hub; service management coordinated between informatics and biostatistics; funding approaches; performance metrics; and future trends and current technology challenges. DISCUSSION This study is a step in developing an improved understanding and creating a common vocabulary about EDW4R operations across institutions. Findings indicate an opportunity for establishing best practices for EDW4R operations in academic medicine. Such guidance could reduce the costs associated with developing an EDW4R by establishing a clear roadmap and maturity path for institutions to follow. CONCLUSIONS CTSA hubs described varying approaches to EDW4R operations that may assist other institutions in better serving investigators with electronic patient data.
Collapse
Affiliation(s)
- Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, USA
| | - Catherine K Craven
- Institute for Health Care Delivery Science, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - David A Dorr
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA
| | - Boyd M Knosp
- Institute for Clinical and Translational Science, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
29
|
Goyal P, Choi JJ, Pinheiro LC, Schenck EJ, Chen R, Jabri A, Satlin MJ, Campion TR, Nahid M, Ringel JB, Hoffman KL, Alshak MN, Li HA, Wehmeyer GT, Rajan M, Reshetnyak E, Hupert N, Horn EM, Martinez FJ, Gulick RM, Safford MM. Clinical Characteristics of Covid-19 in New York City. N Engl J Med 2020; 382:2372-2374. [PMID: 32302078 PMCID: PMC7182018 DOI: 10.1056/nejmc2010419] [Citation(s) in RCA: 1549] [Impact Index Per Article: 387.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Han A Li
- Weill Cornell Medicine, New York, NY
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Chen C, Lee PI, Pain KJ, Delgado D, Cole CL, Campion TR. Replacing Paper Informed Consent with Electronic Informed Consent for Research in Academic Medical Centers: A Scoping Review. AMIA Jt Summits Transl Sci Proc 2020; 2020:80-88. [PMID: 32477626 PMCID: PMC7233043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Although experts have identified benefits to replacing paper with electronic consent (eConsent) for research, a comprehensive understanding of strategies to overcome barriers to adoption is unknown. To address this gap, we performed a scoping review of the literature describing eConsent in academic medical centers. Of 69 studies that met inclusion criteria, 81% (n=56) addressed ethical, legal, and social issues; 67% (n=46) described user interface/user experience considerations; 39% (n=27) compared electronic versus paper approaches; 33% (n=23) discussed approaches to enterprise scalability; and 25% (n=17) described changes to consent elections. Findings indicate a lack of a leading commercial eConsent vendor, as articles described a myriad of homegrown systems and extensions of vendor EHR patient portals. Opportunities appear to exist for researchers and commercial software vendors to develop eConsent approaches that address the five critical areas identified in this review.
Collapse
Affiliation(s)
- Cindy Chen
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Pou-I Lee
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| | - Kevin J Pain
- Samuel J. Wood Library & C.V. Starr Biomedical Information Center, Weill Cornell Medicine, New York, NY
| | - Diana Delgado
- Samuel J. Wood Library & C.V. Starr Biomedical Information Center, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
- Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY
| |
Collapse
|
31
|
Sholle ET, Cusick M, Davila MA, Kabariti J, Flores S, Campion TR. Characterizing Basic and Complex Usage of i2b2 at an Academic Medical Center. AMIA Jt Summits Transl Sci Proc 2020; 2020:589-596. [PMID: 32477681 PMCID: PMC7233105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Developed to enable basic queries for cohort discovery, i2b2 has evolved to support complex queries. Little is known whether query sophistication - and the informatics resources required to support it - addresses researcher needs. In three years at our institution, 609 researchers ran 6,662 queries and requested re-identification of 80 patient cohorts to support specific studies. After characterizing all queries as "basic" or "complex" with respect to use of sophisticated query features, we found that the majority of all queries, and the majority of queries resulting in a request for cohort re-identification, did not use complex i2b2 features. Data domains that required extensive effort to implement saw relatively little use compared to common domains (e.g., diagnoses). These findings suggest that efforts to ensure the performance of basic queries using common data domains may better serve the needs of the research community than efforts to integrate novel domains or introduce complex new features.
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Marika Cusick
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Marcos A Davila
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Joseph Kabariti
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Steven Flores
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
32
|
Michael CL, Sholle ET, Wulff RT, Roboz GJ, Campion TR. Mapping Local Biospecimen Records to the OMOP Common Data Model. AMIA Jt Summits Transl Sci Proc 2020; 2020:422-429. [PMID: 32477663 PMCID: PMC7233045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Research to support precision medicine for leukemia patients requires integration of biospecimen and clinical data. The Observational Medical Outcomes Partnership common data model (OMOP CDM) and its Specimen table presents a potential solution. Although researchers have described progress and challenges in mapping electronic health record (EHR) data to populate the OMOP CDM, to our knowledge no studies have described populating the OMOP CDM with biospecimen data. Using biobank data from our institution, we mapped 26% of biospecimen records to the OMOP Specimen table. Records failed mapping due to local codes for time point that were incompatible with the OMOP reference terminology. We recommend expanding allowable codes to encompass research data, adding foreign keys to leverage additional OMOP tables with data from other sources or to store additional specimen details, and considering a new table to represent processed samples and inventory.
Collapse
Affiliation(s)
- Chelsea L Michael
- Department of Health Informatics, Memorial Sloan Kettering Cancer Center, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Regina T Wulff
- Department of Medicine, Weill Cornell Medicine New York, NY
| | - Gail J Roboz
- Department of Medicine, Weill Cornell Medicine New York, NY
| | - Thomas R Campion
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
33
|
Merkler AE, Parikh NS, Mir S, Gupta A, Kamel H, Lin E, Lantos J, Schenck EJ, Goyal P, Bruce SS, Kahan J, Lansdale KN, LeMoss NM, Murthy SB, Stieg PE, Fink ME, Iadecola C, Segal AZ, Campion TR, Diaz I, Zhang C, Navi BB. Risk of Ischemic Stroke in Patients with Covid-19 versus Patients with Influenza. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2020. [PMID: 32511527 PMCID: PMC7273295 DOI: 10.1101/2020.05.18.20105494] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Importance: Case series without control groups suggest that Covid-19 may cause ischemic stroke, but whether Covid-19 is associated with a higher risk of ischemic stroke than would be expected from a viral respiratory infection is uncertain. Objective: To compare the rate of ischemic stroke between patients with Covid-19 and patients with influenza, a respiratory viral illness previously linked to stroke. Design: A retrospective cohort study. Setting: Two academic hospitals in New York City. Participants: We included adult patients with emergency department visits or hospitalizations with Covid-19 from March 4, 2020 through May 2, 2020. Our comparison cohort included adult patients with emergency department visits or hospitalizations with influenza A or B from January 1, 2016 through May 31, 2018 (calendar years spanning moderate and severe influenza seasons). Exposures: Covid-19 infection confirmed by evidence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the nasopharynx by polymerase chain reaction, and laboratory-confirmed influenza A or B. Main Outcomes and Measures: A panel of neurologists adjudicated the primary outcome of acute ischemic stroke and its clinical characteristics, etiological mechanisms, and outcomes. We used logistic regression to compare the proportion of Covid-19 patients with ischemic stroke versus the proportion among patients with influenza. Results: Among 2,132 patients with emergency department visits or hospitalizations with Covid-19, 31 patients (1.5%; 95% confidence interval [CI], 1.0%-2.1%) had an acute ischemic stroke. The median age of patients with stroke was 69 years (interquartile range, 66-78) and 58% were men. Stroke was the reason for hospital presentation in 8 (26%) cases. For our comparison cohort, we identified 1,516 patients with influenza, of whom 0.2% (95% CI, 0.0-0.6%) had an acute ischemic stroke. After adjustment for age, sex, and race, the likelihood of stroke was significantly higher with Covid-19 than with influenza infection (odds ratio, 7.5; 95% CI, 2.3-24.9). Conclusions and Relevance: Approximately 1.5% of patients with emergency department visits or hospitalizations with Covid-19 experienced ischemic stroke, a rate 7.5-fold higher than in patients with influenza. Future studies should investigate the thrombotic mechanisms in Covid-19 in order to determine optimal strategies to prevent disabling complications like ischemic stroke.
Collapse
Affiliation(s)
- Alexander E Merkler
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Neal S Parikh
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Saad Mir
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Ajay Gupta
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA.,Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Hooman Kamel
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Eaton Lin
- Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Joshua Lantos
- Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Edward J Schenck
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Parag Goyal
- Department of Medicine, Division of Cardiology, Weill Cornell Medicine, New York, NY, USA
| | - Samuel S Bruce
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Joshua Kahan
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Kelsey N Lansdale
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Natalie M LeMoss
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Santosh B Murthy
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Philip E Stieg
- Department of Neurosurgery, Weill Cornell Medicine, New York, NY, USA
| | - Matthew E Fink
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Costantino Iadecola
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Alan Z Segal
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Ivan Diaz
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Cenai Zhang
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| | - Babak B Navi
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
34
|
Adekkanattu P, Jiang G, Luo Y, Kingsbury PR, Xu Z, Rasmussen LV, Pacheco JA, Kiefer RC, Stone DJ, Brandt PS, Yao L, Zhong Y, Deng Y, Wang F, Ancker JS, Campion TR, Pathak J. Evaluating the Portability of an NLP System for Processing Echocardiograms: A Retrospective, Multi-site Observational Study. AMIA Annu Symp Proc 2020; 2019:190-199. [PMID: 32308812 PMCID: PMC7153064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
While natural language processing (NLP) of unstructured clinical narratives holds the potential for patient care and clinical research, portability of NLP approaches across multiple sites remains a major challenge. This study investigated the portability of an NLP system developed initially at the Department of Veterans Affairs (VA) to extract 27 key cardiac concepts from free-text or semi-structured echocardiograms from three academic edical centers: Weill Cornell Medicine, Mayo Clinic and Northwestern Medicine. While the NLP system showed high precision and recall easurements for four target concepts (aortic valve regurgitation, left atrium size at end systole, mitral valve regurgitation, tricuspid valve regurgitation) across all sites, we found moderate or poor results for the remaining concepts and the NLP system performance varied between individual sites.
Collapse
Affiliation(s)
| | | | - Yuan Luo
- Northwestern University, Chicago, IL
| | | | | | | | | | | | | | | | - Liang Yao
- Northwestern University, Chicago, IL
| | | | - Yu Deng
- Northwestern University, Chicago, IL
| | - Fei Wang
- Weill Cornell Medicine, New York, NY
| | | | | | | |
Collapse
|
35
|
Kamel H, Okin PM, Merkler AE, Navi BB, Campion TR, Devereux RB, Díaz I, Weinsaft JW, Kim J. Relationship between left atrial volume and ischemic stroke subtype. Ann Clin Transl Neurol 2019; 6:1480-1486. [PMID: 31402612 PMCID: PMC6689681 DOI: 10.1002/acn3.50841] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 06/19/2019] [Accepted: 06/19/2019] [Indexed: 01/15/2023] Open
Abstract
OBJECTIVE Atrial cardiopathy without atrial fibrillation (AF) may be a potential cardiac source of embolic strokes of undetermined source (ESUS). Atrial volume is a feature of atrial cardiopathy, but the relationship between atrial volume and ESUS remains unclear. METHODS We compared left atrial volume among ischemic stroke subtypes in the Cornell Acute Stroke Academic Registry (CAESAR), which includes all patients with acute ischemic stroke at our hospital since 2011. Stroke subtype was determined by neurologists per the TOAST classification and consensus ESUS definition. Left atrial volume index (LAVI) was obtained directly from our echocardiography image system (Xcelera, Philips Healthcare). We used t-tests and analysis of variance for unadjusted comparisons and targeted minimum loss-based estimation for comparisons adjusted for demographics and comorbidities. RESULTS Among 2116 patients in CAESAR from 2011 to 2016, 1293 had LAVI measurements. LAVI varied across subtypes (P < 0.001) from 48.8 (±30.0) mL/m2 in cardioembolic strokes to 30.3 (±10.5) mL/m2 in small-vessel strokes. LAVI was larger in ESUS (33.3 ± 13.6 mL/m2 ) than in small- or large-vessel stroke (30.9 ± 10.7 mL/m2 ) (P = 0.01). The association between LAVI and ESUS persisted after the adjustment for demographics and comorbidities: a 10 mL/m2 increase in LAVI was associated with a 4.4% increase in ESUS probability (95% CI, 2.3%-6.4%). Results were similar after excluding patients with AF during post-discharge heart-rhythm monitoring. INTERPRETATION We found larger left atria among patients with ESUS versus non-cardioembolic stroke. There was significant overlap in left atrial size between ESUS and non-cardioembolic stroke, highlighting that many ESUS cases are not cardioembolic.
Collapse
Affiliation(s)
- Hooman Kamel
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medical College, New York, New York
| | - Peter M Okin
- Division of Cardiology, Weill Cornell Medical College, New York, New York
| | - Alexander E Merkler
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medical College, New York, New York
| | - Babak B Navi
- Clinical and Translational Neuroscience Unit, Feil Family Brain and Mind Research Institute and Department of Neurology, Weill Cornell Medical College, New York, New York
| | - Thomas R Campion
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York, New York
| | - Richard B Devereux
- Division of Cardiology, Weill Cornell Medical College, New York, New York
| | - Iván Díaz
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York, New York
| | | | - Jiwon Kim
- Division of Cardiology, Weill Cornell Medical College, New York, New York
| |
Collapse
|
36
|
Chen C, Turner SP, Sholle ET, Brown SW, Blau VLI, Brouwer JP, Lewis AN, Cole CL, Nanus DM, Shah MA, Leonard JP, Campion TR. Evaluation of a REDCap-based Workflow for Supporting Federal Guidance for Electronic Informed Consent. AMIA Jt Summits Transl Sci Proc 2019; 2019:163-172. [PMID: 31258968 PMCID: PMC6568140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Adoption of electronic informed consent (eConsent) for research remains low despite evidence of improved patient comprehension, usability, and workflow processes compared to paper. At our institution, we implemented an eConsent workflow using REDCap, a widely used electronic data capture system. The goal of this study was to evaluate the extent to which the REDCap eConsent solution adhered to federal guidance for eConsent. Of 29 requirements derived from sixteen recommendations from the United States Office for Human Research Protections (OHRP) and Food and Drug Administration (FDA), the REDCap eConsent solution supported 24 (86%). To the best of our knowledge, this is among the first studies to evaluate an eConsent approach's support for federal guidance. Findings suggest use of REDCap may help other institutions overcome barriers to eConsent adoption, and that OHRP and FDA expand guidance to recommend eConsent solutions integrate with enterprise clinical and research information systems.
Collapse
Affiliation(s)
- Cindy Chen
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Scott P Turner
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Scott W Brown
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Vanessa L I Blau
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | | | - Alicia N Lewis
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| | - David M Nanus
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Manish A Shah
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - John P Leonard
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
37
|
Campion TR, Pompea ST, Turner SP, Sholle ET, Cole CL, Kaushal R. A Method for Integrating Healthcare Provider Organization and Research Sponsor Systems and Workflows to Support Large-Scale Studies. AMIA Jt Summits Transl Sci Proc 2019; 2019:648-655. [PMID: 31259020 PMCID: PMC6568055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Healthcare provider organizations (HPOs) increasingly participate in large-scale research efforts sponsored by external organizations that require use of consent management systems that may not integrate seamlessly with local workflows. The resulting inefficiency can hinder the ability of HPOs to participate in studies. To overcome this challenge, we developed a method using REDCap, a widely adopted electronic data capture system, and novel middleware that can potentially generalize to other settings. In this paper, we describe the method, illustrate its use to support the NIHAll of Us Research Program and PCORI ADAPTABLE studies at our HPO, and encourage other HPOs to test replicability of the method to facilitate similar research efforts. Code is available on GitHub at https://github.com/wcmc-research-informatics/.
Collapse
Affiliation(s)
- Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| | - Sean T Pompea
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Scott P Turner
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Rainu Kaushal
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| |
Collapse
|
38
|
Turner SP, Pompea ST, Williams KL, Kraemer DA, Sholle ET, Chen C, Cole CL, Kaushal R, Campion TR. Implementation of Informatics to Support the NIH All of Us Research Program in a Healthcare Provider Organization. AMIA Jt Summits Transl Sci Proc 2019; 2019:602-609. [PMID: 31259015 PMCID: PMC6568061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The NIH All of Us Research Program, a national effort to collect biospecimens and health data for over one million participants from across the United States, requires participating healthcare provider organizations (HPOs) to use informatics tools maintained by the NIH to manage participant consent, biospecimen processing, physical measurements, and other workflows. HPOs also maintain distinct workflows for handling overlapping tasks within their individual aegis, which do not necessarily achieve seamless interoperability with NIH-maintained cloud-based systems. At our HPO, we implemented informatics to address gaps in enrollment workflows and hardware, clinical workflow integration, patient engagement, laboratory support, and study team reporting. In this case report we detail our approach to inform efforts at other institutions for the NIH All of Us Research Program and other studies.
Collapse
Affiliation(s)
- Scott P Turner
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Sean T Pompea
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Kelly L Williams
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| | - David A Kraemer
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Cindy Chen
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Rainu Kaushal
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
39
|
Adekkanattu P, Sholle ET, DeFerio J, Pathak J, Johnson SB, Campion TR. Ascertaining Depression Severity by Extracting Patient Health Questionnaire-9 (PHQ-9) Scores from Clinical Notes. AMIA Annu Symp Proc 2018; 2018:147-156. [PMID: 30815052 PMCID: PMC6371338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The Patient Health Questionnaire-9 (PHQ-9) is a validated instrument for assessing depression severity. While some electronic health record (EHR) systems capture PHQ-9 scores in a structured format, unstructured clinical notes remain the only source in many settings, which presents data retrieval challenges for research and clinical decision support. To address this gap, we extended the open-source Leo natural language processing (NLP) platform to extract PHQ-9 scores from clinical notes and evaluated performance using EHR data for n=123,703 patients who were prescribed antidepressants. Compared to a reference standard, the NLP method exhibited high accuracy (97%), sensitivity (98%), precision (97%), and F-score (97%). Furthermore, of patients with PHQ-9 scores identified by the NLP method, 31% (n=498) had at least one PHQ-9 score clinically indicative of major depressive disorder (MDD), but lacked a structured ICD-9/10 diagnosis code for MDD. This NLP technique may facilitate accurate identification and stratification of patients with depression.
Collapse
Affiliation(s)
- Prakash Adekkanattu
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Joseph DeFerio
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
| | - Jyotishman Pathak
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
| | - Stephen B Johnson
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
40
|
Oxley PR, Ruffing J, Campion TR, Wheeler TR, Cole CL. Design and Implementation of a Secure Computing Environment for Analysis of Sensitive Data at an Academic Medical Center. AMIA Annu Symp Proc 2018; 2018:857-866. [PMID: 30815128 PMCID: PMC6371349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Academic medical centers need to make sensitive data from electronic health records, payer claims, genomic pipelines, and other sources available for analytical and educational purposes while ensuring privacy and security. Although many studies have described warehouses for collecting biomedical data, few studies have described secure computing environments for analysis of sensitive data. This case report describes the Weill Cornell Medicine Data Core with respect to user access, data controls, hardware, software, audit, and financial considerations. In the 2.5 years since launch, the Data Core has supported more than 200 faculty, staff, and students across nearly 60 research and education projects. Other institutions may benefit from adopting elements of the approach, including tools available on Github, for balancing access with privacy and security.
Collapse
Affiliation(s)
- Peter R Oxley
- Samuel J. Wood Library, Weill Cornell Medicine, New York, NY
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - John Ruffing
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Center for Advanced Computing, Cornell University, Ithaca, NY
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| | - Terrie R Wheeler
- Samuel J. Wood Library, Weill Cornell Medicine, New York, NY
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
| |
Collapse
|
41
|
Pacheco JA, Rasmussen LV, Kiefer RC, Campion TR, Speltz P, Carroll RJ, Stallings SC, Mo H, Ahuja M, Jiang G, LaRose ER, Peissig PL, Shang N, Benoit B, Gainer VS, Borthwick K, Jackson KL, Sharma A, Wu AY, Kho AN, Roden DM, Pathak J, Denny JC, Thompson WK. A case study evaluating the portability of an executable computable phenotype algorithm across multiple institutions and electronic health record environments. J Am Med Inform Assoc 2018; 25:1540-1546. [PMID: 30124903 PMCID: PMC6213083 DOI: 10.1093/jamia/ocy101] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 06/13/2018] [Accepted: 07/10/2018] [Indexed: 12/12/2022] Open
Abstract
Electronic health record (EHR) algorithms for defining patient cohorts are commonly shared as free-text descriptions that require human intervention both to interpret and implement. We developed the Phenotype Execution and Modeling Architecture (PhEMA, http://projectphema.org) to author and execute standardized computable phenotype algorithms. With PhEMA, we converted an algorithm for benign prostatic hyperplasia, developed for the electronic Medical Records and Genomics network (eMERGE), into a standards-based computable format. Eight sites (7 within eMERGE) received the computable algorithm, and 6 successfully executed it against local data warehouses and/or i2b2 instances. Blinded random chart review of cases selected by the computable algorithm shows PPV ≥90%, and 3 out of 5 sites had >90% overlap of selected cases when comparing the computable algorithm to their original eMERGE implementation. This case study demonstrates potential use of PhEMA computable representations to automate phenotyping across different EHR systems, but also highlights some ongoing challenges.
Collapse
Affiliation(s)
- Jennifer A Pacheco
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Luke V Rasmussen
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Richard C Kiefer
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Thomas R Campion
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Peter Speltz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Robert J Carroll
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Sarah C Stallings
- Meharry-Vanderbilt Alliance, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Huan Mo
- Department of Pathology, Loma Linda University Health, Loma Linda, California, USA
| | - Monika Ahuja
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Guoqian Jiang
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Eric R LaRose
- Department of Biomedical Informatics, Marshfield Clinic Research Institute, Marshfield, Wisconsin, USA
| | - Peggy L Peissig
- Department of Biomedical Informatics, Marshfield Clinic Research Institute, Marshfield, Wisconsin, USA
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Barbara Benoit
- Research IS and Computing, Partners HealthCare, Harvard University, Somerville, Massachusetts, USA
| | - Vivian S Gainer
- Research IS and Computing, Partners HealthCare, Harvard University, Somerville, Massachusetts, USA
| | - Kenneth Borthwick
- Henry Hood Center for Health Research, Geisinger, Danville, Pennsylvania, USA
| | - Kathryn L Jackson
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Ambrish Sharma
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Andy Yizhou Wu
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Abel N Kho
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Dan M Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jyotishman Pathak
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, New York, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - William K Thompson
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| |
Collapse
|
42
|
Sholle E, Krichevsky S, Scandura J, Sosner C, Campion TR. Lessons Learned in the Development of a Computable Phenotype for Response in Myeloproliferative Neoplasms. IEEE Int Conf Healthc Inform 2018; 2018:328-331. [PMID: 31276120 DOI: 10.1109/ichi.2018.00045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Determining response status in patients with myeloproliferative neoplasms is a complex problem requiring the integration of both structured and unstructured data elements from disparate information systems. By applying multiple techniques, a collaborative team of informatics professionals and research personnel were able to determine which elements were amenable to automated extraction and which required expert adjudication. With this knowledge in mind, we were able to build a system that joins together programmatically-derived and manually-abstracted data elements to facilitate response assessment - an important end point in clinical and translational research in this disease area.
Collapse
Affiliation(s)
- Evan Sholle
- Information Technologies & Services, Weill Cornell Medicine, New York, NY
| | | | - Joseph Scandura
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Claudia Sosner
- Department of Medicine, Weill Cornell Medicine, New York, USA
| | - Thomas R Campion
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, USA
| |
Collapse
|
43
|
Sholle ET, Davila MA, Kabariti J, Schwartz JZ, Varughese VI, Cole CL, Campion TR. A scalable method for supporting multiple patient cohort discovery projects using i2b2. J Biomed Inform 2018; 84:179-183. [PMID: 30009991 DOI: 10.1016/j.jbi.2018.07.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 05/16/2018] [Accepted: 07/11/2018] [Indexed: 11/18/2022]
Abstract
Although i2b2, a popular platform for patient cohort discovery using electronic health record (EHR) data, can support multiple projects specific to individual disease areas or research interests, the standard approach for doing so duplicates data across projects, requiring additional disk space and processing time, which limits scalability. To address this deficiency, we developed a novel approach that stored data in a single i2b2 fact table and used structured query language (SQL) views to access data for specific projects. Compared to the standard approach, the view-based approach reduced required disk space by 59% and extract-transfer-load (ETL) time by 46%, without substantially impacting query performance. The view-based approach has enabled scalability of multiple i2b2 projects and generalized to another data model at our institution. Other institutions may benefit from this approach, code of which is available on GitHub (https://github.com/wcmc-research-informatics/super-i2b2).
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Marcos A Davila
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Joseph Kabariti
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Julian Z Schwartz
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Vinay I Varughese
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
| | - Curtis L Cole
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA; Department of Medicine, Weill Cornell Medicine, New York, NY, USA; Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY, USA
| | - Thomas R Campion
- Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA; Department of Healthcare Policy & Research, Weill Cornell Medicine, New York, NY, USA; Department of Pediatrics, Weill Cornell Medicine, New York, NY, USA; Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
44
|
Johnson SB, Adekkanattu P, Campion TR, Flory J, Pathak J, Patterson OV, DuVall SL, Major V, Aphinyanaphongs Y. From Sour Grapes to Low-Hanging Fruit: A Case Study Demonstrating a Practical Strategy for Natural Language Processing Portability. AMIA Jt Summits Transl Sci Proc 2018; 2017:104-112. [PMID: 29888051 PMCID: PMC5961788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Natural Language Processing (NLP) holds potential for patient care and clinical research, but a gap exists between promise and reality. While some studies have demonstrated portability of NLP systems across multiple sites, challenges remain. Strategies to mitigate these challenges can strive for complex NLP problems using advanced methods (hard-to-reach fruit), or focus on simple NLP problems using practical methods (low-hanging fruit). This paper investigates a practical strategy for NLP portability using extraction of left ventricular ejection fraction (LVEF) as a use case. We used a tool developed at the Department of Veterans Affair (VA) to extract the LVEF values from free-text echocardiograms in the MIMIC-III database. The approach showed an accuracy of 98.4%, sensitivity of 99.4%, a positive predictive value of 98.7%, and F-score of 99.0%. This experience, in which a simple NLP solution proved highly portable with excellent performance, illustrates the point that simple NLP applications may be easier to disseminate and adapt, and in the short term may prove more useful, than complex applications.
Collapse
Affiliation(s)
- Stephen B Johnson
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Prakash Adekkanattu
- Information Technologies & Services, Weill Cornell Medicine, New York, New York
| | - Thomas R Campion
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
- Information Technologies & Services, Weill Cornell Medicine, New York, New York
| | - James Flory
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Jyotishman Pathak
- Healthcare Policy and Research, Weill Cornell Medicine, New York, New York
| | - Olga V Patterson
- VA Salt Lake City Health Care System
- University of Utah, Salt Lake City, UT
| | - Scott L DuVall
- VA Salt Lake City Health Care System
- University of Utah, Salt Lake City, UT
| | - Vincent Major
- Center for Health Informatics and Bioinformatics, NYU Langone Medical Center, New York, New York
| | - Yindalon Aphinyanaphongs
- Center for Health Informatics and Bioinformatics, NYU Langone Medical Center, New York, New York
| |
Collapse
|
45
|
Chen C, Wulff RT, Sholle ET, Roboz GJ, Kraemer DA, Campion TR. Evaluating Generalizability of a Biospecimen Informatics Approach: Support for Local Requirements and Best Practices. AMIA Jt Summits Transl Sci Proc 2018; 2017:55-62. [PMID: 29888041 PMCID: PMC5961803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/30/2022]
Abstract
To enable clinical and translational research, academic medical centers increasingly implement biospecimen information management systems. At our institution, one laboratory successfully implemented a multi-system solution that enabled collection and reporting of specimen- and aliquot-level data. The objective of this study was to assess the solution against the laboratory's requirements and with respect to support of best practices for biospecimen information management systems defined by the International Society for Biological and Environmental Repositories (ISBER). The solution supported the laboratory's reporting needs and 90% (n=26) of ISBER best practices. To the best of our knowledge, this is among the first studies to demonstrate the generalizability of a biospecimen informatics approach. Findings suggest that development and evaluation of biospecimen informatics approaches can potentially improve through closer collaboration of informatics and biorepository professional societies.
Collapse
Affiliation(s)
- Cindy Chen
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Regina T. Wulff
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Evan T. Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Gail J. Roboz
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - David A. Kraemer
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Thomas R. Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY,Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY,Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY,Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
46
|
Sholle ET, Kabariti J, Johnson SB, Leonard JP, Pathak J, Varughese VI, Cole CL, Campion TR. Secondary Use of Patients' Electronic Records (SUPER): An Approach for Meeting Specific Data Needs of Clinical and Translational Researchers. AMIA Annu Symp Proc 2018; 2017:1581-1588. [PMID: 29854228 PMCID: PMC5977622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Academic medical centers commonly approach secondary use of electronic health record (EHR) data by implementing centralized clinical data warehouses (CDWs). However, CDWs require extensive resources to model data dimensions and harmonize clinical terminology, which can hinder effective support of the specific and varied data needs of investigators. We hypothesized that an approach that aggregates raw data from source systems, ignores initial modeling typical of CDWs, and transforms raw data for specific research purposes would meet investigator needs. The approach has successfully enabled multiple tools that provide utility to the institutional research enterprise. To our knowledge, this is the first complete description of a methodology for electronic patient data acquisition and provisioning that ignores data harmonization at the time of initial storage in favor of downstream transformation to address specific research questions and applications.
Collapse
Affiliation(s)
- Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Joseph Kabariti
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Stephen B Johnson
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
| | - John P Leonard
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Jyotishman Pathak
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
| | - Vinay I Varughese
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Curtis L Cole
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Department of Medicine, Weill Cornell Medicine, New York, NY
| | - Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| |
Collapse
|
47
|
Campion TR, Weinberg ST, Lorenzi NM, Waitman LR. Evaluation of Computerized Free Text Sign-Out Notes: Baseline Understanding and Recommendations. Appl Clin Inform 2017; 1:304-317. [PMID: 21258575 DOI: 10.4338/aci-2010-04-ra-0023] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND: Standardization of sign-out, the transfer of patient information and responsibility between inpatient providers at shift change, is a Joint Commission National Patient Safety Goal intended to improve communication and reduce risk of error. Computerized systems with free text data entry and limited structure allow clinicians to generate sign-out notes in a variety of ways. OBJECTIVES: The literature lacks a systematic exploration of the range of content generated by users of computerized sign-out systems. The goal of this study was to determine if and how clinicians record standardized sign-out information using a system with free text data entry and limited structure. METHODS: Using qualitative methods, we reviewed free text sign-out notes for 730 patient cases across 39 hospital units at an academic medical center. RESULTS: Two categories of information expression emerged from analysis: patient treatment-comprised of patient summaries, awareness items, and action items-and care team coordination-consisting of discharge information, contact information, and social concerns. A third category describing the format of sign-out note content, presentation of information, also emerged. Location and structure of information varied, but sign-out note content for some hospital units exhibited specific characteristics and was relatively standardized. CONCLUSIONS: Findings provide a baseline understanding of computerized free text sign-out note content. Sign-out notes contained a synthesis of data from disparate sources. We recommend formalizing existing unit-specific content standardization and system use patterns to reduce sign-out note variability and improve communication.
Collapse
Affiliation(s)
- Thomas R Campion
- Vanderbilt University School of Medicine, Department of Biomedical Informatics
| | | | | | | |
Collapse
|
48
|
Campion TR, Sholle ET, Davila MA. Generalizable Middleware to Support Use of REDCap Dynamic Data Pull for Integrating Clinical and Research Data. AMIA Jt Summits Transl Sci Proc 2017; 2017:76-81. [PMID: 28815111 PMCID: PMC5543341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
To support integration of clinical and research data, the makers of REDCap, a widely-used electronic data capture system, released the Dynamic Data Pull (DDP) module. Although DDP is a standard module in REDCap, institutions must develop custom middleware web services to exchange data between REDCap and local source systems. The lack of middleware is a barrier to institutional adoption and use by investigators. To overcome this gap, we developed a REDCap DDP web service middleware (accessible at https://github.com/wcmc-research-informatics/redcap-ddp) that minimizes developer effort, relies on configuration by non-developers, and can generalize to other settings. Early findings suggest the approach is successful.
Collapse
Affiliation(s)
- Thomas R Campion
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
- Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY
- Department of Healthcare Policy and Research, Weill Cornell Medicine, New York, NY
- Department of Pediatrics, Weill Cornell Medicine, New York, NY
| | - Evan T Sholle
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| | - Marcos A Davila
- Information Technologies and Services Department, Weill Cornell Medicine, New York, NY
| |
Collapse
|
49
|
Campion TR, Vest JR, Kern LM, Kaushal R. Adoption of clinical data exchange in community settings: a comparison of two approaches. AMIA Annu Symp Proc 2014; 2014:359-365. [PMID: 25954339 PMCID: PMC4419957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Adoption of electronic clinical data exchange (CDE) across disparate healthcare organizations remains low in community settings despite demonstrated benefits. To expand CDE in communities, New York State funded sixteen community-based organizations to implement point-to-point directed exchange (n=8) and multi-site query-based health information exchange (HIE) (n=8). We conducted a cross-sectional study to compare adoption of directed exchange versus query-based HIE. From 2008 to 2011, 66% (n=1,747) of providers targeted for directed exchange and 21% (n=5,427) of providers targeted for query-based HIE adopted CDE. Funding per provider adoptee was almost two times greater for directed exchange (median (interquartile range): $25,535 ($17,391-$42,240)) than query-based HIE ($14,649 ($9,897-$28,078)), although the difference was not statistically significant. Because its infrastructure can cover larger populations using similar levels of public funding, query-based HIE may scale more broadly than directed exchange. To our knowledge, this is among the first studies to compare directed exchange versus query-based HIE.
Collapse
Affiliation(s)
- Thomas R Campion
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York NY ; Department of Pediatrics, Weill Cornell Medical College, New York NY ; Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York NY ; Health Information Technology Evaluation Collaborative, New York NY
| | - Joshua R Vest
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York NY ; Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York NY ; Health Information Technology Evaluation Collaborative, New York NY
| | - Lisa M Kern
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York NY ; Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York NY ; Health Information Technology Evaluation Collaborative, New York NY ; Department of Medicine, Weill Cornell Medical College, New York NY
| | - Rainu Kaushal
- Department of Healthcare Policy and Research, Weill Cornell Medical College, New York NY ; Department of Pediatrics, Weill Cornell Medical College, New York NY ; Center for Healthcare Informatics and Policy, Weill Cornell Medical College, New York NY ; Health Information Technology Evaluation Collaborative, New York NY ; Department of Medicine, Weill Cornell Medical College, New York NY ; Komansky Center for Children's Health at NewYork-Presbyterian Hospital, New York NY
| |
Collapse
|
50
|
Vest JR, Campion TR, Kern LM, Kaushal R. Public and private sector roles in health information technology policy: Insights from the implementation and operation of exchange efforts in the United States. Health Policy and Technology 2014. [DOI: 10.1016/j.hlpt.2014.03.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|