Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. Curr Protoc Hum Genet 2019;100:e80. [PMID: 30516347 PMCID: PMC6318047 DOI: 10.1002/cphg.80] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

For:	Pendergrass SA, Crawford DC. Using Electronic Health Records To Generate Phenotypes For Research. Curr Protoc Hum Genet 2019;100:e80. [PMID: 30516347 PMCID: PMC6318047 DOI: 10.1002/cphg.80] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Number

Cited by Other Article(s)

Merritt VC, Chen AW, Bonzel CL, Hong C, Sangar R, Morini Sweet S, Sorg SF, Chanfreau-Coffinier C. Development and validation of an electronic health record-based algorithm for identifying TBI in the VA: A VA Million Veteran Program study. Brain Inj 2024:1-9. [PMID: 39004925 DOI: 10.1080/02699052.2024.2373920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 06/24/2024] [Indexed: 07/16/2024]

Chen JS, Copado IA, Vallejos C, Kalaw FGP, Soe P, Cai CX, Toy BC, Borkar D, Sun CQ, Shantha JG, Baxter SL. Variations in Electronic Health Record-Based Definitions of Diabetic Retinopathy Cohorts: A Literature Review and Quantitative Analysis. OPHTHALMOLOGY SCIENCE 2024;4:100468. [PMID: 38560278 PMCID: PMC10973665 DOI: 10.1016/j.xops.2024.100468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 04/04/2024]

Abstract

Purpose

Use of the electronic health record (EHR) has motivated the need for data standardization. A gap in knowledge exists regarding variations in existing terminologies for defining diabetic retinopathy (DR) cohorts. This study aimed to review the literature and analyze variations regarding codified definitions of DR.

Design

Literature review and quantitative analysis.

Subjects

Published manuscripts.

Methods

Four graders reviewed PubMed and Google Scholar for peer-reviewed studies. Studies were included if they used codified definitions of DR (e.g., billing codes). Data elements such as author names, publication year, purpose, data set type, and DR definitions were manually extracted. Each study was reviewed by ≥ 2 authors to validate inclusion eligibility. Quantitative analyses of the codified definitions were then performed to characterize the variation between DR cohort definitions.

Main Outcome Measures

Number of studies included and numeric counts of billing codes used to define codified cohorts.

Results

In total, 43 studies met the inclusion criteria. Half of the included studies used datasets based on structured EHR data (i.e., data registries, institutional EHR review), and half used claims data. All but 1 of the studies used billing codes such as the International Classification of Diseases 9th or 10th edition (ICD-9 or ICD-10), either alone or in addition to another terminology for defining disease. Of the 27 included studies that used ICD-9 and the 20 studies that used ICD-10 codes, the most common codes used pertained to the full spectrum of DR severity. Diabetic retinopathy complications (e.g., vitreous hemorrhage) were also used to define some DR cohorts.

Conclusions

Substantial variations exist among codified definitions for DR cohorts within retrospective studies. Variable definitions may limit generalizability and reproducibility of retrospective studies. More work is needed to standardize disease cohorts.

Financial Disclosures

Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

Collapse

Affiliation(s)

Jimmy S Chen Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Ivan A Copado Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Cecilia Vallejos Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Fritz Gerald P Kalaw Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Priyanka Soe Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California
Cindy X Cai Wilmer Eye Institute, Johns Hopkins School of Medicine, Baltimore, Maryland
Brian C Toy Department of Ophthalmology, Roski Eye Institute, Keck School of Medicine, University of Southern California, Los Angeles, California
Durga Borkar Department of Ophthalmology, Duke Eye Center, Duke University, Durham, North Carolina
Catherine Q Sun F.I. Proctor Foundation, University of California San Francisco, San Francisco, California Department of Ophthalmology, University of California San Francisco, San Francisco, California
Jessica G Shantha F.I. Proctor Foundation, University of California San Francisco, San Francisco, California Department of Ophthalmology, University of California San Francisco, San Francisco, California
Sally L Baxter Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology and Shiley Eye Institute, University of California San Diego, La Jolla, California UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California

Collapse

Jafari E, Blackman MH, Karnes JH, Van Driest SL, Crawford DC, Choi L, McDonough CW. Using electronic health records for clinical pharmacology research: Challenges and considerations. Clin Transl Sci 2024;17:e13871. [PMID: 38943244 PMCID: PMC11213823 DOI: 10.1111/cts.13871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/21/2024] [Accepted: 05/24/2024] [Indexed: 07/01/2024] Open

Miller M, Jorm L, Partyka C, Burns B, Habig K, Oh C, Immens S, Ballard N, Gallego B. Identifying prehospital trauma patients from ambulance patient care records; comparing two methods using linked data in New South Wales, Australia. Injury 2024;55:111570. [PMID: 38664086 DOI: 10.1016/j.injury.2024.111570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 04/11/2024] [Accepted: 04/14/2024] [Indexed: 06/16/2024]

Abstract

BACKGROUND

Linked datasets for trauma system monitoring should ideally follow patients from the prehospital scene to hospital admission and post-discharge. Having a well-defined cohort when using administrative datasets is essential because they must capture the representative population. Unlike hospital electronic health records (EHR), ambulance patient-care records lack access to sources beyond immediate clinical notes. Relying on a limited set of variables to define a study population might result in missed patient inclusion. We aimed to compare two methods of identifying prehospital trauma patients: one using only those documented under a trauma protocol and another incorporating additional data elements from ambulance patient care records.

METHODS

We analyzed data from six routinely collected administrative datasets from 2015 to 2018, including ambulance patient-care records, aeromedical data, emergency department visits, hospitalizations, rehabilitation outcomes, and death records. Three prehospital trauma cohorts were created: an Extended-T-protocol cohort (patients transported under a trauma protocol and/or patients with prespecified criteria from structured data fields), T-protocol cohort (only patients documented as transported under a trauma protocol) and non-T-protocol (extended-T-protocol population not in the T-protocol cohort). Patient-encounter characteristics, mortality, clinical and post-hospital discharge outcomes were compared. A conservative p-value of 0.01 was considered significant RESULTS: Of 1 038 263 patient-encounters included in the extended-T-population 814 729 (78.5 %) were transported, with 438 893 (53.9 %) documented as a T-protocol patient. Half (49.6 %) of the non-T-protocol sub-cohort had an International Classification of Disease 10th edition injury or external cause code, indicating 79644 missed patients when a T-protocol-only definition was used. The non-T-protocol sub-cohort also identified additional patients with intubation, prehospital blood transfusion and positive eFAST. A higher proportion of non-T protocol patients than T-protocol patients were admitted to the ICU (4.6% vs 3.6 %), ventilated (1.8% vs 1.3 %), received in-hospital transfusion (7.9 vs 6.8 %) or died (1.8% vs 1.3 %). Urgent trauma surgery was similar between groups (1.3% vs 1.4 %).

CONCLUSION

The extended-T-population definition identified 50 % more admitted patients with an ICD-10-AM code consistent with an injury, including patients with severe trauma. Developing an EHR phenotype incorporating multiple data fields of ambulance-transported trauma patients for use with linked data may avoid missing these patients.

Collapse

Newby D, Taylor N, Joyce DW, Winchester LM. Optimising the use of electronic medical records for large scale research in psychiatry. Transl Psychiatry 2024;14:232. [PMID: 38824136 PMCID: PMC11144247 DOI: 10.1038/s41398-024-02911-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 04/13/2024] [Accepted: 04/15/2024] [Indexed: 06/03/2024] Open

Bazemore K, Joo J, Hwang WT, Himes BE. Clarifying Chronic Obstructive Pulmonary Disease Genetic Associations Observed in Biobanks via Mediation Analysis of Smoking. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2024;2024:499-508. [PMID: 38827081 PMCID: PMC11141825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]

Mathis M, Steffner KR, Subramanian H, Gill GP, Girardi NI, Bansal S, Bartels K, Khanna AK, Huang J. Overview and Clinical Applications of Artificial Intelligence and Machine Learning in Cardiac Anesthesiology. J Cardiothorac Vasc Anesth 2024;38:1211-1220. [PMID: 38453558 PMCID: PMC10999327 DOI: 10.1053/j.jvca.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 01/30/2024] [Accepted: 02/05/2024] [Indexed: 03/09/2024]

Cao X, Zhang S, Sha Q. A novel method for multiple phenotype association studies based on genotype and phenotype network. PLoS Genet 2024;20:e1011245. [PMID: 38728360 PMCID: PMC11111089 DOI: 10.1371/journal.pgen.1011245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 05/22/2024] [Accepted: 03/29/2024] [Indexed: 05/12/2024] Open

Choudhary T, Upadhyaya P, Davis CM, Yang P, Tallowin S, Lisboa FA, Schobel SA, Coopersmith CM, Elster EA, Buchman TG, Dente CJ, Kamaleswaran R. Derivation and Validation of Generalized Sepsis-induced Acute Respiratory Failure Phenotypes Among Critically Ill Patients: A Retrospective Study. RESEARCH SQUARE 2024:rs.3.rs-4307475. [PMID: 38746442 PMCID: PMC11092838 DOI: 10.21203/rs.3.rs-4307475/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]

Abstract

Background

Septic patients who develop acute respiratory failure (ARF) requiring mechanical ventilation represent a heterogenous subgroup of critically ill patients with widely variable clinical characteristics. Identifying distinct phenotypes of these patients may reveal insights about the broader heterogeneity in the clinical course of sepsis. We aimed to derive novel phenotypes of sepsis-induced ARF using observational clinical data and investigate their generalizability across multi-ICU specialties, considering multi-organ dynamics.

Methods

We performed a multi-center retrospective study of ICU patients with sepsis who required mechanical ventilation for ≥24 hours. Data from two different high-volume academic hospital systems were used as a derivation set with N=3,225 medical ICU (MICU) patients and a validation set with N=848 MICU patients. For the multi-ICU validation, we utilized retrospective data from two surgical ICUs at the same hospitals (N=1,577). Clinical data from 24 hours preceding intubation was used to derive distinct phenotypes using an explainable machine learning-based clustering model interpreted by clinical experts.

Results

Four distinct ARF phenotypes were identified: A (severe multi-organ dysfunction (MOD) with a high likelihood of kidney injury and heart failure), B (severe hypoxemic respiratory failure [median P/F=123]), C (mild hypoxia [median P/F=240]), and D (severe MOD with a high likelihood of hepatic injury, coagulopathy, and lactic acidosis). Patients in each phenotype showed differences in clinical course and mortality rates despite similarities in demographics and admission co-morbidities. The phenotypes were reproduced in external validation utilizing an external MICU from second hospital and SICUs from both centers. Kaplan-Meier analysis showed significant difference in 28-day mortality across the phenotypes (p<0.01) and consistent across both centers. The phenotypes demonstrated differences in treatment effects associated with high positive end-expiratory pressure (PEEP) strategy.

Conclusion

The phenotypes demonstrated unique patterns of organ injury and differences in clinical outcomes, which may help inform future research and clinical trial design for tailored management strategies.

Collapse

Lemas DJ, Du X, Rouhizadeh M, Lewis B, Frank S, Wright L, Spirache A, Gonzalez L, Cheves R, Magalhães M, Zapata R, Reddy R, Xu K, Parker L, Harle C, Young B, Louis-Jaques A, Zhang B, Thompson L, Hogan WR, Modave F. Classifying early infant feeding status from clinical notes using natural language processing and machine learning. Sci Rep 2024;14:7831. [PMID: 38570569 PMCID: PMC10991582 DOI: 10.1038/s41598-024-58299-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 03/27/2024] [Indexed: 04/05/2024] Open

Abstract

The objective of this study is to develop and evaluate natural language processing (NLP) and machine learning models to predict infant feeding status from clinical notes in the Epic electronic health records system. The primary outcome was the classification of infant feeding status from clinical notes using Medical Subject Headings (MeSH) terms. Annotation of notes was completed using TeamTat to uniquely classify clinical notes according to infant feeding status. We trained 6 machine learning models to classify infant feeding status: logistic regression, random forest, XGBoost gradient descent, k-nearest neighbors, and support-vector classifier. Model comparison was evaluated based on overall accuracy, precision, recall, and F1 score. Our modeling corpus included an even number of clinical notes that was a balanced sample across each class. We manually reviewed 999 notes that represented 746 mother-infant dyads with a mean gestational age of 38.9 weeks and a mean maternal age of 26.6 years. The most frequent feeding status classification present for this study was exclusive breastfeeding [n = 183 (18.3%)], followed by exclusive formula bottle feeding [n = 146 (14.6%)], and exclusive feeding of expressed mother's milk [n = 102 (10.2%)], with mixed feeding being the least frequent [n = 23 (2.3%)]. Our final analysis evaluated the classification of clinical notes as breast, formula/bottle, and missing. The machine learning models were trained on these three classes after performing balancing and down sampling. The XGBoost model outperformed all others by achieving an accuracy of 90.1%, a macro-averaged precision of 90.3%, a macro-averaged recall of 90.1%, and a macro-averaged F1 score of 90.1%. Our results demonstrate that natural language processing can be applied to clinical notes stored in the electronic health records to classify infant feeding status. Early identification of breastfeeding status using NLP on unstructured electronic health records data can be used to inform precision public health interventions focused on improving lactation support for postpartum patients.

Collapse

Affiliation(s)

Dominick J Lemas Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA. Department of Obstetrics and Gynecology, University of Florida College of Medicine, Gainesville, FL, 32610, USA.
Xinsong Du Division of General Internal Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
Masoud Rouhizadeh Department of Pharmaceutical Outcomes and Policy, University of Florida College of Medicine, Gainesville, FL, 32610, USA Biomedical Informatics and Data Science Section, Division of General Internal Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
Braeden Lewis Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Simon Frank Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Lauren Wright Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Alex Spirache Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Lisa Gonzalez Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Ryan Cheves Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Marina Magalhães Division of Neonatal and Developmental Medicine, Department of Pediatrics, Stanford University School of Medicine, Palo Alto, CA, 94305, USA
Ruben Zapata Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Rahul Reddy Department of Computer and Information Science, Herbert Wertheim College of Engineering, University of Florida, Gainesville, FL, 32611, USA
Ke Xu Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, 2004 Mowry Road, Clinical and Translational Research Building, Gainesville, FL, 32610, USA
Leslie Parker Department of Biobehavioral Nursing Science, University of Florida College of Nursing, Gainesville, FL, 32603, USA
Chris Harle Health Policy and Management Department, Richard M. Fairbanks School of Public Health, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA
Bridget Young Division of Breastfeeding and Lactation Medicine, University of Rochester Medical Center, Rochester, NY, 14642, USA
Adetola Louis-Jaques Department of Obstetrics and Gynecology, University of Florida College of Medicine, Gainesville, FL, 32610, USA
Bouri Zhang Health Science Center Libraries, University of Florida, Gainesville, FL, 32610, USA
Lindsay Thompson Department of Pediatrics, Wake Forest School of Medicine, Winston-Salem, NC, 27101, USA
William R Hogan Data Science Institute, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
François Modave Department of Anesthesiology, University of Florida College of Medicine, Gainesville, FL, 32610, USA

Collapse

Levites Strekalova YA, Wang X, Sanchez O, Midence S. Trends in publication and levels of social determinants of health reporting in Journal of Clinical and Translational Science from 2017 to 2023. J Clin Transl Sci 2024;8:e58. [PMID: 38655458 PMCID: PMC11036436 DOI: 10.1017/cts.2024.508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 03/13/2024] [Accepted: 03/19/2024] [Indexed: 04/26/2024] Open

Clarke H, Fitzcharles MA. Are Electronic Health Records Sufficiently Accurate to Phenotype Rheumatology Patients With Chronic Pain? J Rheumatol 2024;51:218-220. [PMID: 38224990 DOI: 10.3899/jrheum.2023-1227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]

Al-Sahab B, Leviton A, Loddenkemper T, Paneth N, Zhang B. Biases in Electronic Health Records Data for Generating Real-World Evidence: An Overview. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024;8:121-139. [PMID: 38273982 PMCID: PMC10805748 DOI: 10.1007/s41666-023-00153-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 09/05/2023] [Accepted: 11/07/2023] [Indexed: 01/27/2024]

Acharya A, Shrestha S, Chen A, Conte J, Avramovic S, Sikdar S, Anastasopoulos A, Das S. Clinical risk prediction using language models: benefits and considerations. J Am Med Inform Assoc 2024:ocae030. [PMID: 38412328 DOI: 10.1093/jamia/ocae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/11/2024] [Accepted: 02/03/2024] [Indexed: 02/29/2024] Open

Kashkoush J, Gupta M, Meissner MA, Nielsen ME, Kirchner HL, Garg T. Performance Characteristics of a Rule-Based Electronic Health Record Algorithm to Identify Patients with Gross and Microscopic Hematuria. Methods Inf Med 2023;62:183-192. [PMID: 37666279 DOI: 10.1055/a-2165-5552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]

Chen Q, Dwaraka VB, Carreras-Gallo N, Mendez K, Chen Y, Begum S, Kachroo P, Prince N, Went H, Mendez T, Lin A, Turner L, Moqri M, Chu SH, Kelly RS, Weiss ST, Rattray NJ, Gladyshev VN, Karlson E, Wheelock C, Mathé EA, Dahlin A, McGeachie MJ, Smith R, Lasky-Su JA. OMICmAge: An integrative multi-omics approach to quantify biological age with electronic medical records. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.16.562114. [PMID: 37904959 PMCID: PMC10614756 DOI: 10.1101/2023.10.16.562114] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]

Affiliation(s)

Qingwen Chen Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Varun B. Dwaraka TruDiagnostic, Inc., Lexington, KY USA
Natàlia Carreras-Gallo TruDiagnostic, Inc., Lexington, KY USA
Kevin Mendez Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Yulu Chen Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Sofina Begum Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Priyadarshini Kachroo Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Nicole Prince Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Hannah Went TruDiagnostic, Inc., Lexington, KY USA
Tavis Mendez TruDiagnostic, Inc., Lexington, KY USA
Aaron Lin TruDiagnostic, Inc., Lexington, KY USA
Logan Turner TruDiagnostic, Inc., Lexington, KY USA
Mahdi Moqri Division of Genetics, Dept. of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
Su H. Chu Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Rachel S. Kelly Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Scott T. Weiss Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Nicholas J.W Rattray Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK Strathclyde Centre for Molecular Bioscience, University of Strathclyde, Glasgow, UK
Vadim N. Gladyshev Division of Genetics, Dept. of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Elizabeth Karlson Department of Personalized Medicine, Mass General Brigham and Harvard Medical School, Boston, MA, USA
Craig Wheelock Division of Physiological Chemistry 2, Dept of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
Ewy A. Mathé Division of Preclinical Innovation, National Center for Advancing Translational Science, National Institutes of Health, Rockville, MD, USA
Amber Dahlin Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Michae J. McGeachie Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Ryan Smith TruDiagnostic, Inc., Lexington, KY USA
Jessica A. Lasky-Su Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA

Collapse

Nealon CL, Halladay CW, Gorman BR, Simpson P, Roncone DP, Canania RL, Anthony SA, Rogers LRS, Leber JN, Dougherty JM, Bailey JNC, Crawford DC, Sullivan JM, Galor A, Wu WC, Greenberg PB, Lass JH, Iyengar SK, Peachey NS. Association Between Fuchs Endothelial Corneal Dystrophy, Diabetes Mellitus, and Multimorbidity. Cornea 2023;42:1140-1149. [PMID: 37170406 PMCID: PMC10523841 DOI: 10.1097/ico.0000000000003311] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 04/11/2023] [Indexed: 05/13/2023]

Affiliation(s)

Cari L. Nealon Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Christopher W. Halladay Center of Innovation in Long Term Services and Supports, Providence VA Medical Center, Providence, Rhode Island, USA
Bryan R. Gorman VA Cooperative Studies Program, VA Boston Healthcare System, Boston, Massachusetts Booz Allen Hamilton, McLean, Virginia, USA
Piana Simpson Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
David P. Roncone Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Rachael L. Canania Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Scott A. Anthony Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Lea R. Sawicki Rogers Ophthalmology Section, VA Western NY Health Care System, Buffalo, New York, USA
Jenna N. Leber Ophthalmology Section, VA Western NY Health Care System, Buffalo, New York, USA
Jacquelyn M. Dougherty Ophthalmology Section, VA Western NY Health Care System, Buffalo, New York, USA
Jessica N. Cooke Bailey Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Department of Population & Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Dana C. Crawford Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Department of Population & Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Jack M. Sullivan Ophthalmology Section, VA Western NY Health Care System, Buffalo, New York, USA Research Service, VA Western NY Health Care System, Buffalo, New York, USA Department of Ophthalmology (Ross Eye Institute), University at Buffalo-SUNY, Buffalo, New York, USA
Anat Galor Miami Veterans Affairs Medical Center, Miami, Florida, USA Bascom Palmer Eye Institute, University of Miami, Miami, Florida, USA
Wen-Chih Wu Cardiology Section, Medical Service, Providence VA Medical Center, Providence, Rhode Island, USA
Paul B. Greenberg Ophthalmology Section, Providence VA Medical Center, Providence, Rhode Island, USA Division of Ophthalmology, Alpert Medical School, Brown University, Providence, Rhode Island, USA
Million Veteran Program
Jonathan H. Lass Department of Ophthalmology & Visual Sciences, Case Western Reserve University, Cleveland, Ohio, USA University Hospitals Eye Institute, Cleveland, Ohio, USA
Sudha K. Iyengar Cleveland Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Department of Population & Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA
Neal S. Peachey Research Service, VA Northeast Ohio Healthcare System, Cleveland, Ohio, USA Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio, USA Department of Ophthalmology, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, Ohio, USA

Collapse

Sathe NA, Xian S, Mabrey FL, Crosslin DR, Mooney SD, Morrell ED, Lybarger K, Yetisgen M, Jarvik GP, Bhatraju PK, Wurfel MM. Evaluating construct validity of computable acute respiratory distress syndrome definitions in adults hospitalized with COVID-19: an electronic health records based approach. BMC Pulm Med 2023;23:292. [PMID: 37559024 PMCID: PMC10413524 DOI: 10.1186/s12890-023-02560-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 07/11/2023] [Indexed: 08/11/2023] Open

Abstract

BACKGROUND

Evolving ARDS epidemiology and management during COVID-19 have prompted calls to reexamine the construct validity of Berlin criteria, which have been rarely evaluated in real-world data. We developed a Berlin ARDS definition (EHR-Berlin) computable in electronic health records (EHR) to (1) assess its construct validity, and (2) assess how expanding its criteria affected validity.

METHODS

We performed a retrospective cohort study at two tertiary care hospitals with one EHR, among adults hospitalized with COVID-19 February 2020-March 2021. We assessed five candidate definitions for ARDS: the EHR-Berlin definition modeled on Berlin criteria, and four alternatives informed by recent proposals to expand criteria and include patients on high-flow oxygen (EHR-Alternative 1), relax imaging criteria (EHR-Alternatives 2-3), and extend timing windows (EHR-Alternative 4). We evaluated two aspects of construct validity for the EHR-Berlin definition: (1) criterion validity: agreement with manual ARDS classification by experts, available in 175 patients; (2) predictive validity: relationships with hospital mortality, assessed by Pearson r and by area under the receiver operating curve (AUROC). We assessed predictive validity and timing of identification of EHR-Berlin definition compared to alternative definitions.

RESULTS

Among 765 patients, mean (SD) age was 57 (18) years and 471 (62%) were male. The EHR-Berlin definition classified 171 (22%) patients as ARDS, which had high agreement with manual classification (kappa 0.85), and was associated with mortality (Pearson r = 0.39; AUROC 0.72, 95% CI 0.68, 0.77). In comparison, EHR-Alternative 1 classified 219 (29%) patients as ARDS, maintained similar relationships to mortality (r = 0.40; AUROC 0.74, 95% CI 0.70, 0.79, Delong test P = 0.14), and identified patients earlier in their hospitalization (median 13 vs. 15 h from admission, Wilcoxon signed-rank test P < 0.001). EHR-Alternative 3, which removed imaging criteria, had similar correlation (r = 0.41) but better discrimination for mortality (AUROC 0.76, 95% CI 0.72, 0.80; P = 0.036), and identified patients median 2 h (P < 0.001) from admission.

CONCLUSIONS

The EHR-Berlin definition can enable ARDS identification with high criterion validity, supporting large-scale study and surveillance. There are opportunities to expand the Berlin criteria that preserve predictive validity and facilitate earlier identification.

Collapse

Penrod N, Okeh C, Velez Edwards DR, Barnhart K, Senapati S, Verma SS. Leveraging electronic health record data for endometriosis research. Front Digit Health 2023;5:1150687. [PMID: 37342866 PMCID: PMC10278662 DOI: 10.3389/fdgth.2023.1150687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 05/10/2023] [Indexed: 06/23/2023] Open

Deutsch AJ, Stalbow L, Majarian TD, Mercader JM, Manning AK, Florez JC, Loos RJ, Udler MS. Polygenic Scores Help Reduce Racial Disparities in Predictive Accuracy of Automated Type 1 Diabetes Classification Algorithms. Diabetes Care 2023;46:794-800. [PMID: 36745605 PMCID: PMC10090893 DOI: 10.2337/dc22-1833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/10/2023] [Indexed: 02/07/2023]

Affiliation(s)

Aaron J. Deutsch Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA Department of Medicine, Harvard Medical School, Boston, MA
Lauren Stalbow Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY
Timothy D. Majarian Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
Josep M. Mercader Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA Department of Medicine, Harvard Medical School, Boston, MA
Alisa K. Manning Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA Department of Medicine, Harvard Medical School, Boston, MA Clinical and Translational Epidemiology Unit, Mongan Institute, Massachusetts General Hospital, Boston, MA
Jose C. Florez Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA Department of Medicine, Harvard Medical School, Boston, MA
Ruth J.F. Loos Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
Miriam S. Udler Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA Programs in Metabolism and Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA Department of Medicine, Harvard Medical School, Boston, MA

Collapse

Roy S, Bruehl S, Feng X, Shotwell MS, Van De Ven T, Shaw AD, Kertai MD. Developing a risk stratification tool for predicting opioid-related respiratory depression after non-cardiac surgery: a retrospective study. BMJ Open 2022;12:e064089. [PMID: 36219738 PMCID: PMC9445779 DOI: 10.1136/bmjopen-2022-064089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Abstract

OBJECTIVES

Accurately assessing the probability of significant respiratory depression following opioid administration can potentially enhance perioperative risk assessment and pain management. We developed and validated a risk prediction tool to estimate the probability of significant respiratory depression (indexed by naloxone administration) in patients undergoing noncardiac surgery.

DESIGN

Retrospective cohort study.

SETTING

Single academic centre.

PARTICIPANTS

We studied n=63 084 patients (mean age 47.1±18.2 years; 50% men) who underwent emergency or elective non-cardiac surgery between 1 January 2007 and 30 October 2017.

INTERVENTIONS

A derivation subsample reflecting two-thirds of available patients (n=42 082) was randomly selected for model development, and associations were identified between predictor variables and naloxone administration occurring within 5 days following surgery. The resulting probability model for predicting naloxone administration was then cross-validated in a separate validation cohort reflecting the remaining one-third of patients (n=21 002).

RESULTS

The rate of naloxone administration was identical in the derivation (n=2720 (6.5%)) and validation (n=1360 (6.5%)) cohorts. The risk prediction model identified female sex (OR: 3.01; 95% CI: 2.73 to 3.32), high-risk surgical procedures (OR: 4.16; 95% CI: 3.78 to 4.58), history of drug abuse (OR: 1.81; 95% CI: 1.52 to 2.16) and any opioids being administered on a scheduled rather than as-needed basis (OR: 8.31; 95% CI: 7.26 to 9.51) as risk factors for naloxone administration. Advanced age (OR: 0.971; 95% CI: 0.968 to 0.973), opioids administered via patient-controlled analgesia pump (OR: 0.55; 95% CI: 0.49 to 0.62) and any scheduled non-opioids (OR: 0.63; 95% CI: 0.58 to 0.69) were associated with decreased risk of naloxone administration. An overall risk prediction model incorporating the common clinically available variables above displayed excellent discriminative ability in both the derivation and validation cohorts (c-index=0.820 and 0.814, respectively).

CONCLUSION

Our cross-validated clinical predictive model accurately estimates the risk of serious opioid-related respiratory depression requiring naloxone administration in postoperative patients.

Collapse

Avery CL, Howard AG, Ballou AF, Buchanan VL, Collins JM, Downie CG, Engel SM, Graff M, Highland HM, Lee MP, Lilly AG, Lu K, Rager JE, Staley BS, North KE, Gordon-Larsen P. Strengthening Causal Inference in Exposomics Research: Application of Genetic Data and Methods. ENVIRONMENTAL HEALTH PERSPECTIVES 2022;130:55001. [PMID: 35533073 PMCID: PMC9084332 DOI: 10.1289/ehp9098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Abstract

Advances in technologies to measure a broad set of exposures have led to a range of exposome research efforts. Yet, these efforts have insufficiently integrated methods that incorporate genetic data to strengthen causal inference, despite evidence that many exposome-associated phenotypes are heritable. Objective: We demonstrate how integration of methods and study designs that incorporate genetic data can strengthen causal inference in exposomics research by helping address six challenges: reverse causation and unmeasured confounding, comprehensive examination of phenotypic effects, low efficiency, replication, multilevel data integration, and characterization of tissue-specific effects. Examples are drawn from studies of biomarkers and health behaviors, exposure domains where the causal inference methods we describe are most often applied. Discussion: Technological, computational, and statistical advances in genotyping, imputation, and analysis, combined with broad data sharing and cross-study collaborations, offer multiple opportunities to strengthen causal inference in exposomics research. Full application of these opportunities will require an expanded understanding of genetic variants that predict exposome phenotypes as well as an appreciation that the utility of genetic variants for causal inference will vary by exposure and may depend on large sample sizes. However, several of these challenges can be addressed through international scientific collaborations that prioritize data sharing. Ultimately, we anticipate that efforts to better integrate methods that incorporate genetic data will extend the reach of exposomics research by helping address the challenges of comprehensively measuring the exposome and its health effects across studies, the life course, and in varied contexts and diverse populations. https://doi.org/10.1289/EHP9098.

Collapse

Affiliation(s)

Christy L Avery Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Annie Green Howard Department of Biostatistics, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Anna F Ballou Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Victoria L Buchanan Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Jason M Collins Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Carolina G Downie Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Stephanie M Engel Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Mariaelisa Graff Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Heather M Highland Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Moa P Lee Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Adam G Lilly Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA Department of Sociology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Kun Lu Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Julia E Rager Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Brooke S Staley Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Kari E North Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
Penny Gordon-Larsen Department of Nutrition, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA

Collapse

Almowil Z, Zhou SM, Brophy S, Croxall J. Concept Libraries for Repeatable and Reusable Research: Qualitative Study Exploring the Needs of Users. JMIR Hum Factors 2022;9:e31021. [PMID: 35289755 PMCID: PMC8965669 DOI: 10.2196/31021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Revised: 11/17/2021] [Accepted: 12/05/2021] [Indexed: 12/05/2022] Open

Abstract

Background

Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it difficult to compare different study findings and hinders the ability to conduct repeatable and reusable research.

Objective

This study aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, in the development of a data portal for phenotypes (a concept library).

Methods

This was a qualitative study using interviews and focus group discussion. One-to-one interviews were conducted with researchers, clinicians, machine learning experts, and senior research managers in health data science (N=6) to explore their specific needs in the development of a concept library. In addition, a focus group discussion with researchers (N=14) working with the Secured Anonymized Information Linkage databank, a national eHealth data linkage infrastructure, was held to perform a SWOT (strengths, weaknesses, opportunities, and threats) analysis for the phenotyping system and the proposed concept library. The interviews and focus group discussion were transcribed verbatim, and 2 thematic analyses were performed.

Results

Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would stimulate them to share their work and reuse the work of others, and they pointed out several barriers that could inhibit them from sharing their work and reusing the work of others. The participants suggested some developments that they would like to see to improve reproducible research output using routine data.

Conclusions

The study indicated that most interviewees valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform. Analysis of interviews and the focus group discussion revealed that different stakeholders have different requirements, facilitators, barriers, and concerns about a prototype concept library.

Collapse

Cereceda K, Jorquera R, Villarroel-Espíndola F. Advances in mass cytometry and its applicability to digital pathology in clinical-translational cancer research. ADVANCES IN LABORATORY MEDICINE 2022;3:5-29. [PMID: 37359436 PMCID: PMC10197474 DOI: 10.1515/almed-2021-0075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 07/16/2021] [Indexed: 06/28/2023]

Seedahmed MI, Mogilnicka I, Zeng S, Luo G, Whooley MA, McCulloch CE, Koth L, Arjomandi M. Performance of a Computational Phenotyping Algorithm for Sarcoidosis Using Diagnostic Codes in Electronic Medical Records: A Pilot Study from Two Veterans Affairs Medical Centers. JMIR Form Res 2022;6:e31615. [PMID: 35081036 PMCID: PMC8928044 DOI: 10.2196/31615] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 01/24/2022] [Accepted: 01/24/2022] [Indexed: 11/29/2022] Open

Abstract

Background

Electronic medical records (EMRs) offer the promise of computationally identifying sarcoidosis cases. However, the accuracy of identifying these cases in the EMR is unknown.

Objective

The aim of this study is to determine the statistical performance of using the International Classification of Diseases (ICD) diagnostic codes to identify patients with sarcoidosis in the EMR.

Methods

We used the ICD diagnostic codes to identify sarcoidosis cases by searching the EMRs of the San Francisco and Palo Alto Veterans Affairs medical centers and randomly selecting 200 patients. To improve the diagnostic accuracy of the computational algorithm in cases where histopathological data are unavailable, we developed an index of suspicion to identify cases with a high index of suspicion for sarcoidosis (confirmed and probable) based on clinical and radiographic features alone using the American Thoracic Society practice guideline. Through medical record review, we determined the positive predictive value (PPV) of diagnosing sarcoidosis by two computational methods: using ICD codes alone and using ICD codes plus the high index of suspicion.

Results

Among the 200 patients, 158 (79%) had a high index of suspicion for sarcoidosis. Of these 158 patients, 142 (89.9%) had documentation of nonnecrotizing granuloma, confirming biopsy-proven sarcoidosis. The PPV of using ICD codes alone was 79% (95% CI 78.6%-80.5%) for identifying sarcoidosis cases and 71% (95% CI 64.7%-77.3%) for identifying histopathologically confirmed sarcoidosis in the EMRs. The inclusion of the generated high index of suspicion to identify confirmed sarcoidosis cases increased the PPV significantly to 100% (95% CI 96.5%-100%). Histopathology documentation alone was 90% sensitive compared with high index of suspicion.

Conclusions

ICD codes are reasonable classifiers for identifying sarcoidosis cases within EMRs with a PPV of 79%. Using a computational algorithm to capture index of suspicion data elements could significantly improve the case-identification accuracy.

Collapse

Sulieman L, Cronin RM, Carroll RJ, Natarajan K, Marginean K, Mapes B, Roden D, Harris P, Ramirez A. OUP accepted manuscript. J Am Med Inform Assoc 2022;29:1131-1141. [PMID: 35396991 PMCID: PMC9196700 DOI: 10.1093/jamia/ocac046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 02/18/2022] [Accepted: 03/23/2022] [Indexed: 11/13/2022] Open

Barajas R, Hair B, Lai G, Rotunno M, Shams-White MM, Gillanders EM, Mechanic LE. Facilitating cancer systems epidemiology research. PLoS One 2022;16:e0255328. [PMID: 34972102 PMCID: PMC8719747 DOI: 10.1371/journal.pone.0255328] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Association of step counts over time with the risk of chronic disease in the All of Us Research Program. Nat Med 2022;28:2301-2308. [PMID: 36216933 PMCID: PMC9671804 DOI: 10.1038/s41591-022-02012-w] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 08/15/2022] [Indexed: 01/14/2023]

Greer ML, Davis K, Stack BC. Machine learning can identify patients at risk of hyperparathyroidism without known calcium and intact parathyroid hormone. Head Neck 2021;44:817-822. [PMID: 34953008 DOI: 10.1002/hed.26970] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/01/2021] [Accepted: 12/16/2021] [Indexed: 01/16/2023] Open

Wyatt B, Perumalswami PV, Mageras A, Miller M, Harty A, Ma N, Bowman CA, Collado F, Jeon J, Paulino L, Dinani A, Dieterich D, Li L, Vandromme M, Branch AD. A Digital Case-Finding Algorithm for Diagnosed but Untreated Hepatitis C: A Tool for Increasing Linkage to Treatment and Cure. Hepatology 2021;74:2974-2987. [PMID: 34333777 PMCID: PMC9299620 DOI: 10.1002/hep.32086] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Revised: 06/29/2021] [Accepted: 07/22/2021] [Indexed: 12/20/2022]

Abstract

BACKGROUND AND AIMS

Although chronic HCV infection increases mortality, thousands of patients remain diagnosed-but-untreated (DBU). We aimed to (1) develop a DBU phenotyping algorithm, (2) use it to facilitate case finding and linkage to care, and (3) identify barriers to successful treatment.

APPROACH AND RESULTS

We developed a phenotyping algorithm using Java and SQL and applied it to ~2.5 million EPIC electronic medical records (EMRs; data entered January 2003 to December 2017). Approximately 72,000 EMRs contained an HCV International Classification of Diseases code and/or diagnostic test. The algorithm classified 10,614 cases as DBU (HCV-RNA positive and alive). Its positive and negative predictive values were 88% and 97%, respectively, as determined by manual review of 500 EMRs randomly selected from the ~72,000. Navigators reviewed the charts of 6,187 algorithm-defined DBUs and they attempted to contact potential treatment candidates by phone. By June 2020, 30% (n = 1,862) had completed an HCV-related appointment. Outcomes analysis revealed that DBU patients enrolled in our care coordination program were more likely to complete treatment (72% [n = 219] vs. 54% [n = 256]; P < 0.001) and to have a verified sustained virological response (67% vs. 46%; P < 0.001) than other patients. Forty-eight percent (n = 2,992) of DBU patients could not be reached by phone, which was a major barrier to engagement. Nearly half of these patients had Fibrosis-4 scores ≥ 2.67, indicating significant fibrosis. Multivariable logistic regression showed that DBUs who could not be contacted were less likely to have private insurance than those who could (18% vs. 50%; P < 0.001).

CONCLUSIONS

The digital DBU case-finding algorithm efficiently identified potential HCV treatment candidates, freeing resources for navigation and coordination. The algorithm is portable and accelerated HCV elimination when incorporated in our comprehensive program.

Collapse

Daniels H, Jones KH, Heys S, Ford DV. Exploring the Use of Genomic and Routinely Collected Data: Narrative Literature Review and Interview Study. J Med Internet Res 2021;23:e15739. [PMID: 34559060 PMCID: PMC8501405 DOI: 10.2196/15739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 10/01/2020] [Accepted: 07/15/2021] [Indexed: 11/13/2022] Open

Abstract

Background

Advancing the use of genomic data with routinely collected health data holds great promise for health care and research. Increasing the use of these data is a high priority to understand and address the causes of disease.

Objective

This study aims to provide an outline of the use of genomic data alongside routinely collected data in health research to date. As this field prepares to move forward, it is important to take stock of the current state of play in order to highlight new avenues for development, identify challenges, and ensure that adequate data governance models are in place for safe and socially acceptable progress.

Methods

We conducted a literature review to draw information from past studies that have used genomic and routinely collected data and conducted interviews with individuals who use these data for health research. We collected data on the following: the rationale of using genomic data in conjunction with routinely collected data, types of genomic and routinely collected data used, data sources, project approvals, governance and access models, and challenges encountered.

Results

The main purpose of using genomic and routinely collected data was to conduct genome-wide and phenome-wide association studies. Routine data sources included electronic health records, disease and death registries, health insurance systems, and deprivation indices. The types of genomic data included polygenic risk scores, single nucleotide polymorphisms, and measures of genetic activity, and biobanks generally provided these data. Although the literature search showed that biobanks released data to researchers, the case studies revealed a growing tendency for use within a data safe haven. Challenges of working with these data revolved around data collection, data storage, technical, and data privacy issues.

Conclusions

Using genomic and routinely collected data holds great promise for progressing health research. Several challenges are involved, particularly in terms of privacy. Overcoming these barriers will ensure that the use of these data to progress health research can be exploited to its full potential.

Collapse

Almowil ZA, Zhou SM, Brophy S. Concept libraries for automatic electronic health record based phenotyping: A review. Int J Popul Data Sci 2021;6:1362. [PMID: 34189274 PMCID: PMC8210840 DOI: 10.23889/ijpds.v5i1.1362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open

Abstract

Introduction

Electronic health records (EHR) are linked together to examine disease history and to undertake research into the causes and outcomes of disease. However, the process of constructing algorithms for phenotyping (e.g., identifying disease characteristics) or health characteristics (e.g., smoker) is very time consuming and resource costly. In addition, results can vary greatly between researchers. Reusing or building on algorithms that others have created is a compelling solution to these problems. However, sharing algorithms is not a common practice and many published studies do not detail the clinical code lists used by the researchers in the disease/characteristic definition. To address these challenges, a number of centres across the world have developed health data portals which contain concept libraries (e.g., algorithms for defining concepts such as disease and characteristics) in order to facilitate disease phenotyping and health studies.

Objectives

This study aims to review the literature of existing concept libraries, examine their utilities, identify the current gaps, and suggest future developments.

Methods

The five-stage framework of Arksey and O'Malley was used for the literature search. This approach included defining the research questions, identifying relevant studies through literature review, selecting eligible studies, charting and extracting data, and summarising and reporting the findings.

Results

This review identified seven publicly accessible Electronic Health data concept libraries which were developed in different countries including UK, USA, and Canada. The concept libraries (n = 7) investigated were either general libraries that hold phenotypes of multiple specialties (n = 4) or specialized libraries that manage only certain specialities such as rare diseases (n = 3). There were some clear differences between the general libraries such as archiving data from different electronic sources, and using a range of different types of coding systems. However, they share some clear similarities such as enabling users to upload their own code lists, and allowing users to use/download the publicly accessible code. In addition, there were some differences between the specialized libraries such as difference in ability to search, and if it was possible to use different searching queries such as simple or complex searches. Conversely, there were some similarities between the specialized libraries such as enabling users to upload their own concepts into the libraries and to show where they were published, which facilitates assessing the validity of the concepts. All the specialized libraries aimed to encourage the reuse of research methods such as lists of clinical code and/or metadata.

Conclusion

The seven libraries identified have been developed independently and appear to replicate similar concepts but in different ways. Collaboration between similar libraries would greatly facilitate the use of these libraries for the user. The process of building code lists takes time and effort. Access to existing code lists increases consistency and accuracy of definitions across studies. Concept library developers should collaborate with each other to raise awareness of their existence and of their various functions, which could increase users’ contributions to those libraries and promote their wide-ranging adoption.

Collapse

Tam CS, Gullick J, Saavedra A, Vernon ST, Figtree GA, Chow CK, Cretikos M, Morris RW, William M, Morris J, Brieger D. Combining structured and unstructured data in EMRs to create clinically-defined EMR-derived cohorts. BMC Med Inform Decis Mak 2021;21:91. [PMID: 33685456 PMCID: PMC7938556 DOI: 10.1186/s12911-021-01441-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 02/15/2021] [Indexed: 11/29/2022] Open

Abstract

Background

There have been few studies describing how production EMR systems can be systematically queried to identify clinically-defined populations and limited studies utilising free-text in this process. The aim of this study is to provide a generalisable methodology for constructing clinically-defined EMR-derived patient cohorts using structured and unstructured data in EMRs.

Methods

Patients with possible acute coronary syndrome (ACS) were used as an exemplar. Cardiologists defined clinical criteria for patients presenting with possible ACS. These were mapped to data tables within the production EMR system creating seven inclusion criteria comprised of structured data fields (orders and investigations, procedures, scanned electrocardiogram (ECG) images, and diagnostic codes) and unstructured clinical documentation. Data were extracted from two local health districts (LHD) in Sydney, Australia. Outcome measures included examination of the relative contribution of individual inclusion criteria to the identification of eligible encounters, comparisons between inclusion criterion and evaluation of consistency of data extracts across years and LHDs.

Results

Among 802,742 encounters in a 5 year dataset (1/1/13–30/12/17), the presence of an ECG image (54.8% of encounters) and symptoms and keywords in clinical documentation (41.4–64.0%) were used most often to identify presentations of possible ACS. Orders and investigations (27.3%) and procedures (1.4%), were less often present for identified presentations. Relevant ICD-10/SNOMED CT codes were present for 3.7% of identified encounters. Similar trends were seen when the two LHDs were examined separately, and across years.

Conclusions

Clinically-defined EMR-derived cohorts combining structured and unstructured data during cohort identification is a necessary prerequisite for critical validation work required for development of real-time clinical decision support and learning health systems.

Collapse

Walters CE, Nitin R, Margulis K, Boorom O, Gustavson DE, Bush CT, Davis LK, Below JE, Cox NJ, Camarata SM, Gordon RL. Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020;63:3019-3035. [PMID: 32791019 PMCID: PMC7890229 DOI: 10.1044/2020_jslhr-19-00397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 04/23/2020] [Accepted: 05/19/2020] [Indexed: 05/13/2023]

Abstract

Purpose Data mining algorithms using electronic health records (EHRs) are useful in large-scale population-wide studies to classify etiology and comorbidities (Casey et al., 2016). Here, we apply this approach to developmental language disorder (DLD), a prevalent communication disorder whose risk factors and epidemiology remain largely undiscovered. Method We first created a reliable system for manually identifying DLD in EHRs based on speech-language pathologist (SLP) diagnostic expertise. We then developed and validated an automated algorithmic procedure, called, Automated Phenotyping Tool for identifying DLD cases in health systems data (APT-DLD), that classifies a DLD status for patients within EHRs on the basis of ICD (International Statistical Classification of Diseases and Related Health Problems) codes. APT-DLD was validated in a discovery sample (N = 973) using expert SLP manual phenotype coding as a gold-standard comparison and then applied and further validated in a replication sample of N = 13,652 EHRs. Results In the discovery sample, the APT-DLD algorithm correctly classified 98% (concordance) of DLD cases in concordance with manually coded records in the training set, indicating that APT-DLD successfully mimics a comprehensive chart review. The output of APT-DLD was also validated in relation to independently conducted SLP clinician coding in a subset of records, with a positive predictive value of 95% of cases correctly classified as DLD. We also applied APT-DLD to the replication sample, where it achieved a positive predictive value of 90% in relation to SLP clinician classification of DLD. Conclusions APT-DLD is a reliable, valid, and scalable tool for identifying DLD cohorts in EHRs. This new method has promising public health implications for future large-scale epidemiological investigations of DLD and may inform EHR data mining algorithms for other communication disorders. Supplemental Material https://doi.org/10.23641/asha.12753578.

Collapse

Solomonides A. Review of Clinical Research Informatics. Yearb Med Inform 2020;29:193-202. [PMID: 32823316 PMCID: PMC7442526 DOI: 10.1055/s-0040-1701988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open

Abstract

OBJECTIVES

Clinical Research Informatics (CRI) declares its scope in its name, but its content, both in terms of the clinical research it supports-and sometimes initiates-and the methods it has developed over time, reach much further than the name suggests. The goal of this review is to celebrate the extraordinary diversity of activity and of results, not as a prize-giving pageant, but in recognition of the field, the community that both serves and is sustained by it, and of its interdisciplinarity and its international dimension.

METHODS

Beyond personal awareness of a range of work commensurate with the author's own research, it is clear that, even with a thorough literature search, a comprehensive review is impossible. Moreover, the field has grown and subdivided to an extent that makes it very hard for one individual to be familiar with every branch or with more than a few branches in any depth. A literature survey was conducted that focused on informatics-related terms in the general biomedical and healthcare literature, and specific concerns ("artificial intelligence", "data models", "analytics", etc.) in the biomedical informatics (BMI) literature. In addition to a selection from the results from these searches, suggestive references within them were also considered.

RESULTS

The substantive sections of the paper-Artificial Intelligence, Machine Learning, and "Big Data" Analytics; Common Data Models, Data Quality, and Standards; Phenotyping and Cohort Discovery; Privacy: Deidentification, Distributed Computation, Blockchain; Causal Inference and Real-World Evidence-provide broad coverage of these active research areas, with, no doubt, a bias towards this reviewer's interests and preferences, landing on a number of papers that stood out in one way or another, or, alternatively, exemplified a particular line of work.

CONCLUSIONS

CRI is thriving, not only in the familiar major centers of research, but more widely, throughout the world. This is not to pretend that the distribution is uniform, but to highlight the potential for this domain to play a prominent role in supporting progress in medicine, healthcare, and wellbeing everywhere. We conclude with the observation that CRI and its practitioners would make apt stewards of the new medical knowledge that their methods will bring forward.

Collapse

Khalid SI, Omotosho PA, Spagnoli A, Torquati A. Association of Bariatric Surgery With Risk of Fracture in Patients With Severe Obesity. JAMA Netw Open 2020;3:e207419. [PMID: 32520360 PMCID: PMC7287567 DOI: 10.1001/jamanetworkopen.2020.7419] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

IMPORTANCE

Given the complex relationship between body mass index, body composition, and bone density and the correlative nature of the studies that have established the prevailing notion that higher body mass indices may be protective against osteopenia and osteoporosis and, therefore, fracture, the absolute risk of fracture in patients with severe obesity who undergo either Roux-en-Y gastric bypass (RYGB) or sleeve gastrectomy (SG) compared with those who do not undergo bariatric surgery is unknown.

OBJECTIVE

To assess the rates of fractures associated with obesity and compare rates between those who do not undergo bariatric surgery, those who undergo RYGB, and those who undergo SG.

DESIGN, SETTING, AND PARTICIPANTS

In this retrospective multicenter cohort study of Medicare Standard Analytic Files derived from Medicare parts A and B records from January 2004 to December 2014, patients classified as eligible for bariatric surgery using the US Centers of Medicare & Medicaid criteria who either did not undergo bariatric surgery or underwent RYGB or SG were exactly matched in a 1:1 fashion based on their age, sex, Elixhauser Comorbidity Index, hypertension, smoking status, nonalcoholic fatty liver disease, hyperlipidemia, type 2 diabetes, osteoporosis, osteoarthritis, and obstructive sleep apnea status. Data were analyzed from November to December 2019.

EXPOSURES

RYGB or SG.

MAIN OUTCOMES AND MEASURES

The primary outcome measured in this study was the odds of fracture overall based on exposure to bariatric surgery. Secondary outcomes included the odds of type of fracture (humerus, radius or ulna, pelvis, hip, vertebrae, and total fractures) based on exposure to bariatric surgery.

RESULTS

A total of 49 113 patients were included and were equally made up of 16 371 bariatric surgery-eligible patients who did not undergo weight loss surgery, 16 371 patients who had undergone RYGB, and 16 371 patients who had undergone SG. Each group consisted of an equal number of 4109 men (25.1%) and 12 262 women (74.9%) and had an equal distribution of ages, with 11 780 patients (72.0%) 64 years or younger, 4230 (25.8%) aged 65 to 69 years, 346 (2.1%) aged 70 to 74 years, and 15 (0.1%) aged 75 to 79 years. Patients undergoing RYGB were found to have no significant difference in odds of fractures compared with bariatric surgery-eligible patients who did not undergo surgery. Patients undergoing undergone SG were found to have decreased odds of fractures of the humerus (odds ratio [OR], 0.57; 95% CI, 0.45-0.73), radius or ulna (OR, 0.38; 95% CI, 0.25-0.58), hip (OR, 0.49; 95% CI, 0.33-0.74), pelvis (OR, 0.34; 95% CI, 0.18-0.64), vertebrae (OR, 0.60; 95% CI, 0.48-0.74), or fractures in general (OR, 0.53; 95% CI, 0.46-0.62). Compared with patients undergoing SG, patients undergoing RYGB had a significantly greater risk of total fractures (OR, 1.79; 95% CI, 1.55-2.06) and humeral fractures (OR, 1.60; 95% CI, 1.24-2.07).

CONCLUSIONS AND RELEVANCE

In this cohort study, bariatric surgery was associated with a reduced risk of fracture in bariatric surgery-eligible patients. Sleeve gastrectomy might be the best option for weight loss in patients in which fractures could be a concern, as RYGB may be associated with an increased fracture risk compared with SG.

Collapse

Wu CS, Luedtke AR, Sadikova E, Tsai HJ, Liao SC, Liu CC, Gau SSF, VanderWeele TJ, Kessler RC. Development and Validation of a Machine Learning Individualized Treatment Rule in First-Episode Schizophrenia. JAMA Netw Open 2020;3:e1921660. [PMID: 32083693 PMCID: PMC7043195 DOI: 10.1001/jamanetworkopen.2019.21660] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 12/23/2019] [Indexed: 12/31/2022] Open

Abstract

Importance

Little guidance exists to date on how to select antipsychotic medications for patients with first-episode schizophrenia.

Objective

To develop a preliminary individualized treatment rule (ITR) for patients with first-episode schizophrenia.

Design, Setting, and Participants

This prognostic study obtained data from Taiwan's National Health Insurance Research Database on patients with prescribed antipsychotic medications, ambulatory claims, or discharge diagnoses of a schizophrenic disorder between January 1, 2005, and December 31, 2011. An ITR was developed by applying a targeted minimum loss-based ensemble machine learning method to predict treatment success from baseline clinical and demographic data in a 70% training sample. The model was validated in the remaining 30% of the sample. The probability of treatment success was estimated for each medication for each patient under the model. The analysis was conducted between July 16, 2018, and July 15, 2019.

Exposures

Fifteen different antipsychotic medications.

Main Outcomes and Measures

Treatment success was defined as not switching medication and not being hospitalized for 12 months.

Results

Among the 32 277 patients in the analysis, the mean (SD) age was 36.7 (14.3) years, and 15 752 (48.8%) were male. In the validation sample, the treatment success rate (SE) was 51.7% (1.0%) under the ITR and was 44.5% (0.5%) in the observed population (Z = 7.1; P < .001). The estimated treatment success if all patients were given a prescription for 1 medication was significantly lower for each of the 13 medications than under the ITR (Z = 4.2-16.8; all P < .001). Aripiprazole (3088 [31.9%]) and amisulpride (2920 [30.2%]) were the medications most often recommended by the ITR. Only 1054 patients (10.9%) received ITR-recommended medications. Observed treatment success, although lower than the success under the ITR, was nonetheless significantly higher than if medications had been randomized (44.5% [SE, 0.55%] vs 41.3% [SE, 0.4%]; Z = 6.9; P < .001), although only marginally higher than if medications had been randomized in their observed population proportions (44.5% [SE, 0.5%] vs 43.5% [SE, 0.4%]; Z = 2.2; P = .03]).

Conclusions and Relevance

These results suggest that an ITR may be associatded with an increase in the treatment success rate among patients with first-episode schizophrenia, but experimental evaluation is needed to confirm this possibility. If confirmed, model refinement that investigates biomarkers, clinical observations, and patient reports as additional predictors in iterative pragmatic trials would be needed before clinical implementation.

Collapse

Crawford DC, Lin J, Bailey JNC, Kinzy T, Sedor JR, O’Toole JF, Bush WS. Frequency of ClinVar Pathogenic Variants in Chronic Kidney Disease Patients Surveyed for Return of Research Results at a Cleveland Public Hospital. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020;25:575-586. [PMID: 31797629 PMCID: PMC6931908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Abstract

Return of results is not common in research settings as standards are not yet in place for what to return, how to return, and to whom. As a pioneer of large-scale of return of research results, the Precision Medicine Initiative Cohort now known of All of Us plans to return pharmacogenomic results and variants of clinical significance to its participants starting late 2019. To better understand the local landscape of possibilities regarding return of research results, we assessed the frequency of pathogenic variants and APOL1 renal risk variants in a small diverse cohort of chronic kidney disease patients (CKD) ascertained from a public hospital in Cleveland, Ohio genotyped on the Illumina Infinium MegaEX. Of the 23,720 ClinVar-designated variants directly assayed by the MegaEX, 8,355 (35%) had at least one alternate allele in the 130 participants genotyped. Of these, 18 ClinVar variants deemed pathogenic by multiple submitters with no conflicts in interpretation were distributed across 27 participants. The majority of these pathogenic ClinVar variants (14/18) were associated with autosomal recessive disorders. Of note were four African American carriers of TTR rs76992529 associated with amyloidogenic transthyretin amyloidosis, otherwise known as familial transthyretin amyloidosis (FTA). FTA, an autosomal dominant disorder with variable penetrance, is more common among African-descent populations compared with European-descent populations. Also common in this CKD population were APOL1 renal risk alleles G1 (rs73885319) and G2 (rs71785313) with 60% of the study population carrying at least one renal risk allele. Both pathogenic ClinVar variants and APOL1 renal risk alleles were distributed among participants who wanted actionable genetic results returned, wanted genetic results returned regardless of actionability, and wanted no results returned. Results from this local genetic study highlight challenges in which variants to report, how to interpret them, and the participant's potential for follow-up, only some of the challenges in return of research results likely facing larger studies such as All of Us.

Collapse

Affiliation(s)

Dana C. Crawford Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA,2Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA,3Department of Genetics and Genome Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
John Lin Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
Jessica N. Cooke Bailey Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA,2Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
Tyler Kinzy Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
John R. Sedor Department of Physiology and Biophysics, Case Western Reserve University,5Department of Nephrology and Hypertension, Glickman Urology and Kidney and Lerner Research Institutes, Cleveland Clinic, Cleveland, OH 44106, USA
John F. O’Toole Department of Nephrology and Hypertension, Glickman Urology and Kidney and Lerner Research Institutes, Cleveland Clinic, Cleveland, OH 44106, USA
William S. Bush Cleveland Institute for Computational Biology, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA,2Department of Population and Quantitative Health Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA,3Department of Genetics and Genome Sciences, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA

Collapse

Claussnitzer M, Cho JH, Collins R, Cox NJ, Dermitzakis ET, Hurles ME, Kathiresan S, Kenny EE, Lindgren CM, MacArthur DG, North KN, Plon SE, Rehm HL, Risch N, Rotimi CN, Shendure J, Soranzo N, McCarthy MI. A brief history of human disease genetics. Nature 2020;577:179-189. [PMID: 31915397 PMCID: PMC7405896 DOI: 10.1038/s41586-019-1879-7] [Citation(s) in RCA: 338] [Impact Index Per Article: 84.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/13/2019] [Indexed: 12/16/2022]

Affiliation(s)

Melina Claussnitzer Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA Broad Institute of MIT and Harvard Cambridge, Cambridge, MA, USA Institute of Nutritional Science, University of Hohenheim, Stuttgart, Germany
Judy H Cho Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Rory Collins Nuffield Department of Population Health (NDPH), University of Oxford, Oxford, UK UK Biobank, Stockport, UK
Nancy J Cox Vanderbilt Genetics Institute and Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
Emmanouil T Dermitzakis Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland Health 2030 Genome Center, Geneva, Switzerland
Matthew E Hurles Wellcome Sanger Institute, Hinxton, UK
Sekar Kathiresan Broad Institute of MIT and Harvard Cambridge, Cambridge, MA, USA Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA Verve Therapeutics, Cambridge, MA, USA
Eimear E Kenny Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Cecilia M Lindgren Broad Institute of MIT and Harvard Cambridge, Cambridge, MA, USA Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
Daniel G MacArthur Broad Institute of MIT and Harvard Cambridge, Cambridge, MA, USA Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
Kathryn N North Murdoch Children's Research Institute, Parkville, Victoria, Australia University of Melbourne, Parkville, Victoria, Australia
Sharon E Plon Departments of Pediatrics and Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA Texas Children's Cancer Center, Texas Children's Hospital, Houston, TX, USA
Heidi L Rehm Broad Institute of MIT and Harvard Cambridge, Cambridge, MA, USA Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA Department of Pathology, Harvard Medical School, Boston, MA, USA
Neil Risch Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
Charles N Rotimi Center for Research on Genomics and Global Health, National Human Genome Research Institute, Bethesda, MD, USA
Jay Shendure Department of Genome Sciences, University of Washington, Seattle, WA, USA Brotman Baty Institute for Precision Medicine, Magnuson Health Sciences Building, Seattle, WA, USA Howard Hughes Medical Institute, Seattle, WA, USA
Nicole Soranzo Wellcome Sanger Institute, Hinxton, UK Department of Haematology, University of Cambridge, Cambridge, UK
Mark I McCarthy Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK. Oxford Centre for Diabetes, Endocrinology and Metabolism, Oxford, UK. Oxford NIHR Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, John Radcliffe Hospital, Oxford, UK. Human Genetics, Genentech, South San Francisco, CA, USA.

Collapse

Pendergrass SA, Buyske S, Jeff JM, Frase A, Dudek S, Bradford Y, Ambite JL, Avery CL, Buzkova P, Deelman E, Fesinmeyer MD, Haiman C, Heiss G, Hindorff LA, Hsu CN, Jackson RD, Lin Y, Le Marchand L, Matise TC, Monroe KR, Moreland L, North KE, Park SL, Reiner A, Wallace R, Wilkens LR, Kooperberg C, Ritchie MD, Crawford DC. A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans. PLoS One 2019;14:e0226771. [PMID: 31891604 PMCID: PMC6938343 DOI: 10.1371/journal.pone.0226771] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 12/03/2019] [Indexed: 12/11/2022] Open

Abstract

We performed a hypothesis-generating phenome-wide association study (PheWAS) to identify and characterize cross-phenotype associations, where one SNP is associated with two or more phenotypes, between thousands of genetic variants assayed on the Metabochip and hundreds of phenotypes in 5,897 African Americans as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study. The PAGE I study was a National Human Genome Research Institute-funded collaboration of four study sites accessing diverse epidemiologic studies genotyped on the Metabochip, a custom genotyping chip that has dense coverage of regions in the genome previously associated with cardio-metabolic traits and outcomes in mostly European-descent populations. Here we focus on identifying novel phenome-genome relationships, where SNPs are associated with more than one phenotype. To do this, we performed a PheWAS, testing each SNP on the Metabochip for an association with up to 273 phenotypes in the participating PAGE I study sites. We identified 133 putative pleiotropic variants, defined as SNPs associated at an empirically derived p-value threshold of p<0.01 in two or more PAGE study sites for two or more phenotype classes. We further annotated these PheWAS-identified variants using publicly available functional data and local genetic ancestry. Amongst our novel findings is SPARC rs4958487, associated with increased glucose levels and hypertension. SPARC has been implicated in the pathogenesis of diabetes and is also known to have a potential role in fibrosis, a common consequence of multiple conditions including hypertension. The SPARC example and others highlight the potential that PheWAS approaches have in improving our understanding of complex disease architecture by identifying novel relationships between genetic variants and an array of common human phenotypes.

Collapse

Affiliation(s)

Sarah A. Pendergrass Genentech, Inc., South San Francisco, California, United States of America
Steven Buyske Department of Statistics, Rutgers University, Piscataway, New Jersey, United States of America Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
Janina M. Jeff Illumina, Inc., San Diego, California, United States of America
Alex Frase Department of Genetics, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Scott Dudek Department of Genetics, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Yuki Bradford Department of Genetics, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Jose-Luis Ambite Information Sciences Institute; University of Southern California, Marina del Rey, California, United States of America
Christy L. Avery Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, United States of America
Petra Buzkova Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
Ewa Deelman Information Sciences Institute; University of Southern California, Marina del Rey, California, United States of America
Megan D. Fesinmeyer Amgen, Thousand Oaks, California, United States of America
Christopher Haiman Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, United States of America
Gerardo Heiss Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Sciences, University of North Carolina, Chapel Hill, North Carolina, United States of America
Lucia A. Hindorff National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
Chun-Nan Hsu Center for Research in Biological Systems, Department of Neurosciences, University of California, San Diego, La Jolla, California, United States of America
Rebecca D. Jackson The Ohio State University, Columbus, Ohio, United States of America
Yi Lin Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
Loic Le Marchand Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
Tara C. Matise Department of Genetics, Rutgers University, Piscataway, New Jersey, United States of America
Kristine R. Monroe Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, United States of America
Larry Moreland University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
Kari E. North Department of Epidemiology, University of North Carolina, Chapel Hill, North Carolina, United States of America Carolina Center for Genome Sciences, University of North Carolina, Chapel Hill, North Carolina, United States of America
Sungshim L. Park Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, California, United States of America
Alex Reiner Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
Robert Wallace Departments of Epidemiology and Internal Medicine, University of Iowa, Iowa City, Iowa, United States of America
Lynne R. Wilkens Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii, United States of America
Charles Kooperberg Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
Marylyn D. Ritchie Department of Genetics, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
Dana C. Crawford Cleveland Institute for Computational Biology, Cleveland, Ohio, United States of America Departments of Population and Quantitative Health Sciences and Genetics and Genome Sciences, Case Western Reserve University, Cleveland, Ohio, United States of America * E-mail:

Collapse

Preo N, Capobianco E. Significant EHR Feature-Driven T2D Inference: Predictive Machine Learning and Networks. Front Big Data 2019;2:30. [PMID: 33693353 PMCID: PMC7931876 DOI: 10.3389/fdata.2019.00030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/16/2019] [Indexed: 01/11/2023] Open

Abstract

Background: Electronic health records (EHR) play an important role for the redefinition of phenotypes in view of the wealth and heterogeneity of information now available from disparate data sources. A recent cross-sectional retrospective study has described the potential of EHR toward type 2 diabetes mellitus (T2D) screening when ad hoc models are used. About 10,000 US patients have been analyzed through a variety of inference techniques applied to all records with a variable degree of completeness. The analyses conducted in the reference study have indicated that EHR phenotypes significantly improved T2D detection.

Methods: With these US patients and the T2D data evidenced in the above study, we propose an integrative inference approach that leverages the prediction power of EHR features selected by two well-known methods, Random Forests and Lasso. The goal is 2-fold: reducing the Big Data redundancies potentially harmful to the predictive learning task and exploiting the interconnectivity of EHR features. A mutual information (MI) network is the inference tool used to identify communities useful to prioritize significant T2D features underlying the similarity between patients.

Results: Endowed with a different degree of granularity, the communities detected after the application of both methods were centered especially on T2D comorbidities and risk factors. As such, they appear very relevant for assessment of two main issues, T2D disease burden, and prevention.

Conclusions: Our analytical approach offers a solution for managing the EHR scale factor in a complex disease context. EHR are rich sources of phenotypic diversity through which novel stratifications of patients are expected. To enable these results, both pre-screening of variables and calibration of risk prediction methods become necessary steps in EHR analyses. We have presented networks identifying major T2D communities. The specific significance assigned to comorbidities and risk factors in relation to T2D can be inferred with accuracy from just a suitably reduced number of EHR features.

Collapse

Abul-Husn NS, Kenny EE. Personalized Medicine and the Power of Electronic Health Records. Cell 2019;177:58-69. [PMID: 30901549 PMCID: PMC6921466 DOI: 10.1016/j.cell.2019.02.039] [Citation(s) in RCA: 145] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 02/13/2019] [Accepted: 02/22/2019] [Indexed: 02/06/2023]