76
|
Wang Q, Reps JM, Kostka KF, Ryan PB, Zou Y, Voss EA, Rijnbeek PR, Chen R, Rao GA, Morgan Stewart H, Williams AE, Williams RD, Van Zandt M, Falconer T, Fernandez-Chas M, Vashisht R, Pfohl SR, Shah NH, Kasthurirathne SN, You SC, Jiang Q, Reich C, Zhou Y. Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network. PLoS One 2020; 15:e0226718. [PMID: 31910437 PMCID: PMC6946584 DOI: 10.1371/journal.pone.0226718] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 12/02/2019] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND AND PURPOSE Hemorrhagic transformation (HT) after cerebral infarction is a complex and multifactorial phenomenon in the acute stage of ischemic stroke, and often results in a poor prognosis. Thus, identifying risk factors and making an early prediction of HT in acute cerebral infarction contributes not only to the selections of therapeutic regimen but also, more importantly, to the improvement of prognosis of acute cerebral infarction. The purpose of this study was to develop and validate a model to predict a patient's risk of HT within 30 days of initial ischemic stroke. METHODS We utilized a retrospective multicenter observational cohort study design to develop a Lasso Logistic Regression prediction model with a large, US Electronic Health Record dataset which structured to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). To examine clinical transportability, the model was externally validated across 10 additional real-world healthcare datasets include EHR records for patients from America, Europe and Asia. RESULTS In the database the model was developed, the target population cohort contained 621,178 patients with ischemic stroke, of which 5,624 patients had HT within 30 days following initial ischemic stroke. 612 risk predictors, including the distance a patient travels in an ambulance to get to care for a HT, were identified. An area under the receiver operating characteristic curve (AUC) of 0.75 was achieved in the internal validation of the risk model. External validation was performed across 10 databases totaling 5,515,508 patients with ischemic stroke, of which 86,401 patients had HT within 30 days following initial ischemic stroke. The mean external AUC was 0.71 and ranged between 0.60-0.78. CONCLUSIONS A HT prognostic predict model was developed with Lasso Logistic Regression based on routinely collected EMR data. This model can identify patients who have a higher risk of HT than the population average with an AUC of 0.78. It shows the OMOP CDM is an appropriate data standard for EMR secondary use in clinical multicenter research for prognostic prediction model development and validation. In the future, combining this model with clinical information systems will assist clinicians to make the right therapy decision for patients with acute ischemic stroke.
Collapse
|
77
|
Myers KD, Knowles JW, Staszak D, Shapiro MD, Howard W, Yadava M, Zuzick D, Williamson L, Shah NH, Banda JM, Leader J, Cromwell WC, Trautman E, Murray MF, Baum SJ, Myers S, Gidding SS, Wilemon K, Rader DJ. Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data. LANCET DIGITAL HEALTH 2019; 1:e393-e402. [PMID: 33323221 DOI: 10.1016/s2589-7500(19)30150-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 09/10/2019] [Accepted: 09/20/2019] [Indexed: 12/18/2022]
Abstract
BACKGROUND Cardiovascular outcomes for people with familial hypercholesterolaemia can be improved with diagnosis and medical management. However, 90% of individuals with familial hypercholesterolaemia remain undiagnosed in the USA. We aimed to accelerate early diagnosis and timely intervention for more than 1·3 million undiagnosed individuals with familial hypercholesterolaemia at high risk for early heart attacks and strokes by applying machine learning to large health-care encounter datasets. METHODS We trained the FIND FH machine learning model using deidentified health-care encounter data, including procedure and diagnostic codes, prescriptions, and laboratory findings, from 939 clinically diagnosed individuals with familial hypercholesterolaemia (395 of whom had a molecular diagnosis) and 83 136 individuals presumed free of familial hypercholesterolaemia, sampled from four US institutions. The model was then applied to a national health-care encounter database (170 million individuals) and an integrated health-care delivery system dataset (174 000 individuals). Individuals used in model training and those evaluated by the model were required to have at least one cardiovascular disease risk factor (eg, hypertension, hypercholesterolaemia, or hyperlipidemia). A Health Insurance Portability and Accountability Act of 1996-compliant programme was developed to allow providers to receive identification of individuals likely to have familial hypercholesterolaemia in their practice. FINDINGS Using a model with a measured precision (positive predictive value) of 0·85, recall (sensitivity) of 0·45, area under the precision-recall curve of 0·55, and area under the receiver operating characteristic curve of 0·89, we flagged 1 331 759 of 170 416 201 patients in the national database and 866 of 173 733 individuals in the health-care delivery system dataset as likely to have familial hypercholesterolaemia. Familial hypercholesterolaemia experts reviewed a sample of flagged individuals (45 from the national database and 103 from the health-care delivery system dataset) and applied clinical familial hypercholesterolaemia diagnostic criteria. Of those reviewed, 87% (95% Cl 73-100) in the national database and 77% (68-86) in the health-care delivery system dataset were categorised as having a high enough clinical suspicion of familial hypercholesterolaemia to warrant guideline-based clinical evaluation and treatment. INTERPRETATION The FIND FH model successfully scans large, diverse, and disparate health-care encounter databases to identify individuals with familial hypercholesterolaemia. FUNDING The FH Foundation funded this study. Support was received from Amgen, Sanofi, and Regeneron.
Collapse
|
78
|
|
79
|
Callahan A, Fries JA, Ré C, Huddleston JI, Giori NJ, Delp S, Shah NH. Medical device surveillance with electronic health records. NPJ Digit Med 2019; 2:94. [PMID: 31583282 PMCID: PMC6761113 DOI: 10.1038/s41746-019-0168-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 08/15/2019] [Indexed: 12/18/2022] Open
Abstract
Post-market medical device surveillance is a challenge facing manufacturers, regulatory agencies, and health care providers. Electronic health records are valuable sources of real-world evidence for assessing device safety and tracking device-related patient outcomes over time. However, distilling this evidence remains challenging, as information is fractured across clinical notes and structured records. Modern machine learning methods for machine reading promise to unlock increasingly complex information from text, but face barriers due to their reliance on large and expensive hand-labeled training sets. To address these challenges, we developed and validated state-of-the-art deep learning methods that identify patient outcomes from clinical notes without requiring hand-labeled training data. Using hip replacements-one of the most common implantable devices-as a test case, our methods accurately extracted implant details and reports of complications and pain from electronic health records with up to 96.3% precision, 98.5% recall, and 97.4% F1, improved classification performance by 12.8-53.9% over rule-based methods, and detected over six times as many complication events compared to using structured data alone. Using these additional events to assess complication-free survivorship of different implant systems, we found significant variation between implants, including for risk of revision surgery, which could not be detected using coded data alone. Patients with revision surgeries had more hip pain mentions in the post-hip replacement, pre-revision period compared to patients with no evidence of revision surgery (mean hip pain mentions 4.97 vs. 3.23; t = 5.14; p < 0.001). Some implant models were associated with higher or lower rates of hip pain mentions. Our methods complement existing surveillance mechanisms by requiring orders of magnitude less hand-labeled training data, offering a scalable solution for national medical device surveillance using electronic health records.
Collapse
|
80
|
Ling AY, Kurian AW, Caswell-Jin JL, Sledge GW, Shah NH, Tamang SR. Using natural language processing to construct a metastatic breast cancer cohort from linked cancer registry and electronic medical records data. JAMIA Open 2019; 2:528-537. [PMID: 32025650 PMCID: PMC6994019 DOI: 10.1093/jamiaopen/ooz040] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 07/13/2019] [Accepted: 08/13/2019] [Indexed: 02/04/2023] Open
Abstract
Objectives Most population-based cancer databases lack information on metastatic recurrence. Electronic medical records (EMR) and cancer registries contain complementary information on cancer diagnosis, treatment and outcome, yet are rarely used synergistically. To construct a cohort of metastatic breast cancer (MBC) patients, we applied natural language processing techniques within a semisupervised machine learning framework to linked EMR-California Cancer Registry (CCR) data. Materials and Methods We studied all female patients treated at Stanford Health Care with an incident breast cancer diagnosis from 2000 to 2014. Our database consisted of structured fields and unstructured free-text clinical notes from EMR, linked to CCR, a component of the Surveillance, Epidemiology and End Results Program (SEER). We identified de novo MBC patients from CCR and extracted information on distant recurrences from patient notes in EMR. Furthermore, we trained a regularized logistic regression model for recurrent MBC classification and evaluated its performance on a gold standard set of 146 patients. Results There were 11 459 breast cancer patients in total and the median follow-up time was 96.3 months. We identified 1886 MBC patients, 512 (27.1%) of whom were de novo MBC patients and 1374 (72.9%) were recurrent MBC patients. Our final MBC classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.917, with sensitivity 0.861, specificity 0.878, and accuracy 0.870. Discussion and Conclusion To enable population-based research on MBC, we developed a framework for retrospective case detection combining EMR and CCR data. Our classifier achieved good AUC, sensitivity, and specificity without expert-labeled examples.
Collapse
|
81
|
Ransohoff JD, Nikfarjam A, Jones E, Loew B, Kwong BY, Sarin KY, Shah NH. Detecting Chemotherapeutic Skin Adverse Reactions in Social Health Networks Using Deep Learning. JAMA Oncol 2019; 4:581-583. [PMID: 29494731 DOI: 10.1001/jamaoncol.2017.5688] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
82
|
Banerjee I, Sofela M, Yang J, Chen JH, Shah NH, Ball R, Mushlin AI, Desai M, Bledsoe J, Amrhein T, Rubin DL, Zamanian R, Lungren MP. Development and Performance of the Pulmonary Embolism Result Forecast Model (PERFORM) for Computed Tomography Clinical Decision Support. JAMA Netw Open 2019; 2:e198719. [PMID: 31390040 PMCID: PMC6686780 DOI: 10.1001/jamanetworkopen.2019.8719] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
IMPORTANCE Pulmonary embolism (PE) is a life-threatening clinical problem, and computed tomographic imaging is the standard for diagnosis. Clinical decision support rules based on PE risk-scoring models have been developed to compute pretest probability but are underused and tend to underperform in practice, leading to persistent overuse of CT imaging for PE. OBJECTIVE To develop a machine learning model to generate a patient-specific risk score for PE by analyzing longitudinal clinical data as clinical decision support for patients referred for CT imaging for PE. DESIGN, SETTING, AND PARTICIPANTS In this diagnostic study, the proposed workflow for the machine learning model, the Pulmonary Embolism Result Forecast Model (PERFORM), transforms raw electronic medical record (EMR) data into temporal feature vectors and develops a decision analytical model targeted toward adult patients referred for CT imaging for PE. The model was tested on holdout patient EMR data from 2 large, academic medical practices. A total of 3397 annotated CT imaging examinations for PE from 3214 unique patients seen at Stanford University hospitals and clinics were used for training and validation. The models were externally validated on 240 unique patients seen at Duke University Medical Center. The comparison with clinical scoring systems was done on randomly selected 100 outpatient samples from Stanford University hospitals and clinics and 101 outpatient samples from Duke University Medical Center. MAIN OUTCOMES AND MEASURES Prediction performance of diagnosing acute PE was evaluated using ElasticNet, artificial neural networks, and other machine learning approaches on holdout data sets from both institutions, and performance of models was measured by area under the receiver operating characteristic curve (AUROC). RESULTS Of the 3214 patients included in the study, 1704 (53.0%) were women from Stanford University hospitals and clinics; mean (SD) age was 60.53 (19.43) years. The 240 patients from Duke University Medical Center used for validation included 132 women (55.0%); mean (SD) age was 70.2 (14.2) years. In the samples for clinical scoring system comparisons, the 100 outpatients from Stanford University hospitals and clinics included 67 women (67.0%); mean (SD) age was 57.74 (19.87) years, and the 101 patients from Duke University Medical Center included 59 women (58.4%); mean (SD) age was 73.06 (15.3) years. The best-performing model achieved an AUROC performance of predicting a positive PE study of 0.90 (95% CI, 0.87-0.91) on intrainstitutional holdout data with an AUROC of 0.71 (95% CI, 0.69-0.72) on an external data set from Duke University Medical Center; superior AUROC performance and cross-institutional generalization of the model of 0.81 (95% CI, 0.77-0.87) and 0.81 (95% CI, 0.73-0.82), respectively, were noted on holdout outpatient populations from both intrainstitutional and extrainstitutional data. CONCLUSIONS AND RELEVANCE The machine learning model, PERFORM, may consider multitudes of applicable patient-specific risk factors and dependencies to arrive at a PE risk prediction that generalizes to new population distributions. This approach might be used as an automated clinical decision-support tool for patients referred for CT PE imaging to improve CT use.
Collapse
|
83
|
Nikfarjam A, Ransohoff JD, Callahan A, Polony V, Shah NH. Profiling off-label prescriptions in cancer treatment using social health networks. JAMIA Open 2019; 2:301-305. [PMID: 31709388 PMCID: PMC6824514 DOI: 10.1093/jamiaopen/ooz025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 05/10/2019] [Accepted: 06/20/2019] [Indexed: 11/12/2022] Open
Abstract
Objectives To investigate using patient posts in social media as a resource to profile off-label prescriptions of cancer drugs. Methods We analyzed patient posts from the Inspire health forums (www.inspire.com) and extracted mentions of cancer drugs from the 14 most active cancer-type specific support groups. To quantify drug-disease associations, we calculated information component scores from the frequency of posts in each cancer-specific group with mentions of a given drug. We evaluated the results against three sources: manual review, Wolters-Kluwer Medi-span, and Truven MarketScan insurance claims. Results We identified 279 frequently discussed and therefore highly associated drug-disease pairs from Inspire posts. Of these, 96 are FDA approved, 9 are known off-label uses, and 174 do not have records of known usage (potentially novel off-label uses). We achieved a mean average precision of 74.9% in identifying drug-disease pairs with a true indication association from patient posts and found consistent evidence in medical claims records. We achieved a recall of 69.2% in identifying known off-label drug uses (based on Wolters-Kluwer Medi-span) from patient posts.
Collapse
|
84
|
Lerrigo R, Coffey JTR, Kravitz JL, Jadhav P, Nikfarjam A, Shah NH, Jurafsky D, Sinha SR. The Emotional Toll of Inflammatory Bowel Disease: Using Machine Learning to Analyze Online Community Forum Discourse. CROHNS & COLITIS 360 2019. [DOI: 10.1093/crocol/otz011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Abstract
Background
Patients with inflammatory bowel disease are using online community forums (OCFs) to seek emotional support. The impact of OCFs on well-being and their emotional content are unknown.
Methods
We used an unsupervised machine learning algorithm to identify the thematic content of 51,591 public, online posts from the Crohn’s & Colitis Foundation Community Forum.
Results
We identified 10,702 (20.8%) posts expressing: gratitude (40%), anxiety/fear (20.8%), empathy (18.2%), anger/frustration (13.4%), hope (13.2%), happiness (10.0%), sadness/depression (5.8%), shame/guilt (2.5%), and/or loneliness (2.5%). A common subtheme was the importance of fostering social support.
Conclusions
High-throughput, machine learning-directed analysis of OCFs may help identify psychosocial impacts of inflammatory bowel disease on patients and their caregivers.
Collapse
|
85
|
Nikfarjam A, Ransohoff JD, Callahan A, Jones E, Loew B, Kwong BY, Sarin KY, Shah NH. Early Detection of Adverse Drug Reactions in Social Health Networks: A Natural Language Processing Pipeline for Signal Detection. JMIR Public Health Surveill 2019; 5:e11264. [PMID: 31162134 PMCID: PMC6684218 DOI: 10.2196/11264] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 02/27/2019] [Accepted: 04/04/2019] [Indexed: 12/31/2022] Open
Abstract
Background Adverse drug reactions (ADRs) occur in nearly all patients on chemotherapy, causing morbidity and therapy disruptions. Detection of such ADRs is limited in clinical trials, which are underpowered to detect rare events. Early recognition of ADRs in the postmarketing phase could substantially reduce morbidity and decrease societal costs. Internet community health forums provide a mechanism for individuals to discuss real-time health concerns and can enable computational detection of ADRs. Objective The goal of this study is to identify cutaneous ADR signals in social health networks and compare the frequency and timing of these ADRs to clinical reports in the literature. Methods We present a natural language processing-based, ADR signal-generation pipeline based on patient posts on Internet social health networks. We identified user posts from the Inspire health forums related to two chemotherapy classes: erlotinib, an epidermal growth factor receptor inhibitor, and nivolumab and pembrolizumab, immune checkpoint inhibitors. We extracted mentions of ADRs from unstructured content of patient posts. We then performed population-level association analyses and time-to-detection analyses. Results Our system detected cutaneous ADRs from patient reports with high precision (0.90) and at frequencies comparable to those documented in the literature but an average of 7 months ahead of their literature reporting. Known ADRs were associated with higher proportional reporting ratios compared to negative controls, demonstrating the robustness of our analyses. Our named entity recognition system achieved a 0.738 microaveraged F-measure in detecting ADR entities, not limited to cutaneous ADRs, in health forum posts. Additionally, we discovered the novel ADR of hypohidrosis reported by 23 patients in erlotinib-related posts; this ADR was absent from 15 years of literature on this medication and we recently reported the finding in a clinical oncology journal. Conclusions Several hundred million patients report health concerns in social health networks, yet this information is markedly underutilized for pharmacosurveillance. We demonstrated the ability of a natural language processing-based signal-generation pipeline to accurately detect patient reports of ADRs months in advance of literature reporting and the robustness of statistical analyses to validate system detections. Our findings suggest the important contributions that social health network data can play in contributing to more comprehensive and timely pharmacovigilance.
Collapse
|
86
|
Scott MKD, Quinn K, Li Q, Carroll R, Warsinske H, Vallania F, Chen S, Carns MA, Aren K, Sun J, Koloms K, Lee J, Baral J, Kropski J, Zhao H, Herzog E, Martinez FJ, Moore BB, Hinchcliff M, Denny J, Kaminski N, Herazo-Maya JD, Shah NH, Khatri P. Increased monocyte count as a cellular biomarker for poor outcomes in fibrotic diseases: a retrospective, multicentre cohort study. THE LANCET. RESPIRATORY MEDICINE 2019; 7:497-508. [PMID: 30935881 PMCID: PMC6529612 DOI: 10.1016/s2213-2600(18)30508-3] [Citation(s) in RCA: 140] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 11/14/2018] [Accepted: 11/27/2018] [Indexed: 12/27/2022]
Abstract
BACKGROUND There is an urgent need for biomarkers to better stratify patients with idiopathic pulmonary fibrosis by risk for lung transplantation allocation who have the same clinical presentation. We aimed to investigate whether a specific immune cell type from patients with idiopathic pulmonary fibrosis could identify those at higher risk of poor outcomes. We then sought to validate our findings using cytometry and electronic health records. METHODS We first did a discovery analysis with transcriptome data from the Gene Expression Omnibus at the National Center for Biotechnology Information for 120 peripheral blood mononuclear cell (PBMC) samples of patients with idiopathic pulmonary fibrosis. We estimated percentages of 13 immune cell types using statistical deconvolution, and investigated the association of these cell types with transplant-free survival. We validated these results using PBMC samples from patients with idiopathic pulmonary fibrosis in two independent cohorts (COMET and Yale). COMET profiled monocyte counts in 45 patients with idiopathic pulmonary fibrosis from March 12, 2010, to March 10, 2011, using flow cytometry; we tested if increased monocyte count was associated with the primary outcome of disease progression. In the Yale cohort, 15 patients with idiopathic pulmonary fibrosis (with five healthy controls) were classed as high risk or low risk from April 28, 2014, to Aug 20, 2015, using a 52-gene signature, and we assessed whether monocyte percentage (measured by cytometry by time of flight) was higher in high-risk patients. We then examined complete blood count values in the electronic health records (EHR) of 45 068 patients with idiopathic pulmonary fibrosis, systemic sclerosis, hypertrophic cardiomyopathy, or myelofibrosis from Stanford (Jan 01, 2008, to Dec 31, 2015), Northwestern (Feb 15, 2001 to July 31, 2017), Vanderbilt (Jan 01, 2008, to Dec 31, 2016), and Optum Clinformatics DataMart (Jan 01, 2004, to Dec 31, 2016) cohorts, and examined whether absolute monocyte counts of 0·95 K/μL or greater were associated with all-cause mortality in these patients. FINDINGS In the discovery analysis, estimated CD14+ classical monocyte percentages above the mean were associated with shorter transplant-free survival times (hazard ratio [HR] 1·82, 95% CI 1·05-3·14), whereas higher percentages of T cells and B cells were not (0·97, 0·59-1·66; and 0·78, 0·45-1·34 respectively). In two validation cohorts (COMET trial and the Yale cohort), patients with higher monocyte counts were at higher risk for poor outcomes (COMET Wilcoxon p=0·025; Yale Wilcoxon p=0·049). Monocyte counts of 0·95 K/μL or greater were associated with mortality after adjusting for forced vital capacity (HR 2·47, 95% CI 1·48-4·15; p=0·0063), and the gender, age, and physiology index (HR 2·06, 95% CI 1·22-3·47; p=0·0068) across the COMET, Stanford, and Northwestern datasets). Analysis of medical records of 7459 patients with idiopathic pulmonary fibrosis showed that patients with monocyte counts of 0·95 K/μL or greater were at increased risk of mortality with lung transplantation as a censoring event, after adjusting for age at diagnosis and sex (Stanford HR=2·30, 95% CI 0·94-5·63; Vanderbilt 1·52, 1·21-1·89; Optum 1·74, 1·33-2·27). Likewise, higher absolute monocyte count was associated with shortened survival in patients with hypertrophic cardiomyopathy across all three cohorts, and in patients with systemic sclerosis or myelofibrosis in two of the three cohorts. INTERPRETATION Monocyte count could be incorporated into the clinical assessment of patients with idiopathic pulmonary fibrosis and other fibrotic disorders. Further investigation into the mechanistic role of monocytes in fibrosis might lead to insights that assist the development of new therapies. FUNDING Bill & Melinda Gates Foundation, US National Institute of Allergy and Infectious Diseases, and US National Library of Medicine.
Collapse
|
87
|
Banda JM, Sarraju A, Abbasi F, Parizo J, Pariani M, Ison H, Briskin E, Wand H, Dubois S, Jung K, Myers SA, Rader DJ, Leader JB, Murray MF, Myers KD, Wilemon K, Shah NH, Knowles JW. Finding missed cases of familial hypercholesterolemia in health systems using machine learning. NPJ Digit Med 2019; 2:23. [PMID: 31304370 PMCID: PMC6550268 DOI: 10.1038/s41746-019-0101-5] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 03/13/2019] [Indexed: 01/26/2023] Open
Abstract
Familial hypercholesterolemia (FH) is an underdiagnosed dominant genetic condition affecting approximately 0.4% of the population and has up to a 20-fold increased risk of coronary artery disease if untreated. Simple screening strategies have false positive rates greater than 95%. As part of the FH Foundation's FIND FH initiative, we developed a classifier to identify potential FH patients using electronic health record (EHR) data at Stanford Health Care. We trained a random forest classifier using data from known patients (n = 197) and matched non-cases (n = 6590). Our classifier obtained a positive predictive value (PPV) of 0.88 and sensitivity of 0.75 on a held-out test-set. We evaluated the accuracy of the classifier's predictions by chart review of 100 patients at risk of FH not included in the original dataset. The classifier correctly flagged 84% of patients at the highest probability threshold, with decreasing performance as the threshold lowers. In external validation on 466 FH patients (236 with genetically proven FH) and 5000 matched non-cases from the Geisinger Healthcare System our FH classifier achieved a PPV of 0.85. Our EHR-derived FH classifier is effective in finding candidate patients for further FH screening. Such machine learning guided strategies can lead to effective identification of the highest risk patients for enhanced management strategies.
Collapse
|
88
|
Jung K, Sudat SEK, Kwon N, Stewart WF, Shah NH. Predicting need for advanced illness or palliative care in a primary care population using electronic health record data. J Biomed Inform 2019; 92:103115. [PMID: 30753951 PMCID: PMC6512802 DOI: 10.1016/j.jbi.2019.103115] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Timely outreach to individuals in an advanced stage of illness offers opportunities to exercise decision control over health care. Predictive models built using Electronic health record (EHR) data are being explored as a way to anticipate such need with enough lead time for patient engagement. Prior studies have focused on hospitalized patients, who typically have more data available for predicting care needs. It is unclear if prediction driven outreach is feasible in the primary care setting. In this study, we apply predictive modeling to the primary care population of a large, regional health system and systematically examine the impact of technical choices, such as requiring a minimum number of health care encounters (data density requirements) and aggregating diagnosis codes using Clinical Classifications Software (CCS) groupings to reduce dimensionality, on model performance in terms of discrimination and positive predictive value. We assembled a cohort of 349,667 primary care patients between 65 and 90 years of age who sought care from Sutter Health between July 1, 2011 and June 30, 2014, of whom 2.1% died during the study period. EHR data comprising demographics, encounters, orders, and diagnoses for each patient from a 12 month observation window prior to the point when a prediction is made were extracted. L1 regularized logistic regression and gradient boosted tree models were fit to training data and tuned by cross validation. Model performance in predicting one year mortality was assessed using held-out test patients. Our experiments systematically varied three factors: model type, diagnosis coding, and data density requirements. We found substantial, consistent benefit from using gradient boosting vs logistic regression (mean AUROC over all other technical choices of 84.8% vs 80.7% respectively). There was no benefit from aggregation of ICD codes into CCS code groups (mean AUROC over all other technical choices of 82.9% vs 82.6% respectively). Likewise increasing data density requirements did not affect discrimination (mean AUROC over other technical choices ranged from 82.5% to 83%). We also examine model performance as a function of lead time, which is the interval between death and when a prediction was made. In subgroup analysis by lead time, mean AUROC over all other choices ranged from 87.9% for patients who died within 0 to 3 months to 83.6% for those who died 9 to 12 months after prediction time.
Collapse
|
89
|
Ross EG, Jung K, Dudley JT, Li L, Leeper NJ, Shah NH. Predicting Future Cardiovascular Events in Patients With Peripheral Artery Disease Using Electronic Health Record Data. Circ Cardiovasc Qual Outcomes 2019; 12:e004741. [PMID: 30857412 PMCID: PMC6415677 DOI: 10.1161/circoutcomes.118.004741] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
BACKGROUND Patients with peripheral artery disease (PAD) are at risk of major adverse cardiac and cerebrovascular events. There are no readily available risk scores that can accurately identify which patients are most likely to sustain an event, making it difficult to identify those who might benefit from more aggressive intervention. Thus, we aimed to develop a novel predictive model-using machine learning methods on electronic health record data-to identify which PAD patients are most likely to develop major adverse cardiac and cerebrovascular events. METHODS AND RESULTS Data were derived from patients diagnosed with PAD at 2 tertiary care institutions. Predictive models were built using a common data model that allowed for utilization of both structured (coded) and unstructured (text) data. Only data from time of entry into the health system up to PAD diagnosis were used for modeling. Models were developed and tested using nested cross-validation. A total of 7686 patients were included in learning our predictive models. Utilizing almost 1000 variables, our best predictive model accurately determined which PAD patients would go on to develop major adverse cardiac and cerebrovascular events with an area under the curve of 0.81 (95% CI, 0.80-0.83). CONCLUSIONS Machine learning algorithms applied to data in the electronic health record can learn models that accurately identify PAD patients at risk of future major adverse cardiac and cerebrovascular events, highlighting the great potential of electronic health records to provide automated risk stratification for cardiovascular diseases. Common data models that can enable cross-institution research and technology development could potentially be an important aspect of widespread adoption of newer risk-stratification models.
Collapse
|
90
|
Ding DY, Simpson C, Pfohl S, Kale DC, Jung K, Shah NH. The Effectiveness of Multitask Learning for Phenotyping with Electronic Health Records Data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:18-29. [PMID: 30864307 PMCID: PMC6662921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Electronic phenotyping is the task of ascertaining whether an individual has a medical condition of interest by analyzing their medical record and is foundational in clinical informatics. Increasingly, electronic phenotyping is performed via supervised learning. We investigate the effectiveness of multitask learning for phenotyping using electronic health records (EHR) data. Multitask learning aims to improve model performance on a target task by jointly learning additional auxiliary tasks and has been used in disparate areas of machine learning. However, its utility when applied to EHR data has not been established, and prior work suggests that its benefits are inconsistent. We present experiments that elucidate when multitask learning with neural nets improves performance for phenotyping using EHR data relative to neural nets trained for a single phenotype and to well-tuned baselines. We find that multitask neural nets consistently outperform single-task neural nets for rare phenotypes but underperform for relatively more common phenotypes. The effect size increases as more auxiliary tasks are added. Moreover, multitask learning reduces the sensitivity of neural nets to hyperparameter settings for rare phenotypes. Last, we quantify phenotype complexity and find that neural nets trained with or without multitask learning do not improve on simple baselines unless the phenotypes are sufficiently complex.
Collapse
|
91
|
Agarwal V, Smuck M, Tomkins-Lane C, Shah NH. Inferring Physical Function From Wearable Activity Monitors: Analysis of Free-Living Activity Data From Patients With Knee Osteoarthritis. JMIR Mhealth Uhealth 2018; 6:e11315. [PMID: 30394876 PMCID: PMC6315255 DOI: 10.2196/11315] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Revised: 09/20/2018] [Accepted: 10/01/2018] [Indexed: 12/18/2022] Open
Abstract
Background Clinical assessments for physical function do not objectively quantify routine daily activities. Wearable activity monitors (WAMs) enable objective measurement of daily activities, but it remains unclear how these map to clinically measured physical function measures. Objective This study aims to derive a representation of physical function from daily measurements of free-living activity obtained through a WAM. In addition, we evaluate our derived measure against objectively measured function using an ordinal classification setup. Methods We defined function profiles representing average time spent in a set of pattern classes over consecutive days. We constructed a function profile using minute-level activity data from a WAM available from the Osteoarthritis Initiative. Using the function profile as input, we trained statistical models that classified subjects into quartiles of objective measurements of physical function as measured through the 400-m walk test, 20-m walk test, and 5 times sit-stand test. Furthermore, we evaluated model performance on held-out data. Results The function profile derived from minute-level activity data can accurately predict physical performance as measured through clinical assessments. Using held-out data, the Goodman-Kruskal Gamma statistic obtained in classifying performance values in the first quartile, interquartile range, and the fourth quartile was 0.62, 0.53, and 0.51 for the 400-m walk, 20-m walk, and 5 times sit-stand tests, respectively. Conclusions Function profiles accurately represent physical function, as demonstrated by the relationship between the profiles and clinically measured physical performance. The estimation of physical performance through function profiles derived from free-living activity data may enable remote functional monitoring of patients.
Collapse
|
92
|
Abstract
BACKGROUND Access to palliative care is a key quality metric which most healthcare organizations strive to improve. The primary challenges to increasing palliative care access are a combination of physicians over-estimating patient prognoses, and a shortage of palliative staff in general. This, in combination with treatment inertia can result in a mismatch between patient wishes, and their actual care towards the end of life. METHODS In this work, we address this problem, with Institutional Review Board approval, using machine learning and Electronic Health Record (EHR) data of patients. We train a Deep Neural Network model on the EHR data of patients from previous years, to predict mortality of patients within the next 3-12 month period. This prediction is used as a proxy decision for identifying patients who could benefit from palliative care. RESULTS The EHR data of all admitted patients are evaluated every night by this algorithm, and the palliative care team is automatically notified of the list of patients with a positive prediction. In addition, we present a novel technique for decision interpretation, using which we provide explanations for the model's predictions. CONCLUSION The automatic screening and notification saves the palliative care team the burden of time consuming chart reviews of all patients, and allows them to take a proactive approach in reaching out to such patients rather then relying on referrals from the treating physicians.
Collapse
|
93
|
|
94
|
Seneviratne MG, Banda JM, Brooks JD, Shah NH, Hernandez-Boussard TM. Identifying Cases of Metastatic Prostate Cancer Using Machine Learning on Electronic Health Records. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2018:1498-1504. [PMID: 30815195 PMCID: PMC6371284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cancer stage is rarely captured in structured form in the electronic health record (EHR). We evaluate the performance of a classifier, trained on structured EHR data, in identifying prostate cancer patients with metastatic disease. Using EHR data for a cohort of 5,861 prostate cancer patients mapped to the Observational Health Data Sciences and Informatics (OHDSI) data model, we constructed feature vectors containing frequency counts of conditions, procedures, medications, observations and laboratory values. Staging information from the California Cancer Registry was used as the ground-truth. For identifying patients with metastatic disease, a random forest model achieved precision and recall of 0.90, 0.40 using data within 12 months of diagnosis. This compared to precision 0.33, recall 0.54 for an ICD code-based query. High-precision classifiers using hundreds of structured data elements significantly outperform ICD queries, and may assist in identifying cohorts for observational research or clinical trial matching.
Collapse
|
95
|
Coulet A, Shah NH, Wack M, Chawki MB, Jay N, Dumontier M. Predicting the need for a reduced drug dose, at first prescription. Sci Rep 2018; 8:15558. [PMID: 30349060 PMCID: PMC6197198 DOI: 10.1038/s41598-018-33980-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 10/06/2018] [Indexed: 01/21/2023] Open
Abstract
Prescribing the right drug with the right dose is a central tenet of precision medicine. We examined the use of patients’ prior Electronic Health Records to predict a reduction in drug dosage. We focus on drugs that interact with the P450 enzyme family, because their dosage is known to be sensitive and variable. We extracted diagnostic codes, conditions reported in clinical notes, and laboratory orders from Stanford’s clinical data warehouse to construct cohorts of patients that either did or did not need a dose change. After feature selection, we trained models to predict the patients who will (or will not) require a dose change after being prescribed one of 34 drugs across 23 drug classes. Overall, we can predict (AUC ≥ 0.70–0.95) a dose reduction for 23 drugs and 22 drug classes. Several of these drugs are associated with clinical guidelines that recommend dose reduction exclusively in the case of adverse reaction. For these cases, a reduction in dosage may be considered as a surrogate for an adverse reaction, which our system could indirectly help predict and prevent. Our study illustrates the role machine learning may take in providing guidance in setting the starting dose for drugs associated with response variability.
Collapse
|
96
|
Wang JK, Hom J, Balasubramanian S, Schuler A, Shah NH, Goldstein MK, Baiocchi MTM, Chen JH. An evaluation of clinical order patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. J Biomed Inform 2018; 86:109-119. [PMID: 30195660 PMCID: PMC6250126 DOI: 10.1016/j.jbi.2018.09.005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 08/09/2018] [Accepted: 09/05/2018] [Indexed: 01/06/2023]
Abstract
OBJECTIVE Evaluate the quality of clinical order practice patterns machine-learned from clinician cohorts stratified by patient mortality outcomes. MATERIALS AND METHODS Inpatient electronic health records from 2010 to 2013 were extracted from a tertiary academic hospital. Clinicians (n = 1822) were stratified into low-mortality (21.8%, n = 397) and high-mortality (6.0%, n = 110) extremes using a two-sided P-value score quantifying deviation of observed vs. expected 30-day patient mortality rates. Three patient cohorts were assembled: patients seen by low-mortality clinicians, high-mortality clinicians, and an unfiltered crowd of all clinicians (n = 1046, 1046, and 5230 post-propensity score matching, respectively). Predicted order lists were automatically generated from recommender system algorithms trained on each patient cohort and evaluated against (i) real-world practice patterns reflected in patient cases with better-than-expected mortality outcomes and (ii) reference standards derived from clinical practice guidelines. RESULTS Across six common admission diagnoses, order lists learned from the crowd demonstrated the greatest alignment with guideline references (AUROC range = 0.86-0.91), performing on par or better than those learned from low-mortality clinicians (0.79-0.84, P < 10-5) or manually-authored hospital order sets (0.65-0.77, P < 10-3). The same trend was observed in evaluating model predictions against better-than-expected patient cases, with the crowd model (AUROC mean = 0.91) outperforming the low-mortality model (0.87, P < 10-16) and order set benchmarks (0.78, P < 10-35). DISCUSSION Whether machine-learning models are trained on all clinicians or a subset of experts illustrates a bias-variance tradeoff in data usage. Defining robust metrics to assess quality based on internal (e.g. practice patterns from better-than-expected patient cases) or external reference standards (e.g. clinical practice guidelines) is critical to assess decision support content. CONCLUSION Learning relevant decision support content from all clinicians is as, if not more, robust than learning from a select subgroup of clinicians favored by patient outcomes.
Collapse
|
97
|
Smuck M, Agarwal V, Shah NH. Poster 369: Inferring Functional Status in People with Knee Osteoarthritis Using Patterns Derived from Physical Activity Monitors. PM R 2018. [DOI: 10.1016/j.pmrj.2018.09.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
98
|
Vashisht R, Jung K, Schuler A, Banda JM, Park RW, Jin S, Li L, Dudley JT, Johnson KW, Shervey MM, Xu H, Wu Y, Natrajan K, Hripcsak G, Jin P, Van Zandt M, Reckard A, Reich CG, Weaver J, Schuemie MJ, Ryan PB, Callahan A, Shah NH. Association of Hemoglobin A1c Levels With Use of Sulfonylureas, Dipeptidyl Peptidase 4 Inhibitors, and Thiazolidinediones in Patients With Type 2 Diabetes Treated With Metformin: Analysis From the Observational Health Data Sciences and Informatics Initiative. JAMA Netw Open 2018; 1:e181755. [PMID: 30646124 PMCID: PMC6324274 DOI: 10.1001/jamanetworkopen.2018.1755] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
IMPORTANCE Consensus around an efficient second-line treatment option for type 2 diabetes (T2D) remains ambiguous. The availability of electronic medical records and insurance claims data, which capture routine medical practice, accessed via the Observational Health Data Sciences and Informatics network presents an opportunity to generate evidence for the effectiveness of second-line treatments. OBJECTIVE To identify which drug classes among sulfonylureas, dipeptidyl peptidase 4 (DPP-4) inhibitors, and thiazolidinediones are associated with reduced hemoglobin A1c (HbA1c) levels and lower risk of myocardial infarction, kidney disorders, and eye disorders in patients with T2D treated with metformin as a first-line therapy. DESIGN, SETTING, AND PARTICIPANTS Three retrospective, propensity-matched, new-user cohort studies with replication across 8 sites were performed from 1975 to 2017. Medical data of 246 558 805 patients from multiple countries from the Observational Health Data Sciences and Informatics (OHDSI) initiative were included and medical data sets were transformed into a unified common data model, with analysis done using open-source analytical tools. Participants included patients with T2D receiving metformin with at least 1 prior HbA1c laboratory test who were then prescribed either sulfonylureas, DPP-4 inhibitors, or thiazolidinediones. Data analysis was conducted from 2015 to 2018. EXPOSURES Treatment with sulfonylureas, DPP-4 inhibitors, or thiazolidinediones starting at least 90 days after the initial prescription of metformin. MAIN OUTCOMES AND MEASURES The primary outcome is the first observation of the reduction of HbA1c level to 7% of total hemoglobin or less after prescription of a second-line drug. Secondary outcomes are myocardial infarction, kidney disorder, and eye disorder after prescription of a second-line drug. RESULTS A total of 246 558 805 patients (126 977 785 women [51.5%]) were analyzed. Effectiveness of sulfonylureas, DPP-4 inhibitors, and thiazolidinediones prescribed after metformin to lower HbA1c level to 7% or less of total hemoglobin remained indistinguishable in patients with T2D. Patients treated with sulfonylureas compared with DPP-4 inhibitors had a small increased consensus hazard ratio of myocardial infarction (1.12; 95% CI, 1.02-1.24) and eye disorders (1.15; 95% CI, 1.11-1.19) in the meta-analysis. Hazard of observing kidney disorders after treatment with sulfonylureas, DPP-4 inhibitors, or thiazolidinediones was equally likely. CONCLUSIONS AND RELEVANCE The examined drug classes did not differ in lowering HbA1c and in hazards of kidney disorders in patients with T2D treated with metformin as a first-line therapy. Sulfonylureas had a small, higher observed hazard of myocardial infarction and eye disorders compared with DPP-4 inhibitors in the meta-analysis. The OHDSI collaborative network can be used to conduct a large international study examining the effectiveness of second-line treatment choices made in clinical management of T2D.
Collapse
|
99
|
Banda JM, Seneviratne M, Hernandez-Boussard T, Shah NH. Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models. Annu Rev Biomed Data Sci 2018; 1:53-68. [PMID: 31218278 PMCID: PMC6583807 DOI: 10.1146/annurev-biodatasci-080917-013315] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
With the widespread adoption of electronic health records (EHRs), large repositories of structured and unstructured patient data are becoming available to conduct observational studies. Finding patients with specific conditions or outcomes, known as phenotyping, is one of the most fundamental research problems encountered when using these new EHR data. Phenotyping forms the basis of translational research, comparative effectiveness studies, clinical decision support, and population health analyses using routinely collected EHR data. We review the evolution of electronic phenotyping, from the early rule-based methods to the cutting edge of supervised and unsupervised machine learning models. We aim to cover the most influential papers in commensurate detail, with a focus on both methodology and implementation. Finally, future research directions are explored.
Collapse
|
100
|
Wang JK, Schuler A, Shah NH, Baiocchi MTM, Chen JH. Inpatient Clinical Order Patterns Machine-Learned From Teaching Versus Attending-Only Medical Services. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:226-235. [PMID: 29888077 PMCID: PMC5961816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/29/2022]
Abstract
Clinical order patterns derived from data-mining electronic health records can be a valuable source of decision support content. However, the quality of crowdsourcing such patterns may be suspect depending on the population learned from. For example, it is unclear whether learning inpatient practice patterns from a university teaching service, characterized by physician-trainee teams with an emphasis on medical education, will be of variable quality versus an attending-only medical service that focuses strictly on clinical care. Machine learning clinical order patterns by association rule episode mining from teaching versus attending-only inpatient medical services illustrated some practice variability, but converged towards similar top results in either case. We further validated the automatically generated content by confirming alignment with external reference standards extracted from clinical practice guidelines.
Collapse
|