Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ni Y, Alwell K, Moomaw CJ, Woo D, Adeoye O, Flaherty ML, Ferioli S, Mackey J, De Los Rios La Rosa F, Martini S, Khatri P, Kleindorfer D, Kissela BM. Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis. PLoS One 2018;13:e0192586. [PMID: 29444182 PMCID: PMC5812624 DOI: 10.1371/journal.pone.0192586] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 01/26/2018] [Indexed: 01/30/2023] Open

For:	Ni Y, Alwell K, Moomaw CJ, Woo D, Adeoye O, Flaherty ML, Ferioli S, Mackey J, De Los Rios La Rosa F, Martini S, Khatri P, Kleindorfer D, Kissela BM. Towards phenotyping stroke: Leveraging data from a large-scale epidemiological study to detect stroke diagnosis. PLoS One 2018;13:e0192586. [PMID: 29444182 PMCID: PMC5812624 DOI: 10.1371/journal.pone.0192586] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 01/26/2018] [Indexed: 01/30/2023] Open

Number

Cited by Other Article(s)

Lim H, Park Y, Hong JH, Yoo KB, Seo KD. Use of machine learning techniques for identifying ischemic stroke instead of the rule-based methods: a nationwide population-based study. Eur J Med Res 2024;29:6. [PMID: 38173022 PMCID: PMC10763197 DOI: 10.1186/s40001-023-01594-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 12/13/2023] [Indexed: 01/05/2024] Open

Li Q, Chi L, Zhao W, Wu L, Jiao C, Zheng X, Zhang K, Li X. Machine learning prediction of motor function in chronic stroke patients: a systematic review and meta-analysis. Front Neurol 2023;14:1039794. [PMID: 37388543 PMCID: PMC10299899 DOI: 10.3389/fneur.2023.1039794] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 05/25/2023] [Indexed: 07/01/2023] Open

Abstract

Background

Recent studies have reported that machine learning (ML), with a relatively strong capacity for processing non-linear data and adaptive ability, could improve the accuracy and efficiency of prediction. The article summarizes the published studies on ML models that predict motor function 3-6 months post-stroke.

Methods

A systematic literature search was conducted in PubMed, Embase, Cochorane and Web of Science as of April 3, 2023 for studies on ML prediction of motor function in stroke patients. The quality of the literature was assessed using the Prediction model Risk Of Bias Assessment Tool (PROBAST). A random-effects model was preferred for meta-analysis using R4.2.0 because of the different variables and parameters.

Results

A total of 44 studies were included in this meta-analysis, involving 72,368 patients and 136 models. Models were categorized into subgroups according to the predicted outcome Modified Rankin Scale cut-off value and whether they were constructed based on radiomics. C-statistics, sensitivity, and specificity were calculated. The random-effects model showed that the C-statistics of all models were 0.81 (95% CI: 0.79; 0.83) in the training set and 0.82 (95% CI: 0.80; 0.85) in the validation set. According to different Modified Rankin Scale cut-off values, C-statistics of ML models predicting Modified Rankin Scale>2(used most widely) in stroke patients were 0.81 (95% CI: 0.78; 0.84) in the training set, and 0.84 (95% CI: 0.81; 0.87) in the validation set. C-statistics of radiomics-based ML models in the training set and validation set were 0.81 (95% CI: 0.78; 0.84) and 0.87 (95% CI: 0.83; 0.90), respectively.

Conclusion

ML can be used as an assessment tool for predicting the motor function in patients with 3-6 months of post-stroke. Additionally, the study found that ML models with radiomics as a predictive variable were also demonstrated to have good predictive capabilities. This systematic review provides valuable guidance for the future optimization of ML prediction systems that predict poor motor outcomes in stroke patients.

Systematic review registration

https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42022335260, identifier: CRD42022335260.

Collapse

Surianarayanan C, Lawrence JJ, Chelliah PR, Prakash E, Hewage C. Convergence of Artificial Intelligence and Neuroscience towards the Diagnosis of Neurological Disorders-A Scoping Review. SENSORS (BASEL, SWITZERLAND) 2023;23:3062. [PMID: 36991773 PMCID: PMC10053494 DOI: 10.3390/s23063062] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/09/2023] [Accepted: 03/09/2023] [Indexed: 06/19/2023]

Abstract

Artificial intelligence (AI) is a field of computer science that deals with the simulation of human intelligence using machines so that such machines gain problem-solving and decision-making capabilities similar to that of the human brain. Neuroscience is the scientific study of the struczture and cognitive functions of the brain. Neuroscience and AI are mutually interrelated. These two fields help each other in their advancements. The theory of neuroscience has brought many distinct improvisations into the AI field. The biological neural network has led to the realization of complex deep neural network architectures that are used to develop versatile applications, such as text processing, speech recognition, object detection, etc. Additionally, neuroscience helps to validate the existing AI-based models. Reinforcement learning in humans and animals has inspired computer scientists to develop algorithms for reinforcement learning in artificial systems, which enables those systems to learn complex strategies without explicit instruction. Such learning helps in building complex applications, like robot-based surgery, autonomous vehicles, gaming applications, etc. In turn, with its ability to intelligently analyze complex data and extract hidden patterns, AI fits as a perfect choice for analyzing neuroscience data that are very complex. Large-scale AI-based simulations help neuroscientists test their hypotheses. Through an interface with the brain, an AI-based system can extract the brain signals and commands that are generated according to the signals. These commands are fed into devices, such as a robotic arm, which helps in the movement of paralyzed muscles or other human parts. AI has several use cases in analyzing neuroimaging data and reducing the workload of radiologists. The study of neuroscience helps in the early detection and diagnosis of neurological disorders. In the same way, AI can effectively be applied to the prediction and detection of neurological disorders. Thus, in this paper, a scoping review has been carried out on the mutual relationship between AI and neuroscience, emphasizing the convergence between AI and neuroscience in order to detect and predict various neurological disorders.

Collapse

Eysenbach G, Tan X, Padman R. A Machine Learning Approach to Support Urgent Stroke Triage Using Administrative Data and Social Determinants of Health at Hospital Presentation: Retrospective Study. J Med Internet Res 2023;25:e36477. [PMID: 36716097 PMCID: PMC9926350 DOI: 10.2196/36477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 07/17/2022] [Accepted: 12/18/2022] [Indexed: 01/31/2023] Open

Abstract

BACKGROUND

The key to effective stroke management is timely diagnosis and triage. Machine learning (ML) methods developed to assist in detecting stroke have focused on interpreting detailed clinical data such as clinical notes and diagnostic imaging results. However, such information may not be readily available when patients are initially triaged, particularly in rural and underserved communities.

OBJECTIVE

This study aimed to develop an ML stroke prediction algorithm based on data widely available at the time of patients' hospital presentations and assess the added value of social determinants of health (SDoH) in stroke prediction.

METHODS

We conducted a retrospective study of the emergency department and hospitalization records from 2012 to 2014 from all the acute care hospitals in the state of Florida, merged with the SDoH data from the American Community Survey. A case-control design was adopted to construct stroke and stroke mimic cohorts. We compared the algorithm performance and feature importance measures of the ML models (ie, gradient boosting machine and random forest) with those of the logistic regression model based on 3 sets of predictors. To provide insights into the prediction and ultimately assist care providers in decision-making, we used TreeSHAP for tree-based ML models to explain the stroke prediction.

RESULTS

Our analysis included 143,203 hospital visits of unique patients, and it was confirmed based on the principal diagnosis at discharge that 73% (n=104,662) of these patients had a stroke. The approach proposed in this study has high sensitivity and is particularly effective at reducing the misdiagnosis of dangerous stroke chameleons (false-negative rate <4%). ML classifiers consistently outperformed the benchmark logistic regression in all 3 input combinations. We found significant consistency across the models in the features that explain their performance. The most important features are age, the number of chronic conditions on admission, and primary payer (eg, Medicare or private insurance). Although both the individual- and community-level SDoH features helped improve the predictive performance of the models, the inclusion of the individual-level SDoH features led to a much larger improvement (area under the receiver operating characteristic curve increased from 0.694 to 0.823) than the inclusion of the community-level SDoH features (area under the receiver operating characteristic curve increased from 0.823 to 0.829).

CONCLUSIONS

Using data widely available at the time of patients' hospital presentations, we developed a stroke prediction model with high sensitivity and reasonable specificity. The prediction algorithm uses variables that are routinely collected by providers and payers and might be useful in underresourced hospitals with limited availability of sensitive diagnostic tools or incomplete data-gathering capabilities.

Collapse

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Lee T, Jeon ET, Jung JM, Lee M. Deep-Learning-Based Stroke Screening Using Skeleton Data from Neurological Examination Videos. J Pers Med 2022;12:jpm12101691. [PMID: 36294830 PMCID: PMC9604814 DOI: 10.3390/jpm12101691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 09/25/2022] [Accepted: 10/07/2022] [Indexed: 11/19/2022] Open

Liu L, Wu DTY, Spooner SA, Ni Y. Development and Evaluation of an Automated Approach to Detect Weight Abnormalities in Pediatric Weight Charts. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022;2021:783-792. [PMID: 35308946 PMCID: PMC8861738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Mainali S, Darsie ME, Smetana KS. Machine Learning in Action: Stroke Diagnosis and Outcome Prediction. Front Neurol 2021;12:734345. [PMID: 34938254 PMCID: PMC8685212 DOI: 10.3389/fneur.2021.734345] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 10/28/2021] [Indexed: 01/01/2023] Open

Predicting Mortality in Patients with Stroke Using Data Mining Techniques. ACTA INFORMATICA PRAGENSIA 2021. [DOI: 10.18267/j.aip.163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Atrial fibrillation detection in primary care during blood pressure measurements and using a smartphone cardiac monitor. Sci Rep 2021;11:17721. [PMID: 34489508 PMCID: PMC8421380 DOI: 10.1038/s41598-021-97475-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 08/24/2021] [Indexed: 11/22/2022] Open

Wyrwa JM, Shirel TM, Hostetter TA, Schneider AL, Hoffmire CA, Stearns-Yoder KA, Forster JE, Odom NE, Brenner LA. Suicide After Stroke in the United States Veteran Health Administration Population. Arch Phys Med Rehabil 2021;102:1729-1734. [PMID: 33811852 DOI: 10.1016/j.apmr.2021.03.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 03/10/2021] [Accepted: 03/12/2021] [Indexed: 11/22/2022]

Abstract

OBJECTIVE

To evaluate risk for suicide among veterans with a history of stroke, seeking care within the Veterans Health Administration (VHA), we analyzed existing clinical data.

DESIGN

This retrospective cohort study was approved and performed in accordance with the local Institutional Review Board. Veterans were identified via the VHA's Corporate Data Warehouse. Initial eligibility criteria included confirmed veteran status and at least 90 days of VHA utilization between fiscal years 2001-2015. Cox proportional hazards models were used to assess the association between history of stroke and suicide. Among those veterans who died by suicide, the association between history of stroke and method of suicide was also investigated.

SETTING

VHA.

PARTICIPANTS

Veterans with at least 90 days of VHA utilization between fiscal years 2001-2015 (N=1,647,671). Data from these 1,647,671 veterans were analyzed (1,405,762 without stroke and 241,909 with stroke).

INTERVENTIONS

Not applicable.

MAIN OUTCOME MEASURES

Suicide and method of suicide.

RESULTS

The fully adjusted model, which controlled for age, sex, mental health diagnoses, mild traumatic brain injury, and modified Charlson/Deyo Index (stroke-related diagnoses excluded), demonstrated a hazard ratio of 1.13 (95% confidence interval, 1.02-1.25; P=.02). The majority of suicides in both cohorts was by firearm, and a significantly larger proportion of suicides occurred by firearm in the group with stroke than the cohort without (81.2% vs 76.6%).

CONCLUSIONS

Findings suggest that veterans with a history of stroke are at increased risk for suicide, specifically by firearm, compared with veterans without a history of stroke. Increased efforts are needed to address the mental health needs and lethal means safety of veterans with a history of stroke, with the goal of improving function and decreasing negative psychiatric outcomes, such as suicide.

Collapse

Zhao Y, Fu S, Bielinski SJ, Decker PA, Chamberlain AM, Roger VL, Liu H, Larson NB. Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation. J Med Internet Res 2021;23:e22951. [PMID: 33683212 PMCID: PMC7985804 DOI: 10.2196/22951] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 08/25/2020] [Accepted: 01/20/2021] [Indexed: 11/29/2022] Open

Abstract

Background

Stroke is an important clinical outcome in cardiovascular research. However, the ascertainment of incident stroke is typically accomplished via time-consuming manual chart abstraction. Current phenotyping efforts using electronic health records for stroke focus on case ascertainment rather than incident disease, which requires knowledge of the temporal sequence of events.

Objective

The aim of this study was to develop a machine learning–based phenotyping algorithm for incident stroke ascertainment based on diagnosis codes, procedure codes, and clinical concepts extracted from clinical notes using natural language processing.

Methods

The algorithm was trained and validated using an existing epidemiology cohort consisting of 4914 patients with atrial fibrillation (AF) with manually curated incident stroke events. Various combinations of feature sets and machine learning classifiers were compared. Using a heuristic rule based on the composition of concepts and codes, we further detected the stroke subtype (ischemic stroke/transient ischemic attack or hemorrhagic stroke) of each identified stroke. The algorithm was further validated using a cohort (n=150) stratified sampled from a population in Olmsted County, Minnesota (N=74,314).

Results

Among the 4914 patients with AF, 740 had validated incident stroke events. The best-performing stroke phenotyping algorithm used clinical concepts, diagnosis codes, and procedure codes as features in a random forest classifier. Among patients with stroke codes in the general population sample, the best-performing model achieved a positive predictive value of 86% (43/50; 95% CI 0.74-0.93) and a negative predictive value of 96% (96/100). For subtype identification, we achieved an accuracy of 83% in the AF cohort and 80% in the general population sample.

Conclusions

We developed and validated a machine learning–based algorithm that performed well for identifying incident stroke and for determining type of stroke. The algorithm also performed well on a sample from a general population, further demonstrating its generalizability and potential for adoption by other institutions.

Collapse

Lee S, Doktorchik C, Martin EA, D'Souza AG, Eastwood C, Shaheen AA, Naugler C, Lee J, Quan H. Electronic Medical Record-Based Case Phenotyping for the Charlson Conditions: Scoping Review. JMIR Med Inform 2021;9:e23934. [PMID: 33522976 PMCID: PMC7884219 DOI: 10.2196/23934] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 11/20/2020] [Accepted: 12/05/2020] [Indexed: 12/16/2022] Open

Abstract

Background

Electronic medical records (EMRs) contain large amounts of rich clinical information. Developing EMR-based case definitions, also known as EMR phenotyping, is an active area of research that has implications for epidemiology, clinical care, and health services research.

Objective

This review aims to describe and assess the present landscape of EMR-based case phenotyping for the Charlson conditions.

Methods

A scoping review of EMR-based algorithms for defining the Charlson comorbidity index conditions was completed. This study covered articles published between January 2000 and April 2020, both inclusive. Embase (Excerpta Medica database) and MEDLINE (Medical Literature Analysis and Retrieval System Online) were searched using keywords developed in the following 3 domains: terms related to EMR, terms related to case finding, and disease-specific terms. The manuscript follows the Preferred Reporting Items for Systematic reviews and Meta-analyses extension for Scoping Reviews (PRISMA) guidelines.

Results

A total of 274 articles representing 299 algorithms were assessed and summarized. Most studies were undertaken in the United States (181/299, 60.5%), followed by the United Kingdom (42/299, 14.0%) and Canada (15/299, 5.0%). These algorithms were mostly developed either in primary care (103/299, 34.4%) or inpatient (168/299, 56.2%) settings. Diabetes, congestive heart failure, myocardial infarction, and rheumatology had the highest number of developed algorithms. Data-driven and clinical rule–based approaches have been identified. EMR-based phenotype and algorithm development reflect the data access allowed by respective health systems, and algorithms vary in their performance.

Conclusions

Recognizing similarities and differences in health systems, data collection strategies, extraction, data release protocols, and existing clinical pathways is critical to algorithm development strategies. Several strategies to assist with phenotype-based case definitions have been proposed.

Collapse

Affiliation(s)

Seungwon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Chelsea Doktorchik Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Elliot Asher Martin Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Adam Giles D'Souza Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Cathy Eastwood Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Abdel Aziz Shaheen Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Christopher Naugler Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Pathology and Laboratory Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Joon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Cardiac Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Hude Quan Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Collapse

Aguiar de Sousa D, Katan M. Promising Use of Automated Electronic Phenotyping: Turning Big Data Into Big Value in Stroke Research. Stroke 2020;52:190-192. [PMID: 33297867 DOI: 10.1161/strokeaha.120.033061] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Thangaraj PM, Kummer BR, Lorberbaum T, Elkind MSV, Tatonetti NP. Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods. BioData Min 2020;13:21. [PMID: 33372632 PMCID: PMC7720570 DOI: 10.1186/s13040-020-00230-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 11/15/2020] [Indexed: 01/14/2023] Open

Sung SF, Lin CY, Hu YH. EMR-Based Phenotyping of Ischemic Stroke Using Supervised Machine Learning and Text Mining Techniques. IEEE J Biomed Health Inform 2020;24:2922-2931. [DOI: 10.1109/jbhi.2020.2976931] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Zhao J, Zhang Y, Schlueter DJ, Wu P, Eric Kerchberger V, Trent Rosenbloom S, Wells QS, Feng Q, Denny JC, Wei WQ. Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study. J Biomed Inform 2019;98:103270. [PMID: 31445983 PMCID: PMC6783385 DOI: 10.1016/j.jbi.2019.103270] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 07/10/2019] [Accepted: 08/16/2019] [Indexed: 12/12/2022]

Abstract

OBJECTIVE

Discovering subphenotypes of complex diseases can help characterize disease cohorts for investigative studies aimed at developing better diagnoses and treatments. Recent advances in unsupervised machine learning on electronic health record (EHR) data have enabled researchers to discover phenotypes without input from domain experts. However, most existing studies have ignored time and modeled diseases as discrete events. Uncovering the evolution of phenotypes - how they emerge, evolve and contribute to health outcomes - is essential to define more precise phenotypes and refine the understanding of disease progression. Our objective was to assess the benefits of an unsupervised approach that incorporates time to model diseases as dynamic processes in phenotype discovery.

METHODS

In this study, we applied a constrained non-negative tensor-factorization approach to characterize the complexity of cardiovascular disease (CVD) patient cohort based on longitudinal EHR data. Through tensor-factorization, we identified a set of phenotypic topics (i.e., subphenotypes) that these patients established over the 10 years prior to the diagnosis of CVD, and showed the progress pattern. For each identified subphenotype, we examined its association with the risk for adverse cardiovascular outcomes estimated by the American College of Cardiology/American Heart Association Pooled Cohort Risk Equations, a conventional CVD-risk assessment tool frequently used in clinical practice. Furthermore, we compared the subsequent myocardial infarction (MI) rates among the six most prevalent subphenotypes using survival analysis.

RESULTS

From a cohort of 12,380 adult CVD individuals with 1068 unique PheCodes, we successfully identified 14 subphenotypes. Through the association analysis with estimated CVD risk for each subtype, we found some phenotypic topics such as Vitamin D deficiency and depression, Urinary infections cannot be explained by the conventional risk factors. Through a survival analysis, we found markedly different risks of subsequent MI following the diagnosis of CVD among the six most prevalent topics (p < 0.0001), indicating these topics may capture clinically meaningful subphenotypes of CVD.

CONCLUSION

This study demonstrates the potential benefits of using tensor-decomposition to model diseases as dynamic processes from longitudinal EHR data. Our results suggest that this data-driven approach may potentially help researchers identify complex and chronic disease subphenotypes in precision medicine research.

Collapse