Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhou SM, Fernandez-Gutierrez F, Kennedy J, Cooksey R, Atkinson M, Denaxas S, Siebert S, Dixon WG, O’Neill TW, Choy E, Sudlow C, Brophy S. Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis. PLoS One 2016;11:e0154515. [PMID: 27135409 PMCID: PMC4852928 DOI: 10.1371/journal.pone.0154515] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 04/14/2016] [Indexed: 12/20/2022] Open

For:	Zhou SM, Fernandez-Gutierrez F, Kennedy J, Cooksey R, Atkinson M, Denaxas S, Siebert S, Dixon WG, O’Neill TW, Choy E, Sudlow C, Brophy S. Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis. PLoS One 2016;11:e0154515. [PMID: 27135409 PMCID: PMC4852928 DOI: 10.1371/journal.pone.0154515] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2016] [Accepted: 04/14/2016] [Indexed: 12/20/2022] Open

Number

Cited by Other Article(s)

Inchingolo F, Inchingolo AM, Fatone MC, Avantario P, Del Vecchio G, Pezzolla C, Mancini A, Galante F, Palermo A, Inchingolo AD, Dipalma G. Management of Rheumatoid Arthritis in Primary Care: A Scoping Review. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2024;21:662. [PMID: 38928909 PMCID: PMC11203333 DOI: 10.3390/ijerph21060662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/13/2024] [Accepted: 05/16/2024] [Indexed: 06/28/2024]

Abdulazeem H, Whitelaw S, Schauberger G, Klug SJ. A systematic review of clinical health conditions predicted by machine learning diagnostic and prognostic models trained or validated using real-world primary health care data. PLoS One 2023;18:e0274276. [PMID: 37682909 PMCID: PMC10491005 DOI: 10.1371/journal.pone.0274276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open

Abstract

With the advances in technology and data science, machine learning (ML) is being rapidly adopted by the health care sector. However, there is a lack of literature addressing the health conditions targeted by the ML prediction models within primary health care (PHC) to date. To fill this gap in knowledge, we conducted a systematic review following the PRISMA guidelines to identify health conditions targeted by ML in PHC. We searched the Cochrane Library, Web of Science, PubMed, Elsevier, BioRxiv, Association of Computing Machinery (ACM), and IEEE Xplore databases for studies published from January 1990 to January 2022. We included primary studies addressing ML diagnostic or prognostic predictive models that were supplied completely or partially by real-world PHC data. Studies selection, data extraction, and risk of bias assessment using the prediction model study risk of bias assessment tool were performed by two investigators. Health conditions were categorized according to international classification of diseases (ICD-10). Extracted data were analyzed quantitatively. We identified 106 studies investigating 42 health conditions. These studies included 207 ML prediction models supplied by the PHC data of 24.2 million participants from 19 countries. We found that 92.4% of the studies were retrospective and 77.3% of the studies reported diagnostic predictive ML models. A majority (76.4%) of all the studies were for models' development without conducting external validation. Risk of bias assessment revealed that 90.8% of the studies were of high or unclear risk of bias. The most frequently reported health conditions were diabetes mellitus (19.8%) and Alzheimer's disease (11.3%). Our study provides a summary on the presently available ML prediction models within PHC. We draw the attention of digital health policy makers, ML models developer, and health care professionals for more future interdisciplinary research collaboration in this regard.

Collapse

Kennedy J, Kennedy N, Cooksey R, Choy E, Siebert S, Rahman M, Brophy S. Predicting a diagnosis of ankylosing spondylitis using primary care health records-A machine learning approach. PLoS One 2023;18:e0279076. [PMID: 37000839 PMCID: PMC10065228 DOI: 10.1371/journal.pone.0279076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 12/01/2022] [Indexed: 04/03/2023] Open

Abstract

Ankylosing spondylitis is the second most common cause of inflammatory arthritis. However, a successful diagnosis can take a decade to confirm from symptom onset (via x-rays). The aim of this study was to use machine learning methods to develop a profile of the characteristics of people who are likely to be given a diagnosis of AS in future. The Secure Anonymised Information Linkage databank was used. Patients with ankylosing spondylitis were identified using their routine data and matched with controls who had no record of a diagnosis of ankylosing spondylitis or axial spondyloarthritis. Data was analysed separately for men and women. The model was developed using feature/variable selection and principal component analysis to develop decision trees. The decision tree with the highest average F value was selected and validated with a test dataset. The model for men indicated that lower back pain, uveitis, and NSAID use under age 20 is associated with AS development. The model for women showed an older age of symptom presentation compared to men with back pain and multiple pain relief medications. The models showed good prediction (positive predictive value 70%-80%) in test data but in the general population where prevalence is very low (0.09% of the population in this dataset) the positive predictive value would be very low (0.33%-0.25%). Machine learning can be used to help profile and understand the characteristics of people who will develop AS, and in test datasets with artificially high prevalence, will perform well. However, when applied to a general population with low prevalence rates, such as that in primary care, the positive predictive value for even the best model would be 1.4%. Multiple models may be needed to narrow down the population over time to improve the predictive value and therefore reduce the time to diagnosis of ankylosing spondylitis.

Collapse

Abbas J, Yousef M, Peled N, Hershkovitz I, Hamoud K. Predictive factors for degenerative lumbar spinal stenosis: a model obtained from a machine learning algorithm technique. BMC Musculoskelet Disord 2023;24:218. [PMID: 36949452 PMCID: PMC10035245 DOI: 10.1186/s12891-023-06330-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 03/16/2023] [Indexed: 03/24/2023] Open

Mehta B, Goodman S, DiCarlo E, Jannat-Khah D, Gibbons JAB, Otero M, Donlin L, Pannellini T, Robinson WH, Sculco P, Figgie M, Rodriguez J, Kirschmann JM, Thompson J, Slater D, Frezza D, Xu Z, Wang F, Orange DE. Machine learning identification of thresholds to discriminate osteoarthritis and rheumatoid arthritis synovial inflammation. Arthritis Res Ther 2023;25:31. [PMID: 36864474 PMCID: PMC9979511 DOI: 10.1186/s13075-023-03008-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 02/06/2023] [Indexed: 03/04/2023] Open

Zheng HW, Ranganath VK, Perry LC, Chetrit DA, Criner KM, Pham AQ, Seto R, Vangala S, Elashoff DA, Bui AA. Evaluation of an automated phenotyping algorithm for rheumatoid arthritis. J Biomed Inform 2022;135:104214. [DOI: 10.1016/j.jbi.2022.104214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 09/24/2022] [Accepted: 09/26/2022] [Indexed: 11/16/2022]

Tarakci F, Ozkan IA, Yilmaz S, Tezcan D. Diagnosing rheumatoid arthritis disease using fuzzy expert system and machine learning techniques. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-221582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Abstract Rheumatoid Arthritis (RA) is a very common autoimmune disease that causes significant morbidity and mortality, and therefore early diagnosis and treatment are important. Early diagnosis of RA and knowing the severity of the disease are very important for the treatment to be applied. The diagnosis of RA usually requires a physical examination, laboratory tests, and a review of the patient’s medical history. In this study, the diagnosis of RA was made with two different methods using a fuzzy expert system (FES) and machine learning (ML) techniques, which were designed and implemented with the help of a specialist in the field, and the results were compared. For this purpose, blood counts were taken from 286 people, including 91 men and 195 women from various age groups. In the first method, an FES structure that determines the severity of RA disease has been established from blood count using the laboratory test results of CRP, ESR, RF, and ANA. The FES result that determines RA disease severity, the Anti-CCP level that is used to distinguish RA disease, and the patient’s medical history were used to design the Decision Support System (DSS) that diagnoses RA disease. The DSS is web-based and publicly accessible. In the second method, RA disease was diagnosed using kNN, SVM, LR, DT, NB, and MLP algorithms, which are widely used in machine learning. To examine the effect of the patient’s history on RA disease diagnosis, two different models were used in machine learning techniques, one with and one without the patient’s history. The results of the fuzzy-based DSS were also compared with the diagnoses made by the specialist and the diagnoses made according to the 2010 ACR / EULAR RA classification criteria. The performed DSS has achieved a diagnostic success rate of 94.05% on 286 patients. In the study of machine learning techniques, the highest success rate was achieved with the LR model. While the success rate of the model was 91.25 % with only blood count data, the success rate was 97.90% with the addition of the patient’s history. In addition to the high success rate, the results show that the patient’s history is important in diagnosing RA disease. Collapse

Momtazmanesh S, Nowroozi A, Rezaei N. Artificial Intelligence in Rheumatoid Arthritis: Current Status and Future Perspectives: A State-of-the-Art Review. Rheumatol Ther 2022;9:1249-1304. [PMID: 35849321 PMCID: PMC9510088 DOI: 10.1007/s40744-022-00475-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 06/24/2022] [Indexed: 11/23/2022] Open

Abstract

Investigation of the potential applications of artificial intelligence (AI), including machine learning (ML) and deep learning (DL) techniques, is an exponentially growing field in medicine and healthcare. These methods can be critical in providing high-quality care to patients with chronic rheumatological diseases lacking an optimal treatment, like rheumatoid arthritis (RA), which is the second most prevalent autoimmune disease. Herein, following reviewing the basic concepts of AI, we summarize the advances in its applications in RA clinical practice and research. We provide directions for future investigations in this field after reviewing the current knowledge gaps and technical and ethical challenges in applying AI. Automated models have been largely used to improve RA diagnosis since the early 2000s, and they have used a wide variety of techniques, e.g., support vector machine, random forest, and artificial neural networks. AI algorithms can facilitate screening and identification of susceptible groups, diagnosis using omics, imaging, clinical, and sensor data, patient detection within electronic health record (EHR), i.e., phenotyping, treatment response assessment, monitoring disease course, determining prognosis, novel drug discovery, and enhancing basic science research. They can also aid in risk assessment for incidence of comorbidities, e.g., cardiovascular diseases, in patients with RA. However, the proposed models may vary significantly in their performance and reliability. Despite the promising results achieved by AI models in enhancing early diagnosis and management of patients with RA, they are not fully ready to be incorporated into clinical practice. Future investigations are required to ensure development of reliable and generalizable algorithms while they carefully look for any potential source of bias or misconduct. We showed that a growing body of evidence supports the potential role of AI in revolutionizing screening, diagnosis, and management of patients with RA. However, multiple obstacles hinder clinical applications of AI models. Incorporating the machine and/or deep learning algorithms into real-world settings would be a key step in the progress of AI in medicine.

Collapse

Kaplan AD, Greene JD, Liu VX, Ray P. Unsupervised probabilistic models for sequential Electronic Health Records. J Biomed Inform 2022;134:104163. [PMID: 36038064 PMCID: PMC10588733 DOI: 10.1016/j.jbi.2022.104163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 06/23/2022] [Accepted: 08/11/2022] [Indexed: 11/18/2022]

Freda PJ, Kranzler HR, Moore JH. Novel digital approaches to the assessment of problematic opioid use. BioData Min 2022;15:14. [PMID: 35840990 PMCID: PMC9284824 DOI: 10.1186/s13040-022-00301-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 06/30/2022] [Indexed: 11/16/2022] Open

De Cock D, Myasoedova E, Aletaha D, Studenic P. Big data analyses and individual health profiling in the arena of rheumatic and musculoskeletal diseases (RMDs). Ther Adv Musculoskelet Dis 2022;14:1759720X221105978. [PMID: 35794905 PMCID: PMC9251966 DOI: 10.1177/1759720x221105978] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Accepted: 05/22/2022] [Indexed: 11/17/2022] Open

Duong SQ, Crowson CS, Athreya A, Atkinson EJ, Davis JM, Warrington KJ, Matteson EL, Weinshilboum R, Wang L, Myasoedova E. Clinical predictors of response to methotrexate in patients with rheumatoid arthritis: a machine learning approach using clinical trial data. Arthritis Res Ther 2022;24:162. [PMID: 35778714 PMCID: PMC9248180 DOI: 10.1186/s13075-022-02851-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 06/18/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Methotrexate is the preferred initial disease-modifying antirheumatic drug (DMARD) for rheumatoid arthritis (RA). However, clinically useful tools for individualized prediction of response to methotrexate treatment in patients with RA are lacking. We aimed to identify clinical predictors of response to methotrexate in patients with rheumatoid arthritis (RA) using machine learning methods.

METHODS

Randomized clinical trials (RCT) of patients with RA who were DMARD-naïve and randomized to placebo plus methotrexate were identified and accessed through the Clinical Study Data Request Consortium and Vivli Center for Global Clinical Research Data. Studies with available Disease Activity Score with 28-joint count and erythrocyte sedimentation rate (DAS28-ESR) at baseline and 12 and 24 weeks were included. Latent class modeling of methotrexate response was performed. The least absolute shrinkage and selection operator (LASSO) and random forests methods were used to identify predictors of response.

RESULTS

A total of 775 patients from 4 RCTs were included (mean age 50 years, 80% female). Two distinct classes of patients were identified based on DAS28-ESR change over 24 weeks: "good responders" and "poor responders." Baseline DAS28-ESR, anti-citrullinated protein antibody (ACPA), and Health Assessment Questionnaire (HAQ) score were the top predictors of good response using LASSO (area under the curve [AUC] 0.79) and random forests (AUC 0.68) in the external validation set. DAS28-ESR ≤ 7.4, ACPA positive, and HAQ ≤ 2 provided the highest likelihood of response. Among patients with 12-week DAS28-ESR > 3.2, ≥ 1 point improvement in DAS28-ESR baseline-to-12-week was predictive of achieving DAS28-ESR ≤ 3.2 at 24 weeks.

CONCLUSIONS

We have developed and externally validated a prediction model for response to methotrexate within 24 weeks in DMARD-naïve patients with RA, providing variably weighted clinical features and defined cutoffs for clinical decision-making.

Collapse

Kaplan AD, Tipnis U, Beckham JC, Kimbrel NA, Oslin DW, McMahon BH. Continuous-Time Probabilistic Models for Longitudinal Electronic Health Records. J Biomed Inform 2022;130:104084. [DOI: 10.1016/j.jbi.2022.104084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 03/18/2022] [Accepted: 04/25/2022] [Indexed: 10/18/2022]

Predicting Hospital Readmission for Campylobacteriosis from Electronic Health Records: A Machine Learning and Text Mining Perspective. J Pers Med 2022;12:jpm12010086. [PMID: 35055401 PMCID: PMC8779953 DOI: 10.3390/jpm12010086] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 12/09/2021] [Accepted: 12/14/2021] [Indexed: 02/04/2023] Open

Davids J, Ashrafian H. AIM and mHealth, Smartphones and Apps. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

AIM in Rheumatology. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Kedra J, Davergne T, Braithwaite B, Servy H, Gossec L. Machine learning approaches to improve disease management of patients with rheumatoid arthritis: review and future directions. Expert Rev Clin Immunol 2021;17:1311-1321. [PMID: 34890271 DOI: 10.1080/1744666x.2022.2017773] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Rehberg M, Giegerich C, Praestgaard A, van Hoogstraten H, Iglesias-Rodriguez M, Curtis JR, Gottenberg JE, Schwarting A, Castañeda S, Rubbert-Roth A, Choy EHS. Identification of a Rule to Predict Response to Sarilumab in Patients with Rheumatoid Arthritis Using Machine Learning and Clinical Trial Data. Rheumatol Ther 2021;8:1661-1675. [PMID: 34519964 PMCID: PMC8572308 DOI: 10.1007/s40744-021-00361-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 08/11/2021] [Indexed: 11/22/2022] Open

Abstract

INTRODUCTION

In rheumatoid arthritis, time spent using ineffective medications may lead to irreversible disease progression. Despite availability of targeted treatments, only a minority of patients achieve sustained remission, and little evidence exists to direct the choice of biologic disease-modifying antirheumatic drugs in individual patients. Machine learning was used to identify a rule to predict the response to sarilumab and discriminate between responses to sarilumab versus adalimumab, with a focus on clinically feasible blood biomarkers.

METHODS

The decision tree model GUIDE was trained using a data subset from the sarilumab trial with the most biomarker data, MOBILITY, to identify a rule to predict disease activity after sarilumab 200 mg. The training set comprised 18 categorical and 24 continuous baseline variables; some data were omitted from training and used for validation by the algorithm (cross-validation). The rule was tested using full datasets from four trials (MOBILITY, MONARCH, TARGET, and ASCERTAIN), focusing on the recommended sarilumab dose of 200 mg.

RESULTS

In the training set, the presence of anti-cyclic citrullinated peptide antibodies, combined with C-reactive protein > 12.3 mg/l, was identified as the "rule" that predicts American College of Rheumatology 20% response (ACR20) to sarilumab. In testing, the rule reliably predicted response to sarilumab in MOBILITY, MONARCH, and ASCERTAIN for many efficacy parameters (e.g., ACR70 and the 28-joint disease activity score using CRP [DAS28-CRP] remission). The rule applied less to TARGET, which recruited individuals refractory to tumor necrosis factor inhibitors. The potential clinical benefit of the rule was highlighted in a clinical scenario based on MONARCH data, which found that increased ACR70 rates could be achieved by treating either rule-positive patients with sarilumab or rule-negative patients with adalimumab.

CONCLUSIONS

Well-established and clinically feasible blood biomarkers can guide individual treatment choice. Real-world validation of the rule identified in this post hoc analysis is merited.

CLINICAL TRIAL REGISTRATION

NCT01061736, NCT02332590, NCT01709578, NCT01768572.

Collapse

Affiliation(s)

Markus Rehberg Sanofi, Frankfurt, Germany
Clemens Giegerich Sanofi, Frankfurt, Germany
Amy Praestgaard Sanofi, Cambridge, MA USA
Hubert van Hoogstraten Sanofi, Cambridge, MA USA
Melitza Iglesias-Rodriguez Sanofi, Cambridge, MA USA
Jeffrey R. Curtis Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL USA
Jacques-Eric Gottenberg Strasbourg University Hospital, Strasbourg, France
Andreas Schwarting Acura Kliniken Rheinland-Pfalz AG, Bad Kreuznach, Germany University Center of Autoimmunity, University Medical Center Mainz, Mainz, Germany
Santos Castañeda Rheumatology Division, Hospital Universitario de La Princesa, IIS-IP and EPID-Future Cátedra, Autónoma University of Madrid (UAM), Madrid, Spain
Andrea Rubbert-Roth Kantonsspital St Gallen, St Gallen, Switzerland
Ernest H. S. Choy Section of Rheumatology and Translational Research, Division of Infection and Immunity, Arthritis Research UK CREATE Centre and Welsh Arthritis Research Network (WARN), Cardiff University School of Medicine, Tenovus Building, Heath Park Campus, Cardiff, CF14 4XN UK
the MOBILITY, MONARCH, TARGET, and ASCERTAIN investigators Sanofi, Frankfurt, Germany Sanofi, Cambridge, MA USA Division of Clinical Immunology and Rheumatology, University of Alabama at Birmingham, Birmingham, AL USA Strasbourg University Hospital, Strasbourg, France Acura Kliniken Rheinland-Pfalz AG, Bad Kreuznach, Germany University Center of Autoimmunity, University Medical Center Mainz, Mainz, Germany Rheumatology Division, Hospital Universitario de La Princesa, IIS-IP and EPID-Future Cátedra, Autónoma University of Madrid (UAM), Madrid, Spain Kantonsspital St Gallen, St Gallen, Switzerland Section of Rheumatology and Translational Research, Division of Infection and Immunity, Arthritis Research UK CREATE Centre and Welsh Arthritis Research Network (WARN), Cardiff University School of Medicine, Tenovus Building, Heath Park Campus, Cardiff, CF14 4XN UK

Collapse

Machine Learning in Rheumatic Diseases. Clin Rev Allergy Immunol 2021;60:96-110. [PMID: 32681407 DOI: 10.1007/s12016-020-08805-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Wang JX, Somani S, Chen JH, Murray S, Sarkar U. Health Equity in Artificial Intelligence and Primary Care Research: Protocol for a Scoping Review. JMIR Res Protoc 2021;10:e27799. [PMID: 34533458 PMCID: PMC8486995 DOI: 10.2196/27799] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 05/09/2021] [Accepted: 06/16/2021] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Though artificial intelligence (AI) has the potential to augment the patient-physician relationship in primary care, bias in intelligent health care systems has the potential to differentially impact vulnerable patient populations.

OBJECTIVE

The purpose of this scoping review is to summarize the extent to which AI systems in primary care examine the inherent bias toward or against vulnerable populations and appraise how these systems have mitigated the impact of such biases during their development.

METHODS

We will conduct a search update from an existing scoping review to identify studies on AI and primary care in the following databases: Medline-OVID, Embase, CINAHL, Cochrane Library, Web of Science, Scopus, IEEE Xplore, ACM Digital Library, MathSciNet, AAAI, and arXiv. Two screeners will independently review all abstracts, titles, and full-text articles. The team will extract data using a structured data extraction form and synthesize the results in accordance with PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines.

RESULTS

This review will provide an assessment of the current state of health care equity within AI for primary care. Specifically, we will identify the degree to which vulnerable patients have been included, assess how bias is interpreted and documented, and understand the extent to which harmful biases are addressed. As of October 2020, the scoping review is in the title- and abstract-screening stage. The results are expected to be submitted for publication in fall 2021.

CONCLUSIONS

AI applications in primary care are becoming an increasingly common tool in health care delivery and in preventative care efforts for underserved populations. This scoping review would potentially show the extent to which studies on AI in primary care employ a health equity lens and take steps to mitigate bias.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

PRR1-10.2196/27799.

Collapse

Abbasgholizadeh Rahimi S, Légaré F, Sharma G, Archambault P, Zomahoun HTV, Chandavong S, Rheault N, T Wong S, Langlois L, Couturier Y, Salmeron JL, Gagnon MP, Légaré J. Application of Artificial Intelligence in Community-Based Primary Health Care: Systematic Scoping Review and Critical Appraisal. J Med Internet Res 2021;23:e29839. [PMID: 34477556 PMCID: PMC8449300 DOI: 10.2196/29839] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 05/29/2021] [Accepted: 05/31/2021] [Indexed: 12/27/2022] Open

Abstract

BACKGROUND

Research on the integration of artificial intelligence (AI) into community-based primary health care (CBPHC) has highlighted several advantages and disadvantages in practice regarding, for example, facilitating diagnosis and disease management, as well as doubts concerning the unintended harmful effects of this integration. However, there is a lack of evidence about a comprehensive knowledge synthesis that could shed light on AI systems tested or implemented in CBPHC.

OBJECTIVE

We intended to identify and evaluate published studies that have tested or implemented AI in CBPHC settings.

METHODS

We conducted a systematic scoping review informed by an earlier study and the Joanna Briggs Institute (JBI) scoping review framework and reported the findings according to PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analysis-Scoping Reviews) reporting guidelines. An information specialist performed a comprehensive search from the date of inception until February 2020, in seven bibliographic databases: Cochrane Library, MEDLINE, EMBASE, Web of Science, Cumulative Index to Nursing and Allied Health Literature (CINAHL), ScienceDirect, and IEEE Xplore. The selected studies considered all populations who provide and receive care in CBPHC settings, AI interventions that had been implemented, tested, or both, and assessed outcomes related to patients, health care providers, or CBPHC systems. Risk of bias was assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Two authors independently screened the titles and abstracts of the identified records, read the selected full texts, and extracted data from the included studies using a validated extraction form. Disagreements were resolved by consensus, and if this was not possible, the opinion of a third reviewer was sought. A third reviewer also validated all the extracted data.

RESULTS

We retrieved 22,113 documents. After the removal of duplicates, 16,870 documents were screened, and 90 peer-reviewed publications met our inclusion criteria. Machine learning (ML) (41/90, 45%), natural language processing (NLP) (24/90, 27%), and expert systems (17/90, 19%) were the most commonly studied AI interventions. These were primarily implemented for diagnosis, detection, or surveillance purposes. Neural networks (ie, convolutional neural networks and abductive networks) demonstrated the highest accuracy, considering the given database for the given clinical task. The risk of bias in diagnosis or prognosis studies was the lowest in the participant category (4/49, 4%) and the highest in the outcome category (22/49, 45%).

CONCLUSIONS

We observed variabilities in reporting the participants, types of AI methods, analyses, and outcomes, and highlighted the large gap in the effective development and implementation of AI in CBPHC. Further studies are needed to efficiently guide the development and implementation of AI interventions in CBPHC settings.

Collapse

Lee S, Doktorchik C, Martin EA, D'Souza AG, Eastwood C, Shaheen AA, Naugler C, Lee J, Quan H. Electronic Medical Record-Based Case Phenotyping for the Charlson Conditions: Scoping Review. JMIR Med Inform 2021;9:e23934. [PMID: 33522976 PMCID: PMC7884219 DOI: 10.2196/23934] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 11/20/2020] [Accepted: 12/05/2020] [Indexed: 12/16/2022] Open

Abstract

Background

Electronic medical records (EMRs) contain large amounts of rich clinical information. Developing EMR-based case definitions, also known as EMR phenotyping, is an active area of research that has implications for epidemiology, clinical care, and health services research.

Objective

This review aims to describe and assess the present landscape of EMR-based case phenotyping for the Charlson conditions.

Methods

A scoping review of EMR-based algorithms for defining the Charlson comorbidity index conditions was completed. This study covered articles published between January 2000 and April 2020, both inclusive. Embase (Excerpta Medica database) and MEDLINE (Medical Literature Analysis and Retrieval System Online) were searched using keywords developed in the following 3 domains: terms related to EMR, terms related to case finding, and disease-specific terms. The manuscript follows the Preferred Reporting Items for Systematic reviews and Meta-analyses extension for Scoping Reviews (PRISMA) guidelines.

Results

A total of 274 articles representing 299 algorithms were assessed and summarized. Most studies were undertaken in the United States (181/299, 60.5%), followed by the United Kingdom (42/299, 14.0%) and Canada (15/299, 5.0%). These algorithms were mostly developed either in primary care (103/299, 34.4%) or inpatient (168/299, 56.2%) settings. Diabetes, congestive heart failure, myocardial infarction, and rheumatology had the highest number of developed algorithms. Data-driven and clinical rule–based approaches have been identified. EMR-based phenotype and algorithm development reflect the data access allowed by respective health systems, and algorithms vary in their performance.

Conclusions

Recognizing similarities and differences in health systems, data collection strategies, extraction, data release protocols, and existing clinical pathways is critical to algorithm development strategies. Several strategies to assist with phenotype-based case definitions have been proposed.

Collapse

Affiliation(s)

Seungwon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Chelsea Doktorchik Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Elliot Asher Martin Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Adam Giles D'Souza Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Alberta Health Services, Calgary, AB, Canada
Cathy Eastwood Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Abdel Aziz Shaheen Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Christopher Naugler Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Pathology and Laboratory Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Joon Lee Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Cardiac Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
Hude Quan Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.,Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada

Collapse

AIM in Rheumatology. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_179-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Davids J, Ashrafian H. AIM and mHealth, Smartphones and Apps. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_242-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Maksabedian Hernandez EJ, Tingzon I, Ampil L, Tiu J. Identifying chronic disease patients using predictive algorithms in pharmacy administrative claims: an application in rheumatoid arthritis. J Med Econ 2021;24:1272-1279. [PMID: 34704871 DOI: 10.1080/13696998.2021.1999132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Abstract

OBJECTIVE

To evaluate the predictive performance of logistic and linear regression versus machine learning (ML) algorithms to identify patients with rheumatoid arthritis (RA) treated with target immunomodulators (TIMs) using only pharmacy administrative claims.

METHODS

Adults aged 18-64 years with ≥1 TIM claim in the IBM MarketScan commercial database were included in this retrospective analysis. The predictive ability of logistic regression to identify RA patients was compared with supervised ML classification algorithms including random forest (RF), decision trees, linear support vector machines (SVMs), neural networks, naïve Bayes classifier, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and K-nearest neighbors (k-NN). Model performance was evaluated using F1 score, accuracy, precision, sensitivity, area under the receiver operating characteristic curve (AUROC), and Matthews correlation coefficient (MCC). Analyses were conducted in all-patient and etanercept-only samples.

RESULTS

In the all-patients sample, ML approaches did not outperform logistic regression. RF showed small improvements versus logistic regression that were not considered remarkable, respectively: F1 score (84.55% vs 83.96%), accuracy (84.05% vs 83.79%), sensitivity (84.53% vs 82.20%), AUROC (84.04% vs 83.85%), and MCC (68.07% vs 67.66%). Findings were similar in the etanercept samples.

CONCLUSION

Logistic regression and ML approaches successfully identified patients with RA in a large pharmacy administrative claims database. The ML algorithms were no better than logistic regression at prediction. RF, SVMs, LDA, and ridge classifier showed comparable performance, while neural networks, decision trees, naïve Bayes classifier, and QDA underperformed compared with logistic regression in identifying patients with RA.

Collapse

Luo YF, Henry S, Wang Y, Shen F, Uzuner O, Rumshisky A. The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records. J Am Med Inform Assoc 2020;27:1529-1537. [PMID: 32968800 PMCID: PMC7647359 DOI: 10.1093/jamia/ocaa106] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 05/01/2020] [Accepted: 05/14/2020] [Indexed: 01/19/2023] Open

Jamthikar AD, Gupta D, Puvvula A, Johri AM, Khanna NN, Saba L, Mavrogeni S, Laird JR, Pareek G, Miner M, Sfikakis PP, Protogerou A, Kitas GD, Kolluri R, Sharma AM, Viswanathan V, Rathore VS, Suri JS. Cardiovascular risk assessment in patients with rheumatoid arthritis using carotid ultrasound B-mode imaging. Rheumatol Int 2020;40:1921-1939. [PMID: 32857281 PMCID: PMC7453675 DOI: 10.1007/s00296-020-04691-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 08/18/2020] [Indexed: 12/18/2022]

Charting the life course: Emerging opportunities to advance scientific approaches using life course research. J Clin Transl Sci 2020;5:e9. [PMID: 33948236 PMCID: PMC8057465 DOI: 10.1017/cts.2020.492] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Hemingway H, Lyons R, Li Q, Buchan I, Ainsworth J, Pell J, Morris A. A national initiative in data science for health: an evaluation of the UK Farr Institute. Int J Popul Data Sci 2020;5:1128. [PMID: 32935051 PMCID: PMC7480324 DOI: 10.23889/ijpds.v5i1.1128] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Abstract

OBJECTIVE

To evaluate the extent to which the inter-institutional, inter-disciplinary mobilisation of data and skills in the Farr Institute contributed to establishing the emerging field of data science for health in the UK.

DESIGN AND OUTCOME MEASURES

We evaluated evidence of six domains characterising a new field of science:defining central scientific challenges,demonstrating how the central challenges might be solved,creating novel interactions among groups of scientists,training new types of experts,re-organising universities,demonstrating impacts in society.We carried out citation, network and time trend analyses of publications, and a narrative review of infrastructure, methods and tools.

SETTING

Four UK centres in London, North England, Scotland and Wales (23 university partners), 2013-2018.

RESULTS

1. The Farr Institute helped define a central scientific challenge publishing a research corpus, demonstrating insights from electronic health record (EHR) and administrative data at each stage of the translational cycle in 593 papers with at least one Farr Institute author affiliation on PubMed. 2. The Farr Institute offered some demonstrations of how these scientific challenges might be solved: it established the first four ISO27001 certified trusted research environments in the UK, and approved more than 1000 research users, published on 102 unique EHR and administrative data sources, although there was no clear evidence of an increase in novel, sustained record linkages. The Farr Institute established open platforms for the EHR phenotyping algorithms and validations (>70 diseases, CALIBER). Sample sizes showed some evidence of increase but remained less than 10% of the UK population in primary care-hospital care linked studies. 3.The Farr Institute created novel interactions among researchers: the co-author publication network expanded from 944 unique co-authors (based on 67 publications in the first 30 months) to 3839 unique co-authors (545 papers in the final 30 months). 4. Training expanded substantially with 3 new masters courses, training >400 people at masters, short-course and leadership level and 48 PhD students. 5. Universities reorganised with 4/5 Centres established 27 new faculty (tenured) positions, 3 new university institutes. 6. Emerging evidence of impacts included: > 3200 citations for the 10 most cited papers and Farr research informed eight practice-changing clinical guidelines and policies relevant to the health of millions of UK citizens.

CONCLUSION

The Farr Institute played a major role in establishing and growing the field of data science for health in the UK, with some initial evidence of benefits for health and healthcare. The Farr Institute has now expanded into Health Data Research (HDR) UK but key challenges remain including, how to network such activities internationally.

Collapse

Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet 2020;21:493-502. [PMID: 32235907 DOI: 10.1038/s41576-020-0224-1] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2020] [Indexed: 01/03/2023]

Stafford IS, Kellermann M, Mossotto E, Beattie RM, MacArthur BD, Ennis S. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020;3:30. [PMID: 32195365 PMCID: PMC7062883 DOI: 10.1038/s41746-020-0229-3] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 01/17/2020] [Indexed: 02/07/2023] Open

Hügle M, Omoumi P, van Laar JM, Boedecker J, Hügle T. Applied machine learning and artificial intelligence in rheumatology. Rheumatol Adv Pract 2020;4:rkaa005. [PMID: 32296743 PMCID: PMC7151725 DOI: 10.1093/rap/rkaa005] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 01/07/2020] [Indexed: 12/28/2022] Open

Colombo Filho ME, Mello Galliez R, Andrade Bernardi F, de Oliveira LL, Kritski A, Koenigkam Santos M, Alves D. Preliminary Results on Pulmonary Tuberculosis Detection in Chest X-Ray Using Convolutional Neural Networks. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7303695 DOI: 10.1007/978-3-030-50423-6_42] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

McBrien KA, Souri S, Symonds NE, Rouhi A, Lethebe BC, Williamson TS, Garies S, Birtwhistle R, Quan H, Fabreau GE, Ronksley PE. Identification of validated case definitions for medical conditions used in primary care electronic medical record databases: a systematic review. J Am Med Inform Assoc 2019;25:1567-1578. [PMID: 30137498 DOI: 10.1093/jamia/ocy094] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 07/02/2018] [Indexed: 01/11/2023] Open

Abstract

Objectives

Data derived from primary care electronic medical records (EMRs) are being used for research and surveillance. Case definitions are required to identify patients with specific conditions in EMR data with a degree of accuracy. The purpose of this study is to identify and provide a summary of case definitions that have been validated in primary care EMR data.

Materials and Methods

We searched MEDLINE and Embase (from inception to June 2016) to identify studies that describe case definitions for clinical conditions in EMR data and report on the performance metrics of these definitions.

Results

We identified 40 studies reporting on case definitions for 47 unique clinical conditions. The studies used combinations of International Classification of Disease version 9 (ICD-9) codes, Read codes, laboratory values, and medications in their algorithms. The most common validation metric reported was positive predictive value, with inconsistent reporting of sensitivity and specificity.

Discussion

This review describes validated case definitions derived in primary care EMR data, which can be used to understand disease patterns and prevalence among primary care populations. Limitations include incomplete reporting of performance metrics and uncertainty regarding performance of case definitions across different EMR databases and countries.

Conclusion

Our review found a significant number of validated case definitions with good performance for use in primary care EMR data. These could be applied to other EMR databases in similar contexts and may enable better disease surveillance when using clinical EMR data. Consistent reporting across validation studies using EMR data would facilitate comparison across studies.

Systematic review registration

PROSPERO CRD42016040020 (submitted June 8, 2016, and last revised June 14, 2016).

Collapse

Denaxas S, Gonzalez-Izquierdo A, Direk K, Fitzpatrick NK, Fatemifar G, Banerjee A, Dobson RJB, Howe LJ, Kuan V, Lumbers RT, Pasea L, Patel RS, Shah AD, Hingorani AD, Sudlow C, Hemingway H. UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. J Am Med Inform Assoc 2019;26:1545-1559. [PMID: 31329239 PMCID: PMC6857510 DOI: 10.1093/jamia/ocz105] [Citation(s) in RCA: 104] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 04/25/2019] [Accepted: 05/29/2019] [Indexed: 01/13/2023] Open

Abstract

OBJECTIVE

Electronic health records (EHRs) are a rich source of information on human diseases, but the information is variably structured, fragmented, curated using different coding systems, and collected for purposes other than medical research. We describe an approach for developing, validating, and sharing reproducible phenotypes from national structured EHR in the United Kingdom with applications for translational research.

MATERIALS AND METHODS

We implemented a rule-based phenotyping framework, with up to 6 approaches of validation. We applied our framework to a sample of 15 million individuals in a national EHR data source (population-based primary care, all ages) linked to hospitalization and death records in England. Data comprised continuous measurements (for example, blood pressure; medication information; coded diagnoses, symptoms, procedures, and referrals), recorded using 5 controlled clinical terminologies: (1) read (primary care, subset of SNOMED-CT [Systematized Nomenclature of Medicine Clinical Terms]), (2) International Classification of Diseases-Ninth Revision and Tenth Revision (secondary care diagnoses and cause of mortality), (3) Office of Population Censuses and Surveys Classification of Surgical Operations and Procedures, Fourth Revision (hospital surgical procedures), and (4) DM+D prescription codes.

RESULTS

Using the CALIBER phenotyping framework, we created algorithms for 51 diseases, syndromes, biomarkers, and lifestyle risk factors and provide up to 6 validation approaches. The EHR phenotypes are curated in the open-access CALIBER Portal (https://www.caliberresearch.org/portal) and have been used by 40 national and international research groups in 60 peer-reviewed publications.

CONCLUSIONS

We describe a UK EHR phenomics approach within the CALIBER EHR data platform with initial evidence of validity and use, as an important step toward international use of UK EHR data for health research.

Collapse

Affiliation(s)

Spiros Denaxas Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom The Alan Turing Institute, London, United Kingdom The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Arturo Gonzalez-Izquierdo Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, United Kingdom
Kenan Direk Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, United Kingdom
Natalie K Fitzpatrick Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom
Ghazaleh Fatemifar Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom
Amitava Banerjee Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Richard J B Dobson Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom Department of Biostatistics and Health Informatics, Institute of Psychiatry Psychology and Neuroscience, King’s College London, London, United Kingdom The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Laurence J Howe Institute of Cardiovascular Science, University College London, London, United Kingdom
Valerie Kuan Health Data Research UK, London, United Kingdom Institute of Cardiovascular Science, University College London, London, United Kingdom
R Tom Lumbers Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Laura Pasea Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom
Riyaz S Patel Institute of Cardiovascular Science, University College London, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Anoop D Shah Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom
Aroon D Hingorani Health Data Research UK, London, United Kingdom Institute of Cardiovascular Science, University College London, London, United Kingdom
Cathie Sudlow Centre for Medical Informatics, Usher Institute of Population Health Science and Informatics, University of Edinburgh, Edinburgh, United Kingdom Health Data Research UK, Scotland, United Kingdom
Harry Hemingway Institute of Health Informatics, University College London, London,United Kingdom Health Data Research UK, London, United Kingdom The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, United Kingdom British Heart Foundation Research Accelerator, University College London, London, United Kingdom

Collapse

A Review of Automatic Phenotyping Approaches using Electronic Health Records. ELECTRONICS 2019. [DOI: 10.3390/electronics8111235] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

NØHR C, KUZIEMSKY CE, ELKIN PL, MARCILLY R, PELAYO S. Sustainable Health Informatics: Health Informaticians as Alchemists. Stud Health Technol Inform 2019;265:3-11. [PMID: 31431570 PMCID: PMC7323624 DOI: 10.3233/shti190129] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Walsh JA, Rozycki M, Yi E, Park Y. Application of machine learning in the diagnosis of axial spondyloarthritis. Curr Opin Rheumatol 2019;31:362-367. [PMID: 31033569 PMCID: PMC6553337 DOI: 10.1097/bor.0000000000000612] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Guinn D, Wilhelm EE, Lieberman G, Khozin S. Assessing function of electronic health records for real-world data generation. BMJ Evid Based Med 2019;24:95-98. [PMID: 30478146 DOI: 10.1136/bmjebm-2018-111111] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/08/2018] [Indexed: 01/08/2023]

Talaei-Khoei A, Wilson JM, Kazemi SF. Period of Measurement in Time-Series Predictions of Disease Counts from 2007 to 2017 in Northern Nevada: Analytics Experiment. JMIR Public Health Surveill 2019;5:e11357. [PMID: 30664479 PMCID: PMC6350093 DOI: 10.2196/11357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Revised: 10/23/2018] [Accepted: 10/30/2018] [Indexed: 12/28/2022] Open

McMillan B, Eastham R, Brown B, Fitton R, Dickinson D. Primary Care Patient Records in the United Kingdom: Past, Present, and Future Research Priorities. J Med Internet Res 2018;20:e11293. [PMID: 30567695 PMCID: PMC6315263 DOI: 10.2196/11293] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Accepted: 09/04/2018] [Indexed: 12/25/2022] Open

Wang H, Cui Z, Chen Y, Avidan M, Abdallah AB, Kronzer A. Predicting Hospital Readmission via Cost-Sensitive Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:1968-1978. [PMID: 29993930 DOI: 10.1109/tcbb.2018.2827029] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Abstract

With increased use of electronic medical records (EMRs), data mining on medical data has great potential to improve the quality of hospital treatment and increase the survival rate of patients. Early readmission prediction enables early intervention, which is essential to preventing serious or life-threatening events, and act as a substantial contributor to reduce healthcare costs. Existing works on predicting readmission often focus on certain vital signs and diseases by extracting statistical features. They also fail to consider skewness of class labels in medical data and different costs of misclassification errors. In this paper, we recur to the merits of convolutional neural networks (CNN) to automatically learn features from time series of vital sign, and categorical feature embedding to effectively encode feature vectors with heterogeneous clinical features, such as demographics, hospitalization history, vital signs, and laboratory tests. Then, both learnt features via CNN and statistical features via feature embedding are fed into a multilayer perceptron (MLP) for prediction. We use a cost-sensitive formulation to train MLP during prediction to tackle the imbalance and skewness challenge. We validate the proposed approach on two real medical datasets from Barnes-Jewish Hospital, and all data is taken from historical EMR databases and reflects the kinds of data that would realistically be available at the clinical prediction system in hospitals. We find that early prediction of readmission is possible and when compared with state-of-the-art existing methods used by hospitals, our methods perform significantly better. For example, using the general hospital wards data for 30-day readmission prediction, the area under the curve (AUC) for the proposed model was 0.70, significantly higher than all the baseline methods. Based on these results, a system is being deployed in hospital settings with the proposed forecasting algorithms to support treatment.

Collapse

Ritchie MD. Large-Scale Analysis of Genetic and Clinical Patient Data. Annu Rev Biomed Data Sci 2018. [DOI: 10.1146/annurev-biodatasci-080917-013508] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Elkin PL, Schlegel DR, Anderson M, Komm J, Ficheur G, Bisson L. Artificial Intelligence: Bayesian versus Heuristic Method for Diagnostic Decision Support. Appl Clin Inform 2018;9:432-439. [PMID: 29898469 DOI: 10.1055/s-0038-1656547] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Deans C, Griffin LD, Marmugi L, Renzoni F. Machine Learning Based Localization and Classification with Atomic Magnetometers. PHYSICAL REVIEW LETTERS 2018;120:033204. [PMID: 29400506 DOI: 10.1103/physrevlett.120.033204] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 11/24/2017] [Indexed: 06/07/2023]

Williams R, Kontopantelis E, Buchan I, Peek N. Clinical code set engineering for reusing EHR data for research: A review. J Biomed Inform 2017;70:1-13. [PMID: 28442434 DOI: 10.1016/j.jbi.2017.04.010] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Revised: 03/21/2017] [Accepted: 04/13/2017] [Indexed: 01/26/2023]

Abstract

INTRODUCTION

The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets.

OBJECTIVE

To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools.

METHODS

We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed.

RESULTS

Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered.

DISCUSSION

There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation.

CONCLUSION

Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected.

Collapse

Povalej Brzan P, Obradovic Z, Stiglic G. Contribution of temporal data to predictive performance in 30-day readmission of morbidly obese patients. PeerJ 2017;5:e3230. [PMID: 28462037 PMCID: PMC5407280 DOI: 10.7717/peerj.3230] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 03/25/2017] [Indexed: 12/21/2022] Open

Abstract

Background

Reduction of readmissions after discharge represents an important challenge for many hospitals and has attracted the interest of many researchers in the past few years. Most of the studies in this field focus on building cross-sectional predictive models that aim to predict the occurrence of readmission within 30-days based on information from the current hospitalization. The aim of this study is demonstration of predictive performance gain obtained by inclusion of information from historical hospitalization records among morbidly obese patients.

Methods

The California Statewide inpatient database was used to build regularized logistic regression models for prediction of readmission in morbidly obese patients (n = 18,881). Temporal features were extracted from historical patient hospitalization records in a one-year timeframe. Five different datasets of patients were prepared based on the number of available hospitalizations per patient. Sample size of the five datasets ranged from 4,787 patients with more than five hospitalizations to 20,521 patients with at least two hospitalization records in one year. A 10-fold cross validation was repeted 100 times to assess the variability of the results. Additionally, random forest and extreme gradient boosting were used to confirm the results.

Results

Area under the ROC curve increased significantly when including information from up to three historical records on all datasets. The inclusion of more than three historical records was not efficient. Similar results can be observed for Brier score and PPV value. The number of selected predictors corresponded to the complexity of the dataset ranging from an average of 29.50 selected features on the smallest dataset to 184.96 on the largest dataset based on 100 repetitions of 10-fold cross-validation.

Discussion

The results show positive influence of adding information from historical hospitalization records on predictive performance using all predictive modeling techniques used in this study. We can conclude that it is advantageous to build separate readmission prediction models in subgroups of patients with more hospital admissions by aggregating information from up to three previous hospitalizations.

Collapse