Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bates J, Fodeh SJ, Brandt CA, Womack JA. Classification of radiology reports for falls in an HIV study cohort. J Am Med Inform Assoc 2016;23:e113-7. [PMID: 26567329 PMCID: PMC4954638 DOI: 10.1093/jamia/ocv155] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 08/14/2015] [Accepted: 09/08/2015] [Indexed: 11/13/2022] Open

For:	Bates J, Fodeh SJ, Brandt CA, Womack JA. Classification of radiology reports for falls in an HIV study cohort. J Am Med Inform Assoc 2016;23:e113-7. [PMID: 26567329 PMCID: PMC4954638 DOI: 10.1093/jamia/ocv155] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 08/14/2015] [Accepted: 09/08/2015] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Kidwai-Khan F, Wang R, Skanderson M, Brandt CA, Fodeh S, Womack JA. A roadmap to artificial intelligence (AI): Methods for designing and building AI ready data to promote fairness. J Biomed Inform 2024;154:104654. [PMID: 38740316 PMCID: PMC11144439 DOI: 10.1016/j.jbi.2024.104654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 05/01/2024] [Accepted: 05/10/2024] [Indexed: 05/16/2024]

Abstract

OBJECTIVES

We evaluated methods for preparing electronic health record data to reduce bias before applying artificial intelligence (AI).

METHODS

We created methods for transforming raw data into a data framework for applying machine learning and natural language processing techniques for predicting falls and fractures. Strategies such as inclusion and reporting for multiple races, mixed data sources such as outpatient, inpatient, structured codes, and unstructured notes, and addressing missingness were applied to raw data to promote a reduction in bias. The raw data was carefully curated using validated definitions to create data variables such as age, race, gender, and healthcare utilization. For the formation of these variables, clinical, statistical, and data expertise were used. The research team included a variety of experts with diverse professional and demographic backgrounds to include diverse perspectives.

RESULTS

For the prediction of falls, information extracted from radiology reports was converted to a matrix for applying machine learning. The processing of the data resulted in an input of 5,377,673 reports to the machine learning algorithm, out of which 45,304 were flagged as positive and 5,332,369 as negative for falls. Processed data resulted in lower missingness and a better representation of race and diagnosis codes. For fractures, specialized algorithms extracted snippets of text around keywork "femoral" from dual x-ray absorptiometry (DXA) scans to identify femoral neck T-scores that are important for predicting fracture risk. The natural language processing algorithms yielded 98% accuracy and 2% error rate The methods to prepare data for input to artificial intelligence processes are reproducible and can be applied to other studies.

CONCLUSION

The life cycle of data from raw to analytic form includes data governance, cleaning, management, and analysis. When applying artificial intelligence methods, input data must be prepared optimally to reduce algorithmic bias, as biased output is harmful. Building AI-ready data frameworks that improve efficiency can contribute to transparency and reproducibility. The roadmap for the application of AI involves applying specialized techniques to input data, some of which are suggested here. This study highlights data curation aspects to be considered when preparing data for the application of artificial intelligence to reduce bias.

Collapse

Qiao S, Li X, Olatosi B, Young SD. Utilizing Big Data analytics and electronic health record data in HIV prevention, treatment, and care research: a literature review. AIDS Care 2024;36:583-603. [PMID: 34260325 DOI: 10.1080/09540121.2021.1948499] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 06/22/2021] [Indexed: 01/07/2023]

Casey A, Davidson E, Grover C, Tobin R, Grivas A, Zhang H, Schrempf P, O’Neil AQ, Lee L, Walsh M, Pellie F, Ferguson K, Cvoro V, Wu H, Whalley H, Mair G, Whiteley W, Alex B. Understanding the performance and reliability of NLP tools: a comparison of four NLP tools predicting stroke phenotypes in radiology reports. Front Digit Health 2023;5:1184919. [PMID: 37840686 PMCID: PMC10569314 DOI: 10.3389/fdgth.2023.1184919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 09/06/2023] [Indexed: 10/17/2023] Open

Abstract

Background

Natural language processing (NLP) has the potential to automate the reading of radiology reports, but there is a need to demonstrate that NLP methods are adaptable and reliable for use in real-world clinical applications.

Methods

We tested the F1 score, precision, and recall to compare NLP tools on a cohort from a study on delirium using images and radiology reports from NHS Fife and a population-based cohort (Generation Scotland) that spans multiple National Health Service health boards. We compared four off-the-shelf rule-based and neural NLP tools (namely, EdIE-R, ALARM+, ESPRESSO, and Sem-EHR) and reported on their performance for three cerebrovascular phenotypes, namely, ischaemic stroke, small vessel disease (SVD), and atrophy. Clinical experts from the EdIE-R team defined phenotypes using labelling techniques developed in the development of EdIE-R, in conjunction with an expert researcher who read underlying images.

Results

EdIE-R obtained the highest F1 score in both cohorts for ischaemic stroke, ≥93%, followed by ALARM+, ≥87%. The F1 score of ESPRESSO was ≥74%, whilst that of Sem-EHR is ≥66%, although ESPRESSO had the highest precision in both cohorts, 90% and 98%. For F1 scores for SVD, EdIE-R scored ≥98% and ALARM+ ≥90%. ESPRESSO scored lowest with ≥77% and Sem-EHR ≥81%. In NHS Fife, F1 scores for atrophy by EdIE-R and ALARM+ were 99%, dropping in Generation Scotland to 96% for EdIE-R and 91% for ALARM+. Sem-EHR performed lowest for atrophy at 89% in NHS Fife and 73% in Generation Scotland. When comparing NLP tool output with brain image reads using F1 scores, ALARM+ scored 80%, outperforming EdIE-R at 66% in ischaemic stroke. For SVD, EdIE-R performed best, scoring 84%, with Sem-EHR 82%. For atrophy, EdIE-R and both ALARM+ versions were comparable at 80%.

Conclusions

The four NLP tools show varying F1 (and precision/recall) scores across all three phenotypes, although more apparent for ischaemic stroke. If NLP tools are to be used in clinical settings, this cannot be performed "out of the box." It is essential to understand the context of their development to assess whether they are suitable for the task at hand or whether further training, re-training, or modification is required to adapt tools to the target task.

Collapse

Affiliation(s)

Arlene Casey Advanced Care Research Centre, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
Emma Davidson Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
Claire Grover School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Richard Tobin School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Andreas Grivas School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
Huayu Zhang Advanced Care Research Centre, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
Patrick Schrempf Canon Medical Research Europe Ltd., AI Research, Edinburgh, United Kingdom School of Computer Science, University of St Andrews, St Andrews, United Kingdom
Alison Q. O’Neil Canon Medical Research Europe Ltd., AI Research, Edinburgh, United Kingdom School of Engineering, University of Edinburgh, Edinburgh, United Kingdom
Liam Lee Medical School, University of Edinburgh, Edinburgh, United Kingdom
Michael Walsh Intensive Care Department, University Hospitals Bristol and Weston, Bristol, United Kingdom
Freya Pellie National Horizons Centre, Teesside University, Darlington, United Kingdom School of Health and Life Sciences, Teesside University, Middlesbrough, United Kingdom
Karen Ferguson Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
Vera Cvoro Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom Department of Geriatric Medicine, NHS Fife, Fife, United Kingdom
Honghan Wu Institute of Health Informatics, University College London, London, United Kingdom Alan Turing Institute, London, United Kingdom
Heather Whalley Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom Generation Scotland, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
Grant Mair Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom Neuroradiology, Department of Clinical Neurosciences, NHS Lothian, Edinburgh, United Kingdom
William Whiteley Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom Neuroradiology, Department of Clinical Neurosciences, NHS Lothian, Edinburgh, United Kingdom
Beatrice Alex Edinburgh Futures Institute, University of Edinburgh, Edinburgh, United Kingdom School of Literatures, Languages and Cultures, University of Edinburgh, Edinburgh, United Kingdom

Collapse

Mottin L, Goldman JP, Jäggli C, Achermann R, Gobeill J, Knafou J, Ehrsam J, Wicky A, Gérard CL, Schwenk T, Charrier M, Tsantoulis P, Lovis C, Leichtle A, Kiessling MK, Michielin O, Pradervand S, Foufi V, Ruch P. Multilingual RECIST classification of radiology reports using supervised learning. Front Digit Health 2023;5:1195017. [PMID: 37388252 PMCID: PMC10303934 DOI: 10.3389/fdgth.2023.1195017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/05/2023] [Indexed: 07/01/2023] Open

Affiliation(s)

Luc Mottin HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Jean-Philippe Goldman Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Christoph Jäggli Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
Rita Achermann Department of Radiology, Clinic of Radiology & Nuclear Medicine, University Hospital Basel, University of Basel, Basel, Switzerland
Julien Gobeill HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Julien Knafou HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland
Julien Ehrsam Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Alexandre Wicky Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Camille L. Gérard Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Tanja Schwenk Department of Oncology, Kantonsspital Aarau, Aarau, Switzerland
Mélinda Charrier Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Petros Tsantoulis Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Christian Lovis Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland
Alexander Leichtle Inselspital – Bern University Hospital and University of Bern, Bern, Switzerland
Michael K. Kiessling Department of Medical Oncology and Hematology, University Hospital Zurich, Zurich, Switzerland
Olivier Michielin Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Sylvain Pradervand Precision Oncology Center, Oncology Department, Centre Hospitalier Universitaire Vaudois – CHUV, Lausanne, Switzerland
Vasiliki Foufi Division of Medical Information Sciences, University Hospitals of Geneva, Geneva, Switzerland
Patrick Ruch HES-SO\HEG Genève, Information Sciences, Geneva, Switzerland SIB Text Mining Group, Swiss Institute of Bioinformatics, Geneva, Switzerland

Collapse

Womack JA, Murphy TE, Leo-Summers L, Bates J, Jarad S, Gill TM, Hsieh E, Rodriguez-Barradas MC, Tien PC, Yin MT, Brandt CA, Justice AC. Assessing the contributions of modifiable risk factors to serious falls and fragility fractures among older persons living with HIV. J Am Geriatr Soc 2023;71:1891-1901. [PMID: 36912153 PMCID: PMC10258163 DOI: 10.1111/jgs.18304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 01/14/2023] [Accepted: 01/25/2023] [Indexed: 03/14/2023]

Abstract

BACKGROUND

Although 50 years represents middle age among uninfected individuals, studies have shown that persons living with HIV (PWH) begin to demonstrate elevated risk for serious falls and fragility fractures in the sixth decade; the proportions of these outcomes attributable to modifiable factors are unknown.

METHODS

We analyzed 21,041 older PWH on antiretroviral therapy (ART) from the Veterans Aging Cohort Study from 01/01/2010 through 09/30/2015. Serious falls were identified by Ecodes and a machine-learning algorithm applied to radiology reports. Fragility fractures (hip, vertebral, and upper arm) were identified using ICD9 codes. Predictors for both models included a serious fall within the past 12 months, body mass index, physiologic frailty (VACS Index 2.0), illicit substance and alcohol use disorders, and measures of multimorbidity and polypharmacy. We separately fit multivariable logistic models to each outcome using generalized estimating equations. From these models, the longitudinal extensions of average attributable fraction (LE-AAF) for modifiable risk factors were estimated.

RESULTS

Key risk factors for both outcomes included physiologic frailty (VACS Index 2.0) (serious falls [15%; 95% CI 14%-15%]; fractures [13%; 95% CI 12%-14%]), a serious fall in the past year (serious falls [7%; 95% CI 7%-7%]; fractures [5%; 95% CI 4%-5%]), polypharmacy (serious falls [5%; 95% CI 4%-5%]; fractures [5%; 95% CI 4%-5%]), an opioid prescription in the past month (serious falls [7%; 95% CI 6%-7%]; fractures [9%; 95% CI 8%-9%]), and diagnosis of alcohol use disorder (serious falls [4%; 95% CI 4%-5%]; fractures [8%; 95% CI 7%-8%]).

CONCLUSIONS

This study confirms the contributions of risk factors important in the general population to both serious falls and fragility fractures among older PWH. Successful prevention programs for these outcomes should build on existing prevention efforts while including risk factors specific to PWH.

Collapse

Kidwai-Khan F, Wang R, Skanderson M, Brandt CA, Fodeh S, Womack JA. A Roadmap to Artificial Intelligence (AI): Methods for Designing and Building AI ready Data for Women's Health Studies. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.25.23290399. [PMID: 37398113 PMCID: PMC10312839 DOI: 10.1101/2023.05.25.23290399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]

Womack JA, Murphy TE, Leo-Summers L, Bates J, Jarad S, Smith AC, Gill TM, Hsieh E, Rodriguez-Barradas MC, Tien PC, Yin MT, Brandt CA, Justice AC. Predictive Risk Model for Serious Falls Among Older Persons Living With HIV. J Acquir Immune Defic Syndr 2022;91:168-174. [PMID: 36094483 PMCID: PMC9470988 DOI: 10.1097/qai.0000000000003030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 04/26/2022] [Indexed: 11/26/2022]

Dipnall JF, Lu J, Gabbe BJ, Cosic F, Edwards E, Page R, Du L. Comparison of state-of-the-art machine and deep learning algorithms to classify proximal humeral fractures using radiology text. Eur J Radiol 2022;153:110366. [DOI: 10.1016/j.ejrad.2022.110366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/08/2022] [Accepted: 05/16/2022] [Indexed: 12/01/2022]

Davidson EM, Poon MTC, Casey A, Grivas A, Duma D, Dong H, Suárez-Paniagua V, Grover C, Tobin R, Whalley H, Wu H, Alex B, Whiteley W. The reporting quality of natural language processing studies: systematic review of studies of radiology reports. BMC Med Imaging 2021;21:142. [PMID: 34600486 PMCID: PMC8487512 DOI: 10.1186/s12880-021-00671-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 09/20/2021] [Indexed: 01/04/2023] Open

Abstract

BACKGROUND

Automated language analysis of radiology reports using natural language processing (NLP) can provide valuable information on patients' health and disease. With its rapid development, NLP studies should have transparent methodology to allow comparison of approaches and reproducibility. This systematic review aims to summarise the characteristics and reporting quality of studies applying NLP to radiology reports.

METHODS

We searched Google Scholar for studies published in English that applied NLP to radiology reports of any imaging modality between January 2015 and October 2019. At least two reviewers independently performed screening and completed data extraction. We specified 15 criteria relating to data source, datasets, ground truth, outcomes, and reproducibility for quality assessment. The primary NLP performance measures were precision, recall and F1 score.

RESULTS

Of the 4,836 records retrieved, we included 164 studies that used NLP on radiology reports. The commonest clinical applications of NLP were disease information or classification (28%) and diagnostic surveillance (27.4%). Most studies used English radiology reports (86%). Reports from mixed imaging modalities were used in 28% of the studies. Oncology (24%) was the most frequent disease area. Most studies had dataset size > 200 (85.4%) but the proportion of studies that described their annotated, training, validation, and test set were 67.1%, 63.4%, 45.7%, and 67.7% respectively. About half of the studies reported precision (48.8%) and recall (53.7%). Few studies reported external validation performed (10.8%), data availability (8.5%) and code availability (9.1%). There was no pattern of performance associated with the overall reporting quality.

CONCLUSIONS

There is a range of potential clinical applications for NLP of radiology reports in health services and research. However, we found suboptimal reporting quality that precludes comparison, reproducibility, and replication. Our results support the need for development of reporting standards specific to clinical NLP studies.

Collapse

Affiliation(s)

Emma M Davidson Centre for Clinical Brain Sciences, University of Edinburgh, Chancellor's Building, Little France, Edinburgh, EH16 4TJ, Scotland, UK.
Michael T C Poon Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, Scotland, UK Brain Tumour Centre of Excellence, Cancer Research UK Edinburgh Centre, University of Edinburgh, Edinburgh, Scotland, UK
Arlene Casey School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland, UK
Andreas Grivas School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland, UK
Daniel Duma School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland, UK
Hang Dong Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, Scotland, UK Health Data Research UK, London, UK
Víctor Suárez-Paniagua Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, Scotland, UK Health Data Research UK, London, UK
Claire Grover Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK
Richard Tobin Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, Edinburgh, Scotland, UK
Heather Whalley Centre for Clinical Brain Sciences, University of Edinburgh, Chancellor's Building, Little France, Edinburgh, EH16 4TJ, Scotland, UK Division of Psychiatry, University of Edinburgh, Edinburgh, UK
Honghan Wu Health Data Research UK, London, UK Institute of Health Informatics, University College London, London, UK
Beatrice Alex School of Literatures, Languages and Cultures (LLC), University of Edinburgh, Edinburgh, Scotland, UK Edinburgh Futures Institute, University of Edinburgh, Edinburgh, Scotland, UK
William Whiteley Centre for Clinical Brain Sciences, University of Edinburgh, Chancellor's Building, Little France, Edinburgh, EH16 4TJ, Scotland, UK Health Data Research UK, London, UK Nuffield Department of Population Health, University of Oxford, Oxford, UK

Collapse

Womack JA, Murphy TE, Ramsey C, Bathulapalli H, Leo-Summers L, Smith AC, Bates J, Jarad S, Gill TM, Hsieh E, Rodriguez-Barradas MC, Tien PC, Yin MT, Brandt C, Justice AC. Brief Report: Are Serious Falls Associated With Subsequent Fragility Fractures Among Veterans Living With HIV? J Acquir Immune Defic Syndr 2021;88:192-196. [PMID: 34506360 PMCID: PMC8513792 DOI: 10.1097/qai.0000000000002752] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 06/09/2021] [Indexed: 11/26/2022]

Jing X. The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis. JMIR Med Inform 2021;9:e20675. [PMID: 34236337 PMCID: PMC8433943 DOI: 10.2196/20675] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 11/25/2020] [Accepted: 07/02/2021] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

The Unified Medical Language System (UMLS) has been a critical tool in biomedical and health informatics, and the year 2021 marks its 30th anniversary. The UMLS brings together many broadly used vocabularies and standards in the biomedical field to facilitate interoperability among different computer systems and applications.

OBJECTIVE

Despite its longevity, there is no comprehensive publication analysis of the use of the UMLS. Thus, this review and analysis is conducted to provide an overview of the UMLS and its use in English-language peer-reviewed publications, with the objective of providing a comprehensive understanding of how the UMLS has been used in English-language peer-reviewed publications over the last 30 years.

METHODS

PubMed, ACM Digital Library, and the Nursing & Allied Health Database were used to search for studies. The primary search strategy was as follows: UMLS was used as a Medical Subject Headings term or a keyword or appeared in the title or abstract. Only English-language publications were considered. The publications were screened first, then coded and categorized iteratively, following the grounded theory. The review process followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines.

RESULTS

A total of 943 publications were included in the final analysis. Moreover, 32 publications were categorized into 2 categories; hence the total number of publications before duplicates are removed is 975. After analysis and categorization of the publications, UMLS was found to be used in the following emerging themes or areas (the number of publications and their respective percentages are given in parentheses): natural language processing (230/975, 23.6%), information retrieval (125/975, 12.8%), terminology study (90/975, 9.2%), ontology and modeling (80/975, 8.2%), medical subdomains (76/975, 7.8%), other language studies (53/975, 5.4%), artificial intelligence tools and applications (46/975, 4.7%), patient care (35/975, 3.6%), data mining and knowledge discovery (25/975, 2.6%), medical education (20/975, 2.1%), degree-related theses (13/975, 1.3%), digital library (5/975, 0.5%), and the UMLS itself (150/975, 15.4%), as well as the UMLS for other purposes (27/975, 2.8%).

CONCLUSIONS

The UMLS has been used successfully in patient care, medical education, digital libraries, and software development, as originally planned, as well as in degree-related theses, the building of artificial intelligence tools, data mining and knowledge discovery, foundational work in methodology, and middle layers that may lead to advanced products. Natural language processing, the UMLS itself, and information retrieval are the 3 most common themes that emerged among the included publications. The results, although largely related to academia, demonstrate that UMLS achieves its intended uses successfully, in addition to achieving uses broadly beyond its original intentions.

Collapse

Casey A, Davidson E, Poon M, Dong H, Duma D, Grivas A, Grover C, Suárez-Paniagua V, Tobin R, Whiteley W, Wu H, Alex B. A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 2021;21:179. [PMID: 34082729 PMCID: PMC8176715 DOI: 10.1186/s12911-021-01533-7] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/17/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Natural language processing (NLP) has a significant role in advancing healthcare and has been found to be key in extracting structured information from radiology reports. Understanding recent developments in NLP application to radiology is of significance but recent reviews on this are limited. This study systematically assesses and quantifies recent literature in NLP applied to radiology reports.

METHODS

We conduct an automated literature search yielding 4836 results using automated filtering, metadata enriching steps and citation search combined with manual review. Our analysis is based on 21 variables including radiology characteristics, NLP methodology, performance, study, and clinical application characteristics.

RESULTS

We present a comprehensive analysis of the 164 publications retrieved with publications in 2019 almost triple those in 2015. Each publication is categorised into one of 6 clinical application categories. Deep learning use increases in the period but conventional machine learning approaches are still prevalent. Deep learning remains challenged when data is scarce and there is little evidence of adoption into clinical practice. Despite 17% of studies reporting greater than 0.85 F1 scores, it is hard to comparatively evaluate these approaches given that most of them use different datasets. Only 14 studies made their data and 15 their code available with 10 externally validating results.

CONCLUSIONS

Automated understanding of clinical narratives of the radiology reports has the potential to enhance the healthcare process and we show that research in this field continues to grow. Reproducibility and explainability of models are important if the domain is to move applications into clinical use. More could be done to share code enabling validation of methods on different institutional data and to reduce heterogeneity in reporting of study properties allowing inter-study comparisons. Our results have significance for researchers in the field providing a systematic synthesis of existing work to build on, identify gaps, opportunities for collaboration and avoid duplication.

Collapse

Simon ST, Mandair D, Tiwari P, Rosenberg MA. Prediction of Drug-Induced Long QT Syndrome Using Machine Learning Applied to Harmonized Electronic Health Record Data. J Cardiovasc Pharmacol Ther 2021;26:335-340. [PMID: 33682475 DOI: 10.1177/1074248421995348] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract

BACKGROUND

Drug-induced QT prolongation is a potentially preventable cause of morbidity and mortality, however there are no widespread clinical tools utilized to predict which individuals are at greatest risk. Machine learning (ML) algorithms may provide a method for identifying these individuals, and could be automated to directly alert providers in real time.

OBJECTIVE

This study applies ML techniques to electronic health record (EHR) data to identify an integrated risk-prediction model that can be deployed to predict risk of drug-induced QT prolongation.

METHODS

We examined harmonized data from the UCHealth EHR and identified inpatients who had received a medication known to prolong the QT interval. Using a binary outcome of the development of a QTc interval >500 ms within 24 hours of medication initiation or no ECG with a QTc interval >500 ms, we compared multiple machine learning methods by classification accuracy and performed calibration and rescaling of the final model.

RESULTS

We identified 35,639 inpatients who received a known QT-prolonging medication and an ECG performed within 24 hours of administration. Of those, 4,558 patients developed a QTc > 500 ms and 31,081 patients did not. A deep neural network with random oversampling of controls was found to provide superior classification accuracy (F1 score 0.404; AUC 0.71) for the development of a long QT interval compared with other methods. The optimal cutpoint for prediction was determined and was reasonably accurate (sensitivity 71%; specificity 73%).

CONCLUSIONS

We found that deep neural networks applied to EHR data provide reasonable prediction of which individuals are most susceptible to drug-induced QT prolongation. Future studies are needed to validate this model in novel EHRs and within the physician order entry system to assess the ability to improve patient safety.

Collapse

Oliveira CR, Niccolai P, Ortiz AM, Sheth SS, Shapiro ED, Niccolai LM, Brandt CA. Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study. JMIR Med Inform 2020;8:e20826. [PMID: 32469840 PMCID: PMC7671846 DOI: 10.2196/20826] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 09/18/2020] [Accepted: 10/04/2020] [Indexed: 12/13/2022] Open

Abstract

Background

Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines. The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text. Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive. Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research.

Objective

This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus.

Methods

A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports. To test the algorithm’s classification accuracy, we used a split-validation study design. Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories. Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results). Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure.

Results

The natural language processing algorithm’s performance was validated on 949 pathology reports. The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.91. Precision was lowest for anal histology reports (0.87, 95% CI 0.59-0.98) and highest for cervical cytology (0.98, 95% CI 0.95-0.99). The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.68, 95% CI 0.43-0.87).

Conclusions

This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.

Collapse

Womack JA, Murphy TE, Bathulapalli H, Smith A, Bates J, Jarad S, Redeker NS, Luther SL, Gill TM, Brandt CA, Justice AC. Serious Falls in Middle-Aged Veterans: Development and Validation of a Predictive Risk Model. J Am Geriatr Soc 2020;68:2847-2854. [PMID: 32860222 DOI: 10.1111/jgs.16773] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 07/13/2020] [Accepted: 07/14/2020] [Indexed: 12/27/2022]

Abstract

BACKGROUND/OBJECTIVES

Due to high rates of multimorbidity, polypharmacy, and hazardous alcohol and opioid use, middle-aged Veterans are at risk for serious falls (those prompting a visit with a healthcare provider), posing significant risk to their forthcoming geriatric health and quality of life. We developed and validated a predictive model of the 6-month risk of serious falls among middle-aged Veterans.

DESIGN

Cohort study.

SETTING

Veterans Health Administration (VA).

PARTICIPANTS

Veterans, aged 45 to 65 years, who presented for care within the VA between 2012 and 2015 (N = 275,940).

EXPOSURES

The exposures of primary interest were substance use (including alcohol and prescription opioid use), multimorbidity, and polypharmacy. Hazardous alcohol use was defined as an Alcohol Use Disorders Identification Test - Consumption (AUDIT-C) score of 3 or greater for women and 4 or greater for men. We used International Classification of Diseases, Ninth Revision (ICD-9), codes to identify alcohol and illicit substance use disorders and identified prescription opioid use from pharmacy fill-refill data. We included counts of chronic medications and of physical and mental health comorbidities.

MEASUREMENTS

We identified serious falls using external cause of injury codes and a machine-learning algorithm that identified serious falls in radiology reports. We used multivariable logistic regression with general estimating equations to calculate risk. We used an integrated predictiveness curve to identify intervention thresholds.

RESULTS

Most of our sample (54%) was aged 60 years or younger. Duration of follow-up was up to 4 years. Veterans who fell were more likely to be female (11% vs 7%) and White (72% vs 68%). They experienced 43,641 serious falls during follow-up. We identified 16 key predictors of serious falls and five interaction terms. Model performance was enhanced by addition of opioid use, as evidenced by overall category-free net reclassification improvement of 0.32 (P < .001). Discrimination (C-statistic = 0.76) and calibration were excellent for both development and validation data sets.

CONCLUSION

We developed and internally validated a model to predict 6-month risk of serious falls among middle-aged Veterans with excellent discrimination and calibration.

Collapse

Polypharmacy, Hazardous Alcohol and Illicit Substance Use, and Serious Falls Among PLWH and Uninfected Comparators. J Acquir Immune Defic Syndr 2020;82:305-313. [PMID: 31339866 DOI: 10.1097/qai.0000000000002130] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Spasic I, Nenadic G. Clinical Text Data in Machine Learning: Systematic Review. JMIR Med Inform 2020;8:e17984. [PMID: 32229465 PMCID: PMC7157505 DOI: 10.2196/17984] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 02/24/2020] [Accepted: 02/24/2020] [Indexed: 12/22/2022] Open

Abstract

Background

Clinical narratives represent the main form of communication within health care, providing a personalized account of patient history and assessments, and offering rich information for clinical decision making. Natural language processing (NLP) has repeatedly demonstrated its feasibility to unlock evidence buried in clinical narratives. Machine learning can facilitate rapid development of NLP tools by leveraging large amounts of text data.

Objective

The main aim of this study was to provide systematic evidence on the properties of text data used to train machine learning approaches to clinical NLP. We also investigated the types of NLP tasks that have been supported by machine learning and how they can be applied in clinical practice.

Methods

Our methodology was based on the guidelines for performing systematic reviews. In August 2018, we used PubMed, a multifaceted interface, to perform a literature search against MEDLINE. We identified 110 relevant studies and extracted information about text data used to support machine learning, NLP tasks supported, and their clinical applications. The data properties considered included their size, provenance, collection methods, annotation, and any relevant statistics.

Results

The majority of datasets used to train machine learning models included only hundreds or thousands of documents. Only 10 studies used tens of thousands of documents, with a handful of studies utilizing more. Relatively small datasets were utilized for training even when much larger datasets were available. The main reason for such poor data utilization is the annotation bottleneck faced by supervised machine learning algorithms. Active learning was explored to iteratively sample a subset of data for manual annotation as a strategy for minimizing the annotation effort while maximizing the predictive performance of the model. Supervised learning was successfully used where clinical codes integrated with free-text notes into electronic health records were utilized as class labels. Similarly, distant supervision was used to utilize an existing knowledge base to automatically annotate raw text. Where manual annotation was unavoidable, crowdsourcing was explored, but it remains unsuitable because of the sensitive nature of data considered. Besides the small volume, training data were typically sourced from a small number of institutions, thus offering no hard evidence about the transferability of machine learning models. The majority of studies focused on text classification. Most commonly, the classification results were used to support phenotyping, prognosis, care improvement, resource management, and surveillance.

Conclusions

We identified the data annotation bottleneck as one of the key obstacles to machine learning approaches in clinical NLP. Active learning and distant supervision were explored as a way of saving the annotation efforts. Future research in this field would benefit from alternatives such as data augmentation and transfer learning, or unsupervised learning, which do not require data annotation.

Collapse

Song X, Waitman LR, Hu Y, Yu ASL, Robins D, Liu M. Robust clinical marker identification for diabetic kidney disease with ensemble feature selection. J Am Med Inform Assoc 2019;26:242-253. [PMID: 30602020 PMCID: PMC7792755 DOI: 10.1093/jamia/ocy165] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2018] [Revised: 11/05/2018] [Accepted: 11/21/2018] [Indexed: 11/15/2022] Open

Feller DJ, Zucker J, Don't Walk OB, Srikishan B, Martinez R, Evans H, Yin MT, Gordon P, Elhadad N. Towards the Inference of Social and Behavioral Determinants of Sexual Health: Development of a Gold-Standard Corpus with Semi-Supervised Learning. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018;2018:422-429. [PMID: 30815082 PMCID: PMC6371339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Zhao Y, Wong ZSY, Tsui KL. A Framework of Rebalancing Imbalanced Healthcare Data for Rare Events' Classification: A Case of Look-Alike Sound-Alike Mix-Up Incident Detection. JOURNAL OF HEALTHCARE ENGINEERING 2018;2018:6275435. [PMID: 29951182 PMCID: PMC5987310 DOI: 10.1155/2018/6275435] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 02/02/2018] [Accepted: 02/22/2018] [Indexed: 11/17/2022]

Gehrmann S, Dernoncourt F, Li Y, Carlson ET, Wu JT, Welt J, Foote J, Moseley ET, Grant DW, Tyler PD, Celi LA. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One 2018;13:e0192360. [PMID: 29447188 PMCID: PMC5813927 DOI: 10.1371/journal.pone.0192360] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 01/21/2018] [Indexed: 01/22/2023] Open

Abstract

In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.

Collapse

Affiliation(s)

Sebastian Gehrmann MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Harvard SEAS, Harvard University, Cambridge, MA, United States of America * E-mail:
Franck Dernoncourt MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Massachusetts Institute of Technology, Cambridge, MA, United States of America Adobe Research, San Jose, CA, United States of America
Yeran Li MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America
Eric T. Carlson MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Philips Research North America, Cambridge, MA, United States of America
Joy T. Wu MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Harvard T.H. Chan School of Public Health, Cambridge, MA, United States of America
Jonathan Welt MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Wellman Center for Photomedicine, Massachusetts General Hospital, Boston, MA, United States of America
John Foote MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Tufts University School of Medicine, Cambridge, MA, United States of America
Edward T. Moseley MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America College of Science and Mathematics, University of Massachusetts, Boston, MA, United States of America
David W. Grant MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Department of Surgery, Division of Plastic and Reconstructive Surgery, Washington University School of Medicine, St. Louis, MO, United States of America
Patrick D. Tyler MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Department of Internal Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States of America
Leo A. Celi MIT Critical Data, Laboratory for Computational Physiology, Cambridge, MA, United States of America Massachusetts Institute of Technology, Cambridge, MA, United States of America

Collapse

Kennell TI, Willig JH, Cimino JJ. Clinical Informatics Researcher's Desiderata for the Data Content of the Next Generation Electronic Health Record. Appl Clin Inform 2017;8:1159-1172. [PMID: 29270955 DOI: 10.4338/aci-2017-06-r-0101] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Abstract

OBJECTIVE

Clinical informatics researchers depend on the availability of high-quality data from the electronic health record (EHR) to design and implement new methods and systems for clinical practice and research. However, these data are frequently unavailable or present in a format that requires substantial revision. This article reports the results of a review of informatics literature published from 2010 to 2016 that addresses these issues by identifying categories of data content that might be included or revised in the EHR.

MATERIALS AND METHODS

We used an iterative review process on 1,215 biomedical informatics research articles. We placed them into generic categories, reviewed and refined the categories, and then assigned additional articles, for a total of three iterations.

RESULTS

Our process identified eight categories of data content issues: Adverse Events, Clinician Cognitive Processes, Data Standards Creation and Data Communication, Genomics, Medication List Data Capture, Patient Preferences, Patient-reported Data, and Phenotyping.

DISCUSSION

These categories summarize discussions in biomedical informatics literature that concern data content issues restricting clinical informatics research. These barriers to research result from data that are either absent from the EHR or are inadequate (e.g., in narrative text form) for the downstream applications of the data. In light of these categories, we discuss changes to EHR data storage that should be considered in the redesign of EHRs, to promote continued innovation in clinical informatics.

CONCLUSION

Based on published literature of clinical informaticians' reuse of EHR data, we characterize eight types of data content that, if included in the next generation of EHRs, would find immediate application in advanced informatics tools and techniques.

Collapse

Greene M, Justice AC, Covinsky KE. Assessment of geriatric syndromes and physical function in people living with HIV. Virulence 2016;8:586-598. [PMID: 27715455 DOI: 10.1080/21505594.2016.1245269] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open