1
|
Golder S, Xu D, O'Connor K, Wang Y, Batra M, Hernandez GG. Leveraging Natural Language Processing and Machine Learning Methods for Adverse Drug Event Detection in Electronic Health/Medical Records: A Scoping Review. Drug Saf 2025:10.1007/s40264-024-01505-6. [PMID: 39786481 DOI: 10.1007/s40264-024-01505-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/24/2024] [Indexed: 01/12/2025]
Abstract
BACKGROUND Natural language processing (NLP) and machine learning (ML) techniques may help harness unstructured free-text electronic health record (EHR) data to detect adverse drug events (ADEs) and thus improve pharmacovigilance. However, evidence of their real-world effectiveness remains unclear. OBJECTIVE To summarise the evidence on the effectiveness of NLP/ML in detecting ADEs from unstructured EHR data and ultimately improve pharmacovigilance in comparison to other data sources. METHODS A scoping review was conducted by searching six databases in July 2023. Studies leveraging NLP/ML to identify ADEs from EHR were included. Titles/abstracts were screened by two independent researchers as were full-text articles. Data extraction was conducted by one researcher and checked by another. A narrative synthesis summarises the research techniques, ADEs analysed, model performance and pharmacovigilance impacts. RESULTS Seven studies met the inclusion criteria covering a wide range of ADEs and medications. The utilisation of rule-based NLP, statistical models, and deep learning approaches was observed. Natural language processing/ML techniques with unstructured data improved the detection of under-reported adverse events and safety signals. However, substantial variability was noted in the techniques and evaluation methods employed across the different studies and limitations exist in integrating the findings into practice. CONCLUSIONS Natural language processing (NLP) and machine learning (ML) have promising possibilities in extracting valuable insights with regard to pharmacovigilance from unstructured EHR data. These approaches have demonstrated proficiency in identifying specific adverse events and uncovering previously unknown safety signals that would not have been apparent through structured data alone. Nevertheless, challenges such as the absence of standardised methodologies and validation criteria obstruct the widespread adoption of NLP/ML for pharmacovigilance leveraging of unstructured EHR data.
Collapse
Affiliation(s)
- Su Golder
- Department of Health Sciences, University of York, York, YO10 5DD, UK.
| | - Dongfang Xu
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Karen O'Connor
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Yunwen Wang
- William Allen White School of Journalism and Mass Communications, The University of Kansas, Lawrence, KS, USA
| | - Mahak Batra
- Department of Health Sciences, University of York, York, YO10 5DD, UK
| | | |
Collapse
|
2
|
Gallifant J, Celi LA, Sharon E, Bitterman DS. Navigating the Complexities of Artificial Intelligence-Enabled Real-World Data Collection for Oncology Pharmacovigilance. JCO Clin Cancer Inform 2024; 8:e2400051. [PMID: 38713889 PMCID: PMC11466373 DOI: 10.1200/cci.24.00051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 04/03/2024] [Indexed: 05/09/2024] Open
Abstract
This new editorial discusses the promise and challenges of successful integration of natural language processing methods into electronic health records for timely, robust, and fair oncology pharmacovigilance.
Collapse
Affiliation(s)
- Jack Gallifant
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139
- Department of Critical Care, Guy’s & St Thomas’ NHS Trust, London, United Kingdom, SE1 7EH
| | - Leo Anthony Celi
- Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA 02139
- Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115
| | - Elad Sharon
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Danielle S. Bitterman
- Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Brigham and Women’s Hospital/Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
3
|
Shriver SP, Adams D, McKelvey BA, McCune JS, Miles D, Pratt VM, Ashcraft K, McLeod HL, Williams H, Fleury ME. Overcoming Barriers to Discovery and Implementation of Equitable Pharmacogenomic Testing in Oncology. J Clin Oncol 2024:JCO2301748. [PMID: 38386947 DOI: 10.1200/jco.23.01748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/08/2023] [Accepted: 12/12/2023] [Indexed: 02/24/2024] Open
Abstract
Pharmacogenomics (PGx), the study of inherited genomic variation and drug response or safety, is a vital tool in precision medicine. In oncology, testing to identify PGx variants offers patients the opportunity for customized treatments that can minimize adverse effects and maximize the therapeutic benefits of drugs used for cancer treatment and supportive care. Because individuals of shared ancestry share specific genetic variants, PGx factors may contribute to outcome disparities across racial and ethnic categories when genetic ancestry is not taken into account or mischaracterized in PGx research, discovery, and application. Here, we examine how the current scientific understanding of the role of PGx in differential oncology safety and outcomes may be biased toward a greater understanding and more complete clinical implementation of PGx for individuals of European descent compared with other genetic ancestry groups. We discuss the implications of this bias for PGx discovery, access to care, drug labeling, and patient and provider understanding and use of PGx approaches. Testing for somatic genetic variants is now the standard of care in treatment of many solid tumors, but the integration of PGx into oncology care is still lacking despite demonstrated actionable findings from PGx testing, reduction in avoidable toxicity and death, and return on investment from testing. As the field of oncology is poised to expand and integrate germline genetic variant testing, it is vital that PGx discovery and application are equitable for all populations. Recommendations are introduced to address barriers to facilitate effective and equitable PGx application in cancer care.
Collapse
Affiliation(s)
| | | | | | - Jeannine S McCune
- City of Hope/Beckman Research Institute Department of Hematologic Malignancies Translational Sciences, Duarte, CA
| | | | | | | | | | | | | |
Collapse
|
4
|
Bazoge A, Morin E, Daille B, Gourraud PA. Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review. JMIR Med Inform 2023; 11:e42477. [PMID: 38100200 PMCID: PMC10757232 DOI: 10.2196/42477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/16/2023] [Accepted: 09/07/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND In recent years, health data collected during the clinical care process have been often repurposed for secondary use through clinical data warehouses (CDWs), which interconnect disparate data from different sources. A large amount of information of high clinical value is stored in unstructured text format. Natural language processing (NLP), which implements algorithms that can operate on massive unstructured textual data, has the potential to structure the data and make clinical information more accessible. OBJECTIVE The aim of this review was to provide an overview of studies applying NLP to textual data from CDWs. It focuses on identifying the (1) NLP tasks applied to data from CDWs and (2) NLP methods used to tackle these tasks. METHODS This review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We searched for relevant articles in 3 bibliographic databases: PubMed, Google Scholar, and ACL Anthology. We reviewed the titles and abstracts and included articles according to the following inclusion criteria: (1) focus on NLP applied to textual data from CDWs, (2) articles published between 1995 and 2021, and (3) written in English. RESULTS We identified 1353 articles, of which 194 (14.34%) met the inclusion criteria. Among all identified NLP tasks in the included papers, information extraction from clinical text (112/194, 57.7%) and the identification of patients (51/194, 26.3%) were the most frequent tasks. To address the various tasks, symbolic methods were the most common NLP methods (124/232, 53.4%), showing that some tasks can be partially achieved with classical NLP techniques, such as regular expressions or pattern matching that exploit specialized lexica, such as drug lists and terminologies. Machine learning (70/232, 30.2%) and deep learning (38/232, 16.4%) have been increasingly used in recent years, including the most recent approaches based on transformers. NLP methods were mostly applied to English language data (153/194, 78.9%). CONCLUSIONS CDWs are central to the secondary use of clinical texts for research purposes. Although the use of NLP on data from CDWs is growing, there remain challenges in this field, especially with regard to languages other than English. Clinical NLP is an effective strategy for accessing, extracting, and transforming data from CDWs. Information retrieved with NLP can assist in clinical research and have an impact on clinical practice.
Collapse
Affiliation(s)
- Adrien Bazoge
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
| | - Emmanuel Morin
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Béatrice Daille
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Pierre-Antoine Gourraud
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
- Nantes Université, INSERM, CHU de Nantes, École Centrale Nantes, Centre de Recherche Translationnelle en Transplantation et Immunologie, CR2TI, F-44000 Nantes, France
| |
Collapse
|
5
|
Sim JA, Huang X, Horan MR, Stewart CM, Robison LL, Hudson MM, Baker JN, Huang IC. Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review. Artif Intell Med 2023; 146:102701. [PMID: 38042599 PMCID: PMC10693655 DOI: 10.1016/j.artmed.2023.102701] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 09/30/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]
Abstract
OBJECTIVE Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care. METHODS We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs. RESULTS Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP. CONCLUSIONS This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
Collapse
Affiliation(s)
- Jin-Ah Sim
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; School of AI Convergence, Hallym University, Chuncheon, Republic of Korea
| | - Xiaolei Huang
- Department of Computer Science, University of Memphis, Memphis, TN, United States
| | - Madeline R Horan
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Christopher M Stewart
- Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States
| | - Leslie L Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Melissa M Hudson
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States; Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Justin N Baker
- Department of Pediatrics, Stanford University, Stanford, CA, United States
| | - I-Chan Huang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, United States.
| |
Collapse
|
6
|
Botsis T, Kreimeyer K. Improving drug safety with adverse event detection using natural language processing. Expert Opin Drug Saf 2023; 22:659-668. [PMID: 37339273 DOI: 10.1080/14740338.2023.2228197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 06/19/2023] [Indexed: 06/22/2023]
Abstract
INTRODUCTION Pharmacovigilance (PV) involves monitoring and aggregating adverse event information from a variety of data sources, including health records, biomedical literature, spontaneous adverse event reports, product labels, and patient-generated content like social media posts, but the most pertinent details in these sources are typically available in narrative free-text formats. Natural language processing (NLP) techniques can be used to extract clinically relevant information from PV texts to inform decision-making. AREAS COVERED We conducted a non-systematic literature review by querying the PubMed database to examine the uses of NLP in drug safety and distilled the findings to present our expert opinion on the topic. EXPERT OPINION New NLP techniques and approaches continue to be applied for drug safety use cases; however, systems that are fully deployed and in use in a clinical environment remain vanishingly rare. To see high-performing NLP techniques implemented in the real setting will require long-term engagement with end users and other stakeholders and revised workflows in fully formulated business plans for the targeted use cases. Additionally, we found little to no evidence of extracted information placed into standardized data models, which should be a way to make implementations more portable and adaptable.
Collapse
Affiliation(s)
- Taxiarchis Botsis
- Department of Oncology, the Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kory Kreimeyer
- Department of Oncology, the Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
7
|
Han R, Zhang Z, Wei H, Yin D. Chinese medical event detection based on event frequency distribution ratio and document consistency. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:11063-11080. [PMID: 37322971 DOI: 10.3934/mbe.2023489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Structured information especially medical events extracted from electronic medical records has extremely practical application value and play a basic role in various intelligent diagnosis and treatment systems. Fine-grained Chinese medical event detection is crucial in the process of structuring Chinese Electronic Medical Record (EMR). The current methods for detecting fine-grained Chinese medical events primarily rely on statistical machine learning and deep learning. However, they have two shortcomings: 1) they neglect to take into account the distribution characteristics of these fine-grained medical events. 2) they overlook the consistency in the distribution of medical events within each individual document. Therefore, this paper presents a fine-grained Chinese medical event detection method, which is based on event frequency distribution ratio and document consistency. To start with, a significant number of Chinese EMR texts are used to adapt the Chinese pre-training model BERT to the domain. Second, based on the fundamental features, the Event Frequency - Event Distribution Ratio (EF-DR) is devised to select distinct event information as supplementary features, taking into account the distribution of events within the EMR. Finally, using EMR document consistency within the model improves the outcome of event detection. Our experiments demonstrate that the proposed method significantly outperforms the baseline model.
Collapse
Affiliation(s)
- Ruirui Han
- College of Computer Science and Engineering, Northwest Normal University, 967 Anning East Road, Lanzhou 730070, China
| | - Zhichang Zhang
- College of Computer Science and Engineering, Northwest Normal University, 967 Anning East Road, Lanzhou 730070, China
| | - Hao Wei
- College of Computer Science and Engineering, Northwest Normal University, 967 Anning East Road, Lanzhou 730070, China
| | - Deyue Yin
- College of Computer Science and Engineering, Northwest Normal University, 967 Anning East Road, Lanzhou 730070, China
| |
Collapse
|
8
|
Murphy RM, Dongelmans DA, Kom IYD, Calixto I, Abu-Hanna A, Jager KJ, de Keizer NF, Klopotowska JE. Drug-related causes attributed to acute kidney injury and their documentation in intensive care patients. J Crit Care 2023; 75:154292. [PMID: 36959015 DOI: 10.1016/j.jcrc.2023.154292] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 03/14/2023] [Accepted: 03/14/2023] [Indexed: 03/25/2023]
Abstract
PURPOSE To investigate drug-related causes attributed to acute kidney injury (DAKI) and their documentation in patients admitted to the Intensive Care Unit (ICU). METHODS This study was conducted in an academic hospital in the Netherlands by reusing electronic health record (EHR) data of adult ICU admissions between November 2015 to January 2020. First, ICU admissions with acute kidney injury (AKI) stage 2 or 3 were identified. Subsequently, three modes of DAKI documentation in EHR were examined: diagnosis codes (structured data), allergy module (semi-structured data), and clinical notes (unstructured data). RESULTS n total 8124 ICU admissions were included, with 542 (6.7%) ICU admissions experiencing AKI stage 2 or 3. The ICU physicians deemed 102 of these AKI cases (18.8%) to be drug-related. These DAKI cases were all documented in the clinical notes (100%), one in allergy module (1%) and none via diagnosis codes. The clinical notes required the highest time investment to analyze. CONCLUSIONS Drug-related causes comprise a substantial part of AKI in the ICU patients. However, current unstructured DAKI documentation practice via clinical notes hampers our ability to gain better insights about DAKI occurrence. Therefore, both automating DAKI identification from the clinical notes and increasing structured DAKI documentation should be encouraged.
Collapse
Affiliation(s)
- Rachel M Murphy
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Digital Health, Amsterdam, the Netherlands; Amsterdam Public Health, Quality of Care, Amsterdam, the Netherlands.
| | - Dave A Dongelmans
- Amsterdam Public Health, Quality of Care, Amsterdam, the Netherlands; Amsterdam UMC location University of Amsterdam, Department of Intensive Care Medicine, Meibergdreef 9, Amsterdam, the Netherlands
| | - Izak Yasrebi-de Kom
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Methodology, Amsterdam, the Netherlands
| | - Iacer Calixto
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Methodology, Amsterdam, the Netherlands; Amsterdam Public Health, Mental Health, Amsterdam, the Netherlands
| | - Ameen Abu-Hanna
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Methodology, Amsterdam, the Netherlands; Amsterdam Public Health, Aging & Later Life, Amsterdam, the Netherlands
| | - Kitty J Jager
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Quality of Care, Amsterdam, the Netherlands; Amsterdam Public Health, Aging & Later Life, Amsterdam, the Netherlands; Amsterdam Cardiovascular Sciences, Pulmonary hypertension & thrombosis, Amsterdam, the Netherlands
| | - Nicolette F de Keizer
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Digital Health, Amsterdam, the Netherlands; Amsterdam Public Health, Quality of Care, Amsterdam, the Netherlands
| | - Joanna E Klopotowska
- Amsterdam UMC location University of Amsterdam, Department of Medical Informatics, Meibergdreef 9, Amsterdam, the Netherlands; Amsterdam Public Health, Digital Health, Amsterdam, the Netherlands; Amsterdam Public Health, Quality of Care, Amsterdam, the Netherlands
| |
Collapse
|
9
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Murphy RM, Klopotowska JE, de Keizer NF, Jager KJ, Leopold JH, Dongelmans DA, Abu-Hanna A, Schut MC. Adverse drug event detection using natural language processing: A scoping review of supervised learning methods. PLoS One 2023; 18:e0279842. [PMID: 36595517 PMCID: PMC9810201 DOI: 10.1371/journal.pone.0279842] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 12/15/2022] [Indexed: 01/04/2023] Open
Abstract
To reduce adverse drug events (ADEs), hospitals need a system to support them in monitoring ADE occurrence routinely, rapidly, and at scale. Natural language processing (NLP), a computerized approach to analyze text data, has shown promising results for the purpose of ADE detection in the context of pharmacovigilance. However, a detailed qualitative assessment and critical appraisal of NLP methods for ADE detection in the context of ADE monitoring in hospitals is lacking. Therefore, we have conducted a scoping review to close this knowledge gap, and to provide directions for future research and practice. We included articles where NLP was applied to detect ADEs in clinical narratives within electronic health records of inpatients. Quantitative and qualitative data items relating to NLP methods were extracted and critically appraised. Out of 1,065 articles screened for eligibility, 29 articles met the inclusion criteria. Most frequent tasks included named entity recognition (n = 17; 58.6%) and relation extraction/classification (n = 15; 51.7%). Clinical involvement was reported in nine studies (31%). Multiple NLP modelling approaches seem suitable, with Long Short Term Memory and Conditional Random Field methods most commonly used. Although reported overall performance of the systems was high, it provides an inflated impression given a steep drop in performance when predicting the ADE entity or ADE relation class. When annotating corpora, treating an ADE as a relation between a drug and non-drug entity seems the best practice. Future research should focus on semi-automated methods to reduce the manual annotation effort, and examine implementation of the NLP methods in practice.
Collapse
Affiliation(s)
- Rachel M. Murphy
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Joanna E. Klopotowska
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Nicolette F. de Keizer
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Kitty J. Jager
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Jan Hendrik Leopold
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Dave A. Dongelmans
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
- Department of Intensive Care Medicine, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
| | - Ameen Abu-Hanna
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| | - Martijn C. Schut
- Department of Medical Informatics, Amsterdam UMC (location AMC), Amsterdam, The Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, The Netherlands
| |
Collapse
|
11
|
Aronson JK. Artificial Intelligence in Pharmacovigilance: An Introduction to Terms, Concepts, Applications, and Limitations. Drug Saf 2022; 45:407-418. [PMID: 35579806 DOI: 10.1007/s40264-022-01156-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/10/2022] [Indexed: 01/29/2023]
Abstract
The tools of artificial intelligence (AI) have enormous potential to enhance activities in pharmacovigilance. Pharmacovigilance experts need not be AI experts, but they should know enough about AI to explore the possibilities of collaboration with those who are. Modern concepts of AI date from Alan Turing's work, especially his paper on "the imitation game", in the late 1940s and early 1950s. Its scope today includes computational skills, including the formulation of mathematical proofs; visual perception, including facial recognition and virtual reality; decision making by expert systems; aspects of language, such as language processing, speech recognition, creative composition, and translation; and combinations of these, e.g. in self-driving vehicles. Machines can be programmed with the ability to learn, using neural networks that mimic cognitive actions of the human brain, leading to deep structural learning. Limitations of AI include difficulties with language, arising from the need to understand context and interpret ambiguities, which particularly affect translation, and inadequacies of databases, requiring careful preparation and curation. New techniques may cause unforeseen difficulties via unexpected malfunctioning. Relevant terms and concepts include different types of machine learning, neural networks, natural language programming, ontologies, and expert systems. Adoption of the tools of AI in pharmacovigilance has been slow. Machine learning, in conjunction with natural language processing and data mining, to study adverse drug reactions in databases such as those found in electronic health records, claims databases, and social media, has the potential to enhance the characterization of known adverse effects and reactions and detect new signals.
Collapse
Affiliation(s)
- Jeffrey K Aronson
- Centre for Evidence-Based Medicine, Nuffield Department of Primary Care Health Sciences, Oxford, UK.
| |
Collapse
|
12
|
Seetharam K, Shrestha S, Sengupta PP. Cardiovascular Imaging and Intervention Through the Lens of Artificial Intelligence. Interv Cardiol 2021; 16:e31. [PMID: 34754333 PMCID: PMC8559149 DOI: 10.15420/icr.2020.04] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Accepted: 06/18/2021] [Indexed: 12/13/2022] Open
Abstract
Artificial Intelligence (AI) is the simulation of human intelligence in machines so they can perform various actions and execute decision-making. Machine learning (ML), a branch of AI, can analyse information from data and discover novel patterns. AI and ML are rapidly gaining prominence in healthcare as data become increasingly complex. These algorithms can enhance the role of cardiovascular imaging by automating many tasks or calculations, find new patterns or phenotypes in data and provide alternative diagnoses. In interventional cardiology, AI can assist in intraprocedural guidance, intravascular imaging and provide additional information to the operator. AI is slowly expanding its boundaries into interventional cardiology and can fundamentally alter the field. In this review, the authors discuss how AI can enhance the role of cardiovascular imaging and imaging in interventional cardiology.
Collapse
Affiliation(s)
- Karthik Seetharam
- West Virginia University Medicine Heart and Vascular Institute Morgantown, WV, US
| | - Sirish Shrestha
- West Virginia University Medicine Heart and Vascular Institute Morgantown, WV, US
| | - Partho P Sengupta
- West Virginia University Medicine Heart and Vascular Institute Morgantown, WV, US
| |
Collapse
|
13
|
Geva A, Stedman JP, Manzi SF, Lin C, Savova GK, Avillach P, Mandl KD. Adverse drug event presentation and tracking (ADEPT): semiautomated, high throughput pharmacovigilance using real-world data. JAMIA Open 2020; 3:413-421. [PMID: 33215076 PMCID: PMC7660953 DOI: 10.1093/jamiaopen/ooaa031] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 06/23/2020] [Accepted: 06/27/2020] [Indexed: 11/24/2022] Open
Abstract
Objective To advance use of real-world data (RWD) for pharmacovigilance, we sought to integrate a high-sensitivity natural language processing (NLP) pipeline for detecting potential adverse drug events (ADEs) with easily interpretable output for high-efficiency human review and adjudication of true ADEs. Materials and methods The adverse drug event presentation and tracking (ADEPT) system employs an open source NLP pipeline to identify in clinical notes mentions of medications and signs and symptoms potentially indicative of ADEs. ADEPT presents the output to human reviewers by highlighting these drug-event pairs within the context of the clinical note. To measure incidence of seizures associated with sildenafil, we applied ADEPT to 149 029 notes for 982 patients with pediatric pulmonary hypertension. Results Of 416 patients identified as taking sildenafil, NLP found 72 [17%, 95% confidence interval (CI) 14–21] with seizures as a potential ADE. Upon human review and adjudication, only 4 (0.96%, 95% CI 0.37–2.4) patients with seizures were determined to have true ADEs. Reviewers using ADEPT required a median of 89 s (interquartile range 57–142 s) per patient to review potential ADEs. Discussion ADEPT combines high throughput NLP to increase sensitivity of ADE detection and human review, to increase specificity by differentiating true ADEs from signs and symptoms related to comorbidities, effects of other medications, or other confounders. Conclusion ADEPT is a promising tool for creating gold standard, patient-level labels for advancing NLP-based pharmacovigilance. ADEPT is a potentially time savings platform for computer-assisted pharmacovigilance based on RWD.
Collapse
Affiliation(s)
- Alon Geva
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Division of Critical Care Medicine, Department of Anesthesiology, Critical Care, and Pain Medicine, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Anaesthesia, Harvard Medical School, Boston, Massachusetts, USA
| | - Jason P Stedman
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - Shannon F Manzi
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Clinical Pharmacogenomics Service, Division of Genetics & Genomics and Department of Pharmacy, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| | - Chen Lin
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA
| | - Guergana K Savova
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| | - Paul Avillach
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - Kenneth D Mandl
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|