Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Xiong Y, Shi X, Chen S, Jiang D, Tang B, Wang X, Chen Q, Yan J. Cohort selection for clinical trials using hierarchical neural network. J Am Med Inform Assoc 2021;26:1203-1208. [PMID: 31305921 DOI: 10.1093/jamia/ocz099] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 04/28/2019] [Accepted: 06/13/2019] [Indexed: 12/22/2022] Open

For:	Xiong Y, Shi X, Chen S, Jiang D, Tang B, Wang X, Chen Q, Yan J. Cohort selection for clinical trials using hierarchical neural network. J Am Med Inform Assoc 2021;26:1203-1208. [PMID: 31305921 DOI: 10.1093/jamia/ocz099] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 04/28/2019] [Accepted: 06/13/2019] [Indexed: 12/22/2022] Open

Number

Cited by Other Article(s)

Chowdhury S, Rajaganapathy S, Yu Y, Tao C, Vassilaki M, Zong N. Matching Patients to Clinical Trials using LLaMA 2 Embeddings and Siamese Neural Network. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.28.24309677. [PMID: 38978646 PMCID: PMC11230334 DOI: 10.1101/2024.06.28.24309677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]

Seng EC, Mehdipour S, Simpson S, Gabriel RA. Tracking persistent postoperative opioid use: a proof-of-concept study demonstrating a use case for natural language processing. Reg Anesth Pain Med 2024;49:241-247. [PMID: 37419509 DOI: 10.1136/rapm-2023-104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/24/2023] [Indexed: 07/09/2023]

Abstract

BACKGROUND

Large language models have been gaining tremendous popularity since the introduction of ChatGPT in late 2022. Perioperative pain providers should leverage natural language processing (NLP) technology and explore pertinent use cases to improve patient care. One example is tracking persistent postoperative opioid use after surgery. Since much of the relevant data may be 'hidden' within unstructured clinical text, NLP models may prove to be advantageous. The primary objective of this proof-of-concept study was to demonstrate the ability of an NLP engine to review clinical notes and accurately identify patients who had persistent postoperative opioid use after major spine surgery.

METHODS

Clinical documents from all patients that underwent major spine surgery during July 2015-August 2021 were extracted from the electronic health record. The primary outcome was persistent postoperative opioid use, defined as continued use of opioids greater than or equal to 3 months after surgery. This outcome was ascertained via manual clinician review from outpatient spine surgery follow-up notes. An NLP engine was applied to these notes to ascertain the presence of persistent opioid use-this was then compared with results from clinician manual review.

RESULTS

The final study sample consisted of 965 patients, in which 705 (73.1%) were determined to have persistent opioid use following surgery. The NLP engine correctly determined the patients' opioid use status in 92.9% of cases, in which it correctly identified persistent opioid use in 95.6% of cases and no persistent opioid use in 86.1% of cases.

DISCUSSION

Access to unstructured data within the perioperative history can contextualize patients' opioid use and provide further insight into the opioid crisis, while at the same time improve care directly at the patient level. While these goals are in reach, future work is needed to evaluate how to best implement NLP within different healthcare systems for use in clinical decision support.

Collapse

Dobbins NJ, Han B, Zhou W, Lan KF, Kim HN, Harrington R, Uzuner Ö, Yetisgen M. LeafAI: query generator for clinical cohort discovery rivaling a human programmer. J Am Med Inform Assoc 2023;30:1954-1964. [PMID: 37550244 PMCID: PMC10654856 DOI: 10.1093/jamia/ocad149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 07/14/2023] [Accepted: 07/19/2023] [Indexed: 08/09/2023] Open

Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023;30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open

Abstract

OBJECTIVE

Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used.

MATERIALS AND METHODS

We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies.

RESULTS

Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions.

DISCUSSION

Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released.

CONCLUSION

Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.

Collapse

Meirelles AL, Kurc T, Saltz J, Teodoro G. Effective active learning in digital pathology: A case study in tumor infiltrating lymphocytes. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022;220:106828. [PMID: 35500506 DOI: 10.1016/j.cmpb.2022.106828] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 04/09/2022] [Accepted: 04/19/2022] [Indexed: 06/14/2023]

Idnay B, Dreisbach C, Weng C, Schnall R. A systematic review on natural language processing systems for eligibility prescreening in clinical research. J Am Med Inform Assoc 2021;29:197-206. [PMID: 34725689 PMCID: PMC8714283 DOI: 10.1093/jamia/ocab228] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 08/30/2021] [Accepted: 10/04/2021] [Indexed: 11/14/2022] Open

Li M, Cai H, Nan S, Li J, Lu X, Duan H. A Patient-Screening Tool for Clinical Research Based on Electronic Health Records Using OpenEHR: Development Study. JMIR Med Inform 2021;9:e33192. [PMID: 34673526 PMCID: PMC8569542 DOI: 10.2196/33192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 09/27/2021] [Accepted: 09/27/2021] [Indexed: 11/28/2022] Open

Abstract

Background

The widespread adoption of electronic health records (EHRs) has facilitated the secondary use of EHR data for clinical research. However, screening eligible patients from EHRs is a challenging task. The concepts in eligibility criteria are not completely matched with EHRs, especially derived concepts. The lack of high-level expression of Structured Query Language (SQL) makes it difficult and time consuming to express them. The openEHR Expression Language (EL) as a domain-specific language based on clinical information models shows promise to represent complex eligibility criteria.

Objective

The study aims to develop a patient-screening tool based on EHRs for clinical research using openEHR to solve concept mismatch and improve query performance.

Methods

A patient-screening tool based on EHRs using openEHR was proposed. It uses the advantages of information models and EL in openEHR to provide high-level expressions and improve query performance. First, openEHR archetypes and templates were chosen to define concepts called simple concepts directly from EHRs. Second, openEHR EL was used to generate derived concepts by combining simple concepts and constraints. Third, a hierarchical index corresponding to archetypes in Elasticsearch (ES) was generated to improve query performance for subqueries and join queries related to the derived concepts. Finally, we realized a patient-screening tool for clinical research.

Results

In total, 500 sentences randomly selected from 4691 eligibility criteria in 389 clinical trials on stroke from the Chinese Clinical Trial Registry (ChiCTR) were evaluated. An openEHR-based clinical data repository (CDR) in a grade A tertiary hospital in China was considered as an experimental environment. Based on these, 589 medical concepts were found in the 500 sentences. Of them, 513 (87.1%) concepts could be represented, while the others could not be, because of a lack of information models and coarse-grained requirements. In addition, our case study on 6 queries demonstrated that our tool shows better query performance among 4 cases (66.67%).

Conclusions

We developed a patient-screening tool using openEHR. It not only helps solve concept mismatch but also improves query performance to reduce the burden on researchers. In addition, we demonstrated a promising solution for secondary use of EHR data using openEHR, which can be referenced by other researchers.

Collapse

Xiong Y, Peng W, Chen Q, Huang Z, Tang B. A Unified Machine Reading Comprehension Framework for Cohort Selection. IEEE J Biomed Health Inform 2021;26:379-387. [PMID: 34236972 DOI: 10.1109/jbhi.2021.3095478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Stubbs A, Filannino M, Soysal E, Henry S, Uzuner Ö. Cohort selection for clinical trials: n2c2 2018 shared task track 1. J Am Med Inform Assoc 2021;26:1163-1171. [PMID: 31562516 DOI: 10.1093/jamia/ocz163] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 08/07/2019] [Accepted: 09/18/2019] [Indexed: 01/02/2023] Open

Using supervised machine learning classifiers to estimate likelihood of participating in clinical trials of a de-identified version of ResearchMatch. J Clin Transl Sci 2020;5:e42. [PMID: 33948264 PMCID: PMC8057403 DOI: 10.1017/cts.2020.535] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Dai HJ, Wang FD, Chen CW, Su CH, Wu CS, Jonnagaddala J. Cohort selection for clinical trials using multiple instance learning. J Biomed Inform 2020;107:103438. [PMID: 32360937 DOI: 10.1016/j.jbi.2020.103438] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 02/29/2020] [Accepted: 04/27/2020] [Indexed: 10/24/2022]

Hassanzadeh H, Karimi S, Nguyen A. Matching patients to clinical trials using semantically enriched document representation. J Biomed Inform 2020;105:103406. [DOI: 10.1016/j.jbi.2020.103406] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 01/28/2020] [Accepted: 03/02/2020] [Indexed: 12/16/2022]

Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10062157] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Juhn Y, Liu H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol 2020;145:463-469. [PMID: 31883846 PMCID: PMC7771189 DOI: 10.1016/j.jaci.2019.12.897] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 12/18/2019] [Accepted: 12/19/2019] [Indexed: 01/17/2023]

Primary care perspectives on implementation of clinical trial recruitment. J Clin Transl Sci 2019;4:61-68. [PMID: 32257412 PMCID: PMC7103461 DOI: 10.1017/cts.2019.435] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 10/23/2019] [Accepted: 10/23/2019] [Indexed: 12/12/2022] Open

Spasic I, Krzeminski D, Corcoran P, Balinsky A. Cohort Selection for Clinical Trials From Longitudinal Patient Records: Text Mining Approach. JMIR Med Inform 2019;7:e15980. [PMID: 31674914 PMCID: PMC6913747 DOI: 10.2196/15980] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 09/29/2019] [Accepted: 10/02/2019] [Indexed: 12/17/2022] Open

Abstract

Background

Clinical trials are an important step in introducing new interventions into clinical practice by generating data on their safety and efficacy. Clinical trials need to ensure that participants are similar so that the findings can be attributed to the interventions studied and not to some other factors. Therefore, each clinical trial defines eligibility criteria, which describe characteristics that must be shared by the participants. Unfortunately, the complexities of eligibility criteria may not allow them to be translated directly into readily executable database queries. Instead, they may require careful analysis of the narrative sections of medical records. Manual screening of medical records is time consuming, thus negatively affecting the timeliness of the recruitment process.

Objective

Track 1 of the 2018 National Natural Language Processing Clinical Challenge focused on the task of cohort selection for clinical trials, aiming to answer the following question: Can natural language processing be applied to narrative medical records to identify patients who meet eligibility criteria for clinical trials? The task required the participating systems to analyze longitudinal patient records to determine if the corresponding patients met the given eligibility criteria. We aimed to describe a system developed to address this task.

Methods

Our system consisted of 13 classifiers, one for each eligibility criterion. All classifiers used a bag-of-words document representation model. To prevent the loss of relevant contextual information associated with such representation, a pattern-matching approach was used to extract context-sensitive features. They were embedded back into the text as lexically distinguishable tokens, which were consequently featured in the bag-of-words representation. Supervised machine learning was chosen wherever a sufficient number of both positive and negative instances was available to learn from. A rule-based approach focusing on a small set of relevant features was chosen for the remaining criteria.

Results

The system was evaluated using microaveraged F measure. Overall, 4 machine algorithms, including support vector machine, logistic regression, naïve Bayesian classifier, and gradient tree boosting (GTB), were evaluated on the training data using 10–fold cross-validation. Overall, GTB demonstrated the most consistent performance. Its performance peaked when oversampling was used to balance the training data. The final evaluation was performed on previously unseen test data. On average, the F measure of 89.04% was comparable to 3 of the top ranked performances in the shared task (91.11%, 90.28%, and 90.21%). With an F measure of 88.14%, we significantly outperformed these systems (81.03%, 78.50%, and 70.81%) in identifying patients with advanced coronary artery disease.

Conclusions

The holdout evaluation provides evidence that our system was able to identify eligible patients for the given clinical trial with high accuracy. Our approach demonstrates how rule-based knowledge infusion can improve the performance of machine learning algorithms even when trained on a relatively small dataset.

Collapse