Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang K, Demner-Fushman D. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations. J Am Med Inform Assoc 2018;24:781-787. [PMID: 28339690 DOI: 10.1093/jamia/ocw176] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 12/01/2016] [Indexed: 11/14/2022] Open

For:	Zhang K, Demner-Fushman D. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations. J Am Med Inform Assoc 2018;24:781-787. [PMID: 28339690 DOI: 10.1093/jamia/ocw176] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 12/01/2016] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Trinkley KE, An R, Maw AM, Glasgow RE, Brownson RC. Leveraging artificial intelligence to advance implementation science: potential opportunities and cautions. Implement Sci 2024;19:17. [PMID: 38383393 PMCID: PMC10880216 DOI: 10.1186/s13012-024-01346-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 01/25/2024] [Indexed: 02/23/2024] Open

Abstract

BACKGROUND

The field of implementation science was developed to address the significant time delay between establishing an evidence-based practice and its widespread use. Although implementation science has contributed much toward bridging this gap, the evidence-to-practice chasm remains a challenge. There are some key aspects of implementation science in which advances are needed, including speed and assessing causality and mechanisms. The increasing availability of artificial intelligence applications offers opportunities to help address specific issues faced by the field of implementation science and expand its methods.

MAIN TEXT

This paper discusses the many ways artificial intelligence can address key challenges in applying implementation science methods while also considering potential pitfalls to the use of artificial intelligence. We answer the questions of "why" the field of implementation science should consider artificial intelligence, for "what" (the purpose and methods), and the "what" (consequences and challenges). We describe specific ways artificial intelligence can address implementation science challenges related to (1) speed, (2) sustainability, (3) equity, (4) generalizability, (5) assessing context and context-outcome relationships, and (6) assessing causality and mechanisms. Examples are provided from global health systems, public health, and precision health that illustrate both potential advantages and hazards of integrating artificial intelligence applications into implementation science methods. We conclude by providing recommendations and resources for implementation researchers and practitioners to leverage artificial intelligence in their work responsibly.

CONCLUSIONS

Artificial intelligence holds promise to advance implementation science methods ("why") and accelerate its goals of closing the evidence-to-practice gap ("purpose"). However, evaluation of artificial intelligence's potential unintended consequences must be considered and proactively monitored. Given the technical nature of artificial intelligence applications as well as their potential impact on the field, transdisciplinary collaboration is needed and may suggest the need for a subset of implementation scientists cross-trained in both fields to ensure artificial intelligence is used optimally and ethically.

Collapse

Yang Y, Jayaraj S, Ludmir E, Roberts K. Text Classification of Cancer Clinical Trial Eligibility Criteria. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2024;2023:1304-1313. [PMID: 38222417 PMCID: PMC10785908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/16/2024]

Han S, Sohn TJ, Ng BP, Park C. Predicting unplanned readmission due to cardiovascular disease in hospitalized patients with cancer: a machine learning approach. Sci Rep 2023;13:13491. [PMID: 37596346 PMCID: PMC10439193 DOI: 10.1038/s41598-023-40552-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 08/12/2023] [Indexed: 08/20/2023] Open

Doi K, Takegawa H, Yui M, Anetai Y, Koike Y, Nakamura S, Tanigawa N, Koziumi M, Nishio T. Deep learning-based detection of patients with bone metastasis from Japanese radiology reports. Jpn J Radiol 2023;41:900-908. [PMID: 36988827 DOI: 10.1007/s11604-023-01413-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 03/07/2023] [Indexed: 03/30/2023]

Affiliation(s)

Kentaro Doi Department of Medical Physics and Engineering, Osaka University Graduate School of Medicine, 1-7 Yamadaoka, Suita-Shi, Osaka, Japan Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Hideki Takegawa Department of Medical Physics and Engineering, Osaka University Graduate School of Medicine, 1-7 Yamadaoka, Suita-Shi, Osaka, Japan. Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan. Department of Radiation Oncology, Kansai Medical University Hospital, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan.
Midori Yui Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan Department of Radiation Oncology, Kansai Medical University Hospital, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Yusuke Anetai Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan Department of Radiation Oncology, Kansai Medical University Hospital, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Yuhei Koike Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan Department of Radiation Oncology, Kansai Medical University Hospital, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Satoaki Nakamura Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan Department of Radiation Oncology, Kansai Medical University Hospital, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Noboru Tanigawa Department of Radiology, Kansai Medical University Graduate School of Medicine, 2-5-1 Shinmachi, Hirakata-Shi, Osaka, Japan
Masahiko Koziumi Department of Medical Physics and Engineering, Osaka University Graduate School of Medicine, 1-7 Yamadaoka, Suita-Shi, Osaka, Japan
Teiji Nishio Department of Medical Physics and Engineering, Osaka University Graduate School of Medicine, 1-7 Yamadaoka, Suita-Shi, Osaka, Japan

Collapse

Miller MI, Shih LC, Kolachalama VB. Machine Learning in Clinical Trials: A Primer with Applications to Neurology. Neurotherapeutics 2023;20:1066-1080. [PMID: 37249836 PMCID: PMC10228463 DOI: 10.1007/s13311-023-01384-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2023] [Indexed: 05/31/2023] Open

Artificial Intelligence Applied to clinical trials: opportunities and challenges. HEALTH AND TECHNOLOGY 2023;13:203-213. [PMID: 36923325 PMCID: PMC9974218 DOI: 10.1007/s12553-023-00738-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 02/08/2023] [Indexed: 03/06/2023]

Fang C, Markuzon N, Patel N, Rueda JD. Natural Language Processing for Automated Classification of Qualitative Data From Interviews of Patients With Cancer. VALUE IN HEALTH : THE JOURNAL OF THE INTERNATIONAL SOCIETY FOR PHARMACOECONOMICS AND OUTCOMES RESEARCH 2022;25:1995-2002. [PMID: 35840523 DOI: 10.1016/j.jval.2022.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 05/19/2022] [Accepted: 06/12/2022] [Indexed: 06/15/2023]

Abstract

OBJECTIVES

This study sought to explore the use of novel natural language processing (NLP) methods for classifying unstructured, qualitative textual data from interviews of patients with cancer to identify patient-reported symptoms and impacts on quality of life.

METHODS

We tested the ability of 4 NLP models to accurately classify text from interview transcripts as "symptom," "quality of life impact," and "other." Interview data sets from patients with hepatocellular carcinoma (HCC) (n = 25), biliary tract cancer (BTC) (n = 23), and gastric cancer (n = 24) were used. Models were cross-validated with transcript subsets designated for training, validation, and testing. Multiclass classification performance of the 4 models was evaluated at paragraph and sentence level using the HCC testing data set and analyzed by the one-versus-rest technique quantified by the receiver operating characteristic area under the curve (ROC AUC) score.

RESULTS

NLP models accurately classified multiclass text from patient interviews. The Bidirectional Encoder Representations from Transformers model generally outperformed all other models at paragraph and sentence level. The highest predictive performance of the Bidirectional Encoder Representations from Transformers model was observed using the HCC data set to train and BTC data set to test (mean ROC AUC, 0.940 [SD 0.028]), with similarly high predictive performance using balanced and imbalanced training data sets from BTC and gastric cancer populations.

CONCLUSIONS

NLP models were accurate in predicting multiclass classification of text from interviews of patients with cancer, with most surpassing 0.9 ROC AUC at paragraph level. NLP may be a useful tool for scaling up processing of patient interviews in clinical studies and, thus, could serve to facilitate patient input into drug development and improving patient care.

Collapse

Li J, Wei Q, Ghiasvand O, Chen M, Lobanov V, Weng C, Xu H. A comparative study of pre-trained language models for named entity recognition in clinical trial eligibility criteria from multiple corpora. BMC Med Inform Decis Mak 2022;22:235. [PMID: 36068551 PMCID: PMC9450226 DOI: 10.1186/s12911-022-01967-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/10/2022] [Indexed: 05/29/2024] Open

Rafee A, Riepenhausen S, Neuhaus P, Meidt A, Dugas M, Varghese J. ELaPro, a LOINC-mapped core dataset for top laboratory procedures of eligibility screening for clinical trials. BMC Med Res Methodol 2022;22:141. [PMID: 35568796 PMCID: PMC9107639 DOI: 10.1186/s12874-022-01611-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 04/20/2022] [Indexed: 12/21/2022] Open

Abstract

Background

Screening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. Logical Observation Identifiers Names and Codes (LOINC®), is much needed to support automated screening tools.

Objective

The aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment.

Methods

We used a semi-automated approach to analyze 10,516 screening forms from the Medical Data Models (MDM) portal’s data repository that are pre-annotated with Unified Medical Language System (UMLS). An automated semantic analysis based on concept frequency is followed by an extensive manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach.

Results

Based on analysis of 138,225 EC from 10,516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26,413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of Medical Subject Headings (MeSH) disease domains.

Conclusions

Only a small set of common LP covers the majority of laboratory concepts in screening EC forms which supports the feasibility of establishing a focused core dataset for LP. We present ELaPro, a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials. ELaPro is available in multiple machine-readable data formats like CSV, ODM and HL7 FHIR. The extensive manual curation of this large number of free-text EC as well as the combining of UMLS and LOINC terminologies distinguishes this specialized dataset from previous relevant datasets in the literature.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12874-022-01611-y.

Collapse

Bhatnagar R, Sardar S, Beheshti M, Podichetty JT. How can natural language processing help model informed drug development?: a review. JAMIA Open 2022;5:ooac043. [PMID: 35702625 PMCID: PMC9188322 DOI: 10.1093/jamiaopen/ooac043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 04/28/2022] [Accepted: 05/26/2022] [Indexed: 01/20/2023] Open

Abstract

Objective

To summarize applications of natural language processing (NLP) in model informed drug development (MIDD) and identify potential areas of improvement.

Materials and Methods

Publications found on PubMed and Google Scholar, websites and GitHub repositories for NLP libraries and models. Publications describing applications of NLP in MIDD were reviewed. The applications were stratified into 3 stages: drug discovery, clinical trials, and pharmacovigilance. Key NLP functionalities used for these applications were assessed. Programming libraries and open-source resources for the implementation of NLP functionalities in MIDD were identified.

Results

NLP has been utilized to aid various processes in drug development lifecycle such as gene-disease mapping, biomarker discovery, patient-trial matching, adverse drug events detection, etc. These applications commonly use NLP functionalities of named entity recognition, word embeddings, entity resolution, assertion status detection, relation extraction, and topic modeling. The current state-of-the-art for implementing these functionalities in MIDD applications are transformer models that utilize transfer learning for enhanced performance. Various libraries in python, R, and Java like huggingface, sparkNLP, and KoRpus as well as open-source platforms such as DisGeNet, DeepEnroll, and Transmol have enabled convenient implementation of NLP models to MIDD applications.

Discussion

Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape. There are opportunities to improve the performance of existing models and expand the use of NLP in newer areas of MIDD.

Conclusions

This review provides an overview of the potential and pitfalls of current NLP approaches in MIDD.

Collapse

Kataria S, Ravindran V. Musculoskeletal care - at the confluence of data science, sensors, engineering, and computation. BMC Musculoskelet Disord 2022;23:169. [PMID: 35193536 PMCID: PMC8863295 DOI: 10.1186/s12891-022-05126-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 02/17/2022] [Indexed: 12/27/2022] Open

Alkaitis MS, Agrawal MN, Riely GJ, Razavi P, Sontag D. Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer. JCO Clin Cancer Inform 2021;5:550-560. [PMID: 33989016 DOI: 10.1200/cci.20.00139] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

PURPOSE

Key oncology end points are not routinely encoded into electronic medical records (EMRs). We assessed whether natural language processing (NLP) can abstract treatment discontinuation rationale from unstructured EMR notes to estimate toxicity incidence and progression-free survival (PFS).

METHODS

We constructed a retrospective cohort of 6,115 patients with early-stage and 701 patients with metastatic breast cancer initiating care at Memorial Sloan Kettering Cancer Center from 2008 to 2019. Each cohort was divided into training (70%), validation (15%), and test (15%) subsets. Human abstractors identified the clinical rationale associated with treatment discontinuation events. Concatenated EMR notes were used to train high-dimensional logistic regression and convolutional neural network models. Kaplan-Meier analyses were used to compare toxicity incidence and PFS estimated by our NLP models to estimates generated by manual labeling and time-to-treatment discontinuation (TTD).

RESULTS

Our best high-dimensional logistic regression models identified toxicity events in early-stage patients with an area under the curve of the receiver-operator characteristic of 0.857 ± 0.014 (standard deviation) and progression events in metastatic patients with an area under the curve of 0.752 ± 0.027 (standard deviation). NLP-extracted toxicity incidence and PFS curves were not significantly different from manually extracted curves (P = .95 and P = .67, respectively). By contrast, TTD overestimated toxicity in early-stage patients (P < .001) and underestimated PFS in metastatic patients (P < .001). Additionally, we tested an extrapolation approach in which 20% of the metastatic cohort were labeled manually, and NLP algorithms were used to abstract the remaining 80%. This extrapolated outcomes approach resolved PFS differences between receptor subtypes (P < .001 for hormone receptor+/human epidermal growth factor receptor 2- v human epidermal growth factor receptor 2+ v triple-negative) that could not be resolved with TTD.

CONCLUSION

NLP models are capable of abstracting treatment discontinuation rationale with minimal manual labeling.

Collapse

Zeng K, Xu Y, Lin G, Liang L, Hao T. Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning. BMC Med Inform Decis Mak 2021;21:129. [PMID: 34330259 PMCID: PMC8323220 DOI: 10.1186/s12911-021-01492-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 04/08/2021] [Indexed: 12/02/2022] Open

Abstract

BACKGROUND

Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data.

METHODS

An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories.

RESULTS

Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement.

CONCLUSIONS

A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.

Collapse

Zong H, Yang J, Zhang Z, Li Z, Zhang X. Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods. BMC Med Inform Decis Mak 2021;21:128. [PMID: 33858409 PMCID: PMC8050926 DOI: 10.1186/s12911-021-01487-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Accepted: 04/01/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Semantic categorization analysis of clinical trials eligibility criteria based on natural language processing technology is crucial for the task of optimizing clinical trials design and building automated patient recruitment system. However, most of related researches focused on English eligibility criteria, and to the best of our knowledge, there are no researches studied the Chinese eligibility criteria. Thus in this study, we aimed to explore the semantic categories of Chinese eligibility criteria.

METHODS

We downloaded the clinical trials registration files from the website of Chinese Clinical Trial Registry (ChiCTR) and extracted both the Chinese eligibility criteria and corresponding English eligibility criteria. We represented the criteria sentences based on the Unified Medical Language System semantic types and conducted the hierarchical clustering algorithm for the induction of semantic categories. Furthermore, in order to explore the classification performance of Chinese eligibility criteria with our developed semantic categories, we implemented multiple classification algorithms, include four baseline machine learning algorithms (LR, NB, kNN, SVM), three deep learning algorithms (CNN, RNN, FastText) and two pre-trained language models (BERT, ERNIE).

RESULTS

We totally developed 44 types of semantic categories, summarized 8 topic groups, and investigated the average incidence and prevalence in 272 hepatocellular carcinoma related Chinese clinical trials. Compared with the previous proposed categories in English eligibility criteria, 13 novel categories are identified in Chinese eligibility criteria. The classification result shows that most of semantic categories performed quite well, the pre-trained language model ERNIE achieved best performance with macro-average F1 score of 0.7980 and micro-average F1 score of 0.8484.

CONCLUSION

As a pilot study of Chinese eligibility criteria analysis, we developed the 44 semantic categories by hierarchical clustering algorithms for the first times, and validated the classification capacity with multiple classification algorithms.

Collapse

Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 2021;592:629-633. [PMID: 33828294 DOI: 10.1038/s41586-021-03430-5] [Citation(s) in RCA: 99] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 03/08/2021] [Indexed: 01/04/2023]

宗辉, 张泽, 杨金, 雷健, 李作, 郝天, 张晓. [Artificial intelligence based Chinese clinical trials eligibility criteria classification]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2021;38:105-110. [PMID: 33899434 PMCID: PMC10307579 DOI: 10.7507/1001-5515.202006035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 11/18/2020] [Indexed: 11/03/2022]

Bitterman DS, Miller TA, Mak RH, Savova GK. Clinical Natural Language Processing for Radiation Oncology: A Review and Practical Primer. Int J Radiat Oncol Biol Phys 2021;110:641-655. [PMID: 33545300 DOI: 10.1016/j.ijrobp.2021.01.044] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 12/22/2020] [Accepted: 01/23/2021] [Indexed: 02/07/2023]

Abstract

Natural language processing (NLP), which aims to convert human language into expressions that can be analyzed by computers, is one of the most rapidly developing and widely used technologies in the field of artificial intelligence. Natural language processing algorithms convert unstructured free text data into structured data that can be extracted and analyzed at scale. In medicine, this unlocking of the rich, expressive data within clinical free text in electronic medical records will help untap the full potential of big data for research and clinical purposes. Recent major NLP algorithmic advances have significantly improved the performance of these algorithms, leading to a surge in academic and industry interest in developing tools to automate information extraction and phenotyping from clinical texts. Thus, these technologies are poised to transform medical research and alter clinical practices in the future. Radiation oncology stands to benefit from NLP algorithms if they are appropriately developed and deployed, as they may enable advances such as automated inclusion of radiation therapy details into cancer registries, discovery of novel insights about cancer care, and improved patient data curation and presentation at the point of care. However, challenges remain before the full value of NLP is realized, such as the plethora of jargon specific to radiation oncology, nonstandard nomenclature, a lack of publicly available labeled data for model development, and interoperability limitations between radiation oncology data silos. Successful development and implementation of high quality and high value NLP models for radiation oncology will require close collaboration between computer scientists and the radiation oncology community. Here, we present a primer on artificial intelligence algorithms in general and NLP algorithms in particular; provide guidance on how to assess the performance of such algorithms; review prior research on NLP algorithms for oncology; and describe future avenues for NLP in radiation oncology research and clinics.

Collapse

Stemerman R, Bunning T, Grover J, Kitzmiller R, Patel MD. Identifying Patient Phenotype Cohorts Using Prehospital Electronic Health Record Data. PREHOSP EMERG CARE 2021:1-14. [PMID: 33315497 DOI: 10.1080/10903127.2020.1859658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 12/01/2020] [Indexed: 10/22/2022]

Abstract

Objective: Emergency medical services (EMS) provide critical interventions for patients with acute illness and injury and are important in implementing prehospital emergency care research. Retrospective, manual patient record review, the current reference-standard for identifying patient cohorts, requires significant time and financial investment. We developed automated classification models to identify eligible patients for prehospital clinical trials using EMS clinical notes and compared model performance to manual review.Methods: With eligibility criteria for an ongoing prehospital study of chest pain patients, we used EMS clinical notes (n = 1208) to manually classify patients as eligible, ineligible, and indeterminate. We randomly split these same records into training and test sets to develop and evaluate machine-learning (ML) algorithms using natural language processing (NLP) for feature (variable) selection. We compared models to the manual classification to calculate sensitivity, specificity, accuracy, positive predictive value, and F1 measure. We measured clinical expert time to perform review for manual and automated methods.Results: ML models' sensitivity, specificity, accuracy, positive predictive value, and F1 measure ranged from 0.93 to 0.98. Compared to manual classification (N = 363 records), the automated method excluded 90.9% of records as ineligible and leaving only 33 records for manual review.Conclusions: Our ML derived approach demonstrates the feasibility of developing a high-performing, automated classification system using EMS clinical notes to streamline the identification of a specific cardiac patient cohort. This efficient approach can be leveraged to facilitate prehospital patient-trial matching, patient phenotyping (i.e. influenza-like illness), and create prehospital patient registries.

Collapse

Affiliation(s)

Rachel Stemerman Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Thomas Bunning Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Joseph Grover Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Rebecca Kitzmiller Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020
Mehul D Patel Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020

Collapse

Davidson L, Boland MR. Towards deep phenotyping pregnancy: a systematic review on artificial intelligence and machine learning methods to improve pregnancy outcomes. Brief Bioinform 2021;22:6065792. [PMID: 33406530 PMCID: PMC8424395 DOI: 10.1093/bib/bbaa369] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/13/2020] [Accepted: 11/18/2020] [Indexed: 12/16/2022] Open

Artificial intelligence in oncology. Artif Intell Med 2021. [DOI: 10.1016/b978-0-12-821259-2.00018-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Chamberlin SR, Bedrick SD, Cohen AM, Wang Y, Wen A, Liu S, Liu H, Hersh WR. Evaluation of patient-level retrieval from electronic health record data for a cohort discovery task. JAMIA Open 2020;3:395-404. [PMID: 33215074 PMCID: PMC7660955 DOI: 10.1093/jamiaopen/ooaa026] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Revised: 04/17/2020] [Accepted: 06/03/2020] [Indexed: 11/24/2022] Open

Abstract

OBJECTIVE

Growing numbers of academic medical centers offer patient cohort discovery tools to their researchers, yet the performance of systems for this use case is not well understood. The objective of this research was to assess patient-level information retrieval methods using electronic health records for different types of cohort definition retrieval.

MATERIALS AND METHODS

We developed a test collection consisting of about 100 000 patient records and 56 test topics that characterized patient cohort requests for various clinical studies. Automated information retrieval tasks using word-based approaches were performed, varying 4 different parameters for a total of 48 permutations, with performance measured using B-Pref. We subsequently created structured Boolean queries for the 56 topics for performance comparisons. In addition, we performed a more detailed analysis of 10 topics.

RESULTS

The best-performing word-based automated query parameter settings achieved a mean B-Pref of 0.167 across all 56 topics. The way a topic was structured (topic representation) had the largest impact on performance. Performance not only varied widely across topics, but there was also a large variance in sensitivity to parameter settings across the topics. Structured queries generally performed better than automated queries on measures of recall and precision but were still not able to recall all relevant patients found by the automated queries.

CONCLUSION

While word-based automated methods of cohort retrieval offer an attractive solution to the labor-intensive nature of this task currently used at many medical centers, we generally found suboptimal performance in those approaches, with better performance obtained from structured Boolean queries. Future work will focus on using the test collection to develop and evaluate new approaches to query structure, weighting algorithms, and application of semantic methods.

Collapse

Harkin LJ, Beaver K, Dey P, Choong KA. Secret groups and open forums: Defining online support communities from the perspective of people affected by cancer. Digit Health 2020;6:2055207619898993. [PMID: 32010450 PMCID: PMC6970481 DOI: 10.1177/2055207619898993] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 12/11/2019] [Indexed: 12/12/2022] Open

Abstract

Objective

A quarter of people diagnosed with cancer lack social support. Online cancer communities could allow people to connect and support one another. However, the current proliferation of online support communities constitutes a range of online environments with differing communication capacities and limitations. It is unclear what is perceived as online cancer community support and how different features can help or hinder supportive group processes. This study aimed to explore how perceived support is influenced by the different features and formats of online support environments.

Methods

In-depth qualitative interviews were conducted with 23 individuals affected by a range of cancer diagnoses, including both cancer survivors and family members. Data were analysed using deductive thematic analysis guided by a constructivist epistemological perspective.

Findings

Online supportive communities were defined and differentiated by two themes. Firstly, ‘Open forums’ were identified with thematic properties which facilitated a uniquely informative environment including ‘Safety in anonymity’, ‘Perceived reliability’ and ‘Exposure and detachment’. Secondly, ‘Secret groups’ were identified with thematic properties which enhanced an emotionally supportive environment including ‘Personalised interactions’, an overt ‘Peer hierarchy’, and ‘Crossing the virtual divide’.

Conclusions

Properties of groups can engender different degrees of interpersonal relations and different supportive interactions. In particular, support community designers may want to adapt key features such as anonymity, trustworthiness of websites, and the personalised nature of conversations to influence the development of supportive environments. In personalised peer-led groups, it may be prudent to provide guidance on how to reassert a positive environment if arguments break out online.

Collapse

Liu J, Zhang W, Jiang X, Zhou Y. Data Mining of the Reviews from Online Private Doctors. Telemed J E Health 2019;26:1157-1166. [PMID: 31674890 DOI: 10.1089/tmj.2019.0159] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

Background: User-generated content shared in the online health communities (OHCs) is becoming a valuable resource for researchers to understand patients' decision-making behaviors in the management of their health. Many studies have focused on how to obtain useful information from online reviews in OHCs. Introduction: This study focuses on a telemedicine service called Online Private Doctor (OPD), which is offered by a leading Chinese physician review website (PRW). OPD reviews have not received much attention. By data mining the reviews, our goal is to determine what patients are talking about when they use the OPD service and whether they are satisfied with the service or not. Materials and Methods: We used a Python web crawler to collect 41,029 reviews and 84,510 short reviews (labels) of all 5,645 physicians who offered the OPD service on a PRW (haodf.com) in China. Mixed methods (i.e., a literature review, topic discovery, annotation, and a sentiment analysis) were used to determine the information that the OPD reviews are meant to express. Results: We discovered that the OPD reviews can be categorized into four subjects: competence (35.1%), communication (29.4%), treatment (26.0%), and convenience (9.5%). In terms of previously discovered topics, we found that competence, communication, and treatment have been discussed before, but convenience is an emerging subject. The sentiment analysis indicated that 93.67% of the reviews indicated positive emotions, and the area under the receiver operating characteristic (ROC) curve is 0.64. Furthermore, the labels indicated that only 0.72% (603/84,570) of reviews were negative toward the OPD service. The subjects of the labels were distributed according to competence (34.7%), communication (23.8%), treatment (33.5%), and convenience (8.0%). Discussion: The findings of our study suggest that patients who ever used OPD have been quite satisfied with the service. From their reviews, we discovered that OPD has its special characteristics and is convenient. However, it still has some shortcomings, for example, the quality of the phone connection. In terms of both the platform and the doctors, more efforts should be made to make the OPD better and more regulated. Conclusion: OPD is an emerging telemedicine service that still needs more time and space to evolve. For patients, it helps reduce problems such as scheduling and queuing. Therefore, it brings more convenience to people's daily lives. In the future, more attention should be paid to this service, as it is helpful in reducing the uneven distribution of medical resources.

Collapse

Savova GK, Danciu I, Alamudun F, Miller T, Lin C, Bitterman DS, Tourassi G, Warner JL. Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records. Cancer Res 2019;79:5463-5470. [PMID: 31395609 PMCID: PMC7227798 DOI: 10.1158/0008-5472.can-19-0579] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 06/17/2019] [Accepted: 07/29/2019] [Indexed: 12/12/2022]

Learning Eligibility in Cancer Clinical Trials Using Deep Neural Networks. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8071206] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Using Self-Reported Patient Experiences to Understand Patient Burden: Learnings from Digital Patient Communities in Ankylosing Spondylitis. Adv Ther 2018;35:424-437. [PMID: 29450863 PMCID: PMC5859700 DOI: 10.1007/s12325-018-0669-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Indexed: 12/17/2022]

Abstract

Introduction

Online communities contain a wealth of information containing unsolicited patient experiences that may go beyond what is captured by guided surveys or patient-reported outcome (PRO) instruments used in clinical settings. This study described patient experiences reported online to better understand the day-to-day disease burden of ankylosing spondylitis (AS).

Methods

Unguided, English-language patient narratives reported between January 2010 and May 2016 were collected from 52 online sources (e.g., general/health social networking sites, patient–physician Q&A sites, AS forums). Using natural language processing combined with manual curation, patient-reported experiences within narratives were evaluated and categorized into social, physical, emotional, cognitive, and role activity (SPEC-R) concepts to assess functional impairment. The same SPEC-R categorization was applied to 5 AS-specific PRO instruments to evaluate their coverage of concepts extracted from patient narratives.

Results

A total of 34,780 narratives from 3449 patients with AS were included. Physical aspects of AS (e.g., pain and mobility) were most commonly reported by patients (86.7%), followed by emotional (32.5%), cognitive (23.6%), role activity (8.7%) and social (5.1%). Some frequently discussed subconcepts were effectively captured by ≥ 2 PRO instruments, such as pain (65.3%), asthenia (19.9%), musculoskeletal impairment (19.9%), depression (9.9%), and anger/frustration (5.4%); others [e.g., anxiety (19.1%), mental impairment (3.2%), impulsivity (2.9%)] were not addressed by any of the PRO instruments.

Conclusion

These findings highlight the importance of analyzing patient experiences beyond clinical trial settings and physician reports; continuous assessment of existing PRO instruments in collaboration with patients may increase their utility in real-world settings.

Funding

Novartis Pharmaceuticals Corporation.

Electronic supplementary material

The online version of this article (10.1007/s12325-018-0669-1) contains supplementary material, which is available to authorized users.

Collapse