1
|
Ahmad PN, Liu Y, Khan K, Jiang T, Burhan U. BIR: Biomedical Information Retrieval System for Cancer Treatment in Electronic Health Record Using Transformers. SENSORS (BASEL, SWITZERLAND) 2023; 23:9355. [PMID: 38067736 PMCID: PMC10708614 DOI: 10.3390/s23239355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/25/2023] [Accepted: 10/29/2023] [Indexed: 12/18/2023]
Abstract
The rapid growth of electronic health records (EHRs) has led to unprecedented biomedical data. Clinician access to the latest patient information can improve the quality of healthcare. However, clinicians have difficulty finding information quickly and easily due to the sheer data mining volume. Biomedical information retrieval (BIR) systems can help clinicians find the information required by automatically searching EHRs and returning relevant results. However, traditional BIR systems cannot understand the complex relationships between EHR entities. Transformers are a new type of neural network that is very effective for natural language processing (NLP) tasks. As a result, transformers are well suited for tasks such as machine translation and text summarization. In this paper, we propose a new BIR system for EHRs that uses transformers for predicting cancer treatment from EHR. Our system can understand the complex relationships between the different entities in an EHR, which allows it to return more relevant results to clinicians. We evaluated our system on a dataset of EHRs and found that it outperformed state-of-the-art BIR systems on various tasks, including medical question answering and information extraction. Our results show that Transformers are a promising approach for BIR in EHRs, reaching an accuracy and an F1-score of 86.46%, and 0.8157, respectively. We believe that our system can help clinicians find the information they need more quickly and easily, leading to improved patient care.
Collapse
Affiliation(s)
- Pir Noman Ahmad
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yuanchao Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Khalid Khan
- Department of Computing Science and Mathematics, University of Stirling, Stirling FK9 4LA, UK
| | - Tao Jiang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Umama Burhan
- Department of Computing Science and Mathematics, University of Stirling, Stirling FK9 4LA, UK
| |
Collapse
|
2
|
Solarte-Pabón O, Montenegro O, García-Barragán A, Torrente M, Provencio M, Menasalvas E, Robles V. Transformers for extracting breast cancer information from Spanish clinical narratives. Artif Intell Med 2023; 143:102625. [PMID: 37673566 DOI: 10.1016/j.artmed.2023.102625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/11/2023] [Accepted: 07/08/2023] [Indexed: 09/08/2023]
Abstract
The wide adoption of electronic health records (EHRs) offers immense potential as a source of support for clinical research. However, previous studies focused on extracting only a limited set of medical concepts to support information extraction in the cancer domain for the Spanish language. Building on the success of deep learning for processing natural language texts, this paper proposes a transformer-based approach to extract named entities from breast cancer clinical notes written in Spanish and compares several language models. To facilitate this approach, a schema for annotating clinical notes with breast cancer concepts is presented, and a corpus for breast cancer is developed. Results indicate that both BERT-based and RoBERTa-based language models demonstrate competitive performance in clinical Named Entity Recognition (NER). Specifically, BETO and multilingual BERT achieve F-scores of 93.71% and 94.63%, respectively. Additionally, RoBERTa Biomedical attains an F-score of 95.01%, while RoBERTa BNE achieves an F-score of 94.54%. The findings suggest that transformers can feasibly extract information in the clinical domain in the Spanish language, with the use of models trained on biomedical texts contributing to enhanced results. The proposed approach takes advantage of transfer learning techniques by fine-tuning language models to automatically represent text features and avoiding the time-consuming feature engineering process.
Collapse
Affiliation(s)
- Oswaldo Solarte-Pabón
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain; Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia.
| | - Orlando Montenegro
- Escuela de Ingeniería de Sistemas, Universidad del Valle, Cali, Colombia
| | | | - Maria Torrente
- Hospital Universitario Puerta de Hierro de Madrid, Madrid, Spain
| | | | - Ernestina Menasalvas
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| | - Víctor Robles
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| |
Collapse
|
3
|
Zeng J, Cruz Pico CX, Saridogan T, Shufean MA, Kahle M, Yang D, Shaw K, Meric-Bernstam F. Natural Language Processing-Assisted Literature Retrieval and Analysis for Combination Therapy in Cancer. JCO Clin Cancer Inform 2022; 6:e2100109. [PMID: 34990212 PMCID: PMC9848576 DOI: 10.1200/cci.21.00109] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/15/2021] [Accepted: 11/30/2021] [Indexed: 01/26/2023] Open
Abstract
PURPOSE Despite advances in molecular therapeutics, few anticancer agents achieve durable responses. Rational combinations using two or more anticancer drugs have the potential to achieve a synergistic effect and overcome drug resistance, enhancing antitumor efficacy. A publicly accessible biomedical literature search engine dedicated to this domain will facilitate knowledge discovery and reduce manual search and review. METHODS We developed RetriLite, an information retrieval and extraction framework that leverages natural language processing and domain-specific knowledgebase to computationally identify highly relevant papers and extract key information. The modular architecture enables RetriLite to benefit from synergizing information retrieval and natural language processing techniques while remaining flexible to customization. We customized the application and created an informatics pipeline that strategically identifies papers that describe efficacy of using combination therapies in clinical or preclinical studies. RESULTS In a small pilot study, RetriLite achieved an F1 score of 0.93. A more extensive validation experiment was conducted to determine agents that have enhanced antitumor efficacy in vitro or in vivo with poly (ADP-ribose) polymerase inhibitors: 95.9% of the papers determined to be relevant by our application were true positive and the application's feature of distinguishing a clinical paper from a preclinical paper achieved an accuracy of 97.6%. Interobserver assessment was conducted, which resulted in a 100% concordance. The data derived from the informatics pipeline have also been made accessible to the public via a dedicated online search engine with an intuitive user interface. CONCLUSION RetriLite is a framework that can be applied to establish domain-specific information retrieval and extraction systems. The extensive and high-quality metadata tags along with keyword highlighting facilitate information seekers to more effectively and efficiently discover knowledge in the combination therapy domain.
Collapse
Affiliation(s)
- Jia Zeng
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Christian X. Cruz Pico
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Turçin Saridogan
- Department of Investigational Cancer Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Md Abu Shufean
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Michael Kahle
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Dong Yang
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Kenna Shaw
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
| | - Funda Meric-Bernstam
- Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, The University of Texas MD Anderson Cancer Center, Houston, TX
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX
- Department of Investigational Cancer Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX
| |
Collapse
|
4
|
Stenzl A, Sternberg CN, Ghith J, Serfass L, Schijvenaars BJA, Sboner A. Application of Artificial Intelligence to Overcome Clinical Information Overload in Urologic Cancer. BJU Int 2021; 130:291-300. [PMID: 34846775 DOI: 10.1111/bju.15662] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
OBJECTIVE To describe the use of artificial intelligence (AI) in medical literature and trial data extraction, and its applications in uro-oncology. This bridging review, which consolidates information from the diverse applications of AI, highlights how AI users can investigate more sophisticated queries than with traditional methods, leading to synthesis of raw data and complex outputs into more actionable and personalized results, particularly in the field of uro-oncology. METHODS Literature and clinical trial searches were performed in PubMed, Dimensions, Embase and Google (1999-2020). The searches focused on the use of AI and its various forms to facilitate literature searches, clinical guidelines development, and clinical trial data extraction in uro-oncology. To illustrate how AI can be applied toaddress questions about optimizing therapeutic decision making and individualizing treatment regimens, the Dimensions-linked information platform was searched for "prostate cancer" keywords (76 publications were identified; 48 were included). RESULTS AI offers the promise of transforming raw data and complex outputs into actionable insights. Literature and clinical trial searches can be automated, enabling clinicians to develop and analyze publications expeditiously on complex issues such as therapeutic sequencing and to obtain updates on documents that evolve at the pace and scope of the landscape. An AI-based platform inclusive of 12 trial databases and >100 scientific literature sources enabled the creation of an interactive visualization. CONCLUSION As the literature and clinical trial landscape continues to grow in complexity and with increasing speed, the ability to pull the right information at the right time from different search engines and resources while excluding social media bias becomes more challenging. This review demonstrates that by applying natural language processing and machine learning algorithms, validated and optimized AI leads to a speedier, more personalized, efficient and focused search compared with traditional methods.
Collapse
Affiliation(s)
- Arnulf Stenzl
- Department of Urology, University of Tübingen, Tübingen, Germany
| | - Cora N Sternberg
- Clinical Director, Englander Institute for Precision Medicine, Professor of Medicine, Weill Cornell Medicine Hematology/Oncology, Sandra and Edward Meyer Cancer Center, New York, NY, USA
| | | | | | | | - Andrea Sboner
- Director of Informatics and Computational Biology, Englander Institute for Precision Medicine; Assistant Professor at the Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
5
|
Emani S, Rui A, Rocha HAL, Rizvi RF, Juaçaba SF, Jackson GP, Bates DW. Physician Perception and Satisfaction with Artificial Intelligence in Cancer Treatment: The Watson for Oncology Experience and Implications for Low-Middle Income Countries (Preprint). JMIR Cancer 2021; 8:e31461. [PMID: 35389353 PMCID: PMC9030908 DOI: 10.2196/31461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 01/21/2022] [Accepted: 02/08/2022] [Indexed: 12/24/2022] Open
Abstract
As technology continues to improve, health care systems have the opportunity to use a variety of innovative tools for decision-making, including artificial intelligence (AI) applications. However, there has been little research on the feasibility and efficacy of integrating AI systems into real-world clinical practice, especially from the perspectives of clinicians who use such tools. In this paper, we review physicians’ perceptions of and satisfaction with an AI tool, Watson for Oncology, which is used for the treatment of cancer. Watson for Oncology has been implemented in several different settings, including Brazil, China, India, South Korea, and Mexico. By focusing on the implementation of an AI-based clinical decision support system for oncology, we aim to demonstrate how AI can be both beneficial and challenging for cancer management globally and particularly for low-middle–income countries. By doing so, we hope to highlight the need for additional research on user experience and the unique social, cultural, and political barriers to the successful implementation of AI in low-middle–income countries for cancer care.
Collapse
Affiliation(s)
- Srinivas Emani
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
- Department of Behavioral, Social, and Health Education Sciences, Emory University, Atlanta, GA, United States
| | - Angela Rui
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - Hermano Alexandre Lima Rocha
- Department of Community Health, Federal University of Cearrá, Fortaleza, CE, Brazil
- Instituto do Câncer do Ceará, Fortaleza, CE, Brazil
| | | | - Sergio Ferreira Juaçaba
- Instituto do Câncer do Ceará, Fortaleza, CE, Brazil
- Rodolfo Teofilo College, Fortaleza CE, Brazil
| | - Gretchen Purcell Jackson
- Intuitive Surgical, Sunnyvale, CA, United States
- Departments of Pediatric Surgery, Pediatrics, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
- Department of Healthcare Policy and Management, Harvard School of Public Health, Boston, MA, United States
| |
Collapse
|
6
|
Schmidt L, Finnerty Mutlu AN, Elmore R, Olorisade BK, Thomas J, Higgins JPT. Data extraction methods for systematic review (semi)automation: Update of a living systematic review. F1000Res 2021; 10:401. [PMID: 34408850 PMCID: PMC8361807 DOI: 10.12688/f1000research.51117.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/27/2023] [Indexed: 10/12/2023] Open
Abstract
Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search PubMed, ACL Anthology, arXiv, OpenAlex via EPPI-Reviewer, and the dblp computer science bibliography. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This living review update includes publications up to December 2022 and OpenAlex content up to March 2023. Results: 76 publications are included in this review. Of these, 64 (84%) of the publications addressed extraction of data from abstracts, while 19 (25%) used full texts. A total of 71 (93%) publications developed classifiers for randomised controlled trials. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. Data are available from 25 (33%), and code from 30 (39%) publications. Six (8%) implemented publicly available tools Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of literature review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. Between review updates, trends for sharing data and code increased strongly: in the base-review, data and code were available for 13 and 19% respectively, these numbers increased to 78 and 87% within the 23 new publications. Compared with the base-review, we observed another research trend, away from straightforward data extraction and towards additionally extracting relations between entities or automatic text summarisation. With this living review we aim to review the literature continually.
Collapse
Affiliation(s)
- Lena Schmidt
- NIHR Innovation Observatory, Newcastle University, Newcastle upon Tyne, NE4 5TG, UK
- Sciome LLC, Research Triangle Park, North Carolina, 27713, USA
- Bristol Medical School, University of Bristol, Bristol, BS8 2PS, UK
| | | | - Rebecca Elmore
- Sciome LLC, Research Triangle Park, North Carolina, 27713, USA
| | - Babatunde K. Olorisade
- Bristol Medical School, University of Bristol, Bristol, BS8 2PS, UK
- Evaluate Ltd, London, SE1 2RE, UK
- Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff, CF5 2YB, UK
| | - James Thomas
- UCL Social Research Institute, University College London, London, WC1H 0AL, UK
| | | |
Collapse
|
7
|
Suwanvecho S, Suwanrusme H, Jirakulaporn T, Issarachai S, Taechakraichana N, Lungchukiet P, Decha W, Boonpakdee W, Thanakarn N, Wongrattananon P, Preininger AM, Solomon M, Wang S, Hekmat R, Dankwa-Mullan I, Shortliffe E, Patel VL, Arriaga Y, Jackson GP, Kiatikajornthada N. Comparison of an oncology clinical decision-support system's recommendations with actual treatment decisions. J Am Med Inform Assoc 2021; 28:832-838. [PMID: 33517389 PMCID: PMC7973455 DOI: 10.1093/jamia/ocaa334] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Indexed: 12/02/2022] Open
Abstract
OBJECTIVE IBM(R) Watson for Oncology (WfO) is a clinical decision-support system (CDSS) that provides evidence-informed therapeutic options to cancer-treating clinicians. A panel of experienced oncologists compared CDSS treatment options to treatment decisions made by clinicians to characterize the quality of CDSS therapeutic options and decisions made in practice. METHODS This study included patients treated between 1/2017 and 7/2018 for breast, colon, lung, and rectal cancers at Bumrungrad International Hospital (BIH), Thailand. Treatments selected by clinicians were paired with therapeutic options presented by the CDSS and coded to mask the origin of options presented. The panel rated the acceptability of each treatment in the pair by consensus, with acceptability defined as compliant with BIH's institutional practices. Descriptive statistics characterized the study population and treatment-decision evaluations by cancer type and stage. RESULTS Nearly 60% (187) of 313 treatment pairs for breast, lung, colon, and rectal cancers were identical or equally acceptable, with 70% (219) of WfO therapeutic options identical to, or acceptable alternatives to, BIH therapy. In 30% of cases (94), 1 or both treatment options were rated as unacceptable. Of 32 cases where both WfO and BIH options were acceptable, WfO was preferred in 18 cases and BIH in 14 cases. Colorectal cancers exhibited the highest proportion of identical or equally acceptable treatments; stage IV cancers demonstrated the lowest. CONCLUSION This study demonstrates that a system designed in the US to support, rather than replace, cancer-treating clinicians provides therapeutic options which are generally consistent with recommendations from oncologists outside the US.
Collapse
Affiliation(s)
| | - Harit Suwanrusme
- Bumrungrad International Hospital, Khlong Toei Nuea, Bangkok, Thailand
| | | | | | | | | | - Wimolrat Decha
- Bumrungrad International Hospital, Khlong Toei Nuea, Bangkok, Thailand
| | - Wisanu Boonpakdee
- Bumrungrad International Hospital, Khlong Toei Nuea, Bangkok, Thailand
| | - Nittaya Thanakarn
- Bumrungrad International Hospital, Khlong Toei Nuea, Bangkok, Thailand
| | | | | | | | - Suwei Wang
- IBM Watson Health, Cambridge, Massachusetts, USA
| | | | | | - Edward Shortliffe
- IBM Watson Health, Cambridge, Massachusetts, USA
- Columbia University, New York, New York, USA
| | - Vimla L Patel
- IBM Watson Health, Cambridge, Massachusetts, USA
- New York Academy of Medicine, New York, New York, USA
| | - Yull Arriaga
- IBM Watson Health, Cambridge, Massachusetts, USA
| | - Gretchen Purcell Jackson
- IBM Watson Health, Cambridge, Massachusetts, USA
- Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | | |
Collapse
|