1
|
Bazoge A, Morin E, Daille B, Gourraud PA. Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review. JMIR Med Inform 2023; 11:e42477. [PMID: 38100200 PMCID: PMC10757232 DOI: 10.2196/42477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/16/2023] [Accepted: 09/07/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND In recent years, health data collected during the clinical care process have been often repurposed for secondary use through clinical data warehouses (CDWs), which interconnect disparate data from different sources. A large amount of information of high clinical value is stored in unstructured text format. Natural language processing (NLP), which implements algorithms that can operate on massive unstructured textual data, has the potential to structure the data and make clinical information more accessible. OBJECTIVE The aim of this review was to provide an overview of studies applying NLP to textual data from CDWs. It focuses on identifying the (1) NLP tasks applied to data from CDWs and (2) NLP methods used to tackle these tasks. METHODS This review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We searched for relevant articles in 3 bibliographic databases: PubMed, Google Scholar, and ACL Anthology. We reviewed the titles and abstracts and included articles according to the following inclusion criteria: (1) focus on NLP applied to textual data from CDWs, (2) articles published between 1995 and 2021, and (3) written in English. RESULTS We identified 1353 articles, of which 194 (14.34%) met the inclusion criteria. Among all identified NLP tasks in the included papers, information extraction from clinical text (112/194, 57.7%) and the identification of patients (51/194, 26.3%) were the most frequent tasks. To address the various tasks, symbolic methods were the most common NLP methods (124/232, 53.4%), showing that some tasks can be partially achieved with classical NLP techniques, such as regular expressions or pattern matching that exploit specialized lexica, such as drug lists and terminologies. Machine learning (70/232, 30.2%) and deep learning (38/232, 16.4%) have been increasingly used in recent years, including the most recent approaches based on transformers. NLP methods were mostly applied to English language data (153/194, 78.9%). CONCLUSIONS CDWs are central to the secondary use of clinical texts for research purposes. Although the use of NLP on data from CDWs is growing, there remain challenges in this field, especially with regard to languages other than English. Clinical NLP is an effective strategy for accessing, extracting, and transforming data from CDWs. Information retrieved with NLP can assist in clinical research and have an impact on clinical practice.
Collapse
Affiliation(s)
- Adrien Bazoge
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
| | - Emmanuel Morin
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Béatrice Daille
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Pierre-Antoine Gourraud
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
- Nantes Université, INSERM, CHU de Nantes, École Centrale Nantes, Centre de Recherche Translationnelle en Transplantation et Immunologie, CR2TI, F-44000 Nantes, France
| |
Collapse
|
2
|
Comparative Effectiveness of Digital Breast Tomosynthesis and Mammography in Older Women. J Gen Intern Med 2022; 37:1870-1876. [PMID: 34595682 PMCID: PMC8483166 DOI: 10.1007/s11606-021-07132-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 09/02/2021] [Indexed: 11/23/2022]
Abstract
BACKGROUND Digital breast tomosynthesis (DBT) has become a prevalent mode of breast cancer screening in recent years. Although older women are commonly screened for breast cancer, little is known about screening outcomes using DBT among older women. OBJECTIVE To assess proximal screening outcomes with DBT compared to traditional two-dimensional(2-D) mammography among women 67-74 and women 75 and older. DESIGN Cohort study. PARTICIPANTS Medicare fee-for-service beneficiaries aged 67 years and older with no history of prior cancer who received a screening mammogram in 2015. MAIN MEASURES Use of subsequent imaging (ultrasound and diagnostic mammography) as an indication of recall, breast cancer detection, and characteristics of breast cancer at the time of diagnosis. Analyses used weighted logistic regression to adjust for potential confounders. KEY RESULTS Our study included 26,406 women aged 67-74 and 17,001 women 75 and older who were screened for breast cancer. Among women 75 and older, the rate of subsequent imaging among women screened with DBT did not differ significantly from 2-D mammography (91.8 versus 97.0 per 1,000 screening mammograms, p=0.37). In this age group, DBT was associated with 2.1 additional cancers detected per 1,000 screening mammograms compared to 2D (11.5 versus 9.4, p=0.003), though these additional cancers were almost exclusively in situ and stage I invasive cancers. For women 67-74 years old, DBT was associated with a higher rate of subsequent imaging than 2-D mammography (113.9 versus 100.3, p=0.004) and a higher rate of stage I invasive cancer detection (4.7 versus 3.7, p=0.002), but not other stages. CONCLUSIONS Breast cancer screening with DBT was not associated with lower rates of subsequent imaging among older women. Most additional cancers detected with DBT were early stage. Whether detecting these additional early-stage cancers among older women improves health outcomes remains uncertain.
Collapse
|
3
|
Chikarmane S. Synthetic Mammography: Review of Benefits and Drawbacks in Clinical Use. JOURNAL OF BREAST IMAGING 2022; 4:124-134. [PMID: 38417004 DOI: 10.1093/jbi/wbac008] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Indexed: 03/01/2024]
Abstract
Digital breast tomosynthesis (DBT) has been widely adopted as a breast cancer screening tool, demonstrating decreased recall rates and other improved screening performance metrics when compared to digital mammography (DM) alone. Drawbacks of DBT when added to 2D DM include the increased radiation dose and longer examination time. Synthetic mammography (SM), a 2D reconstruction from the tomosynthesis slices, has been introduced to eliminate the need for a separate acquisition of 2D DM. Data show that the replacement of 2D DM by SM, when used with DBT, maintains the benefits of DBT, such as decreased recall rates, improved cancer detection rates, and similar positive predictive values. Key differences between SM and 2D DM include how the image is acquired, assessment of breast density, and visualization of mammographic findings, such as calcifications. Although SM is approved by the Food and Drug Administration and has been shown to be non-inferior when used with DBT, concerns surrounding SM include image quality and artifacts. The purpose of this review article is to review the benefits, drawbacks, and screening performance metrics of SM versus DBT.
Collapse
Affiliation(s)
- Sona Chikarmane
- Brigham and Women's Hospital, Department of Radiology, Boston, MA, USA
| |
Collapse
|
4
|
Alabousi M, Wadera A, Kashif Al-Ghita M, Kashef Al-Ghetaa R, Salameh JP, Pozdnyakov A, Zha N, Samoilov L, Dehmoobad Sharifabadi A, Sadeghirad B, Freitas V, McInnes MDF, Alabousi A. Performance of Digital Breast Tomosynthesis, Synthetic Mammography, and Digital Mammography in Breast Cancer Screening: A Systematic Review and Meta-Analysis. J Natl Cancer Inst 2021; 113:680-690. [PMID: 33372954 PMCID: PMC8168096 DOI: 10.1093/jnci/djaa205] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Our objective was to perform a systematic review and meta-analysis comparing the breast cancer detection rate (CDR), invasive CDR, recall rate, and positive predictive value 1 (PPV1) of digital mammography (DM) alone, combined digital breast tomosynthesis (DBT) and DM, combined DBT and synthetic 2-dimensional mammography (S2D), and DBT alone. METHODS MEDLINE and Embase were searched until April 2020 to identify comparative design studies reporting on patients undergoing routine breast cancer screening. Random effects model proportional meta-analyses estimated CDR, invasive CDR, recall rate, and PPV1. Meta-regression modeling was used to compare imaging modalities. All statistical tests were 2-sided. RESULTS Forty-two studies reporting on 2 606 296 patients (13 003 breast cancer cases) were included. CDR was highest in combined DBT and DM (6.36 per 1000 screened, 95% confidence interval [CI] = 5.62 to 7.14, P < .001), and combined DBT and S2D (7.40 per 1000 screened, 95% CI = 6.49 to 8.37, P < .001) compared with DM alone (4.68 per 1000 screened, 95% CI = 4.28 to 5.11). Invasive CDR was highest in combined DBT and DM (4.53 per 1000 screened, 95% CI = 3.97 to 5.12, P = .003) and combined DBT and S2D (5.68 per 1000 screened, 95% CI = 4.43 to 7.09, P < .001) compared with DM alone (3.42 per 1000 screened, 95% CI = 3.02 to 3.83). Recall rate was lowest in combined DBT and S2D (42.3 per 1000 screened, 95% CI = 37.4 to 60.4, P<.001). PPV1 was highest in combined DBT and DM (10.0%, 95% CI = 8.0% to 12.0%, P = .004), and combined DBT and S2D (16.0%, 95% CI = 10.0% to 23.0%, P < .001), whereas no difference was detected for DBT alone (7.0%, 95% CI = 6.0% to 8.0%, P = .75) compared with DM alone (7.0%, 95.0% CI = 5.0% to 8.0%). CONCLUSIONS Our findings provide evidence on key performance metrics for DM, DBT alone, combined DBT and DM, and combined DBT and S2D, which may inform optimal application of these modalities for breast cancer screening.
Collapse
Affiliation(s)
- Mostafa Alabousi
- Department of Radiology, McMaster University, Hamilton, ON, Canada
| | - Akshay Wadera
- Department of Radiology, McMaster University, Hamilton, ON, Canada
| | | | - Rayeh Kashef Al-Ghetaa
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada
| | | | - Alex Pozdnyakov
- Faculty of Medicine, McMaster University, Hamilton, ON, Canada
| | - Nanxi Zha
- Department of Radiology, McMaster University, Hamilton, ON, Canada
| | - Lucy Samoilov
- Department of Radiology, McMaster University, Hamilton, ON, Canada
| | | | - Behnam Sadeghirad
- Department of Health Research Methods, Evidence, and Impact (HEI), The Michael G. DeGroote Institute for Pain Research and Care, McMaster University, Hamilton, ON, Canada
| | - Vivianne Freitas
- Joint Department of Medical Imaging, University of Toronto, Toronto, Ontario, Canada
| | - Matthew DF McInnes
- Department of Radiology and Epidemiology, University of Ottawa, Ottawa Hospital Research Institute, Clinical Epidemiology Program, Ottawa, ON, Canada
| | - Abdullah Alabousi
- Department of Radiology, McMaster University, St Joseph’s Healthcare Hamilton, Hamilton, ON, Canada
| |
Collapse
|
5
|
Gao Y, Moy L, Heller SL. Digital Breast Tomosynthesis: Update on Technology, Evidence, and Clinical Practice. Radiographics 2021; 41:321-337. [PMID: 33544665 DOI: 10.1148/rg.2021200101] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Digital breast tomosynthesis (DBT) has been widely adopted in breast imaging in both screening and diagnostic settings. The benefits of DBT are well established. Compared with two-dimensional digital mammography (DM), DBT preferentially increases detection of invasive cancers without increased detection of in-situ cancers, maximizing identification of biologically significant disease, while mitigating overdiagnosis. The higher sensitivity of DBT for architectural distortion allows increased diagnosis of invasive cancers overall and particularly improves the visibility of invasive lobular cancers. Implementation of DBT has decreased the number of recalls for false-positive findings at screening, contributing to improved specificity at diagnostic evaluation. Integration of DBT in diagnostic examinations has also resulted in an increased percentage of biopsies with positive results, improving diagnostic confidence. Although individual DBT examinations have a longer interpretation time compared with that for DM, DBT has streamlined the diagnostic workflow and minimized the need for short-term follow-up examinations, redistributing much-needed time resources to screening. Yet DBT has limitations. Although improvements in cancer detection and recall rates are seen for patients in a large spectrum of age groups and breast density categories, these benefits are minimal in women with extremely dense breast tissue, and the extent of these benefits may vary by practice environment and by geographic location. Although DBT allows detection of more invasive cancers than does DM, its incremental yield is lower than that of US and MRI. Current understanding of the biologic profile of DBT-detected cancers is limited. Whether DBT improves breast cancer-specific mortality remains a key question that requires further investigation. ©RSNA, 2021.
Collapse
Affiliation(s)
- Yiming Gao
- From the Department of Radiology, New York University Langone Medical Center, 160 E 34th St, New York, NY 10016
| | - Linda Moy
- From the Department of Radiology, New York University Langone Medical Center, 160 E 34th St, New York, NY 10016
| | - Samantha L Heller
- From the Department of Radiology, New York University Langone Medical Center, 160 E 34th St, New York, NY 10016
| |
Collapse
|