1
|
Azizi S, Culp L, Freyberg J, Mustafa B, Baur S, Kornblith S, Chen T, Tomasev N, Mitrović J, Strachan P, Mahdavi SS, Wulczyn E, Babenko B, Walker M, Loh A, Chen PHC, Liu Y, Bavishi P, McKinney SM, Winkens J, Roy AG, Beaver Z, Ryan F, Krogue J, Etemadi M, Telang U, Liu Y, Peng L, Corrado GS, Webster DR, Fleet D, Hinton G, Houlsby N, Karthikesalingam A, Norouzi M, Natarajan V. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat Biomed Eng 2023:10.1038/s41551-023-01049-7. [PMID: 37291435 DOI: 10.1038/s41551-023-01049-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 05/02/2023] [Indexed: 06/10/2023]
Abstract
Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a representation-learning strategy for machine-learning models applied to medical-imaging tasks that mitigates such 'out of distribution' performance problem and that improves model robustness and training efficiency. The strategy, which we named REMEDIS (for 'Robust and Efficient Medical Imaging with Self-supervision'), combines large-scale supervised transfer learning on natural images and intermediate contrastive self-supervised learning on medical images and requires minimal task-specific customization. We show the utility of REMEDIS in a range of diagnostic-imaging tasks covering six imaging domains and 15 test datasets, and by simulating three realistic out-of-distribution scenarios. REMEDIS improved in-distribution diagnostic accuracies up to 11.5% with respect to strong supervised baseline models, and in out-of-distribution settings required only 1-33% of the data for retraining to match the performance of supervised models retrained using all available data. REMEDIS may accelerate the development lifecycle of machine-learning models for medical imaging.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Ting Chen
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | - Aaron Loh
- Google Research, Mountain View, CA, USA
| | | | - Yuan Liu
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | - Fiona Ryan
- Georgia Institute of Technology, Computer Science, Atlanta, GA, USA
| | | | - Mozziyar Etemadi
- School of Medicine/School of Engineering, Northwestern University, Chicago, IL, USA
| | | | - Yun Liu
- Google Research, Mountain View, CA, USA
| | - Lily Peng
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Rezazade Mehrizi MH, Mol F, Peter M, Ranschaert E, Dos Santos DP, Shahidi R, Fatehi M, Dratsch T. The impact of AI suggestions on radiologists' decisions: a pilot study of explainability and attitudinal priming interventions in mammography examination. Sci Rep 2023; 13:9230. [PMID: 37286665 DOI: 10.1038/s41598-023-36435-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 06/03/2023] [Indexed: 06/09/2023] Open
Abstract
Various studies have shown that medical professionals are prone to follow the incorrect suggestions offered by algorithms, especially when they have limited inputs to interrogate and interpret such suggestions and when they have an attitude of relying on them. We examine the effect of correct and incorrect algorithmic suggestions on the diagnosis performance of radiologists when (1) they have no, partial, and extensive informational inputs for explaining the suggestions (study 1) and (2) they are primed to hold a positive, negative, ambivalent, or neutral attitude towards AI (study 2). Our analysis of 2760 decisions made by 92 radiologists conducting 15 mammography examinations shows that radiologists' diagnoses follow both incorrect and correct suggestions, despite variations in the explainability inputs and attitudinal priming interventions. We identify and explain various pathways through which radiologists navigate through the decision process and arrive at correct or incorrect decisions. Overall, the findings of both studies show the limited effect of using explainability inputs and attitudinal priming for overcoming the influence of (incorrect) algorithmic suggestions.
Collapse
Affiliation(s)
| | - Ferdinand Mol
- Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Marcel Peter
- Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | | | - Daniel Pinto Dos Santos
- Institute of Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Ramin Shahidi
- Bushehr University of Medical Sciences, Bushehr, Iran
| | | | - Thomas Dratsch
- Institute of Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| |
Collapse
|
3
|
Pullar-Strecker Z, Dost K, Frank E, Wicker J. Hitting the target: stopping active learning at the cost-based optimum. Mach Learn 2022. [DOI: 10.1007/s10994-022-06253-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
AbstractActive learning allows machine learning models to be trained using fewer labels while retaining similar performance to traditional supervised learning. An active learner selects the most informative data points, requests their labels, and retrains itself. While this approach is promising, it raises the question of how to determine when the model is ‘good enough’ without the additional labels required for traditional evaluation. Previously, different stopping criteria have been proposed aiming to identify the optimal stopping point. Yet, optimality can only be expressed as a domain-dependent trade-off between accuracy and the number of labels, and no criterion is superior in all applications. As a further complication, a comparison of criteria for a particular real-world application would require practitioners to collect additional labelled data they are aiming to avoid by using active learning in the first place. This work enables practitioners to employ active learning by providing actionable recommendations for which stopping criteria are best for a given real-world scenario. We contribute the first large-scale comparison of stopping criteria for pool-based active learning, using a cost measure to quantify the accuracy/label trade-off, public implementations of all stopping criteria we evaluate, and an open-source framework for evaluating stopping criteria. Our research enables practitioners to substantially reduce labelling costs by utilizing the stopping criterion which best suits their domain.
Collapse
|
4
|
Shahan CL, Layne GP. Advances in Breast Imaging with Current Screening Recommendations and Controversies. Obstet Gynecol Clin North Am 2022; 49:1-33. [DOI: 10.1016/j.ogc.2021.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
5
|
Leithner D, Moy L, Morris EA, Marino MA, Helbich TH, Pinker K. Abbreviated MRI of the Breast: Does It Provide Value? J Magn Reson Imaging 2018; 49:e85-e100. [PMID: 30194749 PMCID: PMC6408315 DOI: 10.1002/jmri.26291] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 07/26/2018] [Accepted: 07/26/2018] [Indexed: 12/13/2022] Open
Abstract
MRI of the breast is the most sensitive test for breast cancer detection and outperforms conventional imaging with mammography, digital breast tomosynthesis, or ultrasound. However, the long scan time and relatively high costs limit its widespread use. Hence, it is currently only routinely implemented in the screening of women at an increased risk of breast cancer. To overcome these limitations, abbreviated dynamic contrast‐enhanced (DCE)‐MRI protocols have been introduced that substantially shorten image acquisition and interpretation time while maintaining a high diagnostic accuracy. Efforts to develop abbreviated MRI protocols reflect the increasing scrutiny of the disproportionate contribution of radiology to the rising overall healthcare expenditures. Healthcare policy makers are now focusing on curbing the use of advanced imaging examinations such as MRI while continuing to promote the quality and appropriateness of imaging. An important cornerstone of value‐based healthcare defines value as the patient's outcome over costs. Therefore, the concept of a fast, abbreviated MRI exam is very appealing, given its high diagnostic accuracy coupled with the possibility of a marked reduction in the cost of an MRI examination. Given recent concerns about gadolinium‐based contrast agents, unenhanced MRI techniques such as diffusion‐weighted imaging (DWI) are also being investigated for breast cancer diagnosis. Although further larger prospective studies, standardized imaging protocol, and reproducibility studies are necessary, initial results with abbreviated MRI protocols suggest that it seems feasible to offer screening breast DCE‐MRI to a broader population. This article aims to give an overview of abbreviated and fast breast MRI protocols, their utility for breast cancer detection, and their emerging role in the new value‐based healthcare paradigm that has replaced the fee‐for‐service model. Level of Evidence: 1 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2019;49:e85–e100.
Collapse
Affiliation(s)
- Doris Leithner
- Department of Radiology, Breast Imaging Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.,Department of Diagnostic and Interventional Radiology, University Hospital Frankfurt, Frankfurt, Germany
| | - Linda Moy
- Department of Radiology, Center for Biomedical Imaging, NYU School of Medicine, New York, New York, USA
| | - Elizabeth A Morris
- Department of Radiology, Breast Imaging Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Maria A Marino
- Department of Radiology, Breast Imaging Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.,Department of Biomedical Sciences and Morphologic and Functional Imaging, University of Messina, Messina, Italy
| | - Thomas H Helbich
- Department of Biomedical Imaging and Image-guided Therapy, Division of Molecular and Gender Imaging, Medical University Vienna, Vienna, Austria
| | - Katja Pinker
- Department of Radiology, Breast Imaging Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.,Department of Biomedical Imaging and Image-guided Therapy, Division of Molecular and Gender Imaging, Medical University Vienna, Vienna, Austria
| |
Collapse
|
6
|
Breast cancers missed by screening radiologists can be detected by reading mammograms at a distance. Ir J Med Sci 2018; 188:289-293. [DOI: 10.1007/s11845-018-1828-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 04/25/2018] [Indexed: 10/17/2022]
|
7
|
Pham R, Forsberg D, Plecha D. Improved Screening Mammogram Workflow by Maximizing PACS Streamlining Capabilities in an Academic Breast Center. J Digit Imaging 2018; 30:133-140. [PMID: 27766443 DOI: 10.1007/s10278-016-9909-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The aim of this study was to perform an operational improvement project targeted at the breast imaging reading workflow of mammography examinations at an academic medical center with its associated breast centers and satellite sites. Through careful analysis of the current workflow, two major issues were identified: stockpiling of paperwork and multiple worklists. Both issues were considered to cause significant delays to the start of interpreting screening mammograms. Four workflow changes were suggested (scanning of paperwork, worklist consolidation, use of chat functionality, and tracking of case distribution among trainees) and implemented in July 2015. Timestamp data was collected 2 months before (May-Jun) and after (Aug-Sep) the implemented changes. Generalized linear models were used to analyze the data. The results showed significant improvements for the interpretation of screening mammograms. The average time elapsed for time to open a case reduced from 70 to 28 min (60 % decrease, p < 0.001), report turn-around time with preliminary signature decreased from 151 to 107 min (29 % decrease, p < 0.001), and report turn-around time final signature from 153 to 139 min (9 % decrease, p = 0.002). These improvements were achieved while keeping the efficiency of the workflow for diagnostic mammograms at large unaltered even with increased volume of mammography examinations (31 % increase of 4344 examinations for May-Jun to 5678 examinations for Aug-Sep). In conclusion, targeted efforts to improve the breast imaging reading workflow for screening mammograms in a teaching environment provided significant performance improvements without affecting the workflow of diagnostic mammograms.
Collapse
Affiliation(s)
- Ramya Pham
- Department of Radiology, Case Western Reserve University and University Hospitals Case Medical Center, 11100 Euclid Avenue, Cleveland, OH, 44106, USA.
| | - Daniel Forsberg
- Department of Radiology, Case Western Reserve University and University Hospitals Case Medical Center, 11100 Euclid Avenue, Cleveland, OH, 44106, USA.,Sectra, Teknikringen 20, SE-583 30, Linköping, Sweden
| | - Donna Plecha
- Department of Radiology, Case Western Reserve University and University Hospitals Case Medical Center, 11100 Euclid Avenue, Cleveland, OH, 44106, USA
| |
Collapse
|
8
|
Abbreviated MRI Protocols: Wave of the Future for Breast Cancer Screening. AJR Am J Roentgenol 2017; 208:284-289. [DOI: 10.2214/ajr.16.17205] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
9
|
Jinnouchi M, Yabuuchi H, Kubo M, Tokunaga E, Yamamoto H, Honda H. Utility of adaptive control processing for the interpretation of digital mammograms. Acta Radiol 2016; 57:1297-1303. [PMID: 25995309 DOI: 10.1177/0284185115586022] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Background Adaptive control processing for mammography (ACM) is a novel program that automatically sets up appropriate image-processing parameters for individual mammograms (MMGs) by analyzing the focal and whole breast histogram. Purpose To investigate whether ACM improves the image contrast of digital MMGs and whether it improves radiologists' diagnostic performance in reading of MMGs. Material and Methods One hundred normal cases for image quality assessment and another 100 cases (50 normal and 50 cancers) for observer performance assessment were enrolled. All mammograms were examined with and without ACM. Five radiologists assessed the intra- and extra-mammary contrast of 100 normal MMGs, and the mean scores of the intra- and extra-mammary contrast were compared between MMGs with and without ACM in both the dense and non-dense group. They classified 100 MMGs into BI-RADS categories 1-5, and were asked to rate the images on a scale of 0 to 100 for the likelihood of the presence of category 3-5 lesions in each breast. Detectability of breast cancer, reading time, and frequency of window adjustment were compared between MMGs with and without ACM. Results ACM improved the intra-mammary contrast in both the dense and non-dense group but degraded extra-mammary contrast in the dense group. There was no significant difference in detectability of breast cancer between MMGs with and without ACM. Frequency of window adjustment without ACM was significantly higher than that with ACM. Reading time without ACM was significantly longer than that with ACM. Conclusion ACM improves the image contrast of MMGs and shortens reading time.
Collapse
Affiliation(s)
- Mikako Jinnouchi
- Department of Clinical Radiology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Hidetake Yabuuchi
- Department of Health Sciences, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Makoto Kubo
- Department of Surgery and Oncology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Eriko Tokunaga
- Department of Surgery and Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Hidetaka Yamamoto
- Department of Anatomic Pathology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Hiroshi Honda
- Department of Clinical Radiology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| |
Collapse
|
10
|
Consistency and efficiency of CT analysis of metastatic disease: semiautomated lesion management application within a PACS. AJR Am J Roentgenol 2013; 201:618-25. [PMID: 23971455 DOI: 10.2214/ajr.12.10136] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of this study was to evaluate the success, consistency, and efficiency of a semiautomated lesion management application within a PACS in the analysis of metastatic lesions in serial CT examinations of cancer patients. MATERIALS AND METHODS Two observers using baseline and follow-up CT data independently reviewed 93 target lesions (17 lung, five liver, 71 lymph node) in 50 patients with either metastatic bladder or prostate cancer. The observers measured the longest axis (or short axis for lymph nodes) of each lesion and made Response Evaluation Criteria in Solid Tumors (RECIST) determinations using manual and lesion management application methods. The times required for examination review, RECIST calculations, and data input were recorded. The Wilcoxon signed rank test was used to assess time differences, and Bland-Altman analysis was used to assess interobserver agreement within the manual and lesion management application methods. Percentage success rates were also reported. RESULTS With the lesion management application, most lung and liver lesions were semiautomatically segmented. Comparison of the lesion management application and manual methods for all lesions showed a median time saving of 45% for observer 1 (p<0.05) and 28% for observer 2 (p=0.05) on follow-up scans versus 28% for observer 1 (p<0.05) and 9% for observer 2 (p=0.087) on baseline scans. Variability of measurements showed mean percentage change differences of only 8.9% for the lesion management application versus 26.4% for manual measurements. CONCLUSION With the lesion management application method, most lung and liver lesions were successfully segmented semiautomatically; the results were more consistent between observers; and assessment of tumor size was faster than with the manual method.
Collapse
|
11
|
Henderson LM, Hubbard RA, Onega TL, Zhu W, Buist DSM, Fishman P, Tosteson ANA. Assessing health care use and cost consequences of a new screening modality: the case of digital mammography. Med Care 2012; 50:1045-52. [PMID: 22922432 PMCID: PMC3650634 DOI: 10.1097/mlr.0b013e318269e0d1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
BACKGROUND Full-field digital mammography (FFDM) has largely replaced screen-film mammography (SFM) for breast cancer screening, but how this affects downstream breast-related use and costs is unknown. OBJECTIVES To compare breast-related health care use and costs among Medicare beneficiaries undergoing SFM versus FFDM from 1999 to 2005. DESIGN Retrospective cohort study. SUBJECTS Medicare-enrolled women aged 66 and older with mammograms in Breast Cancer Surveillance Consortium registries. MEASURES Subsequent follow-up with additional imaging or breast biopsy within 12 months was ascertained through Medicare claims. Associated mean costs were estimated by screening modality and year, adjusting for confounding factors, and clustering within mammography facilities using Generalized Estimating Equations. RESULTS Among 138,199 women, 332,324 SFM and 22,407 FFDM mammograms were analyzed. Approximately 6.5% of SFM and 9.0% of FFDM had positive findings. In 2001, subsequent imaging was higher among FFDM versus SFM (127.5 vs. 97.4 follow-up mammography claims per 1000 index mammograms), whereas subsequent biopsy was lower among FFDM versus SFM (19.2 vs. 24.9 per 1000 index mammograms) with differences decreasing over time. From 2001 to 2004, mammography subsequent to FFDM had higher mean costs than SFM ($82.60 vs. $64.31 in 2001). The only cost differences between SFM and FFDM for ultrasound or biopsy were in 2001. CONCLUSIONS Subsequent breast-related health care use differed early in FFDM introduction, but diminished over time with differences attributable to higher recall rates for additional imaging and lower rates of biopsy in those undergoing FFDM versus SFM. Remaining cost differences are because of higher reimbursement rates for FFDM versus SFM.
Collapse
Affiliation(s)
- Louise M Henderson
- Department of Radiology, School of Medicine, The University of North Carolina, Chapel Hill, NC 27599-7515, USA.
| | | | | | | | | | | | | |
Collapse
|
12
|
Lopetegui M, Yen PY, Lai AM, Embi PJ, Payne PRO. Time Capture Tool (TimeCaT): development of a comprehensive application to support data capture for Time Motion Studies. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2012; 2012:596-605. [PMID: 23304332 PMCID: PMC3540552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Time Motion Studies (TMS) have proved to be the gold standard method to measure and quantify clinical workflow, and have been widely used to assess the impact of health information systems implementation. Although there are tools available to conduct TMS, they provide different approaches for multitasking, interruptions, inter-observer reliability assessment and task taxonomy, making results across studies not comparable. We postulate that a significant contributing factor towards the standardization and spread of TMS would be the availability and spread of an accessible, scalable and dynamic tool. We present the development of a comprehensive Time Capture Tool (TimeCaT): a web application developed to support data capture for TMS. Ongoing and continuous development of TimeCaT includes the development and validation of a realistic inter-observer reliability scoring algorithm, the creation of an online clinical tasks ontology, and a novel quantitative workflow comparison method.
Collapse
Affiliation(s)
- Marcelo Lopetegui
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | | | | | | | | |
Collapse
|
13
|
Abstract
OBJECTIVE Interpretive accuracy varies among radiologists, especially in mammography. This study examines the relationship between radiologists' confidence in their assessments and their accuracy in interpreting mammograms. MATERIALS AND METHODS In this study, 119 community radiologists interpreted 109 expert-defined screening mammography examinations in test sets and rated their confidence in their assessment for each case. They also provided a global assessment of their ability to interpret mammograms. Positive predictive value (PPV) and negative predictive value (NPV) were modeled as functions of self-rated confidence on each examination using log-linear regression estimated with generalized estimating equations. Reference measures were cancer status and expert-defined need for recall. Effect modification by weekly mammography volume was examined. RESULTS Radiologists who self-reported higher global interpretive ability tended to interpret more mammograms per week (p = 0.08), were more likely to specialize (p = 0.02) and to have completed a fellowship in breast or women's imaging (p = 0.05), and had a higher PPV for cancer detection (p = 0.01). Examinations for which low-volume radiologists were "very confident" had a PPV of 2.93 times (95% CI, 2.01-4.27) higher than examinations they rated with neutral confidence. Trends of increasing NPVs with increasing confidence were significant for low-volume radiologists relative to noncancers (p = 0.01) and expert nonrecalls (p < 0.001). A trend of significantly increasing NPVs existed for high-volume radiologists relative to expert nonrecall (p = 0.02) but not relative to noncancer status (p = 0.32). CONCLUSION Confidence in mammography assessments was associated with better accuracy, especially for low-volume readers. Asking for a second opinion when confidence in an assessment is low may increase accuracy.
Collapse
|
14
|
Association between time spent interpreting, level of confidence, and accuracy of screening mammography. AJR Am J Roentgenol 2012; 198:970-8. [PMID: 22451568 DOI: 10.2214/ajr.11.6988] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
OBJECTIVE The objective of this study was to examine the effect of time spent viewing images and level of confidence on a screening mammography test set on interpretive performance. MATERIALS AND METHODS Radiologists from six mammography registries participated in this study and were randomized to interpret one of four test sets and complete 12 survey questions. Each test set had 109 cases of digitized four-view screening screen-film mammograms with prior comparison screening views. Viewing time for each case was defined as the cumulative time spent viewing all mammographic images before recording which visible feature, if any, was the "most significant finding." Log-linear regression fit via the generalized estimating equation was used to test the effect of viewing time and level of confidence in the interpretation on test set sensitivity and false-positive rate. RESULTS One hundred nineteen radiologists completed a test set and contributed data on 11,484 interpretations. The radiologists spent more time viewing cases that had significant findings or cases for which they had less confidence in their interpretation. Each additional minute of viewing time increased the probability of a true-positive interpretation among cancer cases by 1.12 (95% CI, 1.06-1.19; p < 0.001) regardless of confidence in the assessment. Among the radiologists who were very confident in their assessment, each additional minute of viewing time increased the adjusted risk of a false-positive interpretation among noncancer cases by 1.42 (95% CI, 1.21-1.68), and this viewing-time effect diminished with decreasing confidence. CONCLUSION Longer interpretation times and higher levels of confidence in an interpretation are both associated with higher sensitivity and false-positive rates in mammography screening.
Collapse
|
15
|
Idowu MO, Hardy LB, Souers RJ, Nakhleh RE. Pathologic Diagnostic Correlation With Breast Imaging Findings: A College of American Pathologists Q-Probes Study of 48 Institutions. Arch Pathol Lab Med 2012; 136:53-60. [DOI: 10.5858/arpa.2011-0217-cp] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Context.—Correlation of radiologic and pathologic findings is important for optimal management of patients with image-guided breast biopsies.
Objectives.—To (1) evaluate the rates of radiologic and pathologic correlation in breast needle core biopsies, (2) evaluate laboratory and radiology practices associated with greater correlation rates, and (3) determine the rates at which the lack of radiologic-pathologic correlation is documented in pathology reports.
Design.—The study was offered and conducted as a College of American Pathologists voluntary Q-Probes program. Participants in this study retrospectively reviewed 30 consecutive, initial, diagnostic needle core biopsy cases performed for abnormal radiologic findings. If 12 months of accessioned cases were reviewed without identifying 30 qualifying cases, participants stopped the retrospective review and included all cases identified. For each case or specimen, the participants provided detailed information about the radiologic and pathologic findings.
Results.—In aggregate, a radiologic-pathologic correlation was found in 94.9% (1328 of 1399) of the cases reviewed, based on the participants' judgments. Significant differences in the correlation rates existed when cases were discussed at an interdepartmental, multidisciplinary conference (P < .001). No significant differences were found in the correlation rates of the following: whether surgeons or radiologists performed the biopsy, whether cores with calcifications were identified by any method, and whether the laboratory had one or more designated breast pathologists.
Conclusions.—Participation in a multidisciplinary breast conference is useful in radiologic-pathologic correlation. Active involvement by pathologists in correlating pathologic and radiologic findings is important.
Collapse
|
16
|
Every second counts: digital and analogue mammography - comparison of reading times at the Queen Elizabeth Breast Screening Unit, Gateshead, UK. Breast Cancer Res 2011. [PMCID: PMC3238256 DOI: 10.1186/bcr2971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
17
|
Moin P, Deshpande R, Sayre J, Messer E, Gupte S, Romsdahl H, Hasegawa A, Liu BJ. An observer study for a computer-aided reading protocol (CARP) in the screening environment for digital mammography. Acad Radiol 2011; 18:1420-9. [PMID: 21971259 DOI: 10.1016/j.acra.2011.07.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2011] [Revised: 07/14/2011] [Accepted: 07/14/2011] [Indexed: 11/26/2022]
Abstract
RATIONALE AND OBJECTIVES The aims of this study were to investigate improving work flow efficiency by shortening the reading time of digital mammograms using a computer-aided reading protocol (CARP) in the screening environment and to increase detection sensitivity using CARP, compared to the current protocol, commonly referred to as the quadrant view (QV). MATERIALS AND METHODS A total of 200 cases were selected for a receiver-operating characteristic (ROC) study to evaluate two image display work flows, CARP and QV, in the screening environment. A Web-based tool was developed for scoring, reporting, and statistical analysis. Cases were scored for and stratified by difficulty. A total of six radiologists of differing levels of training ranging from dedicated mammographers to senior radiology residents participated. Each was timed while interpreting the 200 cases in groups of 50, first using QV and then, after a washout period, using CARP. The data were analyzed using ROC and κ analysis. Interpretation times were also assessed. RESULTS Using QV, readers' average area under the ROC curve was 0.68 (range, 0.54-0.73). Using CARP, readers' average area under the ROC curve was 0.71 (range, 0.66-0.75). There was no statistically significant difference in reader performance using either work flow. However, there was a statistically significant reduction in the average interpretation time of negative cases from 64.7 seconds using QV to 58.8 seconds using CARP. CONCLUSIONS CARP determines the display order of regions of interest depending on computer-aided detection findings. This is a variation of traditional computer-aided detection for digital mammography that has the potential to reduce interpretation times of studies with negative findings without significantly affecting sensitivity, thus allowing improved work flow efficiency in the screening environment, in which, in most settings, the majority of cases are negative.
Collapse
|
18
|
Detection of microcalcifications on digital screening mammograms using varying degrees of monitor zooming. AJR Am J Roentgenol 2011; 197:W761-8. [PMID: 21940549 DOI: 10.2214/ajr.10.5238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The American College of Radiology recommends that mammogram images be viewed at 100% resolution (also called one-to-one or full resolution). We tested the effect of this and three other levels of zooming on the ability of radiologists to identify malignant calcifications on screening mammographic views. MATERIALS AND METHODS Seven breast imagers viewed 77 mammographic images, 32 with and 45 without malignant microcalcifications, using four different degrees of monitor zooming. The readers indicated whether they thought a cluster of potentially malignant calcifications was present and where the cluster was located. Tested degrees of zooming included fit screen, a size midway between fit screen and 100%, 100%, and a size slightly larger than 100%. RESULTS Readers failed to detect 17 clusters of malignant calcifications with fit-screen images, 12 clusters with midway images, 13 clusters with 100% images, and 11 clusters with slightly larger images. When viewing images without malignant microcalcifications, the readers marked false-positive areas on 25 images using fit-screen images, 43 of the midway images, 40 of the 100% images, and 29 of the slightly larger images. CONCLUSION All four tested levels of zooming functioned well. There was a trend for the fit-screen images to function slightly less well than the others with regard to sensitivity, so it may not be prudent to rely on those images without other levels of zooming. The 100% resolution images did not function noticeably better than the others.
Collapse
|
19
|
Erickson BJ, Wood CP, Kaufmann TJ, Patriarche JW, Mandrekar J. Optimal presentation modes for detecting brain tumor progression. AJNR Am J Neuroradiol 2011; 32:1652-7. [PMID: 21852368 DOI: 10.3174/ajnr.a2596] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
BACKGROUND AND PURPOSE A common task in radiology interpretation is visual comparison of images. The purpose of this study was to compare traditional side-by-side and in-place (flicker) image presentation modes with advanced methods for detecting primary brain tumors on MR imaging. MATERIALS AND METHODS We identified 66 patients with gliomas and 3 consecutive brain MR imaging examinations (a "triplet"). A display application that presented images in side-by-side mode with or without flicker display as well as display of image subtraction or automated change detection information (also with and without flicker display) was used by 3 board-certified neuroradiologists. They identified regions of brain tumor progression by using this display application. Each case was reviewed using all modes (side-by-side presentation with and without flicker, subtraction with and without flicker, and change detection with and without flicker), with results compared via a panel rating. RESULTS Automated change detection with or without flicker (P < .0027) as well as subtraction with or without flicker (P < .0027) were more sensitive to tumor progression than side-by-side presentation in cases where all 3 raters agreed. Change detection afforded the highest interrater agreement, followed by subtraction. Clinically determined time to progression was longer for cases rated as nonprogressing by using subtraction images and change-detection images both with and without flicker display mode compared with side-by-side presentation. CONCLUSIONS Automated change detection and image subtraction, with and without flicker display mode, are superior to side-by-side image comparison.
Collapse
Affiliation(s)
- B J Erickson
- Department of Radiology, Mayo Clinic, Rochester, Minnesota, USA.
| | | | | | | | | |
Collapse
|
20
|
Full-Field Digital Mammographic Interpretation With Prior Analog Versus Prior Digitized Analog Mammography: Time for Interpretation. AJR Am J Roentgenol 2011; 196:1436-8. [DOI: 10.2214/ajr.10.5430] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
21
|
Guerriero C, Gillan MGC, Cairns J, Wallis MG, Gilbert FJ. Is computer aided detection (CAD) cost effective in screening mammography? A model based on the CADET II study. BMC Health Serv Res 2011; 11:11. [PMID: 21241473 PMCID: PMC3032650 DOI: 10.1186/1472-6963-11-11] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2010] [Accepted: 01/17/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Single reading with computer aided detection (CAD) is an alternative to double reading for detecting cancer in screening mammograms. The aim of this study is to investigate whether the use of a single reader with CAD is more cost-effective than double reading. METHODS Based on data from the CADET II study, the cost-effectiveness of single reading with CAD versus double reading was measured in terms of cost per cancer detected. Cost (Pound (£), year 2007/08) of single reading with CAD versus double reading was estimated assuming a health and social service perspective and a 7 year time horizon. As the equipment cost varies according to the unit size a separate analysis was conducted for high, average and low volume screening units. One-way sensitivity analyses were performed by varying the reading time, equipment and assessment cost, recall rate and reader qualification. RESULTS CAD is cost increasing for all sizes of screening unit. The introduction of CAD is cost-increasing compared to double reading because the cost of CAD equipment, staff training and the higher assessment cost associated with CAD are greater than the saving in reading costs. The introduction of single reading with CAD, in place of double reading, would produce an additional cost of £227 and £253 per 1,000 women screened in high and average volume units respectively. In low volume screening units, the high cost of purchasing the equipment will results in an additional cost of £590 per 1,000 women screened.One-way sensitivity analysis showed that the factors having the greatest effect on the cost-effectiveness of CAD with single reading compared with double reading were the reading time and the reader's professional qualification (radiologist versus advanced practitioner). CONCLUSIONS Without improvements in CAD effectiveness (e.g. a decrease in the recall rate) CAD is unlikely to be a cost effective alternative to double reading for mammography screening in UK. This study provides updated estimates of CAD costs in a full-field digital system and assessment cost for women who are re-called after initial screening. However, the model is highly sensitive to various parameters e.g. reading time, reader qualification, and equipment cost.
Collapse
Affiliation(s)
- Carla Guerriero
- Health Service Research and Policy Department, London School of Hygiene and Tropical Medicine, London, UK.
| | | | | | | | | |
Collapse
|
22
|
Comparison of image acquisition and radiologist interpretation times in a diagnostic mammography center. Acad Radiol 2010; 17:1168-74. [PMID: 20646940 DOI: 10.1016/j.acra.2010.04.018] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2010] [Revised: 04/28/2010] [Accepted: 04/29/2010] [Indexed: 11/22/2022]
Abstract
RATIONALE AND OBJECTIVES The purpose of this study was to determine the acquisition and interpretation times of screen-film mammography and soft-copy digital mammography in a diagnostic mammography center. MATERIALS AND METHODS The study was conducted in three phases for patients presenting for clinical diagnostic workup to a mammography clinic. In the first phase, technologist acquisition and processing times and radiologist interpretation time were measured for patients imaged with a screen-film mammographic system. During the second phase of the study, times were measured for patients imaged with a direct radiographic digital mammographic system, with interpretation performed on a soft-copy display system. During the third phase, 3 months after installation of the soft-copy display system, times were measured again for patients imaged on the same direct radiographic digital mammographic system, with interpretation with the same soft-copy system. The same four experienced breast imaging radiologists and seven technologists participated in all phases of the study. All data were entered into a database, and statistical analysis was conducted using weighted linear models and logarithmic transformation. RESULTS Times were obtained for 295 patients. There were 100 patients each for phases 1 and 2 and 95 patients for phase 3. Diagnostic mammographic acquisition times with processing were 13.02 min/case for screen film (phase 1), 8.16 min/case for digital (phase 2), and 10.66 min/case for digital (phase 3) (P < .001 and P < .0001, respectively). In addition, the radiologist interpretation time for digital mammography in both phases was not significantly different from that for film mammography (P = .2853 and P = .2893, respectively). There was no significant difference between phases 2 and 3 (P = 1.0000). The mean interpretation times were 3.75 min/case for screen film, 2.14 min/case for digital (phase 2), and 2.26 min/case for digital (phase 3). CONCLUSIONS Digital mammography significantly shortened the acquisition time for diagnostic mammography. There was no significant difference in interpretation time compared to screen-film mammography in a diagnostic mammography setting.
Collapse
|
23
|
Diagnostic digital mammography in Japan: issues to consider. Breast Cancer 2010; 17:180-2. [DOI: 10.1007/s12282-009-0196-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2009] [Accepted: 12/21/2009] [Indexed: 10/20/2022]
|
24
|
Reed W, Poulos A, Rickard M, Brennan P. Reader practice in mammography screen reporting in Australia. J Med Imaging Radiat Oncol 2010; 53:530-7. [PMID: 20002284 DOI: 10.1111/j.1754-9485.2009.02119.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Reader variability is a problem in mammography image reporting and compromises the efficacy of screening programmes. The purpose of this exploratory study was to survey reader practice in reporting screening mammograms in Australia to identify aspects of practice that warrant further investigation. Mammography reporting practice and influences on concentration and attention were investigated by using an original questionnaire distributed to screen readers in Australia. A response rate of 71% (83 out of 117) was achieved. Demographic data indicated that the majority of readers were over 46 years of age (73%), have been reporting on screening mammograms for over 10 years (61%), take less than 1 min to report upon a screening mammogram examination (66%), report up to 200 examinations in a single session (83%) and take up to 2 h to report one session (61%). A majority report on more than 5000 examinations annually (66%); 93% of participants regard their search strategy as systematic, 87% agreed that their concentration can vary throughout a session, 64% agreed that the relatively low number of positives can lead to lapses in concentration and attention and almost all (94%) participants agreed that methods to maximise concentration should be explored. Participants identified a range of influences on concentration within their working environment including volume of images reported in one session, image types and aspects of the physical environment. This study has provided important evidence of the need to investigate adverse influences on concentration during mammography screen reporting.
Collapse
Affiliation(s)
- W Reed
- Discipline of Medical Radiation Sciences, Faculty of Health Sciences, The University of Sydney, Sydney, New South Wales, Australia.
| | | | | | | |
Collapse
|
25
|
Haygood TM, Arribas E, Brennan PC, Atkinson EN, Herndon M, Dieber J, Geiser W, Santiago L, Mills CM, Davis P, Adrada B, Carkaci S, Stephens TW, Whitman GJ. Conspicuity of microcalcifications on digital screening mammograms using varying degrees of monitor zooming. Acad Radiol 2009; 16:1509-17. [PMID: 19896068 DOI: 10.1016/j.acra.2009.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2009] [Revised: 07/10/2009] [Accepted: 06/13/2009] [Indexed: 11/27/2022]
Abstract
RATIONALE AND OBJECTIVES American College of Radiology guidelines suggest that digital screening mammographic images should be viewed at the full resolution at which they were acquired. This slows interpretation speed. The aim of this study was to examine the effect of various levels of zooming on the detection and conspicuity of microcalcifications. MATERIALS AND METHODS Six radiologists viewed 40 mammographic images five times in different random orders using five different levels of zooming: full resolution (100%) and 30%, 61%, 88%, and 126% of that size. Thirty-three images contained microcalcifications varying in subtlety, all associated with breast cancer. The clusters were circled. Seven images contained no malignant calcifications but also had randomly placed circles. The radiologists graded the presence or absence and visual conspicuity of any calcifications compared to calcifications in a reference image. They also counted the microcalcifications. RESULTS The radiologists saw the microcalcifications in 94% of the images at 30% size and in either 99% or 100% of the other tested levels of zooming. Conspicuity ratings were worst for the 30% size and fairly similar for the others. Using the 30% size, two radiologists failed to see the microcalcifications on either the craniocaudal or mediolateral oblique view taken from one patient. Interobserver agreement regarding the number of calcifications was lowest for the 30% images and second lowest for the 100% images. CONCLUSIONS Images at 30% size should not be relied on alone for systematic scanning for microcalcifications. The other four levels of magnification all performed well enough to warrant further testing.
Collapse
|
26
|
Sadaf A, Crystal P, Scaranelo A, Helbich T. Performance of computer-aided detection applied to full-field digital mammography in detection of breast cancers. Eur J Radiol 2009; 77:457-61. [PMID: 19875260 DOI: 10.1016/j.ejrad.2009.08.024] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2009] [Revised: 08/26/2009] [Accepted: 08/26/2009] [Indexed: 11/19/2022]
Abstract
OBJECTIVE The aim of this retrospective study was to evaluate performance of computer-aided detection (CAD) with full-field digital mammography (FFDM) in detection of breast cancers. MATERIALS AND METHODS CAD was retrospectively applied to standard mammographic views of 127 cases with biopsy proven breast cancers detected with FFDM (Senographe 2000, GE Medical Systems). CAD sensitivity was assessed in total group of 127 cases and for subgroups based on breast density, mammographic lesion type, mammographic lesion size, histopathology and mode of presentation. RESULTS Overall CAD sensitivity was 91% (115 of 127 cases). There were no statistical differences (p > 0.1) in CAD detection of cancers in dense breasts 90% (53/59) versus non-dense breasts 91% (62/68). There was statistical difference (p < 0.05) in CAD detection of cancers that appeared mammographically as microcalcifications only versus other mammographic manifestations. CAD detected 100% (44/44) of cancers manifesting as microcalcifications, 89% (47/53) as no-calcified masses or asymmetries, 88% (14/16) as masses with associated calcifications, and 71% (10/14) as architectural distortions. CAD sensitivity for cancers 1-10mm was 84% (38/45); 11-20mm 93% (55/59); and >20mm 97% (22/23). CONCLUSION CAD applied to FFDM showed 100% sensitivity in identifying cancers manifesting as microcalcifications only and high sensitivity 86% (71/83) for other mammographic appearances of cancer. Sensitivity is influenced by lesion size. CAD in FFDM is an adjunct helping radiologist in early detection of breast cancers.
Collapse
Affiliation(s)
- Arifa Sadaf
- Department of Medical Imaging, Mount Sinai Hospital, Toronto, Ontario, Canada M5G 1X5.
| | | | | | | |
Collapse
|
27
|
Haygood TM, Wang J, Lane D, Galvan E, Atkinson EN, Stephens T, Whitman GJ. Why does it take longer to read digital than film-screen screening mammograms? A partial explanation. J Digit Imaging 2009; 23:170-80. [PMID: 19214635 DOI: 10.1007/s10278-009-9177-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2008] [Revised: 11/21/2008] [Accepted: 01/04/2009] [Indexed: 11/26/2022] Open
Abstract
Digital screening mammograms (DM) take longer to interpret than film-screen screening mammograms (FSM). We evaluated what part of the process takes long in our reading environment. We selected cases from those for which timed readings had been performed as part of a previous study. Readers were timed as they performed various computer manipulations on groups of DM cases and as they moved the alternator and adjusted lighting and manual shutters for FSM cases. Subtracting manipulation time from the original interpretation times yielded estimated times to reach a decision. Manipulation times for DM ranged from a low of 11 s when four-view DM were simply opened and closed in a 4-on-1 hanging protocol before moving on to the next study to 113.8 s when each view of six-view DM were brought up 1-on-1, enlarged to 100% resolution, and panned through. Manipulation times for groups of FSM ranged from 8.3 to 12.1 s. Estimated decision-making times for DM ranged from 128.0 to 202.2 s, while estimated decision-making time for FSM ranged from 60.9 to 146.3 s. Computer manipulation time partially explains the discrepancy in interaction times between DM and FSM. Radiologists also appear to spend more time looking at DM than at FSM before making a decision.
Collapse
Affiliation(s)
- Tamara Miner Haygood
- Department of Diagnostic Radiology, Unit 1273, The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030-4009, USA.
| | | | | | | | | | | | | |
Collapse
|