1
|
Murray E, Velleman S, Preston JL, Heard R, Shibu A, McCabe P. The Reliability of Expert Diagnosis of Childhood Apraxia of Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:3309-3326. [PMID: 37642523 DOI: 10.1044/2023_jslhr-22-00677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
PURPOSE The current standard for clinical diagnosis of childhood apraxia of speech (CAS) is expert clinician judgment. The psychometric properties of this standard are not well understood; however, they are important for improving clinical diagnosis. The purpose of this study is to determine the extent to which experts agree on the clinical diagnosis of CAS using two cohorts of children with mixed speech sound disorders (SSDs). METHOD Speech samples of children with SSDs were obtained from previous and ongoing research from video recordings of children aged 3-8 years (n = 36) and audio recordings of children aged 8-17 years (n = 56). A total of 23 expert, English-speaking clinicians were recruited internationally. Three of these experts rated each speech sample to provide a description of the observed features and a diagnosis. Intrarater reliability was acceptable at 85% agreement. RESULTS Interrater reliability on the presence or absence of CAS among experts was poor both as a categorical diagnosis (κ = .187, 95% confidence interval [CI] [0.089, 0.286]) and on a continuous "likelihood of CAS" scale (0-100; intraclass correlation = .183, 95% CI [.037, .347]). Reliability was similar across the video-recorded and audio-only samples. There was greater agreement on other diagnoses (such as articulation disorder) than on the diagnosis of CAS, although these too did not meet the predetermined standard. Likelihood of CAS was greater in children who presented with more American Speech-Language-Hearing Association CAS consensus features. CONCLUSIONS Different expert raters had different thresholds for applying the diagnosis of CAS. If expert clinician judgment is to be used for diagnosis of CAS or other SSDs, further standardization and calibration is needed to increase interrater reliability. Diagnosis may require operationalized checklists or reliable measures that operate along a diagnostic continuum. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.23949105.
Collapse
Affiliation(s)
- Elizabeth Murray
- The University of Sydney, New South Wales, Australia
- Remarkable Speech + Movement, Sydney, New South Wales, Australia
| | | | | | - Robert Heard
- The University of Sydney, New South Wales, Australia
| | - Akhila Shibu
- The University of Sydney, New South Wales, Australia
| | | |
Collapse
|
2
|
Mackenzie A, Lewis E, Loveland J. Successes and challenges in extracting information from DICOM image databases for audit and research. Br J Radiol 2023; 96:20230104. [PMID: 37698251 PMCID: PMC10607388 DOI: 10.1259/bjr.20230104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 05/05/2023] [Accepted: 05/11/2023] [Indexed: 09/13/2023] Open
Abstract
In radiography, much valuable associated data (metadata) is generated during image acquisition. The current setup of picture archive and communication systems (PACS) can make extraction of this metadata difficult, especially as it is typically stored with the image. The aim of this work is to examine the current challenges in extracting image metadata and to discuss the potential benefits of using this rich information. This work focuses on breast screening, though the conclusions are applicable to other modalities.The data stored in PACS contain information, currently underutilised, and is of great benefit for auditing and improving imaging and radiographic practice. From the literature, we present examples of the potential clinical benefit such as audits of dose, and radiographic practice, as well as more advanced research highlighting the effects of radiographic practice, e.g. cancer detection rates affected by imaging technology.This review considers the challenges in extracting data, namely,• The search tools for data on most PACS are inadequate being both time-consuming and limited in elements that can be searched.• Security and information governance considerations• Anonymisation of data if required• Data curationThe review describes some solutions that have been successfully implemented.• Retrospective extraction: direct query on PACS• Extracting data prospectively• Use of structured reports• Use of trusted research environmentsUltimately, the data access process will be made easier by inclusion during PACS procurement. Auditing data from PACS can be used to improve quality of imaging and workflow, all of which will be a clinical benefit to patients.
Collapse
Affiliation(s)
| | | | - John Loveland
- NCCPM, Royal Surrey NHS Foundation Trust, Guildford, United Kingdom
| |
Collapse
|
3
|
Siviengphanom S, Gandomkar Z, Lewis SJ, Brennan PC. Global Radiomic Features from Mammography for Predicting Difficult-To-Interpret Normal Cases. J Digit Imaging 2023; 36:1541-1552. [PMID: 37253894 PMCID: PMC10406750 DOI: 10.1007/s10278-023-00836-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 04/05/2023] [Accepted: 04/13/2023] [Indexed: 06/01/2023] Open
Abstract
This work aimed to investigate whether global radiomic features (GRFs) from mammograms can predict difficult-to-interpret normal cases (NCs). Assessments from 537 readers interpreting 239 normal mammograms were used to categorise cases as 120 difficult-to-interpret and 119 easy-to-interpret based on cases having the highest and lowest difficulty scores, respectively. Using lattice- and squared-based approaches, 34 handcrafted GRFs per image were extracted and normalised. Three classifiers were constructed: (i) CC and (ii) MLO using the GRFs from corresponding craniocaudal and mediolateral oblique images only, based on the random forest technique for distinguishing difficult- from easy-to-interpret NCs, and (iii) CC + MLO using the median predictive scores from both CC and MLO models. Useful GRFs for the CC and MLO models were recognised using a scree test. The CC and MLO models were trained and validated using the leave-one-out-cross-validation. The models' performances were assessed by the AUC and compared using the DeLong test. A Kruskal-Wallis test was used to examine if the 34 GRFs differed between difficult- and easy-to-interpret NCs and if difficulty level based on the traditional breast density (BD) categories differed among 115 low-BD and 124 high-BD NCs. The CC + MLO model achieved higher performance (0.71 AUC) than the individual CC and MLO model alone (0.66 each), but statistically non-significant difference was found (all p > 0.05). Six GRFs were identified to be valuable in describing difficult-to-interpret NCs. Twenty features, when compared between difficult- and easy-to-interpret NCs, differed significantly (p < 0.05). No statistically significant difference was observed in difficulty between low- and high-BD NCs (p = 0.709). GRF mammographic analysis can predict difficult-to-interpret NCs.
Collapse
Affiliation(s)
- Somphone Siviengphanom
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, the University of Sydney, Sydney, NSW, 2006, Australia.
| | - Ziba Gandomkar
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, the University of Sydney, Sydney, NSW, 2006, Australia
| | - Sarah J Lewis
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, the University of Sydney, Sydney, NSW, 2006, Australia
| | - Patrick C Brennan
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Sydney School of Health Sciences, Faculty of Medicine and Health, the University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
4
|
Alshabibi AS, Suleiman ME, Albeshan SM, Heard R, Brennan PC. Variations in breast cancer detection rates during mammogram-reading sessions: does experience have an impact? Br J Radiol 2022; 95:20210895. [PMID: 34735290 PMCID: PMC8722243 DOI: 10.1259/bjr.20210895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
OBJECTIVES To examine whether radiologists' performances are consistent throughout a reading session and whether any changes in performance over the reading task differ depending on experience of the reader. METHODS The performance of ten radiologists reading a test set of 60 mammographic cases without breaks was assessed using an ANOVA, 2 × 3 factorial design. Participants were categorized as more (≥2,000 mammogram readings per year) or less (<2,000 readings per year) experienced. Three series of 20 cases were chosen to ensure comparable difficulty and presented in the same sequence to all readers. It usually takes around 30 min for a radiologist to complete each of the 20-case series, resulting in a total of 90 min for the 60 mammographic cases. The sensitivity, specificity, lesion sensitivity, and area under the ROC curve were calculated for each series. We hypothesized that the order in which a series was read (i.e. fixed-series sequence) would have a significant main effect on the participants' performance. We also determined if significant interactions exist between the fixed-series sequence and radiologist experience. RESULTS Significant linear interactions were found between experience and the fixed sequence of the series for sensitivity (F[1] =5.762, p = .04, partial η2 = .41) and lesion sensitivity. (F[1] =6.993, p = .03, partial η2 = .46). The two groups' mean scores were similar for the first series but progressively diverged. By the end of the third series, significant differences in sensitivity and lesion sensitivity were evident, with the more experienced individuals demonstrating improving and the less experienced declining performance. Neither experience nor series sequence significantly affected the specificity or the area under the ROC curve. CONCLUSIONS Radiologists' performance may change considerably during a reading session, apparently as a function of experience, with less experienced radiologists declining in sensitivity and lesion sensitivity while more experienced radiologists actually improve. With the increasing demands on radiologists to undertake high-volume reporting, we suggest that junior radiologists be made aware of possible sensitivity and lesion sensitivity deterioration over time so they can schedule breaks during continuous reading sessions that are appropriate to them, rather than try to emulate their more experienced colleagues. ADVANCES IN KNOWLEDGE Less-experienced radiologists demonstrated a reduction in mammographic diagnostic accuracy in later stages of the reporting sessions. This may suggest that extending the duration of reporting sessions to compensate for increasing workloads may not represent the optimal solution for less-experienced radiologists.
Collapse
Affiliation(s)
| | - Moayyad E Suleiman
- The Medical Image Optimisation and Perception Group (MIOPeG), Faculty of Medicine and Health, The University of Sydney, Susan Wakil Health Building, Camperdown, Australia
| | - Salman M Albeshan
- Department of Radiology and Medical Imaging, The College of Applied Medical Sciences of King Saud University, Riyadh, Saudi Arabia
| | - Robert Heard
- The Medical Image Optimisation and Perception Group (MIOPeG), Faculty of Medicine and Health, The University of Sydney, Susan Wakil Health Building, Camperdown, Australia
| | - Patrick C Brennan
- The Medical Image Optimisation and Perception Group (MIOPeG), Faculty of Medicine and Health, The University of Sydney, Susan Wakil Health Building, Camperdown, Australia
| |
Collapse
|
5
|
Chen Y, James JJ, Michalopoulou E, Darker IT, Jenkins J. The relationship between missed breast cancers on mammography in a test-set based assessment scheme and real-life performance in a National Breast Screening Programme. Eur J Radiol 2021; 142:109881. [PMID: 34352657 DOI: 10.1016/j.ejrad.2021.109881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 07/16/2021] [Accepted: 07/22/2021] [Indexed: 11/30/2022]
Abstract
PURPOSE This retrospective study determined whether a test-set based assessment scheme (PERFORMS) used in a national breast screening programme could be used to predict real-life performance by investigating if the number of cancers missed by mammography readers in real-life related to the number of cancers missed in the PERFORMS test-set and whether real-life reading volumes affected performance. METHOD Data was obtained from consenting readers in the screening programme in England (NHSBSP) where double reading is standard. The rate of cancers missed by individual first readers but correctly identified by second readers was compared with the number of cancers missed in the PERFORMS test-set over a 3-year period. NHSBSP readers are required to interpret at least 1500 cases per year as a first reader, so results were compared between readers who exceeded this target and those that did not. Parametric and non-parametric correlations were calculated. RESULTS Amongst the 536 readers, there was a highly significant positive correlation between the real-life and PERFORMS test-set missed cancer metrics (Pearson Correlation = 0.228, n = 536, p < .0001, Spearman's rho = 0.265, n = 536, p < .0001). There was no significant difference in rates of missed cancers between the 452 readers who exceeded the 1500 first read per year target and those who did not (t(94.2) = -1.87, p = .0643, r = 0.19). CONCLUSIONS The use of a test-set based assessment scheme accurately reflects real-life mammography reading performance, indicating that it can be a useful tool in identifying poor reader performance.
Collapse
Affiliation(s)
- Yan Chen
- University of Nottingham, School of Medicine, Division of Cancer and Stem Cells, City Hospital Campus, Hucknall Road, Nottingham NG5 1PB, United Kingdom.
| | - Jonathan J James
- Nottingham University Hospitals NHS Trust, Nottingham Breast Institute, City Hospital Campus, Hucknall Road, Nottingham NG5 1PB, United Kingdom
| | - Eleni Michalopoulou
- University of Nottingham, School of Medicine, Division of Cancer and Stem Cells, City Hospital Campus, Hucknall Road, Nottingham NG5 1PB, United Kingdom
| | - Iain T Darker
- University of Nottingham, School of Medicine, Division of Cancer and Stem Cells, City Hospital Campus, Hucknall Road, Nottingham NG5 1PB, United Kingdom
| | - Jacquie Jenkins
- Public Health England, Vulcan House Steel, 6 Millsands, Sheffield S3 8NH, United Kingdom
| |
Collapse
|
6
|
Hadadi I, Rae W, Clarke J, McEntee M, Ekpo E. Breast cancer detection: Comparison of digital mammography and digital breast tomosynthesis across non-dense and dense breasts. Radiography (Lond) 2021; 27:1027-1032. [PMID: 33906803 DOI: 10.1016/j.radi.2021.04.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 03/24/2021] [Accepted: 04/07/2021] [Indexed: 10/21/2022]
Abstract
INTRODUCTION Breast density is associated with an increase in breast cancer risk and limits early detection of the disease. This study assesses the diagnostic performance of mammogram readers in digital mammography (DM) and digital breast tomosynthesis (DBT). METHODS Eleven breast readers with 1-39 years of experience reading mammograms and 0-4 years of experience reading DBT participated in the study. All readers independently interpreted 60 DM cases (40 normal/20 abnormal) and 35 DBT cases (20 normal/15 abnormal). Sensitivity, specificity, ROC AUC, and diagnostic confidence were calculated and compared between DM and DBT. RESULTS DBT significantly improved diagnostic confidence in both dense breasts (p = 0.03) and non-dense breasts (p = 0.003) but not in other diagnostic performance metrics. Specificity was higher in DM for readers with >7 years' experience (p = 0.03) in reading mammography, non-radiologists (p = 0.04), readers who had completed a 3-6 months training fellowship in breast imaging (p = 0.04), and those with ≤2 years' experience in reading DBT (p = 0.02), particularly in non-dense breasts. CONCLUSION Diagnostic confidence was higher in DBT when compared to DM. In contrast, other performance metrics appeared to be similar or better with DM and may be influenced by the lack of experience of the reader cohort in reading DBT. IMPLICATIONS FOR PRACTICE The benefits of DBT may not be entirely accrued until radiologists attain expertise in DBT interpretation. Specificity of DBT varied according to reader characteristics, and these characteristics may be useful for optimising pairing strategies in independent double reading of DBT as practiced in Australia to reduce false positive diagnostic errors.
Collapse
Affiliation(s)
- I Hadadi
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Faculty of Medicine and Health, The University of Sydney, Australia; Department of Radiological Sciences, Faculty of Applied Medical Sciences, King Khalid University, Saudi Arabia.
| | - W Rae
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Faculty of Medicine and Health, The University of Sydney, Australia
| | - J Clarke
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Faculty of Medicine and Health, The University of Sydney, Australia
| | - M McEntee
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Faculty of Medicine and Health, The University of Sydney, Australia; University College Cork, Discipline of Diagnostic Radiography, UG 12 Áras Watson, Brookfield Health Sciences, College Road, Cork, T12 AK54, Ireland
| | - E Ekpo
- Medical Image Optimisation and Perception Group, Discipline of Medical Imaging Science, Faculty of Medicine and Health, The University of Sydney, Australia; Orange Radiology, Laboratories and Research Centre, Calabar, Nigeria
| |
Collapse
|
7
|
Qenam BA, Li T, Tapia K, Brennan PC. The roles of clinical audit and test sets in promoting the quality of breast screening: a scoping review. Clin Radiol 2020; 75:794.e1-794.e6. [PMID: 32139003 DOI: 10.1016/j.crad.2020.01.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 01/29/2020] [Indexed: 12/24/2022]
Abstract
Breast screening programmes enhance the probability of early breast cancer detection in many countries worldwide; however, the success of these efforts is highly dependent on the ability of breast screen readers to detect abnormalities in the screened population, which has low prevalence. Therefore, this task can be challenging. Clinical audit is a key quality assurance measure that aims to keep the screen reading performance within acceptable standards. Auditing, nonetheless, is a lengthy process, and its accuracy is dependent on available clinical data, which often can be limited. Mammographic standardised test sets are a different screen reading evaluation approach that provides participants with instant feedback based on a simulated environment. Although a test set provides unique evaluative qualities, its ability to represent clinical performance is debated. This article describes the distinctive roles of clinical audit and test sets in measuring and improving the quality of breast screening and highlights the relationship between test sets and clinical performance.
Collapse
Affiliation(s)
- B A Qenam
- BREAST, Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW, 2141, Australia; Department of Radiological Sciences, College of Applied Medical Sciences, King Saud University, P.O. Box 10219, Riyadh, 11432, Saudi Arabia.
| | - T Li
- BREAST, Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW, 2141, Australia; Medical Image Optimisation and Perception Research Group (MIOPeG), Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW 2141, Australia
| | - K Tapia
- BREAST, Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW, 2141, Australia
| | - P C Brennan
- BREAST, Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW, 2141, Australia; Medical Image Optimisation and Perception Research Group (MIOPeG), Medical Imaging Science, Faculty of Health Sciences, The University of Sydney, Cumberland Campus, 75 East St, Lidcombe, NSW 2141, Australia
| |
Collapse
|
8
|
Thigpen D, Rapelyea J. Test Sets and Real-Life Performance: Can One Predict the Other? Radiol Imaging Cancer 2020; 2:e200126. [PMID: 33779660 PMCID: PMC7983750 DOI: 10.1148/rycan.2020200126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 09/04/2020] [Accepted: 09/08/2020] [Indexed: 06/12/2023]
Affiliation(s)
- Denise Thigpen
- From the Department of Radiology, The George Washington University Medical Center, 2150 Pennsylvania Ave NW, Washington, DC 20037
| | - Jocelyn Rapelyea
- From the Department of Radiology, The George Washington University Medical Center, 2150 Pennsylvania Ave NW, Washington, DC 20037
| |
Collapse
|
9
|
Alshabibi AS, Suleiman ME, Tapia KA, Heard R, Brennan PC. Impact of time of day on radiology image interpretations. Clin Radiol 2020; 75:746-756. [PMID: 32576366 DOI: 10.1016/j.crad.2020.05.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 05/05/2020] [Indexed: 11/25/2022]
Abstract
AIM To examine the impact of the time of day on radiologists' mammography reading performance. MATERIALS AND METHODS Retrospective mammographic reading assessment data were collected from the BreastScreen Reader Assessment Strategy database and included timestamps of the readings and reader-specific demographic data of 197 radiologists. The radiologists performed the readings in a workshop setting with test case sets enriched with malignancies (one-third of cases were malignant). The collected data were evaluated with an analysis of covariance to determine whether time of day influenced radiologists' specificity, lesion sensitivity or the jackknife alternative free-response receiver operating characteristic (JAFROC). RESULTS After adjusting for radiologist experience and fellowship, specificity varied significantly by time of day (p=0.027), but there was no evidence of any significant impact on lesion sensitivity (p=0.441) or JAFROC (p=0.120). The collected data demonstrated that specificity during the late morning (10.00-12.00) was 71.7%; this was significantly lower than in the early morning (08.00-10.00) and mid-afternoon (14.00-16.00), which were 82.74% (p=0.003) and 81.39% (p=0.031), respectively. Specificity during the late afternoon (16.00-18.00) was 73.95%; this was significantly lower than in the early morning (08.00-10.00) and mid-afternoon (14.00-16.00), which were 82.74% (p=0.003) and 81.39% (p=0.031), respectively. CONCLUSION The results indicated that the time of day may influence radiologists' performance, specifically their ability to identify normal images correctly.
Collapse
Affiliation(s)
- A S Alshabibi
- Faculty of Health Sciences, Medical Radiation Sciences, University of Sydney, New South Wales, Australia.
| | - M E Suleiman
- Faculty of Health Sciences, Medical Radiation Sciences, University of Sydney, New South Wales, Australia
| | - K A Tapia
- Faculty of Health Sciences, Medical Radiation Sciences, University of Sydney, New South Wales, Australia
| | - R Heard
- Faculty of Health Sciences, Medical Radiation Sciences, University of Sydney, New South Wales, Australia
| | - P C Brennan
- Faculty of Health Sciences, Medical Radiation Sciences, University of Sydney, New South Wales, Australia
| |
Collapse
|
10
|
Trieu PD(Y, Tapia K, Frazer H, Lee W, Brennan P. Improvement of Cancer Detection on Mammograms via BREAST Test Sets. Acad Radiol 2019; 26:e341-e347. [PMID: 30826148 DOI: 10.1016/j.acra.2018.12.017] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 12/18/2018] [Accepted: 12/19/2018] [Indexed: 11/27/2022]
Abstract
BACKGROUND Breast Screen Reader Assessment Strategy (BREAST) is an innovative training and research program for radiologists in Australia and New Zealand. The aim of this study is to evaluate the efficacy of BREAST test sets in improving readers' performance in detecting cancers on mammograms. MATERIALS AND METHODS Between 2011 and 2018, 50 radiologists (40 fellows, 10 registrars) completed three BREAST test sets and 17 radiologists completed four test sets. Each test set contained 20 biopsy-proven cancer and 40 normal cases. Immediate image-based feedback was available to readers after they completed each test set which allowed the comparison of their selections with the truth. Case specificity, case sensitivity, lesion sensitivity, the Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) and Jackknife Free-Response Receiver Operating Characteristic (JAFROC) Figure of Merit (FOM) were calculated for each reader. Kruskal-Wallis test was utilized to compare scores of the radiologist and registrars across all test-sets whilst Wilcoxon signed rank test was to compare the scores between pairs of test sets. RESULTS Significant improvements in lesion sensitivity ranging from 21% to 31% were found in radiologists completing later test sets compared to first test set (p ≤ 0.01). Eighty three percent of radiologists achieved higher performance in lesion sensitivity after they completed the first read. Registrars had significantly better scores in the third test set compared to the first set with mean increases of 79% in lesion sensitivity (p = 0.005) and 37% in JAFROC (p = 0.02). Sixty percent and 100% of registrars increased their scores in lesion sensitivity in the second and third reads compared to the first read while the percentage of registrars with higher scores in JAFROC was 80%. CONCLUSION Introduction of BREAST into national training programs appears to have an important impact in promoting diagnostic efficacy amongst radiologists and radiology registrars undergoing mammographic readings.
Collapse
|
11
|
Miglioretti DL, Ichikawa L, Smith RA, Buist DS, Carney PA, Geller B, Monsees B, Onega T, Rosenberg R, Sickles EA, Yankaskas BC, Kerlikowske K. Correlation Between Screening Mammography Interpretive Performance on a Test Set and Performance in Clinical Practice. Acad Radiol 2017; 24:1256-1264. [PMID: 28551400 DOI: 10.1016/j.acra.2017.03.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Revised: 03/16/2017] [Accepted: 03/17/2017] [Indexed: 10/19/2022]
Abstract
RATIONALE AND OBJECTIVES Evidence is inconsistent about whether radiologists' interpretive performance on a screening mammography test set reflects their performance in clinical practice. This study aimed to estimate the correlation between test set and clinical performance and determine if the correlation is influenced by cancer prevalence or lesion difficulty in the test set. MATERIALS AND METHODS This institutional review board-approved study randomized 83 radiologists from six Breast Cancer Surveillance Consortium registries to assess one of four test sets of 109 screening mammograms each; 48 radiologists completed a fifth test set of 110 mammograms 2 years later. Test sets differed in number of cancer cases and difficulty of lesion detection. Test set sensitivity and specificity were estimated using woman-level and breast-level recall with cancer status and expert opinion as gold standards. Clinical performance was estimated using women-level recall with cancer status as the gold standard. Spearman rank correlations between test set and clinical performance with 95% confidence intervals (CI) were estimated. RESULTS For test sets with fewer cancers (N = 15) that were more difficult to detect, correlations were weak to moderate for sensitivity (woman level = 0.46, 95% CI = 0.16, 0.69; breast level = 0.35, 95% CI = 0.03, 0.61) and weak for specificity (0.24, 95% CI = 0.01, 0.45) relative to expert recall. Correlations for test sets with more cancers (N = 30) were close to 0 and not statistically significant. CONCLUSIONS Correlations between screening performance on a test set and performance in clinical practice are not strong. Test set performance more accurately reflects performance in clinical practice if cancer prevalence is low and lesions are challenging to detect.
Collapse
|
12
|
Hofvind S, Bennett RL, Brisson J, Lee W, Pelletier E, Flugelman A, Geller B. Audit feedback on reading performance of screening mammograms: An international comparison. J Med Screen 2016; 23:150-9. [DOI: 10.1177/0969141315610790] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 09/17/2015] [Indexed: 01/16/2023]
Abstract
Objective Providing feedback to mammography radiologists and facilities may improve interpretive performance. We conducted a web-based survey to investigate how and why such feedback is undertaken and used in mammographic screening programmes. Methods The survey was sent to representatives in 30 International Cancer Screening Network member countries where mammographic screening is offered. Results Seventeen programmes in 14 countries responded to the survey. Audit feedback was aimed at readers in 14 programmes, and facilities in 12 programmes. Monitoring quality assurance was the most common purpose of audit feedback. Screening volume, recall rate, and rate of screen-detected cancers were typically reported performance measures. Audit reports were commonly provided annually, but more frequently when target guidelines were not reached. Conclusion The purpose, target audience, performance measures included, form and frequency of the audit feedback varied amongst mammographic screening programmes. These variations may provide a basis for those developing and improving such programmes.
Collapse
Affiliation(s)
- S Hofvind
- Department of Screening, Cancer Registry of Norway, Oslo, Norway
| | - RL Bennett
- Nottingham University Hospitals NHS Trust, Nottingham, UK
| | - J Brisson
- Centre de Recherche du, CHU de Québec and Centre des Maladies du Sein Deschênes-Fabia, Hôpital du Saint-Sacrement, Quebec, Canada
| | - W Lee
- Discipline Medical Radiation Sciences, The University of Sydney, Lidcombe, NSW, Australia
| | - E Pelletier
- Institut National de santé Publique du Québec, Canada
| | - A Flugelman
- CHS National Cancer Control Center, Lady Davis Carmel Medical Center, Haifa, Israel
| | - B Geller
- Department of Family Medicine, University of Vermont, Burlington, VT, USA
| |
Collapse
|