1
|
Trieu PD, Lewis SJ, Li T, Ho K, Tapia KA, Brennan PC. Reader characteristics and mammogram features associated with breast imaging reporting scores. Br J Radiol 2020; 93:20200363. [DOI: 10.1259/bjr.20200363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Objectives: This study aims to explore the reading performances of radiologists in detecting cancers on mammograms using Tabar Breast Imaging Reporting and Data System (BIRADS) classification and identify factors related to breast imaging reporting scores. Methods: 117 readings of five different mammogram test sets with each set containing 20 cancer and 40 normal cases were performed by Australian radiologists. Each radiologist evaluated the mammograms using the BIRADS lexicon with category 1 - negative, category 2 - benign findings, category 3 - equivocal findings (Recall), category 4 - suspicious findings (Recall), and category 5 - highly suggestive of malignant findings (Recall). Performance metrics (true positive, false positive, true negative, and false negative) were calculated for each radiologist and the distribution of reporting categories was analyzed in reader-based and case-based groups. The association of reader characteristics and case features among categories was examined using Mann-Whitney U and Kruskal-Wallis tests. Results: 38% of cancer-containing mammograms were reported with category 3 which decreased to 32.3% with category 4 and 16.2% with category 5 while 16.6 and 10.3% of cancer cases were marked with categories 1 and 2. Female readers had less false-negative rates when using categories 1 and 2 for cancer cases than male readers (p < 0.01). A similar pattern as gender category was also found in Breast Screen readers and readers completed breast reading fellowships compared with non-Breast Screen and non-fellowship readers (p < 0.05). Radiologists with low number of cases read per week were more likely to record the cancer cases with category 4 while the ones with high number of cases were with category 3 (p < 0.01). Discrete mass and asymmetric density were the two types of abnormalities reported mostly as equivocal findings with category 3 (47–50%; p = 0.005) while spiculated mass or stellate lesions were mostly selected as highly suggestive of malignancy with category 5 (26%, p = 0.001). Conclusions: Most radiologists used category 3 when reporting cancer mammograms. Gender, working for BreastScreen, fellowship completion, and number of cases read per week were factors associated with scoring selection. Radiologists reported higher Tabar BIRADS category for specific types of abnormalities on mammograms than others. Advances in knowledge: The study identified factors associated with the decision of radiologists in assigning a BIRADS Tabar score for mammograms with abnormality. These findings will be useful for individual training programs to improve the confidence of radiologists in recognizing abnormal lesions on screening mammograms.
Collapse
Affiliation(s)
- Phuong Dung(Yun) Trieu
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| | - Sarah J Lewis
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| | - Tong Li
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| | - Karen Ho
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| | - Kriscia A Tapia
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| | - Patrick C Brennan
- Discipline of Medical Imaging Sciences, Faculty of Medicine and Health. The University of Sydney 75 East street, Lidcombe, New South Wales, Australia 2141
| |
Collapse
|
2
|
Classification of Microcalcification Clusters in Digital Mammograms Using a Stack Generalization Based Classifier. J Imaging 2019; 5:jimaging5090076. [PMID: 34460670 PMCID: PMC8320960 DOI: 10.3390/jimaging5090076] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 09/07/2019] [Accepted: 09/09/2019] [Indexed: 12/24/2022] Open
Abstract
This paper presents a machine learning based approach for the discrimination of malignant and benign microcalcification (MC) clusters in digital mammograms. A series of morphological operations was carried out to facilitate the feature extraction from segmented microcalcification. A combination of morphological, texture, and distribution features from individual MC components and MC clusters were extracted and a correlation-based feature selection technique was used. The clinical relevance of the selected features is discussed. The proposed method was evaluated using three different databases: Optimam Mammography Image Database (OMI-DB), Digital Database for Screening Mammography (DDSM), and Mammographic Image Analysis Society (MIAS) database. The best classification accuracy (95.00±0.57%) was achieved for OPTIMAM using a stack generalization classifier with 10-fold cross validation obtaining an Az value equal to 0.97±0.01.
Collapse
|
3
|
Jackson RL, Double CR, Munro HJ, Lynch J, Tapia KA, Trieu PD, Alakhras M, Ganesan A, Do TD, Soh BP, Brennan PC, Puslednik P. Breast Cancer Diagnostic Efficacy in a Developing South-East Asian Country. Asian Pac J Cancer Prev 2019; 20:727-731. [PMID: 30909671 PMCID: PMC6825776 DOI: 10.31557/apjcp.2019.20.3.727] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background: Breast cancer, is increasing in prevalence amongst South East (SE) Asian women, highlighting the
need for high quality, early diagnoses. This study investigated radiologists’ detection efficacy in a developing (DC)
and developed (DDC) SE Asian country, as compared to Australian radiologists. Methods: Using a test-set of 60
mammographic cases, 20 containing cancer, JAFROC figures of merit (FOM) and ROC area under the curves (AUC)
were calculated as well as location sensitivity, sensitivity and specificity. The test set was examined by 35, 15, and
53 radiologists from DC, a DDC and Australia, respectively. Results: DC radiologists, compared to both groups of
counterparts, demonstrated significantly lower JAFROC FOM, ROC AUC and specificity scores. DC radiologists had
a significantly lower location sensitivity than Australian radiologists. DC radiologists also demonstrated significantly
lower values for age, hours of reading per week, and years of mammography experience when compared with other
radiologists. Conclusion: Significant differences in breast cancer detection parameters can be attributed to the experience
of DC radiologists. The development of inexpensive, innovative, interactive training programs are discussed. This nonuniform
level of breast cancer detection between countries must be addressed to achieve the World Health Organisation
goal of health equity.
Collapse
Affiliation(s)
| | - Callan R Double
- St Matthews Catholic School, Mudgee, New South Wales, Australia.
| | - Hayden J Munro
- St Matthews Catholic School, Mudgee, New South Wales, Australia.
| | - Jessica Lynch
- St Matthews Catholic School, Mudgee, New South Wales, Australia.
| | - Kriscia A Tapia
- Faculty of Health Sciences, The University of Sydney, Australia
| | - Phuong Dung Trieu
- Faculty of Health Sciences, The University of Sydney, Australia.,Department of Medical Imaging, Ho Chi Minh City University of Medicine and Pharmacy, Vietnam
| | - Maram Alakhras
- Faculty of Health Sciences, The University of Sydney, Australia
| | - Aarthi Ganesan
- Faculty of Health Sciences, The University of Sydney, Australia
| | - Thuan Doan Do
- Department of Diagnostic Imaging, Vietnam National Cancer Hospital, Vietnam
| | | | | | - Puslednik Puslednik
- St Matthews Catholic School, Mudgee, New South Wales, Australia. ,Faculty of Health Sciences, The University of Sydney, Australia
| |
Collapse
|
4
|
Ahsen ME, Ayvaci MUS, Raghunathan S. When Algorithmic Predictions Use Human-Generated Data: A Bias-Aware Classification Algorithm for Breast Cancer Diagnosis. INFORMATION SYSTEMS RESEARCH 2019. [DOI: 10.1287/isre.2018.0789] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- Mehmet Eren Ahsen
- Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York 10029
| | | | | |
Collapse
|
5
|
Assessing Resident Performance in Screening Mammography: Development of a Quantitative Algorithm. Acad Radiol 2018; 25:659-664. [PMID: 29366681 DOI: 10.1016/j.acra.2017.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 11/16/2017] [Accepted: 11/21/2017] [Indexed: 11/23/2022]
Abstract
RATIONALE AND OBJECTIVES This study aims to provide objective performance data and feedback, including examination volumes, recall rates, and concordance with faculty interpretations, for residents performing independent interpretation of screening mammography examinations. METHOD AND MATERIALS Residents (r) and faculty (f) interpret screening mammograms separately and identify non-callbacks (NCBs) and callbacks (CBs). Residents review all discordant results. The number of concordant interpretations (fCB-rCB and fNCB-rNCB) and discordant interpretations (fCB-rNCB and fNCB-rCB) are entered into a macro-driven spreadsheet. These macros weigh the data dependent on the perceived clinical impact of the resident's decision. Weighted outcomes are combined with volumes to generate a weighted mammography performance score. Rotation-specific goals are assigned for the weighted score, screening volumes, recall rate relative to faculty, and concordance rates. Residents receive one point for achieving each goal. RESULTS Between July 2013 and May 2017, 18,747 mammography examinations were reviewed by 31 residents, in 71 resident rotations, over 246 resident weeks. Mean resident recall rate was 9.9% and significantly decreased with resident level (R), R2 = 11.3% vs R3 = 9.4%, R4 = 9.2%. Mean resident-faculty discordance rate was 10% and significantly decreased from R2 = 12% to R4 = 9.6%. Weighted performance scores ranged from 1.1 to 2.0 (mean 1.6, standard deviation 0.17), but did not change with rotation experience. Residents had a mean goal achievement score of 2.6 (standard deviation 0.47). CONCLUSIONS This method provides residents with easily accessible case-by-case individualized screening outcome data over the longitudinal period of their residency, and provides an objective method of assessing resident screening mammography performance.
Collapse
|
6
|
Poot JD, Chetlen AL. A Simulation Screening Mammography Module Created for Instruction and Assessment: Radiology Residents vs National Benchmarks. Acad Radiol 2016; 23:1454-1462. [PMID: 27637285 DOI: 10.1016/j.acra.2016.07.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 05/05/2016] [Accepted: 07/01/2016] [Indexed: 12/29/2022]
Abstract
RATIONALE AND OBJECTIVES To improve mammographic screening training and breast cancer detection, radiology residents participated in a simulation screening mammography module in which they interpreted an enriched set of screening mammograms with known outcomes. This pilot research study evaluates the effectiveness of the simulation module while tracking the progress, efficiency, and accuracy of radiology resident interpretations and also compares their performance against national benchmarks. MATERIALS AND METHODS A simulation module was created with 266 digital screening mammograms enriched with high-risk breast lesions (seven cases) and breast malignancies (65 cases). Over a period of 27 months, 39 radiology residents participated in the simulation screening mammography module. Resident sensitivity and specificity were compared to Breast Cancer Surveillance Consortium (BCSC data through 2009) national benchmark and American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) acceptable screening mammography audit ranges. RESULTS The sensitivity, the percentage of cancers with an abnormal initial interpretation (BI-RADS 0), among residents was 84.5%, similar to the BCSC benchmark sensitivity of 84.9% (sensitivity for tissue diagnosis of cancer within 1 year following the initial examination) and within the acceptable ACR BI-RADS medical audit range of ≥75%. The specificity, the percentage of noncancers that had a negative image interpretation (BI-RADS 1 or 2), among residents was 83.2% compared to 90.3% reported in the BCSC benchmark data, but lower than the suggested ACR BI-RADS range of 88%-95%. CONCLUSIONS Using simulation modules for interpretation of screening mammograms is a promising method for training radiology residents to detect breast cancer and to help them achieve competence toward national benchmarks.
Collapse
|
7
|
Mitry D, Zutis K, Dhillon B, Peto T, Hayat S, Khaw KT, Morgan JE, Moncur W, Trucco E, Foster PJ. The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images. Transl Vis Sci Technol 2016; 5:6. [PMID: 27668130 PMCID: PMC5032847 DOI: 10.1167/tvst.5.5.6] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Accepted: 07/07/2016] [Indexed: 11/24/2022] Open
Abstract
Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
Collapse
Affiliation(s)
- Danny Mitry
- NIHR Biomedical Research Centre at Moorfields Eye Hospital & UCL Institute of Ophthalmology, London, UK
| | - Kris Zutis
- VAMPIRE project, School of Science and Engineering, University of Dundee, Dundee, UK
| | - Baljean Dhillon
- Centre for Clinical Brain Sciences, University of Edinburgh and Princess Alexandra Eye Pavilion, Edinburgh, UK
| | - Tunde Peto
- NIHR Biomedical Research Centre at Moorfields Eye Hospital & UCL Institute of Ophthalmology, London, UK
| | - Shabina Hayat
- Department of Public Health and Primary Care, University of Cambridge Strangeways Research Laboratory, Worts Causeway, Cambridge, UK
| | - Kay-Tee Khaw
- Department of Clinical Gerontology, Addenbrookes Hospital, University of Cambridge, Cambridge, UK
| | - James E Morgan
- School of Optometry and Vision Sciences, Cardiff University, Cardiff, UK
| | - Wendy Moncur
- Duncan of Jordanstone College of Arts and Design, University of Dundee, Dundee, UK
| | - Emanuele Trucco
- VAMPIRE project, School of Science and Engineering, University of Dundee, Dundee, UK
| | - Paul J Foster
- NIHR Biomedical Research Centre at Moorfields Eye Hospital & UCL Institute of Ophthalmology, London, UK
| | | |
Collapse
|