1
|
Hadjiiski L, Cha K, Chan HP, Drukker K, Morra L, Näppi JJ, Sahiner B, Yoshida H, Chen Q, Deserno TM, Greenspan H, Huisman H, Huo Z, Mazurchuk R, Petrick N, Regge D, Samala R, Summers RM, Suzuki K, Tourassi G, Vergara D, Armato SG. AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging. Med Phys 2023; 50:e1-e24. [PMID: 36565447 DOI: 10.1002/mp.16188] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/13/2022] [Accepted: 11/22/2022] [Indexed: 12/25/2022] Open
Abstract
Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods. Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic. To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.
Collapse
Affiliation(s)
- Lubomir Hadjiiski
- Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Kenny Cha
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Karen Drukker
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Lia Morra
- Department of Control and Computer Engineering, Politecnico di Torino, Torino, Italy
| | - Janne J Näppi
- 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Berkman Sahiner
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Hiroyuki Yoshida
- 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Quan Chen
- Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
| | - Thomas M Deserno
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany
| | - Hayit Greenspan
- Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv, Israel & Department of Radiology, Ichan School of Medicine, Tel Aviv University, Mt Sinai, New York, New York, USA
| | - Henkjan Huisman
- Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Zhimin Huo
- Tencent America, Palo Alto, California, USA
| | - Richard Mazurchuk
- Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | | | - Daniele Regge
- Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Italy.,Department of Surgical Sciences, University of Turin, Turin, Italy
| | - Ravi Samala
- U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Ronald M Summers
- Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
| | - Kenji Suzuki
- Institute of Innovative Research, Tokyo Institute of Technology, Tokyo, Japan
| | | | - Daniel Vergara
- Department of Radiology, Yale New Haven Hospital, New Haven, Connecticut, USA
| | - Samuel G Armato
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
2
|
A Novel Medical Image Enhancement Algorithm for Breast Cancer Detection on Mammography Images Using Machine Learning. Diagnostics (Basel) 2023; 13:diagnostics13030348. [PMID: 36766453 PMCID: PMC9914723 DOI: 10.3390/diagnostics13030348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/13/2023] [Accepted: 01/13/2023] [Indexed: 01/19/2023] Open
Abstract
Mammography is the most preferred method for breast cancer screening. In this study, computer-aided diagnosis (CAD) systems were used to improve the image quality of mammography images and to detect suspicious areas. The main contribution of this study is to reveal the optimal combination of various pre-processing algorithms to enable better interpretation and classification of mammography images because pre-processing algorithms significantly affect the accuracy of segmentation and classification methods. In this study, the effect of combinations of different preprocessing methods in differentiating benign and malignant breast lesions was investigated. All image processing algorithms used for lesion detection were used in the mini-MIAS database. In the first step, label information and pectoral muscle resulting from the acquisition of mammography images were removed. In the second step, median filter (MF), contrast limited adaptive histogram equalization (CLAHE), and unsharp masking (USM) algorithms with different combinations of the resolution and visibility of images are increased. In the third step, suspicious regions are extracted from the mammograms using the k-means clustering technique. Then, features were extracted from the obtained ROIs. Finally, feature datasets were classified as normal/abnormal, and benign/malign (two class classification) using Machine Learning algorithms. Test performance measures of the classification methods were examined. In both classifications made in the study, lower classification performance values were obtained when the CLAHE algorithm was used alone as a pre-processing method compared to other pre-processing combinations. When the median filter and unsharp masking algorithms are added to the CLAHE algorithm, the performance of the classification methods has increased. In terms of classification success, Support Vector Machines, Random Forest, and Neural Networks showed the best performance. It was found by comparing the performances of the classification methods that different preprocessing algorithms were effective in detecting the presence of breast lesions and distinguishing benign and malignant.
Collapse
|
3
|
A Review of Computer-Aided Breast Cancer Diagnosis Using Sequential Mammograms. Tomography 2022; 8:2874-2892. [PMID: 36548533 PMCID: PMC9785714 DOI: 10.3390/tomography8060241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/18/2022] [Accepted: 12/02/2022] [Indexed: 12/12/2022] Open
Abstract
Radiologists assess the results of mammography, the key screening tool for the detection of breast cancer, to determine the presence of malignancy. They, routinely, compare recent and prior mammographic views to identify changes between the screenings. In case a new lesion appears in a mammogram, or a region is changing rapidly, it is more likely to be suspicious, compared to a lesion that remains unchanged and it is usually benign. However, visual evaluation of mammograms is challenging even for expert radiologists. For this reason, various Computer-Aided Diagnosis (CAD) algorithms are being developed to assist in the diagnosis of abnormal breast findings using mammograms. Most of the current CAD systems do so using only the most recent mammogram. This paper provides a review of the development of methods to emulate the radiological approach and perform automatic segmentation and/or classification of breast abnormalities using sequential mammogram pairs. It begins with demonstrating the importance of utilizing prior views in mammography, through the review of studies where the performance of expert and less-trained radiologists was compared. Following, image registration techniques and their application to mammography are presented. Subsequently, studies that implemented temporal analysis or subtraction of temporally sequential mammograms are summarized. Finally, a description of the open access mammography datasets is provided. This comprehensive review can serve as a thorough introduction to the use of prior information in breast cancer CAD systems but also provides indicative directions to guide future applications.
Collapse
|
4
|
Hadjiiski LM, Cha KH, Cohan RH, Chan HP, Caoili EM, Davenport MS, Samala RK, Weizer AZ, Alva A, Kirova-Nedyalkova G, Shampain K, Meyer N, Barkmeier D, Woolen SA, Shankar PR, Francis IR, Palmbos PL. Intraobserver Variability in Bladder Cancer Treatment Response Assessment With and Without Computerized Decision Support. ACTA ACUST UNITED AC 2021; 6:194-202. [PMID: 32548296 PMCID: PMC7289252 DOI: 10.18383/j.tom.2020.00013] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
We evaluated the intraobserver variability of physicians aided by a computerized decision-support system for treatment response assessment (CDSS-T) to identify patients who show complete response to neoadjuvant chemotherapy for bladder cancer, and the effects of the intraobserver variability on physicians' assessment accuracy. A CDSS-T tool was developed that uses a combination of deep learning neural network and radiomic features from computed tomography (CT) scans to detect bladder cancers that have fully responded to neoadjuvant treatment. Pre- and postchemotherapy CT scans of 157 bladder cancers from 123 patients were collected. In a multireader, multicase observer study, physician-observers estimated the likelihood of pathologic T0 disease by viewing paired pre/posttreatment CT scans placed side by side on an in-house-developed graphical user interface. Five abdominal radiologists, 4 diagnostic radiology residents, 2 oncologists, and 1 urologist participated as observers. They first provided an estimate without CDSS-T and then with CDSS-T. A subset of cases was evaluated twice to study the intraobserver variability and its effects on observer consistency. The mean areas under the curves for assessment of pathologic T0 disease were 0.85 for CDSS-T alone, 0.76 for physicians without CDSS-T and improved to 0.80 for physicians with CDSS-T (P = .001) in the original evaluation, and 0.78 for physicians without CDSS-T and improved to 0.81 for physicians with CDSS-T (P = .010) in the repeated evaluation. The intraobserver variability was significantly reduced with CDSS-T (P < .0001). The CDSS-T can significantly reduce physicians' variability and improve their accuracy for identifying complete response of muscle-invasive bladder cancer to neoadjuvant chemotherapy.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Ajjai Alva
- Internal Medicine, Division of Hematology-Oncology, University of Michigan, Ann Arbor, MI
| | | | | | | | | | - Sean A Woolen
- Department of Radiology, University of California, San Francisco, Medical Center, San Francisco, CA
| | | | | | - Phillip L Palmbos
- Internal Medicine, Division of Hematology-Oncology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
5
|
A Review of the Role of Augmented Intelligence in Breast Imaging: From Automated Breast Density Assessment to Risk Stratification. AJR Am J Roentgenol 2019; 212:259-270. [DOI: 10.2214/ajr.18.20391] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
6
|
A comparative study of Ki-67 antigen expression between luminal A and triple-negative subtypes of breast cancer. Med Oncol 2017; 34:156. [DOI: 10.1007/s12032-017-1019-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2017] [Accepted: 07/31/2017] [Indexed: 12/29/2022]
|
7
|
Mehdy MM, Ng PY, Shair EF, Saleh NIM, Gomes C. Artificial Neural Networks in Image Processing for Early Detection of Breast Cancer. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017; 2017:2610628. [PMID: 28473865 PMCID: PMC5394406 DOI: 10.1155/2017/2610628] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2017] [Accepted: 03/09/2017] [Indexed: 12/26/2022]
Abstract
Medical imaging techniques have widely been in use in the diagnosis and detection of breast cancer. The drawback of applying these techniques is the large time consumption in the manual diagnosis of each image pattern by a professional radiologist. Automated classifiers could substantially upgrade the diagnosis process, in terms of both accuracy and time requirement by distinguishing benign and malignant patterns automatically. Neural network (NN) plays an important role in this respect, especially in the application of breast cancer detection. Despite the large number of publications that describe the utilization of NN in various medical techniques, only a few reviews are available that guide the development of these algorithms to enhance the detection techniques with respect to specificity and sensitivity. The purpose of this review is to analyze the contents of recently published literature with special attention to techniques and states of the art of NN in medical imaging. We discuss the usage of NN in four different medical imaging applications to show that NN is not restricted to few areas of medicine. Types of NN used, along with the various types of feeding data, have been reviewed. We also address hybrid NN adaptation in breast cancer detection.
Collapse
Affiliation(s)
- M. M. Mehdy
- Department of Computer and Communication System Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - P. Y. Ng
- Department of Computer and Communication System Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - E. F. Shair
- Department of Electrical and Electronics Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - N. I. Md Saleh
- Department of Chemical and Environmental Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| | - C. Gomes
- Department of Electrical and Electronics Engineering, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
| |
Collapse
|
8
|
Lee J, Nishikawa RM, Reiser I, Boone JM, Lindfors KK. Local curvature analysis for classifying breast tumors: Preliminary analysis in dedicated breast CT. Med Phys 2016; 42:5479-89. [PMID: 26328996 DOI: 10.1118/1.4928479] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
PURPOSE The purpose of this study is to measure the effectiveness of local curvature measures as novel image features for classifying breast tumors. METHODS A total of 119 breast lesions from 104 noncontrast dedicated breast computed tomography images of women were used in this study. Volumetric segmentation was done using a seed-based segmentation algorithm and then a triangulated surface was extracted from the resulting segmentation. Total, mean, and Gaussian curvatures were then computed. Normalized curvatures were used as classification features. In addition, traditional image features were also extracted and a forward feature selection scheme was used to select the optimal feature set. Logistic regression was used as a classifier and leave-one-out cross-validation was utilized to evaluate the classification performances of the features. The area under the receiver operating characteristic curve (AUC, area under curve) was used as a figure of merit. RESULTS Among curvature measures, the normalized total curvature (CT) showed the best classification performance (AUC of 0.74), while the others showed no classification power individually. Five traditional image features (two shape, two margin, and one texture descriptors) were selected via the feature selection scheme and its resulting classifier achieved an AUC of 0.83. Among those five features, the radial gradient index (RGI), which is a margin descriptor, showed the best classification performance (AUC of 0.73). A classifier combining RGI and CT yielded an AUC of 0.81, which showed similar performance (i.e., no statistically significant difference) to the classifier with the above five traditional image features. Additional comparisons in AUC values between classifiers using different combinations of traditional image features and CT were conducted. The results showed that CT was able to replace the other four image features for the classification task. CONCLUSIONS The normalized curvature measure contains useful information in classifying breast tumors. Using this, one can reduce the number of features in a classifier, which may result in more robust classifiers for different datasets.
Collapse
Affiliation(s)
- Juhun Lee
- Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Robert M Nishikawa
- Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Ingrid Reiser
- Department of Radiology, University of Chicago, Chicago, Illinois 60637
| | - John M Boone
- Department of Radiology, University of California Davis Medical Center, Sacramento, California 95817
| | - Karen K Lindfors
- Department of Radiology, University of California Davis Medical Center, Sacramento, California 95817
| |
Collapse
|
9
|
Veisy A, Lotfinejad S, Salehi K, Zhian F. Risk of breast cancer in relation to reproductive factors in North-West of Iran, 2013-2014. Asian Pac J Cancer Prev 2015; 16:451-5. [PMID: 25684470 DOI: 10.7314/apjcp.2015.16.2.451] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
More than one million new patients suffer from breast cancer annually in the world. In developed countries, breast cancer is the most common malignancy diagnosed among women, and in developing regions, it often ranks second to cervical cancer. This study aimed to investigate the relationship between incidence of breast cancer and reproductive factors in North-West of Iran. This retrospective analytical control-case study was conducted with 235 breast cancer patients and 235 women in the control group. Data collection tools included a set of questions with interviews and patient medical records. Data were analyzed using statistical tests: t-test, Chi-square, Fisher, and Pearson correlation coefficient. Significantly increased risks were associated between breast cancer and older age at first pregnancy, age at menopause and history of contraceptive use. A trend for decreasing risk were observed with increasing parity. Findings of this study showed no association between breast cancer and age at menarche. The study results suggested that physiological and reproductive factors may play important roles in the development breast cancer among Iranian women.
Collapse
Affiliation(s)
- Afsaneh Veisy
- Department of Midwifery, Mahabad Branch, Islamic Azad University, Iran E-mail :
| | | | | | | |
Collapse
|
10
|
Bozek J, Kallenberg M, Grgic M, Karssemeijer N. Use of volumetric features for temporal comparison of mass lesions in full field digital mammograms. Med Phys 2014; 41:021902. [PMID: 24506623 DOI: 10.1118/1.4860956] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE Temporal comparison of lesions might improve classification between benign and malignant lesions in full-field digital mammograms (FFDM). The authors compare the use of volumetric features for lesion classification, which are computed from dense tissue thickness maps, to the use of mammographic lesion area. Use of dense tissue thickness maps for lesion characterization is advantageous, since it results in lesion features that are invariant to acquisition parameters. METHODS The dataset used in the analysis consisted of 60 temporal mammogram pairs comprising 120 mediolateral oblique or craniocaudal views with a total of 65 lesions, of which 41 were benign and 24 malignant. The authors analyzed the performance of four volumetric features, area, and four other commonly used features obtained from temporal mammogram pairs, current mammograms, and prior mammograms. The authors evaluated the individual performance of all features and of different feature sets. The authors used linear discriminant analysis with leave-one-out cross validation to classify different feature sets. RESULTS Volumetric features from temporal mammogram pairs achieved the best individual performance, as measured by the area under the receiver operating characteristic curve (Az value). Volume change (Az = 0.88) achieved higher Az value than projected lesion area change (Az = 0.78) in the temporal comparison of lesions. Best performance was achieved with a set that consisted of a set of features extracted from the current exam combined with four volumetric features representing changes with respect to the prior mammogram (Az = 0.90). This was significantly better (p = 0.005) than the performance obtained using features from the current exam only (Az = 0.77). CONCLUSIONS Volumetric features from temporal mammogram pairs combined with features from the single exam significantly improve discrimination of benign and malignant lesions in FFDM mammograms compared to using only single exam features. In the comparison with prior mammograms, use of volumetric change may lead to better performance than use of lesion area change.
Collapse
Affiliation(s)
- Jelena Bozek
- Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000 Zagreb, Croatia
| | - Michiel Kallenberg
- Department of Radiology, Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 18, 6525 GA Nijmegen, The Netherlands
| | - Mislav Grgic
- Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, HR-10000 Zagreb, Croatia
| | - Nico Karssemeijer
- Department of Radiology, Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 18, 6525 GA Nijmegen, The Netherlands
| |
Collapse
|
11
|
Ayer T, Chen Q, Burnside ES. Artificial neural networks in mammography interpretation and diagnostic decision making. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2013; 2013:832509. [PMID: 23781276 PMCID: PMC3677609 DOI: 10.1155/2013/832509] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 04/22/2013] [Indexed: 11/27/2022]
Abstract
Screening mammography is the most effective means for early detection of breast cancer. Although general rules for discriminating malignant and benign lesions exist, radiologists are unable to perfectly detect and classify all lesions as malignant and benign, for many reasons which include, but are not limited to, overlap of features that distinguish malignancy, difficulty in estimating disease risk, and variability in recommended management. When predictive variables are numerous and interact, ad hoc decision making strategies based on experience and memory may lead to systematic errors and variability in practice. The integration of computer models to help radiologists increase the accuracy of mammography examinations in diagnostic decision making has gained increasing attention in the last two decades. In this study, we provide an overview of one of the most commonly used models, artificial neural networks (ANNs), in mammography interpretation and diagnostic decision making and discuss important features in mammography interpretation. We conclude by discussing several common limitations of existing research on ANN-based detection and diagnostic models and provide possible future research directions.
Collapse
Affiliation(s)
- Turgay Ayer
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Dr., Atlanta, GA 30332, USA.
| | | | | |
Collapse
|
12
|
Evaluating imaging and computer-aided detection and diagnosis devices at the FDA. Acad Radiol 2012; 19:463-77. [PMID: 22306064 DOI: 10.1016/j.acra.2011.12.016] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 12/22/2011] [Accepted: 12/28/2011] [Indexed: 11/22/2022]
Abstract
This report summarizes the Joint FDA-MIPS Workshop on Methods for the Evaluation of Imaging and Computer-Assist Devices. The purpose of the workshop was to gather information on the current state of the science and facilitate consensus development on statistical methods and study designs for the evaluation of imaging devices to support US Food and Drug Administration submissions. Additionally, participants expected to identify gaps in knowledge and unmet needs that should be addressed in future research. This summary is intended to document the topics that were discussed at the meeting and disseminate the lessons that have been learned through past studies of imaging and computer-aided detection and diagnosis device performance.
Collapse
|
13
|
Clinically missed cancer: how effectively can radiologists use computer-aided detection? AJR Am J Roentgenol 2012; 198:708-16. [PMID: 22358014 DOI: 10.2214/ajr.11.6423] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of this study was to determine the effectiveness with which radiologists can use computer-aided detection (CADe) to detect cancer missed at screening. MATERIALS AND METHODS An observer study was performed to measure the ability of radiologists to detect breast cancer on mammograms with and without CADe. The images in the study were from 300 analog mammographic examinations. In 234 cases the mammograms were read clinically as normal and free of cancer for at least 2 subsequent years. In the other 66 cases, cancers were missed clinically. In 256 cases, current and previous mammograms were available. Eight radiologists read the dataset and recorded a BI-RADS assessment, the location of the lesion, and their level of confidence that the patient should be recalled for diagnostic workup for each suspicious lesion. Jackknife alternative free-response receiver operating characteristic analysis was used. RESULTS The jackknife alternative free-response receiver operating characteristic figure of merit was 0.641 without aid and 0.659 with aid (p = 0.06; 95% CI, -0.001 to 0.036). The sensitivity increased 9.9% (95% CI, 3.4-19%) and the callback rate 12.1% (95% CI, 7.3-20%) with CADe. Both increases were statistically significant (p < 0.001). Radiologists on average ignored 71% of correct computer prompts. CONCLUSION Use of CADe can increase radiologist sensitivity 10% with a comparable increase in recall rate. There is potential for CADe to have a bigger clinical impact because radiologists failed to recognize a correct computer prompt in 71% of missed cancer cases [corrected].
Collapse
|
14
|
Singh S, Maxwell J, Baker JA, Nicholas JL, Lo JY. Computer-aided classification of breast masses: performance and interobserver variability of expert radiologists versus residents. Radiology 2010; 258:73-80. [PMID: 20971779 DOI: 10.1148/radiol.10081308] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To evaluate the interobserver variability in descriptions of breast masses by dedicated breast imagers and radiology residents and determine how any differences in lesion description affect the performance of a computer-aided diagnosis (CAD) computer classification system. MATERIALS AND METHODS Institutional review board approval was obtained for this HIPAA-compliant study, and the requirement to obtain informed consent was waived. Images of 50 breast lesions were individually interpreted by seven dedicated breast imagers and 10 radiology residents, yielding 850 lesion interpretations. Lesions were described with use of 11 descriptors from the Breast Imaging Reporting and Data System, and interobserver variability was calculated with the Cohen κ statistic. Those 11 features were selected, along with patient age, and merged together by a linear discriminant analysis (LDA) classification model trained by using 1005 previously existing cases. Variability in the recommendations of the computer model for different observers was also calculated with the Cohen κ statistic. RESULTS A significant difference was observed for six lesion features, and radiology residents had greater interobserver variability in their selection of five of the six features than did dedicated breast imagers. The LDA model accurately classified lesions for both sets of observers (area under the receiver operating characteristic curve = 0.94 for residents and 0.96 for dedicated imagers). Sensitivity was maintained at 100% for residents and improved from 98% to 100% for dedicated breast imagers. For residents, the computer model could potentially improve the specificity from 20% to 40% (P < .01) and the κ value from 0.09 to 0.53 (P < .001). For dedicated breast imagers, the computer model could increase the specificity from 34% to 43% (P = .16) and the κ value from 0.21 to 0.61 (P < .001). CONCLUSION Among findings showing a significant difference, there was greater interobserver variability in lesion descriptions among residents; however, an LDA model using data from either dedicated breast imagers or residents yielded a consistently high performance in the differentiation of benign from malignant breast lesions, demonstrating potential for improving specificity and decreasing interobserver variability in biopsy recommendations.
Collapse
Affiliation(s)
- Swatee Singh
- Carl E. Ravin Advanced Imaging Laboratories, Duke University Medical Center, 2424 Erwin Rd, Ste 302, Durham, NC 27705, USA.
| | | | | | | | | |
Collapse
|
15
|
Chan HP, Wu YT, Sahiner B, Wei J, Helvie MA, Zhang Y, Moore RH, Kopans DB, Hadjiiski L, Way T. Characterization of masses in digital breast tomosynthesis: comparison of machine learning in projection views and reconstructed slices. Med Phys 2010; 37:3576-86. [PMID: 20831065 DOI: 10.1118/1.3432570] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE In digital breast tomosynthesis (DBT), quasi-three-dimensional (3D) structural information is reconstructed from a small number of 2D projection view (PV) mammograms acquired over a limited angular range. The authors developed preliminary computer-aided diagnosis (CADx) methods for classification of malignant and benign masses and compared the effectiveness of analyzing lesion characteristics in the reconstructed DBT slices and in the PVs. METHODS A data set of MLO view DBT of 99 patients containing 107 masses (56 malignant and 51 benign) was collected at the Massachusetts General Hospital with IRB approval. The DBTs were obtained with a GE prototype system which acquired 11 PVs over a 50 degree arc. The authors reconstructed the DBTs at 1 mm slice interval using a simultaneous algebraic reconstruction technique. The region of interest (ROI) containing the mass was marked by a radiologist in the DBT volume and the corresponding ROIs on the PVs were derived based on the imaging geometry. The subsequent processes were fully automated. For classification of masses using the DBT-slice approach, the mass on each slice was segmented by an active contour model initialized with adaptive k-means clustering. A spiculation likelihood map was generated by analysis of the gradient directions around the mass margin and spiculation features were extracted from the map. The rubber band straightening transform (RBST) was applied to a band of pixels around the segmented mass boundary. The RBST image was enhanced by Sobel filtering in the horizontal and vertical directions, from which run-length statistics texture features were extracted. Morphological features including those from the normalized radial length were designed to describe the mass shape. A feature space composed of the spiculation features, texture features, and morphological features extracted from the central slice alone and seven feature spaces obtained by averaging the corresponding features from three to 19 slices centered at the central slice were compared. For classification of masses using the PV approach, a feature extraction process similar to that described above for the DBT approach was performed on the ROIs from the individual PVs. Six feature spaces obtained from the central PV alone and by averaging the corresponding features from three to 11 PVs were formed. In each feature space for either the DBT-slice or the PV approach, a linear discriminant analysis classifier with stepwise feature selection was trained and tested using a two-loop leave-one-case-out resampling procedure. Simplex optimization was used to guide feature selection automatically within the training set in each leave-one-case-out cycle. The performance of the classifiers was evaluated by the area (Az) under the receiver operating characteristic curve. RESULTS The test Az values from the DBT-slice approach ranged from 0.87 +/- 0.03 to 0.93 +/- 0.02, while those from the PV approach ranged from 0.78 +/- 0.04 to 0.84 +/- 0.04. The highest test Az of 0.93 +/- 0.02 from the nine-DBT-slice feature space was significantly (p = 0.006) better than the highest test Az of 0.84 +/- 0.04 from the nine-PV feature space. CONCLUSION The features of breast lesions extracted from the DBT slices consistently provided higher classification accuracy than those extracted from the PV images.
Collapse
Affiliation(s)
- Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Ayer T, Alagoz O, Chhatwal J, Shavlik JW, Kahn CE, Burnside ES. Breast cancer risk estimation with artificial neural networks revisited: discrimination and calibration. Cancer 2010; 116:3310-21. [PMID: 20564067 DOI: 10.1002/cncr.25081] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
BACKGROUND Discriminating malignant breast lesions from benign ones and accurately predicting the risk of breast cancer for individual patients are crucial to successful clinical decisions. In the past, several artificial neural network (ANN) models have been developed for breast cancer-risk prediction. All studies have reported discrimination performance, but not one has assessed calibration, which is an equivalently important measure for accurate risk prediction. In this study, the authors have evaluated whether an artificial neural network (ANN) trained on a large prospectively collected dataset of consecutive mammography findings can discriminate between benign and malignant disease and accurately predict the probability of breast cancer for individual patients. METHODS Our dataset consisted of 62,219 consecutively collected mammography findings matched with the Wisconsin State Cancer Reporting System. The authors built a 3-layer feedforward ANN with 1000 hidden-layer nodes. The authors trained and tested their ANN by using 10-fold cross-validation to predict the risk of breast cancer. The authors used area the under the receiver-operating characteristic curve (AUC), sensitivity, and specificity to evaluate discriminative performance of the radiologists and their ANN. The authors assessed the accuracy of risk prediction (ie, calibration) of their ANN by using the Hosmer-Lemeshow (H-L) goodness-of-fit test. RESULTS Their ANN demonstrated superior discrimination (AUC, 0.965) compared with the radiologists (AUC, 0.939; P<.001). The authors' ANN was also well calibrated as shown by an H-L goodness of fit P-value of .13. CONCLUSIONS The authors' ANN can effectively discriminate malignant abnormalities from benign ones and accurately predict the risk of breast cancer for individual abnormalities.
Collapse
Affiliation(s)
- Turgay Ayer
- Industrial and Systems Engineering Department, University of Wisconsin, Madison, Wisconsin 53792-3252, USA
| | | | | | | | | | | |
Collapse
|
17
|
Ayer T, Ayvaci MUS, Liu ZX, Alagoz O, Burnside ES. Computer-aided diagnostic models in breast cancer screening. IMAGING IN MEDICINE 2010; 2:313-323. [PMID: 20835372 PMCID: PMC2936490 DOI: 10.2217/iim.10.24] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Mammography is the most common modality for breast cancer detection and diagnosis and is often complemented by ultrasound and MRI. However, similarities between early signs of breast cancer and normal structures in these images make detection and diagnosis of breast cancer a difficult task. To aid physicians in detection and diagnosis, computer-aided detection and computer-aided diagnostic (CADx) models have been proposed. A large number of studies have been published for both computer-aided detection and CADx models in the last 20 years. The purpose of this article is to provide a comprehensive survey of the CADx models that have been proposed to aid in mammography, ultrasound and MRI interpretation. We summarize the noteworthy studies according to the screening modality they consider and describe the type of computer model, input data size, feature selection method, input feature type, reference standard and performance measures for each study. We also list the limitations of the existing CADx models and provide several possible future research directions.
Collapse
Affiliation(s)
- Turgay Ayer
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Mehmet US Ayvaci
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Ze Xiu Liu
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Oguzhan Alagoz
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
- Department of Population Health Sciences, University of Wisconsin, Madison, WI, USA
| | - Elizabeth S Burnside
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
- Department of Biostatistics & Medical Informatics, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
18
|
Way T, Chan HP, Hadjiiski L, Sahiner B, Chughtai A, Song TK, Poopat C, Stojanovska J, Frank L, Attili A, Bogot N, Cascade PN, Kazerooni EA. Computer-aided diagnosis of lung nodules on CT scans: ROC study of its effect on radiologists' performance. Acad Radiol 2010; 17:323-32. [PMID: 20152726 DOI: 10.1016/j.acra.2009.10.016] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2009] [Revised: 10/02/2009] [Accepted: 10/13/2009] [Indexed: 10/19/2022]
Abstract
RATIONALE AND OBJECTIVES The aim of this study was to evaluate the effect of computer-aided diagnosis (CAD) on radiologists' estimates of the likelihood of malignancy of lung nodules on computed tomographic (CT) imaging. METHODS AND MATERIALS A total of 256 lung nodules (124 malignant, 132 benign) were retrospectively collected from the thoracic CT scans of 152 patients. An automated CAD system was developed to characterize and provide malignancy ratings for lung nodules on CT volumetric images. An observer study was conducted using receiver-operating characteristic analysis to evaluate the effect of CAD on radiologists' characterization of lung nodules. Six fellowship-trained thoracic radiologists served as readers. The readers rated the likelihood of malignancy on a scale of 0% to 100% and recommended appropriate action first without CAD and then with CAD. The observer ratings were analyzed using the Dorfman-Berbaum-Metz multireader, multicase method. RESULTS The CAD system achieved a test area under the receiver-operating characteristic curve (A(z)) of 0.857 +/- 0.023 using the perimeter, two nodule radii measures, two texture features, and two gradient field features. All six radiologists obtained improved performance with CAD. The average A(z) of the radiologists improved significantly (P < .01) from 0.833 (range, 0.817-0.847) to 0.853 (range, 0.834-0.887). CONCLUSION CAD has the potential to increase radiologists' accuracy in assessing the likelihood of malignancy of lung nodules on CT imaging.
Collapse
|
19
|
Computer-aided detection in full-field digital mammography in a clinical population: performance of radiologist and technologists. Breast Cancer Res Treat 2009; 120:499-506. [PMID: 19418215 DOI: 10.1007/s10549-009-0409-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2009] [Accepted: 04/21/2009] [Indexed: 01/08/2023]
Abstract
The purpose of the study was to evaluate the impact of a computer-aided detection (CAD) system on the performance of mammogram readers in interpreting digital mammograms in a clinical population. Furthermore, the ability of a CAD system to detect breast cancer in digital mammography was studied in comparison to the performance of radiologists and technologists as mammogram readers. Digital mammograms of 1,048 consecutive patients were evaluated by a radiologist and three technologists. Abnormalities were recorded and an imaging conclusion was given as a BI-RADS score before and after CAD analysis. Pathology results during 12 months follow up were used as a reference standard for breast cancer. Fifty-one malignancies were found in 50 patients. Sensitivity and specificity were computed before and after CAD analysis and provided with 95% CIs. In order to assess the detection rate of malignancies by CAD and the observers, the pathological locations of these 51 breast cancers were matched with the locations of the CAD marks and the mammographic locations that were considered to be suspicious by the observers. For all observers, the sensitivity rates did not change after application of CAD. A mean sensitivity of 92% was found for all technologists and 84% for the radiologist. For two technologists, the specificity decreased (from 84 to 83% and from 77 to 75%). For the radiologist and one technologist, the application of CAD did not have any impact on the specificity rates (95 and 83%, respectively). CAD detected 78% of all malignancies. Five malignancies were indicated by CAD without being noticed as suspicious by the observers. In conclusion, the results show that systematic application of CAD in a clinical patient population failed to improve the overall sensitivity of mammogram interpretation by the readers and was associated with an increase in false-positive results. However, CAD marked five malignancies that were missed by the different readers.
Collapse
|
20
|
Elter M, Horsch A. CADx of mammographic masses and clustered microcalcifications: A review. Med Phys 2009; 36:2052-68. [PMID: 19610294 DOI: 10.1118/1.3121511] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Affiliation(s)
- Matthias Elter
- Fraunhofer Institute for Integrated Circuits, Am Wolfsmantel 33, 91058 Erlangen, Germany.
| | | |
Collapse
|
21
|
Filev P, Hadjiiski L, Chan HP, Sahiner B, Ge J, Helvie MA, Roubidoux M, Zhou C. Automated regional registration and characterization of corresponding microcalcification clusters on temporal pairs of mammograms for interval change analysis. Med Phys 2009; 35:5340-50. [PMID: 19175093 DOI: 10.1118/1.3002311] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
A computerized regional registration and characterization system for analysis of microcalcification clusters on serial mammograms is being developed in our laboratory. The system consists of two stages. In the first stage, based on the location of a detected cluster on the current mammogram, a regional registration procedure identifies the local area on the prior that may contain the corresponding cluster. A search program is used to detect cluster candidates within the local area. The detected cluster on the current image is then paired with the cluster candidates on the prior image to form true (TP-TP) or false (TP-FP) pairs. Automatically extracted features were used in a newly designed correspondence classifier to reduce the number of false pairs. In the second stage, a temporal classifier, based on both current and prior information, is used if a cluster has been detected on the prior image, and a current classifier, based on current information alone, is used if no prior cluster has been detected. The data set used in this study consisted of 261 serial pairs containing biopsy-proven calcification clusters. An MQSA radiologist identified the corresponding clusters on the mammograms. On the priors, the radiologist rated the subtlety of 30 clusters (out of the 261 clusters) as 9 or 10 on a scale of 1 (very obvious) to 10 (very subtle). Leave-one-case-out resampling was used for feature selection and classification in both the correspondence and malignant/benign classification schemes. The search program detected 91.2% (238/261) of the clusters on the priors with an average of 0.42 FPs/image. The correspondence classifier identified 86.6% (226/261) of the TP-TP pairs with 20 false matches (0.08 FPs/image) relative to the entire set of 261 image pairs. In the malignant/benign classification stage the temporal classifier achieved a test A(z) of 0.81 for the 246 pairs which contained a detection on the prior. In addition, a classifier was designed by using the clusters on the current mammograms only. It achieved a test A(z) of 0.72 in classifying the clusters as malignant and benign. The difference between the performance of the temporal classifier and the current classifier was statistically significant (p=0.0014). Our interval change analysis system can detect the corresponding cluster on the prior mammogram with high sensitivity, and classify them with a satisfactory accuracy.
Collapse
Affiliation(s)
- Peter Filev
- Department of Radiology, The University of Michigan, Ann Arbor, Michigan 48109-0904, USA
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Zheng B. Breast Cancer: Computer-Aided Detection. METHODS OF CANCER DIAGNOSIS, THERAPY AND PROGNOSIS 2008:5-27. [DOI: 10.1007/978-1-4020-8369-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
23
|
Li H, Giger ML, Yuan Y, Chen W, Horsch K, Lan L, Jamieson AR, Sennett CA, Jansen SA. Evaluation of computer-aided diagnosis on a large clinical full-field digital mammographic dataset. Acad Radiol 2008; 15:1437-45. [PMID: 18995194 DOI: 10.1016/j.acra.2008.05.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2008] [Revised: 05/07/2008] [Accepted: 03/11/2008] [Indexed: 10/21/2022]
Abstract
RATIONALE AND OBJECTIVES To convert and optimize our previously developed computerized analysis methods for use with images from full-field digital mammography (FFDM) for breast mass classification to aid in the diagnosis of breast cancer. MATERIALS AND METHODS An institutional review board approved protocol was obtained, with waiver of consent for retrospective use of mammograms and pathology data. Seven hundred thirty-nine FFDM images, which contained 287 biopsy-proven breast mass lesions, of which 148 lesions were malignant and 139 lesions were benign, were retrospectively collected. Lesion margins were delineated by an expert breast radiologist and were used as the truth for lesion-segmentation evaluation. Our computerized image analysis method consisted of several steps: 1) identified lesions were automatically extracted from the parenchymal background using computerized segmentation methods; 2) a set of image characteristics (mathematic descriptors) were automatically extracted from image data of the lesions and surrounding tissues; and 3) selected features were merged into an estimate of the probability of malignancy using a Bayesian artificial neural network classifier. Performance of the analyses was evaluated at various stages of the conversion using receiver-operating characteristic analysis. RESULTS An area under the curve value of 0.81 was obtained in the task of distinguishing between malignant and benign mass lesions in a round-robin by case evaluation on the entire FFDM dataset. We failed to show a statistically significant difference (P = .83) compared to results from our previous study in which the computerized classification was performed on digitized screen-film mammograms. CONCLUSIONS Our computerized analysis methods developed on digitized screen-film mammography can be converted for use with FFDM. Results show that the computerized analysis methods for the diagnosis of breast mass lesions on FFDM are promising, and can potentially be used to aid clinicians in the diagnostic interpretation of FFDM.
Collapse
|
24
|
Brenner RJ. Computer-assisted detection in clinical practice: medical legal considerations. Semin Roentgenol 2008; 42:280-6. [PMID: 17919530 DOI: 10.1053/j.ro.2007.07.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- R James Brenner
- Breast Imaging Section, University of California, San Francisco, San Francisco, California 94115-1667, USA.
| |
Collapse
|
25
|
Berbaum KS. God, like the Devil, is in the details. Acad Radiol 2006; 13:1311-6. [PMID: 17070448 DOI: 10.1016/j.acra.2006.09.053] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2006] [Revised: 09/22/2006] [Accepted: 09/22/2006] [Indexed: 10/24/2022]
|