1
|
Whitney HM, Drukker K, Vieceli M, Dusen AV, de Oliveira M, Abe H, Giger ML. Role of sureness in evaluating AI/CADx: Lesion-based repeatability of machine learning classification performance on breast MRI. Med Phys 2024; 51:1812-1821. [PMID: 37602841 PMCID: PMC10879454 DOI: 10.1002/mp.16673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 07/24/2023] [Accepted: 07/24/2023] [Indexed: 08/22/2023] Open
Abstract
BACKGROUND Artificial intelligence/computer-aided diagnosis (AI/CADx) and its use of radiomics have shown potential in diagnosis and prognosis of breast cancer. Performance metrics such as the area under the receiver operating characteristic (ROC) curve (AUC) are frequently used as figures of merit for the evaluation of CADx. Methods for evaluating lesion-based measures of performance may enhance the assessment of AI/CADx pipelines, particularly in the situation of comparing performances by classifier. PURPOSE The purpose of this study was to investigate the use case of two standard classifiers to (1) compare overall classification performance of the classifiers in the task of distinguishing between benign and malignant breast lesions using radiomic features extracted from dynamic contrast-enhanced magnetic resonance (DCE-MR) images, (2) define a new repeatability metric (termed sureness), and (3) use sureness to examine if one classifier provides an advantage in AI diagnostic performance by lesion when using radiomic features. METHODS Images of 1052 breast lesions (201 benign, 851 cancers) had been retrospectively collected under HIPAA/IRB compliance. The lesions had been segmented automatically using a fuzzy c-means method and thirty-two radiomic features had been extracted. Classification was investigated for the task of malignant lesions (81% of the dataset) versus benign lesions (19%). Two classifiers (linear discriminant analysis, LDA and support vector machines, SVM) were trained and tested within 0.632 bootstrap analyses (2000 iterations). Whole-set classification performance was evaluated at two levels: (1) the 0.632+ bias-corrected area under the ROC curve (AUC) and (2) performance metric curves which give variability in operating sensitivity and specificity at a target operating point (95% target sensitivity). Sureness was defined as 1-95% confidence interval of the classifier output for each lesion for each classifier. Lesion-based repeatability was evaluated at two levels: (1) repeatability profiles, which represent the distribution of sureness across the decision threshold and (2) sureness of each lesion. The latter was used to identify lesions with better sureness with one classifier over another while maintaining lesion-based performance across the bootstrap iterations. RESULTS In classification performance assessment, the median and 95% CI of difference in AUC between the two classifiers did not show evidence of difference (ΔAUC = -0.003 [-0.031, 0.018]). Both classifiers achieved the target sensitivity. Sureness was more consistent across the classifier output range for the SVM classifier than the LDA classifier. The SVM resulted in a net gain of 33 benign lesions and 307 cancers with higher sureness and maintained lesion-based performance. However, with the LDA there was a notable percentage of benign lesions (42%) with better sureness but lower lesion-based performance. CONCLUSIONS When there is no evidence for difference in performance between classifiers using AUC or other performance summary measures, a lesion-based sureness metric may provide additional insight into AI pipeline design. These findings present and emphasize the utility of lesion-based repeatability via sureness in AI/CADx as a complementary enhancement to other evaluation measures.
Collapse
Affiliation(s)
- Heather M. Whitney
- Department of Radiology, The University of Chicago, Chicago, IL USA 60637
| | - Karen Drukker
- Department of Radiology, The University of Chicago, Chicago, IL USA 60637
| | - Michael Vieceli
- Department of Physics, Wheaton College, Wheaton, IL USA 60187
| | - Amy Van Dusen
- Department of Physics, Wheaton College, Wheaton, IL USA 60187
| | | | - Hiroyuki Abe
- Department of Radiology, The University of Chicago, Chicago, IL USA 60637
| | - Maryellen L. Giger
- Department of Radiology, The University of Chicago, Chicago, IL USA 60637
| |
Collapse
|
2
|
Huang EP, O'Connor JPB, McShane LM, Giger ML, Lambin P, Kinahan PE, Siegel EL, Shankar LK. Criteria for the translation of radiomics into clinically useful tests. Nat Rev Clin Oncol 2023; 20:69-82. [PMID: 36443594 PMCID: PMC9707172 DOI: 10.1038/s41571-022-00707-0] [Citation(s) in RCA: 66] [Impact Index Per Article: 66.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2022] [Indexed: 11/29/2022]
Abstract
Computer-extracted tumour characteristics have been incorporated into medical imaging computer-aided diagnosis (CAD) algorithms for decades. With the advent of radiomics, an extension of CAD involving high-throughput computer-extracted quantitative characterization of healthy or pathological structures and processes as captured by medical imaging, interest in such computer-extracted measurements has increased substantially. However, despite the thousands of radiomic studies, the number of settings in which radiomics has been successfully translated into a clinically useful tool or has obtained FDA clearance is comparatively small. This relative dearth might be attributable to factors such as the varying imaging and radiomic feature extraction protocols used from study to study, the numerous potential pitfalls in the analysis of radiomic data, and the lack of studies showing that acting upon a radiomic-based tool leads to a favourable benefit-risk balance for the patient. Several guidelines on specific aspects of radiomic data acquisition and analysis are already available, although a similar roadmap for the overall process of translating radiomics into tools that can be used in clinical care is needed. Herein, we provide 16 criteria for the effective execution of this process in the hopes that they will guide the development of more clinically useful radiomic tests in the future.
Collapse
Affiliation(s)
- Erich P Huang
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Rockville, MD, USA.
| | - James P B O'Connor
- Division of Radiotherapy and Imaging, Institute of Cancer Research, London, UK
| | - Lisa M McShane
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Rockville, MD, USA
| | | | - Philippe Lambin
- Department of Precision Medicine, Maastricht University, Maastricht, Netherlands
| | - Paul E Kinahan
- Department of Radiology, University of Washington, Seattle, WA, USA
| | - Eliot L Siegel
- Department of Diagnostic Radiology, University of Maryland, Baltimore, MD, USA
| | - Lalitha K Shankar
- Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Rockville, MD, USA
| |
Collapse
|
3
|
Whitney HM, Drukker K, Giger ML. Performance metric curve analysis framework to assess impact of the decision variable threshold, disease prevalence, and dataset variability in two-class classification. J Med Imaging (Bellingham) 2022; 9:035502. [PMID: 35656541 PMCID: PMC9152992 DOI: 10.1117/1.jmi.9.3.035502] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 05/11/2022] [Indexed: 08/23/2023] Open
Abstract
Purpose: The aim of this study is to (1) demonstrate a graphical method and interpretation framework to extend performance evaluation beyond receiver operating characteristic curve analysis and (2) assess the impact of disease prevalence and variability in training and testing sets, particularly when a specific operating point is used. Approach: The proposed performance metric curves (PMCs) simultaneously assess sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and the 95% confidence intervals thereof, as a function of the threshold for the decision variable. We investigated the utility of PMCs using six example operating points associated with commonly used methods to select operating points (including the Youden index and maximum mutual information). As an example, we applied PMCs to the task of distinguishing between malignant and benign breast lesions using human-engineered radiomic features extracted from dynamic contrast-enhanced magnetic resonance images. The dataset had 1885 lesions, with the images acquired in 2015 and 2016 serving as the training set (1450 lesions) and those acquired in 2017 as the test set (435 lesions). Our study used this dataset in two ways: (1) the clinical dataset itself and (2) simulated datasets with features based on the clinical set but with five different disease prevalences. The median and 95% CI of the number of type I (false positive) and type II (false negative) errors were determined for each operating point of interest. Results: PMCs from both the clinical and simulated datasets demonstrated that PMCs could support interpretation of the impact of decision threshold choice on type I and type II errors of classification, particularly relevant to prevalence. Conclusion: PMCs allow simultaneous evaluation of the four performance metrics of sensitivity, specificity, PPV, and NPV as a function of the decision threshold. This may create a better understanding of two-class classifier performance in machine learning.
Collapse
Affiliation(s)
- Heather M. Whitney
- University of Chicago, Department of Radiology, Chicago, Illinois, United States
- Wheaton College, Department of Physics, Wheaton, Illinois, United States
| | - Karen Drukker
- University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Maryellen L. Giger
- University of Chicago, Department of Radiology, Chicago, Illinois, United States
| |
Collapse
|
4
|
Whitney HM, Li H, Ji Y, Liu P, Giger ML. Multi-Stage Harmonization for Robust AI across Breast MR Databases. Cancers (Basel) 2021; 13:cancers13194809. [PMID: 34638294 PMCID: PMC8508003 DOI: 10.3390/cancers13194809] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 09/16/2021] [Accepted: 09/18/2021] [Indexed: 12/22/2022] Open
Abstract
Simple Summary Batch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models. Abstract Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance.
Collapse
Affiliation(s)
- Heather M. Whitney
- Department of Radiology, The University of Chicago, Chicago, IL 60637, USA; (H.L.); (Y.J.)
- Department of Physics, Wheaton College, Wheaton, IL 60187, USA
- Correspondence: (H.M.W.); (M.L.G.)
| | - Hui Li
- Department of Radiology, The University of Chicago, Chicago, IL 60637, USA; (H.L.); (Y.J.)
| | - Yu Ji
- Department of Radiology, The University of Chicago, Chicago, IL 60637, USA; (H.L.); (Y.J.)
- Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China;
| | - Peifang Liu
- Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China;
| | - Maryellen L. Giger
- Department of Radiology, The University of Chicago, Chicago, IL 60637, USA; (H.L.); (Y.J.)
- Correspondence: (H.M.W.); (M.L.G.)
| |
Collapse
|
5
|
Sahiner B, Pezeshk A, Hadjiiski LM, Wang X, Drukker K, Cha KH, Summers RM, Giger ML. Deep learning in medical imaging and radiation therapy. Med Phys 2019; 46:e1-e36. [PMID: 30367497 PMCID: PMC9560030 DOI: 10.1002/mp.13264] [Citation(s) in RCA: 379] [Impact Index Per Article: 75.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Revised: 09/18/2018] [Accepted: 10/09/2018] [Indexed: 12/15/2022] Open
Abstract
The goals of this review paper on deep learning (DL) in medical imaging and radiation therapy are to (a) summarize what has been achieved to date; (b) identify common and unique challenges, and strategies that researchers have taken to address these challenges; and (c) identify some of the promising avenues for the future both in terms of applications as well as technical innovations. We introduce the general principles of DL and convolutional neural networks, survey five major areas of application of DL in medical imaging and radiation therapy, identify common themes, discuss methods for dataset expansion, and conclude by summarizing lessons learned, remaining challenges, and future directions.
Collapse
Affiliation(s)
- Berkman Sahiner
- DIDSR/OSEL/CDRH U.S. Food and Drug AdministrationSilver SpringMD20993USA
| | - Aria Pezeshk
- DIDSR/OSEL/CDRH U.S. Food and Drug AdministrationSilver SpringMD20993USA
| | | | - Xiaosong Wang
- Imaging Biomarkers and Computer‐aided Diagnosis LabRadiology and Imaging SciencesNIH Clinical CenterBethesdaMD20892‐1182USA
| | - Karen Drukker
- Department of RadiologyUniversity of ChicagoChicagoIL60637USA
| | - Kenny H. Cha
- DIDSR/OSEL/CDRH U.S. Food and Drug AdministrationSilver SpringMD20993USA
| | - Ronald M. Summers
- Imaging Biomarkers and Computer‐aided Diagnosis LabRadiology and Imaging SciencesNIH Clinical CenterBethesdaMD20892‐1182USA
| | | |
Collapse
|
6
|
Moon WK, Huang YS, Lo CM, Huang CS, Bae MS, Kim WH, Chen JH, Chang RF. Computer-aided diagnosis for distinguishing between triple-negative breast cancer and fibroadenomas based on ultrasound texture features. Med Phys 2016; 42:3024-35. [PMID: 26127055 DOI: 10.1118/1.4921123] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
PURPOSE Triple-negative breast cancer (TNBC), an aggressive subtype, is frequently misclassified as fibroadenoma due to benign morphologic features on breast ultrasound (US). This study aims to develop a computer-aided diagnosis (CAD) system based on texture features for distinguishing between TNBC and benign fibroadenomas in US images. METHODS US images of 169 pathology-proven tumors (mean size, 1.65 cm; range, 0.7-3.0 cm) composed of 84 benign fibroadenomas and 85 TNBC tumors are used in this study. After a tumor is segmented out using the level-set method, morphological, conventional texture, and multiresolution gray-scale invariant texture feature sets are computed using a best-fitting ellipse, gray-level co-occurrence matrices, and the ranklet transform, respectively. The linear support vector machine with leave-one-out cross-validation schema is used as a classifier, and the diagnostic performance is assessed with receiver operating characteristic curve analysis. RESULTS The Az values of the morphology, conventional texture, and multiresolution gray-scale invariant texture feature sets are 0.8470 [95% confidence intervals (CIs), 0.7826-0.8973], 0.8542 (95% CI, 0.7911-0.9030), and 0.9695 (95% CI, 0.9376-0.9865), respectively. The Az of the CAD system based on the combined feature sets is 0.9702 (95% CI, 0.9334-0.9882). CONCLUSIONS The CAD system based on texture features extracted via the ranklet transform may be useful for improving the ability to discriminate between TNBC and benign fibroadenomas.
Collapse
Affiliation(s)
- Woo Kyung Moon
- Department of Radiology, Seoul National University Hospital and Seoul National University College of Medicine, Seoul 110-744, South Korea
| | - Yao-Sian Huang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan, Republic of China
| | - Chung-Ming Lo
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan, Republic of China
| | - Chiun-Sheng Huang
- Department of Surgery, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei 10041, Taiwan, Republic of China and Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 10617, Taiwan, Republic of China
| | - Min Sun Bae
- Department of Radiology, Seoul National University Hospital and Seoul National University College of Medicine, Seoul 110-744, South Korea
| | - Won Hwa Kim
- Department of Radiology, Seoul National University Hospital and Seoul National University College of Medicine, Seoul 110-744, South Korea
| | - Jeon-Hor Chen
- Center for Functional Onco-Imaging and Department of Radiological Science, University of California, Irvine, California 92868 and Department of Radiology, E-Da Hospital and I-Shou University, Kaohsiung 82445, Taiwan, Republic of China
| | - Ruey-Feng Chang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617, Taiwan, Republic of China and Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 10617, Taiwan, Republic of China
| |
Collapse
|
7
|
Drukker K, Sennett CA, Giger ML. Computerized detection of breast cancer on automated breast ultrasound imaging of women with dense breasts. Med Phys 2014; 41:012901. [PMID: 24387528 DOI: 10.1118/1.4837196] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
PURPOSE Develop a computer-aided detection method and investigate its feasibility for detection of breast cancer in automated 3D ultrasound images of women with dense breasts. METHODS The HIPAA compliant study involved a dataset of volumetric ultrasound image data, "views," acquired with an automated U-Systems Somo●V(®) ABUS system for 185 asymptomatic women with dense breasts (BI-RADS Composition/Density 3 or 4). For each patient, three whole-breast views (3D image volumes) per breast were acquired. A total of 52 patients had breast cancer (61 cancers), diagnosed through any follow-up at most 365 days after the original screening mammogram. Thirty-one of these patients (32 cancers) had a screening-mammogram with a clinically assigned BI-RADS Assessment Category 1 or 2, i.e., were mammographically negative. All software used for analysis was developed in-house and involved 3 steps: (1) detection of initial tumor candidates, (2) characterization of candidates, and (3) elimination of false-positive candidates. Performance was assessed by calculating the cancer detection sensitivity as a function of the number of "marks" (detections) per view. RESULTS At a single mark per view, i.e., six marks per patient, the median detection sensitivity by cancer was 50.0% (16/32) ± 6% for patients with a screening mammogram-assigned BI-RADS category 1 or 2--similar to radiologists' performance sensitivity (49.9%) for this dataset from a prior reader study--and 45.9% (28/61) ± 4% for all patients. CONCLUSIONS Promising detection sensitivity was obtained for the computer on a 3D ultrasound dataset of women with dense breasts at a rate of false-positive detections that may be acceptable for clinical implementation.
Collapse
Affiliation(s)
- Karen Drukker
- Department of Radiology, MC2026, The University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637
| | - Charlene A Sennett
- Department of Radiology, MC2026, The University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637
| | - Maryellen L Giger
- Department of Radiology, MC2026, The University of Chicago, 5841 South Maryland Avenue, Chicago, Illinois 60637
| |
Collapse
|
8
|
Montejo LD, Jia J, Kim HK, Netz UJ, Blaschke S, Müller GA, Hielscher AH. Computer-aided diagnosis of rheumatoid arthritis with optical tomography, Part 1: feature extraction. JOURNAL OF BIOMEDICAL OPTICS 2013; 18:076001. [PMID: 23856915 PMCID: PMC3710917 DOI: 10.1117/1.jbo.18.7.076001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This is the first part of a two-part paper on the application of computer-aided diagnosis to diffuse optical tomography (DOT). An approach for extracting heuristic features from DOT images and a method for using these features to diagnose rheumatoid arthritis (RA) are presented. Feature extraction is the focus of Part 1, while the utility of five classification algorithms is evaluated in Part 2. The framework is validated on a set of 219 DOT images of proximal interphalangeal (PIP) joints. Overall, 594 features are extracted from the absorption and scattering images of each joint. Three major findings are deduced. First, DOT images of subjects with RA are statistically different (p<0.05) from images of subjects without RA for over 90% of the features investigated. Second, DOT images of subjects with RA that do not have detectable effusion, erosion, or synovitis (as determined by MRI and ultrasound) are statistically indistinguishable from DOT images of subjects with RA that do exhibit effusion, erosion, or synovitis. Thus, this subset of subjects may be diagnosed with RA from DOT images while they would go undetected by reviews of MRI or ultrasound images. Third, scattering coefficient images yield better one-dimensional classifiers. A total of three features yield a Youden index greater than 0.8. These findings suggest that DOT may be capable of distinguishing between PIP joints that are healthy and those affected by RA with or without effusion, erosion, or synovitis.
Collapse
Affiliation(s)
- Ludguier D. Montejo
- Columbia University, Department of Biomedical Engineering, New York, New York 10027
- Address all correspondence to: Ludguier D. Montejo and Andreas H. Hielscher, Columbia University, Department of Biomedical Engineering, 500 West 120th Street, ET 351 Mudd Bldg, MC8904, New York, New York 10027. Ludguier D. Montejo, Tel: 212-854-2320; Fax: 212-854-8725; E-mail: ; Andreas H. Hielscher, Tel: 212-854-5020; Fax: 212-854-8725; E-mail:
| | - Jingfei Jia
- Columbia University, Department of Biomedical Engineering, New York, New York 10027
| | - Hyun K. Kim
- Columbia University Medical Center, Department of Radiology, New York, New York 10032
| | - Uwe J. Netz
- Laser-und Medizin-Technologie GmbH Berlin, Berlin-Dahlem, 14195, Germany
- Charité-Universitätsmedizin Berlin, Department of Medical Physics and Laser Medicine, Berlin 10117, Germany
| | - Sabine Blaschke
- University Medical Center Göttingen, Department of Nephrology and Rheumatology, Göttingen 37075, Germany
| | - Gerhard A. Müller
- University Medical Center Göttingen, Department of Nephrology and Rheumatology, Göttingen 37075, Germany
| | - Andreas H. Hielscher
- Columbia University, Department of Biomedical Engineering, New York, New York 10027
- Columbia University Medical Center, Department of Radiology, New York, New York 10032
- Columbia University, Department of Electrical Engineering, New York, New York 10025
- Address all correspondence to: Ludguier D. Montejo and Andreas H. Hielscher, Columbia University, Department of Biomedical Engineering, 500 West 120th Street, ET 351 Mudd Bldg, MC8904, New York, New York 10027. Ludguier D. Montejo, Tel: 212-854-2320; Fax: 212-854-8725; E-mail: ; Andreas H. Hielscher, Tel: 212-854-5020; Fax: 212-854-8725; E-mail:
| |
Collapse
|
9
|
Quantitative ultrasound image analysis of axillary lymph node status in breast cancer patients. Int J Comput Assist Radiol Surg 2013; 8:895-903. [DOI: 10.1007/s11548-013-0829-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 03/06/2013] [Indexed: 11/27/2022]
|
10
|
Gomez W, Pereira WCA, Infantosi AFC. Analysis of co-occurrence texture statistics as a function of gray-level quantization for classifying breast ultrasound. IEEE TRANSACTIONS ON MEDICAL IMAGING 2012; 31:1889-99. [PMID: 22759441 DOI: 10.1109/tmi.2012.2206398] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In this paper, we investigated the behavior of 22 co-occurrence statistics combined to six gray-scale quantization levels to classify breast lesions on ultrasound (BUS) images. The database of 436 BUS images used in this investigation was formed by 217 carcinoma and 219 benign lesions images. The region delimited by a minimum bounding rectangle around the lesion was employed to calculate the gray-level co-occurrence matrix (GLCM). Next, 22 co-occurrence statistics were computed regarding six quantization levels (8, 16, 32, 64, 128, and 256), four orientations (0° , 45° , 90° , and 135°), and ten distances (1, 2,...,10 pixels). Also, to reduce feature space dimensionality, texture descriptors of the same distance were averaged over all orientations, which is a common practice in the literature. Thereafter, the feature space was ranked using mutual information technique with minimal-redundancy-maximal-relevance (mRMR) criterion. Fisher linear discriminant analysis (FLDA) was applied to assess the discrimination power of texture features, by adding the first m-ranked features to the classification procedure iteratively until all of them were considered. The area under ROC curve (AUC) was used as figure of merit to measure the performance of the classifier. It was observed that averaging texture descriptors of a same distance impacts negatively the classification performance, since the best AUC of 0.81 was achieved with 32 gray levels and 109 features. On the other hand, regarding the single texture features (i.e., without averaging procedure), the quantization level does not impact the discrimination power, since AUC = 0.87 was obtained for the six quantization levels. Moreover, the number of features was reduced (between 17 and 24 features). The texture descriptors that contributed notably to distinguish breast lesions were contrast and correlation computed from GLCMs with orientation of 90° and distance more than five pixels.
Collapse
Affiliation(s)
- W Gomez
- Technology Information Laboratory, Center for Research and Advanced Studies of the National Polytechnic Institute, Ciudad Victoria, 87130 Tamaulipas, Mexico.
| | | | | |
Collapse
|
11
|
Pesce LL, Horsch K, Drukker K, Metz CE. Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model. Acad Radiol 2011; 18:1537-48. [PMID: 22055797 DOI: 10.1016/j.acra.2011.08.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2011] [Revised: 07/21/2011] [Accepted: 08/07/2011] [Indexed: 11/29/2022]
Abstract
RATIONALE AND OBJECTIVES Semiparametric methods provide smooth and continuous receiver operating characteristic (ROC) curve fits to ordinal test results and require only that the data follow some unknown monotonic transformation of the model's assumed distributions. The quantitative relationship between cutoff settings or individual test-result values on the data scale and points on the estimated ROC curve is lost in this procedure, however. To recover that relationship in a principled way, we propose a new algorithm for "proper" ROC curves and illustrate it by use of the proper binormal model. MATERIALS AND METHODS Several authors have proposed the use of multinomial distributions to fit semiparametric ROC curves by maximum-likelihood estimation. The resulting approach requires nuisance parameters that specify interval probabilities associated with the data, which are used subsequently as a basis for estimating values of the curve parameters of primary interest. In the method described here, we employ those "nuisance" parameters to recover the relationship between any ordinal test-result scale and true-positive fraction, false-positive fraction, and likelihood ratio. Computer simulations based on the proper binormal model were used to evaluate our approach in estimating those relationships and to assess the coverage of its confidence intervals for realistically sized datasets. RESULTS In our simulations, the method reliably estimated simple relationships between test-result values and the several ROC quantities. CONCLUSION The proposed approach provides an effective and reliable semiparametric method with which to estimate the relationship between cutoff settings or individual test-result values and corresponding points on the ROC curve.
Collapse
Affiliation(s)
- Lorenzo L Pesce
- Department of Radiology, MC 2026, The University of Chicago Medical Center, 5841 S Maryland Avenue, Chicago, IL 60637-1470, USA
| | | | | | | |
Collapse
|