51
|
Zanca F, Chakraborty DP, Marchal G, Bosmans H. Consistency of methods for analysing location-specific data. RADIATION PROTECTION DOSIMETRY 2010; 139:52-56. [PMID: 20159917 PMCID: PMC2868070 DOI: 10.1093/rpd/ncq030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Although the receiver operating characteristic (ROC) method is the acknowledged gold-standard for imaging system assessment, it ignores localisation information and differentiation between multiple abnormalities per case. As the free-response ROC (FROC) method uses localisation information and more closely resembles the clinical reporting process, it is being increasingly used. A number of methods have been proposed to analyse the data that result from an FROC study: jackknife alternative FROC (JAFROC) and a variant termed JAFROC1, initial detection and candidate analysis (IDCA) and ROC analysis via the reduction of the multiple ratings on a case to a single rating. The focus of this paper was to compare JAFROC1, IDCA and the ROC analysis methods using a clinical FROC human data set. All methods agreed on the ordering of the modalities and all yielded statistically significant differences of the figures-of-merit, i.e. p < 0.05. Both IDCA and JAFROC1 yielded much smaller p-values than ROC. The results are consistent with a recent simulation-based validation study comparing these and other methods. In conclusion, IDCA or JAFROC1 analysis of FROC human data may be superior at detecting modality differences than ROC analysis.
Collapse
Affiliation(s)
- F Zanca
- Department of Radiology, Leuven University Center of Medical Physics in Radiology, University Hospitals Leuven, 3000 Leuven, Belgium.
| | | | | | | |
Collapse
|
52
|
Abstract
Detection of multiple lesions in images is a medically important task and free-response receiver operating characteristic (FROC) analyses and its variants, such as alternative FROC (AFROC) analyses, are commonly used to quantify performance in such tasks. However, ideal observers that optimize FROC or AFROC performance metrics have not yet been formulated in the general case. If available, such ideal observers may turn out to be valuable for imaging system optimization and in the design of computer aided diagnosis techniques for lesion detection in medical images. In this paper, we derive ideal AFROC and FROC observers. They are ideal in that they maximize, amongst all decision strategies, the area, or any partial area, under the associated AFROC or FROC curve. Calculation of observer performance for these ideal observers is computationally quite complex. We can reduce this complexity by considering forms of these observers that use false positive reports derived from signal-absent images only. We also consider a Bayes risk analysis for the multiple-signal detection task with an appropriate definition of costs. A general decision strategy that minimizes Bayes risk is derived. With particular cost constraints, this general decision strategy reduces to the decision strategy associated with the ideal AFROC or FROC observer.
Collapse
|
53
|
Long X, Cleveland WL, Yao YL. Multiclass detection of cells in multicontrast composite images. Comput Biol Med 2010; 40:168-78. [PMID: 20022596 PMCID: PMC2870534 DOI: 10.1016/j.compbiomed.2009.11.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Revised: 10/14/2009] [Accepted: 11/24/2009] [Indexed: 10/20/2022]
Abstract
In this paper, we describe a framework for multiclass cell detection in composite images consisting of images obtained with three different contrast methods for transmitted light illumination (referred to as multicontrast composite images). Compared to previous multiclass cell detection results [1], the use of multicontrast composite images was found to improve the detection accuracy by introducing more discriminatory information into the system. Preprocessing multicontrast composite images with Kernel PCA was found to be superior to traditional linear PCA preprocessing, especially in difficult classification scenarios where high-order nonlinear correlations are expected to be important. Systematic study of our approach under different overlap conditions suggests that it possesses sufficient speed and accuracy for use in some practical systems.
Collapse
Affiliation(s)
- Xi Long
- Mechanical Engineering Department, Columbia University, New York, NY 10027, USA
| | | | | |
Collapse
|
54
|
Smal I, Loog M, Niessen W, Meijering E. Quantitative comparison of spot detection methods in fluorescence microscopy. IEEE TRANSACTIONS ON MEDICAL IMAGING 2010; 29:282-301. [PMID: 19556194 DOI: 10.1109/tmi.2009.2025127] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Quantitative analysis of biological image data generally involves the detection of many subresolution spots. Especially in live cell imaging, for which fluorescence microscopy is often used, the signal-to-noise ratio (SNR) can be extremely low, making automated spot detection a very challenging task. In the past, many methods have been proposed to perform this task, but a thorough quantitative evaluation and comparison of these methods is lacking in the literature. In this paper, we evaluate the performance of the most frequently used detection methods for this purpose. These include seven unsupervised and two supervised methods. We perform experiments on synthetic images of three different types, for which the ground truth was available, as well as on real image data sets acquired for two different biological studies, for which we obtained expert manual annotations to compare with. The results from both types of experiments suggest that for very low SNRs ( approximately 2), the supervised (machine learning) methods perform best overall. Of the unsupervised methods, the detectors based on the so-called h -dome transform from mathematical morphology or the multiscale variance-stabilizing transform perform comparably, and have the advantage that they do not require a cumbersome learning stage. At high SNRs ( > 5), the difference in performance of all considered detectors becomes negligible.
Collapse
Affiliation(s)
- Ihor Smal
- Biomedical Imaging Group Rotterdam, Departments of Medical Informatics and Radiology, Erasmus MC, Rotterdam, The Netherlands.
| | | | | | | |
Collapse
|
55
|
Yanagawa M, Honda O, Yoshida S, Ono Y, Inoue A, Daimon T, Sumikawa H, Mihara N, Johkoh T, Tomiyama N, Nakamura H. Commercially available computer-aided detection system for pulmonary nodules on thin-section images using 64 detectors-row CT: preliminary study of 48 cases. Acad Radiol 2009; 16:924-33. [PMID: 19394873 DOI: 10.1016/j.acra.2009.01.030] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2008] [Revised: 01/27/2009] [Accepted: 01/27/2009] [Indexed: 01/15/2023]
Abstract
RATIONALE AND OBJECTIVES Most studies of computer-aided detection (CAD) for pulmonary nodules have focused on solid nodule detection. The aim of this study was to evaluate the performance of a commercially available CAD system in the detection of pulmonary nodules with or without ground-glass opacity (GGO) using 64-detector-row computed tomography compared to visual interpretation. MATERIALS AND METHODS Computed tomographic examinations were performed on 48 patients with existing or suspicious pulmonary nodules on chest radiography. Three radiologists independently reported the location and pattern (GGO, solid, or part solid) of each nodule candidate on computed tomographic scans, assigned each a confidence score, and then analyzed all scans using the CAD system. A reference standard was established by a consensus panel of different radiologists, who found 229 noncalcified nodules with diameters > or = 4 mm. True-positive and false-positive results and confidence levels were used to generate jackknife alternative free-response receiver-operating characteristic plots. RESULTS The sensitivity of GGO for 3 radiologists (60%-80%) was significantly higher than that for the CAD system (21%) (McNemar's test, P < .0001). For overall and solid nodules, the figure-of-merit values without and with the CAD system were significantly different (P = .005-.04) on jackknife alternative free-response receiver-operating characteristic analysis. For GGO and part-solid nodules, the figure-of-merit values with the CAD system were greater than those without the CAD system, indicating no significant differences. CONCLUSION Radiologists are significantly superior to this CAD system in the detection of GGO, but the CAD system can still play a complementary role in detecting nodules with or without GGO.
Collapse
|
56
|
Rao RB, Yakhnenko O, Krishnapuram B. KDD cup 2008 and the workshop on mining medical data. ACTA ACUST UNITED AC 2008. [DOI: 10.1145/1540276.1540288] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
In this report we summarize the KDD Cup 2008 task, which addressed a problem of early breast cancer detection. We describe the data and the challenges, the results and summarize the algorithms used by the winning teams. We also summarize the workshop on Mining Medical Data held in conjunction with SIGKDD on August 24, 2008 in Las Vegas, NV that brought together researchers working on various aspects of applying machine learning and data mining to challenging tasks in medical and health care domains.
Collapse
|
57
|
Hirose T, Nitta N, Shiraishi J, Nagatani Y, Takahashi M, Murata K. Evaluation of computer-aided diagnosis (CAD) software for the detection of lung nodules on multidetector row computed tomography (MDCT): JAFROC study for the improvement in radiologists' diagnostic accuracy. Acad Radiol 2008; 15:1505-12. [PMID: 19000867 DOI: 10.1016/j.acra.2008.06.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2008] [Revised: 06/11/2008] [Accepted: 06/12/2008] [Indexed: 12/21/2022]
Abstract
RATIONALE AND OBJECTIVES The aim of this study was to evaluate the usefulness of computer-aided diagnosis (CAD) software for the detection of lung nodules on multidetector-row computed tomography (MDCT) in terms of improvement in radiologists' diagnostic accuracy in detecting lung nodules, using jackknife free-response receiver-operating characteristic (JAFROC) analysis. MATERIALS AND METHODS Twenty-one patients (6 without and 15 with lung nodules) were selected randomly from 120 consecutive thoracic computed tomographic examinations. The gold standard for the presence or absence of nodules in the observer study was determined by consensus of two radiologists. Six expert radiologists participated in a free-response receiver operating characteristic study for the detection of lung nodules on MDCT, in which cases were interpreted first without and then with the output of CAD software. Radiologists were asked to indicate the locations of lung nodule candidates on the monitor with their confidence ratings for the presence of lung nodules. RESULTS The performance of the CAD software indicated that the sensitivity in detecting lung nodules was 71.4%, with 0.95 false-positive results per case. When radiologists used the CAD software, the average sensitivity improved from 39.5% to 81.0%, with an increase in the average number of false-positive results from 0.14 to 0.89 per case. The average figure-of-merit values for the six radiologists were 0.390 without and 0.845 with the output of the CAD software, and there was a statistically significant difference (P < .0001) using the JAFROC analysis. CONCLUSION The CAD software for the detection of lung nodules on MDCT has the potential to assist radiologists by increasing their accuracy.
Collapse
|
58
|
Chakraborty DP. Validation and statistical power comparison of methods for analyzing free-response observer performance studies. Acad Radiol 2008; 15:1554-66. [PMID: 19000872 DOI: 10.1016/j.acra.2008.07.018] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2008] [Revised: 07/16/2008] [Accepted: 07/17/2008] [Indexed: 11/26/2022]
Abstract
RATIONALE AND OBJECTIVES The aim of this work was to validate and compare the statistical powers of proposed methods for analyzing free-response data using a search-model-based simulator. MATERIALS AND METHODS A free-response data simulator is described that can model a single reader interpreting the same cases in two modalities, or two computer-aided detection (CAD) algorithms, or two human observers, interpreting the same cases in one modality. A variance components model, analogous to the Roe and Metz receiver-operating characteristic (ROC) data simulator, is described; it models intracase and intermodality correlations in free-response studies. Two generic observers were simulated: a quasi-human observer and a quasi-CAD algorithm. Null hypothesis (NH) validity and statistical powers of ROC, jackknife alternative free-response operating characteristic (JAFROC), a variant of JAFROC termed JAFROC-1, initial detection and candidate analysis (IDCA), and a nonparametric (NP) approach were investigated. RESULTS All methods had valid NH behavior over a wide range of simulator parameters. For equal numbers of normal and abnormal cases, for the human observer, the statistical power ranking of the methods was JAFROC-1 > JAFROC > (IDCA approximately NP) > ROC. For the CAD algorithm, the ranking was (NP approximately IDCA) > (JAFROC-1 approximately JAFROC) > ROC. In either case, the statistical power of the highest ranked method exceeded that of the lowest ranked method by about a factor of two. Dependence of statistical power on simulator parameters followed expected trends. For data sets with more abnormal cases than normal cases, JAFROC-1 power significantly exceeded JAFROC power. CONCLUSION Based on this work, the recommendation is to use JAFROC-1 for human observers (including human observers with CAD assist) and the NP method for evaluating CAD algorithms.
Collapse
|
59
|
Singh S, Tourassi GD, Baker JA, Samei E, Lo JY. Automated breast mass detection in 3D reconstructed tomosynthesis volumes: a featureless approach. Med Phys 2008; 35:3626-36. [PMID: 18777923 DOI: 10.1118/1.2953562] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The purpose of this study was to propose and implement a computer aided detection (CADe) tool for breast tomosynthesis. This task was accomplished in two stages-a highly sensitive mass detector followed by a false positive (FP) reduction stage. Breast tomosynthesis data from 100 human subject cases were used, of which 25 subjects had one or more mass lesions and the rest were normal. For stage 1, filter parameters were optimized via a grid search. The CADe identified suspicious locations were reconstructed to yield 3D CADe volumes of interest. The first stage yielded a maximum sensitivity of 93% with 7.7 FPs/breast volume. Unlike traditional CADe algorithms in which the second stage FP reduction is done via feature extraction and analysis, instead information theory principles were used with mutual information as a similarity metric. Three schemes were proposed, all using leave-one-case-out cross validation sampling. The three schemes, A, B, and C, differed in the composition of their knowledge base of regions of interest (ROIs). Scheme A's knowledge base was comprised of all the mass and FP ROIs generated by the first stage of the algorithm. Scheme B had a knowledge base that contained information from mass ROIs and randomly extracted normal ROIs. Scheme C had information from three sources of information-masses, FPs, and normal ROIs. Also, performance was assessed as a function of the composition of the knowledge base in terms of the number of FP or normal ROIs needed by the system to reach optimal performance. The results indicated that the knowledge base needed no more than 20 times as many FPs and 30 times as many normal ROIs as masses to attain maximal performance. The best overall system performance was 85% sensitivity with 2.4 FPs per breast volume for scheme A, 3.6 FPs per breast volume for scheme B, and 3 FPs per breast volume for scheme C.
Collapse
Affiliation(s)
- Swatee Singh
- Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705, USA.
| | | | | | | | | |
Collapse
|
60
|
Song T, Bandos AI, Rockette HE, Gur D. On comparing methods for discriminating between actually negative and actually positive subjects with FROC type data. Med Phys 2008; 35:1547-58. [PMID: 18491549 DOI: 10.1118/1.2890410] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The task of searching and detecting multiple abnormalities depicted on an image, or a series of images, is a common problem in different areas such as military target detection or diagnostic medical imaging. A free response receiver operating characteristic (FROC) approach for assessing performance in many of these scenarios entails marking the locations of suspected abnormalities and indicating a level of suspicion at each of the marked locations. One of the important characteristics of a system being evaluated under the FROC paradigm is its performance in the conventional ROC domain, namely classifying a subject (or a unit of interest) as "negative" or "positive" in regard to the presence of the abnormality (or any of the abnormalities) of interest. With FROC data we can compare subjects by specifying a function of multiple scores within a subject. This approach allows formulating subject-based ROC type indices that can be estimated using existing ROC concepts. In this article we focus on indices that reflect the ability of the system to discriminate between actually negative and actually positive subjects. We consider a previously proposed index that is based on the comparison of the highest scores on subjects and two new indices that are based on potentially more stable comparison functions, namely comparison of average scores and stochastic dominance. Based on these indices we develop nonparametric procedures for comparing subject-based discriminative ability of diagnostic systems being evaluated under the FROC paradigm. We also investigate the properties of the statistical procedures in a simulation study.
Collapse
Affiliation(s)
- Tao Song
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA
| | | | | | | |
Collapse
|
61
|
Bandos AI, Rockette HE, Song T, Gur D. Area under the free-response ROC curve (FROC) and a related summary index. Biometrics 2008; 65:247-56. [PMID: 18479482 DOI: 10.1111/j.1541-0420.2008.01049.x] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Free-response assessment of diagnostic systems continues to gain acceptance in areas related to the detection, localization, and classification of one or more "abnormalities" within a subject. A free-response receiver operating characteristic (FROC) curve is a tool for characterizing the performance of a free-response system at all decision thresholds simultaneously. Although the importance of a single index summarizing the entire curve over all decision thresholds is well recognized in ROC analysis (e.g., area under the ROC curve), currently there is no widely accepted summary of a system being evaluated under the FROC paradigm. In this article, we propose a new index of the free-response performance at all decision thresholds simultaneously, and develop a nonparametric method for its analysis. Algebraically, the proposed summary index is the area under the empirical FROC curve penalized for the number of erroneous marks, rewarded for the fraction of detected abnormalities, and adjusted for the effect of the target size (or "acceptance radius"). Geometrically, the proposed index can be interpreted as a measure of average performance superiority over an artificial "guessing" free-response process and it represents an analogy to the area between the ROC curve and the "guessing" or diagonal line. We derive the ideal bootstrap estimator of the variance, which can be used for a resampling-free construction of asymptotic bootstrap confidence intervals and for sample size estimation using standard expressions. The proposed procedure is free from any parametric assumptions and does not require an assumption of independence of observations within a subject. We provide an example with a dataset sampled from a diagnostic imaging study and conduct simulations that demonstrate the appropriateness of the developed procedure for the considered sample sizes and ranges of parameters.
Collapse
Affiliation(s)
- Andriy I Bandos
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA.
| | | | | | | |
Collapse
|
62
|
Chakraborty DP, Yoon HJ. Operating characteristics predicted by models for diagnostic tasks involving lesion localization. Med Phys 2008; 35:435-45. [PMID: 18383663 DOI: 10.1118/1.2820902] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
In 1996 Swensson published an observer model that predicted receiver operating characteristic (ROC), localization ROC (LROC), free-response ROC (FROC) and alternative FROC (AFROC) curves, thereby achieving "unification" of different observer performance paradigms. More recently a model termed initial detection and candidate analysis (IDCA) has been proposed for fitting computer aided detection (CAD) generated FROC data, and recently a search model for human observer FROC data has been proposed. The purpose of this study was to derive IDCA and the search model based expressions for operating characteristics, and to compare the predictions to the Swensson model. For three out of four mammography CAD data sets all models yielded good fits in the high-confidence region, i.e., near the lower end of the plots. The search model and IDCA tended to better fit the data in the low-confidence region, i.e., near the upper end of the plots, particularly for FROC curves for which the Swensson model predictions departed markedly from the data. For one data set none of the models yielded satisfactory fits. A unique characteristic of search model and IDCA predicted operating characteristics is that the operating point is not allowed to move continuously to the lowest confidence limit of the corresponding Swensson model curves. This prediction is actually observed in the CAD raw data and it is the primary reason for the poor FROC fits of the Swensson model in the low-confidence region.
Collapse
Affiliation(s)
- D P Chakraborty
- Department of Radiology, University of Pittsburgh, 3520 Forbes Avenue, Parkvale Building, Room 109, Pittsburgh, Pennsylvania 15261, USA.
| | | |
Collapse
|
63
|
Tsuchiya Y, Kodera Y, Tanaka R, Sanada S. Quantitative kinetic analysis of lung nodules using the temporal subtraction technique in dynamic chest radiographies performed with a flat panel detector. J Digit Imaging 2008; 22:126-35. [PMID: 18415648 DOI: 10.1007/s10278-008-9116-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2007] [Revised: 02/07/2008] [Accepted: 03/04/2008] [Indexed: 10/22/2022] Open
Abstract
Early detection and treatment of lung cancer is one of the most effective means of reducing cancer mortality, and to this end, chest X-ray radiography has been widely used as a screening method. A related technique based on the development of computer analysis and a flat panel detector (FPD) has enabled the functional evaluation of respiratory kinetics in the chest and is expected to be introduced into clinical practice in the near future. In this study, we developed a computer analysis algorithm to detect lung nodules and to evaluate quantitative kinetics. Breathing chest radiographs obtained by modified FPD and breath synchronization utilizing diaphragmatic analysis of vector movement were converted into four static images by sequential temporal subtraction processing, morphological enhancement processing, kinetic visualization processing, and lung region detection processing. An artificial neural network analyzed these density patterns to detect the true nodules and draw their kinetic tracks. Both the algorithm performance and the evaluation of clinical effectiveness of seven normal patients and simulated nodules showed sufficient detecting capability and kinetic imaging function without significant differences. Our technique can quantitatively evaluate the kinetic range of nodules and is effective in detecting a nodule on a breathing chest radiograph. Moreover, the application of this technique is expected to extend computer-aided diagnosis systems and facilitate the development of an automatic planning system for radiation therapy.
Collapse
Affiliation(s)
- Yuichiro Tsuchiya
- Department of Radiology, Shizuoka Children's Hospital, 860 Urushiyama, Aoi-ku, Shizuoka, 420-8660, Japan.
| | | | | | | |
Collapse
|
64
|
Zanca F, Van Ongeval C, Jacobs J, Marchal G, Bosmans H. A quantitative method for evaluating the detectability of lesions in digital mammography. RADIATION PROTECTION DOSIMETRY 2008; 129:214-218. [PMID: 18319282 DOI: 10.1093/rpd/ncn049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
This study presents a quantitative method for evaluating the detectability of microcalcifications in digital mammography. Four hundred and twenty microcalcifications (with various morphology, size and contrast), simulated with a previously validated method, were used for the creation of image datasets. Lesions were inserted into 163 regions of interests of 59 selected raw digital mammograms with various anatomical backgrounds and acquired with a Siemens Novation DR. After processing, these composite images were scored by experienced radiologists, who located multiple simulated lesions and rated them under conditions of free-search. For statistical analysis, free-response receiver-operating characteristic curves are plotted; the use of jackknife free-response receiver-operating characteristic method has also been investigated. The main advantage of this methodology is that the exact number of inserted microcalcifications is well known and that the lesions are fully characterised in terms of pathology, size, morphology and peak contrast. A first application has been the evaluation of the effect of anatomical background on microcalcifications detection. Preliminary findings in this study indicate that this method may be a promising tool to evaluate factors that have an influence on the detectability of lesions, such as the clinical processing or the viewing conditions.
Collapse
Affiliation(s)
- F Zanca
- University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium.
| | | | | | | | | |
Collapse
|
65
|
Ikedo Y, Fukuoka D, Hara T, Fujita H, Takada E, Endo T, Morita T. Development of a fully automatic scheme for detection of masses in whole breast ultrasound images. Med Phys 2008; 34:4378-88. [PMID: 18072503 DOI: 10.1118/1.2795825] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Ultrasonography has been used for breast cancer screening in Japan. Screening using a conventional hand-held probe is operator dependent and thus it is possible that some areas of the breast may not be scanned. To overcome such problems, a mechanical whole breast ultrasound (US) scanner has been proposed and developed for screening purposes. However, another issue is that radiologists might tire while interpreting all images in a large-volume screening; this increases the likelihood that masses may remain undetected. Therefore, the aim of this study is to develop a fully automatic scheme for the detection of masses in whole breast US images in order to assist the interpretations of radiologists and potentially improve the screening accuracy. The authors database comprised 109 whole breast US imagoes, which include 36 masses (16 malignant masses, 5 fibroadenomas, and 15 cysts). A whole breast US image with 84 slice images (interval between two slice images: 2 mm) was obtained by the ASU-1004 US scanner (ALOKA Co., Ltd., Japan). The feature based on the edge directions in each slice and a method for subtracting between the slice images were used for the detection of masses in the authors proposed scheme. The Canny edge detector was applied to detect edges in US images; these edges were classified as near-vertical edges or near-horizontal edges using a morphological method. The positions of mass candidates were located using the near-vertical edges as a cue. Then, the located positions were segmented by the watershed algorithm and mass candidate regions were detected using the segmented regions and the low-density regions extracted by the slice subtraction method. For the removal of false positives (FPs), rule-based schemes and a quadratic discriminant analysis were applied for the distribution between masses and FPs. As a result, the sensitivity of the authors scheme for the detection of masses was 80.6% (29/36) with 3.8 FPs per whole breast image. The authors scheme for a computer-aided detection may be useful in improving the screening performance and efficiency.
Collapse
Affiliation(s)
- Yuji Ikedo
- Department of Intelligent Image Information, Division of Regeneration and Advanced Medical Sciences, Graduate School of Medicine, Gifu University, 1-1 Yanagido, Gifu 501-1194, Japan.
| | | | | | | | | | | | | |
Collapse
|
66
|
Uchiyama Y, Yokoyama R, Ando H, Asano T, Kato H, Yamakawa H, Yamakawa H, Hara T, Iwama T, Hoshi H, Fujita H. Computer-aided diagnosis scheme for detection of lacunar infarcts on MR images. Acad Radiol 2007; 14:1554-61. [PMID: 18035284 DOI: 10.1016/j.acra.2007.09.012] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2006] [Revised: 09/07/2007] [Accepted: 09/13/2007] [Indexed: 11/19/2022]
Abstract
RATIONALE AND OBJECTIVES The detection and management of asymptomatic lacunar infarcts on magnetic resonance (MR) images are important tasks for radiologists to ensure the prevention of severe cerebral infarctions. However, accurate identification of the lacunar infarcts on MR images is a difficult task for the radiologists. Therefore the purpose of this study was to develop a computer-aided diagnosis scheme for the detection of lacunar infarcts to assist radiologists' interpretation as a "second opinion." MATERIALS AND METHODS Our database comprised 1,143 T1- and 1,143 T2-weighted images obtained from 132 patients. The locations of the lacunar infarcts were determined by experienced neuroradiologists. We first segmented the cerebral region in a T1-weighted image by using a region growing technique for restricting the search area of lacunar infarcts. For identifying the initial lacunar infarcts candidates, a top-hat transform and multiple-phase binarization were then applied to the T2-weighted image within the segmented cerebral region. For eliminating the false positives (FPs), we determined 12 features--the locations x and y, signal intensity differences in the T1- and T2-weighted images, nodular components from a scale of 1 to 4, and nodular and linear components from a scale of 1 to 4. The nodular components and the linear components were obtained using a filter bank technique. The rule-based schemes and a support vector machine with 12 features were applied to the regions of the initial candidates for distinguishing between lacunar infarcts and FPs. RESULTS Our computerized scheme was evaluated by using a holdout method. The sensitivity of the detection of lacunar infarcts was 96.8% (90/93) with 0.76 FP per image. CONCLUSIONS Our computerized scheme would be useful in assisting radiologists for identifying lacunar infarcts in MR images.
Collapse
Affiliation(s)
- Yoshikazu Uchiyama
- Department of Intelligent Image Information, Graduate School of Medicine, Gifu University, Japan.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
67
|
ROC analysis in medical imaging: a tutorial review of the literature. Radiol Phys Technol 2007; 1:2-12. [PMID: 20821157 DOI: 10.1007/s12194-007-0002-1] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Accepted: 09/25/2007] [Indexed: 10/22/2022]
|
68
|
Sundaram P, Zomorodian A, Beaulieu C, Napel S. Colon polyp detection using smoothed shape operators: preliminary results. Med Image Anal 2007; 12:99-119. [PMID: 17910934 DOI: 10.1016/j.media.2007.08.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2007] [Revised: 07/25/2007] [Accepted: 08/03/2007] [Indexed: 10/22/2022]
Abstract
Computer-aided detection (CAD) algorithms identify locations in computed tomographic (CT) images of the colon that are most likely to contain polyps. Existing CAD methods treat the CT data as a voxelized, volume image. They estimate a curvature-based feature at the mucosal surface voxels. However, curvature is a smooth notion, while our data are discrete and noisy. As a second order differential quantity, curvature amplifies noise. In this paper, we present the smoothed shape operators method (SSO), which uses a geometry processing approach. We extract a triangle mesh representation of the colon surface, and estimate curvature on this surface using the shape operator. We then smooth the shape operators on the surface iteratively. Throughout, we use techniques explicitly designed for discrete geometry. All our computation occurs on the surface, rather than in the voxel grid. We evaluate our algorithm on patient data and provide free-response receiver-operating characteristic performance analysis over all size ranges of polyps. We also provide confidence intervals for our performance estimates. We compare our performance with the surface normal overlap (SNO) method for the same data. A preliminary evaluation of our method on 35 patients yielded the following results (polyp diameter range; sensitivity; false positives/case): (10mm; 100%; 17.5), (5-10 mm; 89.7%, 21.23), (<5 mm; 59.1%; 23.9) and (overall; 80.3%; 23.9). The evaluation of the SNO method yielded: (10 mm; 75%; 17.5), (5-10 mm; 43.1%; 21.23), (<5 mm; 15.9%; 23.9) and (overall; 38.5%; 23.9).
Collapse
Affiliation(s)
- P Sundaram
- Department of Radiology, Stanford University, Stanford, CA 94305, United States.
| | | | | | | |
Collapse
|
69
|
Abstract
Computer-aided detection (CAD) has been attracting extensive research interest during the last two decades. It is recognized that the full potential of CAD can only be realized by improving the performance and robustness of CAD algorithms and this requires good evaluation methodology that would permit CAD designers to optimize their algorithms. Free-response receiver operating characteristic (FROC) curves are widely used to assess CAD performance, however, evaluation rarely proceeds beyond determination of lesion localization fraction (sensitivity) at an arbitrarily selected value of nonlesion localizations (false marks) per image. This work describes a FROC curve fitting procedure that uses a recent model of visual search that serves as a framework for the free-response task. A maximum likelihood procedure for estimating the parameters of the model from free-response data and fitting CAD generated FROC curves was implemented. Procedures were implemented to estimate two figures of merit and associated statistics such as 95% confidence intervals and goodness of fit. One of the figures of merit does not require the arbitrary specification of an operating point at which to evaluate CAD performance. For comparison a related method termed initial detection and candidate analysis was also implemented that is applicable when all suspicious regions are reported. The two methods were tested on seven mammography CAD data sets and both yielded good to excellent fits. The search model approach has the advantage that it can potentially be applied to radiologist generated free-response data where not all suspicious regions are reported, only the ones that are deemed sufficiently suspicious to warrant clinical follow-up. This work represents the first practical application of the search model to an important evaluation problem in diagnostic radiology. Software based on this work is expected to benefit CAD developers working in diverse areas of medical imaging.
Collapse
Affiliation(s)
- Hong Jun Yoon
- Department of Radiology, University of Pittsburgh, Pittsburgh, PA 15261
| | - Bin Zheng
- Department of Radiology, University of Pittsburgh, Pittsburgh, PA 15261
| | - Berkman Sahiner
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109
| | | |
Collapse
|
70
|
Gur D, Rockette HE, Bandos AI. "Binary" and "non-binary" detection tasks: are current performance measures optimal? Acad Radiol 2007; 14:871-6. [PMID: 17626312 DOI: 10.1016/j.acra.2007.03.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We have observed that a very large fraction of responses for several detection tasks during the performance of observer studies are in the extreme ranges of lower than 11% or higher than 89% regardless of the actual presence or absence of the abnormality in question or its subjectively rated "subtleness." This observation raises questions regarding the validity and appropriateness of using multicategory rating scales for such detection tasks. Monte Carlo simulation of binary and multicategory ratings for these tasks demonstrate that the use of the former (binary) often results in a less biased and more precise summary index and hence may lead to a higher statistical power for determining differences between modalities.
Collapse
Affiliation(s)
- David Gur
- Department of Radiology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| | | | | |
Collapse
|
71
|
Kann MG, Sheetlin SL, Park Y, Bryant SH, Spouge JL. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment. Nucleic Acids Res 2007; 35:4678-85. [PMID: 17596268 PMCID: PMC1950549 DOI: 10.1093/nar/gkm414] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance.
Collapse
Affiliation(s)
| | | | | | | | - John L. Spouge
- *To whom correspondence should be addressed.301 402 9310301 480 2484
| |
Collapse
|
72
|
Broemeling LD. Detection and Localization in Test Accuracy: A Bayesian Perspective. COMMUN STAT-THEOR M 2007. [DOI: 10.1080/03610920601126027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
73
|
Abstract
When a person or an algorithm searches for targets throughout an image, there are no discrete trials, and the probability of a false alarm cannot be computed. Instead, what is observable is the rate of production of false alarms (per image, say), and data analysis uses the free-response version of signal detection theory. A previously-proposed model implies a power relationship for the free-response operating characteristic. The present letter introduces an extra parameter. The relationship between the logarithms of probability of correct detection and rate of false alarm production is no longer forced to be linear.
Collapse
|
74
|
Ge J, Hadjiiski LM, Sahiner B, Wei J, Helvie MA, Zhou C, Chan HP. Computer-aided detection system for clustered microcalcifications: comparison of performance on full-field digital mammograms and digitized screen-film mammograms. Phys Med Biol 2007; 52:981-1000. [PMID: 17264365 PMCID: PMC2742213 DOI: 10.1088/0031-9155/52/4/008] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have developed a computer-aided detection (CAD) system to detect clustered microcalcifications automatically on full-field digital mammograms (FFDMs) and a CAD system for screen-film mammograms (SFMs). The two systems used the same computer vision algorithms but their false positive (FP) classifiers were trained separately with sample images of each modality. In this study, we compared the performance of the CAD systems for detection of clustered microcalcifications on pairs of FFDM and SFM obtained from the same patient. For case-based performance evaluation, the FFDM CAD system achieved detection sensitivities of 70%, 80% and 90% at an average FP cluster rate of 0.07, 0.16 and 0.63 per image, compared with an average FP cluster rate of 0.15, 0.38 and 2.02 per image for the SFM CAD system. The difference was statistically significant with the alternative free-response receiver operating characteristic (AFROC) analysis. When evaluated on data sets negative for microcalcification clusters, the average FP cluster rates of the FFDM CAD system were 0.04, 0.11 and 0.33 per image at detection sensitivity level of 70%, 80% and 90% compared with an average FP cluster rate of 0.08, 0.14 and 0.50 per image for the SFM CAD system. When evaluated for malignant cases only, the difference of the performance of the two CAD systems was not statistically significant with AFROC analysis.
Collapse
Affiliation(s)
- Jun Ge
- Department of Radiology, University of Michigan, CGC B2103, 1500 E Medical Center Drive, Ann Arbor, MI 48109, USA.
| | | | | | | | | | | | | |
Collapse
|
75
|
Shiraishi J, Li F, Doi K. Computer-aided diagnosis for improved detection of lung nodules by use of posterior-anterior and lateral chest radiographs. Acad Radiol 2007; 14:28-37. [PMID: 17178363 PMCID: PMC1892186 DOI: 10.1016/j.acra.2006.09.057] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Revised: 09/25/2006] [Accepted: 09/26/2006] [Indexed: 12/01/2022]
Abstract
RATIONALE AND OBJECTIVES We developed a computerized scheme for detection of lung nodules in the lateral views of chest radiographs, in order to improve the overall performance in combination with the computer-aided diagnostic (CAD) scheme for posterior-anterior (PA) views. MATERIALS AND METHODS We used 106 pairs of PA and lateral views of chest radiographs (122 lung nodules) for development of the CAD scheme. In the CAD scheme for lateral views, initial candidates of lung nodules were identified by use of a nodule enhancement filter based on the edge gradients. Thirty-four image features extracted from the original and the nodule-enhanced images were used for the rule-based scheme and for artificial neural networks (ANNs) for removal of some false-positive candidates. The computer performance was evaluated with a leave-one-case-out test method for ANNs. For PA views, we used the existing CAD scheme, which was trained with one-half of 924 chest images and then tested with the remaining images. RESULTS When the CAD scheme was applied only to PA views, the sensitivity in the detection of lung nodules was 70.5%, with 4.9 false positives per image. Although the performance of the computerized scheme for lateral views was relatively low (60.7% sensitivity with 1.7 false positives per image), the overall sensitivity (86.9%) was improved (6.6 false positives per two views), because 20 (16.4%) of the 122 nodules were detected only on lateral views. CONCLUSIONS The CAD scheme by use of lateral-view images has the potential to improve the overall performance for detection of lung nodules on chest radiographs when combined with a conventional CAD scheme for standard PA views.
Collapse
Affiliation(s)
- Junji Shiraishi
- Kurt Rossmann Laboratories for Radiologic Image Research, Department of Radiology, The University of Chicago, MC2026, 5841 S. Maryland Avenue, Chicago, IL 60637, USA.
| | | | | |
Collapse
|
76
|
Chakraborty D, Yoon HJ, Mello-Thoms C. Spatial localization accuracy of radiologists in free-response studies: Inferring perceptual FROC curves from mark-rating data. Acad Radiol 2007; 14:4-18. [PMID: 17178361 PMCID: PMC1829298 DOI: 10.1016/j.acra.2006.10.015] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2006] [Revised: 10/24/2006] [Accepted: 10/26/2006] [Indexed: 11/25/2022]
Abstract
RATIONALE AND OBJECTIVES Free-response data consist of a set of mark-ratings pairs. Before analysis, the data are classified or "scored" into lesion and non-lesion localizations. The scoring is done by choosing an acceptance-radius and classifying marks within the acceptance-radius of lesion centers as lesion localizations, and all other marks are classified as non-lesion localizations. The scored data are plotted as a free-response receiver operating characteristic (FROC) curve, essentially a plot of appropriately normalized numbers of lesion localizations vs. non-lesion localizations. Scored FROC curves are frequently used to compare imaging systems and computer-aided detection (CAD) algorithms. However, the choice of acceptance-radius is arbitrary. This makes it difficult to compare curves from different studies and to estimate true performance. MATERIALS AND METHODS To resolve this issue the concept of two types of marks is introduced: perceptual hits and perceptual misses. A perceptual hit is a mark made in response to the observer seeing the lesion. A perceptual miss is a mark made in response to the observer seeing a (lesion-like) non-lesion. A method of estimating the most probable numbers of perceptual hits and misses is described. This allows one to plot a perceptual FROC operating point and by extension a perceptual FROC curve. Unlike a scored FROC operating point, a perceptual point is independent of the choice of acceptance-radius. The method does not allow one to identify individual marks as perceptual hits or misses-only the most probable numbers. It is based on a three-parameter statistical model of the spatial distributions of perceptual hits and misses relative to lesion centers. RESULTS The method has been applied to an observer dataset in which mammographers and residents with different levels of experience were asked to locate lesions in mammograms. The perceptual operating points suggest superior performance for the mammographers and equivalent performance for residents in the first and second mammography rotations. These results and the model validation are preliminary as they are based on a small dataset. CONCLUSION The significance of this study is showing that it is possible to probabilistically determine if a mark resulted from seeing a lesion or a non-lesion. Using the method developed in this study one could perform acceptance-radius independent estimation of observer performance.
Collapse
Affiliation(s)
- Dev Chakraborty
- Department of Radiology, University of Pittsburgh, 3520 5th Avenue, Suite 300, Pittsburgh, PA 15261, USA.
| | | | | |
Collapse
|
77
|
Shiraishi J, Li Q, Suzuki K, Engelmann R, Doi K. Computer-aided diagnostic scheme for the detection of lung nodules on chest radiographs: localized search method based on anatomical classification. Med Phys 2006; 33:2642-53. [PMID: 16898468 DOI: 10.1118/1.2208739] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We developed an advanced computer-aided diagnostic (CAD) scheme for the detection of various types of lung nodules on chest radiographs intended for implementation in clinical situations. We used 924 digitized chest images (992 noncalcified nodules) which had a 500 x 500 matrix size with a 1024 gray scale. The images were divided randomly into two sets which were used for training and testing of the computerized scheme. In this scheme, the lung field was first segmented by use of a ribcage detection technique, and then a large search area (448 x 448 matrix size) within the chest image was automatically determined by taking into account the locations of a midline and a top edge of the segmented ribcage. In order to detect lung nodule candidates based on a localized search method, we divided the entire search area into 7 x 7 regions of interest (ROIs: 64 x 64 matrix size). In the next step, each ROI was classified anatomically into apical, peripheral, hilar, and diaphragm/heart regions by use of its image features. Identification of lung nodule candidates and extraction of image features were applied for each localized region (128 x 128 matrix size), each having its central part (64 x 64 matrix size) located at a position corresponding to a ROI that was classified anatomically in the previous step. Initial candidates were identified by use of the nodule-enhanced image obtained with the average radial-gradient filtering technique, in which the filter size was varied adaptively depending on the location and the anatomical classification of the ROI. We extracted 57 image features from the original and nodule-enhanced images based on geometric, gray-level, background structure, and edge-gradient features. In addition, 14 image features were obtained from the corresponding locations in the contralateral subtraction image. A total of 71 image features were employed for three sequential artificial neural networks (ANNs) in order to reduce the number of false-positive candidates. All parameters for ANNs, i.e., the number of iterations, slope of sigmoid functions, learning rate, and threshold values for removing the false positives, were determined automatically by use of a bootstrap technique with training cases. We employed four different combinations of training and test image data sets which was selected randomly from the 924 cases. By use of our localized search method based on anatomical classification, the average sensitivity was increased to 92.5% with 59.3 false positives per image at the level of initial detection for four different sets of test cases, whereas our previous technique achieved an 82.8% of sensitivity with 56.8 false positives per image. The computer performance in the final step obtained from four different data sets indicated that the average sensitivity in detecting lung nodules was 70.1% with 5.0 false positives per image for testing cases and 70.4% sensitivity with 4.2 false positives per image for training cases. The advanced CAD scheme involving the localized search method with anatomical classification provided improved detection of pulmonary nodules on chest radiographs for 924 lung nodule cases.
Collapse
Affiliation(s)
- Junji Shiraishi
- Kurt Rossmann Laboratories for Radiologic Image Research, Department of Radiology, The University of Chicago, 5841 S. Maryland Avenue, MC 2026, Chicago, Illinois 60637, USA.
| | | | | | | | | |
Collapse
|
78
|
Popescu LM, Lewitt RM. Small nodule detectability evaluation using a generalized scan-statistic model. Phys Med Biol 2006; 51:6225-44. [PMID: 17110782 DOI: 10.1088/0031-9155/51/23/020] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
In this paper is investigated the use of the scan statistic for evaluating the detectability of small nodules in medical images. The scan-statistic method is often used in applications in which random fields must be searched for abnormal local features. Several results of the detection with localization theory are reviewed and a generalization is presented using the noise nodule distribution obtained by scanning arbitrary areas. One benefit of the noise nodule model is that it enables determination of the scan-statistic distribution by using only a few image samples in a way suitable both for simulation and experimental setups. Also, based on the noise nodule model, the case of multiple targets per image is addressed and an image abnormality test using the likelihood ratio and an alternative test using multiple decision thresholds are derived. The results obtained reveal that in the case of low contrast nodules or multiple nodules the usual test strategy based on a single decision threshold underperforms compared with the alternative tests. That is a consequence of the fact that not only the contrast or the size, but also the number of suspicious nodules is a clue indicating the image abnormality. In the case of the likelihood ratio test, the multiple clues are unified in a single decision variable. Other tests that process multiple clues differently do not necessarily produce a unique ROC curve, as shown in examples using a test involving two decision thresholds. We present examples with two-dimensional time-of-flight (TOF) and non-TOF PET image sets analysed using the scan statistic for different search areas, as well as the fixed position observer.
Collapse
Affiliation(s)
- Lucreţiu M Popescu
- Department of Radiology, University of Pennsylvania, 423 Guardian Drive, 4th floor Blockley Hall, Philadelphia, PA 19104-6021, USA.
| | | |
Collapse
|
79
|
Chakraborty DP. Analysis of location specific observer performance data: validated extensions of the jackknife free-response (JAFROC) method. Acad Radiol 2006; 13:1187-93. [PMID: 16979067 DOI: 10.1016/j.acra.2006.06.016] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Revised: 05/03/2006] [Accepted: 06/20/2006] [Indexed: 11/25/2022]
Abstract
RATIONALE AND OBJECTIVES The free-response paradigm is being increasingly used in the assessment of medical imaging systems. The currently implemented method of analyzing the data, namely jackknife free-response (JAFROC) analysis, has some validation and applicability limitations. The purpose of this work is to address these limitations. MATERIALS AND METHODS The general principles of modality evaluation and methodology validation are reviewed. A model for simulating free-response data was used to test the statistical validity of several methods of analyzing the data. The methods differed only in the choice of the figure of merit used to quantify performance. Statistical validity was judged by investigating the behaviors of the methods under null hypothesis conditions of no difference between modalities. RESULTS The validity of the different methods of analyzing the data was found to be dependent on the choice of figure of merit. A figure of merit is identified that accommodates abnormal images with multiple (one or more) lesions, detections of which could have different clinical significances (weights). This figure of merit is shown to be statistically valid. An extension of the analysis to single reader interpretations of images from different modalities is also shown to be statistically valid. CONCLUSION With the validated enhancements, JAFROC is expected to be of greater utility to users of the free-response method. The extension to single-reader interpretations should be of particular value to developers of image processing algorithms, including developers of computer-aided diagnosis algorithms.
Collapse
Affiliation(s)
- Dev P Chakraborty
- Department of Radiology, University of Pittsburgh, 3520 Fifth Avenue, Suite 300, Pittsburgh, PA 15261, USA.
| |
Collapse
|
80
|
Chakraborty DP. A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 2006; 51:3449-62. [PMID: 16825742 PMCID: PMC2230665 DOI: 10.1088/0031-9155/51/14/012] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Search is a basic activity that is performed routinely in many different tasks. In the context of medical imaging it involves locating lesions in images under conditions of uncertainty regarding the number and locations of lesions that may be present. A search model is presented that applies to situations, as in the free-response paradigm, where on each image the number of normal regions that could be mistaken for lesions is unknown, and the number of observer generated localizations of suspicious regions (marks) is unpredictable. The search model is based on a two-stage model that has been proposed in the literature, according to which, at the first stage (the preattentive stage) the observer uses mainly peripheral vision to identify likely lesion candidates, and at the second stage the observer decides (i.e., cognitively evaluates) whether or not to report the candidates. The search model regards the unpredictable numbers of lesion and non-lesion localizations as random variables and models them via appropriate statistical distributions. The model has three parameters quantifying the lesion signal-to-noise ratio, the observer's expertise at rejecting non-lesion locations, and the observer's expertise at finding lesions. A figure-of-merit quantifying the observer's search performance is described. The search model bears a close resemblance to the initial detection and candidate analysis (IDCA) model that has been recently proposed for analysing computer aided detection (CAD) algorithms. The ability to analytically model and quantify the search process would enable more powerful assessment and optimization of performance in these activities, which could be highly significant.
Collapse
Affiliation(s)
- D P Chakraborty
- Department of Radiology, University of Pittsburgh, 3520 5th Avenue, Suite 300, Pittsburgh, PA 15261, USA.
| |
Collapse
|
81
|
Long X, Cleveland WL, Yao YL. Automatic detection of unstained viable cells in bright field images using a support vector machine with an improved training procedure. Comput Biol Med 2006; 36:339-62. [PMID: 16488772 DOI: 10.1016/j.compbiomed.2004.12.002] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2004] [Accepted: 12/03/2004] [Indexed: 11/25/2022]
Abstract
Detection of unstained viable cells in bright field images is an inherently difficult task due to the immense variability of cell appearance. Traditionally, it has required human observers. However, in high-throughput robotic systems, an automatic procedure is essential. In this paper, we formulate viable cell detection as a supervised, binary pattern recognition problem and show that a support vector machine (SVM) with an improved training algorithm provides highly effective cell identification. In the case of cell detection, the binary classification problem generates two classes, one of which is much larger than the other. In addition, the total number of samples is extremely large. This combination represents a difficult problem for SVMs. We solved this problem with an iterative training procedure ("Compensatory Iterative Sample Selection", CISS). This procedure, which was systematically studied under various class size ratios and overlap conditions, was found to outperform several commonly used methods, primarily owing to its ability to choose the most representative samples for the decision boundary. Its speed and accuracy are sufficient for use in a practical system.
Collapse
Affiliation(s)
- Xi Long
- Mechanical Engineering Department, Columbia University, 500 West 120th Street, 220 Mudd. MC 4703, New York, NY 10027, USA
| | | | | |
Collapse
|
82
|
Offiah AC, Moon L, Hall CM, Todd-Pokropek A. Diagnostic accuracy of fracture detection in suspected non-accidental injury: the effect of edge enhancement and digital display on observer performance. Clin Radiol 2006; 61:163-73. [PMID: 16439222 DOI: 10.1016/j.crad.2005.09.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2005] [Revised: 09/13/2005] [Accepted: 09/13/2005] [Indexed: 11/17/2022]
Abstract
AIM To compare the effect of varying degrees of edge enhancement and method of digital image display on fracture detection in suspected non-accidental injury (NAI). MATERIALS AND METHODS Fifty radiographs from post-mortem skeletal surveys in 13 children with suspected NAI were selected. Images were obtained using a Fuji 5000R computed radiography system. Hard copies were printed with edge enhancement factors 0, 0.5 and 1.2. Images (edge enhancement 0.5) were also displayed on a 1K(2) monitor. Six observers independently evaluated all 200 images for the presence of abnormality. Observers also scored each image for visualization of soft tissues, visualization of trabecular markings and overall image quality. The paired Student's t-test and location receiver operating curve (ROC) analysis were used to compare quality scores and diagnostic accuracy of each display method. Individual and pooled true-positive rates (sensitivity) were determined. For the purposes of ROC analysis, histology was taken as the gold standard. RESULTS There was no difference in duration of hard and soft-copy reading sessions (p=0.76). After image manipulation soft-copy radiographs scored significantly better for image quality than hard copy (p<0.0001). Pooled observer sensitivity (at a specificity of 90%) was below 50% for all display methods. Diagnostic accuracy varied significantly between observers. Diagnostic accuracy of individual observers was not affected by display method. CONCLUSION In suspected NAI, diagnostic accuracy of fracture detection is generally low. Diagnostic accuracy appears to be affected more by observer-related factors than by the method of digital image display.
Collapse
Affiliation(s)
- A C Offiah
- Department of Radiology, Great Ormond Street Hospital for Children, London, UK.
| | | | | | | |
Collapse
|
83
|
Wei J, Sahiner B, Hadjiiski LM, Chan HP, Petrick N, Helvie MA, Roubidoux MA, Ge J, Zhou C. Computer-aided detection of breast masses on full field digital mammograms. Med Phys 2006; 32:2827-38. [PMID: 16266097 PMCID: PMC2742215 DOI: 10.1118/1.1997327] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We are developing a computer-aided detection (CAD) system for breast masses on full field digital mammographic (FFDM) images. To develop a CAD system that is independent of the FFDM manufacturer's proprietary preprocessing methods, we used the raw FFDM image as input and developed a multiresolution preprocessing scheme for image enhancement. A two-stage prescreening method that combines gradient field analysis with gray level information was developed to identify mass candidates on the processed images. The suspicious structure in each identified region was extracted by clustering-based region growing. Morphological and spatial gray-level dependence texture features were extracted for each suspicious object. Stepwise linear discriminant analysis (LDA) with simplex optimization was used to select the most useful features. Finally, rule-based and LDA classifiers were designed to differentiate masses from normal tissues. Two data sets were collected: a mass data set containing 110 cases of two-view mammograms with a total of 220 images, and a no-mass data set containing 90 cases of two-view mammograms with a total of 180 images. All cases were acquired with a GE Senographe 2000D FFDM system. The true locations of the masses were identified by an experienced radiologist. Free-response receiver operating characteristic analysis was used to evaluate the performance of the CAD system. It was found that our CAD system achieved a case-based sensitivity of 70%, 80%, and 90% at 0.72, 1.08, and 1.82 false positive (FP) marks/image on the mass data set. The FP rates on the no-mass data set were 0.85, 1.31, and 2.14 FP marks/image, respectively, at the corresponding sensitivities. This study demonstrated the usefulness of our CAD techniques for automated detection of masses on FFDM images.
Collapse
Affiliation(s)
- Jun Wei
- Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
84
|
Arodź T, Kurdziel M, Popiela TJ, Sevre EOD, Yuen DA. Detection of clustered microcalcifications in small field digital mammography. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2006; 81:56-65. [PMID: 16310282 DOI: 10.1016/j.cmpb.2005.10.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2004] [Revised: 10/02/2005] [Accepted: 10/04/2005] [Indexed: 05/05/2023]
Abstract
The most frequent symptoms of ductal carcinoma recognised by mammography are clusters of microcalcifications. Their detection from mammograms is difficult, especially for glandular breasts. We present a new computer-aided detection system for small field digital mammography in planning of breast biopsy. The system processes the mammograms in several steps. First, we filter the original picture with a filter that is sensitive to microcalcification contrast shape. Then, we enhance the mammogram contrast by using wavelet-based sharpening algorithm. Afterwards, we present to radiologist, for visual analysis, such a contrast-enhanced mammogram with suggested positions of microcalcification clusters. We have evaluated the usefulness of the system with the help of four experienced radiologists, who found that it significantly improves the detection of microcalcifications in small field digital mammography.
Collapse
Affiliation(s)
- Tomasz Arodź
- Institute of Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Kraków, Poland.
| | | | | | | | | |
Collapse
|
85
|
Arora R, Kundel HL, Beam CA. Perceptually based FROC analysis. Acad Radiol 2005; 12:1567-74. [PMID: 16321746 DOI: 10.1016/j.acra.2005.06.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2005] [Revised: 06/28/2005] [Accepted: 06/29/2005] [Indexed: 11/26/2022]
Abstract
RATIONALE AND OBJECTIVES Analysis of reading data when cases have multiple targets and/or the reader is required to localize targets is difficult. One approach to this free-response operating characteristic (FROC) problem is for images to be segmented (eg, with quadrants) by the investigator and a segment-level analysis be conducted with the case as a nesting factor. In this report, we introduce an alternative method that uses the visual scan path of the reader to segment the image. We evaluate the new method by applying it to data from a mammography reading experiment. MATERIALS AND METHODS The gaze scan path of one radiologist was recorded as she scanned 40 mammograms for masses and microcalcifications. The observer is an experienced mammographer and was not one of the authors. In addition, the reader provided a rating indicating the degree of suspicion for any suspected targets she identified and localized. We then established "perceptual regions" by using a clustering algorithm on the visual fixations. We combined ratings given to specific locations indicated by the reader with the segmentation from the visual scan to generate a series of ratings classified for whether the perceptually based region associated with the rating contained or did not contain a known target. We analyzed data generated by our method from all 40 cases by using the conventional maximum-likelihood method based on the binormal model. Finally, we tested goodness-of-fit of the binormal model to the data by using chi-square. RESULTS Maximum-likelihood estimation led to a model that did not fit the data (P < .001). However, examination of the observed and expected counts suggests that the binormal assumption does not hold for segments that contain targets and a bimodal distribution model might be preferred. CONCLUSION Our new method provides an alternative approach to analysis of the FROC experiment. It needs to be developed further. Specifically, we propose that a mixture model extension of the binormal model be developed for ratings data arising from perceptually based FROC experiments. A disadvantage to our method is the requirement to record the scan path of the reader. However, we believe that adding such information to receiver operating characteristic (ROC) curve analysis will pay off when appropriate statistical models have been identified because we believe our data support our hypothesis that the perceptual scanning of images by humans deconvolves interpretation correlation. If true, this hypothesis implies that conventional statistical methods for ROC analysis based on independent data can be applied to the analysis of FROC data after conditioning on the scan path of the observer.
Collapse
Affiliation(s)
- Rachna Arora
- Biostatistics Core, H. Lee Moffitt Cancer Center and Research Institute, 12902 Magnolia Drive, Tampa, Florida 33612, USA.
| | | | | |
Collapse
|
86
|
Penedo M, Souto M, Tahoces PG, Carreira JM, Villalón J, Porto G, Seoane C, Vidal JJ, Berbaum KS, Chakraborty DP, Fajardo LL. Free-response receiver operating characteristic evaluation of lossy JPEG2000 and object-based set partitioning in hierarchical trees compression of digitized mammograms. Radiology 2005; 237:450-7. [PMID: 16244253 DOI: 10.1148/radiol.2372040996] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To assess the effects of two irreversible wavelet-based compression algorithms--Joint Photographic Experts Group (JPEG) 2000 and object-based set partitioning in hierarchical trees (SPIHT)--on the detection of clusters of microcalcifications and masses on digitized mammograms. MATERIALS AND METHODS The use of the images in this retrospective image-collection study was approved by the institutional review board, and patient informed consent was not required. One hundred twelve mammographic images (28 with one or two clusters of microcalcifications, 19 with one mass, 17 with both abnormal findings, and 48 with normal findings) obtained in 60 women who ranged in age from 25 to 79 years were digitized and compressed at 40:1 and 80:1 by using the JPEG2000 and object-based SPIHT methods. Five experienced radiologists were asked to locate and rate clusters of microcalcifications and masses on the original and compressed images in a free-response receiver operating characteristic (FROC) data acquisition paradigm. Observer performance was evaluated with the jackknife FROC method. RESULTS The mean FROC figures of merit for detecting clusters of microcalcifications, masses, and both radiographic findings on uncompressed images were 0.80, 0.81, and 0.72, respectively. With object-based SPIHT 80:1 compression, the corresponding values were larger than the values for uncompressed images by 0.005, 0.009, and -0.005, respectively. The 95% confidence interval for the differences in figures of merit between compressed and uncompressed images was -0.039, 0.033 for the microcalcification finding; -0.055, 0.034 for the mass finding; and -0.039, 0.030 for both findings. Because each of these confidence intervals includes zero, no significant difference in detection accuracy between uncompressed and object-based SPIHT 80:1 compression was observed at a P value of 5%. The F test of the null hypothesis that all of the modes (uncompressed and four compressed modes) were equivalent yielded the following results: F = 0.255, P = .903 for the microcalcification finding; F = 0.340, P = .848 for the mass finding; and F = 0.122, P = .975 for both findings. CONCLUSION To within the accuracy of these measurements, lossy compression of digital mammographic data at 80:1 with JPEG2000 or the object-based SPIHT algorithm can be performed without decreasing the rate of detection of clusters of microcalcifications and masses.
Collapse
Affiliation(s)
- Mónica Penedo
- Laboratorio de Imagen Médica, Hospital General Universitario Gregorio Marañón, C/Ibiza 43, 28009 Madrid, Spain.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
87
|
Coxson HO, Baile EM, King GG, Mayo JR. Diagnosis of subsegmental pulmonary emboli: a multi-center study using a porcine model. J Thorac Imaging 2005; 20:24-31. [PMID: 15729119 DOI: 10.1097/01.rti.0000155044.82156.ad] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We measured sensitivity, positive predictive value, and free-response receiver operating characteristic (FROC) of 20 radiologists detecting subsegmental-sized pulmonary emboli in a porcine model using either contrast-enhanced computed tomography (CT) or digital subtraction (DS) pulmonary angiography. Colored methacrylate beads (4.2 and 3.8 mm diameter) were injected into 9 anesthetized juvenile pigs. CT and DS pulmonary angiography images were obtained before and after a pulmonary infiltrate was introduced into the lower lobes. Following imaging, the pigs were euthanized, and the pulmonary arterial tree was cast using clear methacrylate allowing direct visualization of emboli. The 20 radiologists used a custom-made computer application to display the images on their personal computer and record their diagnoses. The results were mailed electronically to the coordinating center for comparison with the cast of the pulmonary vasculature. Twenty-three emboli were included in the statistical analysis. Overall sensitivity for spiral CT and angiography, respectively, was: 60 +/- 18% and 72 +/- 11% (P = 0.06). Positive predictive value for spiral CT and angiography, respectively, was: 49 +/- 24% and 58 +/- 23% (P = 0.25). There was a large variation in both sensitivity and positive predicted values between Readers. There was no difference in sensitivity or positive predictive value between radiologists from community or academic centers (P > 0.27). FROC analysis showed no significant difference between CT or DS (P = 0.27). In conclusion, in this porcine model, there is no overall diagnostic advantage to using DS pulmonary angiography rather than contrast-enhanced spiral CT for the diagnosis of PE when images are interpreted by radiologists located in either academic or community hospital settings.
Collapse
Affiliation(s)
- Harvey O Coxson
- Department of Radiology, Vancouver General Hospital, Vancouver, BC, Canada.
| | | | | | | |
Collapse
|
88
|
Partain CL, Chan HP, Gelovani JG, Giger ML, Izatt JA, Jolesz FA, Kandarpa K, Li KCP, McNitt-Gray M, Napel S, Summers RM, Gazelle GS. Biomedical Imaging Research Opportunities Workshop II: Report and Recommendations. Radiology 2005; 236:389-403. [PMID: 16040898 DOI: 10.1148/radiol.2362041876] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- C Leon Partain
- Dept of Radiology, Vanderbilt Univ Medical Ctr, RR-1223, MCN, 1161 21st Ave South, Nashville, TN 37232-2675, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
89
|
Bornefalk H. Estimation and comparison of CAD system performance in clinical settings. Acad Radiol 2005; 12:687-94. [PMID: 15935967 DOI: 10.1016/j.acra.2005.02.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2004] [Revised: 01/01/2005] [Accepted: 02/03/2005] [Indexed: 10/25/2022]
Abstract
RATIONALE AND OBJECTIVES Computer-aided detection (CAD) systems are frequently compared using free-response receiver operating characteristic (FROC) curves. While there are ample statistical methods for comparing FROC curves, when one is interested in comparing the outcomes of 2 CAD systems applied in a typical clinical setting, there is the additional matter of correctly determining the system operating point. This article shows how the effect of the sampling error on determining the correct CAD operating point can be captured. By incorporating this uncertainty, a method is presented that allows estimation of the probability with which a particular CAD system performs better than another on unseen data in a clinical setting. MATERIALS AND METHODS The distribution of possible clinical outcomes from 2 artificial CAD systems with different FROC curves is examined. The sampling error is captured by the distribution of possible system thresholds of the classifying machine that yields a specified sensitivity. After introducing a measure of superiority, the probability of one system being superior to the other can be determined. RESULTS It is shown that for 2 typical mammography CAD systems, each trained on independent representative datasets of 100 cases, the FROC curves must be separated by 0.20 false positives per image in order to conclude that there is a 90% probability that one is better than the other in a clinical setting. Also, there is no apparent gain in increasing the size of the training set beyond 100 cases. DISCUSSION CAD systems for mammography are modeled for illustrative purposes, but the method presented is applicable to any computer-aided detection system evaluated with FROC curves. The presented method is designed to construct confidence intervals around possible clinical outcomes and to assess the importance of training set size and separation between FROC curves of systems trained on different datasets.
Collapse
Affiliation(s)
- Hans Bornefalk
- Royal Institute of Technology, AlbaNova University Center, Department of Physics, SE--106 91 Stockholm, Sweden.
| |
Collapse
|
90
|
Bilello M, Gokturk SB, Desser T, Napel S, Jeffrey RB, Beaulieu CF. Automatic detection and classification of hypodense hepatic lesions on contrast-enhanced venous-phase CT. Med Phys 2005; 31:2584-93. [PMID: 15487741 DOI: 10.1118/1.1782674] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The objective of this work was to develop and validate algorithms for detection and classification of hypodense hepatic lesions, specifically cysts, hemangiomas, and metastases from CT scans in the portal venous phase of enhancement. Fifty-six CT sections from 51 patients were used as representative of common hypodense liver lesions, including 22 simple cysts, 11 hemangiomas, 22 metastases, and 1 image containing both a cyst and a hemangioma. The detection algorithm uses intensity-based histogram methods to find central lesions, followed by liver contour refinement to identify peripheral lesions. The classification algorithm operates on the focal lesions identified during detection, and includes shape-based segmentation, edge pixel weighting, and lesion texture filtering. Support vector machines are then used to perform a pair-wise lesion classification. For the detection algorithm, 80% lesion sensitivity was achieved at approximately 0.3 false positives (FP) per slice for central lesions, and 0.5 FP per slice for peripheral lesions, giving a total of 0.8 FP per section. For 90% sensitivity, the total number of FP rises to about 2.2 per section. The pair-wise classification yielded good discrimination between cysts and metastases (at 95% sensitivity for detection of metastases, only about 5% of cysts are incorrectly classified as metastases), perfect discrimination between hemangiomas and cysts, and was least accurate in discriminating between hemangiomas and metastases (at 90% sensitivity for detection of hemangiomas, about 28% of metastases were incorrectly classified as hemangiomas). Initial implementations of our algorithms are promising for automating liver lesion detection and classification.
Collapse
Affiliation(s)
- Michel Bilello
- Department of Computer Science, Stanford University, Stanford, California 94305, USA.
| | | | | | | | | | | |
Collapse
|
91
|
Affiliation(s)
- Nancy A Obuchowski
- Department of Biostatistics and Epidemiology, Cleveland Clinic Foundation, 9500 Euclid Ave., Cleveland, OH, USA
| |
Collapse
|
92
|
Chakraborty DP. Recent advances in observer performance methodology: jackknife free-response ROC (JAFROC). RADIATION PROTECTION DOSIMETRY 2005; 114:26-31. [PMID: 15933077 DOI: 10.1093/rpd/nch512] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
The jackknife free-response receiver operating characteristic (JAFROC) method allows quantitative analysis of observer data such as that observed when radiologists interpret images, which could contain more than one lesion and a location can be reported for each perceived lesion. The method was recently validated with a perception-based simulation model that incorporated the detectability parameter of the standard binormal ROC model, and in addition allowed simultaneous samples from both noise and signal distributions. The total number of noise samples is an important new parameter that measures reader expertise. The new sampling model incorporates search, which is an integral part of lesion detection that has not been possible to model until now. The model was used to generate simulated FROC ratings data, which was used to assess the statistical validity of JAFROC analysis. We found that JAFROC analysis is a statistically valid approach for analysing FROC data and that JAFROC analysis exhibited significantly greater statistical power than the existing ROC approach.
Collapse
Affiliation(s)
- Dev P Chakraborty
- University of Pittsburgh, keystone Building Suite 300, 3520 5th Avenue, Pittsburgh, PA 15213, USA.
| |
Collapse
|
93
|
Kann MG, Thiessen PA, Panchenko AR, Schäffer AA, Altschul SF, Bryant SH. A structure-based method for protein sequence alignment. Bioinformatics 2004; 21:1451-6. [PMID: 15613392 DOI: 10.1093/bioinformatics/bti233] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION With the continuing rapid growth of protein sequence data, protein sequence comparison methods have become the most widely used tools of bioinformatics. Among these methods are those that use position-specific scoring matrices (PSSMs) to describe protein families. PSSMs can capture information about conserved patterns within families, which can be used to increase the sensitivity of searches for related sequences. Certain types of structural information, however, are not generally captured by PSSM search methods. Here we introduce a program, Structure-based ALignment TOol (SALTO), that aligns protein query sequences to PSSMs using rules for placing and scoring gaps that are consistent with the conserved regions of domain alignments from NCBI's Conserved Domain Database. RESULTS In most cases, the alignment scores obtained using the local alignment version follow an extreme value distribution. SALTO's performance in finding related sequences and producing accurate alignments is similar to or better than that of IMPALA; one advantage of SALTO is that it imposes an explicit gapping model on each protein family. AVAILABILITY A stand-alone version of the program that can generate global or local alignments is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/SALTO/), and has been incorporated to Cn3D structure/alignment viewer. CONTACT bryant@ncbi.nlm.nih.gov.
Collapse
Affiliation(s)
- Maricel G Kann
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20894, USA
| | | | | | | | | | | |
Collapse
|
94
|
Rubin GD, Lyo JK, Paik DS, Sherbondy AJ, Chow LC, Leung AN, Mindelzun R, Schraedley-Desmond PK, Zinck SE, Naidich DP, Napel S. Pulmonary nodules on multi-detector row CT scans: performance comparison of radiologists and computer-aided detection. Radiology 2004; 234:274-83. [PMID: 15537839 DOI: 10.1148/radiol.2341040589] [Citation(s) in RCA: 174] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
PURPOSE To compare the performance of radiologists and of a computer-aided detection (CAD) algorithm for pulmonary nodule detection on thin-section thoracic computed tomographic (CT) scans. MATERIALS AND METHODS The study was approved by the institutional review board. The requirement of informed consent was waived. Twenty outpatients (age range, 15-91 years; mean, 64 years) were examined with chest CT (multi-detector row scanner, four detector rows, 1.25-mm section thickness, and 0.6-mm interval) for pulmonary nodules. Three radiologists independently analyzed CT scans, recorded the locus of each nodule candidate, and assigned each a confidence score. A CAD algorithm with parameters chosen by using cross validation was applied to the 20 scans. The reference standard was established by two experienced thoracic radiologists in consensus, with blind review of all nodule candidates and free search for additional nodules at a dedicated workstation for three-dimensional image analysis. True-positive (TP) and false-positive (FP) results and confidence levels were used to generate free-response receiver operating characteristic (ROC) plots. Double-reading performance was determined on the basis of TP detections by either reader. RESULTS The 20 scans showed 195 noncalcified nodules with a diameter of 3 mm or more (reference reading). Area under the alternative free-response ROC curve was 0.54, 0.48, 0.55, and 0.36 for CAD and readers 1-3, respectively. Differences between reader 3 and CAD and between readers 2 and 3 were significant (P < .05); those between CAD and readers 1 and 2 were not significant. Mean sensitivity for individual readings was 50% (range, 41%-60%); double reading resulted in increase to 63% (range, 56%-67%). With CAD used at a threshold allowing only three FP detections per CT scan, mean sensitivity was increased to 76% (range, 73%-78%). CAD complemented individual readers by detecting additional nodules more effectively than did a second reader; CAD-reader weighted kappa values were significantly lower than reader-reader weighted kappa values (Wilcoxon rank sum test, P < .05). CONCLUSION With CAD used at a level allowing only three FP detections per CT scan, sensitivity was substantially higher than with conventional double reading.
Collapse
Affiliation(s)
- Geoffrey D Rubin
- Department of Radiology, Stanford University School of Medicine, 300 Pasteur Drive, S-072, Stanford, CA 94305-5105, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
95
|
Suryanarayanan S, Karellas A, Vedantham S, Waldrop SM, D’Orsi CJ. A perceptual evaluation of JPEG 2000 image compression for digital mammography: contrast-detail characteristics. J Digit Imaging 2004; 17:64-70. [PMID: 15255520 PMCID: PMC3043965 DOI: 10.1007/s10278-003-1728-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
Abstract
In this investigation the effect of JPEG 2000 compression on the contrast-detail (CD) characteristics of digital mammography images was studied using an alternative forced choice (AFC) technique. Images of a contrast-detail phantom, acquired using a clinical full-field digital mammography system, were compressed using a commercially available software product (JPEG 2000). Data compression was achieved at ratios of 1:1, 10:1, 20:1, and 30:1 and the images were reviewed by seven observers on a high-resolution display. Psychophysical detection characteristics were first computed by fitting perception data using a maximum-likelihood technique from which CD curves were derived at 50%, 62.5%, and 75% threshold levels. Statistical analysis indicated no significant difference in the perception of mean disk thickness up to 20:1 compression except for disk diameter of 1 mm. All other compression combinations exhibited significant degradation in CD characteristics.
Collapse
Affiliation(s)
| | - Andrew Karellas
- Department of Radiology, Emory University School of Medicine, 1364 Clifton Rd. NE, Atlanta, GA 30322 USA
| | - Srinivasan Vedantham
- Department of Radiology, Emory University School of Medicine, 1364 Clifton Rd. NE, Atlanta, GA 30322 USA
| | - Sandra M. Waldrop
- Department of Radiology, Emory University School of Medicine, 1364 Clifton Rd. NE, Atlanta, GA 30322 USA
| | - Carl J. D’Orsi
- Department of Radiology, Emory University School of Medicine, 1364 Clifton Rd. NE, Atlanta, GA 30322 USA
| |
Collapse
|
96
|
Edwards DC, Lan L, Metz CE, Giger ML, Nishikawa RM. Estimating three-class ideal observer decision variables for computerized detection and classification of mammographic mass lesions. Med Phys 2004; 31:81-90. [PMID: 14761024 DOI: 10.1118/1.1631912] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We are using Bayesian artificial neural networks (BANNs) to classify mammographic masses in schemes for computer-aided diagnosis, and we are extending this methodology to a three-class classification task. We investigated whether a BANN can estimate ideal observer decision variables to distinguish malignant, benign, and false-positive computer detections. Five features were calculated for 63 malignant and 29 benign computer-detected mass lesions, and for 1049 false-positive computer detections, in 440 mammograms randomly divided into a training and testing set. A BANN was trained on the training set features and applied to the testing set features. We then used a known relation between three-class ideal observer decision variables and that used by a two-class ideal observer when two of three classes are grouped into one class, giving one decision variable for distinguishing malignant from nonmalignant detections, and a second for distinguishing true-positive from false-positive computer detections. For comparison, we grouped the training data into two classes in the same two ways and trained two-class BANNs for these two tasks. The three-class BANN decision variables were essentially identical in performance to the specifically trained two-class BANNs, with the average difference in area under the ROC curves being less than 0.0035 and no differences in area being statistically significant. Thus, the BANN outputs obey the same theoretical relationship as do the three-class and two-class ideal observer decision variables, which is consistent with the claim that the three-class BANN output can provide good estimates of the decision variables used by a three-class ideal observer.
Collapse
Affiliation(s)
- Darrin C Edwards
- Department of Radiology, The University of Chicago, Chicago, Illinois 60637, USA.
| | | | | | | | | |
Collapse
|
97
|
Shiraishi J, Sanada S, Sawada M, Yoshida A, Ishida T, Kano A, Ichikawa K, Suzuki K, Hara T. [Data: report from the JSRT sectional committee on medical imaging "recommended references on radiological imaging research"]. Nihon Hoshasen Gijutsu Gakkai Zasshi 2004; 60:1085-100. [PMID: 15389165 DOI: 10.6009/jjrt.kj00000922568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023]
|
98
|
Chakraborty DP, Berbaum KS. Observer studies involving detection and localization: Modeling, analysis, and validation. Med Phys 2004; 31:2313-30. [PMID: 15377098 DOI: 10.1118/1.1769352] [Citation(s) in RCA: 255] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Although the receiver operating characteristic (ROC) paradigm is the accepted method for evaluation of diagnostic imaging systems, it has some serious shortcomings inasmuch as it is restricted to one observer report per image. By contrast the free-response ROC (FROC) paradigm and associated analysis method allows the observer to report multiple abnormalities within each imaging study, and uses the location of reported abnormalities to improve the measurement. Because the ROC method cannot accommodate multiple responses or use location information, its statistical power will suffer. The FROC paradigm/analysis has not enjoyed widespread acceptance because of concern about whether responses made to the same diagnostic study can be treated as independent. We propose a new jackknife FROC analysis method (JAFROC) that does not make the independence assumption. The new analysis method combines elements of FROC and the Dorfman-Berbaum-Metz (DBM) methods. To compare JAFROC to an earlier free-response analysis method (specifically the alternative free-response, or AFROC method), and to the DBM method, which uses conventional ROC scoring, we developed a model for generating simulated FROC data. The simulation model is based on an eye-movement model of how experts evaluate images. It allowed us to examine null hypothesis (NH) behavior and statistical power of the different methods. We found that AFROC analysis did not pass the NH test, being unduly conservative. Both the JAFROC method and the DBM method passed the NH test, but JAFROC had more statistical power than the DBM method. The results of this comparison suggest that future studies of diagnostic performance may enjoy improved statistical power or reduced sample size requirements through the use of the JAFROC method.
Collapse
Affiliation(s)
- Dev P Chakraborty
- Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA.
| | | |
Collapse
|
99
|
Dodd LE, Wagner RF, Armato SG, McNitt-Gray MF, Beiden S, Chan HP, Gur D, McLennan G, Metz CE, Petrick N, Sahiner B, Sayre J. Assessment methodologies and statistical issues for computer-aided diagnosis of lung nodules in computed tomography: contemporary research topics relevant to the lung image database consortium. Acad Radiol 2004; 11:462-75. [PMID: 15109018 DOI: 10.1016/s1076-6332(03)00814-6] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Cancer of the lung and bronchus is the leading fatal malignancy in the United States. Five-year survival is low, but treatment of early stage disease considerably improves chances of survival. Advances in multidetector-row computed tomography technology provide detection of smaller lung nodules and offer a potentially effective screening tool. The large number of images per exam, however, requires considerable radiologist time for interpretation and is an impediment to clinical throughput. Thus, computer-aided diagnosis (CAD) methods are needed to assist radiologists with their decision making. To promote the development of CAD methods, the National Cancer Institute formed the Lung Image Database Consortium (LIDC). The LIDC is charged with developing the consensus and standards necessary to create an image database of multidetector-row computed tomography lung images as a resource for CAD researchers. To develop such a prospective database, its potential uses must be anticipated. The ultimate applications will influence the information that must be included along with the images, the relevant measures of algorithm performance, and the number of required images. In this article we outline assessment methodologies and statistical issues as they relate to several potential uses of the LIDC database. We review methods for performance assessment and discuss issues of defining "truth" as well as the complications that arise when truth information is not available. We also discuss issues about sizing and populating a database.
Collapse
Affiliation(s)
- Lori E Dodd
- Biometrics Research Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 6130 Executive Blvd, MSC 7434, Bethesda, MD 20892, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
100
|
Coppini G, Diciotti S, Falchini M, Villari N, Valli G. Neural networks for computer-aided diagnosis: detection of lung nodules in chest radiograms. ACTA ACUST UNITED AC 2004; 7:344-57. [PMID: 15000360 DOI: 10.1109/titb.2003.821313] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The paper describes a neural-network-based system for the computer aided detection of lung nodules in chest radiograms. Our approach is based on multiscale processing and artificial neural networks (ANNs). The problem of nodule detection is faced by using a two-stage architecture including: 1) an attention focusing subsystem that processes whole radiographs to locate possible nodular regions ensuring high sensitivity; 2) a validation subsystem that processes regions of interest to evaluate the likelihood of the presence of a nodule, so as to reduce false alarms and increase detection specificity. Biologically inspired filters (both LoG and Gabor kernels) are used to enhance salient image features. ANNs of the feedforward type are employed, which allow an efficient use of a priori knowledge about the shape of nodules, and the background structure. The images from the public JSRT database, including 247 radiograms, were used to build and test the system. We performed a further test by using a second private database with 65 radiograms collected and annotated at the Radiology Department of the University of Florence. Both data sets include nodule and nonnodule radiographs. The use of a public data set along with independent testing with a different image set makes the comparison with other systems easier and allows a deeper understanding of system behavior. Experimental results are described by ROC/FROC analysis. For the JSRT database, we observed that by varying sensitivity from 60 to 75% the number of false alarms per image lies in the range 4-10, while accuracy is in the range 95.7-98.0%. When the second data set was used comparable results were obtained. The observed system performances support the undertaking of system validation in clinical settings.
Collapse
|