1
|
Sobiecki A, Hadjiiski LM, Chan HP, Samala RK, Zhou C, Stojanovska J, Agarwal PP. Detection of Severe Lung Infection on Chest Radiographs of COVID-19 Patients: Robustness of AI Models across Multi-Institutional Data. Diagnostics (Basel) 2024; 14:341. [PMID: 38337857 PMCID: PMC10855789 DOI: 10.3390/diagnostics14030341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/24/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
The diagnosis of severe COVID-19 lung infection is important because it carries a higher risk for the patient and requires prompt treatment with oxygen therapy and hospitalization while those with less severe lung infection often stay on observation. Also, severe infections are more likely to have long-standing residual changes in their lungs and may need follow-up imaging. We have developed deep learning neural network models for classifying severe vs. non-severe lung infections in COVID-19 patients on chest radiographs (CXR). A deep learning U-Net model was developed to segment the lungs. Inception-v1 and Inception-v4 models were trained for the classification of severe vs. non-severe COVID-19 infection. Four CXR datasets from multi-country and multi-institutional sources were used to develop and evaluate the models. The combined dataset consisted of 5748 cases and 6193 CXR images with physicians' severity ratings as reference standard. The area under the receiver operating characteristic curve (AUC) was used to evaluate model performance. We studied the reproducibility of classification performance using the different combinations of training and validation data sets. We also evaluated the generalizability of the trained deep learning models using both independent internal and external test sets. The Inception-v1 based models achieved AUC ranging between 0.81 ± 0.02 and 0.84 ± 0.0, while the Inception-v4 models achieved AUC in the range of 0.85 ± 0.06 and 0.89 ± 0.01, on the independent test sets, respectively. These results demonstrate the promise of using deep learning models in differentiating COVID-19 patients with severe from non-severe lung infection on chest radiographs.
Collapse
Affiliation(s)
- André Sobiecki
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; (A.S.); (H.-P.C.); (C.Z.); (P.P.A.)
| | - Lubomir M. Hadjiiski
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; (A.S.); (H.-P.C.); (C.Z.); (P.P.A.)
| | - Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; (A.S.); (H.-P.C.); (C.Z.); (P.P.A.)
| | - Ravi K. Samala
- Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993, USA;
| | - Chuan Zhou
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; (A.S.); (H.-P.C.); (C.Z.); (P.P.A.)
| | | | - Prachi P. Agarwal
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109, USA; (A.S.); (H.-P.C.); (C.Z.); (P.P.A.)
| |
Collapse
|
2
|
Loizidou K, Skouroumouni G, Nikolaou C, Pitris C. Automatic Breast Mass Segmentation and Classification Using Subtraction of Temporally Sequential Digital Mammograms. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2022; 10:1801111. [PMID: 36519002 PMCID: PMC9744267 DOI: 10.1109/jtehm.2022.3219891] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/10/2022] [Accepted: 10/29/2022] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Cancer remains a major cause of morbidity and mortality globally, with 1 in 5 of all new cancers arising in the breast. The introduction of mammography for the radiological diagnosis of breast abnormalities, significantly decreased their mortality rates. Accurate detection and classification of breast masses in mammograms is especially challenging for various reasons, including low contrast and the normal variations of breast tissue density. Various Computer-Aided Diagnosis (CAD) systems are being developed to assist radiologists with the accurate classification of breast abnormalities. METHODS In this study, subtraction of temporally sequential digital mammograms and machine learning are proposed for the automatic segmentation and classification of masses. The performance of the algorithm was evaluated on a dataset created especially for the purposes of this study, with 320 images from 80 patients (two time points and two views of each breast) with precisely annotated mass locations by two radiologists. RESULTS Ninety-six features were extracted and ten classifiers were tested in a leave-one-patient-out and k-fold cross-validation process. Using Neural Networks, the detection of masses was 99.9% accurate. The classification accuracy of the masses as benign or suspicious increased from 92.6%, using the state-of-the-art temporal analysis, to 98%, using the proposed methodology. The improvement was statistically significant (p-value < 0.05). CONCLUSION These results demonstrate the effectiveness of the subtraction of temporally consecutive mammograms for the diagnosis of breast masses. Clinical and Translational Impact Statement: The proposed algorithm has the potential to substantially contribute to the development of automated breast cancer Computer-Aided Diagnosis systems with significant impact on patient prognosis.
Collapse
Affiliation(s)
- Kosmia Loizidou
- KIOS Research and Innovation Center of ExcellenceDepartment of Electrical and Computer EngineeringUniversity of Cyprus 2109 Nicosia Cyprus
| | | | | | - Costas Pitris
- KIOS Research and Innovation Center of ExcellenceDepartment of Electrical and Computer EngineeringUniversity of Cyprus 2109 Nicosia Cyprus
| |
Collapse
|
3
|
Fitzjohn J, Zhou C, Chase JG. Breast cancer diagnosis using frequency decomposition of surface motion of actuated breast tissue. Front Oncol 2022; 12:969530. [DOI: 10.3389/fonc.2022.969530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022] Open
Abstract
This paper presents a computationally simple diagnostic algorithm for breast cancer using a non-invasive Digital Image Elasto Tomography (DIET) system. N=14 women (28 breasts, 13 cancerous) underwent a clinical trial using the DIET system following mammography diagnosis. The screening involves steady state sinusoidal vibrations applied to the free hanging breast with cameras used to capture tissue motion. Image reconstruction methods provide surface displacement data for approximately 14,000 reference points on the breast surface. The breast surface was segmented into four radial and four vertical segments. Frequency decomposition of reference point motion in each segment were compared. Segments on the same vertical band were hypothesised to have similar frequency content in healthy breasts, with significant differences indicating a tumor, based on the stiffness dependence of frequency and tumors being 4~10 times stiffer than healthy tissue. Twelve breast configurations were used to test robustness of the method. Optimal breast configuration for the 26 breasts analysed (13 cancerous, 13 healthy) resulted in 85% sensitivity and 77% specificity. Combining two opposite configurations resulted in correct diagnosis of all cancerous breasts with 100% sensitivity and 69% specificity. Bootstrapping was used to fit a smooth receiver operator characteristic (ROC) curve to compare breast configuration performance with optimal area under the curve (AUC) of 0.85. Diagnostic results show diagnostic accuracy is comparable or better than mammography, with the added benefits of DIET screening, including portability, non-invasive screening, and no breast compression, with potential to increase screening participation and equity, improving outcomes for women.
Collapse
|
4
|
Abbey CK, Li J, Gang GJ, Stayman JW. Assessment of Boundary Discrimination Performance in a Printed Phantom. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2022; 12035:120350N. [PMID: 37051612 PMCID: PMC10089594 DOI: 10.1117/12.2612622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/14/2023]
Abstract
Printed phantoms hold great potential as a tool for examining task-based image quality of x-ray imaging systems. Their ability to produce complex shapes rendered in materials with adjustable attenuation coefficients allows a new level of flexibility in the design of tasks for the evaluation of physical imaging systems. We investigate performance in a fine "boundary discrimination" task in which fine features at the margin of a clearly visible "lesion" are used to classify the lesion as malignant or benign. These tasks are appealing because of their relevance to clinical tasks, and because they typically emphasize higher spatial frequencies relative to more common lesion detection tasks. A 3D printed phantom containing cylindrical shells of varying thickness was used to generate lesions profiles that differed in their edge profiles. This was intended to approximate lesions with indistinct margins that are clinically associated with malignancy. Wall thickness in the phantom ranged from 0.4mm to 0.8mm, which allows for task difficulty to be varied by choosing different thicknesses to represent malignant and benign lesions. The phantom was immersed in a tub filled with water and potassium phosphate to approximate the attenuating background, and imaged repeatedly on a benchtop cone-beam CT scanner. After preparing the image data (reconstruction, ROI Selection, sub-pixel registration), we find that the mean frequency of the lesion profile is 0.11 cyc/mm. The mean frequency of the lesion-difference profile, representative of the discrimination task, is approximately 6 times larger. Model observers show appropriate dose performance in these tasks as well.
Collapse
Affiliation(s)
- Craig K Abbey
- Department of Psychological and Brain Sciences, University of California Santa Barbara
| | - Junyuan Li
- Department of Biomedical Engineering, Johns Hopkins University
| | - Grace J Gang
- Department of Biomedical Engineering, Johns Hopkins University
| | | |
Collapse
|
5
|
Automated Classification of Breast Cancer Lesions for Digitised Mammograms via Computer-Aided Diagnosis System. JOURNAL OF APPLIED SCIENCE & PROCESS ENGINEERING 2021. [DOI: 10.33736/jaspe.3517.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Women with breast cancer have a high risk of death. Digitised mammograms can be used to detect the early stage of breast cancer. However, digitised mammograms suffer low contrast appearances that may lead to misdiagnosis. This paper proposes a Computer-Aided Diagnosis (CAD) system of automated classification of breast cancer lesions using a modified image processing technique of Fuzzy Anisotropic Diffusion Histogram Equalization Contrast Adaptive Limited (FADHECAL) incorporated with Multilevel Otsu Thresholding on digitised mammograms. Four main blocks were used in this CAD system, namely; (i) Pre-processing and Enhancement block; (ii) Segmentation block; (iii) Region of Interests (ROIs) Extraction block; and (iv) Classification block. The CAD system was tested on 30 digitised mammograms retrieved from the Mini-Mammographic Image Analysis Society (MIAS) database with various degrees of severity and background tissues. The proposed CAD system showed a high accuracy of 96.67% for the detection of breast cancer lesions.
Collapse
|
6
|
Breast Cancer Segmentation Methods: Current Status and Future Potentials. BIOMED RESEARCH INTERNATIONAL 2021; 2021:9962109. [PMID: 34337066 PMCID: PMC8321730 DOI: 10.1155/2021/9962109] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 05/14/2021] [Accepted: 06/11/2021] [Indexed: 12/24/2022]
Abstract
Early breast cancer detection is one of the most important issues that need to be addressed worldwide as it can help increase the survival rate of patients. Mammograms have been used to detect breast cancer in the early stages; if detected in the early stages, it can drastically reduce treatment costs. The detection of tumours in the breast depends on segmentation techniques. Segmentation plays a significant role in image analysis and includes detection, feature extraction, classification, and treatment. Segmentation helps physicians quantify the volume of tissue in the breast for treatment planning. In this work, we have grouped segmentation methods into three groups: classical segmentation that includes region-, threshold-, and edge-based segmentation; machine learning segmentation; and supervised and unsupervised and deep learning segmentation. The findings of our study revealed that region-based segmentation is frequently used for classical methods, and the most frequently used techniques are region growing. Further, a median filter is a robust tool for removing noise. Moreover, the MIAS database is frequently used in classical segmentation methods. Meanwhile, in machine learning segmentation, unsupervised machine learning methods are more frequently used, and U-Net is frequently used for mammogram image segmentation because it does not require many annotated images compared with other deep learning models. Furthermore, reviewed papers revealed that it is possible to train a deep learning model without performing any preprocessing or postprocessing and also showed that the U-Net model is frequently used for mammogram segmentation. The U-Net model is frequently used because it does not require many annotated images and also because of the presence of high-performance GPU computing, which makes it easy to train networks with more layers. Additionally, we identified mammograms and utilised widely used databases, wherein 3 and 28 are public and private databases, respectively.
Collapse
|
7
|
Hadjiiski LM, Cha KH, Cohan RH, Chan HP, Caoili EM, Davenport MS, Samala RK, Weizer AZ, Alva A, Kirova-Nedyalkova G, Shampain K, Meyer N, Barkmeier D, Woolen SA, Shankar PR, Francis IR, Palmbos PL. Intraobserver Variability in Bladder Cancer Treatment Response Assessment With and Without Computerized Decision Support. ACTA ACUST UNITED AC 2021; 6:194-202. [PMID: 32548296 PMCID: PMC7289252 DOI: 10.18383/j.tom.2020.00013] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
We evaluated the intraobserver variability of physicians aided by a computerized decision-support system for treatment response assessment (CDSS-T) to identify patients who show complete response to neoadjuvant chemotherapy for bladder cancer, and the effects of the intraobserver variability on physicians' assessment accuracy. A CDSS-T tool was developed that uses a combination of deep learning neural network and radiomic features from computed tomography (CT) scans to detect bladder cancers that have fully responded to neoadjuvant treatment. Pre- and postchemotherapy CT scans of 157 bladder cancers from 123 patients were collected. In a multireader, multicase observer study, physician-observers estimated the likelihood of pathologic T0 disease by viewing paired pre/posttreatment CT scans placed side by side on an in-house-developed graphical user interface. Five abdominal radiologists, 4 diagnostic radiology residents, 2 oncologists, and 1 urologist participated as observers. They first provided an estimate without CDSS-T and then with CDSS-T. A subset of cases was evaluated twice to study the intraobserver variability and its effects on observer consistency. The mean areas under the curves for assessment of pathologic T0 disease were 0.85 for CDSS-T alone, 0.76 for physicians without CDSS-T and improved to 0.80 for physicians with CDSS-T (P = .001) in the original evaluation, and 0.78 for physicians without CDSS-T and improved to 0.81 for physicians with CDSS-T (P = .010) in the repeated evaluation. The intraobserver variability was significantly reduced with CDSS-T (P < .0001). The CDSS-T can significantly reduce physicians' variability and improve their accuracy for identifying complete response of muscle-invasive bladder cancer to neoadjuvant chemotherapy.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Ajjai Alva
- Internal Medicine, Division of Hematology-Oncology, University of Michigan, Ann Arbor, MI
| | | | | | | | | | - Sean A Woolen
- Department of Radiology, University of California, San Francisco, Medical Center, San Francisco, CA
| | | | | | - Phillip L Palmbos
- Internal Medicine, Division of Hematology-Oncology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
8
|
Hegdé J. Deep learning can be used to train naïve, nonprofessional observers to detect diagnostic visual patterns of certain cancers in mammograms: a proof-of-principle study. J Med Imaging (Bellingham) 2020; 7:022410. [PMID: 32042860 PMCID: PMC6998757 DOI: 10.1117/1.jmi.7.2.022410] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 12/26/2019] [Indexed: 11/27/2022] Open
Abstract
The scientific, clinical, and pedagogical significance of devising methodologies to train nonprofessional subjects to recognize diagnostic visual patterns in medical images has been broadly recognized. However, systematic approaches to doing so remain poorly established. Using mammography as an exemplar case, we use a series of experiments to demonstrate that deep learning (DL) techniques can, in principle, be used to train naïve subjects to reliably detect certain diagnostic visual patterns of cancer in medical images. In the main experiment, subjects were required to learn to detect statistical visual patterns diagnostic of cancer in mammograms using only the mammograms and feedback provided following the subjects’ response. We found not only that the subjects learned to perform the task at statistically significant levels, but also that their eye movements related to image scrutiny changed in a learning-dependent fashion. Two additional, smaller exploratory experiments suggested that allowing subjects to re-examine the mammogram in light of various items of diagnostic information may help further improve DL of the diagnostic patterns. Finally, a fourth small, exploratory experiment suggested that the image information learned was similar across subjects. Together, these results prove the principle that DL methodologies can be used to train nonprofessional subjects to reliably perform those aspects of medical image perception tasks that depend on visual pattern recognition expertise.
Collapse
Affiliation(s)
- Jay Hegdé
- Augusta University, Medical College of Georgia, Departments of Neuroscience and Regenerative Medicine and Ophthalmology, Augusta, Georgia, United States
| |
Collapse
|
9
|
Rezaianzadeh A, Sepandi M, Rahimikazerooni S. Assessment of Breast Cancer Risk in an Iranian Female Population Using Bayesian Networks with Varying Node Number. Asian Pac J Cancer Prev 2016; 17:4913-4916. [PMID: 28032495 PMCID: PMC5454695 DOI: 10.22034/apjcp.2016.17.11.4913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Objective: As a source of information, medical data can feature hidden relationships. However, the high volume of datasets and complexity of decision-making in medicine introduce difficulties for analysis and interpretation and processing steps may be needed before the data can be used by clinicians in their work. This study focused on the use of Bayesian models with different numbers of nodes to aid clinicians in breast cancer risk estimation. Methods: Bayesian networks (BNs) with a retrospectively collected dataset including mammographic details, risk factor exposure, and clinical findings was assessed for prediction of the probability of breast cancer in individual patients. Area under the receiver-operating characteristic curve (AUC), accuracy, sensitivity, specificity, and positive and negative predictive values were used to evaluate discriminative performance. Result: A network incorporating selected features performed better (AUC = 0.94) than that incorporating all the features (AUC = 0.93). The results revealed no significant difference among 3 models regarding performance indices at the 5% significance level. Conclusion: BNs could effectively discriminate malignant from benign abnormalities and accurately predict the risk of breast cancer in individuals. Moreover, the overall performance of the 9-node BN was better, and due to the lower number of nodes it might be more readily be applied in clinical settings.
Collapse
Affiliation(s)
- Abbas Rezaianzadeh
- Colorectal Research Center, Shiraz University of Medical Sciences. Shiraz, Iran.
| | | | | |
Collapse
|
10
|
Lee J, Nishikawa RM, Reiser I, Boone JM, Lindfors KK. Local curvature analysis for classifying breast tumors: Preliminary analysis in dedicated breast CT. Med Phys 2016; 42:5479-89. [PMID: 26328996 DOI: 10.1118/1.4928479] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
PURPOSE The purpose of this study is to measure the effectiveness of local curvature measures as novel image features for classifying breast tumors. METHODS A total of 119 breast lesions from 104 noncontrast dedicated breast computed tomography images of women were used in this study. Volumetric segmentation was done using a seed-based segmentation algorithm and then a triangulated surface was extracted from the resulting segmentation. Total, mean, and Gaussian curvatures were then computed. Normalized curvatures were used as classification features. In addition, traditional image features were also extracted and a forward feature selection scheme was used to select the optimal feature set. Logistic regression was used as a classifier and leave-one-out cross-validation was utilized to evaluate the classification performances of the features. The area under the receiver operating characteristic curve (AUC, area under curve) was used as a figure of merit. RESULTS Among curvature measures, the normalized total curvature (CT) showed the best classification performance (AUC of 0.74), while the others showed no classification power individually. Five traditional image features (two shape, two margin, and one texture descriptors) were selected via the feature selection scheme and its resulting classifier achieved an AUC of 0.83. Among those five features, the radial gradient index (RGI), which is a margin descriptor, showed the best classification performance (AUC of 0.73). A classifier combining RGI and CT yielded an AUC of 0.81, which showed similar performance (i.e., no statistically significant difference) to the classifier with the above five traditional image features. Additional comparisons in AUC values between classifiers using different combinations of traditional image features and CT were conducted. The results showed that CT was able to replace the other four image features for the classification task. CONCLUSIONS The normalized curvature measure contains useful information in classifying breast tumors. Using this, one can reduce the number of features in a classifier, which may result in more robust classifiers for different datasets.
Collapse
Affiliation(s)
- Juhun Lee
- Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Robert M Nishikawa
- Department of Radiology, University of Pittsburgh, Pittsburgh, Pennsylvania 15213
| | - Ingrid Reiser
- Department of Radiology, University of Chicago, Chicago, Illinois 60637
| | - John M Boone
- Department of Radiology, University of California Davis Medical Center, Sacramento, California 95817
| | - Karen K Lindfors
- Department of Radiology, University of California Davis Medical Center, Sacramento, California 95817
| |
Collapse
|
11
|
Jorritsma W, Cnossen F, van Ooijen PMA. Improving the radiologist-CAD interaction: designing for appropriate trust. Clin Radiol 2014; 70:115-22. [PMID: 25459198 DOI: 10.1016/j.crad.2014.09.017] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Revised: 09/17/2014] [Accepted: 09/19/2014] [Indexed: 12/25/2022]
Abstract
Computer-aided diagnosis (CAD) has great potential to improve radiologists' diagnostic performance. However, the reported performance of the radiologist-CAD team is lower than what might be expected based on the performance of the radiologist and the CAD system in isolation. This indicates that the interaction between radiologists and the CAD system is not optimal. An important factor in the interaction between humans and automated aids (such as CAD) is trust. Suboptimal performance of the human-automation team is often caused by an inappropriate level of trust in the automation. In this review, we examine the role of trust in the radiologist-CAD interaction and suggest ways to improve the output of the CAD system so that it allows radiologists to calibrate their trust in the CAD system more effectively. Observer studies of the CAD systems show that radiologists often have an inappropriate level of trust in the CAD system. They sometimes under-trust CAD, thereby reducing its potential benefits, and sometimes over-trust it, leading to diagnostic errors they would not have made without CAD. Based on the literature on trust in human-automation interaction and the results of CAD observer studies, we have identified four ways to improve the output of CAD so that it allows radiologists to form a more appropriate level of trust in CAD. Designing CAD systems for appropriate trust is important and can improve the performance of the radiologist-CAD team. Future CAD research and development should acknowledge the importance of the radiologist-CAD interaction, and specifically the role of trust therein, in order to create the perfect artificial partner for the radiologist. This review focuses on the role of trust in the radiologist-CAD interaction. The aim of the review is to encourage CAD developers to design for appropriate trust and thereby improve the performance of the radiologist-CAD team.
Collapse
Affiliation(s)
- W Jorritsma
- Department of Radiology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, 9713 GZ, Groningen, The Netherlands.
| | - F Cnossen
- Institute of Artificial Intelligence and Cognitive Engineering, University of Groningen, Nijenborgh 9, 9747 AG, Groningen, The Netherlands
| | - P M A van Ooijen
- Department of Radiology, University of Groningen, University Medical Center Groningen, Hanzeplein 1, 9713 GZ, Groningen, The Netherlands; Center for Medical Imaging North East Netherlands, Hanzeplein 1, 9713 GZ, Groningen, The Netherlands
| |
Collapse
|
12
|
Yan XB, Xiong WQ, Hu L, Zhao K. Cancer prediction based on radical basis function neural network with particle swarm optimization. Asian Pac J Cancer Prev 2014; 15:7775-80. [PMID: 25292062 DOI: 10.7314/apjcp.2014.15.18.7775] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
This paper addresses cancer prediction based on radial basis function neural network optimized by particle swarm optimization. Today, cancer hazard to people is increasing, and it is often difficult to cure cancer. The occurrence of cancer can be predicted by the method of the computer so that people can take timely and effective measures to prevent the occurrence of cancer. In this paper, the occurrence of cancer is predicted by the means of Radial Basis Function Neural Network Optimized by Particle Swarm Optimization. The neural network parameters to be optimized include the weight vector between network hidden layer and output layer, and the threshold of output layer neurons. The experimental data were obtained from the Wisconsin breast cancer database. A total of 12 experiments were done by setting 12 different sets of experimental result reliability. The findings show that the method can improve the accuracy, reliability and stability of cancer prediction greatly and effectively.
Collapse
Affiliation(s)
- Xiao-Bo Yan
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China E-mail :
| | | | | | | |
Collapse
|
13
|
Schalekamp S, van Ginneken B, Schaefer-Prokop CM, Karssemeijer N. Influence of study design in receiver operating characteristics studies: sequential versus independent reading. J Med Imaging (Bellingham) 2014; 1:015501. [PMID: 26158028 DOI: 10.1117/1.jmi.1.1.015501] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Revised: 01/28/2014] [Accepted: 01/29/2014] [Indexed: 11/14/2022] Open
Abstract
Observer studies to assess new image processing devices or computer-aided diagnosis techniques are often performed, but little is known about the effect of the study design on observer performance results. We investigated the effect of the sequential and independent reading design on observer study results with respect to reader performance and their statistical power. For this we performed an observer study for the detection of lung nodules with bone-suppressed images (BSIs) compared with original chest radiographs. In a fully crossed observer study, eight observers assessed a series of 300 radiographs four times, including one assessment of the original radiograph with sequential BSI and two independent reading sessions with BSI. Observer performance was compared using multireader multicase receiver operating characteristics. No significant difference between the effect of BSI in the sequential and the independent reading sessions could be found ([Formula: see text]; [Formula: see text]). Compared with the original radiographs, increased performance with BSI was significant in the sequential and one of the independent reading sessions ([Formula: see text]; [Formula: see text]), and nonsignificant in the other independent reading session ([Formula: see text]). A strong increase of uncorrelated variance components was found in the independent reading sessions, masking the ability to demonstrate differences in observer performance across modalities. Therefore, the sequential reading design is the preferred design because it is less burdensome and has more statistical power.
Collapse
Affiliation(s)
- Steven Schalekamp
- Radboud University , Nijmegen, Medical Center, Department of Radiology, Postbus 9101, 6500 HB Nijmegen, The Netherlands
| | - Bram van Ginneken
- Radboud University , Nijmegen, Medical Center, Department of Radiology, Postbus 9101, 6500 HB Nijmegen, The Netherlands
| | - Cornelia M Schaefer-Prokop
- Radboud University , Nijmegen, Medical Center, Department of Radiology, Postbus 9101, 6500 HB Nijmegen, The Netherlands ; Meander Medical Center , Department of Radiology, Postbus 1502, 3800 BM Amersfoort, The Netherlands
| | - Nico Karssemeijer
- Radboud University , Nijmegen, Medical Center, Department of Radiology, Postbus 9101, 6500 HB Nijmegen, The Netherlands
| |
Collapse
|
14
|
Abbey CK, Gallas BD, Boone JM, Niklason LT, Hadjiiski LM, Sahiner B, Samuelson FW. Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography. Acad Radiol 2014; 21:481-90. [PMID: 24594418 DOI: 10.1016/j.acra.2013.12.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 12/11/2013] [Accepted: 12/11/2013] [Indexed: 11/25/2022]
Abstract
RATIONALE AND OBJECTIVES Our objective is to determine whether expected utility (EU) and the area under the receiver operator characteristic (AUC) are consistent with one another as endpoints of observer performance studies in mammography. These two measures characterize receiver operator characteristic performance somewhat differently. We compare these two study endpoints at the level of individual reader effects, statistical inference, and components of variance across readers and cases. MATERIALS AND METHODS We reanalyze three previously published laboratory observer performance studies that investigate various x-ray breast imaging modalities using EU and AUC. The EU measure is based on recent estimates of relative utility for screening mammography. RESULTS The AUC and EU measures are correlated across readers for individual modalities (r = 0.93) and differences in modalities (r = 0.94 to 0.98). Statistical inference for modality effects based on multi-reader multi-case analysis is very similar, with significant results (P < .05) in exactly the same conditions. Power analyses show mixed results across studies, with a small increase in power on average for EU that corresponds to approximately a 7% reduction in the number of readers. Despite a large number of crossing receiver operator characteristic curves (59% of readers), modality effects only rarely have opposite signs for EU and AUC (6%). CONCLUSIONS We do not find any evidence of systematic differences between EU and AUC in screening mammography observer studies. Thus, when utility approaches are viable (i.e., an appropriate value of relative utility exists), practical effects such as statistical efficiency may be used to choose study endpoints.
Collapse
|
15
|
Comparison of dual-energy subtraction and electronic bone suppression combined with computer-aided detection on chest radiographs: effect on human observers' performance in nodule detection. AJR Am J Roentgenol 2013; 200:1006-13. [PMID: 23617482 DOI: 10.2214/ajr.12.8877] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
OBJECTIVE The objective of our study was to compare the effect of dual-energy subtraction and bone suppression software alone and in combination with computer-aided detection (CAD) on the performance of human observers in lung nodule detection. MATERIALS AND METHODS One hundred one patients with from one to five lung nodules measuring 5-29 mm and 42 subjects with no nodules were retrospectively selected and randomized. Three independent radiologists marked suspicious-appearing lesions on the original chest radiographs, dual-energy subtraction images, and bone-suppressed images before and after postprocessing with CAD. Marks of the observers and CAD marks were compared with CT as the reference standard. Data were analyzed using nonparametric tests and the jackknife alternative free-response receiver operating characteristic (JAFROC) method. RESULTS Using dual-energy subtraction alone (p = 0.0198) or CAD alone (p = 0.0095) improved the detection rate compared with using the original conventional chest radiograph. The combination of bone suppression and CAD provided the highest sensitivity (51.6%) and the original nonenhanced conventional chest radiograph alone provided the lowest (46.9%; p = 0.0049). Dual-energy subtraction and bone suppression provided the same false-positive (p = 0.2702) and true-positive (p = 0.8451) rates. Up to 22.9% of lesions were found only by the CAD program and were missed by the readers. JAFROC showed no difference in the performance between modalities (p = 0.2742-0.5442). CONCLUSION Dual-energy subtraction and the electronic bone suppression program used in this study provided similar detection rates for pulmonary nodules. Additionally, CAD alone or combined with bone suppression can significantly improve the sensitivity of human observers for pulmonary nodule detection.
Collapse
|
16
|
Regularization in retrieval-driven classification of clustered microcalcifications for breast cancer. Int J Biomed Imaging 2012; 2012:463408. [PMID: 22919363 PMCID: PMC3418652 DOI: 10.1155/2012/463408] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2012] [Accepted: 06/20/2012] [Indexed: 11/17/2022] Open
Abstract
We propose a regularization based approach for case-adaptive classification in computer-aided diagnosis (CAD) of breast cancer. The goal is to improve the classification accuracy on a query case by making use of a set of similar cases retrieved from an existing library of known cases. In the proposed approach, a prior is first derived from a traditional CAD classifier (which is typically pre-trained offline on a set of training cases). It is then used together with the retrieved similar cases to obtain an adaptive classifier on the query case. We consider two different forms for the regularization prior: one is fixed for all query cases and the other is allowed to vary with different query cases. In the experiments the proposed approach is demonstrated on a dataset of 1,006 clinical cases. The results show that it could achieve significant improvement in numerical efficiency compared with a previously proposed case adaptive approach (by about an order of magnitude) while maintaining similar (or better) improvement in classification accuracy; it could also adapt faster in performance with a small number of retrieved cases. Measured by the area of under the ROC curve (AUC), the regularization based approach achieved AUC = 0.8215, compared with AUC = 0.7329 for the baseline classifier (P-value = 0.001).
Collapse
|
17
|
Horsch K, Pesce LL, Giger ML, Metz CE, Jiang Y. A scaling transformation for classifier output based on likelihood ratio: applications to a CAD workstation for diagnosis of breast cancer. Med Phys 2012; 39:2787-804. [PMID: 22559651 DOI: 10.1118/1.3700168] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE The authors developed scaling methods that monotonically transform the output of one classifier to the "scale" of another. Such transformations affect the distribution of classifier output while leaving the ROC curve unchanged. In particular, they investigated transformations between radiologists and computer classifiers, with the goal of addressing the problem of comparing and interpreting case-specific values of output from two classifiers. METHODS Using both simulated and radiologists' rating data of breast imaging cases, the authors investigated a likelihood-ratio-scaling transformation, based on "matching" classifier likelihood ratios. For comparison, three other scaling transformations were investigated that were based on matching classifier true positive fraction, false positive fraction, or cumulative distribution function, respectively. The authors explored modifying the computer output to reflect the scale of the radiologist, as well as modifying the radiologist's ratings to reflect the scale of the computer. They also evaluated how dataset size affects the transformations. RESULTS When ROC curves of two classifiers differed substantially, the four transformations were found to be quite different. The likelihood-ratio scaling transformation was found to vary widely from radiologist to radiologist. Similar results were found for the other transformations. Our simulations explored the effect of database sizes on the accuracy of the estimation of our scaling transformations. CONCLUSIONS The likelihood-ratio-scaling transformation that the authors have developed and evaluated was shown to be capable of transforming computer and radiologist outputs to a common scale reliably, thereby allowing the comparison of the computer and radiologist outputs on the basis of a clinically relevant statistic.
Collapse
Affiliation(s)
- Karla Horsch
- Department of Radiology, The University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | |
Collapse
|
18
|
Jing H, Yang Y, Nishikawa RM. Retrieval boosted computer-aided diagnosis of clustered microcalcifications for breast cancer. Med Phys 2012; 39:676-85. [PMID: 22320777 DOI: 10.1118/1.3675600] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE The authors propose an image-retrieval based approach for case-adaptive classifier design in computer-aided diagnosis (CADx). The conventional approach in CADx is to first train a pattern-classifier based on a set of existing training samples and then apply this classifier to subsequent new cases. The purpose of this work is to improve the classification accuracy of a CADx classifier by making use of a set of known cases retrieved from a reference library that are similar to the case under consideration. METHODS In the proposed approach, the authors will first apply image-retrieval to obtain a set of lesion images from a library of known cases that have similar image features to a case being diagnosed (i.e., query). These retrieved cases are then used to optimize a pattern-classifier toward boosting its classification accuracy on the query case. The basic idea is to put more emphasis on those cases that are similar to the query. The proposed approach is demonstrated first using a linear classifier and then extended to a nonlinear classifier induced by kernel principal component analysis. RESULTS The proposed retrieval-driven approach was tested on a library of mammogram images from 1006 cases (646 benign and 360 malignant) obtained from multiple institutions and was demonstrated to yield significant improvement in classification performance. Measured by the area under the receiver operating characteristic curve (AUC), the case-adaptive approach could boost the classification performance of a linear classifier from AUC = 0.7415 to AUC = 0.7807; similar improvement was also obtained for a nonlinear classifier, with AUC boosted from 0.7527 to 0.7838. CONCLUSIONS Use of additional cases from a reference library that have similar image features can improve the classification accuracy of a CADx classifier on a query case. It can even outperform retraining the classifier with all the cases from the entire reference library. This implies that cases with similar image features are more relevant in defining the local decision boundary of the CADx classifier around the query.
Collapse
Affiliation(s)
- Hao Jing
- Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL 60616, USA
| | | | | |
Collapse
|
19
|
Evaluating imaging and computer-aided detection and diagnosis devices at the FDA. Acad Radiol 2012; 19:463-77. [PMID: 22306064 DOI: 10.1016/j.acra.2011.12.016] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Revised: 12/22/2011] [Accepted: 12/28/2011] [Indexed: 11/22/2022]
Abstract
This report summarizes the Joint FDA-MIPS Workshop on Methods for the Evaluation of Imaging and Computer-Assist Devices. The purpose of the workshop was to gather information on the current state of the science and facilitate consensus development on statistical methods and study designs for the evaluation of imaging devices to support US Food and Drug Administration submissions. Additionally, participants expected to identify gaps in knowledge and unmet needs that should be addressed in future research. This summary is intended to document the topics that were discussed at the meeting and disseminate the lessons that have been learned through past studies of imaging and computer-aided detection and diagnosis device performance.
Collapse
|
20
|
Abstract
A mamografia representa o melhor método de detecção precoce do câncer de mama, porém cerca de 10% a 30% das lesões mamárias são perdidas no rastreamento, devido a limitações próprias dos observadores humanos. A detecção auxiliada por computador (computer-aided detection - CAD) é uma tecnologia relativamente nova que tem sido implementada em alguns serviços de mamografia, com o intuito de prover uma dupla leitura. Estudos clínicos têm demonstrado que o CAD aumenta a sensibilidade de detecção do câncer da mama, por radiologistas, em até 21%. Um sistema CAD é útil em situações em que exista alta variabilidade interobservador, falta de observadores treinados, ou na impossibilidade de se realizar a dupla leitura com dois ou mais radiologistas. O objetivo desta revisão está baseado na necessidade de atualizar a comunidade médica acerca desta ferramenta, como um método auxiliar, quantitativo, não operador-dependente, e que visa a melhorar a qualidade do diagnóstico do câncer de mama.
Collapse
|
21
|
Tian JW, Ning CP, Guo YH, Cheng HD, Tang XL. Effect of a novel segmentation algorithm on radiologists' diagnosis of breast masses using ultrasound imaging. ULTRASOUND IN MEDICINE & BIOLOGY 2012; 38:119-127. [PMID: 22104530 DOI: 10.1016/j.ultrasmedbio.2011.09.011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2011] [Revised: 08/20/2011] [Accepted: 09/20/2011] [Indexed: 05/31/2023]
Abstract
We investigated the effect of using a novel segmentation algorithm on radiologists' sensitivity and specificity for discriminating malignant masses from benign masses using ultrasound. Five-hundred ten conventional ultrasound images were processed by a novel segmentation algorithm. Five radiologists were invited to analyze the original and computerized images independently. Performances of radiologists with or without computer aid were evaluated by receiver operating characteristic (ROC) curve analysis. The masses became more obvious after being processed by the segmentation algorithm. Without using the algorithm, the areas under the ROC curve (Az) of the five radiologists ranged from 0.70∼0.84. Using the algorithm, the Az increased significantly (range, 0.79∼0.88; p < 0.001). The proposed segmentation algorithm could improve the radiologists' diagnosis performance by reducing the image speckles and extracting the mass margin characteristics.
Collapse
Affiliation(s)
- Jia-Wei Tian
- Department of Ultrasound, Second Affiliated Hospital of Harbin Medical University, Harbin, PR China
| | | | | | | | | |
Collapse
|
22
|
Abstract
OBJECTIVE Calculating the sample size for a multireader, multicase study of readers' diagnostic accuracy is complicated. Studies in which patients can have multiple findings, as is common in many computer-aided detection (CAD) studies, are particularly challenging to design. MATERIALS AND METHODS We modified existing methods for sample size estimation for multireader, multicase studies to accommodate multiple findings on the same case. We use data from two large multireader, multicase CAD studies as ballpark estimates of parameter values. RESULTS Sample size tables are presented to provide an estimate of the number of patients and readers required for a multireader, multicase study with multiple findings per case; these estimates may be conservative for many CAD studies. Two figures can be used to adjust the number of readers when there is some data on the between-reader variability. CONCLUSION The sample size tables are useful in determining whether a proposed study is feasible with the available resources; however, it is important that investigators compute sample size for their particular study using any available pilot data.
Collapse
|
23
|
Noroozian M, Hadjiiski L, Rahnama-Moghadam S, Klein KA, Jeffries DO, Pinsky RW, Chan HP, Carson PL, Helvie MA, Roubidoux MA. Digital breast tomosynthesis is comparable to mammographic spot views for mass characterization. Radiology 2011; 262:61-8. [PMID: 21998048 DOI: 10.1148/radiol.11101763] [Citation(s) in RCA: 124] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To determine if digital breast tomosynthesis (DBT) performs comparably to mammographic spot views (MSVs) in characterizing breast masses as benign or malignant. MATERIALS AND METHODS This IRB-approved, HIPAA-compliant reader study obtained informed consent from all subjects. Four blinded Mammography Quality Standards Act-certified academic radiologists individually evaluated DBT images and MSVs of 67 masses (30 malignant, 37 benign) in 67 women (age range, 34-88 years). Images were viewed in random order at separate counterbalanced sessions and were rated for visibility (10-point scale), likelihood of malignancy (12-point scale), and Breast Imaging Reporting and Data System (BI-RADS) classification. Differences in mass visibility were analyzed by using the Wilcoxon matched-pairs signed-ranks test. Reader performance was measured by calculating the area under the receiver operating characteristic curve (A(z)) and partial area index above a sensitivity threshold of 0.90 (A(z)(0.90)) by using likelihood of malignancy ratings. Masses categorized as BI-RADS 4 or 5 were compared with histopathologic analysis to determine true-positive results for each modality. RESULTS Mean mass visibility ratings were slightly better with DBT (range, 3.2-4.4) than with MSV (range, 3.8-4.8) for all four readers, with one reader's improvement achieving statistical significance (P = .001). The A(z) ranged 0.89-0.93 for DBT and 0.88-0.93 for MSV (P ≥ .23). The A(z)((0.90)) ranged 0.36-0.52 for DBT and 0.25-0.40 for MSV (P ≥ .20). The readers characterized seven additional malignant masses as BI-RADS 4 or 5 with DBT than with MSV, at a cost of five false-positive biopsy recommendations, with a mean of 1.8 true-positive (range, 0-3) and 1.3 false-positive (range, -1 to 4) assessments per reader. CONCLUSION In this small study, mass characterization in terms of visibility ratings, reader performance, and BI-RADS assessment with DBT was similar to that with MSVs. Preliminary findings suggest that MSV might not be necessary for mass characterization when performing DBT.
Collapse
Affiliation(s)
- Mitra Noroozian
- Department of Radiology, University of Michigan Health System, 1500 E Medical Center Dr, SPC 5326, Ann Arbor, MI 48109, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Online mammographic images database for development and comparison of CAD schemes. J Digit Imaging 2011; 24:500-6. [PMID: 20480383 DOI: 10.1007/s10278-010-9297-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
Considering the difficulties in finding good-quality images for the development and test of computer-aided diagnosis (CAD), this paper presents a public online mammographic images database free for all interested viewers and aimed to help develop and evaluate CAD schemes. The digitalization of the mammographic images is made with suitable contrast and spatial resolution for processing purposes. The broad recuperation system allows the user to search for different images, exams, or patient characteristics. Comparison with other databases currently available has shown that the presented database has a sufficient number of images, is of high quality, and is the only one to include a functional search system.
Collapse
|
25
|
Freedman MT, Lo SCB, Seibel JC, Bromley CM. Lung Nodules: Improved Detection with Software That Suppresses the Rib and Clavicle on Chest Radiographs. Radiology 2011; 260:265-73. [DOI: 10.1148/radiol.11100153] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
26
|
Cho HC, Hadjiiski L, Sahiner B, Chan HP, Helvie M, Paramagul C, Nees AV. Similarity evaluation in a content-based image retrieval (CBIR) CADx system for characterization of breast masses on ultrasound images. Med Phys 2011; 38:1820-31. [PMID: 21626916 DOI: 10.1118/1.3560877] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE The authors are developing a content-based image retrieval (CBIR) CADx system to assist radiologists in characterization of breast masses on ultrasound images. In this study, the authors compared seven similarity measures to be considered for the CBIR system. The similarity between the query and the retrieved masses was evaluated based on radiologists' visual similarity assessments. METHODS The CADx system retrieves masses that are similar to a query mass from a reference library based on computer-extracted features using a k-nearest neighbor (k-NN) approach. Among seven similarity measures evaluated for the CBIR system, four similarity measures including linear discriminant analysis (LDA), Bayesian neural network (BNN), cosine similarity measure (Cos), and Euclidean distance (ED) similarity measure were compared by radiologists' visual assessment. For LDA and BNN, the features of a query mass were combined first into a malignancy score and then masses with similar scores were retrieved. For Cos and ED, similar masses were retrieved based on the normalized dot product and the Euclidean distance, respectively, between two feature vectors. For the observer study, three most similar masses were retrieved for a given query mass with each method. All query-retrieved mass pairs were mixed and presented to the radiologists in random order. Three Mammography Quality Standards Act (MQSA) radiologists rated the similarity between each pair using a nine-point similarity scale (1 = very dissimilar, 9 = very similar). The accuracy of the CBIR CADx system using the different similarity measures to characterize malignant and benign masses was evaluated by ROC analysis. RESULTS The BNN measure used with the k-NN classifier provided slightly higher performance for classification of malignant and benign masses (A(z) values of 0.87) than those with the LDA, Cos, and ED measures (A(z) of 0.86, 0.84, and 0.81, respectively). The average similarity ratings of all radiologists for LDA, BNN, Cos, and ED were 4.71, 4.95, 5.18, and 5.32, respectively. The k-NN with the ED measures retrieved masses of significantly higher similarity (p < 0.008) than LDA and BNN. CONCLUSIONS Similarity measures using the resemblance of individual features in the multidimensional feature space can retrieve visually more similar masses than similarity measures using the resemblance of the classifier scores. A CBIR system that can most effectively retrieve similar masses to the query may not have the best A(z).
Collapse
Affiliation(s)
- Hyun-Chong Cho
- Department of Radiology, The University of Michigan, Ann Arbor Michigan 48109-0904, USA.
| | | | | | | | | | | | | |
Collapse
|
27
|
Goddard K, Roudsari A, Wyatt JC. Automation bias: a systematic review of frequency, effect mediators, and mitigators. J Am Med Inform Assoc 2011; 19:121-7. [PMID: 21685142 DOI: 10.1136/amiajnl-2011-000089] [Citation(s) in RCA: 148] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Automation bias (AB)--the tendency to over-rely on automation--has been studied in various academic fields. Clinical decision support systems (CDSS) aim to benefit the clinical decision-making process. Although most research shows overall improved performance with use, there is often a failure to recognize the new errors that CDSS can introduce. With a focus on healthcare, a systematic review of the literature from a variety of research fields has been carried out, assessing the frequency and severity of AB, the effect mediators, and interventions potentially mitigating this effect. This is discussed alongside automation-induced complacency, or insufficient monitoring of automation output. A mix of subject specific and freetext terms around the themes of automation, human-automation interaction, and task performance and error were used to search article databases. Of 13 821 retrieved papers, 74 met the inclusion criteria. User factors such as cognitive style, decision support systems (DSS), and task specific experience mediated AB, as did attitudinal driving factors such as trust and confidence. Environmental mediators included workload, task complexity, and time constraint, which pressurized cognitive resources. Mitigators of AB included implementation factors such as training and emphasizing user accountability, and DSS design factors such as the position of advice on the screen, updated confidence levels attached to DSS output, and the provision of information versus recommendation. By uncovering the mechanisms by which AB operates, this review aims to help optimize the clinical decision-making process for CDSS developers and healthcare practitioners.
Collapse
Affiliation(s)
- Kate Goddard
- Centre for Health Informatics, City University, London, UK.
| | | | | |
Collapse
|
28
|
Chen KY, Chen CN, Wu MH, Ho MC, Tai HC, Huang WC, Chung YC, Chen A, Chang KJ. Computerized detection and quantification of microcalcifications in thyroid nodules. ULTRASOUND IN MEDICINE & BIOLOGY 2011; 37:870-878. [PMID: 21546154 DOI: 10.1016/j.ultrasmedbio.2011.03.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2010] [Revised: 02/05/2011] [Accepted: 03/01/2011] [Indexed: 05/30/2023]
Abstract
To improve the ultrasonographic detection rates of thyroid cancers with microcalcifications, we propose to enhance the sensitivity of sonographic calcifications detection and to avoid interobserver variation by a computerized quantification method in a prospective setting. A total of 227 participants with 258 nodules were evaluated. Among them, two nodules were excluded for suspicious aspiration cytology results without pathologic proof. Among the remaining 256 nodules, the diagnosis of 181 nodules was verified by surgical pathology and the diagnosis of 75 was based on fine needle aspiration (FNA) biopsy results. There were 173 benign thyroid nodules and 83 malignant thyroid nodules, which included 74 papillary carcinomas. Patient clinical data were collected and the presence of calcifications on conventional gray-scale ultrasound images was retrospectively reviewed by a thyroid specialist. Quantification of cystic components and calcifications was automatically performed by a proprietary program (AmCAD-UT) implemented with methods proposed in this article. The calcification index (CI) was calculated after the cystic component was excluded. The CI between benign and malignant nodules diagnosed by combined FNA biopsy and surgical pathology results (total number, 256) showed a significant difference (p < 0.0001, AUC = 0.746). Furthermore, we excluded patients without surgical pathology results for further validation and the CI between benign and malignant nodules confirmed by pathology results (total number, 181) showed a significant difference (p < 0.0001, AUC = 0.763). To learn whether our computer program increased our diagnostic capabilities, we analyzed human investigators and their abilities to detect and evaluate. In this study, calcifications were noted in 48.19% (40 of 83) of malignant thyroid nodules and in 10.98% (19 of 173) of benign nodules. This new computer-aided diagnosis method to evaluate the sonographic calcifications of thyroid nodules is a more sensitive and more objective method. It can provide better sensitivity than conventional methods in the diagnosis of thyroid malignancies containing microcalcifications.
Collapse
Affiliation(s)
- Kuen-Yuan Chen
- Department of Surgery, National Taiwan University Hospital, Taipei, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Chen DR, Lai HW. Three-dimensional ultrasonography for breast malignancy detection. ACTA ACUST UNITED AC 2011; 5:253-61. [PMID: 23484500 DOI: 10.1517/17530059.2011.561314] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
INTRODUCTION Breast ultrasound is used not only to differentiate a solid breast mass from a cyst and to assist in guided biopsy, but also to classify benign and malignant lesions, with good resolution gray-scale imaging equipped with color Doppler adequate for daily clinical practice in most circumstances. AREAS COVERED This article critically reviews three-dimensional (3D) ultrasound for the detection of breast malignancies in comparison with the popular two-dimensional ultrasound, highlighting the advantages it has over other imaging modalities as well as the drawbacks that are presented. In particular, the article looks at how 3D ultrasound planes help us to define more clearly the margins, that is, microlobulation and papillomas, of breast tumors. This paper also highlights how the resolution and multiple planes of 3D ultrasound can clearly demonstrate skin tumor infiltration for evaluation and how it can be used for planning, monitoring and treatment of breast cancer. EXPERT OPINION As with any new technology, 3D ultrasound has a learning curve and clinicians will need to master the technology in order to use this tool to its full potential. Although 3D ultrasound does have its limitations, a better understanding of its settings along with the optimization of image acquisition and a better ability to manipulate data during analysis will lead to 3D ultrasound becoming a useful tool for breast malignancy detection.
Collapse
Affiliation(s)
- Dar-Ren Chen
- Changhua Christian Hospital, Comprehensive Breast Cancer Center, 135 Nanhsiao Street, Changhua 500 , Taiwan +886 4 723 8595 ext. 4871 ; +886 4 723 3715 ;
| | | |
Collapse
|
30
|
Chan HP, Wu YT, Sahiner B, Wei J, Helvie MA, Zhang Y, Moore RH, Kopans DB, Hadjiiski L, Way T. Characterization of masses in digital breast tomosynthesis: comparison of machine learning in projection views and reconstructed slices. Med Phys 2010; 37:3576-86. [PMID: 20831065 DOI: 10.1118/1.3432570] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE In digital breast tomosynthesis (DBT), quasi-three-dimensional (3D) structural information is reconstructed from a small number of 2D projection view (PV) mammograms acquired over a limited angular range. The authors developed preliminary computer-aided diagnosis (CADx) methods for classification of malignant and benign masses and compared the effectiveness of analyzing lesion characteristics in the reconstructed DBT slices and in the PVs. METHODS A data set of MLO view DBT of 99 patients containing 107 masses (56 malignant and 51 benign) was collected at the Massachusetts General Hospital with IRB approval. The DBTs were obtained with a GE prototype system which acquired 11 PVs over a 50 degree arc. The authors reconstructed the DBTs at 1 mm slice interval using a simultaneous algebraic reconstruction technique. The region of interest (ROI) containing the mass was marked by a radiologist in the DBT volume and the corresponding ROIs on the PVs were derived based on the imaging geometry. The subsequent processes were fully automated. For classification of masses using the DBT-slice approach, the mass on each slice was segmented by an active contour model initialized with adaptive k-means clustering. A spiculation likelihood map was generated by analysis of the gradient directions around the mass margin and spiculation features were extracted from the map. The rubber band straightening transform (RBST) was applied to a band of pixels around the segmented mass boundary. The RBST image was enhanced by Sobel filtering in the horizontal and vertical directions, from which run-length statistics texture features were extracted. Morphological features including those from the normalized radial length were designed to describe the mass shape. A feature space composed of the spiculation features, texture features, and morphological features extracted from the central slice alone and seven feature spaces obtained by averaging the corresponding features from three to 19 slices centered at the central slice were compared. For classification of masses using the PV approach, a feature extraction process similar to that described above for the DBT approach was performed on the ROIs from the individual PVs. Six feature spaces obtained from the central PV alone and by averaging the corresponding features from three to 11 PVs were formed. In each feature space for either the DBT-slice or the PV approach, a linear discriminant analysis classifier with stepwise feature selection was trained and tested using a two-loop leave-one-case-out resampling procedure. Simplex optimization was used to guide feature selection automatically within the training set in each leave-one-case-out cycle. The performance of the classifiers was evaluated by the area (Az) under the receiver operating characteristic curve. RESULTS The test Az values from the DBT-slice approach ranged from 0.87 +/- 0.03 to 0.93 +/- 0.02, while those from the PV approach ranged from 0.78 +/- 0.04 to 0.84 +/- 0.04. The highest test Az of 0.93 +/- 0.02 from the nine-DBT-slice feature space was significantly (p = 0.006) better than the highest test Az of 0.84 +/- 0.04 from the nine-PV feature space. CONCLUSION The features of breast lesions extracted from the DBT slices consistently provided higher classification accuracy than those extracted from the PV images.
Collapse
Affiliation(s)
- Heang-Ping Chan
- Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Ayer T, Ayvaci MUS, Liu ZX, Alagoz O, Burnside ES. Computer-aided diagnostic models in breast cancer screening. IMAGING IN MEDICINE 2010; 2:313-323. [PMID: 20835372 PMCID: PMC2936490 DOI: 10.2217/iim.10.24] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Mammography is the most common modality for breast cancer detection and diagnosis and is often complemented by ultrasound and MRI. However, similarities between early signs of breast cancer and normal structures in these images make detection and diagnosis of breast cancer a difficult task. To aid physicians in detection and diagnosis, computer-aided detection and computer-aided diagnostic (CADx) models have been proposed. A large number of studies have been published for both computer-aided detection and CADx models in the last 20 years. The purpose of this article is to provide a comprehensive survey of the CADx models that have been proposed to aid in mammography, ultrasound and MRI interpretation. We summarize the noteworthy studies according to the screening modality they consider and describe the type of computer model, input data size, feature selection method, input feature type, reference standard and performance measures for each study. We also list the limitations of the existing CADx models and provide several possible future research directions.
Collapse
Affiliation(s)
- Turgay Ayer
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Mehmet US Ayvaci
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Ze Xiu Liu
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
| | - Oguzhan Alagoz
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
- Department of Population Health Sciences, University of Wisconsin, Madison, WI, USA
| | - Elizabeth S Burnside
- Industrial & Systems Engineering Department, University of Wisconsin, Madison, WI, USA
- Department of Biostatistics & Medical Informatics, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
32
|
Timp S, Varela C, Karssemeijer N. Computer-aided diagnosis with temporal analysis to improve radiologists' interpretation of mammographic mass lesions. ACTA ACUST UNITED AC 2010; 14:803-8. [PMID: 20403792 DOI: 10.1109/titb.2010.2043296] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The purpose of this study was to evaluate the effect of independent reading with computer-aided diagnosis (CAD) and independent double reading on radiologists' performance to characterize mass lesions on serial mammograms. Six radiologists rated 198 cases, 99 benign and 99 malignant. For each case, the mammograms from two consecutive screening rounds were available. The mass was visible on the prior view in 40% of the cases. Independently, a CAD programe also rated each mass lesion making use of information from prior and current views. The following reading situations were compared: single reading, independent reading with CAD, and independent double reading. Independent reading with CAD was implemented by averaging the scaled ratings from each radiologist and the scaled CAD scores. We implemented independent double reading by averaging the scaled scores from two radiologists. Results were evaluated using receiver-operating characteristic (ROC) methodology and multiple reader multiple case analysis. The average performance, measured as the area under the ROC curve (A(z) value), was 0.80 for the single-reading mode. For independent double reading, the average performance improved to 0.81. This improvement was not significant. For independent interpretation with CAD, the average performance significantly increased to 0.83 (P < 0.05). We conclude that CAD technology with temporal analysis has the potential to help radiologists with the task of discriminating between benign and malignant masses.
Collapse
Affiliation(s)
- Sheila Timp
- Department of Radiology, Radboud University, Nijmegen Medical Centre, HB Nijmegen, The Netherlands.
| | | | | |
Collapse
|
33
|
Cui J, Sahiner B, Chan HP, Nees A, Paramagul C, Hadjiiski LM, Zhou C, Shi J. A new automated method for the segmentation and characterization of breast masses on ultrasound images. Med Phys 2009; 36:1553-65. [PMID: 19544771 DOI: 10.1118/1.3110069] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Segmentation is one of the first steps in most computer-aided diagnosis systems for characterization of masses as malignant or benign. In this study, the authors designed an automated method for segmentation of breast masses on ultrasound (US) images. The method automatically estimated an initial contour based on a manually identified point approximately at the mass center. A two-stage active contour method iteratively refined the initial contour and performed self-examination and correction on the segmentation result. To evaluate the method, the authors compared it with manual segmentation by two experienced radiologists (R1 and R2) on a data set of 488 US images from 250 biopsy-proven masses (100 malignant and 150 benign). Two area overlap ratios (AOR1 and AOR2) and an area error measure were used as performance measures to evaluate the segmentation accuracy. Values for AOR1, defined as the ratio of the intersection of the computer and the reference segmented areas to the reference segmented area, were 0.82 +/- 0.16 and 0.84 +/- 0.18, respectively, when manually segmented mass regions by R1 and R2 were used as the reference. Although this indicated a high agreement between the computer and manual segmentations, the two radiologists' manual segmentation results were significantly (p < 0.03) more consistent, with AOR1 = 0.84 +/- 0.16 and 0.91 +/- 0.12, respectively, when the segmented regions by R1 and R2 were used as the reference. To evaluate the segmentation method in terms of lesion classification accuracy, feature spaces were formed by extracting texture, width-to-height, and posterior shadowing features based on either automated computer segmentation or the radiologists' manual segmentation. A linear discriminant analysis classifier was designed using stepwise feature selection and two-fold cross validation to characterize the mass as malignant or benign. For features extracted from computer segmentation, the case-based test A(z) values ranged from 0.88 +/- 0.03 to 0.92 +/- 0.02, indicating a comparable performance to those extracted from manual segmentation by radiologists (A(z) value range: 0.87 +/- 0.03 to 0.90 +/- 0.03).
Collapse
Affiliation(s)
- Jing Cui
- Department of Radiology, The University of Michigan, Ann Arbor Michigan 48109-0904, USA.
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Elter M, Horsch A. CADx of mammographic masses and clustered microcalcifications: A review. Med Phys 2009; 36:2052-68. [PMID: 19610294 DOI: 10.1118/1.3121511] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Affiliation(s)
- Matthias Elter
- Fraunhofer Institute for Integrated Circuits, Am Wolfsmantel 33, 91058 Erlangen, Germany.
| | | |
Collapse
|
35
|
Burnside ES, Davis J, Chhatwal J, Alagoz O, Lindstrom MJ, Geller BM, Littenberg B, Shaffer KA, Kahn CE, Page CD. Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology 2009; 251:663-72. [PMID: 19366902 DOI: 10.1148/radiol.2513081346] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To determine whether a Bayesian network trained on a large database of patient demographic risk factors and radiologist-observed findings from consecutive clinical mammography examinations can exceed radiologist performance in the classification of mammographic findings as benign or malignant. MATERIALS AND METHODS The institutional review board exempted this HIPAA-compliant retrospective study from requiring informed consent. Structured reports from 48 744 consecutive pooled screening and diagnostic mammography examinations in 18 269 patients from April 5, 1999 to February 9, 2004 were collected. Mammographic findings were matched with a state cancer registry, which served as the reference standard. By using 10-fold cross validation, the Bayesian network was tested and trained to estimate breast cancer risk by using demographic risk factors (age, family and personal history of breast cancer, and use of hormone replacement therapy) and mammographic findings recorded in the Breast Imaging Reporting and Data System lexicon. The performance of radiologists compared with the Bayesian network was evaluated by using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS The Bayesian network significantly exceeded the performance of interpreting radiologists in terms of AUC (0.960 vs 0.939, P = .002), sensitivity (90.0% vs 85.3%, P < .001), and specificity (93.0% vs 88.1%, P < .001). CONCLUSION On the basis of prospectively collected variables, the evaluated Bayesian network can predict the probability of breast cancer and exceed interpreting radiologist performance. Bayesian networks may help radiologists improve mammographic interpretation.
Collapse
Affiliation(s)
- Elizabeth S Burnside
- Department of Radiology, University of Wisconsin School of Medicine and Public Health, E3/311 Clinical Science Center, 600 Highland Ave, Madison, WI 53792-3252, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Filev P, Hadjiiski L, Chan HP, Sahiner B, Ge J, Helvie MA, Roubidoux M, Zhou C. Automated regional registration and characterization of corresponding microcalcification clusters on temporal pairs of mammograms for interval change analysis. Med Phys 2009; 35:5340-50. [PMID: 19175093 DOI: 10.1118/1.3002311] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
A computerized regional registration and characterization system for analysis of microcalcification clusters on serial mammograms is being developed in our laboratory. The system consists of two stages. In the first stage, based on the location of a detected cluster on the current mammogram, a regional registration procedure identifies the local area on the prior that may contain the corresponding cluster. A search program is used to detect cluster candidates within the local area. The detected cluster on the current image is then paired with the cluster candidates on the prior image to form true (TP-TP) or false (TP-FP) pairs. Automatically extracted features were used in a newly designed correspondence classifier to reduce the number of false pairs. In the second stage, a temporal classifier, based on both current and prior information, is used if a cluster has been detected on the prior image, and a current classifier, based on current information alone, is used if no prior cluster has been detected. The data set used in this study consisted of 261 serial pairs containing biopsy-proven calcification clusters. An MQSA radiologist identified the corresponding clusters on the mammograms. On the priors, the radiologist rated the subtlety of 30 clusters (out of the 261 clusters) as 9 or 10 on a scale of 1 (very obvious) to 10 (very subtle). Leave-one-case-out resampling was used for feature selection and classification in both the correspondence and malignant/benign classification schemes. The search program detected 91.2% (238/261) of the clusters on the priors with an average of 0.42 FPs/image. The correspondence classifier identified 86.6% (226/261) of the TP-TP pairs with 20 false matches (0.08 FPs/image) relative to the entire set of 261 image pairs. In the malignant/benign classification stage the temporal classifier achieved a test A(z) of 0.81 for the 246 pairs which contained a detection on the prior. In addition, a classifier was designed by using the clusters on the current mammograms only. It achieved a test A(z) of 0.72 in classifying the clusters as malignant and benign. The difference between the performance of the temporal classifier and the current classifier was statistically significant (p=0.0014). Our interval change analysis system can detect the corresponding cluster on the prior mammogram with high sensitivity, and classify them with a satisfactory accuracy.
Collapse
Affiliation(s)
- Peter Filev
- Department of Radiology, The University of Michigan, Ann Arbor, Michigan 48109-0904, USA
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Giger ML, Chan HP, Boone J. Anniversary paper: History and status of CAD and quantitative image analysis: the role of Medical Physics and AAPM. Med Phys 2009; 35:5799-820. [PMID: 19175137 PMCID: PMC2673617 DOI: 10.1118/1.3013555] [Citation(s) in RCA: 165] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The roles of physicists in medical imaging have expanded over the years, from the study of imaging systems (sources and detectors) and dose to the assessment of image quality and perception, the development of image processing techniques, and the development of image analysis methods to assist in detection and diagnosis. The latter is a natural extension of medical physicists' goals in developing imaging techniques to help physicians acquire diagnostic information and improve clinical decisions. Studies indicate that radiologists do not detect all abnormalities on images that are visible on retrospective review, and they do not always correctly characterize abnormalities that are found. Since the 1950s, the potential use of computers had been considered for analysis of radiographic abnormalities. In the mid-1980s, however, medical physicists and radiologists began major research efforts for computer-aided detection or computer-aided diagnosis (CAD), that is, using the computer output as an aid to radiologists-as opposed to a completely automatic computer interpretation-focusing initially on methods for the detection of lesions on chest radiographs and mammograms. Since then, extensive investigations of computerized image analysis for detection or diagnosis of abnormalities in a variety of 2D and 3D medical images have been conducted. The growth of CAD over the past 20 years has been tremendous-from the early days of time-consuming film digitization and CPU-intensive computations on a limited number of cases to its current status in which developed CAD approaches are evaluated rigorously on large clinically relevant databases. CAD research by medical physicists includes many aspects-collecting relevant normal and pathological cases; developing computer algorithms appropriate for the medical interpretation task including those for segmentation, feature extraction, and classifier design; developing methodology for assessing CAD performance; validating the algorithms using appropriate cases to measure performance and robustness; conducting observer studies with which to evaluate radiologists in the diagnostic task without and with the use of the computer aid; and ultimately assessing performance with a clinical trial. Medical physicists also have an important role in quantitative imaging, by validating the quantitative integrity of scanners and developing imaging techniques, and image analysis tools that extract quantitative data in a more accurate and automated fashion. As imaging systems become more complex and the need for better quantitative information from images grows, the future includes the combined research efforts from physicists working in CAD with those working on quantitative imaging systems to readily yield information on morphology, function, molecular structure, and more-from animal imaging research to clinical patient care. A historical review of CAD and a discussion of challenges for the future are presented here, along with the extension to quantitative image analysis.
Collapse
Affiliation(s)
- Maryellen L Giger
- Department of Radiology, University of Chicago, Chicago, Illinois 60637, USA.
| | | | | |
Collapse
|
38
|
Abstract
PURPOSE OF REVIEW Computer-aided diagnosis (CAD) is a technology used for the detection and characterization of cancer. Although CAD is not limited to a single type of cancer, a large number of CAD systems to date have been designed and used for breast cancer. The aim of this review is to discuss the current state of the CAD systems for breast-cancer diagnosis, their application as a second reader in clinical practice, and studies that have evaluated the effect of CAD on radiologists' performance. RECENT FINDINGS A large number of CAD applications are being developed for different imaging modalities. Owing to commercially available Food and Drug Administration (FDA) approved systems, the main clinical use of CAD to date is for screen-film mammography. Many studies have shown that CAD improves radiologists' performance. A large number of academic institutions have devoted a substantial research effort to developing CAD methods. SUMMARY CAD systems will play an increasingly important role in the clinic as a second reader. Clinical trials have shown that CAD can improve the accuracy of breast-cancer detection. Preclinical studies have demonstrated the potential of CAD to improve the classification of malignant and benign lesions. An increased number of CAD systems are being developed for different breast-imaging modalities.
Collapse
Affiliation(s)
- Lubomir Hadjiiski
- Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109-0904, USA.
| | | | | |
Collapse
|
39
|
Angelo MF, Patrocinio AC, Schiabel H, Medeiros RB, Pires SR. Comparing mammographic images. IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE : THE QUARTERLY MAGAZINE OF THE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY 2008; 27:74-81. [PMID: 18519185 DOI: 10.1109/memb.2008.919024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Michele F Angelo
- Departmento de Tecnologia/Universidade Estadual de Feira de Santana-UEFS, Av. Universitára, Feira de Santana-BA, Brazil.
| | | | | | | | | |
Collapse
|
40
|
Horsch K, Giger ML, Metz CE. Potential effect of different radiologist reporting methods on studies showing benefit of CAD. Acad Radiol 2008; 15:139-52. [PMID: 18206613 DOI: 10.1016/j.acra.2007.09.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Revised: 09/12/2007] [Accepted: 09/13/2007] [Indexed: 11/24/2022]
Abstract
RATIONALE AND OBJECTIVES To investigate the effect of different reporting methods and performance measures on the assessment of the benefit of computer-aided diagnosis (CAD) in characterizing malignant and benign breast lesions on mammography and sonography. MATERIALS AND METHODS In a previous study, 10 observers provided three types of reporting data (probability of malignancy [PM] estimates, Breast Imaging Reporting and Data System [BI-RADS] ratings, and biopsy decisions), both without and with CAD. The current study compares alternative performance measures computed from the three types of reporting data. The area under the receiver operating characteristic curve (AUC) was computed from both the PM estimates and the BI-RADS ratings, whereas sensitivity and specificity were computed from all three data types. Sensitivity and specificity values calculated from either the PM estimates or the BI-RADS ratings were determined by setting both constant and user-dependent thresholds. Student's t-tests were used to evaluate the statistical significance of the differences in the performance measures without and with CAD. RESULTS The average AUC values of the 10 observers calculated from either PM estimates or BI-RADS ratings demonstrated statistically significant improvements in performance with CAD, increasing from 0.87 to 0.92 or 0.93, respectively. However, the statistical significance of improvements in sensitivity or specificity depended on the type of reporting data used. CONCLUSIONS Use of different types of reporting data in the computation of sensitivity and specificity may result in different conclusions concerning the benefit of CAD. Meaningful determination of sensitivity and specificity from PM estimates require the use of user-dependent thresholds.
Collapse
|
41
|
Diagnostic Accuracy and Reading Time to Detect Intracranial Aneurysms on MR Angiography Using a Computer-Aided Diagnosis System. AJR Am J Roentgenol 2008; 190:459-65. [DOI: 10.2214/ajr.07.2642] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
42
|
Helvie M. Improving mammographic interpretation: double reading and computer-aided diagnosis. Radiol Clin North Am 2007; 45:801-11, vi. [PMID: 17888770 DOI: 10.1016/j.rcl.2007.06.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
This article discusses two commonly used techniques advocated to improve screening mammography performance: double reading (DR) and computer-aided detection (CAD). Analysis of these methods is incomplete because no randomized controlled trials have been performed to assess changes in survival. Although DR and CAD have shown improvement in sensitivity, specificity often has decreased. Balancing which parameter is more important involves health care policy, costs, cultural factors, legal risk, and patient preference.
Collapse
Affiliation(s)
- Mark Helvie
- Department of Radiology, University of Michigan Health System, 1500 East Medical Center Drive, TC 2910N, Ann Arbor, MI 48109-0326, USA.
| |
Collapse
|
43
|
Braun M, Pölcher M, Schrading S, Zivanovic O, Kowalski T, Flucke U, Leutner C, Park-Simon TW, Rudlowski C, Kuhn W, Kuhl CK. Influence of preoperative MRI on the surgical management of patients with operable breast cancer. Breast Cancer Res Treat 2007; 111:179-87. [PMID: 17906928 DOI: 10.1007/s10549-007-9767-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2007] [Accepted: 09/18/2007] [Indexed: 10/22/2022]
Abstract
PURPOSE Evaluation of the impact of preoperative magnetic resonance imaging (MRI) of the breast on the clinical management of patients with operable breast cancer (BC). METHODS Retrospective analysis of 160 patients with operable breast cancer (stages Tis through T4), treated from 2002 through 2004. All patients underwent a full mammographic assessment, high frequency breast ultrasound, and breast MRI. The impact of preoperative MRI was evaluated for each patient with regard to changes in the therapeutic procedure. Patient and tumor characteristics were analyzed to identify possible patient subgroups that predominantly would benefit from preoperative MRI. RESULTS Preoperative MRI affected the clinical management in 44 of 160 patients (27.5%). In 30 cases (18.75%) additional in situ or invasive cancers or a more widespread tumor extent were diagnosed correctly which went undetected by clinical palpation, mammography, and breast ultrasound. In 14 cases (8.75%) additional surgical procedures were performed based on suspicious MRI findings that turned out to be benign in final pathology. Age, menopausal status, breast density, tumor characteristics (type, tumor size, grading), ER-, PR- and HER2-receptor features did not significantly differ between patients in which breast MRI affected the clinical management and patients that experienced no additional information from MRI. CONCLUSIONS Preoperative breast MRI changes surgical management of patients with operable breast cancer. MRI detects additional invasive carcinoma and proves to be a powerful supplement to the conventional work-up in the clinical management of breast cancer. This advantage is independent from patients- and tumor-specific characteristics.
Collapse
Affiliation(s)
- Michael Braun
- Department of Obstetrics and Gynecology, University of Bonn, Sigmund-Freud-Street 25, 53105, Bonn, Germany.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Khorasani R. Role of information technology in improving the quality of care in radiology: an overview. J Am Coll Radiol 2007; 2:1035-6. [PMID: 17411989 DOI: 10.1016/j.jacr.2005.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2005] [Indexed: 10/25/2022]
Affiliation(s)
- Ramin Khorasani
- Department of Radiology and Center for Evidence-Based Imaging, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
45
|
Timp S, Varela C, Karssemeijer N. Temporal change analysis for characterization of mass lesions in mammography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2007; 26:945-53. [PMID: 17649908 DOI: 10.1109/tmi.2007.897392] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In this paper, we present a fully automated computer-aided diagnosis (CAD) program to detect temporal changes in mammographic masses between two consecutive screening rounds. The goal of this work was to improve the characterization of mass lesions by adding information about the tumor behavior over time. Towards this goal we previously developed a regional registration technique that finds for each mass lesion on the current view a location on the prior view where the mass was most likely to develop. For the task of interval change analysis, we designed two kinds of temporal features: difference features and similarity features. Difference features indicate the (relative) change in feature values determined on prior and current views. These features may be especially useful for lesions that are visible on both views. Similarity features measure whether two regions are comparable in appearance and may be useful for lesions that are visible on the prior view as well as for newly developing lesions. We evaluated the classification performance with and without the use of temporal features on a dataset consisting of 465 temporal mammogram pairs, 238 benign, and 227 malignant. We used cross validation to partition the dataset into a training set and a test set. The training set was used to train a support vector machine classifier and the test set to evaluate the classifier. The average A(z) value (area under the receiver operating characteristic curve) for classifying each lesion was 0.74 without temporal features and 0.77 with the use of temporal features. The improvement obtained by adding temporal features was statistically significant (P = 0.005). In particular, similarity features contributed to this improvement. Furthermore, we found that the improvement was comparable for masses that were visible and for masses that were not visible on the prior view. These results show that the use of temporal features is an effective approach to improve the characterization of masses.
Collapse
|
46
|
Gur D, Rockette HE, Bandos AI. "Binary" and "non-binary" detection tasks: are current performance measures optimal? Acad Radiol 2007; 14:871-6. [PMID: 17626312 DOI: 10.1016/j.acra.2007.03.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We have observed that a very large fraction of responses for several detection tasks during the performance of observer studies are in the extreme ranges of lower than 11% or higher than 89% regardless of the actual presence or absence of the abnormality in question or its subjectively rated "subtleness." This observation raises questions regarding the validity and appropriateness of using multicategory rating scales for such detection tasks. Monte Carlo simulation of binary and multicategory ratings for these tasks demonstrate that the use of the former (binary) often results in a less biased and more precise summary index and hence may lead to a higher statistical power for determining differences between modalities.
Collapse
Affiliation(s)
- David Gur
- Department of Radiology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| | | | | |
Collapse
|
47
|
O'Connor SD, Yao J, Summers RM. Lytic Metastases in Thoracolumbar Spine: Computer-aided Detection at CT—Preliminary Study. Radiology 2007; 242:811-6. [PMID: 17325068 DOI: 10.1148/radiol.2423060260] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To evaluate the sensitivity of a computer-aided detection (CAD) system for detection of lytic thoracolumbar spinal lesions at body CT, with results of manual lesion segmentation as the reference standard. MATERIALS AND METHODS The study was HIPAA compliant and institutional review board approved; the institutional review board waived the need for informed consent. The CAD system segments the spine on CT images and searches for detections that match size, shape, location, and attenuation criteria. To reduce false-positive findings, 16 features for each detection were computed and fed to a classifier trained with manually segmented lesions. The data set consisted of CT studies of 50 patients (30 men, 20 women; range, 18-82 years; mean, 54.8 years) with 28 lesions. Studies were assigned to either a training (29 studies) or testing (21 studies) set. Sensitivities and false-positive rates (FPRs) for training and testing sets were calculated for these lesions, which were probable lytic metastases with areas 0.8 cm(2) or greater. RESULTS Training set sensitivity was 0.83 (10 of 12; 95% confidence interval: 0.51, 0.97), with an FPR of 7.4 per patient. Test set sensitivity was 0.94 (15 of 16; 95% confidence interval: 0.68, 1.00), with an FPR of 4.5 per patient. There was no significant difference between the CAD sensitivities of the training and test sets (P = .56). Of three false-negative findings, two were due to incomplete segmentation of the vertebral pedicle, and the third was rejected by the classifier. False-positive detections were most often attributable to veins that connect the basivertebral vein with the anterior venous plexus (106 [34%] of 310) and to low-attenuating disks (83 [27%] of 310). CONCLUSION This CAD system successfully identified probable lytic metastases in the thoracolumbar spine and generalized well to an independent testing set.
Collapse
Affiliation(s)
- Stacy D O'Connor
- Diagnostic Radiology Department, Clinical Center, National Institutes of Health, Bldg 10, Room 1C351, 10 Center Dr, MSC 1182, Bethesda, MD 20892-1182, USA
| | | | | |
Collapse
|
48
|
Sahiner B, Chan HP, Roubidoux MA, Hadjiiski LM, Helvie MA, Paramagul C, Bailey J, Nees AV, Blane C. Malignant and benign breast masses on 3D US volumetric images: effect of computer-aided diagnosis on radiologist accuracy. Radiology 2007; 242:716-24. [PMID: 17244717 PMCID: PMC2800986 DOI: 10.1148/radiol.2423051464] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To retrospectively investigate the effect of using a custom-designed computer classifier on radiologists' sensitivity and specificity for discriminating malignant masses from benign masses on three-dimensional (3D) volumetric ultrasonographic (US) images, with histologic analysis serving as the reference standard. MATERIALS AND METHODS Informed consent and institutional review board approval were obtained. Our data set contained 3D US volumetric images obtained in 101 women (average age, 51 years; age range, 25-86 years) with 101 biopsy-proved breast masses (45 benign, 56 malignant). A computer algorithm was designed to automatically delineate mass boundaries and extract features on the basis of segmented mass shapes and margins. A computer classifier was used to merge features into a malignancy score. Five experienced radiologists participated as readers. Each radiologist read cases first without computer-aided diagnosis (CAD) and immediately thereafter with CAD. Observers' malignancy rating data were analyzed with the receiver operating characteristic (ROC) curve. RESULTS Without CAD, the five radiologists had an average area under the ROC curve (A(z)) of 0.83 (range, 0.81-0.87). With CAD, the average A(z) increased significantly (P = .006) to 0.90 (range, 0.86-0.93). When a 2% likelihood of malignancy was used as the threshold for biopsy recommendation, the average sensitivity of radiologists increased from 96% to 98% with CAD, while the average specificity for this data set decreased from 22% to 19%. If a biopsy recommendation threshold could be chosen such that sensitivity would be maintained at 96%, specificity would increase to 45% with CAD. CONCLUSION Use of a computer algorithm may improve radiologists' accuracy in distinguishing malignant from benign breast masses on 3D US volumetric images.
Collapse
Affiliation(s)
- Berkman Sahiner
- Department of Radiology, University of Michigan Medical Center, CGC B2102, 1500 E Medical Center Dr, Ann Arbor, MI 48109-0904, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Hadjiiski L, Chan HP, Sahiner B, Helvie MA, Roubidoux MA. Quasi-continuous and discrete confidence rating scales for observer performance studies: Effects on ROC analysis. Acad Radiol 2007; 14:38-48. [PMID: 17178364 PMCID: PMC2976672 DOI: 10.1016/j.acra.2006.09.048] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2006] [Revised: 09/15/2006] [Accepted: 09/15/2006] [Indexed: 11/23/2022]
Abstract
RATIONALE AND OBJECTIVES To examine the effects of the number of categories in the rating scale used in an observer experiment on the results of ROC analysis by a simulation study. MATERIALS AND METHODS We have previously evaluated the effects of computer-aided diagnosis on radiologists' characterization of malignant and benign breast masses in serial mammograms. The evaluation of the likelihood of malignancy was performed on a quasi-continuous (0-100 points) confidence rating scale. In this study, we simulated the use of discrete confidence rating scales with fewer number of categories and analyzed the results with receiver operating characteristic (ROC) methodology. The observers' estimates of the likelihood of malignancy were also mapped to BI-RADS assessments with five and seven categories and ROC analysis was performed. The area under the ROC curve and the partial area index obtained from ROC analysis of the different confidence rating scales were compared. RESULTS The fitted ROC curves and the performance indices do not change significantly when the confidence rating scales were varied from 6 to 101 points if the estimated operating points obtained directly from the data are distributed relatively evenly over the entire range of true-positive fraction (TPF) and false-positive fraction (FPF). The mapping of the likelihood of malignancy observer data to the seven-category BI-RADS assessment scale allowed reliable ROC analysis, whereas mapping to the five-category BI-RADS scale could cause erratic ROC curve fitting because of the lack of operating points in the mid-range or failure in ROC curve fitting because of data degeneration for some observers. CONCLUSION ROC analysis of discrete confidence rating scales with few but relatively evenly distributed data points over the entire FPF and TPF range is comparable to that of a quasi-continuous rating scale. However, ROC analysis of discrete confidence rating scales with few and unevenly distributed data points may cause unreliable estimations.
Collapse
Affiliation(s)
- Lubomir Hadjiiski
- Department of Radiology, The University of Michigan, CGC B2102, 1500 East Medical Center Drive, Ann Arbor, MI 48109-0904, USA.
| | | | | | | | | |
Collapse
|
50
|
Landau DA. Doctors – A species on the verge of extinction? Med Hypotheses 2007; 68:245-9. [PMID: 17052859 DOI: 10.1016/j.mehy.2006.08.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Accepted: 08/23/2006] [Indexed: 11/20/2022]
Abstract
Medicine is undergoing profound change, but the basic format of the medical encounter has remained unchanged. Nevertheless, medicine in the 22nd century may be fully computerized, and a possible model is shortly depicted in this paper. Computer applications are constantly increasing their share in medical diagnosis, and may ultimately replace physicians. Treatment decisions have been submitted to standardized treatment guidelines, which may be applied more efficiently by computer applications. Although hundreds of studies have evaluated computerized tools in diagnosis and treatment, the possibility that computer applications may replace human physicians in the future is rarely raised. The effects of this process on doctors and medicine may be tremendous and will probably be felt even in early stages, and therefore, this process should be a subject of open discussion.
Collapse
|