Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kupinski MA, Hoppin JW, Clarkson E, Barrett HH, Kastis GA. Estimation in medical imaging without a gold standard. Acad Radiol 2002;9:290-7. [PMID: 11887945 PMCID: PMC3143018 DOI: 10.1016/s1076-6332(03)80372-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

For:	Kupinski MA, Hoppin JW, Clarkson E, Barrett HH, Kastis GA. Estimation in medical imaging without a gold standard. Acad Radiol 2002;9:290-7. [PMID: 11887945 PMCID: PMC3143018 DOI: 10.1016/s1076-6332(03)80372-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Number

Cited by Other Article(s)

Liu Z, Mhlanga JC, Xia H, Siegel BA, Jha AK. Need for Objective Task-Based Evaluation of Image Segmentation Algorithms for Quantitative PET: A Study with ACRIN 6668/RTOG 0235 Multicenter Clinical Trial Data. J Nucl Med 2024;65:jnumed.123.266018. [PMID: 38360049 PMCID: PMC10924158 DOI: 10.2967/jnumed.123.266018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 12/19/2023] [Accepted: 12/19/2023] [Indexed: 02/17/2024] Open

Abstract

Reliable performance of PET segmentation algorithms on clinically relevant tasks is required for their clinical translation. However, these algorithms are typically evaluated using figures of merit (FoMs) that are not explicitly designed to correlate with clinical task performance. Such FoMs include the Dice similarity coefficient (DSC), the Jaccard similarity coefficient (JSC), and the Hausdorff distance (HD). The objective of this study was to investigate whether evaluating PET segmentation algorithms using these task-agnostic FoMs yields interpretations consistent with evaluation on clinically relevant quantitative tasks. Methods: We conducted a retrospective study to assess the concordance in the evaluation of segmentation algorithms using the DSC, JSC, and HD and on the tasks of estimating the metabolic tumor volume (MTV) and total lesion glycolysis (TLG) of primary tumors from PET images of patients with non-small cell lung cancer. The PET images were collected from the American College of Radiology Imaging Network 6668/Radiation Therapy Oncology Group 0235 multicenter clinical trial data. The study was conducted in 2 contexts: (1) evaluating conventional segmentation algorithms, namely those based on thresholding (SUVmax40% and SUVmax50%), boundary detection (Snakes), and stochastic modeling (Markov random field-Gaussian mixture model); (2) evaluating the impact of network depth and loss function on the performance of a state-of-the-art U-net-based segmentation algorithm. Results: Evaluation of conventional segmentation algorithms based on the DSC, JSC, and HD showed that SUVmax40% significantly outperformed SUVmax50%. However, SUVmax40% yielded lower accuracy on the tasks of estimating MTV and TLG, with a 51% and 54% increase, respectively, in the ensemble normalized bias. Similarly, the Markov random field-Gaussian mixture model significantly outperformed Snakes on the basis of the task-agnostic FoMs but yielded a 24% increased bias in estimated MTV. For the U-net-based algorithm, our evaluation showed that although the network depth did not significantly alter the DSC, JSC, and HD values, a deeper network yielded substantially higher accuracy in the estimated MTV and TLG, with a decreased bias of 91% and 87%, respectively. Additionally, whereas there was no significant difference in the DSC, JSC, and HD values for different loss functions, up to a 73% and 58% difference in the bias of the estimated MTV and TLG, respectively, existed. Conclusion: Evaluation of PET segmentation algorithms using task-agnostic FoMs could yield findings discordant with evaluation on clinically relevant quantitative tasks. This study emphasizes the need for objective task-based evaluation of image segmentation algorithms for quantitative PET.

Collapse

Liu Y, Jha AK. How accurately can quantitative imaging methods be ranked without ground truth: An upper bound on no-gold-standard evaluation. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2024;12929:129290W. [PMID: 39610808 PMCID: PMC11601990 DOI: 10.1117/12.3006888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2024]

Liu Z, Mhlanga JC, Siegel BA, Jha AK. Need for objective task-based evaluation of AI-based segmentation methods for quantitative PET. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2023;12467:124670R. [PMID: 37990707 PMCID: PMC10659582 DOI: 10.1117/12.2647894] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]

Liu Z, Moon HS, Li Z, Laforest R, Perlmutter JS, Norris SA, Jha AK. A tissue-fraction estimation-based segmentation method for quantitative dopamine transporter SPECT. Med Phys 2022;49:5121-5137. [PMID: 35635327 PMCID: PMC9703616 DOI: 10.1002/mp.15778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Revised: 04/25/2022] [Accepted: 05/16/2022] [Indexed: 11/09/2022] Open

Abstract

BACKGROUND

Quantitative measures of dopamine transporter (DaT) uptake in caudate, putamen, and globus pallidus (GP) derived from dopamine transporter-single-photon emission computed tomography (DaT-SPECT) images have potential as biomarkers for measuring the severity of Parkinson's disease. Reliable quantification of this uptake requires accurate segmentation of the considered regions. However, segmentation of these regions from DaT-SPECT images is challenging, a major reason being partial-volume effects (PVEs) in SPECT. The PVEs arise from two sources, namely the limited system resolution and reconstruction of images over finite-sized voxel grids. The limited system resolution results in blurred boundaries of the different regions. The finite voxel size leads to TFEs, that is, voxels contain a mixture of regions. Thus, there is an important need for methods that can account for the PVEs, including the TFEs, and accurately segment the caudate, putamen, and GP, from DaT-SPECT images.

PURPOSE

Design and objectively evaluate a fully automated tissue-fraction estimation-based segmentation method that segments the caudate, putamen, and GP from DaT-SPECT images.

METHODS

The proposed method estimates the posterior mean of the fractional volumes occupied by the caudate, putamen, and GP within each voxel of a three-dimensional DaT-SPECT image. The estimate is obtained by minimizing a cost function based on the binary cross-entropy loss between the true and estimated fractional volumes over a population of SPECT images, where the distribution of true fractional volumes is obtained from existing populations of clinical magnetic resonance images. The method is implemented using a supervised deep-learning-based approach.

RESULTS

Evaluations using clinically guided highly realistic simulation studies show that the proposed method accurately segmented the caudate, putamen, and GP with high mean Dice similarity coefficients of ∼ 0.80 and significantly outperformed (p < 0.01 $p < 0.01$ ) all other considered segmentation methods. Further, an objective evaluation of the proposed method on the task of quantifying regional uptake shows that the method yielded reliable quantification with low ensemble normalized root mean square error (NRMSE) < 20% for all the considered regions. In particular, the method yielded an even lower ensemble NRMSE of ∼ 10% for the caudate and putamen.

CONCLUSIONS

The proposed tissue-fraction estimation-based segmentation method for DaT-SPECT images demonstrated the ability to accurately segment the caudate, putamen, and GP, and reliably quantify the uptake within these regions. The results motivate further evaluation of the method with physical-phantom and patient studies.

Collapse

Liu Z, Li Z, Mhlanga JC, Siegel BA, Jha AK. No-gold-standard evaluation of quantitative imaging methods in the presence of correlated noise. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2022;12035:120350M. [PMID: 36465994 PMCID: PMC9717481 DOI: 10.1117/12.2605762] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]

Abstract

Objective evaluation of quantitative imaging (QI) methods with patient data is highly desirable, but is hindered by the lack or unreliability of an available gold standard. To address this issue, techniques that can evaluate QI methods without access to a gold standard are being actively developed. These techniques assume that the true and measured values are linearly related by a slope, bias, and Gaussian-distributed noise term, where the noise between measurements made by different methods is independent of each other. However, this noise arises in the process of measuring the same quantitative value, and thus can be correlated. To address this limitation, we propose a no-gold-standard evaluation (NGSE) technique that models this correlated noise by a multi-variate Gaussian distribution parameterized by a covariance matrix. We derive a maximum-likelihood-based approach to estimate the parameters that describe the relationship between the true and measured values, without any knowledge of the true values. We then use the estimated slopes and diagonal elements of the covariance matrix to compute the noise-to-slope ratio (NSR) to rank the QI methods on the basis of precision. The proposed NGSE technique was evaluated with multiple numerical experiments. Our results showed that the technique reliably estimated the NSR values and yielded accurate rankings of the considered methods for 83% of 160 trials. In particular, the technique correctly identified the most precise method for ∼ 97% of the trials. Overall, this study demonstrates the efficacy of the NGSE technique to accurately rank different QI methods when correlated noise is present, and without access to any knowledge of the ground truth. The results motivate further validation of this technique with realistic simulation studies and patient data.

Collapse

Jha AK, Myers KJ, Obuchowski NA, Liu Z, Rahman MA, Saboury B, Rahmim A, Siegel BA. Objective Task-Based Evaluation of Artificial Intelligence-Based Medical Imaging Methods:: Framework, Strategies, and Role of the Physician. PET Clin 2021;16:493-511. [PMID: 34537127 DOI: 10.1016/j.cpet.2021.06.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Madan H, Berlot R, Ray NJ, Pernus F, Spiclin Z. Practical Priors for Bayesian Inference of Latent Biomarkers. IEEE J Biomed Health Inform 2019;24:396-406. [PMID: 31581104 DOI: 10.1109/jbhi.2019.2945077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Madan H, Pernuš F, Špiclin Ž. Reference-free error estimation for multiple measurement methods. Stat Methods Med Res 2018;28:2196-2209. [PMID: 29384043 DOI: 10.1177/0962280217754231] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Harmonic subtraction for evaluating right ventricle ejection fraction from planar equilibrium radionuclide angiography. Int J Cardiovasc Imaging 2017;33:1857-1862. [DOI: 10.1007/s10554-017-1164-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Accepted: 05/08/2017] [Indexed: 10/19/2022]

Osadebey M, Pedersen M, Arnold D, Wendel-Mitoraj K. Bayesian framework inspired no-reference region-of-interest quality measure for brain MRI images. J Med Imaging (Bellingham) 2017. [PMID: 28630885 DOI: 10.1117/1.jmi.4.2.025504] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Jha AK, Mena E, Caffo B, Ashrafinia S, Rahmim A, Frey E, Subramaniam RM. Practical no-gold-standard evaluation framework for quantitative imaging methods: application to lesion segmentation in positron emission tomography. J Med Imaging (Bellingham) 2017;4:011011. [PMID: 28331883 DOI: 10.1117/1.jmi.4.1.011011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 02/09/2017] [Indexed: 11/14/2022] Open

Jha AK, Frey E. No-gold-standard evaluation of image-acquisition methods using patient data. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2017;10136. [PMID: 28596636 DOI: 10.1117/12.2255902] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Jha AK, Caffo B, Frey EC. A no-gold-standard technique for objective assessment of quantitative nuclear-medicine imaging methods. Phys Med Biol 2016;61:2780-800. [PMID: 26982626 DOI: 10.1088/0031-9155/61/7/2780] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Abstract

The objective optimization and evaluation of nuclear-medicine quantitative imaging methods using patient data is highly desirable but often hindered by the lack of a gold standard. Previously, a regression-without-truth (RWT) approach has been proposed for evaluating quantitative imaging methods in the absence of a gold standard, but this approach implicitly assumes that bounds on the distribution of true values are known. Several quantitative imaging methods in nuclear-medicine imaging measure parameters where these bounds are not known, such as the activity concentration in an organ or the volume of a tumor. We extended upon the RWT approach to develop a no-gold-standard (NGS) technique for objectively evaluating such quantitative nuclear-medicine imaging methods with patient data in the absence of any ground truth. Using the parameters estimated with the NGS technique, a figure of merit, the noise-to-slope ratio (NSR), can be computed, which can rank the methods on the basis of precision. An issue with NGS evaluation techniques is the requirement of a large number of patient studies. To reduce this requirement, the proposed method explored the use of multiple quantitative measurements from the same patient, such as the activity concentration values from different organs in the same patient. The proposed technique was evaluated using rigorous numerical experiments and using data from realistic simulation studies. The numerical experiments demonstrated that the NSR was estimated accurately using the proposed NGS technique when the bounds on the distribution of true values were not precisely known, thus serving as a very reliable metric for ranking the methods on the basis of precision. In the realistic simulation study, the NGS technique was used to rank reconstruction methods for quantitative single-photon emission computed tomography (SPECT) based on their performance on the task of estimating the mean activity concentration within a known volume of interest. Results showed that the proposed technique provided accurate ranking of the reconstruction methods for 97.5% of the 50 noise realizations. Further, the technique was robust to the choice of evaluated reconstruction methods. The simulation study pointed to possible violations of the assumptions made in the NGS technique under clinical scenarios. However, numerical experiments indicated that the NGS technique was robust in ranking methods even when there was some degree of such violation.

Collapse

Sullivan DC, Obuchowski NA, Kessler LG, Raunig DL, Gatsonis C, Huang EP, Kondratovich M, McShane LM, Reeves AP, Barboriak DP, Guimaraes AR, Wahl RL. Metrology Standards for Quantitative Imaging Biomarkers. Radiology 2015;277:813-25. [PMID: 26267831 PMCID: PMC4666097 DOI: 10.1148/radiol.2015142202] [Citation(s) in RCA: 305] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Affiliation(s)

Daniel C. Sullivan From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Nancy A. Obuchowski From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Larry G. Kessler From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
David L. Raunig From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Constantine Gatsonis From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Erich P. Huang From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Marina Kondratovich From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Lisa M. McShane From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Anthony P. Reeves From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Daniel P. Barboriak From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Alexander R. Guimaraes From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
Richard L. Wahl From the Department of Radiology, Duke University Medical Center, Box 2715, Durham, NC 27710 (D.C.S., D.P.B.); Department of Quantitative Health Sciences, Cleveland Clinic Foundation, Cleveland, Ohio (N.A.O.); Department of Public Health, University of Washington, Seattle, Wash (L.G.K.); Department of Informatics, ICON Medical, Washington, Pa (D.L.R.); Center for Statistical Sciences, Brown University, Providence, RI (C.G.); National Cancer Institute, Bethesda, Md (E.P.H., L.M.M.); Center for Devices and Radiological Health, U.S. Food and Drug Administration, White Oak, Md (M.K.); Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY (A.P.R.); Department of Radiology, Oregon Health & Science University, Portland, Ore (A.R.G.); and Mallinckrodt Institute of Radiology, Washington University School of Medicine, St Louis, Mo (R.L.W.)
For the RSNA-QIBA Metrology Working Group

Collapse

Lebenberg J, Lalande A, Clarysse P, Buvat I, Casta C, Cochet A, Constantinidès C, Cousty J, de Cesare A, Jehan-Besson S, Lefort M, Najman L, Roullot E, Sarry L, Tilmant C, Frouin F, Garreau M. Improved Estimation of Cardiac Function Parameters Using a Combination of Independent Automated Segmentation Results in Cardiovascular Magnetic Resonance Imaging. PLoS One 2015;10:e0135715. [PMID: 26287691 PMCID: PMC4545395 DOI: 10.1371/journal.pone.0135715] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 07/24/2015] [Indexed: 11/18/2022] Open

Affiliation(s)

Jessica Lebenberg Laboratoire d’Imagerie Biomédicale, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Université Pierre et Marie Curie, Paris, France École Spéciale de Mécanique et d’Électricité-Sudria, Ivry-sur-Seine, France * E-mail:
Alain Lalande Laboratoire Electronique, Informatique et Image, Centre National de la Recherche Scientifique, Université de Bourgogne, Dijon, France
Patrick Clarysse Centre de Recherche en Acquisition et Traitement de l’Image pour la Santé, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut National des Sciences Appliquées Lyon, Université de Lyon, Villeurbanne, France
Irene Buvat Unité d’Imagerie Moléculaire In Vivo, Service Hospitalier Frédéric Joliot, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Commissariat à l’Energie Atomique, Université Paris Sud, Orsay, France
Christopher Casta Centre de Recherche en Acquisition et Traitement de l’Image pour la Santé, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Institut National des Sciences Appliquées Lyon, Université de Lyon, Villeurbanne, France
Alexandre Cochet Laboratoire Electronique, Informatique et Image, Centre National de la Recherche Scientifique, Université de Bourgogne, Dijon, France
Constantin Constantinidès Laboratoire d’Imagerie Biomédicale, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Université Pierre et Marie Curie, Paris, France École Spéciale de Mécanique et d’Électricité-Sudria, Ivry-sur-Seine, France
Jean Cousty Laboratoire d’Informatique Gaspard Monge, Centre National de la Recherche Scientifique, Université Paris-Est Marne-la-Vallée, École Supérieure d’Ingénieurs en Électrotechnique et Électronique, Marne-la-Vallée, France
Alain de Cesare Laboratoire d’Imagerie Biomédicale, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Université Pierre et Marie Curie, Paris, France
Stephanie Jehan-Besson Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, Centre National de la Recherche Scientifique, Caen, France
Muriel Lefort Laboratoire d’Imagerie Biomédicale, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Université Pierre et Marie Curie, Paris, France
Laurent Najman Laboratoire d’Informatique Gaspard Monge, Centre National de la Recherche Scientifique, Université Paris-Est Marne-la-Vallée, École Supérieure d’Ingénieurs en Électrotechnique et Électronique, Marne-la-Vallée, France
Elodie Roullot École Spéciale de Mécanique et d’Électricité-Sudria, Ivry-sur-Seine, France
Laurent Sarry Image Science for Interventional Techniques, Centre National de la Recherche Scientifique, Université d’Auvergne, Clermont-Ferrand, France
Christophe Tilmant Institut Pascal, Centre National de la Recherche Scientifique, Université Blaise Pascal, Clermont-Ferrand, France
Frederique Frouin Unité d’Imagerie Moléculaire In Vivo, Service Hospitalier Frédéric Joliot, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Commissariat à l’Energie Atomique, Université Paris Sud, Orsay, France
Mireille Garreau Laboratoire de Traitement du Signal et des Images, Institut National de la Santé et de la Recherche Médicale, Université de Rennes, Rennes, France

Collapse

Branscum AJ, Johnson WO, Hanson TE, Baron AT. Flexible regression models for ROC and risk analysis, with or without a gold standard. Stat Med 2015;34:3997-4015. [DOI: 10.1002/sim.6610] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 07/06/2015] [Indexed: 11/07/2022]

Jha AK, Song N, Caffo B, Frey EC. Objective evaluation of reconstruction methods for quantitative SPECT imaging in the absence of ground truth. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2015;9416:94161K. [PMID: 26430292 DOI: 10.1117/12.2081286] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Liu H, Wang J, Xu X, Song E, Wang Q, Jin R, Hung CC, Fei B. A robust and accurate center-frequency estimation (RACE) algorithm for improving motion estimation performance of SinMod on tagged cardiac MR images without known tagging parameters. Magn Reson Imaging 2014;32:1139-55. [PMID: 25087857 DOI: 10.1016/j.mri.2014.07.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Revised: 04/30/2014] [Accepted: 07/24/2014] [Indexed: 10/25/2022]

Obuchowski NA, Reeves AP, Huang EP, Wang XF, Buckler AJ, Kim HJG, Barnhart HX, Jackson EF, Giger ML, Pennello G, Toledano AY, Kalpathy-Cramer J, Apanasovich TV, Kinahan PE, Myers KJ, Goldgof DB, Barboriak DP, Gillies RJ, Schwartz LH, Sullivan DC. Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons. Stat Methods Med Res 2014;24:68-106. [PMID: 24919829 DOI: 10.1177/0962280214537390] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Four-Dimensional Image Reconstruction Strategies in Cardiac-Gated and Respiratory-Gated PET Imaging. PET Clin 2012;8:51-67. [PMID: 27157815 DOI: 10.1016/j.cpet.2012.10.005] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Lebenberg J, Buvat I, Lalande A, Clarysse P, Casta C, Cochet A, Constantinides C, Cousty J, de Cesare A, Jehan-Besson S, Lefort M, Najman L, Roullot E, Sarry L, Tilmant C, Garreau M, Frouin F. Nonsupervised ranking of different segmentation approaches: application to the estimation of the left ventricular ejection fraction from cardiac cine MRI sequences. IEEE TRANSACTIONS ON MEDICAL IMAGING 2012;31:1651-1660. [PMID: 22665506 DOI: 10.1109/tmi.2012.2201737] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Jha AK, Kupinski MA, Rodríguez JJ, Stephen RM, Stopeck AT. Task-based evaluation of segmentation algorithms for diffusion-weighted MRI without using a gold standard. Phys Med Biol 2012;57:4425-46. [PMID: 22713231 DOI: 10.1088/0031-9155/57/13/4425] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Manatunga AK, Binongo JNG, Taylor AT. Computer-aided diagnosis of renal obstruction: utility of log-linear modeling versus standard ROC and kappa analysis. EJNMMI Res 2011;1:1-8. [PMID: 21935501 PMCID: PMC3175375 DOI: 10.1186/2191-219x-1-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

Background

The accuracy of computer-aided diagnosis (CAD) software is best evaluated by comparison to a gold standard which represents the true status of disease. In many settings, however, knowledge of the true status of disease is not possible and accuracy is evaluated against the interpretations of an expert panel. Common statistical approaches to evaluate accuracy include receiver operating characteristic (ROC) and kappa analysis but both of these methods have significant limitations and cannot answer the question of equivalence: Is the CAD performance equivalent to that of an expert? The goal of this study is to show the strength of log-linear analysis over standard ROC and kappa statistics in evaluating the accuracy of computer-aided diagnosis of renal obstruction compared to the diagnosis provided by expert readers.

Methods

Log-linear modeling was utilized to analyze a previously published database that used ROC and kappa statistics to compare diuresis renography scan interpretations (non-obstructed, equivocal, or obstructed) generated by a renal expert system (RENEX) in 185 kidneys (95 patients) with the independent and consensus scan interpretations of three experts who were blinded to clinical information and prospectively and independently graded each kidney as obstructed, equivocal, or non-obstructed.

Results

Log-linear modeling showed that RENEX and the expert consensus had beyond-chance agreement in both non-obstructed and obstructed readings (both p < 0.0001). Moreover, pairwise agreement between experts and pairwise agreement between each expert and RENEX were not significantly different (p = 0.41, 0.95, 0.81 for the non-obstructed, equivocal, and obstructed categories, respectively). Similarly, the three-way agreement of the three experts and three-way agreement of two experts and RENEX was not significantly different for non-obstructed (p = 0.79) and obstructed (p = 0.49) categories.

Conclusion

Log-linear modeling showed that RENEX was equivalent to any expert in rating kidneys, particularly in the obstructed and non-obstructed categories. This conclusion, which could not be derived from the original ROC and kappa analysis, emphasizes and illustrates the role and importance of log-linear modeling in the absence of a gold standard. The log-linear analysis also provides additional evidence that RENEX has the potential to assist in the interpretation of diuresis renography studies.

Collapse

Zaidi H, El Naqa I. PET-guided delineation of radiation therapy treatment volumes: a survey of image segmentation techniques. Eur J Nucl Med Mol Imaging 2010;37:2165-87. [PMID: 20336455 DOI: 10.1007/s00259-010-1423-3] [Citation(s) in RCA: 227] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2009] [Accepted: 02/20/2010] [Indexed: 12/23/2022]

Jha AK, Kupinski MA, Rodríguez JJ, Stephen RM, Stopeck AT. Evaluating segmentation algorithms for diffusion-weighted MR images: a task-based approach. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2010;7627:76270L (2010). [PMID: 21152379 PMCID: PMC2997747 DOI: 10.1117/12.845515] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Pennello G, Thompson L. Experience with reviewing Bayesian medical device trials. J Biopharm Stat 2008;18:81-115. [PMID: 18161543 DOI: 10.1080/10543400701668274] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Ducreux D, Buvat I, Meder JF, Mikulis D, Crawley A, Fredy D, TerBrugge K, Lasjaunias P, Bittoun J. Perfusion-weighted MR imaging studies in brain hypervascular diseases: comparison of arterial input function extractions for perfusion measurement. AJNR Am J Neuroradiol 2006;27:1059-69. [PMID: 16687543 PMCID: PMC7975726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]

Khalil MM, Elgazzar A, Khalil W. Evaluation of left ventricular ejection fraction by the quantitative algorithms QGS, ECTb, LMC and LVGTF using gated myocardial perfusion SPECT: investigation of relative accuracy. Nucl Med Commun 2006;27:321-32. [PMID: 16531917 DOI: 10.1097/01.mnm.0000202861.67293.95] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Abstract

AIM

To compare the quantitative algorithms Emory Cardiac Toolbox (ECTb), quantitative gated SPECT (QGS), layer of maximum counts (LMC), and left ventricular global thickening fraction (LVGTF) using gated myocardial tomography in the calculation of the left ventricular ejection fraction using the regression without truth (RWT) technique.

MATERIALS AND METHODS

Seventy-four consecutive patients were included in the study (59 males). All patients underwent stress-rest myocardial perfusion SPECT using Tc-tetrofosmin. Analysis of variance (ANOVA), the paired Student's t-test, the Pearson correlation coefficient and Bland-Altman were used for comparing the methods. The relative accuracy was performed by RWT.

RESULTS

ANOVA revealed a significant difference among the methods in calculating the ejection fraction. RWT showed that ECTb and QGS outperformed the other two methods. The ECTb was slightly better than QGS, and LMC was slightly better than LVGTF. QGS and ECTb achieved good correlations in end diastolic volume, end systolic volume and ejection fraction measurements. One-way ANOVA demonstrated that QGS was the only software program affected by the category of the perfusion summed stress score (SSS), P=0.038. The ejection fraction determined by the QGS, ECTb and LVGTF methods correlated significantly with defect size (r=0.545, P<0.0001; r=0.530, P<0.0001; and r=0.419, P<0.0001, respectively), but the LMC method was not significantly correlated (r=0.216, P=0.067).

CONCLUSIONS

There was a considerable variation among the quantitative gated SPECT methods in the evaluation of the ejection fraction. RWT revealed that the ECTb and QGS outperformed the other two methods with respect to the bias and precision of the measurements. Pair-wise correlations of the four methods ranged from mild to good with large agreement limits. Results of RWT provided important information in ranking the quantitative gated SPECT methods.

Collapse

Kupinski MA, Hoppin JW, Krasnow J, Dahlberg S, Leppo JA, King MA, Clarkson E, Barrett HH. Comparing cardiac ejection fraction estimation algorithms without a gold standard. Acad Radiol 2006;13:329-37. [PMID: 16488845 PMCID: PMC2464280 DOI: 10.1016/j.acra.2005.12.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2005] [Revised: 12/01/2005] [Accepted: 12/02/2005] [Indexed: 11/17/2022]

Abstract

RATIONALE AND OBJECTIVES

Imaging and estimation of left ventricular function have major diagnostic and prognostic importance in patients with coronary artery disease. It is vital that the method used to estimate cardiac ejection fraction (EF) allows the observer to best perform this task. To measure task-based performance, one must clearly define the task in question, the observer performing the task, and the patient population being imaged. In this report, the task is to accurately and precisely measure cardiac EF, and the observers are human-assisted computer algorithms that analyze the images and estimate cardiac EF. It is very difficult to measure the performance of an observer by using clinical data because estimation tasks typically lack a gold standard. A solution to this "no-gold-standard" problem recently was proposed, called regression without truth (RWT).

MATERIALS AND METHODS

Results of three different software packages used to analyze gated, cardiac, and nuclear medicine images, each of which uses a different algorithm to estimate a patient's cardiac EF, are compared. The three methods are the Emory method, Quantitative Gated Single-Photon Emission Computed Tomographic method, and the Wackers-Liu Circumferential Quantification method. The same set of images is used as input to each of the three algorithms. Data were analyzed from the three different algorithms by using RWT to determine which produces the best estimates of cardiac EF in terms of accuracy and precision.

RESULTS AND DISCUSSION

In performing this study, three different consistency checks were developed to ensure that the RWT method is working properly. The Emory method of estimating EF slightly outperformed the other two methods. In addition, the RWT method passed all three consistency checks, garnering confidence in the method and its application to clinical data.

Collapse

Ruiz-de-Jesus O, Yanez-Suarez O, Jimenez-Angeles L, Vallejo-Venegas E. Software phantom for the synthesis of equilibrium radionuclide ventriculography images. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2006;2006:1085-1088. [PMID: 17946442 DOI: 10.1109/iembs.2006.260276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Hoppin JW, Kupinski MA, Wilson DW, Peterson T, Gershman B, Kastis G, Clarkson E, Furenlid L, Barrett HH. Evaluating Estimation Techniques in Medical Imaging Without a Gold Standard: Experimental Validation. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2003;5034:10.1117/12.480330. [PMID: 26346933 PMCID: PMC4558919 DOI: 10.1117/12.480330] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Hoppin JW, Kupinski MA, Kastis GA, Clarkson E, Barrett HH. Objective comparison of quantitative imaging modalities without the use of a gold standard. IEEE TRANSACTIONS ON MEDICAL IMAGING 2002;21:441-9. [PMID: 12071615 PMCID: PMC3150581 DOI: 10.1109/tmi.2002.1009380] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]