1
|
Su J, Li M, Lin Y, Xiong L, Yuan C, Zhou Z, Yan K. Deep learning-driven multi-view multi-task image quality assessment method for chest CT image. Biomed Eng Online 2023; 22:117. [PMID: 38057850 DOI: 10.1186/s12938-023-01183-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 11/27/2023] [Indexed: 12/08/2023] Open
Abstract
BACKGROUND Chest computed tomography (CT) image quality impacts radiologists' diagnoses. Pre-diagnostic image quality assessment is essential but labor-intensive and may have human limitations (fatigue, perceptual biases, and cognitive biases). This study aims to develop and validate a deep learning (DL)-driven multi-view multi-task image quality assessment (M[Formula: see text]IQA) method for assessing the quality of chest CT images in patients, to determine if they are suitable for assessing the patient's physical condition. METHODS This retrospective study utilizes and analyzes chest CT images from 327 patients. Among them, 1613 images from 286 patients are used for model training and validation, while the remaining 41 patients are reserved as an additional test set for conducting ablation studies, comparative studies, and observer studies. The M[Formula: see text]IQA method is driven by DL technology and employs a multi-view fusion strategy, which incorporates three scanning planes (coronal, axial, and sagittal). It assesses image quality for multiple tasks, including inspiration evaluation, position evaluation, radiation protection evaluation, and artifact evaluation. Four algorithms (pixel threshold, neural statistics, region measurement, and distance measurement) have been proposed, each tailored for specific evaluation tasks, with the aim of optimizing the evaluation performance of the M[Formula: see text]IQA method. RESULTS In the additional test set, the M[Formula: see text]IQA method achieved 87% precision, 93% sensitivity, 69% specificity, and a 0.90 F1-score. Extensive ablation and comparative studies have demonstrated the effectiveness of the proposed algorithms and the generalization performance of the proposed method across various assessment tasks. CONCLUSION This study develops and validates a DL-driven M[Formula: see text]IQA method, complemented by four proposed algorithms. It holds great promise in automating the assessment of chest CT image quality. The performance of this method, as well as the effectiveness of the four algorithms, is demonstrated on an additional test set.
Collapse
Affiliation(s)
- Jialin Su
- School of Optoelectronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, China
| | - Meifang Li
- Department of Medical Imaging, Affiliated Hospital of Putian University, Putian, 351100, China
- School of Clinical Medicine, Fujian Medical University, Fuzhou, 350122, China
| | - Yongping Lin
- School of Optoelectronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, China.
| | - Liu Xiong
- School of Optoelectronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, China
| | - Caixing Yuan
- Department of Medical Imaging, Affiliated Hospital of Putian University, Putian, 351100, China
| | - Zhimin Zhou
- Department of Medical Imaging, Affiliated Hospital of Putian University, Putian, 351100, China
| | - Kunlong Yan
- Department of Medical Imaging, Affiliated Hospital of Putian University, Putian, 351100, China
| |
Collapse
|
2
|
Baird GL, Bernstein MH, Atalay MK. Bias, Noise, and Consensus Interpretation in Radiology. Radiology 2022; 305:E69. [PMID: 35972359 DOI: 10.1148/radiol.220055] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Grayson L Baird
- Brown Radiology Human Factors Lab, Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Rhode Island Hospital, 593 Eddy St, Providence, RI 02903
| | - Michael H Bernstein
- Brown Radiology Human Factors Lab, Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Rhode Island Hospital, 593 Eddy St, Providence, RI 02903
| | - Michael K Atalay
- Brown Radiology Human Factors Lab, Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Rhode Island Hospital, 593 Eddy St, Providence, RI 02903.,Department of Diagnostic Imaging, Warren Alpert Medical School of Brown University, Providence, RI
| |
Collapse
|
3
|
Assessment of Diagnostic Imaging Sector in Public Hospitals in Northern Jordan. Healthcare (Basel) 2022; 10:healthcare10061136. [PMID: 35742187 PMCID: PMC9223183 DOI: 10.3390/healthcare10061136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 05/30/2022] [Accepted: 06/16/2022] [Indexed: 12/15/2022] Open
Abstract
The most effective diagnostic methods in the medical field are diagnostic imaging techniques such as radiography, computed tomography (CT), magnetic resonance imaging (MRI), and nuclear medicine, which are used to visualize internal body to diagnose it, determine potential treatment, and evaluate and forecast care results. Therefore, the purpose of this research is to assess the diagnostic imaging sector, at three major public hospitals in the northern part of Jordan, according to regional and global requirements. The assessment approach was based on knowledge of the accessibility of diagnostic imaging equipment and its quality assurance and performance, the quantity and efficiency of radiological technologists, and the design of radiology units and medical imaging chambers in many aspects based on the use of two tools, a questionnaire and checklists, to accomplish a comprehensive evaluation. The response rate of radiological technologists was 66%. The assessment reveals a noticeable increase in the number of radiological technologists in general with high academic qualification level. Additionally, the number of diagnostic imaging equipment in Jordan revealed a large deficiency in the population–device balance, and through checklists that evaluated both CT and MRI units, it was revealed that the rate of following global requirements and occupational health and safety (OHS) standards was high. The basic supplies available in both the CT and MRI units alike were high, which indicates the high quality of healthcare provided in Jordan.
Collapse
|
4
|
Stępień I, Oszust M. A Brief Survey on No-Reference Image Quality Assessment Methods for Magnetic Resonance Images. J Imaging 2022; 8:160. [PMID: 35735959 PMCID: PMC9224540 DOI: 10.3390/jimaging8060160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 05/31/2022] [Accepted: 06/01/2022] [Indexed: 02/08/2023] Open
Abstract
No-reference image quality assessment (NR-IQA) methods automatically and objectively predict the perceptual quality of images without access to a reference image. Therefore, due to the lack of pristine images in most medical image acquisition systems, they play a major role in supporting the examination of resulting images and may affect subsequent treatment. Their usage is particularly important in magnetic resonance imaging (MRI) characterized by long acquisition times and a variety of factors that influence the quality of images. In this work, a survey covering recently introduced NR-IQA methods for the assessment of MR images is presented. First, typical distortions are reviewed and then popular NR methods are characterized, taking into account the way in which they describe MR images and create quality models for prediction. The survey also includes protocols used to evaluate the methods and popular benchmark databases. Finally, emerging challenges are outlined along with an indication of the trends towards creating accurate image prediction models.
Collapse
Affiliation(s)
- Igor Stępień
- Doctoral School of Engineering and Technical Sciences, Rzeszow University of Technology, al. Powstancow Warszawy 12, 35-959 Rzeszow, Poland;
| | - Mariusz Oszust
- Department of Computer and Control Engineering, Rzeszow University of Technology, Wincentego Pola 2, 35-959 Rzeszow, Poland
| |
Collapse
|
5
|
Diagnosis Efficacy of Cone-Beam Computed Tomography in Endodontics—A Systematic Review of High-Level-Evidence Studies. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12030938] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Introduction: The integration of clinical inspection and diagnostic imaging forms the basis for endodontic diagnosis, decision making, treatment planning, and outcome assessments. In recent years, CBCT imaging has become a common diagnostic tool in endodontics. CBCT should only be used to ensure that the benefits to the patient exceed the risks. As such, our aim in this study was to evaluate the high level diagnostic efficacy studies and their risk of bias. Methods: A systematic search of the literature was conducted to identify studies evaluating the use of CBCT imaging in endodontics. The following databases were searched: Medline (PubMed), Scopus, and Cochrane Central. The identified studies were subjected to rigorous inclusion criteria. Studies considered as having a high efficacy level were then subjected to a risk of bias assessment using the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. Results: Initially, 1568 articles were identified for possible inclusion in the review. Following title and abstract assessment, duplicate removal, and a full-text evaluation, 22 studies were included. Of those studies, 2 had a low risk of bias and 20 had a high risk of bias. Six studies investigated non-surgical treatment, eight investigated surgical treatment, two investigated both non-surgical and surgical treatment, and six studies investigated diagnostic thinking or decision making. Conclusion: The evidence for the influence of CBCT on decision making and treatment outcomes in endodontics is predominantly based on studies with a high risk of bias.
Collapse
|
6
|
Lévêque L, Outtas M, Liu H, Zhang L. Comparative study of the methodologies used for subjective medical image quality assessment. Phys Med Biol 2021; 66. [PMID: 34225264 DOI: 10.1088/1361-6560/ac1157] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 07/05/2021] [Indexed: 11/12/2022]
Abstract
Healthcare professionals have been increasingly viewing medical images and videos in their routine clinical practice, and this in a wide variety of environments. Both the perception and interpretation of medical visual information, across all branches of practice or medical specialties (e.g. diagnostic, therapeutic, or surgical medicine), career stages, and practice settings (e.g. emergency care), appear to be critical for patient care. However, medical images and videos are not self-explanatory and, therefore, need to be interpreted by humans, i.e. medical experts. In addition, various types of degradations and artifacts may appear during image acquisition or processing, and consequently affect medical imaging data. Such distortions tend to impact viewers' quality of experience, as well as their clinical practice. It is accordingly essential to better understand how medical experts perceive the quality of visual content. Thankfully, progress has been made in the recent literature towards such understanding. In this article, we present an up-to-date state-of the-art of relatively recent (i.e. not older than ten years old) existing studies on the subjective quality assessment of medical images and videos, as well as research works using task-based approaches. Furthermore, we discuss the merits and drawbacks of the methodologies used, and we provide recommendations about experimental designs and statistical processes to evaluate the perception of medical images and videos for future studies, which could then be used to optimise the visual experience of image readers in real clinical practice. Finally, we tackle the issue of the lack of available annotated medical image and video quality databases, which appear to be indispensable for the development of new dedicated objective metrics.
Collapse
Affiliation(s)
- Lucie Lévêque
- Nantes Laboratory of Digital Sciences (LS2N), University of Nantes, Nantes, France
| | - Meriem Outtas
- Department of Industrial Computer Science and Electronics, National Institute of Applied Sciences, Rennes, France
| | - Hantao Liu
- School of Computer Science and Informatics, Cardiff University, Cardiff, United Kingdom
| | - Lu Zhang
- Department of Industrial Computer Science and Electronics, National Institute of Applied Sciences, Rennes, France
| |
Collapse
|
7
|
Image Quality Assessment to Emulate Experts’ Perception in Lumbar MRI Using Machine Learning. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11146616] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Medical image quality is crucial to obtaining reliable diagnostics. Most quality controls rely on routine tests using phantoms, which do not reflect closely the reality of images obtained on patients and do not reflect directly the quality perceived by radiologists. The purpose of this work is to develop a method that classifies the image quality perceived by radiologists in MR images. The focus was set on lumbar images as they are widely used with different challenges. Three neuroradiologists evaluated the image quality of a dataset that included T1-weighting images in axial and sagittal orientation, and sagittal T2-weighting. In parallel, we introduced the computational assessment using a wide range of features extracted from the images, then fed them into a classifier system. A total of 95 exams were used, from our local hospital and a public database, and part of the images was manipulated to broaden the distribution quality of the dataset. Good recall of 82% and an area under curve (AUC) of 77% were obtained on average in testing condition, using a Support Vector Machine. Even though the actual implementation still relies on user interaction to extract features, the results are promising with respect to a potential implementation for monitoring image quality online with the acquisition process.
Collapse
|
8
|
|
9
|
Obuchowicz R, Piórkowski A, Urbanik A, Strzelecki M. Influence of Acquisition Time on MR Image Quality Estimated with Nonparametric Measures Based on Texture Features. BIOMED RESEARCH INTERNATIONAL 2019; 2019:3706581. [PMID: 31828100 PMCID: PMC6886329 DOI: 10.1155/2019/3706581] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 08/06/2019] [Accepted: 09/01/2019] [Indexed: 12/21/2022]
Abstract
Correlation of parametrized image texture features (ITF) analyses conducted in different regions of interest (ROIs) overcomes limitations and reliably reflects image quality. The aim of this study is to propose a nonparametrical method and classify the quality of a magnetic resonance (MR) image that has undergone controlled degradation by using textural features in the image. Images of 41 patients, 17 women and 24 men, aged between 23 and 56 years were analyzed. T2-weighted sagittal sequences of the lumbar spine, cervical spine, and knee and T2-weighted coronal sequences of the shoulder and wrist were generated. The implementation of parallel imaging with the use of GRAPPA2, GRAPPA3, and GRAPPA4 led to a substantial reduction in the scanning time but also degraded image quality. The number of degraded image textural features was correlated with the scanning time. Longer scan times correlated with markedly higher ITF image persistence in comparison with images computed with reduced scan times. Higher ITF preservation was observed in images of bones in the spine and femur as compared to images of soft tissues, i.e., tendons and muscles. Finally, a nonparametrized image quality assessment based on an analysis of the ITF, computed for different tissues, correlating with the changes in acquisition time of the MR images, was successfully developed. The correlation between acquisition time and the number of reproducible features present in an MR image was found to yield the necessary assumptions to calculate the quality index.
Collapse
Affiliation(s)
- Rafał Obuchowicz
- Department of Diagnostic Imaging, Jagiellonian University Medical College, Kraków 31-501, Poland
| | - Adam Piórkowski
- Department of Biocybernetics and Biomedical Engineering, AGH University of Science and Technology, Kraków 30-059, Poland
| | - Andrzej Urbanik
- Department of Diagnostic Imaging, Jagiellonian University Medical College, Kraków 31-501, Poland
| | - Michał Strzelecki
- Institute of Electronics, Łódź University of Technology, Łódź 90-924, Poland
| |
Collapse
|
10
|
Shaw CB, Foster BH, Borgese M, Boutin RD, Bateni C, Boonsri P, Bayne CO, Szabo RM, Nayak KS, Chaudhari AJ. Real-time three-dimensional MRI for the assessment of dynamic carpal instability. PLoS One 2019; 14:e0222704. [PMID: 31536561 PMCID: PMC6752861 DOI: 10.1371/journal.pone.0222704] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Accepted: 09/03/2019] [Indexed: 12/11/2022] Open
Abstract
Background Carpal instability is defined as a condition where wrist motion and/or loading creates mechanical dysfunction, resulting in weakness, pain and decreased function. When conventional methods do not identify the instability patterns, yet clinical signs of instability exist, the diagnosis of dynamic instability is often suggested to describe carpal derangement manifested only during the wrist’s active motion or stress. We addressed the question: can advanced MRI techniques provide quantitative means to evaluate dynamic carpal instability and supplement standard static MRI acquisition? Our objectives were to (i) develop a real-time, three-dimensional MRI method to image the carpal joints during their active, uninterrupted motion; and (ii) demonstrate feasibility of the method for assessing metrics relevant to dynamic carpal instability, thus overcoming limitations of standard MRI. Methods Twenty wrists (bilateral wrists of ten healthy participants) were scanned during radial-ulnar deviation and clenched-fist maneuvers. Images resulting from two real-time MRI pulse sequences, four sparse data-acquisition schemes, and three constrained image reconstruction techniques were compared. Image quality was assessed via blinded scoring by three radiologists and quantitative imaging metrics. Results Real-time MRI data-acquisition employing sparse radial sampling with a gradient-recalled-echo acquisition and constrained iterative reconstruction appeared to provide a practical tradeoff between imaging speed (temporal resolution up to 135 ms per slice) and image quality. The method effectively reduced streaking artifacts arising from data undersampling and enabled the derivation of quantitative measures pertinent to evaluating dynamic carpal instability. Conclusion This study demonstrates that real-time, three-dimensional MRI of the moving wrist is feasible and may be useful for the evaluation of dynamic carpal instability.
Collapse
Affiliation(s)
- Calvin B. Shaw
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
| | - Brent H. Foster
- Department of Biomedical Engineering, University of California Davis, Davis, California, United States of America
| | - Marissa Borgese
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
| | - Robert D. Boutin
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
| | - Cyrus Bateni
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
| | - Pattira Boonsri
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
| | - Christopher O. Bayne
- Department of Orthopaedic Surgery, University of California Davis, Sacramento, California, United States of America
| | - Robert M. Szabo
- Department of Orthopaedic Surgery, University of California Davis, Sacramento, California, United States of America
| | - Krishna S. Nayak
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California, United States of America
| | - Abhijit J. Chaudhari
- Department of Radiology, University of California Davis, Sacramento, California, United States of America
- * E-mail:
| |
Collapse
|
11
|
Sheridan H, Reingold EM. The Holistic Processing Account of Visual Expertise in Medical Image Perception: A Review. Front Psychol 2017; 8:1620. [PMID: 29033865 PMCID: PMC5627012 DOI: 10.3389/fpsyg.2017.01620] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 09/04/2017] [Indexed: 12/11/2022] Open
Abstract
In the field of medical image perception, the holistic processing perspective contends that experts can rapidly extract global information about the image, which can be used to guide their subsequent search of the image (Swensson, 1980; Nodine and Kundel, 1987; Kundel et al., 2007). In this review, we discuss the empirical evidence supporting three different predictions that can be derived from the holistic processing perspective: Expertise in medical image perception is domain-specific, experts use parafoveal and/or peripheral vision to process large regions of the image in parallel, and experts benefit from a rapid initial glimpse of an image. In addition, we discuss a pivotal recent study (Litchfield and Donovan, 2016) that seems to contradict the assumption that experts benefit from a rapid initial glimpse of the image. To reconcile this finding with the existing literature, we suggest that global processing may serve multiple functions that extend beyond the initial glimpse of the image. Finally, we discuss future research directions, and we highlight the connections between the holistic processing account and similar theoretical perspectives and findings from other domains of visual expertise.
Collapse
Affiliation(s)
- Heather Sheridan
- Department of Psychology, University at Albany, State University of New York, Albany, NY, United States
| | - Eyal M. Reingold
- Department of Psychology, University of Toronto, Mississauga, ON, Canada
| |
Collapse
|
12
|
Geist JR. The efficacy of diagnostic imaging should guide oral and maxillofacial radiology research. Oral Surg Oral Med Oral Pathol Oral Radiol 2017; 124:211-213. [PMID: 28698118 DOI: 10.1016/j.oooo.2017.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 06/02/2017] [Indexed: 01/14/2023]
Affiliation(s)
- James R Geist
- Editor, OMR Section, Professor, Department of Biomedical and Diagnostic Sciences, University of Detroit Mercy School of Dentistry, Detroit, MI, USA
| |
Collapse
|
13
|
|
14
|
|
15
|
Levenson RM, Krupinski EA, Navarro VM, Wasserman EA. Pigeons (Columba livia) as Trainable Observers of Pathology and Radiology Breast Cancer Images. PLoS One 2015; 10:e0141357. [PMID: 26581091 PMCID: PMC4651348 DOI: 10.1371/journal.pone.0141357] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Accepted: 10/07/2015] [Indexed: 02/07/2023] Open
Abstract
Pathologists and radiologists spend years acquiring and refining their medically essential visual skills, so it is of considerable interest to understand how this process actually unfolds and what image features and properties are critical for accurate diagnostic performance. Key insights into human behavioral tasks can often be obtained by using appropriate animal models. We report here that pigeons (Columba livia)—which share many visual system properties with humans—can serve as promising surrogate observers of medical images, a capability not previously documented. The birds proved to have a remarkable ability to distinguish benign from malignant human breast histopathology after training with differential food reinforcement; even more importantly, the pigeons were able to generalize what they had learned when confronted with novel image sets. The birds’ histological accuracy, like that of humans, was modestly affected by the presence or absence of color as well as by degrees of image compression, but these impacts could be ameliorated with further training. Turning to radiology, the birds proved to be similarly capable of detecting cancer-relevant microcalcifications on mammogram images. However, when given a different (and for humans quite difficult) task—namely, classification of suspicious mammographic densities (masses)—the pigeons proved to be capable only of image memorization and were unable to successfully generalize when shown novel examples. The birds’ successes and difficulties suggest that pigeons are well-suited to help us better understand human medical image perception, and may also prove useful in performance assessment and development of medical imaging hardware, image processing, and image analysis tools.
Collapse
Affiliation(s)
- Richard M Levenson
- Department of Pathology and Laboratory Medicine, University of California Davis Medical Center, Sacramento, California, United States of America
| | - Elizabeth A Krupinski
- Department of Radiology & Imaging Sciences, College of Medicine, Emory University, Atlanta, Georgia, United States of America
| | - Victor M Navarro
- Department of Psychological and Brain Sciences, The University of Iowa, Iowa City, Iowa, United States of America
| | - Edward A Wasserman
- Department of Psychological and Brain Sciences, The University of Iowa, Iowa City, Iowa, United States of America
| |
Collapse
|
16
|
The Diagnostic Efficacy of Cone-beam Computed Tomography in Endodontics: A Systematic Review and Analysis by a Hierarchical Model of Efficacy. J Endod 2015; 41:1008-14. [DOI: 10.1016/j.joen.2015.02.021] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Revised: 02/05/2015] [Accepted: 02/14/2015] [Indexed: 01/21/2023]
|
17
|
|
18
|
Mallett S, Halligan S, Collins GS, Altman DG. Exploration of analysis methods for diagnostic imaging tests: problems with ROC AUC and confidence scores in CT colonography. PLoS One 2014; 9:e107633. [PMID: 25353643 PMCID: PMC4212964 DOI: 10.1371/journal.pone.0107633] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 08/19/2014] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. METHODS In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. RESULTS Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. CONCLUSIONS The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Collapse
Affiliation(s)
- Susan Mallett
- Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
| | - Steve Halligan
- Centre for Medical Imaging, University College London, London, United Kingdom
| | - Gary S. Collins
- Centre for Statistics in Medicine, University of Oxford, Oxford, United Kingdom
| | - Doug G. Altman
- Centre for Statistics in Medicine, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
19
|
Kruse C, Spin-Neto R, Wenzel A, Kirkevang LL. Cone beam computed tomography and periapical lesions: a systematic review analysing studies on diagnostic efficacy by a hierarchical model. Int Endod J 2014; 48:815-28. [DOI: 10.1111/iej.12388] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Accepted: 09/30/2014] [Indexed: 11/27/2022]
Affiliation(s)
- C. Kruse
- Oral Radiology; Department of Dentistry; Aarhus University; Aarhus Denmark
| | - R. Spin-Neto
- Oral Radiology; Department of Dentistry; Aarhus University; Aarhus Denmark
| | - A. Wenzel
- Oral Radiology; Department of Dentistry; Aarhus University; Aarhus Denmark
- Radiology; Department of Dentistry; Copenhagen University; Copenhagen Denmark
| | - L.-L. Kirkevang
- Oral Radiology; Department of Dentistry; Aarhus University; Aarhus Denmark
- Department of Endodontics; Institute of Clinical Dentistry; Faculty of Dentistry; Oslo University; Oslo Norway
| |
Collapse
|
20
|
Observer study of a prototype clinical decision support system for breast cancer diagnosis using dynamic contrast-enhanced MRI. AJR Am J Roentgenol 2013; 200:277-83. [PMID: 23345346 DOI: 10.2214/ajr.12.8718] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of this article is to evaluate the performance of radiologists using a prototype clinical decision support system to diagnose and manage patients with breast cancer based on dynamic contrast-enhanced MRI studies. MATERIALS AND METHODS The study was conducted with three breast radiologists and two breast imaging fellows who gave patient treatment recommendations and confidence ratings, both without and with computer aid. The computer aid presented similar cases from a retrieval database of 192 lesions (96 malignant and 96 benign) for a test set of 97 mass lesions (46 malignant and 51 benign). The performance of each observer was quantified by receiver operating characteristic analysis. The radiologists' confidence in their recommendations was analyzed with respect to the query case pathologic diagnosis, perceived usefulness of the similar cases, and the accuracy of the computer in retrieving cases of the correct diagnosis. The statistical significance in the performance measure differences was determined by using a two-tailed Student t test for paired data. RESULTS For each observer, the area under the receiver operating characteristic curve did not change significantly with the use of the computer aid (from a mean of 0.8 to a mean of 0.8; p = 0.61). The average confidence of three of the five observers increased significantly with the computer aid (from 5.9 to 6.3 [p < 0.001], from 7.0 to 7.2 [p = 0.04], and from 4.4 to 5.4 [p < 0.001], respectively). The confidence change of the radiologists was more frequent and larger for malignant lesions where the computer was correct. However, for benign lesions, even when the computer was correct, the confidence of the radiologists did not necessarily change. CONCLUSION The presentation of similar cases reinforced radiologists' confidence rating in the diagnosis of malignant lesions; however, it did not change their confidence rating for benign lesions or reduce the number of unnecessary biopsies in managing patients with breast cancer using dynamic contrast-enhanced MRI under the limited study conditions.
Collapse
|
21
|
|
22
|
Muhogora W, Padovani R, Bonutti F, Msaki P, Kazema R. Performance evaluation of three computed radiography systems using methods recommended in American Association of Physicists in Medicine Report 93. J Med Phys 2011; 36:138-46. [PMID: 21897559 PMCID: PMC3159220 DOI: 10.4103/0971-6203.83478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2010] [Revised: 01/25/2011] [Accepted: 01/25/2011] [Indexed: 11/13/2022] Open
Abstract
The performances of three clinical computed radiography (CR) systems, (Agfa CR 75 (with CRMD 4.0 image plates), Kodak CR 850 (with Kodak GP plates) and Kodak CR 850A (with Kodak GP plates)) were evaluated using six tests recommended in American Association of Physicists in Medicine Report 93. The results indicated variable performances with majority being within acceptable limits. The variations were mainly attributed to differences in detector formulations, plate readers’ characteristics, and aging effects. The differences of the mean low contrast scores between the imaging systems for three observers were statistically significant for Agfa and Kodak CR 850A (P=0.009) and for Kodak CR systems (P=0.006) probably because of the differences in ages. However, the differences were not statistically significant between Agfa and Kodak CR 850 (P=0.284) suggesting similar perceived image quality. The study demonstrates the need to implement quality control program regularly.
Collapse
|
23
|
Efficacy of clinical gait analysis: A systematic review. Gait Posture 2011; 34:149-53. [PMID: 21646022 DOI: 10.1016/j.gaitpost.2011.03.027] [Citation(s) in RCA: 176] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 02/03/2011] [Accepted: 03/10/2011] [Indexed: 02/02/2023]
Abstract
The aim of this systematic review was to evaluate and summarize the current evidence base related to the clinical efficacy of gait analysis. A literature review was conducted to identify references related to human gait analysis published between January 2000 and September 2009 plus relevant older references. The references were assessed independently by four reviewers using a hierarchical model of efficacy adapted for gait analysis, and final scores were agreed upon by at least three of the four reviewers. 1528 references were identified relating to human instrumented gait analysis. Of these, 116 original articles addressed technical accuracy efficacy, 89 addressed diagnostic accuracy efficacy, 11 addressed diagnostic thinking and treatment efficacy, seven addressed patient outcomes efficacy, and one addressed societal efficacy, with some of the articles addressing multiple levels of efficacy. This body of literature provides strong evidence for the technical, diagnostic accuracy, diagnostic thinking and treatment efficacy of gait analysis. The existing evidence also indicates efficacy at the higher levels of patient outcomes and societal cost-effectiveness, but this evidence is more sparse and does not include any randomized controlled trials. Thus, the current evidence supports the clinical efficacy of gait analysis, particularly at the lower levels of efficacy, but additional research is needed to strengthen the evidence base at the higher levels of efficacy.
Collapse
|
24
|
Abstract
Medical images constitute a core portion of the information a physician utilizes to render diagnostic and treatment decisions. At a fundamental level, this diagnostic process involves two basic processes: visually inspecting the image (visual perception) and rendering an interpretation (cognition). The likelihood of error in the interpretation of medical images is, unfortunately, not negligible. Errors do occur, and patients' lives are impacted, underscoring our need to understand how physicians interact with the information in an image during the interpretation process. With improved understanding, we can develop ways to further improve decision making and, thus, to improve patient care. The science of medical image perception is dedicated to understanding and improving the clinical interpretation process.
Collapse
|
25
|
Jensen A, Geller BM, Gard CC, Miglioretti DL, Yankaskas B, Carney PA, Rosenberg RD, Vejborg I, Lynge E. Performance of diagnostic mammography differs in the United States and Denmark. Int J Cancer 2010; 127:1905-12. [PMID: 20104518 DOI: 10.1002/ijc.25198] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Diagnostic mammography is the primary imaging modality to diagnose breast cancer. However, few studies have evaluated variability in diagnostic mammography performance in communities, and none has done so between countries. We compared diagnostic mammography performance in community-based settings in the United States and Denmark. The performance of 93,585 diagnostic mammograms from 180 facilities contributing data to the US Breast Cancer Surveillance Consortium (BCSC) from 1999 to 2001 was compared to that of all 51,313 diagnostic mammograms performed at Danish clinics in 2000. We used the imaging workup's final assessment to determine sensitivity, specificity and an estimate of accuracy: area under the receiver-operating characteristics (ROCs) curve (AUC). Diagnostic mammography had slightly higher sensitivity in the United States (85%) than in Denmark (82%). In contrast, it had higher specificity in Denmark (99%) than in the United States (93%). The AUC was high in both countries: 0.91 in United States and 0.95 in Denmark. Denmark's higher accuracy may result from supplementary ultrasound examinations, which are provided to 74% of Danish women but only 37% to 52% of US women. In addition, Danish mammography facilities specialize in either diagnosis or screening, possibly leading to greater diagnostic mammography expertise in facilities dedicated to symptomatic patients. Performance of community-based diagnostic mammography settings varied markedly between the 2 countries, indicating that it can be further optimized.
Collapse
Affiliation(s)
- Allan Jensen
- Institute of Public Health, University of Copenhagen, Copenhagen K, Denmark.
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Reed W, Poulos A, Rickard M, Brennan P. Reader practice in mammography screen reporting in Australia. J Med Imaging Radiat Oncol 2010; 53:530-7. [PMID: 20002284 DOI: 10.1111/j.1754-9485.2009.02119.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Reader variability is a problem in mammography image reporting and compromises the efficacy of screening programmes. The purpose of this exploratory study was to survey reader practice in reporting screening mammograms in Australia to identify aspects of practice that warrant further investigation. Mammography reporting practice and influences on concentration and attention were investigated by using an original questionnaire distributed to screen readers in Australia. A response rate of 71% (83 out of 117) was achieved. Demographic data indicated that the majority of readers were over 46 years of age (73%), have been reporting on screening mammograms for over 10 years (61%), take less than 1 min to report upon a screening mammogram examination (66%), report up to 200 examinations in a single session (83%) and take up to 2 h to report one session (61%). A majority report on more than 5000 examinations annually (66%); 93% of participants regard their search strategy as systematic, 87% agreed that their concentration can vary throughout a session, 64% agreed that the relatively low number of positives can lead to lapses in concentration and attention and almost all (94%) participants agreed that methods to maximise concentration should be explored. Participants identified a range of influences on concentration within their working environment including volume of images reported in one session, image types and aspects of the physical environment. This study has provided important evidence of the need to investigate adverse influences on concentration during mammography screen reporting.
Collapse
Affiliation(s)
- W Reed
- Discipline of Medical Radiation Sciences, Faculty of Health Sciences, The University of Sydney, Sydney, New South Wales, Australia.
| | | | | | | |
Collapse
|
27
|
Kim KJ, Lee KH, Kang HS, Kim SY, Kim YH, Kim B, Seo J, Mantiuk R. Objective index of image fidelity for JPEG2000 compressed body CT images. Med Phys 2009; 36:3218-26. [DOI: 10.1118/1.3129159] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
28
|
Kim B, Lee KH, Kim KJ, Richter T, Kang HS, Kim SY, Kim YH, Seo J. JPEG2000 3D compression vs 2D compression: An assessment of artifact amount and computing time in compressing thin-section abdomen CT images. Med Phys 2009; 36:835-44. [DOI: 10.1118/1.3075824] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
29
|
Performance assessments of diagnostic systems under the FROC paradigm: experimental, analytical, and results interpretation issues. Acad Radiol 2008; 15:1312-5. [PMID: 18790403 DOI: 10.1016/j.acra.2008.05.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2008] [Revised: 05/22/2008] [Accepted: 04/29/2008] [Indexed: 11/22/2022]
Abstract
As use of free response receiver-operating characteristic (FROC) curves gains more acceptance for quantitatively assessing the performance of diagnostic systems, it is important that the experimentalist understands the possible role of this approach as one of the experimental design paradigms that are available to him or her among all other approaches as well as some of the issues associated with FROC type studies. In a number of experimental scenarios, the FROC paradigm and associated analytical tools have theoretical and practical advantages over both the binary and the ROC approaches to performance assessments of diagnostic systems, but it also has some limitations related to experimental design, data analyses, clinical relevance, and complexity in the interpretation of the results. These issues are rarely discussed and are the focus of this work.
Collapse
|
30
|
ROC analysis in medical imaging: a tutorial review of the literature. Radiol Phys Technol 2007; 1:2-12. [PMID: 20821157 DOI: 10.1007/s12194-007-0002-1] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2007] [Accepted: 09/25/2007] [Indexed: 10/22/2022]
|