1
|
Hoeijmakers EJI, Martens B, Hendriks BMF, Mihl C, Miclea RL, Backes WH, Wildberger JE, Zijta FM, Gietema HA, Nelemans PJ, Jeukens CRLPN. How subjective CT image quality assessment becomes surprisingly reliable: pairwise comparisons instead of Likert scale. Eur Radiol 2024; 34:4494-4503. [PMID: 38165429 PMCID: PMC11213789 DOI: 10.1007/s00330-023-10493-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 09/22/2023] [Accepted: 10/29/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVES The aim of this study is to improve the reliability of subjective IQ assessment using a pairwise comparison (PC) method instead of a Likert scale method in abdominal CT scans. METHODS Abdominal CT scans (single-center) were retrospectively selected between September 2019 and February 2020 in a prior study. Sample variance in IQ was obtained by adding artificial noise using dedicated reconstruction software, including reconstructions with filtered backprojection and varying iterative reconstruction strengths. Two datasets (each n = 50) were composed with either higher or lower IQ variation with the 25 original scans being part of both datasets. Using in-house developed software, six observers (five radiologists, one resident) rated both datasets via both the PC method (forcing observers to choose preferred scans out of pairs of scans resulting in a ranking) and a 5-point Likert scale. The PC method was optimized using a sorting algorithm to minimize necessary comparisons. The inter- and intraobserver agreements were assessed for both methods with the intraclass correlation coefficient (ICC). RESULTS Twenty-five patients (mean age 61 years ± 15.5; 56% men) were evaluated. The ICC for interobserver agreement for the high-variation dataset increased from 0.665 (95%CI 0.396-0.814) to 0.785 (95%CI 0.676-0.867) when the PC method was used instead of a Likert scale. For the low-variation dataset, the ICC increased from 0.276 (95%CI 0.034-0.500) to 0.562 (95%CI 0.337-0.729). Intraobserver agreement increased for four out of six observers. CONCLUSION The PC method is more reliable for subjective IQ assessment indicated by improved inter- and intraobserver agreement. CLINICAL RELEVANCE STATEMENT This study shows that the pairwise comparison method is a more reliable method for subjective image quality assessment. Improved reliability is of key importance for optimization studies, validation of automatic image quality assessment algorithms, and training of AI algorithms. KEY POINTS • Subjective assessment of diagnostic image quality via Likert scale has limited reliability. • A pairwise comparison method improves the inter- and intraobserver agreement. • The pairwise comparison method is more reliable for CT optimization studies.
Collapse
Affiliation(s)
- Eva J I Hoeijmakers
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands.
| | - Bibi Martens
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- CARIM School for Cardiovascular Diseases, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Babs M F Hendriks
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- CARIM School for Cardiovascular Diseases, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Casper Mihl
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- CARIM School for Cardiovascular Diseases, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Razvan L Miclea
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
| | - Walter H Backes
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- Department of Neurology and School for Mental health and Neuroscience (MheNs), Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
| | - Joachim E Wildberger
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- CARIM School for Cardiovascular Diseases, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Frank M Zijta
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
| | - Hester A Gietema
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
- GROW School for Oncology and Reproduction, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Patricia J Nelemans
- Department of Epidemiology, Maastricht University, Universiteitssingel 50, Maastricht, 6229 ER, The Netherlands
| | - Cécile R L P N Jeukens
- Department of Radiology and Nuclear Medicine, Maastricht University Medical Centre+, P. Debyelaan 25, Maastricht, 6229 HX, The Netherlands
| |
Collapse
|
2
|
Hasan N, Rizk C, Marzooq F, Khan K, AlKhaja M, Babikir E. Assessment of image quality and establishment of local acceptable quality dose for computed tomography based on patient effective diameter. J Med Imaging (Bellingham) 2024; 11:043502. [PMID: 39157448 PMCID: PMC11328147 DOI: 10.1117/1.jmi.11.4.043502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/20/2024] [Accepted: 07/19/2024] [Indexed: 08/20/2024] Open
Abstract
Purpose We aim to develop modified clinical indication (CI)-based image quality scoring criteria (IQSC) for assessing image quality (IQ) and establishing acceptable quality doses (AQDs) in adult computed tomography (CT) examinations, based on CIs and patient sizes. Approach CT images, volume CT dose index (CTDI vol ), and dose length product (DLP) were collected retrospectively between September 2020 and September 2021 for eight common CIs from two CT scanners at a central hospital in the Kingdom of Bahrain. Using the modified CI-based IQSC and a Likert scale (0 to 4), three radiologists assessed the IQ of each examination. AQDs were then established as the median value ofCTDI vol and DLP for images with an average score of 3 and compared to national diagnostic reference levels (NDRLs). Results Out of 581 examinations, 60 were excluded from the study due to average scores above or below 3. The established AQDs were lower than the NDRLs for all CIs, except AQDs / CTDI vol for oncologic follow-up for large patients (28 versus 26 mGy) in scanner A, besides abdominal pain for medium patients (16 versus 15 mGy) and large patients (34 versus 27 mGy), and diverticulitis/appendicitis for medium patients (15 versus 12 mGy) and large patients (33 versus 30 mGy) in scanner B, indicating the need for optimization. Conclusions CI-based IQSC is crucial for IQ assessment and establishing AQDs according to patient size. It identifies stations requiring optimization of patient radiation exposure.
Collapse
Affiliation(s)
- Nada Hasan
- University of Bahrain, Environment and Sustainable Development Program, College of Science, Zallaq, Kingdom of Bahrain
- Salmaniya Medical Complex, Department of Radiology, Manama, Kingdom of Bahrain
| | - Chadia Rizk
- National Council for Scientific Research, Lebanese Atomic Energy Commission, Beirut, Lebanon
| | - Fatema Marzooq
- Salmaniya Medical Complex, Department of Radiology, Manama, Kingdom of Bahrain
| | - Khalid Khan
- Salmaniya Medical Complex, Department of Radiology, Manama, Kingdom of Bahrain
| | - Maryam AlKhaja
- Salmaniya Medical Complex, Department of Radiology, Manama, Kingdom of Bahrain
| | - Esameldeen Babikir
- University of Bahrain, College of Health and Sport Sciences, Department of Allied Health Sciences, Zallaq, Kingdom of Bahrain
| |
Collapse
|
3
|
Liu X, Li H, Wang S, Yang S, Zhang G, Xu Y, Yang H, Shan F. CT radiomics to differentiate neuroendocrine neoplasm from adenocarcinoma in patients with a peripheral solid pulmonary nodule: a multicenter study. Front Oncol 2024; 14:1420213. [PMID: 38952551 PMCID: PMC11215045 DOI: 10.3389/fonc.2024.1420213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/03/2024] [Indexed: 07/03/2024] Open
Abstract
Purpose To construct and validate a computed tomography (CT) radiomics model for differentiating lung neuroendocrine neoplasm (LNEN) from lung adenocarcinoma (LADC) manifesting as a peripheral solid nodule (PSN) to aid in early clinical decision-making. Methods A total of 445 patients with pathologically confirmed LNEN and LADC from June 2016 to July 2023 were retrospectively included from five medical centers. Those patients were split into the training set (n = 316; 158 LNEN) and external test set (n = 129; 43 LNEN), the former including the cross-validation (CV) training set and CV test set using ten-fold CV. The support vector machine (SVM) classifier was used to develop the semantic, radiomics and merged models. The diagnostic performances were evaluated by the area under the receiver operating characteristic curve (AUC) and compared by Delong test. Preoperative neuron-specific enolase (NSE) levels were collected as a clinical predictor. Results In the training set, the AUCs of the radiomics model (0.878 [95% CI: 0.836, 0.915]) and merged model (0.884 [95% CI: 0.844, 0.919]) significantly outperformed the semantic model (0.718 [95% CI: 0.663, 0.769], p both<.001). In the external test set, the AUCs of the radiomics model (0.787 [95% CI: 0.696, 0.871]), merged model (0.807 [95%CI: 0.720, 0.889]) and semantic model (0.729 [95% CI: 0.631, 0.811]) did not exhibit statistical differences. The radiomics model outperformed NSE in sensitivity in the training set (85.3% vs 20.0%; p <.001) and external test set (88.9% vs 40.7%; p = .002). Conclusion The CT radiomics model could non-invasively, effectively and sensitively predict LNEN and LADC presenting as a PSN to assist in treatment strategy selection.
Collapse
Affiliation(s)
- Xiaoyu Liu
- Department of Radiology, Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| | - Hongjian Li
- Department of Radiology, Affiliated Hospital of North Sichuan Medical College, North Sichuan Medical College, Nanchong, China
| | - Shengping Wang
- Department of Radiology, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shan Yang
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Guobin Zhang
- Department of Radiology, Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yonghua Xu
- Department of Imaging and Interventional Radiology, Zhongshan-Xuhui Hospital of Fudan University, Fudan University, Shanghai, China
| | - Hanfeng Yang
- Department of Radiology, Affiliated Hospital of North Sichuan Medical College, North Sichuan Medical College, Nanchong, China
| | - Fei Shan
- Department of Radiology, Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
| |
Collapse
|
4
|
Nigro S, Filardi M, Tafuri B, Nicolardi M, De Blasi R, Giugno A, Gnoni V, Milella G, Urso D, Zoccolella S, Logroscino G. Deep Learning-based Approach for Brainstem and Ventricular MR Planimetry: Application in Patients with Progressive Supranuclear Palsy. Radiol Artif Intell 2024; 6:e230151. [PMID: 38506619 PMCID: PMC11140505 DOI: 10.1148/ryai.230151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 02/01/2024] [Accepted: 03/06/2024] [Indexed: 03/21/2024]
Abstract
Purpose To develop a fast and fully automated deep learning (DL)-based method for the MRI planimetric segmentation and measurement of the brainstem and ventricular structures most affected in patients with progressive supranuclear palsy (PSP). Materials and Methods In this retrospective study, T1-weighted MR images in healthy controls (n = 84) were used to train DL models for segmenting the midbrain, pons, middle cerebellar peduncle (MCP), superior cerebellar peduncle (SCP), third ventricle, and frontal horns (FHs). Internal, external, and clinical test datasets (n = 305) were used to assess segmentation model reliability. DL masks from test datasets were used to automatically extract midbrain and pons areas and the width of MCP, SCP, third ventricle, and FHs. Automated measurements were compared with those manually performed by an expert radiologist. Finally, these measures were combined to calculate the midbrain to pons area ratio, MR parkinsonism index (MRPI), and MRPI 2.0, which were used to differentiate patients with PSP (n = 71) from those with Parkinson disease (PD) (n = 129). Results Dice coefficients above 0.85 were found for all brain regions when comparing manual and DL-based segmentations. A strong correlation was observed between automated and manual measurements (Spearman ρ > 0.80, P < .001). DL-based measurements showed excellent performance in differentiating patients with PSP from those with PD, with an area under the receiver operating characteristic curve above 0.92. Conclusion The automated approach successfully segmented and measured the brainstem and ventricular structures. DL-based models may represent a useful approach to support the diagnosis of PSP and potentially other conditions associated with brainstem and ventricular alterations. Keywords: MR Imaging, Brain/Brain Stem, Segmentation, Quantification, Diagnosis, Convolutional Neural Network Supplemental material is available for this article. © RSNA, 2024 See also the commentary by Mohajer in this issue.
Collapse
Affiliation(s)
- Salvatore Nigro
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Marco Filardi
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Benedetta Tafuri
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Martina Nicolardi
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Roberto De Blasi
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Alessia Giugno
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Valentina Gnoni
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Giammarco Milella
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Daniele Urso
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Stefano Zoccolella
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| | - Giancarlo Logroscino
- From the Center for Neurodegenerative Diseases and the Aging Brain, University of Bari Aldo Moro, Pia Fondazione Cardinale G. Panico, 73039 Tricase, Italy (S.N., M.F., B.T., A.G., V.G., D.U., G.L.); Department of Translational Biomedicine and Neuroscience (DiBraiN), University of Bari Aldo Moro, Bari, Italy (M.F., B.T., G.M., G.L.); Department of Radiology, Pia Fondazione Cardinale G. Panico, Tricase, Italy (M.N., R.D.B.); Department of Neurosciences, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, England (D.U.); and Operative Unit of Neurology, San Paolo Hospital, ASL Bari, Bari, Italy (S.Z.)
| |
Collapse
|
5
|
Udin MH, Armstrong S, Kai A, Doyle S, Ionita CN, Pokharel S, Sharma UC. Lightweight preprocessing and template matching facilitate streamlined ischemic myocardial scar classification. J Med Imaging (Bellingham) 2024; 11:024503. [PMID: 38525295 PMCID: PMC10956816 DOI: 10.1117/1.jmi.11.2.024503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 01/12/2024] [Accepted: 03/07/2024] [Indexed: 03/26/2024] Open
Abstract
Purpose Ischemic myocardial scarring (IMS) is a common outcome of coronary artery disease that potentially leads to lethal arrythmias and heart failure. Late-gadolinium-enhanced cardiac magnetic resonance (CMR) imaging scans have served as the diagnostic bedrock for IMS, with recent advancements in machine learning enabling enhanced scar classification. However, the trade-off for these improvements is intensive computational and time demands. As a solution, we propose a combination of lightweight preprocessing (LWP) and template matching (TM) to streamline IMS classification. Approach CMR images from 279 patients (151 IMS, 128 control) were classified for IMS presence using two convolutional neural networks (CNNs) and TM, both with and without LWP. Evaluation metrics included accuracy, sensitivity, specificity, F1-score, area under the receiver operating characteristic curve (AUROC), and processing time. External testing dataset analysis encompassed patient-level classifications (PLCs) and a CNN versus TM classification comparison (CVTCC). Results LWP enhanced the speed of both CNNs (4.9x) and TM (21.9x). Furthermore, in the absence of LWP, TM outpaced CNNs by over 10x, while with LWP, TM was more than 100x faster. Additionally, TM performed similarly to the CNNs in accuracy, sensitivity, specificity, F1-score, and AUROC, with PLCs demonstrating improvements across all five metrics. Moreover, the CVTCC revealed a substantial 90.9% agreement. Conclusions Our results highlight the effectiveness of LWP and TM in streamlining IMS classification. Anticipated enhancements to LWP's region of interest (ROI) isolation and TM's ROI targeting are expected to boost accuracy, positioning them as a potential alternative to CNNs for IMS classification, supporting the need for further research.
Collapse
Affiliation(s)
- Michael H. Udin
- University at Buffalo, Department of Biomedical Engineering, Buffalo, New York, United States
- Canon Stroke and Vascular Research Center, Buffalo, New York, United States
- Roswell Park Comprehensive Cancer Center, Department of Pathology, Buffalo, New York, United States
- University at Buffalo, Jacobs School of Medicine, Department of Medicine, Buffalo, New York, United States
| | - Sara Armstrong
- University at Buffalo, Jacobs School of Medicine, Department of Medicine, Buffalo, New York, United States
| | - Alice Kai
- University at Buffalo, Jacobs School of Medicine, Department of Medicine, Buffalo, New York, United States
| | - Scott Doyle
- University at Buffalo, Department of Biomedical Engineering, Buffalo, New York, United States
| | - Ciprian N. Ionita
- University at Buffalo, Department of Biomedical Engineering, Buffalo, New York, United States
- Canon Stroke and Vascular Research Center, Buffalo, New York, United States
| | - Saraswati Pokharel
- University at Buffalo, Department of Biomedical Engineering, Buffalo, New York, United States
- Roswell Park Comprehensive Cancer Center, Department of Pathology, Buffalo, New York, United States
| | - Umesh C. Sharma
- University at Buffalo, Jacobs School of Medicine, Department of Medicine, Buffalo, New York, United States
| |
Collapse
|
6
|
Yi W, Zhao J, Tang W, Yin H, Yu L, Wang Y, Tian W. Deep learning-based high-accuracy detection for lumbar and cervical degenerative disease on T2-weighted MR images. EUROPEAN SPINE JOURNAL : OFFICIAL PUBLICATION OF THE EUROPEAN SPINE SOCIETY, THE EUROPEAN SPINAL DEFORMITY SOCIETY, AND THE EUROPEAN SECTION OF THE CERVICAL SPINE RESEARCH SOCIETY 2023; 32:3807-3814. [PMID: 36943484 DOI: 10.1007/s00586-023-07641-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 01/31/2023] [Accepted: 03/05/2023] [Indexed: 03/23/2023]
Abstract
PURPOSE To develop and validate a deep learning (DL) model for detecting lumbar degenerative disease in both sagittal and axial views of T2-weighted MRI and evaluate its generalized performance in detecting cervical degenerative disease. METHODS T2-weighted MRI scans of 804 patients with symptoms of lumbar degenerative disease were retrospectively collected from three hospitals. The training dataset (n = 456) and internal validation dataset (n = 134) were randomly selected from the center I. Two external validation datasets comprising 100 and 114 patients were from center II and center III, respectively. A DL model based on 3D ResNet18 and transformer architecture was proposed to detect lumbar degenerative disease. In addition, a cervical MR image dataset comprising 200 patients from an independent hospital was used to evaluate the generalized performance of the DL model. The diagnostic performance was assessed by the free-response receiver operating characteristic (fROC) curve and precision-recall (PR) curve. Precision, recall, and F1-score were used to measure the DL model. RESULTS A total of 2497 three-dimension retrogression annotations were labeled for training (n = 1157) and multicenter validation (n = 1340). The DL model showed excellent detection efficiency in the internal validation dataset, with F1-score achieving 0.971 and 0.903 on the sagittal and axial MR images, respectively. Good performance was also observed in the external validation dataset I (F1-score, 0.768 on sagittal MR images and 0.837 on axial MR images) and external validation dataset II (F1-score, 0.787 on sagittal MR images and 0.770 on axial MR images). Furthermore, the robustness of the DL model was demonstrated via transfer learning and generalized performance evaluation on the external cervical dataset, with the F1-score yielding 0.931 and 0.919 on the sagittal and axial MR images, respectively. CONCLUSION The proposed DL model can automatically detect lumbar and cervical degenerative disease on T2-weighted MR images with good performance, robustness, and feasibility in clinical practice.
Collapse
Affiliation(s)
- Wei Yi
- Department of Spine Surgery, Beijing Jishuitan Hospital, Beijing, 100035, China.
| | - Jingwei Zhao
- Beijing Jishuitan Hospital, Research Unit of Intelligent Orthopedics, Chinese Academy of Medical Sciences, Beijing, China.
| | - Wen Tang
- Institute of Advanced Research, Infervision Medical Technology Co., Ltd, Beijing, Beijing, China
| | - Hongkun Yin
- Institute of Advanced Research, Infervision Medical Technology Co., Ltd, Beijing, Beijing, China
| | - Lifeng Yu
- The Second Hospital of Zhangjiakou City, Zhangjiakou, Hebei Province, China
| | - Yaohui Wang
- Department of Trauma, Beijing Water Conservancy Hospital, Beijing, 10036, China
| | - Wei Tian
- Beijing Jishuitan Hospital, Research Unit of Intelligent Orthopedics, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
7
|
Maleki F, Ovens K, Gupta R, Reinhold C, Spatz A, Forghani R. Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls. Radiol Artif Intell 2022; 5:e220028. [PMID: 36721408 PMCID: PMC9885377 DOI: 10.1148/ryai.220028] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Revised: 10/10/2022] [Accepted: 10/24/2022] [Indexed: 11/17/2022]
Abstract
Purpose To investigate the impact of the following three methodological pitfalls on model generalizability: (a) violation of the independence assumption, (b) model evaluation with an inappropriate performance indicator or baseline for comparison, and (c) batch effect. Materials and Methods The authors used retrospective CT, histopathologic analysis, and radiography datasets to develop machine learning models with and without the three methodological pitfalls to quantitatively illustrate their effect on model performance and generalizability. F1 score was used to measure performance, and differences in performance between models developed with and without errors were assessed using the Wilcoxon rank sum test when applicable. Results Violation of the independence assumption by applying oversampling, feature selection, and data augmentation before splitting data into training, validation, and test sets seemingly improved model F1 scores by 71.2% for predicting local recurrence and 5.0% for predicting 3-year overall survival in head and neck cancer and by 46.0% for distinguishing histopathologic patterns in lung cancer. Randomly distributing data points for a patient across datasets superficially improved the F1 score by 21.8%. High model performance metrics did not indicate high-quality lung segmentation. In the presence of a batch effect, a model built for pneumonia detection had an F1 score of 98.7% but correctly classified only 3.86% of samples from a new dataset of healthy patients. Conclusion Machine learning models developed with these methodological pitfalls, which are undetectable during internal evaluation, produce inaccurate predictions; thus, understanding and avoiding these pitfalls is necessary for developing generalizable models.Keywords: Random Forest, Diagnosis, Prognosis, Convolutional Neural Network (CNN), Medical Image Analysis, Generalizability, Machine Learning, Deep Learning, Model Evaluation Supplemental material is available for this article. Published under a CC BY 4.0 license.
Collapse
|
8
|
Jason Jeong J, Patel B, Banerjee I. GAN augmentation for multiclass image classification using hemorrhage detection as a case-study. J Med Imaging (Bellingham) 2022; 9:035504. [PMID: 35769344 DOI: 10.1117/1.jmi.9.3.035504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 05/31/2022] [Indexed: 11/14/2022] Open
Abstract
Purpose: In recent years, the development and exploration of deeper and more complex deep learning models has been on the rise. However, the availability of large heterogeneous datasets to support efficient training of deep learning models is lacking. While linear image transformations for augmentation have been used traditionally, the recent development of generative adversarial networks (GANs) could theoretically allow us to generate an infinite amount of data from the real distribution to support deep learning model training. Recently, the Radiological Society of North America (RSNA) curated a multiclass hemorrhage detection challenge dataset that includes over 800,000 images for hemorrhage detection, but all high-performing models were trained using traditional data augmentation techniques. Given a wide variety of selections, the augmentation for image classification often follows a trial-and-error policy. Approach: We designed conditional DCGAN (cDCGAN) and in parallel trained multiple popular GAN models to use as online augmentations and compared them to traditional augmentation methods for the hemorrhage case study. Results: Our experimentations show that the super-minority, epidural hemorrhages with cDCGAN augmentation presented a minimum of 2 × improvement in their performance against the traditionally augmented model using the same classifier configuration. Conclusion: This shows that for complex and imbalanced datasets, traditional data imbalancing solutions may not be sufficient and require more complex and diverse data augmentation methods such as GANs to solve.
Collapse
Affiliation(s)
- Jiwoong Jason Jeong
- Arizona State University, Ira A. Fulton Schools of Engineering, Tempe, Arizona, United States
| | - Bhavik Patel
- Mayo Clinic, Department of Radiology, Arizona, United States
| | - Imon Banerjee
- Arizona State University, Ira A. Fulton Schools of Engineering, Tempe, Arizona, United States.,Mayo Clinic, Department of Radiology, Arizona, United States
| |
Collapse
|
9
|
Bento M, Fantini I, Park J, Rittner L, Frayne R. Deep Learning in Large and Multi-Site Structural Brain MR Imaging Datasets. Front Neuroinform 2022; 15:805669. [PMID: 35126080 PMCID: PMC8811356 DOI: 10.3389/fninf.2021.805669] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 12/27/2021] [Indexed: 12/22/2022] Open
Abstract
Large, multi-site, heterogeneous brain imaging datasets are increasingly required for the training, validation, and testing of advanced deep learning (DL)-based automated tools, including structural magnetic resonance (MR) image-based diagnostic and treatment monitoring approaches. When assembling a number of smaller datasets to form a larger dataset, understanding the underlying variability between different acquisition and processing protocols across the aggregated dataset (termed “batch effects”) is critical. The presence of variation in the training dataset is important as it more closely reflects the true underlying data distribution and, thus, may enhance the overall generalizability of the tool. However, the impact of batch effects must be carefully evaluated in order to avoid undesirable effects that, for example, may reduce performance measures. Batch effects can result from many sources, including differences in acquisition equipment, imaging technique and parameters, as well as applied processing methodologies. Their impact, both beneficial and adversarial, must be considered when developing tools to ensure that their outputs are related to the proposed clinical or research question (i.e., actual disease-related or pathological changes) and are not simply due to the peculiarities of underlying batch effects in the aggregated dataset. We reviewed applications of DL in structural brain MR imaging that aggregated images from neuroimaging datasets, typically acquired at multiple sites. We examined datasets containing both healthy control participants and patients that were acquired using varying acquisition protocols. First, we discussed issues around Data Access and enumerated the key characteristics of some commonly used publicly available brain datasets. Then we reviewed methods for correcting batch effects by exploring the two main classes of approaches: Data Harmonization that uses data standardization, quality control protocols or other similar algorithms and procedures to explicitly understand and minimize unwanted batch effects; and Domain Adaptation that develops DL tools that implicitly handle the batch effects by using approaches to achieve reliable and robust results. In this narrative review, we highlighted the advantages and disadvantages of both classes of DL approaches, and described key challenges to be addressed in future studies.
Collapse
Affiliation(s)
- Mariana Bento
- Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Calgary Image Processing and Analysis Centre, Foothills Medical Centre, Calgary, AB, Canada
- *Correspondence: Mariana Bento
| | - Irene Fantini
- School of Electrical and Computer Engineering, University of Campinas, Campinas, Brazil
| | - Justin Park
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Calgary Image Processing and Analysis Centre, Foothills Medical Centre, Calgary, AB, Canada
- Radiology and Clinical Neurosciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Leticia Rittner
- School of Electrical and Computer Engineering, University of Campinas, Campinas, Brazil
| | - Richard Frayne
- Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
- Calgary Image Processing and Analysis Centre, Foothills Medical Centre, Calgary, AB, Canada
- Radiology and Clinical Neurosciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Seaman Family MR Research Centre, Foothills Medical Centre, Calgary, AB, Canada
| |
Collapse
|
10
|
Deep learning-accelerated T2-weighted imaging of the prostate: Impact of further acceleration with lower spatial resolution on image quality. Eur J Radiol 2021; 145:110012. [PMID: 34753082 DOI: 10.1016/j.ejrad.2021.110012] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/19/2021] [Accepted: 10/26/2021] [Indexed: 12/24/2022]
Abstract
PURPOSE To compare image quality in prostate MRI among standard T2-weighted imaging (T2-std), accelerated T2-weighted imaging (T2WI) with high resolution (T2-HR) and more accelerated T2WI with lower resolution (T2-LR) using both conventional reconstruction (C) and deep learning reconstruction (DL). MATERIALS AND METHODS In 46 consecutive patients, T2-std, T2-HR and T2-LR were acquired in 3:32 min, 1:06 min and 0.52 min, respectively. Both reconstruction techniques (C and DL) were applied to T2-HR and T2-LR. Five sets of images (T2-std, T2-HRC, T2-LRC, T2-HRDL, and T2-LRDL) for each patient were independently evaluated by two radiologists. Quantitative analysis including the signal-to-noise ratio (SNR) and contrast ratio (CR) and qualitative analysis with a 5-point scale for the sharpness of structures, ghosting or other artifacts, noise and overall image quality were performed. RESULTS The SNR was not different in either the peripheral zone (PZ) or transition zone (TZ) between T2-LRDL and T2-std with the median value of 21.7 versus 22.6 in PZ and 16.5 versus 17.3 in TZ, respectively. The CR between the prostate gland and muscle was significantly lower on T2-HRC and T2-LRC than on T2-std. Most of the evaluated factors showed significantly lower scores on T2-HRC and T2-LRC than on T2-std. Although noise and overall image quality on T2-HRDL and other artifacts on T2-LRDL were rated significantly lower than on T2-std (median value 4.0 versus 4.5, P < 0.001; 4.5 versus 5.0, P = 0.001; 4.5 versus 5.0, P = 0.006, respectively), other factors did not differ between T2-std and T2-HRDL or T2-LRDL. CONCLUSION DL is useful to improve image quality in accelerated T2WI of the prostate gland. Using DL, accelerated T2WI with lower spatial resolution than T2-std can be achieved with similar image quality in much shorter scan time (75.5% reduction in the acquisition time).
Collapse
|
11
|
Cherian Kurian N, Sethi A, Reddy Konduru A, Mahajan A, Rane SU. A 2021 update on cancer image analytics with deep learning. WIRES DATA MINING AND KNOWLEDGE DISCOVERY 2021. [DOI: 10.1002/widm.1410] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Nikhil Cherian Kurian
- Department of Electrical Engineering Indian Institute of Technology, Bombay Mumbai India
| | - Amit Sethi
- Department of Electrical Engineering Indian Institute of Technology, Bombay Mumbai India
| | - Anil Reddy Konduru
- Department of Pathology Tata Memorial Center‐ACTREC, HBNI Navi Mumbai India
| | - Abhishek Mahajan
- Department of Radiology Tata Memorial Hospital, HBNI Mumbai India
| | - Swapnil Ulhas Rane
- Department of Pathology Tata Memorial Center‐ACTREC, HBNI Navi Mumbai India
| |
Collapse
|