1
|
Bayerl N, Adams LC, Cavallaro A, Bäuerle T, Schlicht M, Wullich B, Hartmann A, Uder M, Ellmann S. Assessment of a fully-automated diagnostic AI software in prostate MRI: Clinical evaluation and histopathological correlation. Eur J Radiol 2024; 181:111790. [PMID: 39520837 DOI: 10.1016/j.ejrad.2024.111790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 09/29/2024] [Accepted: 10/12/2024] [Indexed: 11/16/2024]
Abstract
OBJECTIVE This study aims to evaluate the diagnostic performance of a commercial, fully-automated, artificial intelligence (AI) driven software tool in identifying and grading prostate lesions in prostate MRI, using histopathological findings as the reference standard, while contextualizing its performance within the framework of PI-RADS v2.1 criteria. MATERIAL AND METHODS This retrospective study analyzed 123 patients who underwent multiparametric prostate MRI followed by systematic and targeted biopsies. MRI protocols adhered to international guidelines and included T2-weighted, diffusion-weighted, T1-weighted, and dynamic contrast-enhanced imaging. The AI software tool mdprostate was integrated into the Picture Archiving and Communication System to automatically segment the prostate, calculate prostate volume, and classify lesions according to PI-RADS scores using biparametric T2-weighted and diffusion-weighted imaging. Histopathological analysis of biopsy cores served as the reference standard. Diagnostic performance metrics including sensitivity, specificity, positive and negative predictive value (PPV, NPV), and area under the ROC curve (AUC) were calculated. RESULTS mdprostate demonstrated 100 % sensitivity at a PI-RADS ≥ 2 cutoff, effectively ruling out both clinically significant and non-significant prostate cancers for lesions remaining below this threshold. For detecting clinically significant prostate cancer (csPCa) using a PI-RADS ≥ 4 cutoff, mdprostate achieved a sensitivity of 85.5 % and a specificity of 63.2 %. The AUC for detecting cancers of any grade was 0.803. The performance metrics of mdprostate were comparable to those reported in two meta-analyses of PI-RADS v2.1, with no significant differences in sensitivity and specificity (p > 0.05). CONCLUSION The evaluated AI tool demonstrated high diagnostic performance in identifying and grading prostate lesions, with results comparable to those reported in meta-analyses of expert readers using PI-RADS v2.1. Its ability to standardize evaluations and potentially reduce variability underscores its potential as a valuable adjunct in the prostate cancer diagnostic pathway. The high accuracy of mdprostate, particularly in ruling out prostate cancers, highlights its clinical utility by reducing workload and enhancing patient outcomes.
Collapse
Affiliation(s)
- Nadine Bayerl
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Radiology, University Hospital Erlangen, Maximiliansplatz 3, 91054 Erlangen, Germany.
| | - Lisa C Adams
- Technical University of Munich, Department of Diagnostic and Interventional Radiology, Ismaninger Str. 22, 81675 Munich, Germany.
| | - Alexander Cavallaro
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Radiology, University Hospital Erlangen, Maximiliansplatz 3, 91054 Erlangen, Germany.
| | - Tobias Bäuerle
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Radiology, University Hospital Erlangen, Maximiliansplatz 3, 91054 Erlangen, Germany; University Medical Center of Johannes Gutenberg-University Mainz, Department of Diagnostic and Interventional Radiology, Langenbeckstr. 1, 55131 Mainz, Germany.
| | - Michael Schlicht
- Sozialstiftung Bamberg, Clinic of Internal Medicine III, Hanst-Schütz Str. 3, 96050 Bamberg, Germany
| | - Bernd Wullich
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Clinic of Urology and Pediatric Urology, University Hospital Erlangen, Maximiliansplatz 1, 91054 Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054 Erlangen, Germany; Bavarian Cancer Research Center (BZKF), 91054 Erlangen, Germany.
| | - Arndt Hartmann
- Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), 91054 Erlangen, Germany; Bavarian Cancer Research Center (BZKF), 91054 Erlangen, Germany; Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Pathology, University Hospital Erlangen, Krankenhausstr. 8-10, 91054 Erlangen, Germany.
| | - Michael Uder
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Radiology, University Hospital Erlangen, Maximiliansplatz 3, 91054 Erlangen, Germany.
| | - Stephan Ellmann
- Friedrich-Alexander-Universität Erlangen-Nürnberg, Institute of Radiology, University Hospital Erlangen, Maximiliansplatz 3, 91054 Erlangen, Germany; Radiologisch-Nuklearmedizinisches Zentrum (RNZ.), Martin-Richter-Straße 43, 90489 Nürnberg, Germany.
| |
Collapse
|
2
|
Langkilde F, Masaba P, Edenbrandt L, Gren M, Halil A, Hellström M, Larsson M, Naeem AA, Wallström J, Maier SE, Jäderling F. Manual prostate MRI segmentation by readers with different experience: a study of the learning progress. Eur Radiol 2024; 34:4801-4809. [PMID: 38165432 PMCID: PMC11213744 DOI: 10.1007/s00330-023-10515-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 11/06/2023] [Accepted: 11/10/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVE To evaluate the learning progress of less experienced readers in prostate MRI segmentation. MATERIALS AND METHODS One hundred bi-parametric prostate MRI scans were retrospectively selected from the Göteborg Prostate Cancer Screening 2 Trial (single center). Nine readers with varying degrees of segmentation experience were involved: one expert radiologist, two experienced radiology residents, two inexperienced radiology residents, and four novices. The task was to segment the whole prostate gland. The expert's segmentations were used as reference. For all other readers except three novices, the 100 MRI scans were divided into five rounds (cases 1-10, 11-25, 26-50, 51-76, 76-100). Three novices segmented only 50 cases (three rounds). After each round, a one-on-one feedback session between the expert and the reader was held, with feedback on systematic errors and potential improvements for the next round. Dice similarity coefficient (DSC) > 0.8 was considered accurate. RESULTS Using DSC > 0.8 as the threshold, the novices had a total of 194 accurate segmentations out of 250 (77.6%). The residents had a total of 397/400 (99.2%) accurate segmentations. In round 1, the novices had 19/40 (47.5%) accurate segmentations, in round 2 41/60 (68.3%), and in round 3 84/100 (84.0%) indicating learning progress. CONCLUSIONS Radiology residents, regardless of prior experience, showed high segmentation accuracy. Novices showed larger interindividual variation and lower segmentation accuracy than radiology residents. To prepare datasets for artificial intelligence (AI) development, employing radiology residents seems safe and provides a good balance between cost-effectiveness and segmentation accuracy. Employing novices should only be considered on an individual basis. CLINICAL RELEVANCE STATEMENT Employing radiology residents for prostate MRI segmentation seems safe and can potentially reduce the workload of expert radiologists. Employing novices should only be considered on an individual basis. KEY POINTS • Using less experienced readers for prostate MRI segmentation is cost-effective but may reduce quality. • Radiology residents provided high accuracy segmentations while novices showed large inter-reader variability. • To prepare datasets for AI development, employing radiology residents seems safe and might provide a good balance between cost-effectiveness and segmentation accuracy while novices should only be employed on an individual basis.
Collapse
Affiliation(s)
- Fredrik Langkilde
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden.
| | - Patrick Masaba
- Department of Molecular Medicine and Surgery (MMK), Karolinska Institutet, Stockholm, Sweden
| | - Lars Edenbrandt
- Department of Molecular and Clinical Medicine, Institute of Medicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Clinical Physiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Magnus Gren
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Airin Halil
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Mikael Hellström
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | | | - Ameer Ali Naeem
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Jonas Wallström
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Stephan E Maier
- Department of Radiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Radiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Fredrik Jäderling
- Department of Molecular Medicine and Surgery (MMK), Karolinska Institutet, Stockholm, Sweden
- Department of Diagnostic Radiology, Capio S:T Göran's Hospital, Stockholm, Sweden
| |
Collapse
|
3
|
Fassia MK, Balasubramanian A, Woo S, Vargas HA, Hricak H, Konukoglu E, Becker AS. Deep Learning Prostate MRI Segmentation Accuracy and Robustness: A Systematic Review. Radiol Artif Intell 2024; 6:e230138. [PMID: 38568094 PMCID: PMC11294957 DOI: 10.1148/ryai.230138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 02/24/2024] [Accepted: 03/19/2024] [Indexed: 04/28/2024]
Abstract
Purpose To investigate the accuracy and robustness of prostate segmentation using deep learning across various training data sizes, MRI vendors, prostate zones, and testing methods relative to fellowship-trained diagnostic radiologists. Materials and Methods In this systematic review, Embase, PubMed, Scopus, and Web of Science databases were queried for English-language articles using keywords and related terms for prostate MRI segmentation and deep learning algorithms dated to July 31, 2022. A total of 691 articles from the search query were collected and subsequently filtered to 48 on the basis of predefined inclusion and exclusion criteria. Multiple characteristics were extracted from selected studies, such as deep learning algorithm performance, MRI vendor, and training dataset features. The primary outcome was comparison of mean Dice similarity coefficient (DSC) for prostate segmentation for deep learning algorithms versus diagnostic radiologists. Results Forty-eight studies were included. Most published deep learning algorithms for whole prostate gland segmentation (39 of 42 [93%]) had a DSC at or above expert level (DSC ≥ 0.86). The mean DSC was 0.79 ± 0.06 (SD) for peripheral zone, 0.87 ± 0.05 for transition zone, and 0.90 ± 0.04 for whole prostate gland segmentation. For selected studies that used one major MRI vendor, the mean DSCs of each were as follows: General Electric (three of 48 studies), 0.92 ± 0.03; Philips (four of 48 studies), 0.92 ± 0.02; and Siemens (six of 48 studies), 0.91 ± 0.03. Conclusion Deep learning algorithms for prostate MRI segmentation demonstrated accuracy similar to that of expert radiologists despite varying parameters; therefore, future research should shift toward evaluating segmentation robustness and patient outcomes across diverse clinical settings. Keywords: MRI, Genital/Reproductive, Prostate Segmentation, Deep Learning Systematic review registration link: osf.io/nxaev © RSNA, 2024.
Collapse
Affiliation(s)
- Mohammad-Kasim Fassia
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Adithya Balasubramanian
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Sungmin Woo
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Hebert Alberto Vargas
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Hedvig Hricak
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Ender Konukoglu
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| | - Anton S. Becker
- From the Departments of Radiology (M.K.F.) and Urology (A.B.), New York-Presbyterian Weill Cornell Medical Center, 525 E 68th St, New York, NY 10065-4870; Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (S.W., H.A.V., H.H., A.S.B.); and Department of Biomedical Imaging, ETH-Zurich, Zurich Switzerland (E.K.)
| |
Collapse
|
4
|
Johnson LA, Harmon SA, Yilmaz EC, Lin Y, Belue MJ, Merriman KM, Lay NS, Sanford TH, Sarma KV, Arnold CW, Xu Z, Roth HR, Yang D, Tetreault J, Xu D, Patel KR, Gurram S, Wood BJ, Citrin DE, Pinto PA, Choyke PL, Turkbey B. Automated prostate gland segmentation in challenging clinical cases: comparison of three artificial intelligence methods. Abdom Radiol (NY) 2024; 49:1545-1556. [PMID: 38512516 DOI: 10.1007/s00261-024-04242-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 03/23/2024]
Abstract
OBJECTIVE Automated methods for prostate segmentation on MRI are typically developed under ideal scanning and anatomical conditions. This study evaluates three different prostate segmentation AI algorithms in a challenging population of patients with prior treatments, variable anatomic characteristics, complex clinical history, or atypical MRI acquisition parameters. MATERIALS AND METHODS A single institution retrospective database was queried for the following conditions at prostate MRI: prior prostate-specific oncologic treatment, transurethral resection of the prostate (TURP), abdominal perineal resection (APR), hip prosthesis (HP), diversity of prostate volumes (large ≥ 150 cc, small ≤ 25 cc), whole gland tumor burden, magnet strength, noted poor quality, and various scanners (outside/vendors). Final inclusion criteria required availability of axial T2-weighted (T2W) sequence and corresponding prostate organ segmentation from an expert radiologist. Three previously developed algorithms were evaluated: (1) deep learning (DL)-based model, (2) commercially available shape-based model, and (3) federated DL-based model. Dice Similarity Coefficient (DSC) was calculated compared to expert. DSC by model and scan factors were evaluated with Wilcox signed-rank test and linear mixed effects (LMER) model. RESULTS 683 scans (651 patients) met inclusion criteria (mean prostate volume 60.1 cc [9.05-329 cc]). Overall DSC scores for models 1, 2, and 3 were 0.916 (0.707-0.971), 0.873 (0-0.997), and 0.894 (0.025-0.961), respectively, with DL-based models demonstrating significantly higher performance (p < 0.01). In sub-group analysis by factors, Model 1 outperformed Model 2 (all p < 0.05) and Model 3 (all p < 0.001). Performance of all models was negatively impacted by prostate volume and poor signal quality (p < 0.01). Shape-based factors influenced DL models (p < 0.001) while signal factors influenced all (p < 0.001). CONCLUSION Factors affecting anatomical and signal conditions of the prostate gland can adversely impact both DL and non-deep learning-based segmentation models.
Collapse
Affiliation(s)
- Latrice A Johnson
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Stephanie A Harmon
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Enis C Yilmaz
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yue Lin
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mason J Belue
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Katie M Merriman
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Nathan S Lay
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Karthik V Sarma
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, CA, USA
| | - Corey W Arnold
- Department of Radiology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ziyue Xu
- NVIDIA Corporation, Santa Clara, CA, USA
| | | | - Dong Yang
- NVIDIA Corporation, Santa Clara, CA, USA
| | | | - Daguang Xu
- NVIDIA Corporation, Santa Clara, CA, USA
| | - Krishnan R Patel
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sandeep Gurram
- Urologic Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bradford J Wood
- Center for Interventional Oncology, National Cancer Institute, NIH, Bethesda, MD, USA
- Department of Radiology, Clinical Center, NIH, Bethesda, MD, USA
| | - Deborah E Citrin
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter A Pinto
- Urologic Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter L Choyke
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Baris Turkbey
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
- Molecular Imaging Branch (B.T.), National Cancer Institute, National Institutes of Health, 10 Center Dr., MSC 1182, Building 10, Room B3B85, Bethesda, MD, 20892, USA.
| |
Collapse
|
5
|
Lambert B, Forbes F, Doyle S, Dehaene H, Dojat M. Trustworthy clinical AI solutions: A unified review of uncertainty quantification in Deep Learning models for medical image analysis. Artif Intell Med 2024; 150:102830. [PMID: 38553168 DOI: 10.1016/j.artmed.2024.102830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/28/2024] [Accepted: 03/01/2024] [Indexed: 04/02/2024]
Abstract
The full acceptance of Deep Learning (DL) models in the clinical field is rather low with respect to the quantity of high-performing solutions reported in the literature. End users are particularly reluctant to rely on the opaque predictions of DL models. Uncertainty quantification methods have been proposed in the literature as a potential solution, to reduce the black-box effect of DL models and increase the interpretability and the acceptability of the result by the final user. In this review, we propose an overview of the existing methods to quantify uncertainty associated with DL predictions. We focus on applications to medical image analysis, which present specific challenges due to the high dimensionality of images and their variable quality, as well as constraints associated with real-world clinical routine. Moreover, we discuss the concept of structural uncertainty, a corpus of methods to facilitate the alignment of segmentation uncertainty estimates with clinical attention. We then discuss the evaluation protocols to validate the relevance of uncertainty estimates. Finally, we highlight the open challenges for uncertainty quantification in the medical field.
Collapse
Affiliation(s)
- Benjamin Lambert
- Univ. Grenoble Alpes, Inserm, U1216, Grenoble Institut des Neurosciences, Grenoble, 38000, France; Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Florence Forbes
- Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, 38000, France
| | - Senan Doyle
- Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Harmonie Dehaene
- Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Michel Dojat
- Univ. Grenoble Alpes, Inserm, U1216, Grenoble Institut des Neurosciences, Grenoble, 38000, France.
| |
Collapse
|
6
|
Molière S, Hamzaoui D, Granger B, Montagne S, Allera A, Ezziane M, Luzurier A, Quint R, Kalai M, Ayache N, Delingette H, Renard-Penna R. Reference standard for the evaluation of automatic segmentation algorithms: Quantification of inter observer variability of manual delineation of prostate contour on MRI. Diagn Interv Imaging 2024; 105:65-73. [PMID: 37822196 DOI: 10.1016/j.diii.2023.08.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/28/2023] [Accepted: 08/01/2023] [Indexed: 10/13/2023]
Abstract
PURPOSE The purpose of this study was to investigate the relationship between inter-reader variability in manual prostate contour segmentation on magnetic resonance imaging (MRI) examinations and determine the optimal number of readers required to establish a reliable reference standard. MATERIALS AND METHODS Seven radiologists with various experiences independently performed manual segmentation of the prostate contour (whole-gland [WG] and transition zone [TZ]) on 40 prostate MRI examinations obtained in 40 patients. Inter-reader variability in prostate contour delineations was estimated using standard metrics (Dice similarity coefficient [DSC], Hausdorff distance and volume-based metrics). The impact of the number of readers (from two to seven) on segmentation variability was assessed using pairwise metrics (consistency) and metrics with respect to a reference segmentation (conformity), obtained either with majority voting or simultaneous truth and performance level estimation (STAPLE) algorithm. RESULTS The average segmentation DSC for two readers in pairwise comparison was 0.919 for WG and 0.876 for TZ. Variability decreased with the number of readers: the interquartile ranges of the DSC were 0.076 (WG) / 0.021 (TZ) for configurations with two readers, 0.005 (WG) / 0.012 (TZ) for configurations with three readers, and 0.002 (WG) / 0.0037 (TZ) for configurations with six readers. The interquartile range decreased slightly faster between two and three readers than between three and six readers. When using consensus methods, variability often reached its minimum with three readers (with STAPLE, DSC = 0.96 [range: 0.945-0.971] for WG and DSC = 0.94 [range: 0.912-0.957] for TZ, and interquartile range was minimal for configurations with three readers. CONCLUSION The number of readers affects the inter-reader variability, in terms of inter-reader consistency and conformity to a reference. Variability is minimal for three readers, or three readers represent a tipping point in the variability evolution, with both pairwise-based metrics or metrics with respect to a reference. Accordingly, three readers may represent an optimal number to determine references for artificial intelligence applications.
Collapse
Affiliation(s)
- Sébastien Molière
- Department of Radiology, Hôpitaux Universitaire de Strasbourg, Hôpital de Hautepierre, 67200, Strasbourg, France; Breast and Thyroid Imaging Unit, Institut de Cancérologie Strasbourg Europe, 67200, Strasbourg, France; IGBMC, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67400, Illkirch, France.
| | - Dimitri Hamzaoui
- Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, 06902, Nice, France
| | - Benjamin Granger
- Sorbonne Université, INSERM, Institut Pierre Louis d'Epidémiologie et de Santé Publique, IPLESP, AP-HP, Hôpital Pitié Salpêtrière, Département de Santé Publique, 75013, Paris, France
| | - Sarah Montagne
- Department of Radiology, Hôpital Tenon, Assistance Publique-Hôpitaux de Paris, 75020, Paris, France; Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France; GRC N° 5, Oncotype-Uro, Sorbonne Université, 75020, Paris, France
| | - Alexandre Allera
- Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
| | - Malek Ezziane
- Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
| | - Anna Luzurier
- Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
| | - Raphaelle Quint
- Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
| | - Mehdi Kalai
- Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France
| | - Nicholas Ayache
- Department of Radiology, Hôpitaux Universitaire de Strasbourg, Hôpital de Hautepierre, 67200, Strasbourg, France
| | - Hervé Delingette
- Department of Radiology, Hôpitaux Universitaire de Strasbourg, Hôpital de Hautepierre, 67200, Strasbourg, France
| | - Raphaële Renard-Penna
- Department of Radiology, Hôpital Tenon, Assistance Publique-Hôpitaux de Paris, 75020, Paris, France; Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique-Hôpitaux de Paris, 75013, Paris, France; GRC N° 5, Oncotype-Uro, Sorbonne Université, 75020, Paris, France
| |
Collapse
|
7
|
Kaneko M, Magoulianitis V, Ramacciotti LS, Raman A, Paralkar D, Chen A, Chu TN, Yang Y, Xue J, Yang J, Liu J, Jadvar DS, Gill K, Cacciamani GE, Nikias CL, Duddalwar V, Jay Kuo CC, Gill IS, Abreu AL. The Novel Green Learning Artificial Intelligence for Prostate Cancer Imaging: A Balanced Alternative to Deep Learning and Radiomics. Urol Clin North Am 2024; 51:1-13. [PMID: 37945095 DOI: 10.1016/j.ucl.2023.08.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
The application of artificial intelligence (AI) on prostate magnetic resonance imaging (MRI) has shown promising results. Several AI systems have been developed to automatically analyze prostate MRI for segmentation, cancer detection, and region of interest characterization, thereby assisting clinicians in their decision-making process. Deep learning, the current trend in imaging AI, has limitations including the lack of transparency "black box", large data processing, and excessive energy consumption. In this narrative review, the authors provide an overview of the recent advances in AI for prostate cancer diagnosis and introduce their next-generation AI model, Green Learning, as a promising solution.
Collapse
Affiliation(s)
- Masatomo Kaneko
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Urology, Graduate School of Medical Science, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Vasileios Magoulianitis
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Lorenzo Storino Ramacciotti
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Alex Raman
- Western University of Health Sciences. Pomona, CA, USA
| | - Divyangi Paralkar
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Andrew Chen
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Timothy N Chu
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Yijing Yang
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jintang Xue
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jiaxin Yang
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Jinyuan Liu
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Donya S Jadvar
- Dornsife School of Letters and Science, University of Southern California, Los Angeles, CA, USA
| | - Karanvir Gill
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer
| | - Giovanni E Cacciamani
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Chrysostomos L Nikias
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Vinay Duddalwar
- Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - C-C Jay Kuo
- Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, USA
| | - Inderbir S Gill
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Andre Luis Abreu
- USC Institute of Urology and Catherine & Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; USC Institute of Urology, Center for Image-Guided Surgery, Focal Therapy and Artificial Intelligence for Prostate Cancer; Department of Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Ostmeier S, Axelrod B, Isensee F, Bertels J, Mlynash M, Christensen S, Lansberg MG, Albers GW, Sheth R, Verhaaren BFJ, Mahammedi A, Li LJ, Zaharchuk G, Heit JJ. USE-Evaluator: Performance metrics for medical image segmentation models supervised by uncertain, small or empty reference annotations in neuroimaging. Med Image Anal 2023; 90:102927. [PMID: 37672900 DOI: 10.1016/j.media.2023.102927] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 07/08/2023] [Accepted: 08/03/2023] [Indexed: 09/08/2023]
Abstract
Performance metrics for medical image segmentation models are used to measure the agreement between the reference annotation and the predicted segmentation. Usually, overlap metrics, such as the Dice, are used as a metric to evaluate the performance of these models in order for results to be comparable. However, there is a mismatch between the distributions of cases and the difficulty level of segmentation tasks in public data sets compared to clinical practice. Common metrics used to assess performance fail to capture the impact of this mismatch, particularly when dealing with datasets in clinical settings that involve challenging segmentation tasks, pathologies with low signal, and reference annotations that are uncertain, small, or empty. Limitations of common metrics may result in ineffective machine learning research in designing and optimizing models. To effectively evaluate the clinical value of such models, it is essential to consider factors such as the uncertainty associated with reference annotations, the ability to accurately measure performance regardless of the size of the reference annotation volume, and the classification of cases where reference annotations are empty. We study how uncertain, small, and empty reference annotations influence the value of metrics on a stroke in-house data set regardless of the model. We examine metrics behavior on the predictions of a standard deep learning framework in order to identify suitable metrics in such a setting. We compare our results to the BRATS 2019 and Spinal Cord public data sets. We show how uncertain, small, or empty reference annotations require a rethinking of the evaluation. The evaluation code was released to encourage further analysis of this topic https://github.com/SophieOstmeier/UncertainSmallEmpty.git.
Collapse
Affiliation(s)
- Sophie Ostmeier
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America.
| | - Brian Axelrod
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Fabian Isensee
- Division of Medical Image Computing, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | | | - Michael Mlynash
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | | | - Maarten G Lansberg
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Gregory W Albers
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | | | | | - Abdelkader Mahammedi
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Li-Jia Li
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Greg Zaharchuk
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| | - Jeremy J Heit
- Stanford University, Center of Academic Medicine, 453 Quarry Rd, Palo Alto, CA 94304, United States of America
| |
Collapse
|
9
|
Jin L, Ma Z, Li H, Gao F, Gao P, Yang N, Li D, Li M, Geng D. Interobserver Agreement in Automatic Segmentation Annotation of Prostate Magnetic Resonance Imaging. Bioengineering (Basel) 2023; 10:1340. [PMID: 38135930 PMCID: PMC10740636 DOI: 10.3390/bioengineering10121340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/10/2023] [Accepted: 11/17/2023] [Indexed: 12/24/2023] Open
Abstract
We aimed to compare the performance and interobserver agreement of radiologists manually segmenting images or those assisted by automatic segmentation. We further aimed to reduce interobserver variability and improve the consistency of radiomics features. This retrospective study included 327 patients diagnosed with prostate cancer from September 2016 to June 2018; images from 228 patients were used for automatic segmentation construction, and images from the remaining 99 were used for testing. First, four radiologists with varying experience levels retrospectively segmented 99 axial prostate images manually using T2-weighted fat-suppressed magnetic resonance imaging. Automatic segmentation was performed after 2 weeks. The Pyradiomics software package v3.1.0 was used to extract the texture features. The Dice coefficient and intraclass correlation coefficient (ICC) were used to evaluate segmentation performance and the interobserver consistency of prostate radiomics. The Wilcoxon rank sum test was used to compare the paired samples, with the significance level set at p < 0.05. The Dice coefficient was used to accurately measure the spatial overlap of manually delineated images. In all the 99 prostate segmentation result columns, the manual and automatic segmentation results of the senior group were significantly better than those of the junior group (p < 0.05). Automatic segmentation was more consistent than manual segmentation (p < 0.05), and the average ICC reached >0.85. The automatic segmentation annotation performance of junior radiologists was similar to that of senior radiologists performing manual segmentation. The ICC of radiomics features increased to excellent consistency (0.925 [0.888~0.950]). Automatic segmentation annotation provided better results than manual segmentation by radiologists. Our findings indicate that automatic segmentation annotation helps reduce variability in the perception and interpretation between radiologists with different experience levels and ensures the stability of radiomics features.
Collapse
Affiliation(s)
- Liang Jin
- Radiology Department, Huashan Hospital, Affiliated with Fudan University, Shanghai 200040, China; (L.J.); (H.L.)
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Zhuangxuan Ma
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Haiqing Li
- Radiology Department, Huashan Hospital, Affiliated with Fudan University, Shanghai 200040, China; (L.J.); (H.L.)
| | - Feng Gao
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Pan Gao
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Nan Yang
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Dechun Li
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
| | - Ming Li
- Radiology Department, Huadong Hospital, Affiliated with Fudan University, Shanghai 200040, China; (Z.M.); (F.G.); (P.G.); (N.Y.); (D.L.)
- Institute of Functional and Molecular Medical Imaging, Shanghai 200040, China
| | - Daoying Geng
- Radiology Department, Huashan Hospital, Affiliated with Fudan University, Shanghai 200040, China; (L.J.); (H.L.)
- Institute of Functional and Molecular Medical Imaging, Shanghai 200040, China
| |
Collapse
|
10
|
Meglič J, Sunoqrot MRS, Bathen TF, Elschot M. Label-set impact on deep learning-based prostate segmentation on MRI. Insights Imaging 2023; 14:157. [PMID: 37749333 PMCID: PMC10519913 DOI: 10.1186/s13244-023-01502-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 08/12/2023] [Indexed: 09/27/2023] Open
Abstract
BACKGROUND Prostate segmentation is an essential step in computer-aided detection and diagnosis systems for prostate cancer. Deep learning (DL)-based methods provide good performance for prostate gland and zones segmentation, but little is known about the impact of manual segmentation (that is, label) selection on their performance. In this work, we investigated these effects by obtaining two different expert label-sets for the PROSTATEx I challenge training dataset (n = 198) and using them, in addition to an in-house dataset (n = 233), to assess the effect on segmentation performance. The automatic segmentation method we used was nnU-Net. RESULTS The selection of training/testing label-set had a significant (p < 0.001) impact on model performance. Furthermore, it was found that model performance was significantly (p < 0.001) higher when the model was trained and tested with the same label-set. Moreover, the results showed that agreement between automatic segmentations was significantly (p < 0.0001) higher than agreement between manual segmentations and that the models were able to outperform the human label-sets used to train them. CONCLUSIONS We investigated the impact of label-set selection on the performance of a DL-based prostate segmentation model. We found that the use of different sets of manual prostate gland and zone segmentations has a measurable impact on model performance. Nevertheless, DL-based segmentation appeared to have a greater inter-reader agreement than manual segmentation. More thought should be given to the label-set, with a focus on multicenter manual segmentation and agreement on common procedures. CRITICAL RELEVANCE STATEMENT Label-set selection significantly impacts the performance of a deep learning-based prostate segmentation model. Models using different label-set showed higher agreement than manual segmentations. KEY POINTS • Label-set selection has a significant impact on the performance of automatic segmentation models. • Deep learning-based models demonstrated true learning rather than simply mimicking the label-set. • Automatic segmentation appears to have a greater inter-reader agreement than manual segmentation.
Collapse
Affiliation(s)
- Jakob Meglič
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology - NTNU, 7030, Trondheim, Norway.
- Faculty of Medicine, University of Ljubljana, 1000, Ljubljana, Slovenia.
| | - Mohammed R S Sunoqrot
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology - NTNU, 7030, Trondheim, Norway
- Department of Radiology and Nuclear Medicine, St. Olavs Hospital, Trondheim University Hospital, 7030, Trondheim, Norway
| | - Tone Frost Bathen
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology - NTNU, 7030, Trondheim, Norway
- Department of Radiology and Nuclear Medicine, St. Olavs Hospital, Trondheim University Hospital, 7030, Trondheim, Norway
| | - Mattijs Elschot
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology - NTNU, 7030, Trondheim, Norway.
- Department of Radiology and Nuclear Medicine, St. Olavs Hospital, Trondheim University Hospital, 7030, Trondheim, Norway.
| |
Collapse
|
11
|
Isaksson LJ, Summers P, Mastroleo F, Marvaso G, Corrao G, Vincini MG, Zaffaroni M, Ceci F, Petralia G, Orecchia R, Jereczek-Fossa BA. Automatic Segmentation with Deep Learning in Radiotherapy. Cancers (Basel) 2023; 15:4389. [PMID: 37686665 PMCID: PMC10486603 DOI: 10.3390/cancers15174389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/28/2023] [Accepted: 08/30/2023] [Indexed: 09/10/2023] Open
Abstract
This review provides a formal overview of current automatic segmentation studies that use deep learning in radiotherapy. It covers 807 published papers and includes multiple cancer sites, image types (CT/MRI/PET), and segmentation methods. We collect key statistics about the papers to uncover commonalities, trends, and methods, and identify areas where more research might be needed. Moreover, we analyzed the corpus by posing explicit questions aimed at providing high-quality and actionable insights, including: "What should researchers think about when starting a segmentation study?", "How can research practices in medical image segmentation be improved?", "What is missing from the current corpus?", and more. This allowed us to provide practical guidelines on how to conduct a good segmentation study in today's competitive environment that will be useful for future research within the field, regardless of the specific radiotherapeutic subfield. To aid in our analysis, we used the large language model ChatGPT to condense information.
Collapse
Affiliation(s)
- Lars Johannes Isaksson
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
| | - Paul Summers
- Division of Radiology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy;
| | - Federico Mastroleo
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Translational Medicine, University of Piemonte Orientale (UPO), 20188 Novara, Italy
| | - Giulia Marvaso
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Giulia Corrao
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Maria Giulia Vincini
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Mattia Zaffaroni
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
| | - Francesco Ceci
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
- Division of Nuclear Medicine, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy
| | - Giuseppe Petralia
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
- Precision Imaging and Research Unit, Department of Medical Imaging and Radiation Sciences, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy
| | - Roberto Orecchia
- Scientific Directorate, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy;
| | - Barbara Alicja Jereczek-Fossa
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141 Milan, Italy; (L.J.I.); (F.M.); (G.C.); (M.G.V.); (M.Z.); (B.A.J.-F.)
- Department of Oncology and Hemato-Oncology, University of Milan, 20141 Milan, Italy; (F.C.); (G.P.)
| |
Collapse
|
12
|
Thimansson E, Bengtsson J, Baubeta E, Engman J, Flondell-Sité D, Bjartell A, Zackrisson S. Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. Eur Radiol 2023; 33:2519-2528. [PMID: 36371606 PMCID: PMC10017633 DOI: 10.1007/s00330-022-09239-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/26/2022] [Accepted: 10/13/2022] [Indexed: 11/15/2022]
Abstract
OBJECTIVES Prostate volume (PV) in combination with prostate specific antigen (PSA) yields PSA density which is an increasingly important biomarker. Calculating PV from MRI is a time-consuming, radiologist-dependent task. The aim of this study was to assess whether a deep learning algorithm can replace PI-RADS 2.1 based ellipsoid formula (EF) for calculating PV. METHODS Eight different measures of PV were retrospectively collected for each of 124 patients who underwent radical prostatectomy and preoperative MRI of the prostate (multicenter and multi-scanner MRI's 1.5 and 3 T). Agreement between volumes obtained from the deep learning algorithm (PVDL) and ellipsoid formula by two radiologists (PVEF1 and PVEF2) was evaluated against the reference standard PV obtained by manual planimetry by an expert radiologist (PVMPE). A sensitivity analysis was performed using a prostatectomy specimen as the reference standard. Inter-reader agreement was evaluated between the radiologists using the ellipsoid formula and between the expert and inexperienced radiologists performing manual planimetry. RESULTS PVDL showed better agreement and precision than PVEF1 and PVEF2 using the reference standard PVMPE (mean difference [95% limits of agreement] PVDL: -0.33 [-10.80; 10.14], PVEF1: -3.83 [-19.55; 11.89], PVEF2: -3.05 [-18.55; 12.45]) or the PV determined based on specimen weight (PVDL: -4.22 [-22.52; 14.07], PVEF1: -7.89 [-30.50; 14.73], PVEF2: -6.97 [-30.13; 16.18]). Inter-reader agreement was excellent between the two experienced radiologists using the ellipsoid formula and was good between expert and inexperienced radiologists performing manual planimetry. CONCLUSION Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. KEY POINTS • A commercially available deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. • The deep-learning algorithm was previously untrained on this heterogenous multicenter day-to-day practice MRI data set.
Collapse
Affiliation(s)
- Erik Thimansson
- Department of Translational Medicine, Diagnostic Radiology, Lund University, Carl-Bertil Laurells gata 9, SE-205 02, Malmö, Sweden.
- Department of Radiology, Helsingborg Hospital, Helsingborg, Sweden.
| | - J Bengtsson
- Department of Clinical Sciences, Diagnostic Radiology, Lund University, Lund, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Lund, Sweden
| | - E Baubeta
- Department of Translational Medicine, Diagnostic Radiology, Lund University, Carl-Bertil Laurells gata 9, SE-205 02, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Lund, Sweden
| | - J Engman
- Department of Translational Medicine, Diagnostic Radiology, Lund University, Carl-Bertil Laurells gata 9, SE-205 02, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Lund, Sweden
| | - D Flondell-Sité
- Department of Translational Medicine, Urological Cancers, Lund University, Malmö, Sweden
- Department of Urology, Skåne University Hospital, Malmö, Sweden
| | - A Bjartell
- Department of Translational Medicine, Urological Cancers, Lund University, Malmö, Sweden
- Department of Urology, Skåne University Hospital, Malmö, Sweden
| | - S Zackrisson
- Department of Translational Medicine, Diagnostic Radiology, Lund University, Carl-Bertil Laurells gata 9, SE-205 02, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Malmö, Sweden
- Department of Imaging and Functional Medicine, Skåne University Hospital, Lund, Sweden
| |
Collapse
|
13
|
Montazerolghaem M, Sun Y, Sasso G, Haworth A. U-Net Architecture for Prostate Segmentation: The Impact of Loss Function on System Performance. Bioengineering (Basel) 2023; 10:bioengineering10040412. [PMID: 37106600 PMCID: PMC10135670 DOI: 10.3390/bioengineering10040412] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/29/2023] Open
Abstract
Segmentation of the prostate gland from magnetic resonance images is rapidly becoming a standard of care in prostate cancer radiotherapy treatment planning. Automating this process has the potential to improve accuracy and efficiency. However, the performance and accuracy of deep learning models varies depending on the design and optimal tuning of the hyper-parameters. In this study, we examine the effect of loss functions on the performance of deep-learning-based prostate segmentation models. A U-Net model for prostate segmentation using T2-weighted images from a local dataset was trained and performance compared when using nine different loss functions, including: Binary Cross-Entropy (BCE), Intersection over Union (IoU), Dice, BCE and Dice (BCE + Dice), weighted BCE and Dice (W (BCE + Dice)), Focal, Tversky, Focal Tversky, and Surface loss functions. Model outputs were compared using several metrics on a five-fold cross-validation set. Ranking of model performance was found to be dependent on the metric used to measure performance, but in general, W (BCE + Dice) and Focal Tversky performed well for all metrics (whole gland Dice similarity coefficient (DSC): 0.71 and 0.74; 95HD: 6.66 and 7.42; Ravid 0.05 and 0.18, respectively) and Surface loss generally ranked lowest (DSC: 0.40; 95HD: 13.64; Ravid −0.09). When comparing the performance of the models for the mid-gland, apex, and base parts of the prostate gland, the models’ performance was lower for the apex and base compared to the mid-gland. In conclusion, we have demonstrated that the performance of a deep learning model for prostate segmentation can be affected by choice of loss function. For prostate segmentation, it would appear that compound loss functions generally outperform singles loss functions such as Surface loss.
Collapse
|
14
|
Xu L, Zhang G, Zhang D, Zhang J, Zhang X, Bai X, Chen L, Peng Q, Jin R, Mao L, Li X, Jin Z, Sun H. Development and clinical utility analysis of a prostate zonal segmentation model on T2-weighted imaging: a multicenter study. Insights Imaging 2023; 14:44. [PMID: 36928683 PMCID: PMC10020392 DOI: 10.1186/s13244-023-01394-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 02/19/2023] [Indexed: 03/18/2023] Open
Abstract
OBJECTIVES To automatically segment prostate central gland (CG) and peripheral zone (PZ) on T2-weighted imaging using deep learning and assess the model's clinical utility by comparing it with a radiologist annotation and analyzing relevant influencing factors, especially the prostate zonal volume. METHODS A 3D U-Net-based model was trained with 223 patients from one institution and tested using one internal testing group (n = 93) and two external testing datasets, including one public dataset (ETDpub, n = 141) and one private dataset from two centers (ETDpri, n = 59). The Dice similarity coefficients (DSCs), 95th Hausdorff distance (95HD), and average boundary distance (ABD) were calculated to evaluate the model's performance and further compared with a junior radiologist's performance in ETDpub. To investigate factors influencing the model performance, patients' clinical characteristics, prostate morphology, and image parameters in ETDpri were collected and analyzed using beta regression. RESULTS The DSCs in the internal testing group, ETDpub, and ETDpri were 0.909, 0.889, and 0.869 for CG, and 0.844, 0.755, and 0.764 for PZ, respectively. The mean 95HD and ABD were less than 7.0 and 1.3 for both zones. The U-Net model outperformed the junior radiologist, having a higher DSC (0.769 vs. 0.706) and higher intraclass correlation coefficient for volume estimation in PZ (0.836 vs. 0.668). CG volume and Magnetic Resonance (MR) vendor were significant influencing factors for CG and PZ segmentation. CONCLUSIONS The 3D U-Net model showed good performance for CG and PZ auto-segmentation in all the testing groups and outperformed the junior radiologist for PZ segmentation. The model performance was susceptible to prostate morphology and MR scanner parameters.
Collapse
Affiliation(s)
- Lili Xu
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China.,National Center for Quality Control of Radiology, Beijing, China
| | - Gumuyang Zhang
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Daming Zhang
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Jiahui Zhang
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Xiaoxiao Zhang
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Xin Bai
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Li Chen
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Qianyu Peng
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Ru Jin
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China
| | - Li Mao
- AI Lab, Deepwise Healthcare, Beijing, China
| | - Xiuli Li
- AI Lab, Deepwise Healthcare, Beijing, China
| | - Zhengyu Jin
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China. .,National Center for Quality Control of Radiology, Beijing, China.
| | - Hao Sun
- Department of Radiology, State Key Laboratory of Complex Severe and Rare Disease, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Shuaifuyuan No.1, Wangfujing Street, Dongcheng District, Beijing, 100730, China. .,National Center for Quality Control of Radiology, Beijing, China.
| |
Collapse
|
15
|
Kushwaha A, Mourad RF, Heist K, Tariq H, Chan HP, Ross BD, Chenevert TL, Malyarenko D, Hadjiiski LM. Improved Repeatability of Mouse Tibia Volume Segmentation in Murine Myelofibrosis Model Using Deep Learning. Tomography 2023; 9:589-602. [PMID: 36961007 PMCID: PMC10037585 DOI: 10.3390/tomography9020048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 03/02/2023] [Accepted: 03/03/2023] [Indexed: 03/09/2023] Open
Abstract
A murine model of myelofibrosis in tibia was used in a co-clinical trial to evaluate segmentation methods for application of image-based biomarkers to assess disease status. The dataset (32 mice with 157 3D MRI scans including 49 test-retest pairs scanned on consecutive days) was split into approximately 70% training, 10% validation, and 20% test subsets. Two expert annotators (EA1 and EA2) performed manual segmentations of the mouse tibia (EA1: all data; EA2: test and validation). Attention U-net (A-U-net) model performance was assessed for accuracy with respect to EA1 reference using the average Jaccard index (AJI), volume intersection ratio (AVI), volume error (AVE), and Hausdorff distance (AHD) for four training scenarios: full training, two half-splits, and a single-mouse subsets. The repeatability of computer versus expert segmentations for tibia volume of test-retest pairs was assessed by within-subject coefficient of variance (%wCV). A-U-net models trained on full and half-split training sets achieved similar average accuracy (with respect to EA1 annotations) for test set: AJI = 83-84%, AVI = 89-90%, AVE = 2-3%, and AHD = 0.5 mm-0.7 mm, exceeding EA2 accuracy: AJ = 81%, AVI = 83%, AVE = 14%, and AHD = 0.3 mm. The A-U-net model repeatability wCV [95% CI]: 3 [2, 5]% was notably better than that of expert annotators EA1: 5 [4, 9]% and EA2: 8 [6, 13]%. The developed deep learning model effectively automates murine bone marrow segmentation with accuracy comparable to human annotators and substantially improved repeatability.
Collapse
|
16
|
A Soft Label Method for Medical Image Segmentation with Multirater Annotations. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:1883597. [PMID: 36851939 PMCID: PMC9966563 DOI: 10.1155/2023/1883597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 10/04/2022] [Accepted: 10/06/2022] [Indexed: 02/20/2023]
Abstract
In medical image analysis, collecting multiple annotations from different clinical raters is a typical practice to mitigate possible diagnostic errors. For such multirater labels' learning problems, in addition to majority voting, it is a common practice to use soft labels in the form of full-probability distributions obtained by averaging raters as ground truth to train the model, which benefits from uncertainty contained in soft labels. However, the potential information contained in soft labels is rarely studied, which may be the key to improving the performance of medical image segmentation with multirater annotations. In this work, we aim to improve soft label methods by leveraging interpretable information from multiraters. Considering that mis-segmentation occurs in areas with weak supervision of annotations and high difficulty of images, we propose to reduce the reliance on local uncertain soft labels and increase the focus on image features. Therefore, we introduce local self-ensembling learning with consistency regularization, forcing the model to concentrate more on features rather than annotations, especially in regions with high uncertainty measured by the pixelwise interclass variance. Furthermore, we utilize a label smoothing technique to flatten each rater's annotation, alleviating overconfidence of structural edges in annotations. Without introducing additional parameters, our method improves the accuracy of the soft label baseline by 4.2% and 2.7% on a synthetic dataset and a fundus dataset, respectively. In addition, quantitative comparisons show that our method consistently outperforms existing multirater strategies as well as state-of-the-art methods. This work provides a simple yet effective solution for the widespread multirater label segmentation problems in clinical diagnosis.
Collapse
|
17
|
Isaksson LJ, Pepa M, Summers P, Zaffaroni M, Vincini MG, Corrao G, Mazzola GC, Rotondi M, Lo Presti G, Raimondi S, Gandini S, Volpe S, Haron Z, Alessi S, Pricolo P, Mistretta FA, Luzzago S, Cattani F, Musi G, Cobelli OD, Cremonesi M, Orecchia R, Marvaso G, Petralia G, Jereczek-Fossa BA. Comparison of automated segmentation techniques for magnetic resonance images of the prostate. BMC Med Imaging 2023; 23:32. [PMID: 36774463 PMCID: PMC9921124 DOI: 10.1186/s12880-023-00974-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 01/20/2023] [Indexed: 02/13/2023] Open
Abstract
BACKGROUND Contouring of anatomical regions is a crucial step in the medical workflow and is both time-consuming and prone to intra- and inter-observer variability. This study compares different strategies for automatic segmentation of the prostate in T2-weighted MRIs. METHODS This study included 100 patients diagnosed with prostate adenocarcinoma who had undergone multi-parametric MRI and prostatectomy. From the T2-weighted MR images, ground truth segmentation masks were established by consensus from two expert radiologists. The prostate was then automatically contoured with six different methods: (1) a multi-atlas algorithm, (2) a proprietary algorithm in the Syngo.Via medical imaging software, and four deep learning models: (3) a V-net trained from scratch, (4) a pre-trained 2D U-net, (5) a GAN extension of the 2D U-net, and (6) a segmentation-adapted EfficientDet architecture. The resulting segmentations were compared and scored against the ground truth masks with one 70/30 and one 50/50 train/test data split. We also analyzed the association between segmentation performance and clinical variables. RESULTS The best performing method was the adapted EfficientDet (model 6), achieving a mean Dice coefficient of 0.914, a mean absolute volume difference of 5.9%, a mean surface distance (MSD) of 1.93 pixels, and a mean 95th percentile Hausdorff distance of 3.77 pixels. The deep learning models were less prone to serious errors (0.854 minimum Dice and 4.02 maximum MSD), and no significant relationship was found between segmentation performance and clinical variables. CONCLUSIONS Deep learning-based segmentation techniques can consistently achieve Dice coefficients of 0.9 or above with as few as 50 training patients, regardless of architectural archetype. The atlas-based and Syngo.via methods found in commercial clinical software performed significantly worse (0.855[Formula: see text]0.887 Dice).
Collapse
Affiliation(s)
- Lars Johannes Isaksson
- Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy.
| | - Matteo Pepa
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Paul Summers
- grid.15667.330000 0004 1757 0843Division of Radiology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Mattia Zaffaroni
- Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy.
| | - Maria Giulia Vincini
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Giulia Corrao
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Giovanni Carlo Mazzola
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| | - Marco Rotondi
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| | - Giuliana Lo Presti
- grid.15667.330000 0004 1757 0843Molecular and Pharmaco-Epidemiology Unit, Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Sara Raimondi
- grid.15667.330000 0004 1757 0843Molecular and Pharmaco-Epidemiology Unit, Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Sara Gandini
- grid.15667.330000 0004 1757 0843Molecular and Pharmaco-Epidemiology Unit, Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Stefania Volpe
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| | - Zaharudin Haron
- grid.459841.50000 0004 6017 2701Radiology Department, National Cancer Institute, Putrajaya, Malaysia
| | - Sarah Alessi
- grid.15667.330000 0004 1757 0843Division of Radiology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Paola Pricolo
- grid.15667.330000 0004 1757 0843Division of Radiology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Francesco Alessandro Mistretta
- grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy ,grid.15667.330000 0004 1757 0843Division of Urology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Stefano Luzzago
- grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy ,grid.15667.330000 0004 1757 0843Division of Urology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Federica Cattani
- grid.15667.330000 0004 1757 0843Medical Physics Unit, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Gennaro Musi
- grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy ,grid.15667.330000 0004 1757 0843Division of Urology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Ottavio De Cobelli
- grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy ,grid.15667.330000 0004 1757 0843Division of Urology, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Marta Cremonesi
- grid.15667.330000 0004 1757 0843Radiation Research Unit, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Roberto Orecchia
- grid.15667.330000 0004 1757 0843Scientific Direction, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Giulia Marvaso
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| | - Giuseppe Petralia
- grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy ,grid.15667.330000 0004 1757 0843Precision Imaging and Research Unit, Department of Medical Imaging and Radiation Sciences, IEO European Institute of Oncology IRCCS, Milan, Italy
| | - Barbara Alicja Jereczek-Fossa
- grid.15667.330000 0004 1757 0843Department of Radiation Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy ,grid.4708.b0000 0004 1757 2822Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| |
Collapse
|
18
|
Canellas R, Kohli MD, Westphalen AC. The Evidence for Using Artificial Intelligence to Enhance Prostate Cancer MR Imaging. Curr Oncol Rep 2023; 25:243-250. [PMID: 36749494 DOI: 10.1007/s11912-023-01371-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2022] [Indexed: 02/08/2023]
Abstract
PURPOSE OF REVIEW The purpose of this review is to summarize the current status of artificial intelligence applied to prostate cancer MR imaging. RECENT FINDINGS Artificial intelligence has been applied to prostate cancer MR imaging to improve its diagnostic accuracy and reproducibility of interpretation. Multiple models have been tested for gland segmentation and volume calculation, automated lesion detection, localization, and characterization, as well as prediction of tumor aggressiveness and tumor recurrence. Studies show, for example, that very robust automated gland segmentation and volume calculations can be achieved and that lesions can be detected and accurately characterized. Although results are promising, we should view these with caution. Most studies included a small sample of patients from a single institution and most models did not undergo proper external validation. More research is needed with larger and well-design studies for the development of reliable artificial intelligence tools.
Collapse
Affiliation(s)
- Rodrigo Canellas
- Department of Radiology, University of Washington, 1959 NE Pacific St., 2nd Floor, Seattle, WA, 98195, USA
| | - Marc D Kohli
- Clinical Informatics, Department of Radiology and Biomedical Imaging, University of California, San Francisco, CA, 94143, USA.,Imaging Informatics, UCSF Health, 500 Parnassus Ave, 3rd Floor, San Francisco, CA, 94143, USA
| | - Antonio C Westphalen
- Department of Radiology, University of Washington, 1959 NE Pacific St., 2nd Floor, Seattle, WA, 98195, USA. .,Department of Urology, University of Washington, 1959 NE Pacific St., 2nd Floor, Seattle, WA, 98195, USA. .,Department Radiation Oncology, University of Washington, 1959 NE Pacific St., 2nd Floor, Seattle, WA, 98195, USA.
| |
Collapse
|
19
|
Wu C, Montagne S, Hamzaoui D, Ayache N, Delingette H, Renard-Penna R. Automatic segmentation of prostate zonal anatomy on MRI: a systematic review of the literature. Insights Imaging 2022; 13:202. [PMID: 36543901 PMCID: PMC9772373 DOI: 10.1186/s13244-022-01340-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 11/27/2022] [Indexed: 12/24/2022] Open
Abstract
OBJECTIVES Accurate zonal segmentation of prostate boundaries on MRI is a critical prerequisite for automated prostate cancer detection based on PI-RADS. Many articles have been published describing deep learning methods offering great promise for fast and accurate segmentation of prostate zonal anatomy. The objective of this review was to provide a detailed analysis and comparison of applicability and efficiency of the published methods for automatic segmentation of prostate zonal anatomy by systematically reviewing the current literature. METHODS A Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) was conducted until June 30, 2021, using PubMed, ScienceDirect, Web of Science and EMBase databases. Risk of bias and applicability based on Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria adjusted with Checklist for Artificial Intelligence in Medical Imaging (CLAIM) were assessed. RESULTS A total of 458 articles were identified, and 33 were included and reviewed. Only 2 articles had a low risk of bias for all four QUADAS-2 domains. In the remaining, insufficient details about database constitution and segmentation protocol provided sources of bias (inclusion criteria, MRI acquisition, ground truth). Eighteen different types of terminology for prostate zone segmentation were found, while 4 anatomic zones are described on MRI. Only 2 authors used a blinded reading, and 4 assessed inter-observer variability. CONCLUSIONS Our review identified numerous methodological flaws and underlined biases precluding us from performing quantitative analysis for this review. This implies low robustness and low applicability in clinical practice of the evaluated methods. Actually, there is not yet consensus on quality criteria for database constitution and zonal segmentation methodology.
Collapse
Affiliation(s)
- Carine Wu
- Sorbonne Université, Paris, France.
- Academic Department of Radiology, Hôpital Tenon, Assistance Publique des Hôpitaux de Paris, 4 Rue de La Chine, 75020, Paris, France.
| | - Sarah Montagne
- Sorbonne Université, Paris, France
- Academic Department of Radiology, Hôpital Tenon, Assistance Publique des Hôpitaux de Paris, 4 Rue de La Chine, 75020, Paris, France
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
- GRC N° 5, Oncotype-Uro, Sorbonne Université, Paris, France
| | - Dimitri Hamzaoui
- Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, Nice, France
| | - Nicholas Ayache
- Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, Nice, France
| | - Hervé Delingette
- Inria, Epione Team, Sophia Antipolis, Université Côte d'Azur, Nice, France
| | - Raphaële Renard-Penna
- Sorbonne Université, Paris, France
- Academic Department of Radiology, Hôpital Tenon, Assistance Publique des Hôpitaux de Paris, 4 Rue de La Chine, 75020, Paris, France
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
- GRC N° 5, Oncotype-Uro, Sorbonne Université, Paris, France
| |
Collapse
|
20
|
Hamzaoui D, Montagne S, Renard-Penna R, Ayache N, Delingette H. Automatic zonal segmentation of the prostate from 2D and 3D T2-weighted MRI and evaluation for clinical use. J Med Imaging (Bellingham) 2022; 9:024001. [PMID: 35300345 PMCID: PMC8920492 DOI: 10.1117/1.jmi.9.2.024001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 02/23/2022] [Indexed: 11/14/2022] Open
Abstract
Purpose: An accurate zonal segmentation of the prostate is required for prostate cancer (PCa) management with MRI. Approach: The aim of this work is to present UFNet, a deep learning-based method for automatic zonal segmentation of the prostate from T2-weighted (T2w) MRI. It takes into account the image anisotropy, includes both spatial and channelwise attention mechanisms and uses loss functions to enforce prostate partition. The method was applied on a private multicentric three-dimensional T2w MRI dataset and on the public two-dimensional T2w MRI dataset ProstateX. To assess the model performance, the structures segmented by the algorithm on the private dataset were compared with those obtained by seven radiologists of various experience levels. Results: On the private dataset, we obtained a Dice score (DSC) of 93.90 ± 2.85 for the whole gland (WG), 91.00 ± 4.34 for the transition zone (TZ), and 79.08 ± 7.08 for the peripheral zone (PZ). Results were significantly better than other compared networks' ( p - value < 0.05 ). On ProstateX, we obtained a DSC of 90.90 ± 2.94 for WG, 86.84 ± 4.33 for TZ, and 78.40 ± 7.31 for PZ. These results are similar to state-of-the art results and, on the private dataset, are coherent with those obtained by radiologists. Zonal locations and sectorial positions of lesions annotated by radiologists were also preserved. Conclusions: Deep learning-based methods can provide an accurate zonal segmentation of the prostate leading to a consistent zonal location and sectorial position of lesions, and therefore can be used as a helping tool for PCa diagnosis.
Collapse
Affiliation(s)
- Dimitri Hamzaoui
- Université Côte d'Azur, Inria, Epione Project-Team, Sophia Antipolis, Valbonne, France
| | - Sarah Montagne
- Sorbonne Université, Radiology Department, CHU La Pitié Salpétrière/Tenon, Paris, France
| | - Raphaële Renard-Penna
- Sorbonne Université, Radiology Department, CHU La Pitié Salpétrière/Tenon, Paris, France
| | - Nicholas Ayache
- Université Côte d'Azur, Inria, Epione Project-Team, Sophia Antipolis, Valbonne, France
| | - Hervé Delingette
- Université Côte d'Azur, Inria, Epione Project-Team, Sophia Antipolis, Valbonne, France
| |
Collapse
|
21
|
Liu X, Xing F, Marin T, Fakhri GE, Woo J. Variational Inference for Quantifying Inter-observer Variability in Segmentation of Anatomical Structures. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2022; 12032:120321M. [PMID: 36303579 PMCID: PMC9603619 DOI: 10.1117/12.2604547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Lesions or organ boundaries visible through medical imaging data are often ambiguous, thus resulting in significant variations in multi-reader delineations, i.e., the source of aleatoric uncertainty. In particular, quantifying the inter-observer variability of manual annotations with Magnetic Resonance (MR) Imaging data plays a crucial role in establishing a reference standard for various diagnosis and treatment tasks. Most segmentation methods, however, simply model a mapping from an image to its single segmentation map and do not take the disagreement of annotators into consideration. In order to account for inter-observer variability, without sacrificing accuracy, we propose a novel variational inference framework to model the distribution of plausible segmentation maps, given a specific MR image, which explicitly represents the multi-reader variability. Specifically, we resort to a latent vector to encode the multi-reader variability and counteract the inherent information loss in the imaging data. Then, we apply a variational autoencoder network and optimize its evidence lower bound (ELBO) to efficiently approximate the distribution of the segmentation map, given an MR image. Experimental results, carried out with the QUBIQ brain growth MRI segmentation datasets with seven annotators, demonstrate the effectiveness of our approach.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Thibault Marin
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| |
Collapse
|
22
|
Rouvière O, Moldovan PC, Vlachomitrou A, Gouttard S, Riche B, Groth A, Rabotnikov M, Ruffion A, Colombel M, Crouzet S, Weese J, Rabilloud M. Combined model-based and deep learning-based automated 3D zonal segmentation of the prostate on T2-weighted MR images: clinical evaluation. Eur Radiol 2022; 32:3248-3259. [PMID: 35001157 DOI: 10.1007/s00330-021-08408-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 09/28/2021] [Accepted: 10/09/2021] [Indexed: 11/04/2022]
Abstract
OBJECTIVE To train and to test for prostate zonal segmentation an existing algorithm already trained for whole-gland segmentation. METHODS The algorithm, combining model-based and deep learning-based approaches, was trained for zonal segmentation using the NCI-ISBI-2013 dataset and 70 T2-weighted datasets acquired at an academic centre. Test datasets were randomly selected among examinations performed at this centre on one of two scanners (General Electric, 1.5 T; Philips, 3 T) not used for training. Automated segmentations were corrected by two independent radiologists. When segmentation was initiated outside the prostate, images were cropped and segmentation repeated. Factors influencing the algorithm's mean Dice similarity coefficient (DSC) and its precision were assessed using beta regression. RESULTS Eighty-two test datasets were selected; one was excluded. In 13/81 datasets, segmentation started outside the prostate, but zonal segmentation was possible after image cropping. Depending on the radiologist chosen as reference, algorithm's median DSCs were 96.4/97.4%, 91.8/93.0% and 79.9/89.6% for whole-gland, central gland and anterior fibromuscular stroma (AFMS) segmentations, respectively. DSCs comparing radiologists' delineations were 95.8%, 93.6% and 81.7%, respectively. For all segmentation tasks, the scanner used for imaging significantly influenced the mean DSC and its precision, and the mean DSC was significantly lower in cases with initial segmentation outside the prostate. For central gland segmentation, the mean DSC was also significantly lower in larger prostates. The radiologist chosen as reference had no significant impact, except for AFMS segmentation. CONCLUSIONS The algorithm performance fell within the range of inter-reader variability but remained significantly impacted by the scanner used for imaging. KEY POINTS • Median Dice similarity coefficients obtained by the algorithm fell within human inter-reader variability for the three segmentation tasks (whole gland, central gland, anterior fibromuscular stroma). • The scanner used for imaging significantly impacted the performance of the automated segmentation for the three segmentation tasks. • The performance of the automated segmentation of the anterior fibromuscular stroma was highly variable across patients and showed also high variability across the two radiologists.
Collapse
Affiliation(s)
- Olivier Rouvière
- Department of Urinary and Vascular Imaging, Hôpital Edouard Herriot, Hospices Civils de Lyon, Pavillon B, 5 place d'Arsonval, F-69437, Lyon, France. .,Université de Lyon, F-69003, Lyon, France. .,Faculté de Médecine Lyon Est, Université Lyon 1, F-69003, Lyon, France. .,INSERM, LabTau, U1032, Lyon, France.
| | - Paul Cezar Moldovan
- Department of Urinary and Vascular Imaging, Hôpital Edouard Herriot, Hospices Civils de Lyon, Pavillon B, 5 place d'Arsonval, F-69437, Lyon, France
| | - Anna Vlachomitrou
- Philips France, 33 rue de Verdun, CS 60 055, 92156, Suresnes Cedex, France
| | - Sylvain Gouttard
- Department of Urinary and Vascular Imaging, Hôpital Edouard Herriot, Hospices Civils de Lyon, Pavillon B, 5 place d'Arsonval, F-69437, Lyon, France
| | - Benjamin Riche
- Service de Biostatistique Et Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, F-69003, Lyon, France.,Laboratoire de Biométrie Et Biologie Évolutive, Équipe Biostatistique-Santé, UMR 5558, CNRS, F-69100, Villeurbanne, France
| | - Alexandra Groth
- Philips Research, Röntgenstrasse 24-26, 22335, Hamburg, Germany
| | | | - Alain Ruffion
- Department of Urology, Centre Hospitalier Lyon Sud, Hospices Civils de Lyon, F-69310, Pierre-Bénite, France
| | - Marc Colombel
- Université de Lyon, F-69003, Lyon, France.,Faculté de Médecine Lyon Est, Université Lyon 1, F-69003, Lyon, France.,Department of Urology, Hôpital Edouard Herriot, Hospices Civils de Lyon, F-69437, Lyon, France
| | - Sébastien Crouzet
- Department of Urology, Hôpital Edouard Herriot, Hospices Civils de Lyon, F-69437, Lyon, France
| | - Juergen Weese
- Philips Research, Röntgenstrasse 24-26, 22335, Hamburg, Germany
| | - Muriel Rabilloud
- Université de Lyon, F-69003, Lyon, France.,Faculté de Médecine Lyon Est, Université Lyon 1, F-69003, Lyon, France.,Service de Biostatistique Et Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, F-69003, Lyon, France.,Laboratoire de Biométrie Et Biologie Évolutive, Équipe Biostatistique-Santé, UMR 5558, CNRS, F-69100, Villeurbanne, France
| |
Collapse
|
23
|
Ghafoor S, Becker AS, Woo S, Causa Andrieu PI, Stocker D, Gangai N, Hricak H, Vargas HA. Comparison of PI-RADS Versions 2.0 and 2.1 for MRI-based Calculation of the Prostate Volume. Acad Radiol 2021; 28:1548-1556. [PMID: 32814644 DOI: 10.1016/j.acra.2020.07.027] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 07/20/2020] [Accepted: 07/21/2020] [Indexed: 01/25/2023]
Abstract
RATIONALE AND OBJECTIVES Prostate gland volume (PGV) should be routinely included in MRI reports of the prostate. The recently updated Prostate Imaging Reporting and Data System (PI-RADS) version 2.1 includes a change in the recommended measurement method for PGV compared to version 2.0. The purpose of this study was to evaluate the agreement of MRI-based PGV calculations with the volumetric manual slice-by-slice prostate segmentation as a reference standard using the linear measurements per PI-RADS versions 2.0 and 2.1. Furthermore, to assess inter-reader agreement for the different measurement approaches, determine the influence of an enlarged transition zone on measurement accuracy and to assess the value of the bullet formula for PGV calculation. MATERIALS AND METHODS Ninety-five consecutive treatment-naive patients undergoing prostate MRI were retrospectively analyzed. Prostates were manually contoured and segmented on axial T2-weighted images. Four different radiologists independently measured the prostate in three dimensions according to PI-RADS v2.0 and v2.1, respectively. MRI-based PGV was calculated using the ellipsoid and bullet formulas. Calculated volumes were compared to the reference manual segmentations using Wilcoxon signed-rank test. Inter-reader agreement was calculated using intraclass correlation coefficient (ICC). RESULTS Inter-reader agreement was excellent for the ellipsoid and bullet formulas using PI-RADS v2.0 (ICC 0.985 and 0.987) and v2.1 (ICC 0.990 and 0.994), respectively. The median difference from the reference standard using the ellipsoid formula derived PGV was 0.4 mL (interquartile range, -3.9 to 5.1 mL) for PI-RADS v2.0 (p = 0.393) and 2.6 mL (interquartile range, -1.6 to 7.3 mL) for v2.1 (p < 0.001) with a median difference of 2.2 mL. The bullet formula overestimated PGV by a median of 13.3 mL using PI-RADS v2.0 (p < 0.001) and 16.0 mL using v2.1 (p < 0.001). In the presence of an enlarged transition zone the PGV tended to be higher than the reference standard for PI-RADS v2.0 (median difference of 4.7 mL; p = 0.018) and for v2.1 (median difference of 5.7 mL, p < 0.001) using the ellipsoid formula. CONCLUSION Inter-reader agreement was excellent for the calculated PGV for both methods. PI-RADS v2.0 measurements with the ellipsoid formula yielded the most accurate volume estimates. The differences between PI-RADS v2.0 and v2.1 were statistically significant although small in absolute numbers but may be of relevance in specific clinical scenarios like prostate-specific antigen density calculation. These findings validate the use of the ellipsoid formula and highlight that the bullet formula should not be used for prostate volume estimation due to systematic overestimation.
Collapse
Affiliation(s)
- Soleen Ghafoor
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA.
| | - Anton S Becker
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Sungmin Woo
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Pamela I Causa Andrieu
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Daniel Stocker
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Natalie Gangai
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Hedvig Hricak
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Hebert Alberto Vargas
- Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| |
Collapse
|
24
|
Stanzione A, Ponsiglione A, Di Fiore GA, Picchi SG, Di Stasi M, Verde F, Petretta M, Imbriaco M, Cuocolo R. Prostate Volume Estimation on MRI: Accuracy and Effects of Ellipsoid and Bullet-Shaped Measurements on PSA Density. Acad Radiol 2021; 28:e219-e226. [PMID: 32553281 DOI: 10.1016/j.acra.2020.05.014] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 05/13/2020] [Accepted: 05/14/2020] [Indexed: 12/14/2022]
Abstract
RATIONALE AND OBJECTIVES PSA density (PSAd), an important decision-making parameter for patients with suspected prostate cancer (PCa), is dependent on magnetic resonance imaging prostate volume (PV) estimation. We aimed to compare the accuracy of the ellipsoid and bullet-shaped formulas with manual whole-gland segmentation as reference standard and to evaluate the corresponding PSAd diagnostic accuracy in predicting clinically significant PCa. MATERIALS AND METHODS We retrospectively analysed 195 patients with suspected PCa who underwent magnetic resonance imaging and prostate biopsy. Patients with PCa were categorized according to ISUP score. PV and corresponding PSAd were calculated with manual segmentation (mPV and mPSAd) as well as with ellipsoid (ePV and ePSAd) and bullet-shaped (bPV and bPSAd) formulas. Inter and intra-reader reproducibility were assessed with Lin's concordance correlation coefficient and the intraclass correlation coefficient (ICC). A 2-way analysis of variance with post-hoc Bonferroni test was used for assessing PV differences. Predictive values of PSAd calculated with different methods for detecting clinically significant PCa were evaluated by receiver operating characteristic curve analysis and Youden's index. RESULTS Both intra (ρ = 0.99, ICC = 0.99) and inter-reader (ρ = 0.98, ICC = 0.98) reproducibility were excellent. No significant difference was found between ePV and reference standard (p = 1.00). bPV was significantly different from both (p = 0.00). PSAd (mPSAd/ePSAd cut-off ≥ 0.15, bPSAd cut-off ≥ 0.12) had sensitivity = 69-70%, specificity = 72-75%, areas under the curve = 0.757-0.760 (p = 0.70-0.88). CONCLUSIONS Our work shows that when using bullet-shaped formula, a different PSAd cut-off must be considered to avoid PCa under-diagnosis and inaccurate risk-stratification.
Collapse
Affiliation(s)
- Arnaldo Stanzione
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Andrea Ponsiglione
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy.
| | | | - Stefano Giusto Picchi
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Martina Di Stasi
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Francesco Verde
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Mario Petretta
- Department of Translational Medical Sciences, University of Naples "Federico II", Naples, Italy
| | - Massimo Imbriaco
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| | - Renato Cuocolo
- Department of Advanced Biomedical Sciences, University of Naples "Federico II", Naples, Italy
| |
Collapse
|
25
|
Abstract
PURPOSE OF REVIEW The purpose of this review was to identify the most recent lines of research focusing on the application of artificial intelligence (AI) in the diagnosis and staging of prostate cancer (PCa) with imaging. RECENT FINDINGS The majority of studies focused on the improvement in the interpretation of bi-parametric and multiparametric magnetic resonance imaging, and in the planning of image guided biopsy. These initial studies showed that AI methods based on convolutional neural networks could achieve a diagnostic performance close to that of radiologists. In addition, these methods could improve segmentation and reduce inter-reader variability. Methods based on both clinical and imaging findings could help in the identification of high-grade PCa and more aggressive disease, thus guiding treatment decisions. Though these initial results are promising, only few studies addressed the repeatability and reproducibility of the investigated AI tools. Further, large-scale validation studies are missing and no diagnostic phase III or higher studies proving improved outcomes regarding clinical decision making have been conducted. SUMMARY AI techniques have the potential to significantly improve and simplify diagnosis, risk stratification and staging of PCa. Larger studies with a focus on quality standards are needed to allow a widespread introduction of AI in clinical practice.
Collapse
Affiliation(s)
- Pascal A T Baltzer
- Department of Biomedical Imaging and Image-guided Therapy, Medical University of Vienna, Vienna, Austria
| | | |
Collapse
|
26
|
Sarma KV, Raman AG, Dhinagar NJ, Priester AM, Harmon S, Sanford T, Mehralivand S, Turkbey B, Marks LS, Raman SS, Speier W, Arnold CW. Harnessing clinical annotations to improve deep learning performance in prostate segmentation. PLoS One 2021; 16:e0253829. [PMID: 34170972 PMCID: PMC8232529 DOI: 10.1371/journal.pone.0253829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 06/13/2021] [Indexed: 12/09/2022] Open
Abstract
PURPOSE Developing large-scale datasets with research-quality annotations is challenging due to the high cost of refining clinically generated markup into high precision annotations. We evaluated the direct use of a large dataset with only clinically generated annotations in development of high-performance segmentation models for small research-quality challenge datasets. MATERIALS AND METHODS We used a large retrospective dataset from our institution comprised of 1,620 clinically generated segmentations, and two challenge datasets (PROMISE12: 50 patients, ProstateX-2: 99 patients). We trained a 3D U-Net convolutional neural network (CNN) segmentation model using our entire dataset, and used that model as a template to train models on the challenge datasets. We also trained versions of the template model using ablated proportions of our dataset, and evaluated the relative benefit of those templates for the final models. Finally, we trained a version of the template model using an out-of-domain brain cancer dataset, and evaluated the relevant benefit of that template for the final models. We used five-fold cross-validation (CV) for all training and evaluation across our entire dataset. RESULTS Our model achieves state-of-the-art performance on our large dataset (mean overall Dice 0.916, average Hausdorff distance 0.135 across CV folds). Using this model as a pre-trained template for refining on two external datasets significantly enhanced performance (30% and 49% enhancement in Dice scores respectively). Mean overall Dice and mean average Hausdorff distance were 0.912 and 0.15 for the ProstateX-2 dataset, and 0.852 and 0.581 for the PROMISE12 dataset. Using even small quantities of data to train the template enhanced performance, with significant improvements using 5% or more of the data. CONCLUSION We trained a state-of-the-art model using unrefined clinical prostate annotations and found that its use as a template model significantly improved performance in other prostate segmentation tasks, even when trained with only 5% of the original dataset.
Collapse
Affiliation(s)
- Karthik V. Sarma
- University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Alex G. Raman
- University of California, Los Angeles, Los Angeles, CA, United States of America
- Western University of Health Sciences, Pomona, CA, United States of America
| | - Nikhil J. Dhinagar
- University of California, Los Angeles, Los Angeles, CA, United States of America
- Keck School of Medicine, University of Southern California, Los Angeles, CA, United States of America
| | - Alan M. Priester
- University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Stephanie Harmon
- National Cancer Institute, National Institutes of Health, Bethesda, MD, United States of America
- Clinical Research Directorate, Frederick National Laboratory for Cancer Research, Frederick, MD, United States of America
| | - Thomas Sanford
- National Cancer Institute, National Institutes of Health, Bethesda, MD, United States of America
- SUNY Upstate Medical Center, Syracuse, NY, United States of America
| | - Sherif Mehralivand
- National Cancer Institute, National Institutes of Health, Bethesda, MD, United States of America
| | - Baris Turkbey
- National Cancer Institute, National Institutes of Health, Bethesda, MD, United States of America
| | - Leonard S. Marks
- University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Steven S. Raman
- University of California, Los Angeles, Los Angeles, CA, United States of America
| | - William Speier
- University of California, Los Angeles, Los Angeles, CA, United States of America
| | - Corey W. Arnold
- University of California, Los Angeles, Los Angeles, CA, United States of America
| |
Collapse
|
27
|
Montagne S, Hamzaoui D, Allera A, Ezziane M, Luzurier A, Quint R, Kalai M, Ayache N, Delingette H, Renard-Penna R. Challenge of prostate MRI segmentation on T2-weighted images: inter-observer variability and impact of prostate morphology. Insights Imaging 2021; 12:71. [PMID: 34089410 PMCID: PMC8179870 DOI: 10.1186/s13244-021-01010-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Accepted: 05/05/2021] [Indexed: 12/29/2022] Open
Abstract
Background Accurate prostate zonal segmentation on magnetic resonance images (MRI) is a critical prerequisite for automated prostate cancer detection. We aimed to assess the variability of manual prostate zonal segmentation by radiologists on T2-weighted (T2W) images, and to study factors that may influence it. Methods Seven radiologists of varying levels of experience segmented the whole prostate gland (WG) and the transition zone (TZ) on 40 axial T2W prostate MRI images (3D T2W images for all patients, and both 3D and 2D images for a subgroup of 12 patients). Segmentation variabilities were evaluated based on: anatomical and morphological variation of the prostate (volume, retro-urethral lobe, intensity contrast between zones, presence of a PI-RADS ≥ 3 lesion), variation in image acquisition (3D vs 2D T2W images), and reader’s experience. Several metrics including Dice Score (DSC) and Hausdorff Distance were used to evaluate differences, with both a pairwise and a consensus (STAPLE reference) comparison. Results DSC was 0.92 (± 0.02) and 0.94 (± 0.03) for WG, 0.88 (± 0.05) and 0.91 (± 0.05) for TZ respectively with pairwise comparison and consensus reference. Variability was significantly (p < 0.05) lower for the mid-gland (DSC 0.95 (± 0.02)), higher for the apex (0.90 (± 0.06)) and the base (0.87 (± 0.06)), and higher for smaller prostates (p < 0.001) and when contrast between zones was low (p < 0.05). Impact of the other studied factors was non-significant. Conclusions Variability is higher in the extreme parts of the gland, is influenced by changes in prostate morphology (volume, zone intensity ratio), and is relatively unaffected by the radiologist’s level of expertise. Supplementary Information The online version contains supplementary material available at 10.1186/s13244-021-01010-9.
Collapse
Affiliation(s)
- Sarah Montagne
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France. .,Academic Department of Radiology, Hôpital Tenon, Assistance Publique des Hôpitaux de Paris, Paris, France. .,Sorbonne Universités, GRC n° 5, Oncotype-Uro, Paris, France.
| | - Dimitri Hamzaoui
- Inria, Epione Team, Université Côte D'Azur, Sophia Antipolis, Nice, France
| | - Alexandre Allera
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Malek Ezziane
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Anna Luzurier
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Raphaelle Quint
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Mehdi Kalai
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Nicholas Ayache
- Inria, Epione Team, Université Côte D'Azur, Sophia Antipolis, Nice, France
| | - Hervé Delingette
- Inria, Epione Team, Université Côte D'Azur, Sophia Antipolis, Nice, France
| | - Raphaële Renard-Penna
- Academic Department of Radiology, Hôpital Pitié-Salpétrière, Assistance Publique des Hôpitaux de Paris, Paris, France.,Academic Department of Radiology, Hôpital Tenon, Assistance Publique des Hôpitaux de Paris, Paris, France.,Sorbonne Universités, GRC n° 5, Oncotype-Uro, Paris, France
| |
Collapse
|
28
|
Wang L, Kelly B, Lee EH, Wang H, Zheng J, Zhang W, Halabi S, Liu J, Tian Y, Han B, Huang C, Yeom KW, Deng K, Song J. Multi-classifier-based identification of COVID-19 from chest computed tomography using generalizable and interpretable radiomics features. Eur J Radiol 2021; 136:109552. [PMID: 33497881 PMCID: PMC7810032 DOI: 10.1016/j.ejrad.2021.109552] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 12/09/2020] [Accepted: 01/12/2021] [Indexed: 12/11/2022]
Abstract
PURPOSE To investigate the efficacy of radiomics in diagnosing patients with coronavirus disease (COVID-19) and other types of viral pneumonia with clinical symptoms and CT signs similar to those of COVID-19. METHODS Between 18 January 2020 and 20 May 2020, 110 SARS-CoV-2 positive and 108 SARS-CoV-2 negative patients were retrospectively recruited from three hospitals based on the inclusion criteria. Manual segmentation of pneumonia lesions on CT scans was performed by four radiologists. The latest version of Pyradiomics was used for feature extraction. Four classifiers (linear classifier, k-nearest neighbour, least absolute shrinkage and selection operator [LASSO], and random forest) were used to differentiate SARS-CoV-2 positive and SARS-CoV-2 negative patients. Comparison of the performance of the classifiers and radiologists was evaluated by ROC curve and Kappa score. RESULTS We manually segmented 16,053 CT slices, comprising 32,625 pneumonia lesions, from the CT scans of all patients. Using Pyradiomics, 120 radiomic features were extracted from each image. The key radiomic features screened by different classifiers varied and lead to significant differences in classification accuracy. The LASSO achieved the best performance (sensitivity: 72.2%, specificity: 75.1%, and AUC: 0.81) on the external validation dataset and attained excellent agreement (Kappa score: 0.89) with radiologists (average sensitivity: 75.6%, specificity: 78.2%, and AUC: 0.81). All classifiers indicated that "Original_Firstorder_RootMeanSquared" and "Original_Firstorder_Uniformity" were significant features for this task. CONCLUSIONS We identified radiomic features that were significantly associated with the classification of COVID-19 pneumonia using multiple classifiers. The quantifiable interpretation of the differences in features between the two groups extends our understanding of CT imaging characteristics of COVID-19 pneumonia.
Collapse
Affiliation(s)
- Lu Wang
- School of Medical Informatics, China Medical University Puhe Rd, Shenbei New District, Shenyang, Liaoning, 110122, China
| | - Brendan Kelly
- Department of Radiology, School of Medicine, Stanford University 725 Welch Rd MC 5654, Palo Alto, CA, 94305, United States
| | - Edward H. Lee
- Department of Radiology, School of Medicine, Stanford University 725 Welch Rd MC 5654, Palo Alto, CA, 94305, United States
| | - Hongmei Wang
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China, No. 1 Swan Lake Road Hefei, Anhui, 230036, China
| | - Jimmy Zheng
- Department of Radiology, School of Medicine, Stanford University 725 Welch Rd MC 5654, Palo Alto, CA, 94305, United States
| | - Wei Zhang
- Department of Radiology, the Lu’an Affiliated Hospital, Anhui Medical University, No. 21 Wanxi Rd, Lu’an, Anhui, 237005, China
| | - Safwan Halabi
- Department of Radiology, School of Medicine, Stanford University 725 Welch Rd MC 5654, Palo Alto, CA, 94305, United States
| | - Jining Liu
- Bengbu Medical College, Department of Imaging Medicine, 2600 Donghai Avenue, Bengbu, Anhui, 233030, China
| | - Yulong Tian
- Wannan Medical College, Department of Imaging Medicine and Nuclear Medicine, 22 Wenchang West Rd, Higher Education Park, Wuhu, Anhui, 241002, China
| | - Baoqin Han
- Wannan Medical College, Department of Imaging Medicine and Nuclear Medicine, 22 Wenchang West Rd, Higher Education Park, Wuhu, Anhui, 241002, China
| | - Chuanbin Huang
- Wannan Medical College, Department of Imaging Medicine and Nuclear Medicine, 22 Wenchang West Rd, Higher Education Park, Wuhu, Anhui, 241002, China
| | - Kristen W. Yeom
- Department of Radiology, School of Medicine, Stanford University 725 Welch Rd MC 5654, Palo Alto, CA, 94305, United States
| | - Kexue Deng
- Department of Radiology, The First Affiliated Hospital of University of Science and Technology of China, No. 1 Swan Lake Road Hefei, Anhui, 230036, China,Corresponding author
| | - Jiangdian Song
- School of Medical Informatics, China Medical University Puhe Rd, Shenbei New District, Shenyang, Liaoning, 110122, China; Department of Radiology, School of Medicine, Stanford University 1201 Welch Rd, Lucas Center, Palo Alto, CA, 94305, United States.
| |
Collapse
|
29
|
Magnetic Resonance Imaging Based Radiomic Models of Prostate Cancer: A Narrative Review. Cancers (Basel) 2021; 13:cancers13030552. [PMID: 33535569 PMCID: PMC7867056 DOI: 10.3390/cancers13030552] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 01/18/2021] [Accepted: 01/27/2021] [Indexed: 12/11/2022] Open
Abstract
Simple Summary The increasing interest in implementing artificial intelligence in radiomic models has occurred alongside advancement in the tools used for computer-aided diagnosis. Such tools typically apply both statistical and machine learning methodologies to assess the various modalities used in medical image analysis. Specific to prostate cancer, the radiomics pipeline has multiple facets that are amenable to improvement. This review discusses the steps of a magnetic resonance imaging based radiomics pipeline. Present successes, existing opportunities for refinement, and the most pertinent pending steps leading to clinical validation are highlighted. Abstract The management of prostate cancer (PCa) is dependent on biomarkers of biological aggression. This includes an invasive biopsy to facilitate a histopathological assessment of the tumor’s grade. This review explores the technical processes of applying magnetic resonance imaging based radiomic models to the evaluation of PCa. By exploring how a deep radiomics approach further optimizes the prediction of a PCa’s grade group, it will be clear how this integration of artificial intelligence mitigates existing major technological challenges faced by a traditional radiomic model: image acquisition, small data sets, image processing, labeling/segmentation, informative features, predicting molecular features and incorporating predictive models. Other potential impacts of artificial intelligence on the personalized treatment of PCa will also be discussed. The role of deep radiomics analysis-a deep texture analysis, which extracts features from convolutional neural networks layers, will be highlighted. Existing clinical work and upcoming clinical trials will be reviewed, directing investigators to pertinent future directions in the field. For future progress to result in clinical translation, the field will likely require multi-institutional collaboration in producing prospectively populated and expertly labeled imaging libraries.
Collapse
|
30
|
Test-time adaptable neural networks for robust medical image segmentation. Med Image Anal 2021; 68:101907. [DOI: 10.1016/j.media.2020.101907] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 11/11/2020] [Accepted: 11/12/2020] [Indexed: 11/20/2022]
|
31
|
Prostate Cancer: Prostate-specific Membrane Antigen Positron-emission Tomography/Computed Tomography or Positron-emission Tomography/Magnetic Resonance Imaging for Staging. Top Magn Reson Imaging 2020; 29:59-66. [PMID: 32015295 DOI: 10.1097/rmr.0000000000000229] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Positron-emission tomography (PET) with prostate-specific membrane antigen (PSMA) has been increasingly used to image prostate cancer in the last decade. In the staging setting several studies have already been published suggesting PSMA PET can be a valuable tool. They, however, did not translate into recommendations by guidelines. Both PSMA PET/computed tomography (CT) and PET/magnetic resonance imaging have been investigated in the staging setting, showing higher detection rate of prostate cancer lesions over the conventional imaging work-up and some studies already showed an impact on disease management. The aim of this review is to provide an overview of the existing published data regarding PSMA PET for staging prostate cancer, with emphasis on PET/magnetic resonance imaging. Despite the fact that PSMA is a relatively new tool and not officially recommended for staging yet, there are >50 original studies in the literature assessing PSMA PET performance in the staging setting of prostate cancer, and some meta-analyses.
Collapse
|
32
|
Liechti MR, Muehlematter UJ, Schneider AF, Eberli D, Rupp NJ, Hötker AM, Donati OF, Becker AS. Manual prostate cancer segmentation in MRI: interreader agreement and volumetric correlation with transperineal template core needle biopsy. Eur Radiol 2020; 30:4806-4815. [PMID: 32306078 DOI: 10.1007/s00330-020-06786-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 02/16/2020] [Accepted: 03/02/2020] [Indexed: 11/26/2022]
Abstract
OBJECTIVES To assess interreader agreement of manual prostate cancer lesion segmentation on multiparametric MR images (mpMRI). The secondary aim was to compare tumor volume estimates between MRI segmentation and transperineal template saturation core needle biopsy (TTSB). METHODS We retrospectively reviewed patients who had undergone mpMRI of the prostate at our institution and who had received TTSB within 190 days of the examination. Seventy-eight cancer lesions with Gleason score of at least 3 + 4 = 7 were manually segmented in T2-weighted images by 3 radiologists and 1 medical student. Twenty lesions were also segmented in apparent diffusion coefficient (ADC) and dynamic contrast enhanced (DCE) series. First, 20 volumetric similarity scores were computed to quantify interreader agreement. Second, manually segmented cancer lesion volumes were compared with TTSB-derived estimates by Bland-Altman analysis and Wilcoxon testing. RESULTS Interreader agreement across all readers was only moderate with mean T2 Dice score of 0.57 (95%CI 0.39-0.70), volumetric similarity coefficient of 0.74 (0.48-0.89), and Hausdorff distance of 5.23 mm (3.17-9.32 mm). Discrepancy of volume estimate between MRI and TTSB was increasing with tumor size. Discrepancy was significantly different between tumors with a Gleason score 3 + 4 vs. higher grade tumors (0.66 ml vs. 0.78 ml; p = 0.007). There were no significant differences between T2, ADC, and DCE segmentations. CONCLUSIONS We found at best moderate interreader agreement of manual prostate cancer segmentation in mpMRI. Additionally, our study suggests a systematic discrepancy between the tumor volume estimate by MRI segmentation and TTSB core length, especially for large and high-grade tumors. KEY POINTS • Manual prostate cancer segmentation in mpMRI shows moderate interreader agreement. • There are no significant differences between T2, ADC, and DCE segmentation agreements. • There is a systematic difference between volume estimates derived from biopsy and MRI.
Collapse
Affiliation(s)
- Marc R Liechti
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland
| | - Urs J Muehlematter
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland
| | - Aurelia F Schneider
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland
| | - Daniel Eberli
- Department of Urology, University Hospital of Zurich, Zurich, Switzerland
| | - Niels J Rupp
- Department of Pathology and Molecular Pathology, University Hospital of Zurich, Zurich, Switzerland
| | - Andreas M Hötker
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland
| | - Olivio F Donati
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland
| | - Anton S Becker
- Institute of Diagnostic and Interventional Radiology, University Hospital of Zurich, Zurich, Switzerland.
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|