1
|
Li H, Liu H, von Busch H, Grimm R, Huisman H, Tong A, Winkel D, Penzkofer T, Shabunin I, Choi MH, Yang Q, Szolar D, Shea S, Coakley F, Harisinghani M, Oguz I, Comaniciu D, Kamen A, Lou B. Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Biparametric MRI Datasets. Radiol Artif Intell 2024:e230521. [PMID: 39166972 DOI: 10.1148/ryai.230521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To determine whether the unsupervised domain adaptation (UDA) method with generated images improves the performance of a supervised learning (SL) model for prostate cancer (PCa) detection using multisite bp-MRI datasets. Materials and Methods This retrospective study included data from 5,150 patients (14,191 samples) collected across nine different imaging centers. A novel UDA method using a unified generative model was developed for PCa detection using multisite bp-MRI datasets. This method translates diffusion-weighted imaging (DWI) acquisitions, including apparent diffusion coefficient (ADC) and individual DW images acquired using various b-values, to align with the style of images acquired using b-values recommended by Prostate Imaging Reporting and Data System (PI-RADS) guidelines. The generated ADC and DW images replace the original images for PCa detection. An independent set of 1,692 test cases (2,393 samples) was used for evaluation. The area under the receiver operating characteristic curve (AUC) was used as the primary metric, and statistical analysis was performed via bootstrapping. Results For all test cases, the AUC values for baseline SL and UDA methods were 0.73 and 0.79 (P < .001), respectively, for PI-RADS ≥ 3, and 0.77 and 0.80 (P < .001) for PI-RADS ≥ 4 PCa lesions. In the 361 test cases under the most unfavorable image acquisition setting, the AUC values for baseline SL and UDA were 0.49 and 0.76 (P < .001) for PI-RADS ≥ 3, and 0.50 and 0.77 (P < .001) for PI-RADS ≥ 4 PCa lesions. Conclusion UDA with generated images improved the performance of SL methods in PCa lesion detection across multisite datasets with various b values, especially for images acquired with significant deviations from the PI-RADS recommended DWI protocol (eg, with an extremely high b-value). ©RSNA, 2024.
Collapse
|
2
|
Yoo Y, Gibson E, Zhao G, Re TJ, Parmar H, Das J, Wang H, Kim MM, Shen C, Lee Y, Kondziolka D, Ibrahim M, Lian J, Jain R, Zhu T, Comaniciu D, Balter JM, Cao Y. Extended nnU-Net for brain metastasis detection and segmentation in contrast-enhanced MRI with a large multi-institutional dataset. Int J Radiat Oncol Biol Phys 2024:S0360-3016(24)03138-9. [PMID: 39059508 DOI: 10.1016/j.ijrobp.2024.07.2318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/30/2024] [Accepted: 07/13/2024] [Indexed: 07/28/2024]
Abstract
PURPOSE To investigate an extended self-adapting nnU-Net framework for detecting and segmenting brain metastases (BM) on MRI. APPROACH Six different nnU-Net systems with adaptive data sampling, adaptive Dice loss (ADL) or different patch/batch sizes were trained and tested for detecting and segmenting intraparenchymal BM with a size ≥ 2 mm on 3D post-Gd T1-weighted MRI volumes using 2092 patients from seven institutions (1712, 195 and 185 patients for training, validation and testing, respectively). Gross tumor volumes (GTVs) of BM delineated by physicians for stereotactic radiosurgery (SRS) were collected retrospectively and curated at each institute. Additional centralized data curation was carried out to create GTVs of uncontoured BM by two radiologists to improve accuracy of ground truth. The training dataset was augmented with synthetic BMs of 1025 MRI volumes using a 3D generative pipeline. BM detection was evaluated by lesion-level sensitivity and false-positive (FP) rate. BM segmentation was assessed by lesion-level Dice similarity coefficient (DSC), 95-percentile Hausdorff distance (HD95) and average HD. The performances were assessed across different BM sizes. Additional testing was performed using a second dataset of 206 patients. RESULTS Of the six nnU-Net systems, the nnU-Net with ADL achieved the best detection and segmentation performance on the first testing dataset. At an FP rate of 0.65±1.17, overall sensitivity was 0.904 for all sizes of BM, 0.966 for BM ≥ 0.1 cm3 and 0.824 for BM < 0.1 cm3. Mean values of DSC, HD95 and average HD of all detected BMs were 0.758, 1.45 mm and 0.23 mm, respectively. Performances on the second testing dataset achieved sensitivity of 0.907 at an FP rate of 0.57±0.85 for all BM sizes, and average HD of 0.33 mm for all detected BM. CONCLUSIONS Our proposed extension of the self-configuring nnU-Net framework substantially improved small BM detection sensitivity while maintaining a controlled FP rate. Clinical utility of the extended nnU-Net model for assisting early BM detection and SRS planning will be investigated.
Collapse
|
3
|
Mansoor A, Schmuecking I, Ghesu FC, Georgescu B, Grbic S, Vishwanath RS, Farri O, Ghosh R, Vunikili R, Zimmermann M, Sutcliffe J, Mendelsohn SL, Comaniciu D, Gefter WB. Large-Scale Study on AI's Impact on Identifying Chest Radiographs with No Actionable Disease in Outpatient Imaging. Acad Radiol 2024:S1076-6332(24)00390-8. [PMID: 38997881 DOI: 10.1016/j.acra.2024.06.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/10/2024] [Accepted: 06/19/2024] [Indexed: 07/14/2024]
Abstract
RATIONALE AND OBJECTIVES Given the high volume of chest radiographs, radiologists frequently encounter heavy workloads. In outpatient imaging, a substantial portion of chest radiographs show no actionable findings. Automatically identifying these cases could improve efficiency by facilitating shorter reading workflows. PURPOSE A large-scale study to assess the performance of AI on identifying chest radiographs with no actionable disease (NAD) in an outpatient imaging population using comprehensive, objective, and reproducible criteria for NAD. MATERIALS AND METHODS The independent validation study includes 15000 patients with chest radiographs in posterior-anterior (PA) and lateral projections from an outpatient imaging center in the United States. Ground truth was established by reviewing CXR reports and classifying cases as NAD or actionable disease (AD). The NAD definition includes completely normal chest radiographs and radiographs with well-defined non-actionable findings. The AI NAD Analyzer1 (trained with 100 million multimodal images and fine-tuned on 1.3 million radiographs) utilizes a tandem system with image-level rule in and compartment-level rule out to provide case level output as NAD or potential actionable disease (PAD). RESULTS A total of 14057 cases met our eligibility criteria (age 56 ± 16.1 years, 55% women and 45% men). The prevalence of NAD cases in the study population was 70.7%. The AI NAD Analyzer correctly classified NAD cases with a sensitivity of 29.1% and a yield of 20.6%. The specificity was 98.9% which corresponds to a miss rate of 0.3% of cases. Significant findings were missed in 0.06% of cases, while no cases with critical findings were missed by AI. CONCLUSION In an outpatient population, AI can identify 20% of chest radiographs as NAD with a very low rate of missed findings. These cases could potentially be read using a streamlined protocol, thus improving efficiency and consequently reducing daily workload for radiologists.
Collapse
|
4
|
Islam S, Murthy VN, Neumann D, Das BK, Sharma P, Maier A, Comaniciu D, Ghesu FC. Self-supervised learning for interventional image analytics: toward robust device trackers. J Med Imaging (Bellingham) 2024; 11:035001. [PMID: 38756438 PMCID: PMC11094643 DOI: 10.1117/1.jmi.11.3.035001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 04/23/2024] [Accepted: 05/01/2024] [Indexed: 05/18/2024] Open
Abstract
Purpose The accurate detection and tracking of devices, such as guiding catheters in live X-ray image acquisitions, are essential prerequisites for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness/no failures during tracking. To achieve this, one needs to efficiently tackle challenges, such as device obscuration by the contrast agent or other external devices or wires and changes in the field-of-view or acquisition angle, as well as the continuous movement due to cardiac and respiratory motion. Approach To overcome the aforementioned challenges, we propose an approach to learn spatio-temporal features from a very large data cohort of over 16 million interventional X-ray frames using self-supervision for image sequence data. Our approach is based on a masked image modeling technique that leverages frame interpolation-based reconstruction to learn fine inter-frame temporal correspondences. The features encoded in the resulting model are fine-tuned downstream in a light-weight model. Results Our approach achieves state-of-the-art performance, in particular for robustness, compared to ultra optimized reference solutions (that use multi-stage feature fusion or multi-task and flow regularization). The experiments show that our method achieves a 66.31% reduction in the maximum tracking error against the reference solutions (23.20% when flow regularization is used), achieving a success score of 97.95% at a 3 × faster inference speed of 42 frames-per-second (on GPU). In addition, we achieve a 20% reduction in the standard deviation of errors, which indicates a much more stable tracking performance. Conclusions The proposed data-driven approach achieves superior performance, particularly in robustness and speed compared with the frequently used multi-modular approaches for device tracking. The results encourage the use of our approach in various other tasks within interventional image analytics that require effective understanding of spatio-temporal semantics.
Collapse
|
5
|
Das BK, Zhao G, Islam S, Re TJ, Comaniciu D, Gibson E, Maier A. Co-ordinate-based positional embedding that captures resolution to enhance transformer's performance in medical image analysis. Sci Rep 2024; 14:9380. [PMID: 38654066 DOI: 10.1038/s41598-024-59813-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 04/15/2024] [Indexed: 04/25/2024] Open
Abstract
Vision transformers (ViTs) have revolutionized computer vision by employing self-attention instead of convolutional neural networks and demonstrated success due to their ability to capture global dependencies and remove spatial biases of locality. In medical imaging, where input data may differ in size and resolution, existing architectures require resampling or resizing during pre-processing, leading to potential spatial resolution loss and information degradation. This study proposes a co-ordinate-based embedding that encodes the geometry of medical images, capturing physical co-ordinate and resolution information without the need for resampling or resizing. The effectiveness of the proposed embedding is demonstrated through experiments with UNETR and SwinUNETR models for infarct segmentation on MRI dataset with AxTrace and AxADC contrasts. The dataset consists of 1142 training, 133 validation and 143 test subjects. Both models with the addition of co-ordinate based positional embedding achieved substantial improvements in mean Dice score by 6.5% and 7.6%. The proposed embedding showcased a statistically significant advantage p-value< 0.0001 over alternative approaches. In conclusion, the proposed co-ordinate-based pixel-wise positional embedding method offers a promising solution for Transformer-based models in medical image analysis. It effectively leverages physical co-ordinate information to enhance performance without compromising spatial resolution and provides a foundation for future advancements in positional embedding techniques for medical applications.
Collapse
|
6
|
Yoo Y, Gibson E, Zhao G, Sandu A, Re T, Das J, Hesheng W, Kim MM, Shen C, Lee YZ, Kondziolka D, Ibrahim M, Lian J, Jain R, Zhu T, Parmar H, Comaniciu D, Balter J, Cao Y. An Automated Brain Metastasis Detection and Segmentation System from MRI with a Large Multi-Institutional Dataset. Int J Radiat Oncol Biol Phys 2023; 117:S88-S89. [PMID: 37784596 DOI: 10.1016/j.ijrobp.2023.06.414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
PURPOSE/OBJECTIVE(S) Developments of automated systems for brain metastasis (BM) detection and segmentation from MRI for assisting early detection and stereotactic radiosurgery (SRS) have been reported but most based upon relatively small datasets from single institutes. This work aims to develop and evaluate a system using a large multi-institutional dataset, and to improve both identification of small/subtle BMs and segmentation accuracy of large BMs. MATERIALS/METHODS A 3D U-Net system was trained and evaluated to detect and segment intraparenchymal BMs with a size > 2mm using 1856 MRI volumes from 1791 patients treated with SRS from seven institutions (1539 volumes for training, 183 for validation, and 134 for testing). All patients had 3D post-Gd T1w MRI scans pre-SRS. Gross tumor volumes (GTVs) of BMs for SRS were curated by each institute first. Then, additional efforts were spent to create GTVs for the untreated and/or uncontoured BMs, including central reviews by two radiologists, to improve accuracy of ground truth. The training dataset was augmented with synthetic BMs of 3773 MRIs using a 3D generative pipeline. Our system consists of two U-Nets with one using small 3D patches dedicated for detecting small BMs and another using large 3D patches for segmenting large BMs, and a random-forest based fusion module for combining the two network outputs. The first U-Net was trained with 3D patches containing at least one BM < 0.1 cm3. For detection performance, we measured BM-level sensitivity and case-level false-positive (FP) rate. For segmentation performance, we measured BM-level Dice similarity coefficient (DSC) and 95-percentile Hausdorff distance (HD95). We also stratified performances based upon BM sizes. RESULTS For 739 BMs in the 134 testing cases, the overall lesion-level sensitivity was 0.870 with an average case-level FP of 1.34±1.92 (95% CI: 1.02-1.67). The sensitivity was >0.969 for the BMs >0.1 cm3, but dropped to 0.755 for the BMs < 0.1 cm3 (Table 1). The average DSC and HD95 for all detected BMs were 0.786 and 1.35mm. The worse performance for BMs > 20 cm3 was caused by a case with 83 cm3 GTV and artifacts in the MRI volume. CONCLUSION We achieved excellent detection sensitivity and segmentation accuracy for BMs > 0.1 cm3, and promising performance for small BMs (<0.1cm3) with a controlled FP rate using a large multi-institutional dataset. Clinical utility for assisting early detection and SRS planning will be investigated. Table 1: Per-lesion detection and segmentation performance stratified by individual BM size. N is the number of BMs in each category.
Collapse
|
7
|
Ghesu FC, Georgescu B, Mansoor A, Yoo Y, Neumann D, Patel P, Vishwanath RS, Balter JM, Cao Y, Grbic S, Comaniciu D. Contrastive self-supervised learning from 100 million medical images with optional supervision. J Med Imaging (Bellingham) 2022; 9:064503. [PMID: 36466078 PMCID: PMC9710476 DOI: 10.1117/1.jmi.9.6.064503] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 11/14/2022] [Indexed: 12/05/2022] Open
Abstract
Purpose Building accurate and robust artificial intelligence systems for medical image assessment requires the creation of large sets of annotated training examples. However, constructing such datasets is very costly due to the complex nature of annotation tasks, which often require expert knowledge (e.g., a radiologist). To counter this limitation, we propose a method to learn from medical images at scale in a self-supervised way. Approach Our approach, based on contrastive learning and online feature clustering, leverages training datasets of over 100,000,000 medical images of various modalities, including radiography, computed tomography (CT), magnetic resonance (MR) imaging, and ultrasonography (US). We propose to use the learned features to guide model training in supervised and hybrid self-supervised/supervised regime on various downstream tasks. Results We highlight a number of advantages of this strategy on challenging image assessment problems in radiography, CT, and MR: (1) significant increase in accuracy compared to the state-of-the-art (e.g., area under the curve boost of 3% to 7% for detection of abnormalities from chest radiography scans and hemorrhage detection on brain CT); (2) acceleration of model convergence during training by up to 85% compared with using no pretraining (e.g., 83% when training a model for detection of brain metastases in MR scans); and (3) increase in robustness to various image augmentations, such as intensity variations, rotations or scaling reflective of data variation seen in the field. Conclusions The proposed approach enables large gains in accuracy and robustness on challenging image assessment problems. The improvement is significant compared with other state-of-the-art approaches trained on medical or vision images (e.g., ImageNet).
Collapse
|
8
|
Gibson E, Georgescu B, Ceccaldi P, Trigan PH, Yoo Y, Das J, Re TJ, Rs V, Balachandran A, Eibenberger E, Chekkoury A, Brehm B, Bodanapally UK, Nicolaou S, Sanelli PC, Schroeppel TJ, Flohr T, Comaniciu D, Lui YW. Artificial Intelligence with Statistical Confidence Scores for Detection of Acute or Subacute Hemorrhage on Noncontrast CT Head Scans. Radiol Artif Intell 2022; 4:e210115. [PMID: 35652116 DOI: 10.1148/ryai.210115] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 03/01/2022] [Accepted: 04/01/2022] [Indexed: 11/11/2022]
Abstract
Purpose To present a method that automatically detects, subtypes, and locates acute or subacute intracranial hemorrhage (ICH) on noncontrast CT (NCCT) head scans; generates detection confidence scores to identify high-confidence data subsets with higher accuracy; and improves radiology worklist prioritization. Such scores may enable clinicians to better use artificial intelligence (AI) tools. Materials and Methods This retrospective study included 46 057 studies from seven "internal" centers for development (training, architecture selection, hyperparameter tuning, and operating-point calibration; n = 25 946) and evaluation (n = 2947) and three "external" centers for calibration (n = 400) and evaluation (n = 16 764). Internal centers contributed developmental data, whereas external centers did not. Deep neural networks predicted the presence of ICH and subtypes (intraparenchymal, intraventricular, subarachnoid, subdural, and/or epidural hemorrhage) and segmentations per case. Two ICH confidence scores are discussed: a calibrated classifier entropy score and a Dempster-Shafer score. Evaluation was completed by using receiver operating characteristic curve analysis and report turnaround time (RTAT) modeling on the evaluation set and on confidence score-defined subsets using bootstrapping. Results The areas under the receiver operating characteristic curve for ICH were 0.97 (0.97, 0.98) and 0.95 (0.94, 0.95) on internal and external center data, respectively. On 80% of the data stratified by calibrated classifier and Dempster-Shafer scores, the system improved the Youden indexes, increasing them from 0.84 to 0.93 (calibrated classifier) and from 0.84 to 0.92 (Dempster-Shafer) for internal centers and increasing them from 0.78 to 0.88 (calibrated classifier) and from 0.78 to 0.89 (Dempster-Shafer) for external centers (P < .001). Models estimated shorter RTAT for AI-prioritized worklists with confidence measures than for AI-prioritized worklists without confidence measures, shortening RTAT by 27% (calibrated classifier) and 27% (Dempster-Shafer) for internal centers and shortening RTAT by 25% (calibrated classifier) and 27% (Dempster-Shafer) for external centers (P < .001). Conclusion AI that provided statistical confidence measures for ICH detection on NCCT scans reliably detected and subtyped hemorrhages, identified high-confidence predictions, and improved worklist prioritization in simulation.Keywords: CT, Head/Neck, Hemorrhage, Convolutional Neural Network (CNN) Supplemental material is available for this article. © RSNA, 2022.
Collapse
|
9
|
Jung HM, Yang R, Gefter WB, Ghesu FC, Mailhe B, Mansoor A, Grbic S, Comaniciu D, Vogt S, Mortani Barbosa EJ. Value of quantitative airspace disease measured on chest CT and chest radiography at initial diagnosis compared to clinical variables for prediction of severe COVID-19. J Med Imaging (Bellingham) 2022; 9:034003. [PMID: 35721308 DOI: 10.1117/1.jmi.9.3.034003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 05/31/2022] [Indexed: 11/14/2022] Open
Abstract
Purpose: Rapid prognostication of COVID-19 patients is important for efficient resource allocation. We evaluated the relative prognostic value of baseline clinical variables (CVs), quantitative human-read chest CT (qCT), and AI-read chest radiograph (qCXR) airspace disease (AD) in predicting severe COVID-19. Approach: We retrospectively selected 131 COVID-19 patients (SARS-CoV-2 positive, March to October, 2020) at a tertiary hospital in the United States, who underwent chest CT and CXR within 48 hr of initial presentation. CVs included patient demographics and laboratory values; imaging variables included qCT volumetric percentage AD (POv) and qCXR area-based percentage AD (POa), assessed by a deep convolutional neural network. Our prognostic outcome was need for ICU admission. We compared the performance of three logistic regression models: using CVs known to be associated with prognosis (model I), using a dimension-reduced set of best predictor variables (model II), and using only age and AD (model III). Results: 60/131 patients required ICU admission, whereas 71/131 did not. Model I performed the poorest ( AUC = 0.67 [0.58 to 0.76]; accuracy = 77 % ). Model II performed the best ( AUC = 0.78 [0.71 to 0.86]; accuracy = 81 % ). Model III was equivalent ( AUC = 0.75 [0.67 to 0.84]; accuracy = 80 % ). Both models II and III outperformed model I ( AUC difference = 0.11 [0.02 to 0.19], p = 0.01 ; AUC difference = 0.08 [0.01 to 0.15], p = 0.04 , respectively). Model II and III results did not change significantly when POv was replaced by POa. Conclusions: Severe COVID-19 can be predicted using only age and quantitative AD imaging metrics at initial diagnosis, which outperform the set of CVs. Moreover, AI-read qCXR can replace qCT metrics without loss of prognostic performance, promising more resource-efficient prognostication.
Collapse
|
10
|
Singh V, Kamaleswaran R, Chalfin D, Buño-Soto A, San Roman J, Rojas-Kenney E, Molinaro R, von Sengbusch S, Hodjat P, Comaniciu D, Kamen A. A deep learning approach for predicting severity of COVID-19 patients using a parsimonious set of laboratory markers. iScience 2021; 24:103523. [PMID: 34870131 PMCID: PMC8626152 DOI: 10.1016/j.isci.2021.103523] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/17/2021] [Accepted: 11/23/2021] [Indexed: 12/02/2022] Open
Abstract
The SARS-CoV-2 virus has caused tremendous healthcare burden worldwide. Our focus was to develop a practical and easy-to-deploy system to predict the severe manifestation of disease in patients with COVID-19 with an aim to assist clinicians in triage and treatment decisions. Our proposed predictive algorithm is a trained artificial intelligence-based network using 8,427 COVID-19 patient records from four healthcare systems. The model provides a severity risk score along with likelihoods of various clinical outcomes, namely ventilator use and mortality. The trained model using patient age and nine laboratory markers has the prediction accuracy with an area under the curve (AUC) of 0.78, 95% CI: 0.77–0.82, and the negative predictive value NPV of 0.86, 95% CI: 0.84–0.88 for the need to use a ventilator and has an accuracy with AUC of 0.85, 95% CI: 0.84–0.86, and the NPV of 0.94, 95% CI: 0.92–0.96 for predicting in-hospital 30-day mortality. Algorithm using 9 laboratory markers & age may predict severity in patients with COVID-19 Model was trained and tested on a multicenter sample of 10,937 patients Algorithm can predict ventilator use (NPV, 0.86) and mortality (NPV, 0.94) High NPV suggests utility as an adjunct to aid in triaging of patients with COVID-19
Collapse
|
11
|
Winkel DJ, Tong A, Lou B, Kamen A, Comaniciu D, Disselhorst JA, Rodríguez-Ruiz A, Huisman H, Szolar D, Shabunin I, Choi MH, Xing P, Penzkofer T, Grimm R, von Busch H, Boll DT. A Novel Deep Learning Based Computer-Aided Diagnosis System Improves the Accuracy and Efficiency of Radiologists in Reading Biparametric Magnetic Resonance Images of the Prostate: Results of a Multireader, Multicase Study. Invest Radiol 2021; 56:605-613. [PMID: 33787537 DOI: 10.1097/rli.0000000000000780] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
OBJECTIVE The aim of this study was to evaluate the effect of a deep learning based computer-aided diagnosis (DL-CAD) system on radiologists' interpretation accuracy and efficiency in reading biparametric prostate magnetic resonance imaging scans. MATERIALS AND METHODS We selected 100 consecutive prostate magnetic resonance imaging cases from a publicly available data set (PROSTATEx Challenge) with and without histopathologically confirmed prostate cancer. Seven board-certified radiologists were tasked to read each case twice in 2 reading blocks (with and without the assistance of a DL-CAD), with a separation between the 2 reading sessions of at least 2 weeks. Reading tasks were to localize and classify lesions according to Prostate Imaging Reporting and Data System (PI-RADS) v2.0 and to assign a radiologist's level of suspicion score (scale from 1-5 in 0.5 increments; 1, benign; 5, malignant). Ground truth was established by consensus readings of 3 experienced radiologists. The detection performance (receiver operating characteristic curves), variability (Fleiss κ), and average reading time without DL-CAD assistance were evaluated. RESULTS The average accuracy of radiologists in terms of area under the curve in detecting clinically significant cases (PI-RADS ≥4) was 0.84 (95% confidence interval [CI], 0.79-0.89), whereas the same using DL-CAD was 0.88 (95% CI, 0.83-0.94) with an improvement of 4.4% (95% CI, 1.1%-7.7%; P = 0.010). Interreader concordance (in terms of Fleiss κ) increased from 0.22 to 0.36 (P = 0.003). Accuracy of radiologists in detecting cases with PI-RADS ≥3 was improved by 2.9% (P = 0.10). The median reading time in the unaided/aided scenario was reduced by 21% from 103 to 81 seconds (P < 0.001). CONCLUSIONS Using a DL-CAD system increased the diagnostic accuracy in detecting highly suspicious prostate lesions and reduced both the interreader variability and the reading time.
Collapse
|
12
|
Gündel S, Setio AAA, Ghesu FC, Grbic S, Georgescu B, Maier A, Comaniciu D. Robust classification from noisy labels: Integrating additional knowledge for chest radiography abnormality assessment. Med Image Anal 2021; 72:102087. [PMID: 34015595 DOI: 10.1016/j.media.2021.102087] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 02/24/2021] [Accepted: 04/16/2021] [Indexed: 12/29/2022]
Abstract
Chest radiography is the most common radiographic examination performed in daily clinical practice for the detection of various heart and lung abnormalities. The large amount of data to be read and reported, with more than 100 studies per day for a single radiologist, poses a challenge in consistently maintaining high interpretation accuracy. The introduction of large-scale public datasets has led to a series of novel systems for automated abnormality classification. However, the labels of these datasets were obtained using natural language processed medical reports, yielding a large degree of label noise that can impact the performance. In this study, we propose novel training strategies that handle label noise from such suboptimal data. Prior label probabilities were measured on a subset of training data re-read by 4 board-certified radiologists and were used during training to increase the robustness of the training model to the label noise. Furthermore, we exploit the high comorbidity of abnormalities observed in chest radiography and incorporate this information to further reduce the impact of label noise. Additionally, anatomical knowledge is incorporated by training the system to predict lung and heart segmentation, as well as spatial knowledge labels. To deal with multiple datasets and images derived from various scanners that apply different post-processing techniques, we introduce a novel image normalization strategy. Experiments were performed on an extensive collection of 297,541 chest radiographs from 86,876 patients, leading to a state-of-the-art performance level for 17 abnormalities from 2 datasets. With an average AUC score of 0.880 across all abnormalities, our proposed training strategies can be used to significantly improve performance scores.
Collapse
|
13
|
Nael K, Gibson E, Yang C, Ceccaldi P, Yoo Y, Das J, Doshi A, Georgescu B, Janardhanan N, Odry B, Nadar M, Bush M, Re TJ, Huwer S, Josan S, von Busch H, Meyer H, Mendelson D, Drayer BP, Comaniciu D, Fayad ZA. Automated detection of critical findings in multi-parametric brain MRI using a system of 3D neural networks. Sci Rep 2021; 11:6876. [PMID: 33767226 PMCID: PMC7994311 DOI: 10.1038/s41598-021-86022-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/08/2021] [Indexed: 01/22/2023] Open
Abstract
With the rapid growth and increasing use of brain MRI, there is an interest in automated image classification to aid human interpretation and improve workflow. We aimed to train a deep convolutional neural network and assess its performance in identifying abnormal brain MRIs and critical intracranial findings including acute infarction, acute hemorrhage and mass effect. A total of 13,215 clinical brain MRI studies were categorized to training (74%), validation (9%), internal testing (8%) and external testing (8%) datasets. Up to eight contrasts were included from each brain MRI and each image volume was reformatted to common resolution to accommodate for differences between scanners. Following reviewing the radiology reports, three neuroradiologists assigned each study to abnormal vs normal, and identified three critical findings including acute infarction, acute hemorrhage, and mass effect. A deep convolutional neural network was constructed by a combination of localization feature extraction (LFE) modules and global classifiers to identify the presence of 4 variables in brain MRIs including abnormal, acute infarction, acute hemorrhage and mass effect. Training, validation and testing sets were randomly defined on a patient basis. Training was performed on 9845 studies using balanced sampling to address class imbalance. Receiver operating characteristic (ROC) analysis was performed. The ROC analysis of our models for 1050 studies within our internal test data showed AUC/sensitivity/specificity of 0.91/83%/86% for normal versus abnormal brain MRI, 0.95/92%/88% for acute infarction, 0.90/89%/81% for acute hemorrhage, and 0.93/93%/85% for mass effect. For 1072 studies within our external test data, it showed AUC/sensitivity/specificity of 0.88/80%/80% for normal versus abnormal brain MRI, 0.97/90%/97% for acute infarction, 0.83/72%/88% for acute hemorrhage, and 0.87/79%/81% for mass effect. Our proposed deep convolutional network can accurately identify abnormal and critical intracranial findings on individual brain MRIs, while addressing the fact that some MR contrasts might not be available in individual studies.
Collapse
|
14
|
Weikert T, Rapaka S, Grbic S, Re T, Chaganti S, Winkel DJ, Anastasopoulos C, Niemann T, Wiggli BJ, Bremerich J, Twerenbold R, Sommer G, Comaniciu D, Sauter AW. Prediction of Patient Management in COVID-19 Using Deep Learning-Based Fully Automated Extraction of Cardiothoracic CT Metrics and Laboratory Findings. Korean J Radiol 2021; 22:994-1004. [PMID: 33686818 PMCID: PMC8154782 DOI: 10.3348/kjr.2020.0994] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 12/21/2020] [Accepted: 12/23/2020] [Indexed: 11/15/2022] Open
Abstract
Objective To extract pulmonary and cardiovascular metrics from chest CTs of patients with coronavirus disease 2019 (COVID-19) using a fully automated deep learning-based approach and assess their potential to predict patient management. Materials and Methods All initial chest CTs of patients who tested positive for severe acute respiratory syndrome coronavirus 2 at our emergency department between March 25 and April 25, 2020, were identified (n = 120). Three patient management groups were defined: group 1 (outpatient), group 2 (general ward), and group 3 (intensive care unit [ICU]). Multiple pulmonary and cardiovascular metrics were extracted from the chest CT images using deep learning. Additionally, six laboratory findings indicating inflammation and cellular damage were considered. Differences in CT metrics, laboratory findings, and demographics between the patient management groups were assessed. The potential of these parameters to predict patients' needs for intensive care (yes/no) was analyzed using logistic regression and receiver operating characteristic curves. Internal and external validity were assessed using 109 independent chest CT scans. Results While demographic parameters alone (sex and age) were not sufficient to predict ICU management status, both CT metrics alone (including both pulmonary and cardiovascular metrics; area under the curve [AUC] = 0.88; 95% confidence interval [CI] = 0.79–0.97) and laboratory findings alone (C-reactive protein, lactate dehydrogenase, white blood cell count, and albumin; AUC = 0.86; 95% CI = 0.77–0.94) were good classifiers. Excellent performance was achieved by a combination of demographic parameters, CT metrics, and laboratory findings (AUC = 0.91; 95% CI = 0.85–0.98). Application of a model that combined both pulmonary CT metrics and demographic parameters on a dataset from another hospital indicated its external validity (AUC = 0.77; 95% CI = 0.66–0.88). Conclusion Chest CT of patients with COVID-19 contains valuable information that can be accessed using automated image analysis. These metrics are useful for the prediction of patient management.
Collapse
|
15
|
Liu S, Setio AAA, Ghesu FC, Gibson E, Grbic S, Georgescu B, Comaniciu D. No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting With Adversarial Attacks. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:335-345. [PMID: 32966215 DOI: 10.1109/tmi.2020.3026261] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Detecting malignant pulmonary nodules at an early stage can allow medical interventions which may increase the survival rate of lung cancer patients. Using computer vision techniques to detect nodules can improve the sensitivity and the speed of interpreting chest CT for lung cancer screening. Many studies have used CNNs to detect nodule candidates. Though such approaches have been shown to outperform the conventional image processing based methods regarding the detection accuracy, CNNs are also known to be limited to generalize on under-represented samples in the training set and prone to imperceptible noise perturbations. Such limitations can not be easily addressed by scaling up the dataset or the models. In this work, we propose to add adversarial synthetic nodules and adversarial attack samples to the training data to improve the generalization and the robustness of the lung nodule detection systems. To generate hard examples of nodules from a differentiable nodule synthesizer, we use projected gradient descent (PGD) to search the latent code within a bounded neighbourhood that would generate nodules to decrease the detector response. To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.
Collapse
|
16
|
Chaganti S, Balachandran A, Chabin G, Cohen S, Flohr T, Georgescu B, Grenier P, Grbic S, Liu S, Mellot F, Murray N, Nicolaou S, Parker W, Re T, Sanelli P, Sauter AW, Xu Z, Yoo Y, Ziebandt V, Comaniciu D. Automated Quantification of CT Patterns Associated with COVID-19 from Chest CT. ARXIV 2020:arXiv:2004.01279v7. [PMID: 32550252 PMCID: PMC7280906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Revised: 11/18/2020] [Indexed: 12/29/2022]
Abstract
PURPOSE To present a method that automatically segments and quantifies abnormal CT patterns commonly present in coronavirus disease 2019 (COVID-19), namely ground glass opacities and consolidations. MATERIALS AND METHODS In this retrospective study, the proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions, based on a dataset of 9749 chest CT volumes. The method outputs two combined measures of the severity of lung and lobe involvement, quantifying both the extent of COVID-19 abnormalities and presence of high opacities, based on deep learning and deep reinforcement learning. The first measure of (PO, PHO) is global, while the second of (LSS, LHOS) is lobewise. Evaluation of the algorithm is reported on CTs of 200 participants (100 COVID-19 confirmed patients and 100 healthy controls) from institutions from Canada, Europe and the United States collected between 2002-Present (April, 2020). Ground truth is established by manual annotations of lesions, lungs, and lobes. Correlation and regression analyses were performed to compare the prediction to the ground truth. RESULTS Pearson correlation coefficient between method prediction and ground truth for COVID-19 cases was calculated as 0.92 for PO (P < .001), 0.97 for PHO(P < .001), 0.91 for LSS (P < .001), 0.90 for LHOS (P < .001). 98 of 100 healthy controls had a predicted PO of less than 1%, 2 had between 1-2%. Automated processing time to compute the severity scores was 10 seconds per case compared to 30 minutes required for manual annotations. CONCLUSION A new method segments regions of CT abnormalities associated with COVID-19 and computes (PO, PHO), as well as (LSS, LHOS) severity scores.
Collapse
|
17
|
Winkel DJ, Wetterauer C, Matthias MO, Lou B, Shi B, Kamen A, Comaniciu D, Seifert HH, Rentsch CA, Boll DT. Autonomous Detection and Classification of PI-RADS Lesions in an MRI Screening Population Incorporating Multicenter-Labeled Deep Learning and Biparametric Imaging: Proof of Concept. Diagnostics (Basel) 2020; 10:diagnostics10110951. [PMID: 33202680 PMCID: PMC7697194 DOI: 10.3390/diagnostics10110951] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 10/27/2020] [Accepted: 11/11/2020] [Indexed: 12/12/2022] Open
Abstract
Background: Opportunistic prostate cancer (PCa) screening is a controversial topic. Magnetic resonance imaging (MRI) has proven to detect prostate cancer with a high sensitivity and specificity, leading to the idea to perform an image-guided prostate cancer (PCa) screening; Methods: We evaluated a prospectively enrolled cohort of 49 healthy men participating in a dedicated image-guided PCa screening trial employing a biparametric MRI (bpMRI) protocol consisting of T2-weighted (T2w) and diffusion weighted imaging (DWI) sequences. Datasets were analyzed both by human readers and by a fully automated artificial intelligence (AI) software using deep learning (DL). Agreement between the algorithm and the reports—serving as the ground truth—was compared on a per-case and per-lesion level using metrics of diagnostic accuracy and k statistics; Results: The DL method yielded an 87% sensitivity (33/38) and 50% specificity (5/10) with a k of 0.42. 12/28 (43%) Prostate Imaging Reporting and Data System (PI-RADS) 3, 16/22 (73%) PI-RADS 4, and 5/5 (100%) PI-RADS 5 lesions were detected compared to the ground truth. Targeted biopsy revealed PCa in six participants, all correctly diagnosed by both the human readers and AI. Conclusions: The results of our study show that in our AI-assisted, image-guided prostate cancer screening the software solution was able to identify highly suspicious lesions and has the potential to effectively guide the targeted-biopsy workflow.
Collapse
|
18
|
Ghesu FC, Georgescu B, Mansoor A, Yoo Y, Gibson E, Vishwanath RS, Balachandran A, Balter JM, Cao Y, Singh R, Digumarthy SR, Kalra MK, Grbic S, Comaniciu D. Quantifying and leveraging predictive uncertainty for medical image assessment. Med Image Anal 2020; 68:101855. [PMID: 33260116 DOI: 10.1016/j.media.2020.101855] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 08/21/2020] [Accepted: 09/14/2020] [Indexed: 11/19/2022]
Abstract
The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy. Finally, we present a multi-reader study showing that the predictive uncertainty is indicative of reader errors.
Collapse
|
19
|
Wang DD, Qian Z, Vukicevic M, Engelhardt S, Kheradvar A, Zhang C, Little SH, Verjans J, Comaniciu D, O'Neill WW, Vannan MA. 3D Printing, Computational Modeling, and Artificial Intelligence for Structural Heart Disease. JACC Cardiovasc Imaging 2020; 14:41-60. [PMID: 32861647 DOI: 10.1016/j.jcmg.2019.12.022] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 11/27/2019] [Accepted: 12/02/2019] [Indexed: 01/19/2023]
Abstract
Structural heart disease (SHD) is a new field within cardiovascular medicine. Traditional imaging modalities fall short in supporting the needs of SHD interventions, as they have been constructed around the concept of disease diagnosis. SHD interventions disrupt traditional concepts of imaging in requiring imaging to plan, simulate, and predict intraprocedural outcomes. In transcatheter SHD interventions, the absence of a gold-standard open cavity surgical field deprives physicians of the opportunity for tactile feedback and visual confirmation of cardiac anatomy. Hence, dependency on imaging in periprocedural guidance has led to evolution of a new generation of procedural skillsets, concept of a visual field, and technologies in the periprocedural planning period to accelerate preclinical device development, physician, and patient education. Adaptation of 3-dimensional (3D) printing in clinical care and procedural planning has demonstrated a reduction in early-operator learning curve for transcatheter interventions. Integration of computation modeling to 3D printing has accelerated research and development understanding of fluid mechanics within device testing. Application of 3D printing, computational modeling, and ultimately incorporation of artificial intelligence is changing the landscape of physician training and delivery of patient-centric care. Transcatheter structural heart interventions are requiring in-depth periprocedural understanding of cardiac pathophysiology and device interactions not afforded by traditional imaging metrics.
Collapse
|
20
|
Chaganti S, Grenier P, Balachandran A, Chabin G, Cohen S, Flohr T, Georgescu B, Grbic S, Liu S, Mellot F, Murray N, Nicolaou S, Parker W, Re T, Sanelli P, Sauter AW, Xu Z, Yoo Y, Ziebandt V, Comaniciu D. Automated Quantification of CT Patterns Associated with COVID-19 from Chest CT. Radiol Artif Intell 2020; 2:e200048. [PMID: 33928255 PMCID: PMC7392373 DOI: 10.1148/ryai.2020200048] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
PURPOSE To present a method that automatically segments and quantifies abnormal CT patterns commonly present in coronavirus disease 2019 (COVID-19), namely ground glass opacities and consolidations. MATERIALS AND METHODS In this retrospective study, the proposed method takes as input a non-contrasted chest CT and segments the lesions, lungs, and lobes in three dimensions, based on a dataset of 9749 chest CT volumes. The method outputs two combined measures of the severity of lung and lobe involvement, quantifying both the extent of COVID-19 abnormalities and presence of high opacities, based on deep learning and deep reinforcement learning. The first measure of (PO, PHO) is global, while the second of (LSS, LHOS) is lobe-wise. Evaluation of the algorithm is reported on CTs of 200 participants (100 COVID-19 confirmed patients and 100 healthy controls) from institutions from Canada, Europe and the United States collected between 2002-Present (April 2020). Ground truth is established by manual annotations of lesions, lungs, and lobes. Correlation and regression analyses were performed to compare the prediction to the ground truth. RESULTS Pearson correlation coefficient between method prediction and ground truth for COVID-19 cases was calculated as 0.92 for PO (P < .001), 0.97 for PHO (P < .001), 0.91 for LSS (P < .001), 0.90 for LHOS (P < .001). 98 of 100 healthy controls had a predicted PO of less than 1%, 2 had between 1-2%. Automated processing time to compute the severity scores was 10 seconds per case compared to 30 minutes required for manual annotations. CONCLUSION A new method segments regions of CT abnormalities associated with COVID-19 and computes (PO, PHO), as well as (LSS, LHOS) severity scores.
Collapse
|
21
|
Winkel DJ, Weikert TJ, Breit HC, Chabin G, Gibson E, Heye TJ, Comaniciu D, Boll DT. Validation of a fully automated liver segmentation algorithm using multi-scale deep reinforcement learning and comparison versus manual segmentation. Eur J Radiol 2020; 126:108918. [PMID: 32171914 DOI: 10.1016/j.ejrad.2020.108918] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 01/29/2020] [Accepted: 02/23/2020] [Indexed: 12/12/2022]
Abstract
PURPOSE To evaluate the performance of an artificial intelligence (AI) based software solution tested on liver volumetric analyses and to compare the results to the manual contour segmentation. MATERIALS AND METHODS We retrospectively obtained 462 multiphasic CT datasets with six series for each patient: three different contrast phases and two slice thickness reconstructions (1.5/5 mm), totaling 2772 series. AI-based liver volumes were determined using multi-scale deep-reinforcement learning for 3D body markers detection and 3D structure segmentation. The algorithm was trained for liver volumetry on approximately 5000 datasets. We computed the absolute error of each automatically- and manually-derived volume relative to the mean manual volume. The mean processing time/dataset and method was recorded. Variations of liver volumes were compared using univariate generalized linear model analyses. A subgroup of 60 datasets was manually segmented by three radiologists, with a further subgroup of 20 segmented three times by each, to compare the automatically-derived results with the ground-truth. RESULTS The mean absolute error of the automatically-derived measurement was 44.3 mL (representing 2.37 % of the averaged liver volumes). The liver volume was neither dependent on the contrast phase (p = 0.697), nor on the slice thickness (p = 0.446). The mean processing time/dataset with the algorithm was 9.94 s (sec) compared to manual segmentation with 219.34 s. We found an excellent agreement between both approaches with an ICC value of 0.996. CONCLUSION The results of our study demonstrate that AI-powered fully automated liver volumetric analyses can be done with excellent accuracy, reproducibility, robustness, speed and agreement with the manual segmentation.
Collapse
|
22
|
Taghanaki SA, Zheng Y, Kevin Zhou S, Georgescu B, Sharma P, Xu D, Comaniciu D, Hamarneh G. Combo loss: Handling input and output imbalance in multi-organ segmentation. Comput Med Imaging Graph 2019; 75:24-33. [PMID: 31129477 DOI: 10.1016/j.compmedimag.2019.04.005] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2018] [Revised: 03/20/2019] [Accepted: 04/26/2019] [Indexed: 10/26/2022]
Abstract
Simultaneous segmentation of multiple organs from different medical imaging modalities is a crucial task as it can be utilized for computer-aided diagnosis, computer-assisted surgery, and therapy planning. Thanks to the recent advances in deep learning, several deep neural networks for medical image segmentation have been introduced successfully for this purpose. In this paper, we focus on learning a deep multi-organ segmentation network that labels voxels. In particular, we examine the critical choice of a loss function in order to handle the notorious imbalance problem that plagues both the input and output of a learning model. The input imbalance refers to the class-imbalance in the input training samples (i.e., small foreground objects embedded in an abundance of background voxels, as well as organs of varying sizes). The output imbalance refers to the imbalance between the false positives and false negatives of the inference model. In order to tackle both types of imbalance during training and inference, we introduce a new curriculum learning based loss function. Specifically, we leverage Dice similarity coefficient to deter model parameters from being held at bad local minima and at the same time gradually learn better model parameters by penalizing for false positives/negatives using a cross entropy term. We evaluated the proposed loss function on three datasets: whole body positron emission tomography (PET) scans with 5 target organs, magnetic resonance imaging (MRI) prostate scans, and ultrasound echocardigraphy images with a single target organ i.e., left ventricular. We show that a simple network architecture with the proposed integrative loss function can outperform state-of-the-art methods and results of the competing methods can be improved when our proposed loss is used.
Collapse
|
23
|
Dey D, Slomka PJ, Leeson P, Comaniciu D, Shrestha S, Sengupta PP, Marwick TH. Artificial Intelligence in Cardiovascular Imaging: JACC State-of-the-Art Review. J Am Coll Cardiol 2019; 73:1317-1335. [PMID: 30898208 PMCID: PMC6474254 DOI: 10.1016/j.jacc.2018.12.054] [Citation(s) in RCA: 324] [Impact Index Per Article: 64.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 12/13/2018] [Indexed: 12/11/2022]
Abstract
Data science is likely to lead to major changes in cardiovascular imaging. Problems with timing, efficiency, and missed diagnoses occur at all stages of the imaging chain. The application of artificial intelligence (AI) is dependent on robust data; the application of appropriate computational approaches and tools; and validation of its clinical application to image segmentation, automated measurements, and eventually, automated diagnosis. AI may reduce cost and improve value at the stages of image acquisition, interpretation, and decision-making. Moreover, the precision now possible with cardiovascular imaging, combined with "big data" from the electronic health record and pathology, is likely to better characterize disease and personalize therapy. This review summarizes recent promising applications of AI in cardiology and cardiac imaging, which potentially add value to patient care.
Collapse
|
24
|
Ghesu FC, Georgescu B, Zheng Y, Grbic S, Maier A, Hornegger J, Comaniciu D. Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:176-189. [PMID: 29990011 DOI: 10.1109/tpami.2017.2782687] [Citation(s) in RCA: 132] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Robust and fast detection of anatomical structures is a prerequisite for both diagnostic and interventional medical image analysis. Current solutions for anatomy detection are typically based on machine learning techniques that exploit large annotated image databases in order to learn the appearance of the captured anatomy. These solutions are subject to several limitations, including the use of suboptimal feature engineering techniques and most importantly the use of computationally suboptimal search-schemes for anatomy detection. To address these issues, we propose a method that follows a new paradigm by reformulating the detection problem as a behavior learning task for an artificial agent. We couple the modeling of the anatomy appearance and the object search in a unified behavioral framework, using the capabilities of deep reinforcement learning and multi-scale image analysis. In other words, an artificial agent is trained not only to distinguish the target anatomical object from the rest of the body but also how to find the object by learning and following an optimal navigation path to the target object in the imaged volumetric space. We evaluated our approach on 1487 3D-CT volumes from 532 patients, totaling over 500,000 image slices and show that it significantly outperforms state-of-the-art solutions on detecting several anatomical structures with no failed cases from a clinical acceptance perspective, while also achieving a 20-30 percent higher detection accuracy. Most importantly, we improve the detection-speed of the reference methods by 2-3 orders of magnitude, achieving unmatched real-time performance on large 3D-CT scans.
Collapse
|
25
|
Itu L, Rapaka S, Passerini T, Georgescu B, Schwemmer C, Schoebinger M, Flohr T, Sharma P, Comaniciu D. Reply to Liu et al. J Appl Physiol (1985) 2018; 125:1353. [PMID: 30354943 DOI: 10.1152/japplphysiol.00563.2018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|