1
|
Na H, Kim EJ, Muller A, Butts C, Reilly E, Geng T, Romeo M, Ong A. Small Hemothoraces Not Drained on Admission: Initial Volume Predicts Need for Intervention. Am Surg 2024; 90:2232-2237. [PMID: 38780449 DOI: 10.1177/00031348241256087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
BACKGROUND Unlike large hemothoraces (HTX), small HTX after blunt trauma may be observed without drainage. We aimed to study if there were risk factors that would predict the need for intervention in initially observed small HTX. METHODS A retrospective review of patients with blunt traumatic HTX from 2016 to 2022 was performed. Patients with small HTX (pleural fluid volume <400 mL on admission chest computerized tomography [CT]) were included. Patients were considered as being "initially observed" if there was no intervention for the HTX within 48 hours after admission. Primary outcome was any HTX-related intervention (open, thoracoscopic or percutaneous procedures) occurring after 48 hours and up to 6 months after injury. Univariable and multivariable statistical analyses were employed. A P-value of <.05 was considered significant. RESULTS Of 335 patients with HTX, 188 (59.6%) met inclusion criteria. Median (interquartile range) HTX volume was 90 (36-134) ml. One hundred and twenty-seven (68%) were initially observed. Of these, 31 (24%) had the primary outcome. These patients had a larger HTX volume (median, 129 vs 68 mL, P = .0001), and number of rib fractures (median, 7 vs 4, P = .0002) compared to those without the primary outcome. Chest-related readmission occurred in 8 (6%) with a median of 20 days from injury. Of these, 7 required an HTX-related intervention. Logistic regression analysis found that both the number of rib fractures and HTX volume independently predicted the primary outcome. CONCLUSION For small HTX initially observed, number of rib fractures and initial volume predicted delayed HTX-related intervention.
Collapse
Affiliation(s)
- HeeYun Na
- Drexel University College of Medicine, Wyomissing, PA, USA
| | - Esther J Kim
- Department of Surgery, Reading Hospital, West Reading, PA, USA
| | - Alison Muller
- Department of Surgery, Reading Hospital, West Reading, PA, USA
| | | | - Eugene Reilly
- Department of Surgery, Reading Hospital, West Reading, PA, USA
| | - Thomas Geng
- Department of Surgery, Reading Hospital, West Reading, PA, USA
| | - Michael Romeo
- Department of Radiology, Reading Hospital, West Reading, PA, USA
| | - Adrian Ong
- Department of Surgery, Reading Hospital, West Reading, PA, USA
| |
Collapse
|
2
|
Zhao T, Meng X, Wang Z, Hu Y, Fan H, Han J, Zhu N, Niu F. Diagnostic evaluation of blunt chest trauma by imaging-based application of artificial intelligence: A review. Am J Emerg Med 2024; 85:35-43. [PMID: 39213808 DOI: 10.1016/j.ajem.2024.08.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 08/12/2024] [Indexed: 09/04/2024] Open
Abstract
Artificial intelligence (AI) is becoming increasingly integral in clinical practice, such as during imaging tasks associated with the diagnosis and evaluation of blunt chest trauma (BCT). Due to significant advances in imaging-based deep learning, recent studies have demonstrated the efficacy of AI in the diagnosis of BCT, with a focus on rib fractures, pulmonary contusion, hemopneumothorax and others, demonstrating significant clinical progress. However, the complicated nature of BCT presents challenges in providing a comprehensive diagnosis and prognostic evaluation, and current deep learning research concentrates on specific clinical contexts, limiting its utility in addressing BCT intricacies. Here, we provide a review of the available evidence surrounding the potential utility of AI in BCT, and additionally identify the challenges impeding its development. This review offers insights on how to optimize the role of AI in the diagnostic evaluation of BCT, which can ultimately enhance patient care and outcomes in this critical clinical domain.
Collapse
Affiliation(s)
- Tingting Zhao
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin University, Tianjin, China
| | - Xianghong Meng
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin University, Tianjin, China.
| | - Zhi Wang
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin University, Tianjin, China.
| | - Yongcheng Hu
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China
| | - Hongxing Fan
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin Medical University, Tianjin, China
| | - Jun Han
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin University, Tianjin, China
| | - Nana Zhu
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin Medical University, Tianjin, China
| | - Feige Niu
- The Department of Radiology, Tianjin University Tianjin Hospital, 406 Jiefang Southern Road, Tianjin, China; Graduate School, Tianjin Medical University, Tianjin, China
| |
Collapse
|
3
|
Cheng CT, Lin HH, Hsu CP, Chen HW, Huang JF, Hsieh CH, Fu CY, Chung IF, Liao CH. Deep Learning for Automated Detection and Localization of Traumatic Abdominal Solid Organ Injuries on CT Scans. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1113-1123. [PMID: 38366294 PMCID: PMC11169164 DOI: 10.1007/s10278-024-01038-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 01/31/2024] [Accepted: 02/01/2024] [Indexed: 02/18/2024]
Abstract
Computed tomography (CT) is the most commonly used diagnostic modality for blunt abdominal trauma (BAT), significantly influencing management approaches. Deep learning models (DLMs) have shown great promise in enhancing various aspects of clinical practice. There is limited literature available on the use of DLMs specifically for trauma image evaluation. In this study, we developed a DLM aimed at detecting solid organ injuries to assist medical professionals in rapidly identifying life-threatening injuries. The study enrolled patients from a single trauma center who received abdominal CT scans between 2008 and 2017. Patients with spleen, liver, or kidney injury were categorized as the solid organ injury group, while others were considered negative cases. Only images acquired from the trauma center were enrolled. A subset of images acquired in the last year was designated as the test set, and the remaining images were utilized to train and validate the detection models. The performance of each model was assessed using metrics such as the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value, and negative predictive value based on the best Youden index operating point. The study developed the models using 1302 (87%) scans for training and tested them on 194 (13%) scans. The spleen injury model demonstrated an accuracy of 0.938 and a specificity of 0.952. The accuracy and specificity of the liver injury model were reported as 0.820 and 0.847, respectively. The kidney injury model showed an accuracy of 0.959 and a specificity of 0.989. We developed a DLM that can automate the detection of solid organ injuries by abdominal CT scans with acceptable diagnostic accuracy. It cannot replace the role of clinicians, but we can expect it to be a potential tool to accelerate the process of therapeutic decisions for trauma care.
Collapse
Affiliation(s)
- Chi-Tung Cheng
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Hou-Hsien Lin
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Chih-Po Hsu
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Huan-Wu Chen
- Department of Medical Imaging & Intervention, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Jen-Fu Huang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Chi-Hsun Hsieh
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - Chih-Yuan Fu
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan
| | - I-Fang Chung
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Chien-Hung Liao
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan, Taiwan.
| |
Collapse
|
4
|
Cheng CT, Ooyang CH, Kang SC, Liao CH. Applications of Deep Learning in Trauma Radiology: A Narrative Review. Biomed J 2024:100743. [PMID: 38679199 DOI: 10.1016/j.bj.2024.100743] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/26/2024] [Accepted: 04/24/2024] [Indexed: 05/01/2024] Open
Abstract
Diagnostic imaging is essential in modern trauma care for initial evaluation and identifying injuries requiring intervention. Deep learning (DL) has become mainstream in medical image analysis and has shown promising efficacy for classification, segmentation, and lesion detection. This narrative review provides the fundamental concepts for developing DL algorithms in trauma imaging and presents an overview of current progress in each modality. DL has been applied to detect free fluid on Focused Assessment with Sonography for Trauma (FAST), traumatic findings on chest and pelvic X-rays, and computed tomography (CT) scans, identify intracranial hemorrhage on head CT, detect vertebral fractures, and identify injuries to organs like the spleen, liver, and lungs on abdominal and chest CT. Future directions involve expanding dataset size and diversity through federated learning, enhancing model explainability and transparency to build clinician trust, and integrating multimodal data to provide more meaningful insights into traumatic injuries. Though some commercial artificial intelligence products are Food and Drug Administration-approved for clinical use in the trauma field, adoption remains limited, highlighting the need for multi-disciplinary teams to engineer practical, real-world solutions. Overall, DL shows immense potential to improve the efficiency and accuracy of trauma imaging, but thoughtful development and validation are critical to ensure these technologies positively impact patient care.
Collapse
Affiliation(s)
- Chi-Tung Cheng
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan Taiwan
| | - Chun-Hsiang Ooyang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan Taiwan
| | - Shih-Ching Kang
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan Taiwan.
| | - Chien-Hung Liao
- Department of Trauma and Emergency Surgery, Chang Gung Memorial Hospital, Linkou, Chang Gung University, Taoyuan Taiwan
| |
Collapse
|
5
|
Sarkar N, Kumagai M, Meyr S, Pothapragada S, Unberath M, Li G, Ahmed SR, Smith EB, Davis MA, Khatri GD, Agrawal A, Delproposto ZS, Chen H, Caballero CG, Dreizin D. An ASER AI/ML expert panel formative user research study for an interpretable interactive splenic AAST grading graphical user interface prototype. Emerg Radiol 2024; 31:167-178. [PMID: 38302827 PMCID: PMC11257379 DOI: 10.1007/s10140-024-02202-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 01/08/2024] [Indexed: 02/03/2024]
Abstract
PURPOSE The AAST Organ Injury Scale is widely adopted for splenic injury severity but suffers from only moderate inter-rater agreement. This work assesses SpleenPro, a prototype interactive explainable artificial intelligence/machine learning (AI/ML) diagnostic aid to support AAST grading, for effects on radiologist dwell time, agreement, clinical utility, and user acceptance. METHODS Two trauma radiology ad hoc expert panelists independently performed timed AAST grading on 76 admission CT studies with blunt splenic injury, first without AI/ML assistance, and after a 2-month washout period and randomization, with AI/ML assistance. To evaluate user acceptance, three versions of the SpleenPro user interface with increasing explainability were presented to four independent expert panelists with four example cases each. A structured interview consisting of Likert scales and free responses was conducted, with specific questions regarding dimensions of diagnostic utility (DU); mental support (MS); effort, workload, and frustration (EWF); trust and reliability (TR); and likelihood of future use (LFU). RESULTS SpleenPro significantly decreased interpretation times for both raters. Weighted Cohen's kappa increased from 0.53 to 0.70 with AI/ML assistance. During user acceptance interviews, increasing explainability was associated with improvement in Likert scores for MS, EWF, TR, and LFU. Expert panelists indicated the need for a combined early notification and grading functionality, PACS integration, and report autopopulation to improve DU. CONCLUSIONS SpleenPro was useful for improving objectivity of AAST grading and increasing mental support. Formative user research identified generalizable concepts including the need for a combined detection and grading pipeline and integration with the clinical workflow.
Collapse
Affiliation(s)
- Nathan Sarkar
- University of Maryland School of Medicine, 655 W. Baltimore Street, Baltimore, MD, 21201, USA
| | - Mitsuo Kumagai
- University of Maryland College Park, 4603 Calvert Rd, College Park, MD, 20740, USA
| | - Samantha Meyr
- University of Maryland College Park, 4603 Calvert Rd, College Park, MD, 20740, USA
| | - Sriya Pothapragada
- University of Maryland College Park, 4603 Calvert Rd, College Park, MD, 20740, USA
| | - Mathias Unberath
- Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD, 21218, USA
| | - Guang Li
- University of Maryland School of Medicine, 655 W. Baltimore Street, Baltimore, MD, 21201, USA
| | - Sagheer Rauf Ahmed
- University of Maryland School of Medicine, 655 W. Baltimore Street, Baltimore, MD, 21201, USA
- R Adams Cowley Shock Trauma Center, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Elana Beth Smith
- University of Maryland School of Medicine, 655 W. Baltimore Street, Baltimore, MD, 21201, USA
- R Adams Cowley Shock Trauma Center, 22 S Greene St, Baltimore, MD, 21201, USA
| | | | | | - Anjali Agrawal
- Teleradiology Solutions, 22 Lianfair Road Unit 6, Ardmore, PA, 19003, USA
| | | | - Haomin Chen
- Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD, 21218, USA
| | | | - David Dreizin
- University of Maryland School of Medicine, 655 W. Baltimore Street, Baltimore, MD, 21201, USA.
- R Adams Cowley Shock Trauma Center, 22 S Greene St, Baltimore, MD, 21201, USA.
| |
Collapse
|
6
|
Tewkesbury G, Beyer C, Eddinger K, McLauchlan N, Tran A, Cannon JW, Knollmann F. CT-based pleural effusion volume estimation formula demonstrates low accuracy and reproducibility for traumatic hemothorax. Injury 2024; 55:111112. [PMID: 37839918 DOI: 10.1016/j.injury.2023.111112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/08/2023] [Indexed: 10/17/2023]
Abstract
PURPOSE We aimed to evaluate the accuracy and reproducibility of the CT-based volume estimation formula V = d2 * h, where d and h represent the maximum depth and height of the effusion, for acute traumatic hemothorax. MATERIALS & METHODS Prospectively identified patients with CT showing acute traumatic hemothorax were considered. Volumes were retrospectively estimated using d2 * h, then manually measured on axial images. Subgroup analysis was performed on borderline-sized hemothorax (200-400 mL). Measurements were repeated by three non-radiologists. Bland-Altman analysis was used to assess agreement between the two methods and agreement between raters for each method. RESULTS A total of 46 patients (median age 34; 36 men) with hemothorax volume 23-1622 mL (median 191 mL, IQR 99-324 mL) were evaluated. Limits of agreement between estimates and measured volumes were -718 - +842 mL (± 202 mL). Borderline-sized hemothorax (n = 13) limits of agreement were -300 - +121 mL (± 114 mL). Of all hemothorax, 85 % (n = 39/46) were correctly stratified as over or under 300 mL, and of borderline-sized hemothorax, 54 % (n = 7/13). Inter-rater limits of agreement were -251 - +350, -694 - +1019, and -696 - +957 for the estimation formula, respectively, and -124 - +190, -97 - +111, and -96 - +46 for the measured volume. DISCUSSION An estimation formula varies with actual hemothorax volume by hundreds of mL. There is low accuracy in stratifying hemothorax volumes close to 300 mL. Variability between raters was substantially higher with the estimation formula than with manual measurements.
Collapse
Affiliation(s)
| | - Carl Beyer
- Department of Surgery, Penn Medicine, United States
| | | | | | - Anne Tran
- Perelman School of Medicine, University of Pennsylvania, United States
| | | | | |
Collapse
|
7
|
Zhang L, LaBelle W, Unberath M, Chen H, Hu J, Li G, Dreizin D. A vendor-agnostic, PACS integrated, and DICOM-compatible software-server pipeline for testing segmentation algorithms within the clinical radiology workflow. Front Med (Lausanne) 2023; 10:1241570. [PMID: 37954555 PMCID: PMC10637622 DOI: 10.3389/fmed.2023.1241570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 10/09/2023] [Indexed: 11/14/2023] Open
Abstract
Background Reproducible approaches are needed to bring AI/ML for medical image analysis closer to the bedside. Investigators wishing to shadow test cross-sectional medical imaging segmentation algorithms on new studies in real-time will benefit from simple tools that integrate PACS with on-premises image processing, allowing visualization of DICOM-compatible segmentation results and volumetric data at the radiology workstation. Purpose In this work, we develop and release a simple containerized and easily deployable pipeline for shadow testing of segmentation algorithms within the clinical workflow. Methods Our end-to-end automated pipeline has two major components- 1. A router/listener and anonymizer and an OHIF web viewer backstopped by a DCM4CHEE DICOM query/retrieve archive deployed in the virtual infrastructure of our secure hospital intranet, and 2. An on-premises single GPU workstation host for DICOM/NIfTI conversion steps, and image processing. DICOM images are visualized in OHIF along with their segmentation masks and associated volumetry measurements (in mL) using DICOM SEG and structured report (SR) elements. Since nnU-net has emerged as a widely-used out-of-the-box method for training segmentation models with state-of-the-art performance, feasibility of our pipleine is demonstrated by recording clock times for a traumatic pelvic hematoma nnU-net model. Results Mean total clock time from PACS send by user to completion of transfer to the DCM4CHEE query/retrieve archive was 5 min 32 s (± SD of 1 min 26 s). This compares favorably to the report turnaround times for whole-body CT exams, which often exceed 30 min, and illustrates feasibility in the clinical setting where quantitative results would be expected prior to report sign-off. Inference times accounted for most of the total clock time, ranging from 2 min 41 s to 8 min 27 s. All other virtual and on-premises host steps combined ranged from a minimum of 34 s to a maximum of 48 s. Conclusion The software worked seamlessly with an existing PACS and could be used for deployment of DL models within the radiology workflow for prospective testing on newly scanned patients. Once configured, the pipeline is executed through one command using a single shell script. The code is made publicly available through an open-source license at "https://github.com/vastc/," and includes a readme file providing pipeline config instructions for host names, series filter, other parameters, and citation instructions for this work.
Collapse
Affiliation(s)
- Lei Zhang
- School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Wayne LaBelle
- School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Mathias Unberath
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Haomin Chen
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Jiazhen Hu
- Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Guang Li
- School of Medicine, University of Maryland, Baltimore, MD, United States
| | - David Dreizin
- School of Medicine, University of Maryland, Baltimore, MD, United States
| |
Collapse
|
8
|
Sarkar N, Zhang L, Campbell P, Liang Y, Li G, Khedr M, Khetan U, Dreizin D. Pulmonary contusion: automated deep learning-based quantitative visualization. Emerg Radiol 2023; 30:435-441. [PMID: 37318609 PMCID: PMC10527354 DOI: 10.1007/s10140-023-02149-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 06/07/2023] [Indexed: 06/16/2023]
Abstract
PURPOSE Rapid automated CT volumetry of pulmonary contusion may predict progression to Acute Respiratory Distress Syndrome (ARDS) and help guide early clinical management in at-risk trauma patients. This study aims to train and validate state-of-the-art deep learning models to quantify pulmonary contusion as a percentage of total lung volume (Lung Contusion Index, or auto-LCI) and assess the relationship between auto-LCI and relevant clinical outcomes. METHODS 302 adult patients (age ≥ 18) with pulmonary contusion were retrospectively identified from reports between 2016 and 2021. nnU-Net was trained on manual contusion and whole-lung segmentations. Point-of-care candidate variables for multivariate regression included oxygen saturation, heart rate, and systolic blood pressure on admission. Logistic regression was used to assess ARDS risk, and Cox proportional hazards models were used to determine differences in ICU length of stay and mechanical ventilation time. RESULTS Mean Volume Similarity Index and mean Dice scores were 0.82 and 0.67. Interclass correlation coefficient and Pearson r between ground-truth and predicted volumes were 0.90 and 0.91. 38 (14%) patients developed ARDS. In bivariate analysis, auto-LCI was associated with ARDS (p < 0.001), ICU admission (p < 0.001), and need for mechanical ventilation (p < 0.001). In multivariate analyses, auto-LCI was associated with ARDS (p = 0.04), longer length of stay in the ICU (p = 0.02) and longer time on mechanical ventilation (p = 0.04). AUC of multivariate regression to predict ARDS using auto-LCI and clinical variables was 0.70 while AUC using auto-LCI alone was 0.68. CONCLUSION Increasing auto-LCI values corresponded with increased risk of ARDS, longer ICU admissions, and longer periods of mechanical ventilation.
Collapse
Affiliation(s)
- Nathan Sarkar
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Lei Zhang
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Peter Campbell
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Yuanyuan Liang
- Department of Epidemiology & Public Health, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Guang Li
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Mustafa Khedr
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - Udit Khetan
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA
| | - David Dreizin
- Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, 22 S Greene St, Baltimore, MD, 21201, USA.
| |
Collapse
|
9
|
Dreizin D, Zhang L, Sarkar N, Bodanapally UK, Li G, Hu J, Chen H, Khedr M, Khetan U, Campbell P, Unberath M. Accelerating voxelwise annotation of cross-sectional imaging through AI collaborative labeling with quality assurance and bias mitigation. FRONTIERS IN RADIOLOGY 2023; 3:1202412. [PMID: 37485306 PMCID: PMC10362988 DOI: 10.3389/fradi.2023.1202412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 06/22/2023] [Indexed: 07/25/2023]
Abstract
Background precision-medicine quantitative tools for cross-sectional imaging require painstaking labeling of targets that vary considerably in volume, prohibiting scaling of data annotation efforts and supervised training to large datasets for robust and generalizable clinical performance. A straight-forward time-saving strategy involves manual editing of AI-generated labels, which we call AI-collaborative labeling (AICL). Factors affecting the efficacy and utility of such an approach are unknown. Reduction in time effort is not well documented. Further, edited AI labels may be prone to automation bias. Purpose In this pilot, using a cohort of CTs with intracavitary hemorrhage, we evaluate both time savings and AICL label quality and propose criteria that must be met for using AICL annotations as a high-throughput, high-quality ground truth. Methods 57 CT scans of patients with traumatic intracavitary hemorrhage were included. No participant recruited for this study had previously interpreted the scans. nnU-net models trained on small existing datasets for each feature (hemothorax/hemoperitoneum/pelvic hematoma; n = 77-253) were used in inference. Two common scenarios served as baseline comparison- de novo expert manual labeling, and expert edits of trained staff labels. Parameters included time effort and image quality graded by a blinded independent expert using a 9-point scale. The observer also attempted to discriminate AICL and expert labels in a random subset (n = 18). Data were compared with ANOVA and post-hoc paired signed rank tests with Bonferroni correction. Results AICL reduced time effort 2.8-fold compared to staff label editing, and 8.7-fold compared to expert labeling (corrected p < 0.0006). Mean Likert grades for AICL (8.4, SD:0.6) were significantly higher than for expert labels (7.8, SD:0.9) and edited staff labels (7.7, SD:0.8) (corrected p < 0.0006). The independent observer failed to correctly discriminate AI and human labels. Conclusion For our use case and annotators, AICL facilitates rapid large-scale curation of high-quality ground truth. The proposed quality control regime can be employed by other investigators prior to embarking on AICL for segmentation tasks in large datasets.
Collapse
Affiliation(s)
- David Dreizin
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Lei Zhang
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Nathan Sarkar
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Uttam K. Bodanapally
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Guang Li
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Jiazhen Hu
- Johns Hopkins University, Baltimore, MD, United States
| | - Haomin Chen
- Johns Hopkins University, Baltimore, MD, United States
| | - Mustafa Khedr
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Udit Khetan
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | - Peter Campbell
- Department of Diagnostic Radiology and Nuclear Medicine, School of Medicine, University of Maryland, Baltimore, MD, United States
| | | |
Collapse
|
10
|
Chen H, Unberath M, Dreizin D. Toward automated interpretable AAST grading for blunt splenic injury. Emerg Radiol 2023; 30:41-50. [PMID: 36371579 PMCID: PMC10314366 DOI: 10.1007/s10140-022-02099-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 11/04/2022] [Indexed: 11/13/2022]
Abstract
BACKGROUND The American Association for the Surgery of Trauma (AAST) splenic organ injury scale (OIS) is the most frequently used CT-based grading system for blunt splenic trauma. However, reported inter-rater agreement is modest, and an algorithm that objectively automates grading based on transparent and verifiable criteria could serve as a high-trust diagnostic aid. PURPOSE To pilot the development of an automated interpretable multi-stage deep learning-based system to predict AAST grade from admission trauma CT. METHODS Our pipeline includes 4 parts: (1) automated splenic localization, (2) Faster R-CNN-based detection of pseudoaneurysms (PSA) and active bleeds (AB), (3) nnU-Net segmentation and quantification of splenic parenchymal disruption (SPD), and (4) a directed graph that infers AAST grades from detection and segmentation results. Training and validation is performed on a dataset of adult patients (age ≥ 18) with voxelwise labeling, consensus AAST grading, and hemorrhage-related outcome data (n = 174). RESULTS AAST classification agreement (weighted κ) between automated and consensus AAST grades was substantial (0.79). High-grade (IV and V) injuries were predicted with accuracy, positive predictive value, and negative predictive value of 92%, 95%, and 89%. The area under the curve for predicting hemorrhage control intervention was comparable between expert consensus and automated AAST grading (0.83 vs 0.88). The mean combined inference time for the pipeline was 96.9 s. CONCLUSIONS The results of our method were rapid and verifiable, with high agreement between automated and expert consensus grades. Diagnosis of high-grade lesions and prediction of hemorrhage control intervention produced accurate results in adult patients.
Collapse
Affiliation(s)
- Haomin Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - David Dreizin
- Emergency and Trauma Imaging, Department of Diagnostic Radiology and Nuclear Medicine, R Adams Cowley Shock Trauma Center, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|