1
|
Reis EP, Blankemeier L, Zambrano Chaves JM, Jensen MEK, Yao S, Truyts CAM, Willis MH, Adams S, Amaro E, Boutin RD, Chaudhari AS. Automated abdominal CT contrast phase detection using an interpretable and open-source artificial intelligence algorithm. Eur Radiol 2024:10.1007/s00330-024-10769-6. [PMID: 38683384 DOI: 10.1007/s00330-024-10769-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 03/11/2024] [Accepted: 03/20/2024] [Indexed: 05/01/2024]
Abstract
OBJECTIVES To develop and validate an open-source artificial intelligence (AI) algorithm to accurately detect contrast phases in abdominal CT scans. MATERIALS AND METHODS Retrospective study aimed to develop an AI algorithm trained on 739 abdominal CT exams from 2016 to 2021, from 200 unique patients, covering 1545 axial series. We performed segmentation of five key anatomic structures-aorta, portal vein, inferior vena cava, renal parenchyma, and renal pelvis-using TotalSegmentator, a deep learning-based tool for multi-organ segmentation, and a rule-based approach to extract the renal pelvis. Radiomics features were extracted from the anatomical structures for use in a gradient-boosting classifier to identify four contrast phases: non-contrast, arterial, venous, and delayed. Internal and external validation was performed using the F1 score and other classification metrics, on the external dataset "VinDr-Multiphase CT". RESULTS The training dataset consisted of 172 patients (mean age, 70 years ± 8, 22% women), and the internal test set included 28 patients (mean age, 68 years ± 8, 14% women). In internal validation, the classifier achieved an accuracy of 92.3%, with an average F1 score of 90.7%. During external validation, the algorithm maintained an accuracy of 90.1%, with an average F1 score of 82.6%. Shapley feature attribution analysis indicated that renal and vascular radiodensity values were the most important for phase classification. CONCLUSION An open-source and interpretable AI algorithm accurately detects contrast phases in abdominal CT scans, with high accuracy and F1 scores in internal and external validation, confirming its generalization capability. CLINICAL RELEVANCE STATEMENT Contrast phase detection in abdominal CT scans is a critical step for downstream AI applications, deploying algorithms in the clinical setting, and for quantifying imaging biomarkers, ultimately allowing for better diagnostics and increased access to diagnostic imaging. KEY POINTS Digital Imaging and Communications in Medicine labels are inaccurate for determining the abdominal CT scan phase. AI provides great help in accurately discriminating the contrast phase. Accurate contrast phase determination aids downstream AI applications and biomarker quantification.
Collapse
Affiliation(s)
- Eduardo Pontes Reis
- Department of Radiology, Stanford University, Stanford, CA, USA.
- Center for Artificial Intelligence in Medicine & Imaging (AIMI), Stanford University, Stanford, CA, USA.
- Hospital Israelita Albert Einstein, Sao Paulo, Brazil.
| | - Louis Blankemeier
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Juan Manuel Zambrano Chaves
- Department of Radiology, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | | | - Sally Yao
- Department of Radiology, Stanford University, Stanford, CA, USA
| | | | - Marc H Willis
- Department of Radiology, Stanford University, Stanford, CA, USA
| | - Scott Adams
- Department of Radiology, Stanford University, Stanford, CA, USA
| | - Edson Amaro
- Hospital Israelita Albert Einstein, Sao Paulo, Brazil
| | - Robert D Boutin
- Department of Radiology, Stanford University, Stanford, CA, USA
| | - Akshay S Chaudhari
- Department of Radiology, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| |
Collapse
|
2
|
Li W, Lin HM, Lin A, Napoleone M, Moreland R, Murari A, Stepanov M, Ivanov E, Prasad AS, Shih G, Hu Z, Zulbayar S, Sejdić E, Colak E. Machine Learning Classification of Body Part, Imaging Axis, and Intravenous Contrast Enhancement on CT Imaging. Can Assoc Radiol J 2024; 75:82-91. [PMID: 37439250 DOI: 10.1177/08465371231180844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2023] Open
Abstract
Purpose: The development and evaluation of machine learning models that automatically identify the body part(s) imaged, axis of imaging, and the presence of intravenous contrast material of a CT series of images. Methods: This retrospective study included 6955 series from 1198 studies (501 female, 697 males, mean age 56.5 years) obtained between January 2010 and September 2021. Each series was annotated by a trained board-certified radiologist with labels consisting of 16 body parts, 3 imaging axes, and whether an intravenous contrast agent was used. The studies were randomly assigned to the training, validation and testing sets with a proportion of 70%, 20% and 10%, respectively, to develop a 3D deep neural network for each classification task. External validation was conducted with a total of 35,272 series from 7 publicly available datasets. The classification accuracy for each series was independently assessed for each task to evaluate model performance. Results: The accuracies for identifying the body parts, imaging axes, and the presence of intravenous contrast were 96.0% (95% CI: 94.6%, 97.2%), 99.2% (95% CI: 98.5%, 99.7%), and 97.5% (95% CI: 96.4%, 98.5%) respectively. The generalizability of the models was demonstrated through external validation with accuracies of 89.7 - 97.8%, 98.6 - 100%, and 87.8 - 98.6% for the same tasks. Conclusions: The developed models demonstrated high performance on both internal and external testing in identifying key aspects of a CT series.
Collapse
Affiliation(s)
- Wuqi Li
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Hui Ming Lin
- Department of Medical Imaging, Unity Health Toronto, Toronto, ON, Canada
| | - Amy Lin
- Department of Medical Imaging, Unity Health Toronto, Toronto, ON, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Marc Napoleone
- Department of Medical Imaging, Unity Health Toronto, Toronto, ON, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Robert Moreland
- Department of Medical Imaging, Unity Health Toronto, Toronto, ON, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Alexis Murari
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Maxim Stepanov
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Eric Ivanov
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Abhinav Sanjeeva Prasad
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - George Shih
- Department of Radiology, Weill Cornell Medicine, New York, NY, USA
| | - Zixuan Hu
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
| | - Suvd Zulbayar
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Ervin Sejdić
- The Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
- North York General Hospital, Toronto, ON, Canada
| | - Errol Colak
- Department of Medical Imaging, Unity Health Toronto, Toronto, ON, Canada
- Department of Medical Imaging, Faculty of Medicine, University of Toronto, Toronto, ON, Canada
- Li Ka Shing Knowledge Institute, St Michael's Hospital, Unity Health Toronto, Toronto, ON, Canada
| |
Collapse
|
3
|
Na S, Ko Y, Ham SJ, Sung YS, Kim MH, Shin Y, Jung SC, Ju C, Kim BS, Yoon K, Kim KW. Sequence-Type Classification of Brain MRI for Acute Stroke Using a Self-Supervised Machine Learning Algorithm. Diagnostics (Basel) 2023; 14:70. [PMID: 38201379 PMCID: PMC10804387 DOI: 10.3390/diagnostics14010070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 12/18/2023] [Accepted: 12/22/2023] [Indexed: 01/12/2024] Open
Abstract
We propose a self-supervised machine learning (ML) algorithm for sequence-type classification of brain MRI using a supervisory signal from DICOM metadata (i.e., a rule-based virtual label). A total of 1787 brain MRI datasets were constructed, including 1531 from hospitals and 256 from multi-center trial datasets. The ground truth (GT) was generated by two experienced image analysts and checked by a radiologist. An ML framework called ImageSort-net was developed using various features related to MRI acquisition parameters and used for training virtual labels and ML algorithms derived from rule-based labeling systems that act as labels for supervised learning. For the performance evaluation of ImageSort-net (MLvirtual), we compare and analyze the performances of models trained with human expert labels (MLhumans), using as a test set blank data that the rule-based labeling system failed to infer from each dataset. The performance of ImageSort-net (MLvirtual) was comparable to that of MLhuman (98.5% and 99%, respectively) in terms of overall accuracy when trained with hospital datasets. When trained with a relatively small multi-center trial dataset, the overall accuracy was relatively lower than that of MLhuman (95.6% and 99.4%, respectively). After integrating the two datasets and re-training them, MLvirtual showed higher accuracy than MLvirtual trained only on multi-center datasets (95.6% and 99.7%, respectively). Additionally, the multi-center dataset inference performances after the re-training of MLvirtual and MLhumans were identical (99.7%). Training of ML algorithms based on rule-based virtual labels achieved high accuracy for sequence-type classification of brain MRI and enabled us to build a sustainable self-learning system.
Collapse
Affiliation(s)
- Seongwon Na
- Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Republic of Korea;
- Biomedical Research Center, Asan Institute for Life Sciences, Asan Medical Center, Seoul 05505, Republic of Korea
| | - Yousun Ko
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul 05505, Republic of Korea; (Y.K.)
| | - Su Jung Ham
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul 05505, Republic of Korea; (Y.K.)
| | - Yu Sub Sung
- Clinical Research Center, Asan Medical Center, Seoul 05505, Republic of Korea
- Department of Convergence Medicine, University of Ulsan College of Medicine, Seoul 05505, Republic of Korea
| | - Mi-Hyun Kim
- Trialinformatics Inc., Seoul 05505, Republic of Korea
- Department of Radiation Science & Technology, Jeonbuk National University, Jeonju 56212, Republic of Korea
| | - Youngbin Shin
- Biomedical Research Center, Asan Institute for Life Sciences, Asan Medical Center, Seoul 05505, Republic of Korea
| | - Seung Chai Jung
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul 05505, Republic of Korea; (Y.K.)
| | - Chung Ju
- Shin Poong Pharm. Co., Ltd., Seoul 06246, Republic of Korea
- Graduate School of Clinical Pharmacy, CHA University, Pocheon-si 11160, Republic of Korea
| | - Byung Su Kim
- Shin Poong Pharm. Co., Ltd., Seoul 06246, Republic of Korea
| | - Kyoungro Yoon
- Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Republic of Korea;
- Department of Smart ICT Convergence Engineering, Konkuk University, Seoul 05029, Republic of Korea
| | - Kyung Won Kim
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul 05505, Republic of Korea; (Y.K.)
| |
Collapse
|
4
|
Yoon JS, Yon CJ, Lee D, Lee JJ, Kang CH, Kang SB, Lee NK, Chang CB. Assessment of a novel deep learning-based software developed for automatic feature extraction and grading of radiographic knee osteoarthritis. BMC Musculoskelet Disord 2023; 24:869. [PMID: 37940935 PMCID: PMC10631128 DOI: 10.1186/s12891-023-06951-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 10/10/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND The Kellgren-Lawrence (KL) grading system is the most widely used method to classify the severity of osteoarthritis (OA) of the knee. However, due to ambiguity of terminology, the KL system showed inferior inter- and intra-observer reliability. For a more reliable evaluation, we recently developed novel deep learning (DL) software known as MediAI-OA to extract each radiographic feature of knee OA and to grade OA severity based on the KL system. METHODS This research used data from the Osteoarthritis Initiative for training and validation of MediAI-OA. 44,193 radiographs and 810 radiographs were set as the training data and used as validation data, respectively. This AI model was developed to automatically quantify the degree of joint space narrowing (JSN) of medial and lateral tibiofemoral joint, to automatically detect osteophytes in four regions (medial distal femur, lateral distal femur, medial proximal tibia and lateral proximal tibia) of the knee joint, to classify the KL grade, and present the results of these three OA features together. The model was tested by using 400 test datasets, and the results were compared to the ground truth. The accuracy of the JSN quantification and osteophyte detection was evaluated. The KL grade classification performance was evaluated by precision, recall, F1 score, accuracy, and Cohen's kappa coefficient. In addition, we defined KL grade 2 or higher as clinically significant OA, and accuracy of OA diagnosis were obtained. RESULTS The mean squared error of JSN rate quantification was 0.067 and average osteophyte detection accuracy of the MediAI-OA was 0.84. The accuracy of KL grading was 0.83, and the kappa coefficient between the AI model and ground truth was 0.768, which demonstrated substantial consistency. The OA diagnosis accuracy of this software was 0.92. CONCLUSIONS The novel DL software known as MediAI-OA demonstrated satisfactory performance comparable to that of experienced orthopedic surgeons and radiologists for analyzing features of knee OA, KL grading and OA diagnosis. Therefore, reliable KL grading can be performed and the burden of the radiologist can be reduced by using MediAI-OA.
Collapse
Affiliation(s)
- Ji Soo Yoon
- Department of Orthopaedic Surgery, Seoul National University Bundang Hospital, Seongnam-Si, Republic of Korea
| | - Chang-Jin Yon
- Department of Orthopaedic Surgery, Keimyung University Dongsan Hospital, Daegu, Republic of Korea
| | | | | | - Chang Ho Kang
- Department of Radiology, Korea University Anam Hospital, Seoul, Republic of Korea
| | - Seung-Baik Kang
- Department of Orthopaedic Surgery, SMG-SNU Boramae Medical Center, Seoul, Republic of Korea
- Department of Orthopaedic Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Na-Kyoung Lee
- Department of Orthopaedic Surgery, Seoul National University Bundang Hospital, Seongnam-Si, Republic of Korea.
| | - Chong Bum Chang
- Department of Orthopaedic Surgery, Seoul National University Bundang Hospital, Seongnam-Si, Republic of Korea
- Department of Orthopaedic Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
5
|
Fraiwan M, Al-Kofahi N, Ibnian A, Hanatleh O. Detection of developmental dysplasia of the hip in X-ray images using deep transfer learning. BMC Med Inform Decis Mak 2022; 22:216. [PMID: 35964072 PMCID: PMC9375244 DOI: 10.1186/s12911-022-01957-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 07/30/2022] [Indexed: 01/14/2023] Open
Abstract
Background Developmental dysplasia of the hip (DDH) is a relatively common disorder in newborns, with a reported prevalence of 1–5 per 1000 births. It can lead to developmental abnormalities in terms of mechanical difficulties and a displacement of the joint (i.e., subluxation or dysplasia). An early diagnosis in the first few months from birth can drastically improve healing, render surgical intervention unnecessary and reduce bracing time. A pelvic X-ray inspection represents the gold standard for DDH diagnosis. Recent advances in deep learning artificial intelligence have enabled the use of many image-based medical decision-making applications. The present study employs deep transfer learning in detecting DDH in pelvic X-ray images without the need for explicit measurements. Methods Pelvic anteroposterior X-ray images from 354 subjects (120 DDH and 234 normal) were collected locally at two hospitals in northern Jordan. A system that accepts these images as input and classifies them as DDH or normal was developed using thirteen deep transfer learning models. Various performance metrics were evaluated in addition to the overfitting/underfitting behavior and the training times. Results The highest mean DDH detection accuracy was 96.3% achieved using the DarkNet53 model, although other models achieved comparable results. A common theme across all the models was the extremely high sensitivity (i.e., recall) value at the expense of specificity. The F1 score, precision, recall and specificity for DarkNet53 were 95%, 90.6%, 100% and 94.3%, respectively. Conclusions Our automated method appears to be a highly accurate DDH screening and diagnosis method. Moreover, the performance evaluation shows that it is possible to further improve the system by expanding the dataset to include more X-ray images.
Collapse
Affiliation(s)
- Mohammad Fraiwan
- Department of Computer Engineering, Jordan University of Science and Technology, Irbid, Jordan.
| | - Noran Al-Kofahi
- Department of Internal Medicine, Jordan University of Science and Technology, Irbid, Jordan
| | - Ali Ibnian
- Department of Internal Medicine, Jordan University of Science and Technology, Irbid, Jordan
| | - Omar Hanatleh
- Department of Internal Medicine, Jordan University of Science and Technology, Irbid, Jordan
| |
Collapse
|