1
|
Kabir S, Pippi Salle JL, Chowdhury MEH, Abbas TO. Quantification of vesicoureteral reflux using machine learning. J Pediatr Urol 2024; 20:257-264. [PMID: 37980211 DOI: 10.1016/j.jpurol.2023.10.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 10/17/2023] [Accepted: 10/21/2023] [Indexed: 11/20/2023]
Abstract
INTRODUCTION The radiographic grading of voiding cystourethrogram (VCUG) images is often used to determine the clinical course and appropriate treatment in patients with vesicoureteral reflux (VUR). However, image-based evaluation of VUR remains highly subjective, so we developed a supervised machine learning model to automatically and objectively grade VCUG data. STUDY DESIGN A total of 113 VCUG images were gathered from public sources to compile the dataset for this study. For each image, VUR severity was graded by four pediatric radiologists and three pediatric urologists (low severity scored 1-3; high severity 4-5). Ground truth for each image was assigned based on the grade diagnosed by a majority of the expert assessors. Nine features were extracted from each VCUG image, then six machine learning models were trained, validated, and tested using 'leave-one-out' cross-validation. All features were compared and contrasted, with the highest-ranked then being used to train the final models. RESULTS F1-score is a metric that is often used to indicate performance accuracy of machine learning models. When using the highest-ranked VCUG image features, F1-scores for the support vector machine (SVM) and multi-layer perceptron (MLP) classifiers were 90.27 % and 91.14 %, respectively, indicating a high level of accuracy. When using all features combined, F1 scores were 89.37 % for SVM and 90.27 % for MLP. DISCUSSION These findings indicate that a distorted pattern of renal calyces is an accurate predictor of high-grade VUR. Machine learning protocols can be enhanced in future to improve objective grading of VUR.
Collapse
Affiliation(s)
- Saidul Kabir
- Department of Electrical and Electronic Engineering, University of Dhaka, Dhaka, 1000, Bangladesh
| | | | | | - Tariq O Abbas
- Urology Division, Surgery Department, Sidra Medicine, Qatar.
| |
Collapse
|
2
|
Li Z, Tan Z, Wang Z, Tang W, Ren X, Fu J, Wang G, Chu H, Chen J, Duan Y, Zhuang L, Wu M. Development and multi-institutional validation of a deep learning model for grading of vesicoureteral reflux on voiding cystourethrogram: a retrospective multicenter study. EClinicalMedicine 2024; 69:102466. [PMID: 38361995 PMCID: PMC10867607 DOI: 10.1016/j.eclinm.2024.102466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 01/16/2024] [Accepted: 01/17/2024] [Indexed: 02/17/2024] Open
Abstract
Background Voiding cystourethrography (VCUG) is the gold standard for the diagnosis and grading of vesicoureteral reflux (VUR). However, VUR grading from voiding cystourethrograms is highly subjective with low reliability. This study aimed to develop a deep learning model to improve reliability for VUR grading on VCUG and compare its performance to that of clinicians. Methods In this retrospective study in China, VCUG images were collected between January 2019 and September 2022 from our institution as an internal dataset for training and 4 external data sets as external testing set for validation. Samples were divided into training (N = 1000) and validation sets (N = 500), internal testing set (N = 168), and external testing set (N = 280). An ensemble learning-based model, Deep-VCUG, using Res-Net 101 and the voting methods was developed to predict VUR grade. The grading performance was assessed using heatmaps, area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, and F1 score in the internal and external testing set. The performances of four clinicians (2 pediatric urologists and 2 radiologists) with and without the Deep-VCUG assisted to predict VUR grade were explored in external testing sets. Findings A total of 1948 VCUG images were collected (Internal dataset = 1668; multi-center external dataset = 280). For assessing unilateral VUR grading, the Deep-VCUG achieved AUCs of 0.962 (95% confidence interval [CI]: 0.943-0.978) and 0.944 (95% [CI]: 0.921-0.964) in the internal and external testing sets, respectively, for bilateral VUR grading, the Deep-VCUG also achieved high AUCs of 0.960 (95% [CI]: 0.922-0.983) and 0.924 (95% [CI]: 0.887-0.957). The Deep-VCUG model using voting method outperformed single model and clinician in terms of classification based on VCUG image. Moreover, Under the Dee-VCUG assisted, the classification ability of junior and senior clinicians was significantly improved. Interpretation The Deep-VCUG model is a generalizable, objective, and accurate tool for vesicoureteral reflux grading based on VCUG imaging and had good assistance with clinicians to VUR grading applicability. Funding This study was supported by Natural Science Foundation of China, "Fuqing Scholar" Student Scientific Research Program of Shanghai Medical College, Fudan University, and the Program of Greater Bay Area Institute of Precision Medicine (Guangzhou).
Collapse
Affiliation(s)
- Zhanchi Li
- Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zelong Tan
- Department of Electronic Engineering, Tsinghua University, Beijing, China
| | - Zheyuan Wang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Wenjuan Tang
- Department of Radiology, Shanghai Children's Hospital, School of Medicine, Shanghai Jiao Tong University, 355 Luding Road, Shanghai, 200062, China
| | - Xiang Ren
- Department of Radiology, Shanghai Children's Hospital, School of Medicine, Shanghai Jiao Tong University, 355 Luding Road, Shanghai, 200062, China
| | - Jinhua Fu
- Department of Radiology, Shanghai Children's Hospital, School of Medicine, Shanghai Jiao Tong University, 355 Luding Road, Shanghai, 200062, China
| | - Guangbing Wang
- Department of Urology, Puyang People's Hospital, Henan, China
| | - Han Chu
- Department of Urology, Anhui Provincial Children's Hospital, Anhui, China
| | - Jiarong Chen
- Department of Urology, The Children's Hospital of Guangxi Zhuang Autonomous Region, China
| | - Yuhe Duan
- Department of Urology, The Affiliated Hospital of Qingdao University, China
| | - Likai Zhuang
- Department of Urology, Children's Hospital of Fudan University, National Pediatric Medical Center of China, Shanghai, 201102, China
| | - Min Wu
- Department of Urology, Shanghai Children's Hospital, School of Medicine, Shanghai Jiao Tong University, 355 Luding Road, Shanghai, 200062, China
| |
Collapse
|
3
|
Abstract
Application of artificial intelligence (AI) is one of the hottest topics in medicine. Unlike traditional methods that rely heavily on statistical assumptions, machine learning algorithms can identify highly complex patterns from data, allowing robust predictions. There is an abundance of evidence of exponentially increasing pediatric urologic publications using AI methodology in recent years. While these studies show great promise for better understanding of disease and patient care, we should be realistic about the challenges arising from the nature of pediatric urologic conditions and practice, in order to continue to produce high-impact research.
Collapse
Affiliation(s)
- Hsin-Hsiao Scott Wang
- Computational Healthcare Analytics Program, Department of Urology, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA, USA.
| | - Ranveer Vasdev
- Department of Urology, Mayo Clinic Rochester, 200 1st Street Southwest, Rochester, MN 55905, USA
| | - Caleb P Nelson
- Clinical and Health Services Research, Department of Urology, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA, USA
| |
Collapse
|
4
|
Harris C, Finn KR, Kieseler ML, Maechler MR, Tse PU. DeepAction: a MATLAB toolbox for automated classification of animal behavior in video. Sci Rep 2023; 13:2688. [PMID: 36792716 PMCID: PMC9932075 DOI: 10.1038/s41598-023-29574-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 02/07/2023] [Indexed: 02/17/2023] Open
Abstract
The identification of animal behavior in video is a critical but time-consuming task in many areas of research. Here, we introduce DeepAction, a deep learning-based toolbox for automatically annotating animal behavior in video. Our approach uses features extracted from raw video frames by a pretrained convolutional neural network to train a recurrent neural network classifier. We evaluate the classifier on two benchmark rodent datasets and one octopus dataset. We show that it achieves high accuracy, requires little training data, and surpasses both human agreement and most comparable existing methods. We also create a confidence score for classifier output, and show that our method provides an accurate estimate of classifier performance and reduces the time required by human annotators to review and correct automatically-produced annotations. We release our system and accompanying annotation interface as an open-source MATLAB toolbox.
Collapse
Affiliation(s)
- Carl Harris
- grid.254880.30000 0001 2179 2404Department of Psychological and Brain Science, Dartmouth College, Hanover, NH 03755 USA
| | - Kelly R. Finn
- grid.254880.30000 0001 2179 2404Department of Psychological and Brain Science, Dartmouth College, Hanover, NH 03755 USA ,grid.254880.30000 0001 2179 2404Neukom Institute, Dartmouth College, Hanover, NH 03755 USA
| | - Marie-Luise Kieseler
- grid.254880.30000 0001 2179 2404Department of Psychological and Brain Science, Dartmouth College, Hanover, NH 03755 USA
| | - Marvin R. Maechler
- grid.254880.30000 0001 2179 2404Department of Psychological and Brain Science, Dartmouth College, Hanover, NH 03755 USA
| | - Peter U. Tse
- grid.254880.30000 0001 2179 2404Department of Psychological and Brain Science, Dartmouth College, Hanover, NH 03755 USA
| |
Collapse
|
5
|
Lee W, Lee S, Lee D, Jun K, Ahn DH, Kim MS. Deep Learning-Based ADHD and ADHD-RISK Classification Technology through the Recognition of Children's Abnormal Behaviors during the Robot-Led ADHD Screening Game. SENSORS (BASEL, SWITZERLAND) 2022; 23:278. [PMID: 36616875 PMCID: PMC9824867 DOI: 10.3390/s23010278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/22/2022] [Accepted: 12/25/2022] [Indexed: 06/17/2023]
Abstract
Although attention deficit hyperactivity disorder (ADHD) in children is rising worldwide, fewer studies have focused on screening than on the treatment of ADHD. Most previous similar ADHD classification studies classified only ADHD and normal classes. However, medical professionals believe that better distinguishing the ADHD-RISK class will assist them socially and medically. We created a projection-based game in which we can see stimuli and responses to better understand children's abnormal behavior. The developed screening game is divided into 11 stages. Children play five games. Each game is divided into waiting and game stages; thus, 10 stages are created, and the additional waiting stage includes an explanation stage where the robot waits while explaining the first game. Herein, we classified normal, ADHD-RISK, and ADHD using skeleton data obtained through games for ADHD screening of children and a bidirectional long short-term memory-based deep learning model. We verified the importance of each stage by passing the feature for each stage through the channel attention layer. Consequently, the final classification accuracy of the three classes was 98.15% using bi-directional LSTM with channel attention model. Additionally, the attention scores obtained through the channel attention layer indicated that the data in the latter part of the game are heavily involved in learning the ADHD-RISK case. These results imply that for ADHD-RISK, the game is repeated, and children's attention decreases as they progress to the second half.
Collapse
Affiliation(s)
- Wonjun Lee
- School of Integrated Technology, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Sanghyub Lee
- School of Integrated Technology, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Deokwon Lee
- School of Integrated Technology, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Kooksung Jun
- School of Integrated Technology, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Dong Hyun Ahn
- Department of Psychiatry, Hanyang University Hospital, Seoul 04763, Republic of Korea
| | - Mun Sang Kim
- School of Integrated Technology, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| |
Collapse
|
6
|
Son H, Lee S, Kim K, Koo KI, Hwang CH. Deep learning-based quantitative estimation of lymphedema-induced fibrosis using three-dimensional computed tomography images. Sci Rep 2022; 12:15371. [PMID: 36100619 PMCID: PMC9470678 DOI: 10.1038/s41598-022-19204-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 08/25/2022] [Indexed: 11/09/2022] Open
Abstract
In lymphedema, proinflammatory cytokine-mediated progressive cascades always occur, leading to macroscopic fibrosis. However, no methods are practically available for measuring lymphedema-induced fibrosis before its deterioration. Technically, CT can visualize fibrosis in superficial and deep locations. For standardized measurement, verification of deep learning (DL)-based recognition was performed. A cross-sectional, observational cohort trial was conducted. After narrowing window width of the absorptive values in CT images, SegNet-based semantic segmentation model of every pixel into 5 classes (air, skin, muscle/water, fat, and fibrosis) was trained (65%), validated (15%), and tested (20%). Then, 4 indices were formulated and compared with the standardized circumference difference ratio (SCDR) and bioelectrical impedance (BEI) results. In total, 2138 CT images of 27 chronic unilateral lymphedema patients were analyzed. Regarding fibrosis segmentation, the mean boundary F1 score and accuracy were 0.868 and 0.776, respectively. Among 19 subindices of the 4 indices, 73.7% were correlated with the BEI (partial correlation coefficient: 0.420–0.875), and 13.2% were correlated with the SCDR (0.406–0.460). The mean subindex of Index 2 \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\left( {\frac{{P_{Fibrosis\, in\, Affected} - P_{Fibrosis\, in\, Unaffected} }}{{P_{Limb\, in\, Unaffected} }}} \right)$$\end{document}PFibrosisinAffected-PFibrosisinUnaffectedPLimbinUnaffected presented the highest correlation. DL has potential applications in CT image-based lymphedema-induced fibrosis recognition. The subtraction-type formula might be the most promising estimation method.
Collapse
|