1
|
Du X, Ding X, Xi M, Lv Y, Qiu S, Liu Q. A Data Augmentation Method for Motor Imagery EEG Signals Based on DCGAN-GP Network. Brain Sci 2024; 14:375. [PMID: 38672024 PMCID: PMC11048538 DOI: 10.3390/brainsci14040375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 04/09/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Motor imagery electroencephalography (EEG) signals have garnered attention in brain-computer interface (BCI) research due to their potential in promoting motor rehabilitation and control. However, the limited availability of labeled data poses challenges for training robust classifiers. In this study, we propose a novel data augmentation method utilizing an improved Deep Convolutional Generative Adversarial Network with Gradient Penalty (DCGAN-GP) to address this issue. We transformed raw EEG signals into two-dimensional time-frequency maps and employed a DCGAN-GP network to generate synthetic time-frequency representations resembling real data. Validation experiments were conducted on the BCI IV 2b dataset, comparing the performance of classifiers trained with augmented and unaugmented data. Results demonstrated that classifiers trained with synthetic data exhibit enhanced robustness across multiple subjects and achieve higher classification accuracy. Our findings highlight the effectiveness of utilizing a DCGAN-GP-generated synthetic EEG data to improve classifier performance in distinguishing different motor imagery tasks. Thus, the proposed data augmentation method based on a DCGAN-GP offers a promising avenue for enhancing BCI system performance, overcoming data scarcity challenges, and bolstering classifier robustness, thereby providing substantial support for the broader adoption of BCI technology in real-world applications.
Collapse
Affiliation(s)
| | - Xiaohui Ding
- Communication and Network Laboratory, Dalian University, Dalian 116622, China; (X.D.); (M.X.); (Y.L.); (S.Q.); (Q.L.)
| | | | | | | | | |
Collapse
|
2
|
Wang X, Mi Y, Zhang X. 3D human pose data augmentation using Generative Adversarial Networks for robotic-assisted movement quality assessment. Front Neurorobot 2024; 18:1371385. [PMID: 38644903 PMCID: PMC11032046 DOI: 10.3389/fnbot.2024.1371385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/08/2024] [Indexed: 04/23/2024] Open
Abstract
In the realm of human motion recognition systems, the augmentation of 3D human pose data plays a pivotal role in enriching and enhancing the quality of original datasets through the generation of synthetic data. This augmentation is vital for addressing the current research gaps in diversity and complexity, particularly when dealing with rare or complex human movements. Our study introduces a groundbreaking approach employing Generative Adversarial Networks (GANs), coupled with Support Vector Machine (SVM) and DenseNet, further enhanced by robot-assisted technology to improve the precision and efficiency of data collection. The GANs in our model are responsible for generating highly realistic and diverse 3D human motion data, while SVM aids in the effective classification of this data. DenseNet is utilized for the extraction of key features, facilitating a comprehensive and integrated approach that significantly elevates both the data augmentation process and the model's ability to process and analyze complex human movements. The experimental outcomes underscore our model's exceptional performance in motion quality assessment, showcasing a substantial improvement over traditional methods in terms of classification accuracy and data processing efficiency. These results validate the effectiveness of our integrated network model, setting a solid foundation for future advancements in the field. Our research not only introduces innovative methodologies for 3D human pose data enhancement but also provides substantial technical support for practical applications across various domains, including sports science, rehabilitation medicine, and virtual reality. By combining advanced algorithmic strategies with robotic technologies, our work addresses key challenges in data augmentation and motion quality assessment, paving the way for new research and development opportunities in these critical areas.
Collapse
Affiliation(s)
- Xuefeng Wang
- College of Sports, Woosuk University, Jeonju, Republic of Korea
| | - Yang Mi
- College of Sports and Health, Linyi University, Linyi, China
| | - Xiang Zhang
- Department of Information Engineering, Linyi Technician Institute, Linyi, China
| |
Collapse
|
3
|
Jiang Q, Sun H, Deng W, Chen L, Li Q, Xie J, Pan X, Cheng Y, Chen X, Wang Y, Li Y, Wang X, Liu S, Xiao Y. Super Resolution of Pulmonary Nodules Target Reconstruction Using a Two-Channel GAN Models. Acad Radiol 2024:S1076-6332(24)00086-2. [PMID: 38458886 DOI: 10.1016/j.acra.2024.02.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/09/2024] [Accepted: 02/09/2024] [Indexed: 03/10/2024]
Abstract
RATIONALE AND OBJECTIVES To develop a Dual generative-adversarial-network (GAN) Cascaded Network (DGCN) for generating super-resolution computed tomography (SRCT) images from normal-resolution CT (NRCT) images and evaluate the performance of DGCN in multi-center datasets. MATERIALS AND METHODS This retrospective study included 278 patients with chest CT from two hospitals between January 2020 and June 2023, and each patient had all three NRCT (512×512 matrix CT images with a resolution of 0.70 mm, 0.70 mm,1.0 mm), high-resolution CT (HRCT, 1024×1024 matrix CT images with a resolution of 0.35 mm, 0.35 mm,1.0 mm), and ultra-high-resolution CT (UHRCT, 1024×1024 matrix CT images with a resolution of 0.17 mm, 0.17 mm, 0.5 mm) examinations. Initially, a deep chest CT super-resolution residual network (DCRN) was built to generate HRCT from NRCT. Subsequently, we employed the DCRN as a pre-trained model for the training of DGCN to further enhance resolution along all three axes, ultimately yielding SRCT. PSNR, SSIM, FID, subjective evaluation scores, and objective evaluation parameters related to pulmonary nodule segmentation in the testing set were recorded and analyzed. RESULTS DCRN obtained a PSNR of 52.16, SSIM of 0.9941, FID of 137.713, and an average diameter difference of 0.0981 mm. DGCN obtained a PSNR of 46.50, SSIM of 0.9990, FID of 166.421, and an average diameter difference of 0.0981 mm on 39 testing cases. There were no significant differences between the SRCT and UHRCT images in subjective evaluation. CONCLUSION Our model exhibited a significant enhancement in generating HRCT and SRCT images and outperformed established methods regarding image quality and clinical segmentation accuracy across both internal and external testing datasets.
Collapse
Affiliation(s)
- Qinling Jiang
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Hongbiao Sun
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Wei Deng
- Shanghai United Imaging Intelligence Co. Ltd., Shanghai 200232, China
| | - Lei Chen
- Shanghai United Imaging Intelligence Co. Ltd., Shanghai 200232, China
| | - Qingchu Li
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Jicai Xie
- Department of Radiology, The Second People's Hospital of Yuhuan, 317699, China
| | - Xianpan Pan
- Shanghai United Imaging Intelligence Co. Ltd., Shanghai 200232, China
| | - Yuxin Cheng
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Xin Chen
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Yunmeng Wang
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Yanran Li
- Univerisity of Queensland, Brisbane 4072, Australia
| | - Xiang Wang
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Shiyuan Liu
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Yi Xiao
- Department of Radiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China.
| |
Collapse
|
4
|
Vagni M, Tran HE, Romano A, Chiloiro G, Boldrini L, Zormpas-Petridis K, Kawula M, Landry G, Kurz C, Corradini S, Belka C, Indovina L, Gambacorta MA, Placidi L, Cusumano D. Auto-segmentation of pelvic organs at risk on 0.35T MRI using 2D and 3D Generative Adversarial Network models. Phys Med 2024; 119:103297. [PMID: 38310680 DOI: 10.1016/j.ejmp.2024.103297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/04/2023] [Accepted: 01/23/2024] [Indexed: 02/06/2024] Open
Abstract
PURPOSE Manual recontouring of targets and Organs At Risk (OARs) is a time-consuming and operator-dependent task. We explored the potential of Generative Adversarial Networks (GAN) to auto-segment the rectum, bladder and femoral heads on 0.35T MRIs to accelerate the online MRI-guided-Radiotherapy (MRIgRT) workflow. METHODS 3D planning MRIs from 60 prostate cancer patients treated with 0.35T MR-Linac were collected. A 3D GAN architecture and its equivalent 2D version were trained, validated and tested on 40, 10 and 10 patients respectively. The volumetric Dice Similarity Coefficient (DSC) and 95th percentile Hausdorff Distance (HD95th) were computed against expert drawn ground-truth delineations. The networks were also validated on an independent external dataset of 16 patients. RESULTS In the internal test set, the 3D and 2D GANs showed DSC/HD95th of 0.83/9.72 mm and 0.81/10.65 mm for the rectum, 0.92/5.91 mm and 0.85/15.72 mm for the bladder, and 0.94/3.62 mm and 0.90/9.49 mm for the femoral heads. In the external test set, the performance was 0.74/31.13 mm and 0.72/25.07 mm for the rectum, 0.92/9.46 mm and 0.88/11.28 mm for the bladder, and 0.89/7.00 mm and 0.88/10.06 mm for the femoral heads. The 3D and 2D GANs required on average 1.44 s and 6.59 s respectively to generate the OARs' volumetric segmentation for a single patient. CONCLUSIONS The proposed 3D GAN auto-segments pelvic OARs with high accuracy on 0.35T, in both the internal and the external test sets, outperforming its 2D equivalent in both segmentation robustness and volume generation time.
Collapse
Affiliation(s)
- Marica Vagni
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | - Huong Elena Tran
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | - Angela Romano
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | - Giuditta Chiloiro
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | - Luca Boldrini
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | | | - Maria Kawula
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany
| | - Guillaume Landry
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany
| | - Christopher Kurz
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany
| | - Stefanie Corradini
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany
| | - Claus Belka
- Department of Radiation Oncology, University Hospital, LMU Munich, Munich, Germany; German Cancer Consortium (DKTK), Department of Radiation Oncology, Munich, Germany
| | - Luca Indovina
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy
| | | | - Lorenzo Placidi
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy.
| | - Davide Cusumano
- Fondazione Policlinico Universitario "Agostino Gemelli" IRCCS, Rome, Italy; Mater Olbia Hospital, Olbia, SS, Italy
| |
Collapse
|
5
|
Sohn J, Shin H, Lee J, Kim HC. Validation of Electrocardiogram Based Photoplethysmogram Generated Using U-Net Based Generative Adversarial Networks. J Healthc Inform Res 2024; 8:140-157. [PMID: 38273980 PMCID: PMC10805750 DOI: 10.1007/s41666-023-00156-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 10/24/2023] [Accepted: 11/13/2023] [Indexed: 01/27/2024]
Abstract
Photoplethysmogram (PPG) performs an important role in alarming atrial fibrillation (AF). While the importance of PPG is emphasized, there is insufficient amount of openly available atrial fibrillation PPG data. We propose a U-net-based generative adversarial network (GAN) which synthesize PPG from paired electrocardiogram (ECG). To measure the performance of the proposed GAN, we compared the generated PPG to reference PPG in terms of morphology similarity and also examined its influence on AF detection classifier performance. First, morphology was compared using two different metrics against the reference signal: percent root mean square difference (PRD) and Pearson correlation coefficient. The mean PRD and Pearson correlation coefficient were 27% and 0.94, respectively. Heart rate variability (HRV) of the reference AF ECG and the generated PPG were compared as well. The p-value of the paired t-test was 0.248, indicating that no significant difference was observed between the two HRV values. Second, to validate the generated AF PPG dataset, four different datasets were prepared combining the generated PPG and real AF PPG. Each dataset was used to optimize a classification model while maintaining the same architecture. A test dataset was prepared to test the performance of each optimized model. Subsequently, these datasets were used to test the hypothesis whether the generated data benefits the training of an AF classifier. Comparing the performance metrics of each optimized model, the training dataset consisting of generated and real AF PPG showed a test accuracy result of 0.962, which was close to that of the dataset consisting only of real AF PPG data at 0.961. Furthermore, both models yielded the same F1 score of 0.969. Lastly, using only the generated AF PPG dataset resulted in test accuracy of 0.945, indicating that the trained model was capable of generating valuable AF PPG. Therefore, it can be concluded that the generated AF PPG can be used to augment insufficient data. To summarize, this study proposes a GAN-based method to generate atrial fibrillation PPG that can be used for training atrial fibrillation PPG classification models.
Collapse
Affiliation(s)
- Jangjay Sohn
- Institute of Medical & Biological Engineering, Medical Research Center, Seoul National University College of Medicine, Seoul, Korea
- Department of Electronic Engineering, Hanyang University, Seoul, Korea
| | - Heean Shin
- Samsung SDS R&D Center, Seoul, Republic of Korea
| | - Joonnyong Lee
- Mellowing Factory Co., Ltd., 131 Sapeyong-daero 57-gil, Seocho-gu, Seoul, 06535 Republic of Korea
| | - Hee Chan Kim
- Interdisciplinary Program in Bioengineering, Seoul National University, Seoul, Korea
- Department of Biomedical Engineering, Seoul National University College of Medicine, 103, Daehak-ro, Jongno-gu, Seoul, 03080 Republic of Korea
| |
Collapse
|
6
|
Xu Z, Tang J, Qi C, Yao D, Liu C, Zhan Y, Lukasiewicz T. Cross-domain attention-guided generative data augmentation for medical image analysis with limited data. Comput Biol Med 2024; 168:107744. [PMID: 38006826 DOI: 10.1016/j.compbiomed.2023.107744] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 11/12/2023] [Accepted: 11/20/2023] [Indexed: 11/27/2023]
Abstract
Data augmentation is widely applied to medical image analysis tasks in limited datasets with imbalanced classes and insufficient annotations. However, traditional augmentation techniques cannot supply extra information, making the performance of diagnosis unsatisfactory. GAN-based generative methods have thus been proposed to obtain additional useful information to realize more effective data augmentation; but existing generative data augmentation techniques mainly encounter two problems: (i) Current generative data augmentation lacks of the capability in using cross-domain differential information to extend limited datasets. (ii) The existing generative methods cannot provide effective supervised information in medical image segmentation tasks. To solve these problems, we propose an attention-guided cross-domain tumor image generation model (CDA-GAN) with an information enhancement strategy. The CDA-GAN can generate diverse samples to expand the scale of datasets, improving the performance of medical image diagnosis and treatment tasks. In particular, we incorporate channel attention into a CycleGAN-based cross-domain generation network that captures inter-domain information and generates positive or negative samples of brain tumors. In addition, we propose a semi-supervised spatial attention strategy to guide spatial information of features at the pixel level in tumor generation. Furthermore, we add spectral normalization to prevent the discriminator from mode collapse and stabilize the training procedure. Finally, to resolve an inapplicability problem in the segmentation task, we further propose an application strategy of using this data augmentation model to achieve more accurate medical image segmentation with limited data. Experimental studies on two public brain tumor datasets (BraTS and TCIA) show that the proposed CDA-GAN model greatly outperforms the state-of-the-art generative data augmentation in both practical medical image classification tasks and segmentation tasks; e.g. CDA-GAN is 0.50%, 1.72%, 2.05%, and 0.21% better than the best SOTA baseline in terms of ACC, AUC, Recall, and F1, respectively, in the classification task of BraTS, while its improvements w.r.t. the best SOTA baseline in terms of Dice, Sens, HD95, and mIOU, in the segmentation task of TCIA are 2.50%, 0.90%, 14.96%, and 4.18%, respectively.
Collapse
Affiliation(s)
- Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Jiaqi Tang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Chang Qi
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China; Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria.
| | - Dan Yao
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin, China
| | - Caihua Liu
- College of Computer Science and Technology, Civil Aviation University of China, Tianjin, China
| | - Yuefu Zhan
- Department of Radiology, Hainan Women and Children's Medical Center, Haikou, China
| | - Thomas Lukasiewicz
- Institute of Logic and Computation, Vienna University of Technology, Vienna, Austria; Department of Computer Science, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
7
|
Belue MJ, Harmon SA, Masoudi S, Barrett T, Law YM, Purysko AS, Panebianco V, Yilmaz EC, Lin Y, Jadda PK, Raavi S, Wood BJ, Pinto PA, Choyke PL, Turkbey B. Quality of T2-weighted MRI re-acquisition versus deep learning GAN image reconstruction: A multi-reader study. Eur J Radiol 2024; 170:111259. [PMID: 38128256 PMCID: PMC10842312 DOI: 10.1016/j.ejrad.2023.111259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/23/2023] [Accepted: 12/07/2023] [Indexed: 12/23/2023]
Abstract
PURPOSE To evaluate CycleGAN's ability to enhance T2-weighted image (T2WI) quality. METHOD A CycleGAN algorithm was used to enhance T2WI quality. 96 patients (192 scans) were identified from patients who underwent multiple axial T2WI due to poor quality on the first attempt (RAD1) and improved quality on re-acquisition (RAD2). CycleGAN algorithm gave DL classifier scores (0-1) for quality quantification and produced enhanced versions of QI1 and QI2 from RAD1 and RAD2, respectively. A subset (n = 20 patients) was selected for a blinded, multi-reader study, where four radiologists rated T2WI on a scale of 1-4 for quality. The multi-reader study presented readers with 60 image pairs (RAD1 vs RAD2, RAD1 vs QI1, and RAD2 vs QI2), allowing for selecting sequence preferences and quantifying the quality changes. RESULTS The DL classifier correctly discerned 71.9 % of quality classes, with 90.6 % (96/106) as poor quality and 48.8 % (42/86) as diagnostic in original sequences (RAD1, RAD2). CycleGAN images (QI1, QI2) demonstrated quantitative improvements, with consistently higher DL classifier scores than original scans (p < 0.001). In the multi-reader analysis, CycleGAN demonstrated no qualitative improvements, with diminished overall quality and motion in QI2 in most patients compared to RAD2, with noise levels remaining similar (8/20). No readers preferred QI2 to RAD2 for diagnosis. CONCLUSION Despite quantitative enhancements with CycleGAN, there was no qualitative boost in T2WI diagnostic quality, noise, or motion. Expert radiologists didn't favor CycleGAN images over standard scans, highlighting the divide between quantitative and qualitative metrics.
Collapse
Affiliation(s)
- Mason J Belue
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Stephanie A Harmon
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Tristan Barrett
- Department of Radiology, University of Cambridge, Cambridge, England
| | - Yan Mee Law
- Department of Radiology, Singapore General Hospital, Singapore
| | - Andrei S Purysko
- Section of Abdominal Imaging, Imaging Institute, Cleveland Clinic, Cleveland, OH, USA
| | | | - Enis C Yilmaz
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yue Lin
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Pavan Kumar Jadda
- Center for Information Technology, National Institutes of Health, Bethesda, MD, USA
| | - Sitarama Raavi
- Center for Information Technology, National Institutes of Health, Bethesda, MD, USA
| | - Bradford J Wood
- Center for Interventional Oncology, National Cancer Institute, NIH, Bethesda, MD, USA; Department of Radiology, Clinical Center, National Institutes of Health, Bethesda, Maryland, USA
| | - Peter A Pinto
- Urologic Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Peter L Choyke
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Baris Turkbey
- Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
8
|
Thangaraj PM, Shankar SV, Oikonomou EK, Khera R. RCT-Twin-GAN Generates Digital Twins of Randomized Control Trials Adapted to Real-world Patients to Enhance their Inference and Application. medRxiv 2023:2023.12.06.23299464. [PMID: 38106089 PMCID: PMC10723568 DOI: 10.1101/2023.12.06.23299464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Background Randomized clinical trials (RCTs) are designed to produce evidence in selected populations. Assessing their effects in the real-world is essential to change medical practice, however, key populations are historically underrepresented in the RCTs. We define an approach to simulate RCT-based effects in real-world settings using RCT digital twins reflecting the covariate patterns in an electronic health record (EHR). Methods We developed a Generative Adversarial Network (GAN) model, RCT-Twin-GAN, which generates a digital twin of an RCT (RCT-Twin) conditioned on covariate distributions from an EHR cohort. We improved upon a traditional tabular conditional GAN, CTGAN, with a loss function adapted for data distributions and by conditioning on multiple discrete and continuous covariates simultaneously. We assessed the similarity between a Heart Failure with preserved Ejection Fraction (HFpEF) RCT (TOPCAT), a Yale HFpEF EHR cohort, and RCT-Twin. We also evaluated cardiovascular event-free survival stratified by Spironolactone (treatment) use. Results By applying RCT-Twin-GAN to 3445 TOPCAT participants and conditioning on 3445 Yale EHR HFpEF patients, we generated RCT-Twin datasets between 1141-3445 patients in size, depending on covariate conditioning and model parameters. RCT-Twin randomly allocated spironolactone (S)/ placebo (P) arms like an RCT, was similar to RCT by a multi-dimensional distance metric, and balanced covariates (median absolute standardized mean difference (MASMD) 0.017, IQR 0.0034-0.030). The 5 EHR-conditioned covariates in RCT-Twin were closer to the EHR compared with the RCT (MASMD 0.008 vs 0.63, IQR 0.005-0.018 vs 0.59-1.11). RCT-Twin reproduced the overall effect size seen in TOPCAT (5-year cardiovascular composite outcome odds ratio (95% confidence interval) of 0.89 (0.75-1.06) in RCT vs 0.85 (0.69-1.04) in RCT-Twin). Conclusions RCT-Twin-GAN simulates RCT-derived effects in real-world patients by translating these effects to the covariate distributions of EHR patients. This key methodological advance may enable the direct translation of RCT-derived effects into real-world patient populations and may enable causal inference in real-world settings.
Collapse
Affiliation(s)
- Phyllis M Thangaraj
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Sumukh Vasisht Shankar
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Evangelos K Oikonomou
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Rohan Khera
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
- Section of Health Informatics, Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
- Center for Outcomes Research and Evaluation (CORE), Yale New Haven Hospital, New Haven, CT, USA
| |
Collapse
|
9
|
Salas J, Saha A, Ravela S. Learning inter-annual flood loss risk models from historical flood insurance claims. J Environ Manage 2023; 347:118862. [PMID: 37806269 DOI: 10.1016/j.jenvman.2023.118862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/06/2023] [Accepted: 08/20/2023] [Indexed: 10/10/2023]
Abstract
Flooding is a natural hazard that causes substantial loss of lives and livelihoods worldwide. Developing predictive models for flood-induced financial losses is crucial for applications such as insurance underwriting. This research uses the National Flood Insurance Program (NFIP) dataset between 2000 and 2020 to evaluate the predictive skill of past data in predicting near-future flood loss risk. Our approach applies neural networks (Conditional Generative Adversarial Networks), decision trees (Extreme Gradient Boosting), and kernel-based regressors (Gaussian Processes) to estimate pointwise losses. It aggregates them over intervals using a bias-corrected Burr-Pareto distribution to predict risk. The regression models help identify the most informative predictors and highlight crucial factors influencing flood-related financial losses. Applying our approach to quantify the county-level coastal flood loss risk in eight US Southern states results in an R2=0.807, substantially outperforming related work using stage-damage curves. More detailed testing on 11 counties with significant claims in the NFIP dataset reveals that Extreme Gradient Boosting yields the most favorable results, and bias correction significantly improves the similarity between the predicted and reference claim amount distributions. Our experiments also show that, despite the already experienced climate change, the difference in future short-term risk predictions of flood-loss amounts between historical shifting or expanding training data windows is insignificant.
Collapse
Affiliation(s)
- Joaquin Salas
- Earth Signals and Systems Group, Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139-4307, United States of America; CICATA Querétaro. Instituto Politécnico Nacional, Cerro Blanco 141, Colinas del Cimatario, Querétaro, Querétaro, 76090, Mexico.
| | - Anamitra Saha
- Earth Signals and Systems Group, Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139-4307, United States of America
| | - Sai Ravela
- Earth Signals and Systems Group, Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139-4307, United States of America
| |
Collapse
|
10
|
Shi D, Zhang W, He S, Chen Y, Song F, Liu S, Wang R, Zheng Y, He M. Translation of Color Fundus Photography into Fluorescein Angiography Using Deep Learning for Enhanced Diabetic Retinopathy Screening. Ophthalmol Sci 2023; 3:100401. [PMID: 38025160 PMCID: PMC10630672 DOI: 10.1016/j.xops.2023.100401] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 08/23/2023] [Accepted: 09/08/2023] [Indexed: 12/01/2023]
Abstract
Purpose To develop and validate a deep learning model that can transform color fundus (CF) photography into corresponding venous and late-phase fundus fluorescein angiography (FFA) images. Design Cross-sectional study. Participants We included 51 370 CF-venous FFA pairs and 14 644 CF-late FFA pairs from 4438 patients for model development. External testing involved 50 eyes with CF-FFA pairs and 2 public datasets for diabetic retinopathy (DR) classification, with 86 952 CF from EyePACs, and 1744 CF from MESSIDOR2. Methods We trained a deep-learning model to transform CF into corresponding venous and late-phase FFA images. The translated FFA images' quality was evaluated quantitatively on the internal test set and subjectively on 100 eyes with CF-FFA paired images (50 from external), based on the realisticity of the global image, anatomical landmarks (macula, optic disc, and vessels), and lesions. Moreover, we validated the clinical utility of the translated FFA for classifying 5-class DR and diabetic macular edema (DME) in the EyePACs and MESSIDOR2 datasets. Main Outcome Measures Image generation was quantitatively assessed by structural similarity measures (SSIM), and subjectively by 2 clinical experts on a 5-point scale (1 refers real FFA); intragrader agreement was assessed by kappa. The DR classification accuracy was assessed by area under the receiver operating characteristic curve. Results The SSIM of the translated FFA images were > 0.6, and the subjective quality scores ranged from 1.37 to 2.60. Both experts reported similar quality scores with substantial agreement (all kappas > 0.8). Adding the generated FFA on top of CF improved DR classification in the EyePACs and MESSIDOR2 datasets, with the area under the receiver operating characteristic curve increased from 0.912 to 0.939 on the EyePACs dataset and from 0.952 to 0.972 on the MESSIDOR2 dataset. The DME area under the receiver operating characteristic curve also increased from 0.927 to 0.974 in the MESSIDOR2 dataset. Conclusions Our CF-to-FFA framework produced realistic FFA images. Moreover, adding the translated FFA images on top of CF improved the accuracy of DR screening. These results suggest that CF-to-FFA translation could be used as a surrogate method when FFA examination is not feasible and as a simple add-on to improve DR screening. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Danli Shi
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Weiyi Zhang
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shuang He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Yanxian Chen
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Fan Song
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shunming Liu
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| | - Ruobing Wang
- Department of Ophthalmology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yingfeng Zheng
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| |
Collapse
|
11
|
Madokoro H, Sato K, Nix S, Chiyonobu S, Nagayoshi T, Sato K. OutcropHyBNet: Hybrid Backbone Networks with Data Augmentation for Accurate Stratum Semantic Segmentation of Monocular Outcrop Images in Carbon Capture and Storage Applications. Sensors (Basel) 2023; 23:8809. [PMID: 37960509 PMCID: PMC10650223 DOI: 10.3390/s23218809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/11/2023] [Accepted: 10/26/2023] [Indexed: 11/15/2023]
Abstract
The rapid advancement of climate change and global warming have widespread impacts on society, including ecosystems, water security, food production, health, and infrastructure. To achieve significant global emission reductions, approximately 74% is expected to come from cutting carbon dioxide (CO2) emissions in energy supply and demand. Carbon Capture and Storage (CCS) has attained global recognition as a preeminent approach for the mitigation of atmospheric carbon dioxide levels, primarily by means of capturing and storing CO2 emissions originating from fossil fuel systems. Currently, geological models for storage location determination in CCS rely on limited sampling data from borehole surveys, which poses accuracy challenges. To tackle this challenge, our research project focuses on analyzing exposed rock formations, known as outcrops, with the goal of identifying the most effective backbone networks for classifying various strata types in outcrop images. We leverage deep learning-based outcrop semantic segmentation techniques using hybrid backbone networks, named OutcropHyBNet, to achieve accurate and efficient lithological classification, while considering texture features and without compromising computational efficiency. We conducted accuracy comparisons using publicly available benchmark datasets, as well as an original dataset expanded through random sampling of 13 outcrop images obtained using a stationary camera, installed on the ground. Additionally, we evaluated the efficacy of data augmentation through image synthesis using Only Adversarial Supervision for Semantic Image Synthesis (OASIS). Evaluation experiments on two public benchmark datasets revealed insights into the classification characteristics of different classes. The results demonstrate the superiority of Convolutional Neural Networks (CNNs), specifically DeepLabv3, and Vision Transformers (ViTs), particularly SegFormer, under specific conditions. These findings contribute to advancing accurate lithological classification in geological studies using deep learning methodologies. In the evaluation experiments conducted on ground-level images obtained using a stationary camera and aerial images captured using a drone, we successfully demonstrated the superior performance of SegFormer across all categories.
Collapse
Affiliation(s)
- Hirokazu Madokoro
- Faculty of Software and Information Science, Iwate Prefectural University, Takizawa 020-0693, Japan
| | - Kodai Sato
- Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo 015-0055, Japan
| | - Stephanie Nix
- Faculty of Software and Information Science, Iwate Prefectural University, Takizawa 020-0693, Japan
| | - Shun Chiyonobu
- Graduate School of International Resource Sciences, Akita University, Akita 010-8502, Japan
| | - Takeshi Nagayoshi
- Faculty of Bioresource Sciences, Akita Prefectural University, Akita 010-0195, Japan
| | - Kazuhito Sato
- Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo 015-0055, Japan
| |
Collapse
|
12
|
Lim B, Seth I, Kah S, Sofiadellis F, Ross RJ, Rozen WM, Cuomo R. Using Generative Artificial Intelligence Tools in Cosmetic Surgery: A Study on Rhinoplasty, Facelifts, and Blepharoplasty Procedures. J Clin Med 2023; 12:6524. [PMID: 37892665 PMCID: PMC10607912 DOI: 10.3390/jcm12206524] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 10/03/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023] Open
Abstract
Artificial intelligence (AI), notably Generative Adversarial Networks, has the potential to transform medical and patient education. Leveraging GANs in medical fields, especially cosmetic surgery, provides a plethora of benefits, including upholding patient confidentiality, ensuring broad exposure to diverse patient scenarios, and democratizing medical education. This study investigated the capacity of AI models, DALL-E 2, Midjourney, and Blue Willow, to generate realistic images pertinent to cosmetic surgery. We combined the generative powers of ChatGPT-4 and Google's BARD with these GANs to produce images of various noses, faces, and eyelids. Four board-certified plastic surgeons evaluated the generated images, eliminating the need for real patient photographs. Notably, generated images predominantly showcased female faces with lighter skin tones, lacking representation of males, older women, and those with a body mass index above 20. The integration of AI in cosmetic surgery offers enhanced patient education and training but demands careful and ethical incorporation to ensure comprehensive representation and uphold medical standards.
Collapse
Affiliation(s)
- Bryan Lim
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
- Central Clinical School, Faculty of Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Ishith Seth
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
- Central Clinical School, Faculty of Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Skyler Kah
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
| | - Foti Sofiadellis
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
| | - Richard J. Ross
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
| | - Warren M. Rozen
- Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia
- Central Clinical School, Faculty of Medicine, Monash University, Melbourne, VIC 3004, Australia
| | - Roberto Cuomo
- Plastic Surgery Unit, Department of Medicine, Surgery and Neuroscience, University of Siena, 53100 Siena, Italy
| |
Collapse
|
13
|
Guerreiro J, Tomás P, Garcia N, Aidos H. Super-resolution of magnetic resonance images using Generative Adversarial Networks. Comput Med Imaging Graph 2023; 108:102280. [PMID: 37597380 DOI: 10.1016/j.compmedimag.2023.102280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 06/30/2023] [Accepted: 07/26/2023] [Indexed: 08/21/2023]
Abstract
Magnetic Resonance Imaging (MRI) typically comes at the cost of small spatial coverage, high expenses and long scan times. Accelerating MRI acquisition by taking less measurements yields the potential to relax these inherent forfeits. Recent breakthroughs in the field of Machine Learning have shown high-resolution (HR) images could be recovered from low-resolution (LR) signals via super-resolution (SR). In particular, a novel class of neural networks named Generative Adversarial Networks (GAN) has manifested an alternative way of conceiving models capable of generating data. GANs can learn to infer details based on some prior information, subsequently recovering missing data. Accordingly, they manifest huge potential in MRI reconstruction and acceleration tasks. This paper conducts a review on GAN-based SR methods, exhibiting the immersive ability of GANs on upscaling MRIs by a scale factor of ×4 while at the same time maintaining trustworthy and high-frequency details. Despite quantitative results suggesting SRResCycGAN outperforms other popular deep learning methods in recovering ×4 downgraded images, qualitative results show Beby-GAN holds the best perceptual quality and proves GAN-based methods hold the capacity to reduce medical costs, distress patients and even enable new MRI applications where it is currently too slow or expensive.
Collapse
Affiliation(s)
- João Guerreiro
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal.
| | - Pedro Tomás
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| | - Nuno Garcia
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| | - Helena Aidos
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
14
|
Huang F, Deng Y. TCGAN: Convolutional Generative Adversarial Network for time series classification and clustering. Neural Netw 2023; 165:868-883. [PMID: 37433231 DOI: 10.1016/j.neunet.2023.06.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 06/04/2023] [Accepted: 06/25/2023] [Indexed: 07/13/2023]
Abstract
Recent works have demonstrated the superiority of supervised Convolutional Neural Networks (CNNs) in learning hierarchical representations from time series data for successful classification. These methods require sufficiently large labeled data for stable learning, however acquiring high-quality labeled time series data can be costly and potentially infeasible. Generative Adversarial Networks (GANs) have achieved great success in enhancing unsupervised and semi-supervised learning. Nonetheless, to our best knowledge, it remains unclear how effectively GANs can serve as a general-purpose solution to learn representations for time series recognition, i.e., classification and clustering. The above considerations inspire us to introduce a Time-series Convolutional GAN (TCGAN). TCGAN learns by playing an adversarial game between two one-dimensional CNNs (i.e., a generator and a discriminator) in the absence of label information. Parts of the trained TCGAN are then reused to construct a representation encoder to empower linear recognition methods. We conducted comprehensive experiments on synthetic and real-world datasets. The results demonstrate that TCGAN is faster and more accurate than existing time-series GANs. The learned representations enable simple classification and clustering methods to achieve superior and stable performance. Furthermore, TCGAN retains high efficacy in scenarios with few-labeled and imbalanced-labeled data. Our work provides a promising path to effectively utilize abundant unlabeled time series data.
Collapse
Affiliation(s)
- Fanling Huang
- School of Software, Tsinghua University, Beijing, China.
| | - Yangdong Deng
- School of Software, Tsinghua University, Beijing, China.
| |
Collapse
|
15
|
Pérez E, Ventura S. Progressive growing of Generative Adversarial Networks for improving data augmentation and skin cancer diagnosis. Artif Intell Med 2023; 141:102556. [PMID: 37295899 DOI: 10.1016/j.artmed.2023.102556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 04/06/2023] [Accepted: 04/14/2023] [Indexed: 06/12/2023]
Abstract
Early melanoma diagnosis is the most important factor in the treatment of skin cancer and can effectively reduce mortality rates. Recently, Generative Adversarial Networks have been used to augment data, prevent overfitting and improve the diagnostic capacity of models. However, its application remains a challenging task due to the high levels of inter and intra-class variance seen in skin images, limited amounts of data, and model instability. We present a more robust Progressive Growing of Adversarial Networks based on residual learning, which is highly recommended to ease the training of deep networks. The stability of the training process was increased by receiving additional inputs from preceding blocks. The architecture is able to produce plausible photorealistic synthetic 512 × 512 skin images, even with small dermoscopic and non-dermoscopic skin image datasets as problem domains. In this manner, we tackle the lack of data and the imbalance problems. Additionally, the proposed approach leverages a skin lesion boundary segmentation algorithm and transfer learning to enhance the diagnosis of melanoma. Inception score and Matthews Correlation Coefficient were used to measure the performance of the models. The architecture was evaluated qualitatively and quantitatively through the use of an extensive experimental study on sixteen datasets, illustrating its effectiveness in the diagnosis of melanoma. Finally, four state-of-the-art data augmentation techniques applied in five convolutional neural network models were significantly outperformed. The results indicated that a bigger number of trainable parameters will not necessarily obtain a better performance in melanoma diagnosis.
Collapse
Affiliation(s)
- Eduardo Pérez
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI). University of Córdoba, Córdoba, Spain; Maimónides Biomedical Research Institute of Córdoba (IMIBIC). University of Córdoba, Córdoba, Spain
| | - Sebastián Ventura
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI). University of Córdoba, Córdoba, Spain; Maimónides Biomedical Research Institute of Córdoba (IMIBIC). University of Córdoba, Córdoba, Spain.
| |
Collapse
|
16
|
Zhang M, Lin H, Takagi S, Cao Y, Shahabi C, Xiong L. CSGAN: Modality-Aware Trajectory Generation via Clustering-based Sequence GAN. IEEE Int Conf Mob Data Manag 2023; 2023:148-157. [PMID: 37965426 PMCID: PMC10644148 DOI: 10.1109/mdm58254.2023.00032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Human mobility data is useful for various applications in urban planning, transportation, and public health, but collecting and sharing real-world trajectories can be challenging due to privacy and data quality issues. To address these problems, recent research focuses on generating synthetic trajectories, mainly using generative adversarial networks (GANs) trained by real-world trajectories. In this paper, we hypothesize that by explicitly capturing the modality of transportation (e.g., walking, biking, driving), we can generate not only more diverse and representative trajectories for different modalities but also more realistic trajectories that preserve the geographical density, trajectory, and transition level properties by capturing both cross-modality and modality-specific patterns. Towards this end, we propose a Clustering-based Sequence Generative Adversarial Network (CSGAN) that simultaneously clusters the trajectories based on their modalities and learns the essential properties of real-world trajectories to generate realistic and representative synthetic trajectories. To measure the effectiveness of generated trajectories, in addition to typical density and trajectory level statistics, we define several new metrics for a comprehensive evaluation, including modality distribution and transition probabilities both globally and within each modality. Our extensive experiments with real-world datasets show the superiority of our model in various metrics over state-of-the-art models.
Collapse
|
17
|
Penhaskashi J, Sekimoto O, Chiappelli F. Permafrost viremia and immune tweening. Bioinformation 2023; 19:685-691. [PMID: 37885785 PMCID: PMC10598357 DOI: 10.6026/97320630019685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/30/2023] [Accepted: 06/30/2023] [Indexed: 10/28/2023] Open
Abstract
The immune system, an exquisitely regulated physiological system, utilizes a wide spectrum of soluble factors and multiple cell populations and subpopulations at diverse states of maturation to monitor and protect the organism against foreign organisms. Immune surveillance is ensured by distinguishing self-antigens from self-associated with non-self (e.g., viral) peptides presented by major histocompatibility complexes (MHC). Pathology is often identified as unregulated inflammatory responses (e.g., cytokine storm), or recognizing self as a non-self entity (i.e., auto-immunity). Artificial intelligence (AI), and in particular specific machine learning (ML) paradigms (e.g., Deep Learning [DL]) proffer powerful algorithms to better understand and more accurately predict immune responses, immune regulation and homeostasis, and immune reactivity to challenges (i.e., immune allostasis) by their intrinsic ability to interpret immune parameters, pathways and events by analyzing large amounts of complex data and drawing predictive inferences (i.e., immune tweening). We propose here that DL models play an increasingly significant role in better defining and characterizing immunological surveillance to ancient and novel virus species released by thawing permafrost.
Collapse
Affiliation(s)
- Jaden Penhaskashi
- />Division of West Valley Dental Implant Center, Encino, CA 91316, USA
| | | | - Francesco Chiappelli
- />Dental Group of Sherman Oaks, CA 91403 , USA
- />Center for the Health Sciences, UCLA, Los Angeles, CA, USA
| |
Collapse
|
18
|
Sheikh ZA, Singh Y, Singh PK, Gonçalves PJS. Defending the Defender: Adversarial Learning Based Defending Strategy for Learning Based Security Methods in Cyber-Physical Systems (CPS). Sensors (Basel) 2023; 23:5459. [PMID: 37420626 DOI: 10.3390/s23125459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 05/26/2023] [Accepted: 06/06/2023] [Indexed: 07/09/2023]
Abstract
Cyber-Physical Systems (CPS) are prone to many security exploitations due to a greater attack surface being introduced by their cyber component by the nature of their remote accessibility or non-isolated capability. Security exploitations, on the other hand, rise in complexities, aiming for more powerful attacks and evasion from detections. The real-world applicability of CPS thus poses a question mark due to security infringements. Researchers have been developing new and robust techniques to enhance the security of these systems. Many techniques and security aspects are being considered to build robust security systems; these include attack prevention, attack detection, and attack mitigation as security development techniques with consideration of confidentiality, integrity, and availability as some of the important security aspects. In this paper, we have proposed machine learning-based intelligent attack detection strategies which have evolved as a result of failures in traditional signature-based techniques to detect zero-day attacks and attacks of a complex nature. Many researchers have evaluated the feasibility of learning models in the security domain and pointed out their capability to detect known as well as unknown attacks (zero-day attacks). However, these learning models are also vulnerable to adversarial attacks like poisoning attacks, evasion attacks, and exploration attacks. To make use of a robust-cum-intelligent security mechanism, we have proposed an adversarial learning-based defense strategy for the security of CPS to ensure CPS security and invoke resilience against adversarial attacks. We have evaluated the proposed strategy through the implementation of Random Forest (RF), Artificial Neural Network (ANN), and Long Short-Term Memory (LSTM) on the ToN_IoT Network dataset and an adversarial dataset generated through the Generative Adversarial Network (GAN) model.
Collapse
Affiliation(s)
- Zakir Ahmad Sheikh
- Department of Computer Science and Information Technology, Central University of Jammu, Rahya Suchani, Bagla, Jammu 181143, India
| | - Yashwant Singh
- Department of Computer Science and Information Technology, Central University of Jammu, Rahya Suchani, Bagla, Jammu 181143, India
| | - Pradeep Kumar Singh
- STME, Narsee Monjee Institute of Management Studies (NMIMS) Deemed to be University, Maharashtra 400056, India
| | | |
Collapse
|
19
|
Veturi YA, Woof W, Lazebnik T, Moghul I, Woodward-Court P, Wagner SK, Cabral de Guimarães TA, Daich Varela M, Liefers B, Patel PJ, Beck S, Webster AR, Mahroo O, Keane PA, Michaelides M, Balaskas K, Pontikos N. SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease. Ophthalmol Sci 2023; 3:100258. [PMID: 36685715 PMCID: PMC9852957 DOI: 10.1016/j.xops.2022.100258] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 11/23/2022]
Abstract
Purpose Rare disease diagnosis is challenging in medical image-based artificial intelligence due to a natural class imbalance in datasets, leading to biased prediction models. Inherited retinal diseases (IRDs) are a research domain that particularly faces this issue. This study investigates the applicability of synthetic data in improving artificial intelligence-enabled diagnosis of IRDs using generative adversarial networks (GANs). Design Diagnostic study of gene-labeled fundus autofluorescence (FAF) IRD images using deep learning. Participants Moorfields Eye Hospital (MEH) dataset of 15 692 FAF images obtained from 1800 patients with confirmed genetic diagnosis of 1 of 36 IRD genes. Methods A StyleGAN2 model is trained on the IRD dataset to generate 512 × 512 resolution images. Convolutional neural networks are trained for classification using different synthetically augmented datasets, including real IRD images plus 1800 and 3600 synthetic images, and a fully rebalanced dataset. We also perform an experiment with only synthetic data. All models are compared against a baseline convolutional neural network trained only on real data. Main Outcome Measures We evaluated synthetic data quality using a Visual Turing Test conducted with 4 ophthalmologists from MEH. Synthetic and real images were compared using feature space visualization, similarity analysis to detect memorized images, and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score for no-reference-based quality evaluation. Convolutional neural network diagnostic performance was determined on a held-out test set using the area under the receiver operating characteristic curve (AUROC) and Cohen's Kappa (κ). Results An average true recognition rate of 63% and fake recognition rate of 47% was obtained from the Visual Turing Test. Thus, a considerable proportion of the synthetic images were classified as real by clinical experts. Similarity analysis showed that the synthetic images were not copies of the real images, indicating that copied real images, meaning the GAN was able to generalize. However, BRISQUE score analysis indicated that synthetic images were of significantly lower quality overall than real images (P < 0.05). Comparing the rebalanced model (RB) with the baseline (R), no significant change in the average AUROC and κ was found (R-AUROC = 0.86[0.85-88], RB-AUROC = 0.88[0.86-0.89], R-k = 0.51[0.49-0.53], and RB-k = 0.52[0.50-0.54]). The synthetic data trained model (S) achieved similar performance as the baseline (S-AUROC = 0.86[0.85-87], S-k = 0.48[0.46-0.50]). Conclusions Synthetic generation of realistic IRD FAF images is feasible. Synthetic data augmentation does not deliver improvements in classification performance. However, synthetic data alone deliver a similar performance as real data, and hence may be useful as a proxy to real data. Financial Disclosure(s): Proprietary or commercial disclosure may be found after the references.
Collapse
Key Words
- AUROC, area under the receiver operating characteristic curve
- BRISQUE, Blind/Referenceless Image Spatial Quality Evaluator
- Class imbalance
- Clinical Decision-Support Model
- DL, deep learning
- Deep Learning
- FAF, fundas autofluorescence
- FRR, Fake Recognition Rate
- GAN, generative adversarial network
- Generative Adversarial Networks
- IRD, inherited retinal disease
- Inherited Retinal Diseases
- MEH, Moorfields Eye Hospital
- R, baseline model
- RB, rebalanced model
- S, synthetic data trained model
- Synthetic data
- TRR, True Recognition Rate
- UMAP, Universal Manifold Approximation and Projection
Collapse
Affiliation(s)
- Yoga Advaith Veturi
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - William Woof
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Teddy Lazebnik
- University College London Cancer Institute, University College London, London, UK
| | | | - Peter Woodward-Court
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Siegfried K. Wagner
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | | | - Malena Daich Varela
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | | | | | - Stephan Beck
- University College London Cancer Institute, University College London, London, UK
| | - Andrew R. Webster
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Omar Mahroo
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Pearse A. Keane
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Michel Michaelides
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Konstantinos Balaskas
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Nikolas Pontikos
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| |
Collapse
|
20
|
Kim HS, Ha EG, Lee A, Choi YJ, Jeon KJ, Han SS, Lee C. Refinement of image quality in panoramic radiography using a generative adversarial network. Dentomaxillofac Radiol 2023:20230007. [PMID: 37129509 DOI: 10.1259/dmfr.20230007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023] Open
Abstract
OBJECTIVE We aimed to develop and assess the clinical usefulness of a generative adversarial network (GAN) model for improving image quality in panoramic radiography. METHODS Panoramic radiographs obtained at Yonsei University Dental Hospital were randomly selected for study inclusion (n = 100). Datasets with degraded image quality (n = 400) were prepared using four different processing methods: blur, noise, blur with noise, and blur in the anterior teeth region. The images were distributed to the training and test datasets in a ratio of 9:1 for each group. The Pix2Pix GAN model was trained using pairs of the original and degraded image datasets for 100 epochs. The peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) were obtained for the test dataset, and two oral and maxillofacial radiologists rated the quality of clinical images. RESULTS Among the degraded images, the GAN model enabled the greatest improvement in those with blur in the region of the anterior teeth but was least effective in improving images exhibiting blur with noise (PSNR, 36.27 > 32.74; SSIM, 0.90 > 0.82). While the mean clinical image quality score of the original radiographs was 44.6 out of 46.0, the highest and lowest predicted scores were observed in the blur (45.2) and noise (36.0) groups. CONCLUSION The GAN model developed in this study has the potential to improve panoramic radiographs with degraded image quality, both quantitatively and qualitatively. As the model performs better in refining blurred images, further research is required to identify the most effective methods for handling noisy images.
Collapse
Affiliation(s)
- Hak-Sun Kim
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
| | - Eun-Gyu Ha
- Department of Electrical and Electronic Engineering, Yonsei University College of Engineering, Seoul, South Korea
| | - Ari Lee
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
| | - Yoon Joo Choi
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
| | - Kug Jin Jeon
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
| | - Sang-Sun Han
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
| | - Chena Lee
- Department of Oral and Maxillofacial Radiology, Yonsei University College of Dentistry, Seoul, South Korea
- Institute for Innovation in Digital Healthcare, Yonsei University, Seoul, South Korea
| |
Collapse
|
21
|
Zhu W, Qiu P, Farazi M, Nandakumar K, Dumitrascu OM, Wang Y. OPTIMAL TRANSPORT GUIDED UNSUPERVISED LEARNING FOR ENHANCING LOW-QUALITY RETINAL IMAGES. Proc IEEE Int Symp Biomed Imaging 2023; 2023:10.1109/isbi53787.2023.10230719. [PMID: 37736573 PMCID: PMC10513403 DOI: 10.1109/isbi53787.2023.10230719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/23/2023]
Abstract
Real-world non-mydriatic retinal fundus photography is prone to artifacts, imperfections, and low-quality when certain ocular or systemic co-morbidities exist. Artifacts may result in inaccuracy or ambiguity in clinical diagnoses. In this paper, we proposed a simple but effective end-to-end framework for enhancing poor-quality retinal fundus images. Leveraging the optimal transport theory, we proposed an unpaired image-to-image translation scheme for transporting low-quality images to their high-quality counterparts. We theoretically proved that a Generative Adversarial Networks (GAN) model with a generator and discriminator is sufficient for this task. Furthermore, to mitigate the inconsistency of information between the low-quality images and their enhancements, an information consistency mechanism was proposed to maximally maintain structural consistency (optical discs, blood vessels, lesions) between the source and enhanced domains. Extensive experiments were conducted on the EyeQ dataset to demonstrate the superiority of our proposed method perceptually and quantitatively.
Collapse
Affiliation(s)
- Wenhui Zhu
- School of Computing and Augmented Intelligence, Arizona State University, AZ 85281, USA
| | - Peijie Qiu
- McKeley School of Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Mohammad Farazi
- School of Computing and Augmented Intelligence, Arizona State University, AZ 85281, USA
| | | | | | - Yalin Wang
- School of Computing and Augmented Intelligence, Arizona State University, AZ 85281, USA
| |
Collapse
|
22
|
Huang Z, Wang J, Lu X, Mohd Zain A, Yu G. scGGAN: single-cell RNA-seq imputation by graph-based generative adversarial network. Brief Bioinform 2023; 24:7024714. [PMID: 36733262 DOI: 10.1093/bib/bbad040] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/21/2022] [Accepted: 01/18/2023] [Indexed: 02/04/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) data are typically with a large number of missing values, which often results in the loss of critical gene signaling information and seriously limit the downstream analysis. Deep learning-based imputation methods often can better handle scRNA-seq data than shallow ones, but most of them do not consider the inherent relations between genes, and the expression of a gene is often regulated by other genes. Therefore, it is essential to impute scRNA-seq data by considering the regional gene-to-gene relations. We propose a novel model (named scGGAN) to impute scRNA-seq data that learns the gene-to-gene relations by Graph Convolutional Networks (GCN) and global scRNA-seq data distribution by Generative Adversarial Networks (GAN). scGGAN first leverages single-cell and bulk genomics data to explore inherent relations between genes and builds a more compact gene relation network to jointly capture the homogeneous and heterogeneous information. Then, it constructs a GCN-based GAN model to integrate the scRNA-seq, gene sequencing data and gene relation network for generating scRNA-seq data, and trains the model through adversarial learning. Finally, it utilizes data generated by the trained GCN-based GAN model to impute scRNA-seq data. Experiments on simulated and real scRNA-seq datasets show that scGGAN can effectively identify dropout events, recover the biologically meaningful expressions, determine subcellular states and types, improve the differential expression analysis and temporal dynamics analysis. Ablation experiments confirm that both the gene relation network and gene sequence data help the imputation of scRNA-seq data.
Collapse
Affiliation(s)
- Zimo Huang
- MEng student at School of Software, Shandong University, China
| | - Jun Wang
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, China
| | - Xudong Lu
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, China
| | | | - Guoxian Yu
- School of Software, Shandong University, China
| |
Collapse
|
23
|
Qin J, Gao F, Wang Z, Wong DC, Zhao Z, Relton SD, Fang H. A novel temporal generative adversarial network for electrocardiography anomaly detection. Artif Intell Med 2023; 136:102489. [PMID: 36710067 DOI: 10.1016/j.artmed.2023.102489] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 11/28/2022] [Accepted: 01/09/2023] [Indexed: 01/15/2023]
Abstract
Cardiac abnormality detection from Electrocardiogram (ECG) signals is a common task for cardiologists. To facilitate efficient and objective detection, automated ECG classification by using deep learning based methods have been developed in recent years. Despite their impressive performance, these methods perform poorly when presented with cardiac abnormalities that are not well represented, or absent, in the training data. To this end, we propose a novel one-class classification based ECG anomaly detection generative adversarial network (GAN). Specifically, we embedded a Bi-directional Long-Short Term Memory (Bi-LSTM) layer into a GAN architecture and used a mini-batch discrimination training strategy in the discriminator to synthesis ECG signals. Our method generates samples to match the data distribution from normal signals of healthy group so that a generalised anomaly detector can be built reliably. The experimental results demonstrate our method outperforms several state-of-the-art semi-supervised learning based ECG anomaly detection algorithms and robustly detects the unknown anomaly class in the MIT-BIH arrhythmia database. Experiments show that our method achieves the accuracy of 95.5% and AUC of 95.9% which outperforms the most competitive baseline by 0.7% and 1.7% respectively. Our method may prove to be a helpful diagnostic method for helping cardiologists identify arrhythmias.
Collapse
Affiliation(s)
- Jing Qin
- College of Software Engineering, Dalian University, Dalian, China.
| | - Fujie Gao
- College of Information Engineering, Dalian University, Dalian, China.
| | - Zumin Wang
- College of Information Engineering, Dalian University, Dalian, China.
| | - David C Wong
- Department of Computer Science and Centre for Health Informatics, University of Manchester, Manchester, UK.
| | - Zhibin Zhao
- School of Mechanical Engineering, Xi'an Jiaotong University, Xi'an, China.
| | - Samuel D Relton
- Leeds Institute of Health Sciences, University of Leeds, Leeds, UK.
| | - Hui Fang
- Department of Computer Science, Loughborough University, Loughborough, UK.
| |
Collapse
|
24
|
Kim H, Kim C, Kim H, Cho S, Hwang E. Panoptic blind image inpainting. ISA Trans 2023; 132:208-221. [PMID: 36372606 DOI: 10.1016/j.isatra.2022.10.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 10/24/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
In autonomous driving, scene understanding is a critical task in recognizing the driving environment or dangerous situations. Here, a variety of factors, including foreign objects on the lens, cloudy weather, and light blur, often reduce the accuracy of scene recognition. In this paper, we propose a new blind image inpainting model that accurately reconstructs images in a real environment where there is no ground truth for restoration. To this end, we first introduce a panoptic map to represent content information in detail and design an encoder-decoder structure to predict the panoptic map and the corrupted region mask. Then, we construct an image inpainting model that utilizes the information of the predicted map. Lastly, we present a mask refinement process to improve the accuracy of map prediction. To evaluate the effectiveness of the proposed model, we compared the restoration results of various inpainting methods on the cityscapes and coco datasets. Experimental results show that the proposed model outperforms other blind image inpainting models in terms of L1/L2 losses, PSNR and SSIM, and achieves similar performance to other image inpainting techniques that utilize additional information.
Collapse
Affiliation(s)
- Hyungjoon Kim
- School of Computer Science, Semyung University, Jecheon, Republic of Korea.
| | - ChungIl Kim
- Korea Electronics Technology Institute, Seongnam, Republic of Korea.
| | - Hyeonwoo Kim
- School of Electrical Engineering, Korea University, Seoul, Republic of Korea.
| | - Seongkuk Cho
- School of Electrical Engineering, Korea University, Seoul, Republic of Korea.
| | - Eenjun Hwang
- School of Electrical Engineering, Korea University, Seoul, Republic of Korea.
| |
Collapse
|
25
|
Han T, Wu J, Luo W, Wang H, Jin Z, Qu L. Review of Generative Adversarial Networks in mono- and cross-modal biomedical image registration. Front Neuroinform 2022; 16:933230. [PMID: 36483313 PMCID: PMC9724825 DOI: 10.3389/fninf.2022.933230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 10/13/2022] [Indexed: 09/19/2023] Open
Abstract
Biomedical image registration refers to aligning corresponding anatomical structures among different images, which is critical to many tasks, such as brain atlas building, tumor growth monitoring, and image fusion-based medical diagnosis. However, high-throughput biomedical image registration remains challenging due to inherent variations in the intensity, texture, and anatomy resulting from different imaging modalities, different sample preparation methods, or different developmental stages of the imaged subject. Recently, Generative Adversarial Networks (GAN) have attracted increasing interest in both mono- and cross-modal biomedical image registrations due to their special ability to eliminate the modal variance and their adversarial training strategy. This paper provides a comprehensive survey of the GAN-based mono- and cross-modal biomedical image registration methods. According to the different implementation strategies, we organize the GAN-based mono- and cross-modal biomedical image registration methods into four categories: modality translation, symmetric learning, adversarial strategies, and joint training. The key concepts, the main contributions, and the advantages and disadvantages of the different strategies are summarized and discussed. Finally, we analyze the statistics of all the cited works from different points of view and reveal future trends for GAN-based biomedical image registration studies.
Collapse
Affiliation(s)
- Tingting Han
- Ministry of Education Key Laboratory of Intelligent Computing and Signal Processing, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, China
| | - Jun Wu
- Ministry of Education Key Laboratory of Intelligent Computing and Signal Processing, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, China
| | - Wenting Luo
- Ministry of Education Key Laboratory of Intelligent Computing and Signal Processing, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, China
| | - Huiming Wang
- Ministry of Education Key Laboratory of Intelligent Computing and Signal Processing, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, China
| | - Zhe Jin
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Lei Qu
- Ministry of Education Key Laboratory of Intelligent Computing and Signal Processing, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- SEU-ALLEN Joint Center, Institute for Brain and Intelligence, Southeast University, Nanjing, China
| |
Collapse
|
26
|
Pattanaik A, Balabantaray RC. Enhancement of license plate recognition performance using Xception with Mish activation function. Multimed Tools Appl 2022; 82:16793-16815. [PMID: 36258895 PMCID: PMC9560886 DOI: 10.1007/s11042-022-13922-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/12/2022] [Accepted: 09/12/2022] [Indexed: 06/16/2023]
Abstract
The current breakthroughs in the highway research sector have resulted in a greater awareness and focus on the construction of an effective Intelligent Transportation System (ITS). One of the most actively researched areas is Vehicle Licence Plate Recognition (VLPR), concerned with determining the characters contained in a vehicle's Licence Plate (LP). Many existing methods have been used to deal with different environmental complexity factors but are limited to motion deblurring. The aim of our research is to provide an effective and robust solution for recognizing characters present in license plates in complex environmental conditions. Our proposed approach is capable of handling not only the motion-blurred LPs but also recognizing the characters present in different types of low resolution and blurred license plates, illegible vehicle plates, license plates present in different weather and light conditions, and various traffic circumstances, as well as high-speed vehicles. Our research provides a series of different approaches to execute different steps in the character recognition process. The proposed approach presents the concept of Generative Adversarial Networks (GAN) with Discrete Cosine Transform (DCT) Discriminator (DCTGAN), a joint image super resolution and deblurring approach that uses a discrete cosine transform with low computational complexity to remove various types of blur and complexities from licence plates. License Plates (LPs) are detected using the Improved Bernsen Algorithm (IBA) with Connected Component Analysis(CCA). Finally, with the aid of the proposed Xception model with transfer learning, the characters in LPs are recognised. Here we have not used any segmentation technique to split the characters. Four benchmark datasets such as Stanford Cars, FZU Cars, HumAIn 2019 Challenge datasets, and Application-Oriented License Plate (AOLP) dataset, as well as our own collected dataset, were used for the validation of our proposed algorithm. This dataset includes the images of vehicles captured in different lighting and weather conditions such as sunny, rainy, cloudy, blurred, low illumination, foggy, and night. The suggested strategy does better than the current best practices in both numbers and quality.
Collapse
Affiliation(s)
- Anmol Pattanaik
- International Institute of Information Technology Bhubaneswar, Odisha, India
| | | |
Collapse
|
27
|
Abstract
A continuing outbreak of pneumonia-related disease novel, Coronavirus has been recorded worldwide and has become a global health problem. This research aims to generate a constructive training data set for a neural network to detect COVID-19 from X-ray images. The creation of medical images is an issue in the field of deep learning. Medical image datasets are frequently unbalanced; using such datasets to train a deep neural network model to correctly classify medical conditions typically leads to over-fitting the data on majority class samples. Data augmentation is commonly used in training data to expand the dataset. Data augmentation may not be beneficial in medical domains with limited data. This paper proposed a data generation model using a Deep Convolutional Generative adversarial network (DCGAN), which generates fake instances with comparable properties to the original data. The model's Fréchet Distance of Inception (FID) was 23.78, close to the original data. Deep transfer learning-based models VGG-16, Inceptionv3 and MobilNet, were chosen as the backbone for COVID-19 detection. The present study aims to increase the dataset using the DCGAN data augmentation technique to improve classifier performance.
Collapse
Affiliation(s)
| | - Ravi Subban
- Dept of Computer Science, School of Engineering and Technology, Pondicherry University, India
| | - Nelson Kennedy Babu C
- Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India
| |
Collapse
|
28
|
Dunphy K, Fekri MN, Grolinger K, Sadhu A. Data Augmentation for Deep-Learning-Based Multiclass Structural Damage Detection Using Limited Information. Sensors (Basel) 2022; 22:6193. [PMID: 36015955 PMCID: PMC9412832 DOI: 10.3390/s22166193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/10/2022] [Revised: 08/05/2022] [Accepted: 08/16/2022] [Indexed: 06/15/2023]
Abstract
The deterioration of infrastructure's health has become more predominant on a global scale during the 21st century. Aging infrastructure as well as those structures damaged by natural disasters have prompted the research community to improve state-of-the-art methodologies for conducting Structural Health Monitoring (SHM). The necessity for efficient SHM arises from the hazards damaged infrastructure imposes, often resulting in structural collapse, leading to economic loss and human fatalities. Furthermore, day-to-day operations in these affected areas are limited until an inspection is performed to assess the level of damage experienced by the structure and the required rehabilitation determined. However, human-based inspections are often labor-intensive, inefficient, subjective, and restricted to accessible site locations, which ultimately negatively impact our ability to collect large amounts of data from inspection sites. Though Deep-Learning (DL) methods have been heavily explored in the past decade to rectify the limitations of traditional methods and automate structural inspection, data scarcity continues to remain prevalent within the field of SHM. The absence of sufficiently large, balanced, and generalized databases to train DL-based models often results in inaccurate and biased damage predictions. Recently, Generative Adversarial Networks (GANs) have received attention from the SHM community as a data augmentation tool by which a training dataset can be expanded to improve the damage classification. However, there are no existing studies within the SHM field which investigate the performance of DL-based multiclass damage identification using synthetic data generated from GANs. Therefore, this paper investigates the performance of a convolutional neural network architecture using synthetic images generated from a GAN for multiclass damage detection of concrete surfaces. Through this study, it was determined the average classification performance of the proposed CNN on hybrid datasets decreased by 10.6% and 7.4% for validation and testing datasets when compared to the same model trained entirely on real samples. Moreover, each model's performance decreased on average by 1.6% when comparing a singular model trained with real samples and the same model trained with both real and synthetic samples for a given training configuration. The correlation between classification accuracy and the amount and diversity of synthetic data used for data augmentation is quantified and the effect of using limited data to train existing GAN architectures is investigated. It was observed that the diversity of the samples decreases and correlation increases with the increase in the number of synthetic samples.
Collapse
Affiliation(s)
- Kyle Dunphy
- Department of Civil and Environmental Engineering, Western University, London, ON N6A 3K7, Canada
| | - Mohammad Navid Fekri
- Department of Electrical and Computer Engineering, Western University, London, ON N6A 3K7, Canada
| | - Katarina Grolinger
- Department of Electrical and Computer Engineering, Western University, London, ON N6A 3K7, Canada
| | - Ayan Sadhu
- Department of Civil and Environmental Engineering, Western University, London, ON N6A 3K7, Canada
| |
Collapse
|
29
|
Xiong YT, Zeng W, Xu L, Guo JX, Liu C, Chen JT, Du XY, Tang W. Virtual reconstruction of midfacial bone defect based on generative adversarial network. Head Face Med 2022; 18:19. [PMID: 35761334 PMCID: PMC9235085 DOI: 10.1186/s13005-022-00325-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 05/19/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND The study aims to evaluate the accuracy of the generative adversarial networks (GAN) for reconstructing bony midfacial defects. METHODS According to anatomy, the bony midface was divided into five subunit structural regions and artificial defects are manually created on the corresponding CT images. GAN is trained to reconstruct artificial defects to their previous normal shape and tested. The clinical defects are reconstructed by the trained GAN, where the midspan defects were used for qualitative evaluation and the unilateral defects were used for quantitative evaluation. The cosine similarity and the mean error are used to evaluate the accuracy of reconstruction. The Mann-Whitney U test is used to detect whether reconstruction errors were consistent in artificial and unilateral clinical defects. RESULTS This study included 518 normal CT data, with 415 in training set and 103 in testing set, and 17 real patient data, with 2 midspan defects and 15 unilateral defects. Reconstruction of midspan clinical defects assessed by experts is acceptable. The cosine similarity in the reconstruction of artificial defects and unilateral clinical defects is 0.97 ± 0.01 and 0.96 ± 0.01, P = 0.695. The mean error in the reconstruction of artificial defects and unilateral clinical defects is 0.59 ± 0.31 mm and 0.48 ± 0.08 mm, P = 0.09. CONCLUSION GAN-based virtual reconstruction technology has reached a high accuracy in testing set, and statistical tests suggest that it can achieve similar results in real patient data. This study has preliminarily solved the problem of bony midfacial defect without reference.
Collapse
Affiliation(s)
- Yu-Tao Xiong
- State Key Laboratory of Oral Diseases and National Clinical Research Centre for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, No.14, 3rd section of Ren Min Nan Road, Chengdu, 610041, China
| | - Wei Zeng
- State Key Laboratory of Oral Diseases and National Clinical Research Centre for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, No.14, 3rd section of Ren Min Nan Road, Chengdu, 610041, China
| | - Lei Xu
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Ji-Xiang Guo
- Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Chang Liu
- State Key Laboratory of Oral Diseases and National Clinical Research Centre for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, No.14, 3rd section of Ren Min Nan Road, Chengdu, 610041, China
| | - Jun-Tian Chen
- State Key Laboratory of Oral Diseases and National Clinical Research Centre for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, No.14, 3rd section of Ren Min Nan Road, Chengdu, 610041, China
| | - Xin-Ya Du
- Department of Stomatology, the People's Hospital of Longhua, Shenzhen, 518109, China
| | - Wei Tang
- State Key Laboratory of Oral Diseases and National Clinical Research Centre for Oral Diseases and Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, No.14, 3rd section of Ren Min Nan Road, Chengdu, 610041, China.
| |
Collapse
|
30
|
Zhang Y, Wa S, Zhang L, Lv C. Automatic Plant Disease Detection Based on Tranvolution Detection Network With GAN Modules Using Leaf Images. Front Plant Sci 2022; 13:875693. [PMID: 35693164 PMCID: PMC9178295 DOI: 10.3389/fpls.2022.875693] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 04/27/2022] [Indexed: 05/31/2023]
Abstract
The detection of plant disease is of vital importance in practical agricultural production. It scrutinizes the plant's growth and health condition and guarantees the regular operation and harvest of the agricultural planting to proceed successfully. In recent decades, the maturation of computer vision technology has provided more possibilities for implementing plant disease detection. Nonetheless, detecting plant diseases is typically hindered by factors such as variations in the illuminance and weather when capturing images and the number of leaves or organs containing diseases in one image. Meanwhile, traditional deep learning-based algorithms attain multiple deficiencies in the area of this research: (1) Training models necessitate a significant investment in hardware and a large amount of data. (2) Due to their slow inference speed, models are tough to acclimate to practical production. (3) Models are unable to generalize well enough. Provided these impediments, this study suggested a Tranvolution detection network with GAN modules for plant disease detection. Foremost, a generative model was added ahead of the backbone, and GAN models were added to the attention extraction module to construct GAN modules. Afterward, the Transformer was modified and incorporated with the CNN, and then we suggested the Tranvolution architecture. Eventually, we validated the performance of different generative models' combinations. Experimental outcomes demonstrated that the proposed method satisfyingly achieved 51.7% (Precision), 48.1% (Recall), and 50.3% (mAP), respectively. Furthermore, the SAGAN model was the best in the attention extraction module, while WGAN performed best in image augmentation. Additionally, we deployed the proposed model on Hbird E203 and devised an intelligent agricultural robot to put the model into practical agricultural use.
Collapse
Affiliation(s)
- Yan Zhang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| | - Shiyun Wa
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| | - Longxiang Zhang
- College of Science, China Agricultural University, Beijing, China
| | - Chunli Lv
- College of Information and Electrical Engineering, China Agricultural University, Beijing, China
| |
Collapse
|
31
|
Kaabachi B, Despraz J, Meurers T, Prasser F, Raisaro JL. Generation and Evaluation of Synthetic Data in a University Hospital Setting. Stud Health Technol Inform 2022; 294:141-142. [PMID: 35612040 DOI: 10.3233/shti220420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this study, we propose a unified evaluation framework for systematically assessing the utility-privacy trade-off of synthetic data generation (SDG) models. These SDG models are adapted to deal with longitudinal or tabular data stemming from electronic health records (EHR) containing both discrete and numeric features. Our evaluation framework considers different data sharing scenarios and attacker models.
Collapse
Affiliation(s)
- Bayrem Kaabachi
- Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland
- Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Jérémie Despraz
- Centre Hospitalier Universitaire Vaudois (CHUV), Lausanne, Switzerland
| | - Thierry Meurers
- Berlin Institute of Health @ Charité - Universitätsmedizin Berlin, Germany
| | - Fabian Prasser
- Berlin Institute of Health @ Charité - Universitätsmedizin Berlin, Germany
| | | |
Collapse
|
32
|
Kossen T, Hirzel MA, Madai VI, Boenisch F, Hennemuth A, Hildebrand K, Pokutta S, Sharma K, Hilbert A, Sobesky J, Galinovic I, Khalil AA, Fiebach JB, Frey D. Toward Sharing Brain Images: Differentially Private TOF-MRA Images With Segmentation Labels Using Generative Adversarial Networks. Front Artif Intell 2022; 5:813842. [PMID: 35586223 PMCID: PMC9108458 DOI: 10.3389/frai.2022.813842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/31/2022] [Indexed: 12/03/2022] Open
Abstract
Sharing labeled data is crucial to acquire large datasets for various Deep Learning applications. In medical imaging, this is often not feasible due to privacy regulations. Whereas anonymization would be a solution, standard techniques have been shown to be partially reversible. Here, synthetic data using a Generative Adversarial Network (GAN) with differential privacy guarantees could be a solution to ensure the patient's privacy while maintaining the predictive properties of the data. In this study, we implemented a Wasserstein GAN (WGAN) with and without differential privacy guarantees to generate privacy-preserving labeled Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) image patches for brain vessel segmentation. The synthesized image-label pairs were used to train a U-net which was evaluated in terms of the segmentation performance on real patient images from two different datasets. Additionally, the Fréchet Inception Distance (FID) was calculated between the generated images and the real images to assess their similarity. During the evaluation using the U-Net and the FID, we explored the effect of different levels of privacy which was represented by the parameter ϵ. With stricter privacy guarantees, the segmentation performance and the similarity to the real patient images in terms of FID decreased. Our best segmentation model, trained on synthetic and private data, achieved a Dice Similarity Coefficient (DSC) of 0.75 for ϵ = 7.4 compared to 0.84 for ϵ = ∞ in a brain vessel segmentation paradigm (DSC of 0.69 and 0.88 on the second test set, respectively). We identified a threshold of ϵ <5 for which the performance (DSC <0.61) became unstable and not usable. Our synthesized labeled TOF-MRA images with strict privacy guarantees retained predictive properties necessary for segmenting the brain vessels. Although further research is warranted regarding generalizability to other imaging modalities and performance improvement, our results mark an encouraging first step for privacy-preserving data sharing in medical imaging.
Collapse
Affiliation(s)
- Tabea Kossen
- CLAIM-Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
- Department of Computer Engineering and Microelectronics, Computer Vision & Remote Sensing, Technical University Berlin, Berlin, Germany
| | - Manuel A. Hirzel
- CLAIM-Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Vince I. Madai
- CLAIM-Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
- QUEST Center for Responsible Research, Berlin Institute of Health (BIH), Charité-Universitätsmedizin Berlin, Berlin, Germany
- Faculty of Computing, Engineering and the Built Environment, School of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom
| | | | - Anja Hennemuth
- Department of Computer Engineering and Microelectronics, Computer Vision & Remote Sensing, Technical University Berlin, Berlin, Germany
- Institute for Imaging Science and Computational Modelling in Cardiovascular Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
- Fraunhofer MEVIS, Bremen, Germany
| | - Kristian Hildebrand
- Department VI Computer Science and Media, Berlin University of Applied Sciences and Technology, Berlin, Germany
| | - Sebastian Pokutta
- Department for AI in Society, Science, and Technology, Zuse Institute Berlin, Berlin, Germany
- Institute of Mathematics, Technical University Berlin, Berlin, Germany
| | - Kartikey Sharma
- Department for AI in Society, Science, and Technology, Zuse Institute Berlin, Berlin, Germany
| | - Adam Hilbert
- CLAIM-Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Jan Sobesky
- Johanna-Etienne-Hospital, Neuss, Germany
- Centre for Stroke Research Berlin, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Ivana Galinovic
- Centre for Stroke Research Berlin, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Ahmed A. Khalil
- Centre for Stroke Research Berlin, Charité Universitätsmedizin Berlin, Berlin, Germany
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Mind, Brain, Body Institute, Berlin School of Mind and Brain, Humboldt-Universität Berlin, Berlin, Germany
| | - Jochen B. Fiebach
- Centre for Stroke Research Berlin, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Dietmar Frey
- CLAIM-Charité Lab for AI in Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
33
|
Apostolopoulos ID, Papathanasiou ND, Apostolopoulos DJ, Panayiotakis GS. Applications of Generative Adversarial Networks (GANs) in Positron Emission Tomography (PET) imaging: A review. Eur J Nucl Med Mol Imaging 2022. [PMID: 35451611 DOI: 10.1007/s00259-022-05805-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/12/2022] [Indexed: 11/04/2022]
Abstract
PURPOSE This paper reviews recent applications of Generative Adversarial Networks (GANs) in Positron Emission Tomography (PET) imaging. Recent advances in Deep Learning (DL) and GANs catalysed the research of their applications in medical imaging modalities. As a result, several unique GAN topologies have emerged and been assessed in an experimental environment over the last two years. METHODS The present work extensively describes GAN architectures and their applications in PET imaging. The identification of relevant publications was performed via approved publication indexing websites and repositories. Web of Science, Scopus, and Google Scholar were the major sources of information. RESULTS The research identified a hundred articles that address PET imaging applications such as attenuation correction, de-noising, scatter correction, removal of artefacts, image fusion, high-dose image estimation, super-resolution, segmentation, and cross-modality synthesis. These applications are presented and accompanied by the corresponding research works. CONCLUSION GANs are rapidly employed in PET imaging tasks. However, specific limitations must be eliminated to reach their full potential and gain the medical community's trust in everyday clinical practice.
Collapse
|
34
|
Kierdorf J, Weber I, Kicherer A, Zabawa L, Drees L, Roscher R. Behind the Leaves: Estimation of Occluded Grapevine Berries With Conditional Generative Adversarial Networks. Front Artif Intell 2022; 5:830026. [PMID: 35402903 PMCID: PMC8990779 DOI: 10.3389/frai.2022.830026] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 02/28/2022] [Indexed: 11/30/2022] Open
Abstract
The need for accurate yield estimates for viticulture is becoming more important due to increasing competition in the wine market worldwide. One of the most promising methods to estimate the harvest is berry counting, as it can be approached non-destructively, and its process can be automated. In this article, we present a method that addresses the challenge of occluded berries with leaves to obtain a more accurate estimate of the number of berries that will enable a better estimate of the harvest. We use generative adversarial networks, a deep learning-based approach that generates a highly probable scenario behind the leaves exploiting learned patterns from images with non-occluded berries. Our experiments show that the estimate of the number of berries after applying our method is closer to the manually counted reference. In contrast to applying a factor to the berry count, our approach better adapts to local conditions by directly involving the appearance of the visible berries. Furthermore, we show that our approach can identify which areas in the image should be changed by adding new berries without explicitly requiring information about hidden areas.
Collapse
Affiliation(s)
- Jana Kierdorf
- Remote Sensing Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany
| | - Immanuel Weber
- Application Center for Machine Learning and Sensor Technology, University of Applied Sciences Koblenz, Koblenz, Germany
| | - Anna Kicherer
- Julius Kühn-Institut (JKI), Federal Research Centre for Cultivated Plants, Institute for Grapevine Breeding Geilweilerhof, Siebeldingen, Germany
| | - Laura Zabawa
- Geodesy Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany
| | - Lukas Drees
- Remote Sensing Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany
| | - Ribana Roscher
- Remote Sensing Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany
| |
Collapse
|
35
|
Ganjdanesh A, Zhang J, Chew EY, Ding Y, Huang H, Chen W. LONGL-Net: temporal correlation structure guided deep learning model to predict longitudinal age-related macular degeneration severity. PNAS Nexus 2022; 1:pgab003. [PMID: 35360552 PMCID: PMC8962776 DOI: 10.1093/pnasnexus/pgab003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 11/15/2021] [Indexed: 01/28/2023]
Abstract
Age-related macular degeneration (AMD) is the principal cause of blindness in developed countries, and its prevalence will increase to 288 million people in 2040. Therefore, automated grading and prediction methods can be highly beneficial for recognizing susceptible subjects to late-AMD and enabling clinicians to start preventive actions for them. Clinically, AMD severity is quantified by Color Fundus Photographs (CFP) of the retina, and many machine-learning-based methods are proposed for grading AMD severity. However, few models were developed to predict the longitudinal progression status, i.e. predicting future late-AMD risk based on the current CFP, which is more clinically interesting. In this paper, we propose a new deep-learning-based classification model (LONGL-Net) that can simultaneously grade the current CFP and predict the longitudinal outcome, i.e. whether the subject will be in late-AMD in the future time-point. We design a new temporal-correlation-structure-guided Generative Adversarial Network model that learns the interrelations of temporal changes in CFPs in consecutive time-points and provides interpretability for the classifier's decisions by forecasting AMD symptoms in the future CFPs. We used about 30,000 CFP images from 4,628 participants in the Age-Related Eye Disease Study. Our classifier showed average 0.905 (95% CI: 0.886-0.922) AUC and 0.762 (95% CI: 0.733-0.792) accuracy on the 3-class classification problem of simultaneously grading current time-point's AMD condition and predicting late AMD progression of subjects in the future time-point. We further validated our model on the UK Biobank dataset, where our model showed average 0.905 accuracy and 0.797 sensitivity in grading 300 CFP images.
Collapse
Affiliation(s)
- Alireza Ganjdanesh
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Jipeng Zhang
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Emily Y Chew
- Division of Epidemiology and Clinical Applications, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ying Ding
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Heng Huang
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Wei Chen
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA 15219, USA
| |
Collapse
|
36
|
Argilaga A, Zhuang D. Predicting the Non-Deterministic Response of a Micro-Scale Mechanical Model Using Generative Adversarial Networks. Materials (Basel) 2022; 15:965. [PMID: 35160911 PMCID: PMC8838419 DOI: 10.3390/ma15030965] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/19/2022] [Accepted: 01/23/2022] [Indexed: 01/27/2023]
Abstract
Recent improvements in micro-scale material descriptions allow to build increasingly refined multiscale models in geomechanics. This often comes at the expense of computational cost which can eventually become prohibitive. Among other characteristics, the non-determinism of a micro-scale response makes its replacement by a surrogate particularly challenging. Machine Learning (ML) is a promising technique to substitute physics-based models, nevertheless existing ML algorithms for the prediction of material response do not integrate non-determinism in the learning process. Is it possible to use the numerical output of the latest micro-scale descriptions to train a ML algorithm that will then provide a response at a much lower computational cost? A series of ML algorithms with different levels of depth and supervision are trained using a data-driven approach. Gaussian Process Regression (GPR), Self-Organizing Maps (SOM) and Generative Adversarial Networks (GANs) are tested and the latter retained because of its superior results. A modified GANs with lower network depth showed good performance in the generation of failure probability maps, with good reproduction of the non-deterministic micro-scale response. The trained generator can be incorporated into existing multiscale models allowing to, at least partially, bypass the costly micro-scale computations.
Collapse
Affiliation(s)
- Albert Argilaga
- MOE Key Laboratory of Soft Soils and Geoenvironmental Engineering, Zhejiang University, Hangzhou 310058, China;
| | - Duanyang Zhuang
- MOE Key Laboratory of Soft Soils and Geoenvironmental Engineering, Zhejiang University, Hangzhou 310058, China;
- Center for Hypergravity Experiment and Interdisciplinary Research, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
37
|
Abstract
In the past few years, de novo molecular design has increasingly been using generative models from the emergent field of Deep Learning, proposing novel compounds that are likely to possess desired properties or activities. De novo molecular design finds applications in different fields ranging from drug discovery and materials sciences to biotechnology. A panoply of deep generative models, including architectures as Recurrent Neural Networks, Autoencoders, and Generative Adversarial Networks, can be trained on existing data sets and provide for the generation of novel compounds. Typically, the new compounds follow the same underlying statistical distributions of properties exhibited on the training data set Additionally, different optimization strategies, including transfer learning, Bayesian optimization, reinforcement learning, and conditional generation, can direct the generation process toward desired aims, regarding their biological activities, synthesis processes or chemical features. Given the recent emergence of these technologies and their relevance, this work presents a systematic and critical review on deep generative models and related optimization methods for targeted compound design, and their applications.
Collapse
Affiliation(s)
- Tiago Sousa
- Centre of Biological Engineering, Campus Gualtar, University of Minho, 4710-057 Braga, Portugal
| | - João Correia
- Centre of Biological Engineering, Campus Gualtar, University of Minho, 4710-057 Braga, Portugal
| | - Vítor Pereira
- Centre of Biological Engineering, Campus Gualtar, University of Minho, 4710-057 Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, Campus Gualtar, University of Minho, 4710-057 Braga, Portugal
| |
Collapse
|
38
|
Hou X, Zhang X, Liang H, Shen L, Lai Z, Wan J. GuidedStyle: Attribute knowledge guided style manipulation for semantic face editing. Neural Netw 2022; 145:209-20. [PMID: 34768091 DOI: 10.1016/j.neunet.2021.10.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/29/2021] [Accepted: 10/21/2021] [Indexed: 11/21/2022]
Abstract
Although significant progress has been made in synthesizing high-quality and visually realistic face images by unconditional Generative Adversarial Networks (GANs), there is still a lack of control over the generation process in order to achieve semantic face editing. In this paper, we propose a novel learning framework, called GuidedStyle, to achieve semantic face editing on pretrained StyleGAN by guiding the image generation process with a knowledge network. Furthermore, we allow an attention mechanism in StyleGAN generator to adaptively select a single layer for style manipulation. As a result, our method is able to perform disentangled and controllable edits along various attributes, including smiling, eyeglasses, gender, mustache, hair color and attractive. Both qualitative and quantitative results demonstrate the superiority of our method over other competing methods for semantic face editing. Moreover, we show that our model can be also applied to different types of real and artistic face editing, demonstrating strong generalization ability.
Collapse
|
39
|
Hu L, Zhou DW, Zha YF, Li L, He H, Xu WH, Qian L, Zhang YK, Fu CX, Hu H, Zhao JG. Synthesizing High- b-Value Diffusion-weighted Imaging of the Prostate Using Generative Adversarial Networks. Radiol Artif Intell 2021; 3:e200237. [PMID: 34617025 DOI: 10.1148/ryai.2021200237] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 04/11/2021] [Accepted: 05/18/2021] [Indexed: 11/11/2022]
Abstract
Purpose To develop and evaluate a diffusion-weighted imaging (DWI) deep learning framework based on the generative adversarial network (GAN) to generate synthetic high-b-value (b =1500 sec/mm2) DWI (SYNb1500) sets from acquired standard-b-value (b = 800 sec/mm2) DWI (ACQb800) and acquired standard-b-value (b = 1000 sec/mm2) DWI (ACQb1000) sets. Materials and Methods This retrospective multicenter study included 395 patients who underwent prostate multiparametric MRI. This cohort was split into internal training (96 patients) and external testing (299 patients) datasets. To create SYNb1500 sets from ACQb800 and ACQb1000 sets, a deep learning model based on GAN (M0) was developed by using the internal dataset. M0 was trained and compared with a conventional model based on the cycle GAN (Mcyc). M0 was further optimized by using denoising and edge-enhancement techniques (optimized version of the M0 [Opt-M0]). The SYNb1500 sets were synthesized by using the M0 and the Opt-M0 were synthesized by using ACQb800 and ACQb1000 sets from the external testing dataset. For comparison, traditional calculated (b =1500 sec/mm2) DWI (CALb1500) sets were also obtained. Reader ratings for image quality and prostate cancer detection were performed on the acquired high-b-value (b = 1500 sec/mm2) DWI (ACQb1500), CALb1500, and SYNb1500 sets and the SYNb1500 set generated by the Opt-M0 (Opt-SYNb1500). Wilcoxon signed rank tests were used to compare the readers' scores. A multiple-reader multiple-case receiver operating characteristic curve was used to compare the diagnostic utility of each DWI set. Results When compared with the Mcyc, the M0 yielded a lower mean squared difference and higher mean scores for the peak signal-to-noise ratio, structural similarity, and feature similarity (P < .001 for all). Opt-SYNb1500 resulted in significantly better image quality (P ≤ .001 for all) and a higher mean area under the curve than ACQb1500 and CALb1500 (P ≤ .042 for all). Conclusion A deep learning framework based on GAN is a promising method to synthesize realistic high-b-value DWI sets with good image quality and accuracy in prostate cancer detection.Keywords: Prostate Cancer, Abdomen/GI, Diffusion-weighted Imaging, Deep Learning Framework, High b Value, Generative Adversarial Networks© RSNA, 2021 Supplemental material is available for this article.
Collapse
Affiliation(s)
- Lei Hu
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Da-Wei Zhou
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Yun-Fei Zha
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Liang Li
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Huan He
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Wen-Hao Xu
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Li Qian
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Yi-Kun Zhang
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Cai-Xia Fu
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Hui Hu
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| | - Jun-Gong Zhao
- Department of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yi Shan Road, Shanghai 200233, China (L.H., W.H.X., J.G.Z.); State Key Laboratory of Integrated Services Networks, School of Telecommunications Engineering, Xidian University, Xi'an, China (D.W.Z.); Department of Radiology, Renmin Hospital, Wuhan University, Wuhan, China (Y.F.Z., L.L., H. He, L.Q., Y.K.Z.); MR Application Development, Siemens Shenzhen MR, Shenzhen, China (C.X.F.); and Department of Radiology, The Affiliated Renmin Hospital of Jiangsu University, Zhenjiang, China (H. Hu)
| |
Collapse
|
40
|
Liu X, Xing F, Fakhri GE, Woo J. A UNIFIED CONDITIONAL DISENTANGLEMENT FRAMEWORK FOR MULTIMODAL BRAIN MR IMAGE TRANSLATION. Proc IEEE Int Symp Biomed Imaging 2021; 2021. [PMID: 34567419 DOI: 10.1109/isbi48211.2021.9433897] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Multimodal MRI provides complementary and clinically relevant information to probe tissue condition and to characterize various diseases. However, it is often difficult to acquire sufficiently many modalities from the same subject due to limitations in study plans, while quantitative analysis is still demanded. In this work, we propose a unified conditional disentanglement framework to synthesize any arbitrary modality from an input modality. Our framework hinges on a cycle-constrained conditional adversarial training approach, where it can extract a modality-invariant anatomical feature with a modality-agnostic encoder and generate a target modality with a conditioned decoder. We validate our framework on four MRI modalities, including T1-weighted, T1 contrast enhanced, T2-weighted, and FLAIR MRI, from the BraTS'18 database, showing superior performance on synthesis quality over the comparison methods. In addition, we report results from experiments on a tumor segmentation task carried out with synthesized data.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Fangxu Xing
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Georges El Fakhri
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Jonghye Woo
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
41
|
Giudice O, Guarnera L, Battiato S. Fighting Deepfakes by Detecting GAN DCT Anomalies. J Imaging 2021; 7:128. [PMID: 34460764 DOI: 10.3390/jimaging7080128] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 07/23/2021] [Accepted: 07/26/2021] [Indexed: 11/16/2022] Open
Abstract
To properly contrast the Deepfake phenomenon the need to design new Deepfake detection algorithms arises; the misuse of this formidable A.I. technology brings serious consequences in the private life of every involved person. State-of-the-art proliferates with solutions using deep neural networks to detect a fake multimedia content but unfortunately these algorithms appear to be neither generalizable nor explainable. However, traces left by Generative Adversarial Network (GAN) engines during the creation of the Deepfakes can be detected by analyzing ad-hoc frequencies. For this reason, in this paper we propose a new pipeline able to detect the so-called GAN Specific Frequencies (GSF) representing a unique fingerprint of the different generative architectures. By employing Discrete Cosine Transform (DCT), anomalous frequencies were detected. The β statistics inferred by the AC coefficients distribution have been the key to recognize GAN-engine generated data. Robustness tests were also carried out in order to demonstrate the effectiveness of the technique using different attacks on images such as JPEG Compression, mirroring, rotation, scaling, addition of random sized rectangles. Experiments demonstrated that the method is innovative, exceeds the state of the art and also give many insights in terms of explainability.
Collapse
|
42
|
Klasen M, Ahrens D, Eberle J, Steinhage V. Image-based Automated Species Identification: Can Virtual Data Augmentation Overcome Problems of Insufficient Sampling? Syst Biol 2021; 71:320-333. [PMID: 34143222 DOI: 10.1093/sysbio/syab048] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 06/10/2021] [Accepted: 06/16/2021] [Indexed: 11/13/2022] Open
Abstract
Automated species identification and delimitation is challenging, particularly in rare and thus often scarcely sampled species, which do not allow sufficient discrimination of infraspecific versus interspecific variation. Typical problems arising from either low or exaggerated interspecific morphological differentiation are best met by automated methods of machine learning that learn efficient and effective species identification from training samples. However, limited infraspecific sampling remains a key challenge also in machine learning. In this study, we assessed whether a data augmentation approach may help to overcome the problem of scarce training data in automated visual species identification. The stepwise augmentation of data comprised image rotation as well as visual and virtual augmentation. The visual data augmentation applies classic approaches of data augmentation and generation of artificial images using a Generative Adversarial Networks (GAN) approach. Descriptive feature vectors are derived from bottleneck features of a VGG-16 convolutional neural network (CNN) that are then stepwise reduced in dimensionality using Global Average Pooling and PCA to prevent overfitting. Finally, data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space (SMOTE). Applied on four different image datasets, which include scarab beetle genitalia (Pleophylla, Schizonycha) as well as wing patterns of bees (Osmia) and cattleheart butterflies (Parides), our augmentation approach outperformed a deep learning baseline approach by means of resulting identification accuracy with non-augmented data as well as a traditional 2D morphometric approach (Procrustes analysis of scarab beetle genitalia).
Collapse
Affiliation(s)
- Morris Klasen
- Department of Computer Science IV, University of Bonn, Endenicher Allee 19A, 53115 Bonn, Germany
| | - Dirk Ahrens
- Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53113 Bonn, Germany
| | - Jonas Eberle
- Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53113 Bonn, Germany.,Paris-Lodron-Universität, Zoologische Evolutionsbiologie, Hellbrunner Straße 34, 5020 Salzburg, Austria
| | - Volker Steinhage
- Department of Computer Science IV, University of Bonn, Endenicher Allee 19A, 53115 Bonn, Germany
| |
Collapse
|
43
|
Marzullo A, Moccia S, Catellani M, Calimeri F, Momi ED. Towards realistic laparoscopic image generation using image-domain translation. Comput Methods Programs Biomed 2021; 200:105834. [PMID: 33229016 DOI: 10.1016/j.cmpb.2020.105834] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 11/05/2020] [Indexed: 06/11/2023]
Abstract
Background and ObjectivesOver the last decade, Deep Learning (DL) has revolutionized data analysis in many areas, including medical imaging. However, there is a bottleneck in the advancement of DL in the surgery field, which can be seen in a shortage of large-scale data, which in turn may be attributed to the lack of a structured and standardized methodology for storing and analyzing surgical images in clinical centres. Furthermore, accurate annotations manually added are expensive and time consuming. A great help can come from the synthesis of artificial images; in this context, in the latest years, the use of Generative Adversarial Neural Networks (GANs) achieved promising results in obtaining photo-realistic images. MethodsIn this study, a method for Minimally Invasive Surgery (MIS) image synthesis is proposed. To this aim, the generative adversarial network pix2pix is trained to generate paired annotated MIS images by transforming rough segmentation of surgical instruments and tissues into realistic images. An additional regularization term was added to the original optimization problem, in order to enhance realism of surgical tools with respect to the background. Results Quantitative and qualitative (i.e., human-based) evaluations of generated images have been carried out in order to assess the effectiveness of the method. ConclusionsExperimental results show that the proposed method is actually able to translate MIS segmentations to realistic MIS images, which can in turn be used to augment existing data sets and help at overcoming the lack of useful images; this allows physicians and algorithms to take advantage from new annotated instances for their training.
Collapse
Affiliation(s)
- Aldo Marzullo
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy.
| | - Sara Moccia
- Department of Information Engineering, Unviersitá Politecnica delle Marche, Ancona, Italy; Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| | - Michele Catellani
- Department of urology, European Institute of Oncology, IRCCS, Milan, Italy
| | - Francesco Calimeri
- Department of Mathematics and Computer Science, University of Calabria, Rende, Italy
| | - Elena De Momi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
44
|
Kan CNE, Gilat-Schmidt T, Ye DH. Enhancing Reproductive Organ Segmentation in Pediatric CT via Adversarial Learning. Proc SPIE Int Soc Opt Eng 2021; 11596:1159612. [PMID: 33994628 PMCID: PMC8122493 DOI: 10.1117/12.2582127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Accurately segmenting organs in abdominal computed tomography (CT) scans is crucial for clinical applications such as pre-operative planning and dose estimation. With the recent advent of deep learning algorithms, many robust frameworks have been proposed for organ segmentation in abdominal CT images. However, many of these frameworks require large amounts of training data in order to achieve high segmentation accuracy. Pediatric abdominal CT images containing reproductive organs are particularly hard to obtain since these organs are extremely sensitive to ionizing radiation. Hence, it is extremely challenging to train automatic segmentation algorithms on organs such as the uterus and the prostate. To address these issues, we propose a novel segmentation network with a built-in auxiliary classifier generative adversarial network (ACGAN) that conditionally generates additional features during training. The proposed CFG-SegNet (conditional feature generation segmentation network) is trained on a single loss function which combines adversarial loss, reconstruction loss, auxiliary classifier loss and segmentation loss. 2.5D segmentation experiments are performed on a custom data set containing 24 female CT volumes containing the uterus and 40 male CT volumes containing the prostate. CFG-SegNet achieves an average segmentation accuracy of 0.929 DSC (Dice Similarity Coefficient) on the prostate and 0.724 DSC on the uterus with 4-fold cross validation. The results show that our network is high-performing and has the potential to precisely segment difficult organs with few available training images.
Collapse
Affiliation(s)
- Chi Nok Enoch Kan
- Department of Electrical and Computer Engineering, Marquette University, Milwaukee, USA
| | - Taly Gilat-Schmidt
- Department of Electrical and Computer Engineering, Marquette University, Milwaukee, USA
| | - Dong Hye Ye
- Department of Electrical and Computer Engineering, Marquette University, Milwaukee, USA
| |
Collapse
|
45
|
Kahembwe E, Ramamoorthy S. Lower dimensional kernels for video discriminators. Neural Netw 2020; 132:506-20. [PMID: 33039788 DOI: 10.1016/j.neunet.2020.09.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 11/22/2022]
Abstract
This work presents an analysis of the discriminators used in Generative Adversarial Networks (GANs) for Video. We show that unconstrained video discriminator architectures induce a loss surface with high curvature which make optimization difficult. We also show that this curvature becomes more extreme as the maximal kernel dimension of video discriminators increases. With these observations in hand, we propose a methodology for the design of a family of efficient Lower-Dimensional Video Discriminators for GANs (LDVD-GANs). The proposed methodology improves the performance and efficiency of video GAN models it is applied to and demonstrates good performance on complex and diverse datasets such as UCF-101. In particular, we show that LDVDs can double the performance of Temporal-GANs and provide for state-of-the-art performance on a single GPU using the proposed methodology.
Collapse
|
46
|
Nishimura Y, Nakamura Y, Ishiguro H. Human interaction behavior modeling using Generative Adversarial Networks. Neural Netw 2020; 132:521-31. [PMID: 33039789 DOI: 10.1016/j.neunet.2020.09.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 09/01/2020] [Accepted: 09/25/2020] [Indexed: 11/22/2022]
Abstract
Recently, considerable research has focused on personal assistant robots, and robots capable of rich human-like communication are expected. Among humans, non-verbal elements contribute to effective and dynamic communication. However, people use a wide range of diverse gestures, and a robot capable of expressing various human gestures has not been realized. In this study, we address human behavior modeling during interaction using a deep generative model. In the proposed method, to consider interaction motion, three factors, i.e., interaction intensity, time evolution, and time resolution, are embedded in the network structure. Subjective evaluation results suggest that the proposed method can generate high-quality human motions.
Collapse
|
47
|
Wen J, Shi Y, Zhou X, Xue Y. Crop Disease Classification on Inadequate Low-Resolution Target Images. Sensors (Basel) 2020; 20:E4601. [PMID: 32824352 DOI: 10.3390/s20164601] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 08/13/2020] [Accepted: 08/13/2020] [Indexed: 12/01/2022]
Abstract
Currently, various agricultural image classification tasks are carried out on high-resolution images. However, in some cases, we cannot get enough high-resolution images for classification, which significantly affects classification performance. In this paper, we design a crop disease classification network based on Enhanced Super-Resolution Generative adversarial networks (ESRGAN) when only an insufficient number of low-resolution target images are available. First, ESRGAN is used to recover super-resolution crop images from low-resolution images. Transfer learning is applied in model training to compensate for the lack of training samples. Then, we test the performance of the generated super-resolution images in crop disease classification task. Extensive experiments show that using the fine-tuned ESRGAN model can recover realistic crop information and improve the accuracy of crop disease classification, compared with the other four image super-resolution methods.
Collapse
|
48
|
Zhao M, Liu X, Liu H, Wong KKL. Super-resolution of cardiac magnetic resonance images using Laplacian Pyramid based on Generative Adversarial Networks. Comput Med Imaging Graph 2020; 80:101698. [PMID: 31935666 DOI: 10.1016/j.compmedimag.2020.101698] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Revised: 12/28/2019] [Accepted: 01/02/2020] [Indexed: 11/24/2022]
Abstract
BACKGROUND AND OBJECTIVE Cardiac magnetic resonance imaging (MRI) can assist in both functional and structural analysis of the heart, but due to hardware and physical limitations, high-resolution MRI scans is time consuming and peak signal-to-noise ratio (PSNR) is low. The existing super-resolution methods attempt to resolve this issue, but there are still shortcomings, such as hallucinate details after super-resolution, low precision after reconstruction, etc. To dispose these problems, we propose the Laplacian Pyramid Generation Adversarial Network (LSRGAN) in order to generate visually better cardiovascular ultrasound images so as to aid physician diagnosis and treatment. METHODS AND RESULTS In order to address the problem of low image resolution, we used the Laplacian Pyramid to analyze the high-frequency detail features of super-resolution (SR) reconstruction of images with different pixel sizes. To eliminate gradient disappearance, we implemented the least squares loss function as the discriminator, we introduce the residual-dense block (RDB) as the basic network building unit is used to generate higher quality images. The experimental results show that the LSRGAN can effectively avoid the illusion details after super-resolution and has the best reconstruction quality. Compared with the state-of-the-art methods, our proposed algorithm generates higher quality super-resolution images that comes with higher peak signal-to-noise ratio and structural similarity (SSIM) scores. CONCLUSION We implemented a novel LSRGAN network model, which solves reduces insufficient resolution and hallucinate details of MRI after super-resolution. Our research presents a superior super-resolution method for medical experts to diagnose and treat myocardial ischemia and myocardial infarction.
Collapse
Affiliation(s)
- Ming Zhao
- School of Computer Science and Engineering, Central South University, Changsha, 410000, China
| | - Xinhong Liu
- School of Computer Science and Engineering, Central South University, Changsha, 410000, China
| | - Hui Liu
- Computer Science Department, Missouri State University, Springfield, 62701, United States
| | - Kelvin K L Wong
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
49
|
Ali T, Jan S, Alkhodre A, Nauman M, Amin M, Siddiqui MS. DeepMoney: counterfeit money detection using generative adversarial networks. PeerJ Comput Sci 2019; 5:e216. [PMID: 33816869 PMCID: PMC7924467 DOI: 10.7717/peerj-cs.216] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Accepted: 07/16/2019] [Indexed: 06/12/2023]
Abstract
Conventional paper currency and modern electronic currency are two important modes of transactions. In several parts of the world, conventional methodology has clear precedence over its electronic counterpart. However, the identification of forged currency paper notes is now becoming an increasingly crucial problem because of the new and improved tactics employed by counterfeiters. In this paper, a machine assisted system-dubbed DeepMoney-is proposed which has been developed to discriminate fake notes from genuine ones. For this purpose, state-of-the-art models of machine learning called Generative Adversarial Networks (GANs) are employed. GANs use unsupervised learning to train a model that can then be used to perform supervised predictions. This flexibility provides the best of both worlds by allowing unlabelled data to be trained on whilst still making concrete predictions. This technique was applied to Pakistani banknotes. State-of-the-art image processing and feature recognition techniques were used to design the overall approach of a valid input. Augmented samples of images were used in the experiments which show that a high-precision machine can be developed to recognize genuine paper money. An accuracy of 80% has been achieved. The code is available as an open source to allow others to reproduce and build upon the efforts already made.
Collapse
Affiliation(s)
- Toqeer Ali
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi Arabia
| | - Salman Jan
- Malaysian Institute of Information Technology, University Kuala Lumpur, Kuala Lumpur, Malaysia
| | - Ahmad Alkhodre
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi Arabia
| | | | | | - Muhammad Shoaib Siddiqui
- Faculty of Computer and Information Systems, Islamic University of Madinah, Madinah, Saudi Arabia
| |
Collapse
|
50
|
Zhang S, Wang T, Peng Y, Dong J. A hierarchically trained generative network for robust facial symmetrization. Technol Health Care 2019; 27:217-227. [PMID: 31045541 PMCID: PMC6598010 DOI: 10.3233/thc-199021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Face symmetrization has extensive applications in both medical and academic fields, such as facial disorder diagnosis. Human face possesses an important characteristic, which is as known as symmetry. However, in many scenarios, the perfect symmetry doesn't exist in human faces, which yields a large number of studies around this topic. For example, facial palsy evaluation, facial beauty evaluation based on facial symmetry analysis, and many among others. Currently, there are still very limited researches dedicated for automatic facial symmetrization. Most of the existing studies only utilized their own implantations for facial symmetrization to assist their interdisciplinary academic researches. Limitations thus can be noticed in their methods, such as the requirements for manual interventions. Furthermore, most existing methods utilize facial landmark detection algorithms for automatic facial symmetrization. Though accuracies of the landmark detection algorithms are promising, the uncontrolled conditions in the facial images can still negatively impact the performance of the symmetrical face production. To this end, this paper presents a joint-loss enhanced deep generative network model for automatic facial symmetrization, which is achieved by a full facial image analysis. The joint-loss consists of a pair of adversarial losses and an identity loss. The adversarial losses try to make the generated symmetrical face as realistic as possible, while the identity loss helps to constrain the output to have the same identity of the person in the original input as much as possible. Rather than an end-to-end learning strategy, the proposed model is constructed by a multi-stage training process, which avoids the demand for a large size of the symmetrical face as training data. Experiments are conducted with comparisons with several existing methods based on some of the most popular facial landmark detection algorithms. Competitive results of the proposed method are demonstrated.
Collapse
Affiliation(s)
- Shu Zhang
- Ocean University of China, Qingdao, Shandong, China
| | - Ting Wang
- Shandong University of Science and Technology, Qingdao, Shandong, China
| | - Yanjun Peng
- Shandong University of Science and Technology, Qingdao, Shandong, China
| | - Junyu Dong
- Ocean University of China, Qingdao, Shandong, China
| |
Collapse
|