1
|
Jung D, Kang M, Park SH, Guezzi N, Yu J. Unsupervised speckle noise reduction technique for clinical ultrasound imaging. Ultrasonography 2024; 43:327-344. [PMID: 39155463 PMCID: PMC11374585 DOI: 10.14366/usg.24005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 07/01/2024] [Indexed: 08/20/2024] Open
Abstract
PURPOSE Deep learning-based image enhancement has significant potential in the field of ultrasound image processing, as it can accurately model complicated nonlinear artifacts and noise, such as ultrasonic speckle patterns. However, training deep learning networks to acquire reference images that are clean and free of noise presents significant challenges. This study introduces an unsupervised deep learning framework, termed speckle-to-speckle (S2S), designed for speckle and noise suppression. This framework can complete its training without the need for clean (speckle-free) reference images. METHODS The proposed network leverages statistical reasoning for the mutual training of two in vivo images, each with distinct speckle patterns and noise. It then infers speckle- and noise-free images without needing clean reference images. This approach significantly reduces the time, cost, and effort experts need to invest in annotating reference images manually. RESULTS The experimental results demonstrated that the proposed approach outperformed existing techniques in terms of the signal-to-noise ratio, contrast-to-noise ratio, structural similarity index, edge preservation index, and processing time (up to 86 times faster). It also performed excellently on images obtained from ultrasound scanners other than the ones used in this work. CONCLUSION S2S demonstrates the potential of employing an unsupervised learning-based technique in medical imaging applications, where acquiring a ground truth reference is challenging.
Collapse
Affiliation(s)
- Dongkyu Jung
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
| | - Myeongkyun Kang
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
| | - Sang Hyun Park
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
| | - Nizar Guezzi
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
| | - Jaesok Yu
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
- The Interdisciplinary Studies of Artificial Intelligence, Daegu Gyeongbuk Institute of Science and Techology, Daegu, Korea
| |
Collapse
|
2
|
Wu S, Kurugol S, Kleinman PK, Ecklund K, Walters M, Connolly SA, Johnston P, Tsai A. Deep generative model of the distal tibial classic metaphyseal lesion in infants: assessment of synthetic images. RADIOLOGY ADVANCES 2024; 1:umae018. [PMID: 39171131 PMCID: PMC11335364 DOI: 10.1093/radadv/umae018] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/04/2024] [Accepted: 06/25/2024] [Indexed: 08/23/2024]
Abstract
Background The classic metaphyseal lesion (CML) is a distinctive fracture highly specific to infant abuse. To increase the size and diversity of the training CML database for automated deep-learning detection of this fracture, we developed a mask conditional diffusion model (MaC-DM) to generate synthetic images with and without CMLs. Purpose To objectively and subjectively assess the synthetic radiographic images with and without CMLs generated by MaC-DM. Materials and Methods For retrospective testing, we randomly chose 100 real images (50 normals and 50 with CMLs; 39 infants, male = 22, female = 17; mean age = 4.1 months; SD = 3.1 months) from an existing distal tibia dataset (177 normal, 73 with CMLs), and generated 100 synthetic distal tibia images via MaC-DM (50 normals and 50 with CMLs). These test images were shown to 3 blinded radiologists. In the first session, radiologists determined if the images were normal or had CMLs. In the second session, they determined if the images were real or synthetic. We analyzed the radiologists' interpretations and employed t-distributed stochastic neighbor embedding technique to analyze the data distribution of the test images. Results When presented with the 200 images (100 synthetic, 100 with CMLs), radiologists reliably and accurately diagnosed CMLs (kappa = 0.90, 95% CI = [0.88-0.92]; accuracy = 92%, 95% CI = [89-97]). However, they were inaccurate in differentiating between real and synthetic images (kappa = 0.05, 95% CI = [0.03-0.07]; accuracy = 53%, 95% CI = [49-59]). The t-distributed stochastic neighbor embedding analysis showed substantial differences in the data distribution between normal images and those with CMLs (area under the curve = 0.996, 95% CI = [0.992-1.000], P < .01), but minor differences between real and synthetic images (area under the curve = 0.566, 95% CI = [0.486-0.647], P = .11). Conclusion Radiologists accurately diagnosed images with distal tibial CMLs but were unable to distinguish real from synthetically generated ones, indicating that our generative model could synthesize realistic images. Thus, MaC-DM holds promise as an effective strategy for data augmentation in training machine-learning models for diagnosis of distal tibial CMLs.
Collapse
Affiliation(s)
- Shaoju Wu
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Sila Kurugol
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Paul K Kleinman
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Kirsten Ecklund
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Michele Walters
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Susan A Connolly
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Patrick Johnston
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Andy Tsai
- Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| |
Collapse
|
3
|
Kamran SA, Hossain KF, Ong J, Waisberg E, Zaman N, Baker SA, Lee AG, Tavakkoli A. FA4SANS-GAN: A Novel Machine Learning Generative Adversarial Network to Further Understand Ophthalmic Changes in Spaceflight Associated Neuro-Ocular Syndrome (SANS). OPHTHALMOLOGY SCIENCE 2024; 4:100493. [PMID: 38682031 PMCID: PMC11046204 DOI: 10.1016/j.xops.2024.100493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 01/11/2024] [Accepted: 02/05/2024] [Indexed: 05/01/2024]
Abstract
Purpose To provide an automated system for synthesizing fluorescein angiography (FA) images from color fundus photographs for averting risks associated with fluorescein dye and extend its future application to spaceflight associated neuro-ocular syndrome (SANS) detection in spaceflight where resources are limited. Design Development and validation of a novel conditional generative adversarial network (GAN) trained on limited amount of FA and color fundus images with diabetic retinopathy and control cases. Participants Color fundus and FA paired images for unique patients were collected from a publicly available study. Methods FA4SANS-GAN was trained to generate FA images from color fundus photographs using 2 multiscale generators coupled with 2 patch-GAN discriminators. Eight hundred fifty color fundus and FA images were utilized for training by augmenting images from 17 unique patients. The model was evaluated on 56 fluorescein images collected from 14 unique patients. In addition, it was compared with 3 other GAN architectures trained on the same data set. Furthermore, we test the robustness of the models against acquisition noise and retaining structural information when introduced to artificially created biological markers. Main Outcome Measures For GAN synthesis, metric Fréchet Inception Distance (FID) and Kernel Inception Distance (KID). Also, two 1-sided tests (TOST) based on Welch's t test for measuring statistical significance. Results On test FA images, mean FID for FA4SANS-GAN was 39.8 (standard deviation, 9.9), which is better than GANgio model's mean of 43.2 (standard deviation, 13.7), Pix2PixHD's mean of 57.3 (standard deviation, 11.5) and Pix2Pix's mean of 67.5 (standard deviation, 11.7). Similarly for KID, FA4SANS-GAN achieved mean of 0.00278 (standard deviation, 0.00167) which is better than other 3 model's mean KID of 0.00303 (standard deviation, 0.00216), 0.00609 (standard deviation, 0.00238), 0.00784 (standard deviation, 0.00218). For TOST measurement, FA4SANS-GAN was proven to be statistically significant versus GANgio (P = 0.006); versus Pix2PixHD (P < 0.00001); and versus Pix2Pix (P < 0.00001). Conclusions Our study has shown FA4SANS-GAN to be statistically significant for 2 GAN synthesis metrics. Moreover, it is robust against acquisition noise, and can retain clear biological markers compared with the other 3 GAN architectures. This deployment of this model can be crucial in the International Space Station for detecting SANS. Financial Disclosures The authors have no proprietary or commercial interest in any materials discussed in this article.
Collapse
Affiliation(s)
- Sharif Amit Kamran
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, Reno, Nevada
| | - Khondker Fariha Hossain
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, Reno, Nevada
| | - Joshua Ong
- Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, Michigan
| | - Ethan Waisberg
- Department of Ophthalmology, University College Dublin School of Medicine, Belfield, Dublin, Ireland
| | - Nasif Zaman
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, Reno, Nevada
| | - Salah A. Baker
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada
| | - Andrew G. Lee
- Center for Space Medicine, Baylor College of Medicine, Houston, Texas
- Department of Ophthalmology, Blanton Eye Institute, Houston Methodist Hospital, Houston, Texas
- Houston Methodist Research Institute, Houston Methodist Hospital, Houston, Texas
- Departments of Ophthalmology, Neurology, and Neurosurgery, Weill Cornell Medicine, New York, New York
- Department of Ophthalmology, University of Texas Medical Branch, Galveston, Texas
- Department of Ophthalmology, University of Texas MD Anderson Cancer Center, Houston, Texas
- Department of Ophthalmology, Texas A&M College of Medicine, Texas
- Department of Ophthalmology, The University of Iowa Hospitals and Clinics, Iowa City, Iowa
| | - Alireza Tavakkoli
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, Reno, Nevada
| |
Collapse
|
4
|
Dai H, Wang J, Zhong Q, Chen T, Liu H, Zhang X, Lu R. A GAN-based anomaly detector using multi-feature fusion and selection. Sci Rep 2024; 14:5259. [PMID: 38438429 PMCID: PMC11222451 DOI: 10.1038/s41598-024-52378-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 01/18/2024] [Indexed: 03/06/2024] Open
Abstract
In numerous applications, abnormal samples are hard to collect, limiting the use of well-established supervised learning methods. GAN-based models which trained in an unsupervised and single feature set manner have been proposed by simultaneously considering the reconstruction error and the latent space deviation between normal samples and abnormal samples. However, the ability to capture the input distribution of each feature set is limited. Hence, we propose an unsupervised and multi-feature model, Wave-GANomaly, trained only on normal samples to learn the distribution of these normal samples. The model predicts whether a given sample is normal or not by its deviation from the distribution of normal samples. Wave-GANomaly fuses and selects from the wave-based features extracted by the WaveBlock module and the convolution-based features. The WaveBlock has proven to efficiently improve the performance on image classification, object detection, and segmentation tasks. As a result, Wave-GANomaly achieves the best average area under the curve (AUC) on the Canadian Institute for Advanced Research (CIFAR)-10 dataset (94.3%) and on the Modified National Institute of Standards and Technology (MNIST) dataset (91.0%) when compared to existing state-of-the-art anomaly detectors such as GANomaly, Skip-GANomaly, and the skip-attention generative adversarial network (SAGAN). We further verify our method by the self-curated real-world dataset, the result show that our method is better than GANomaly which only use single feature set for training the model.
Collapse
Affiliation(s)
- Huafeng Dai
- Tsinghua University, Beijing, China
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
| | - Jyunrong Wang
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
- Hefei University of Technology, Hefei, Anhui, China
| | - Quan Zhong
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
| | - Taogen Chen
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
| | - Hao Liu
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
| | - Xuegang Zhang
- LCFC (Hefei) Electronics Technology Co., Ltd., Hefei, Anhui, China
- Hefei LCFC Information Technology Co., Ltd., Hefei, Anhui, China
| | - Rongsheng Lu
- Hefei University of Technology, Hefei, Anhui, China.
| |
Collapse
|
5
|
Castellano G, Esposito A, Lella E, Montanaro G, Vessio G. Automated detection of Alzheimer's disease: a multi-modal approach with 3D MRI and amyloid PET. Sci Rep 2024; 14:5210. [PMID: 38433282 PMCID: PMC10909869 DOI: 10.1038/s41598-024-56001-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 02/28/2024] [Indexed: 03/05/2024] Open
Abstract
Recent advances in deep learning and imaging technologies have revolutionized automated medical image analysis, especially in diagnosing Alzheimer's disease through neuroimaging. Despite the availability of various imaging modalities for the same patient, the development of multi-modal models leveraging these modalities remains underexplored. This paper addresses this gap by proposing and evaluating classification models using 2D and 3D MRI images and amyloid PET scans in uni-modal and multi-modal frameworks. Our findings demonstrate that models using volumetric data learn more effective representations than those using only 2D images. Furthermore, integrating multiple modalities enhances model performance over single-modality approaches significantly. We achieved state-of-the-art performance on the OASIS-3 cohort. Additionally, explainability analyses with Grad-CAM indicate that our model focuses on crucial AD-related regions for its predictions, underscoring its potential to aid in understanding the disease's causes.
Collapse
Affiliation(s)
| | - Andrea Esposito
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
| | - Eufemia Lella
- Sirio - Research & Innovation, Sidea Group, Bari, Italy
| | | | - Gennaro Vessio
- Department of Computer Science, University of Bari Aldo Moro, Bari, Italy.
| |
Collapse
|
6
|
Kim K, Cho K, Jang R, Kyung S, Lee S, Ham S, Choi E, Hong GS, Kim N. Updated Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging for Medical Professionals. Korean J Radiol 2024; 25:224-242. [PMID: 38413108 PMCID: PMC10912493 DOI: 10.3348/kjr.2023.0818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/27/2023] [Accepted: 12/28/2023] [Indexed: 02/29/2024] Open
Abstract
The emergence of Chat Generative Pre-trained Transformer (ChatGPT), a chatbot developed by OpenAI, has garnered interest in the application of generative artificial intelligence (AI) models in the medical field. This review summarizes different generative AI models and their potential applications in the field of medicine and explores the evolving landscape of Generative Adversarial Networks and diffusion models since the introduction of generative AI models. These models have made valuable contributions to the field of radiology. Furthermore, this review also explores the significance of synthetic data in addressing privacy concerns and augmenting data diversity and quality within the medical domain, in addition to emphasizing the role of inversion in the investigation of generative models and outlining an approach to replicate this process. We provide an overview of Large Language Models, such as GPTs and bidirectional encoder representations (BERTs), that focus on prominent representatives and discuss recent initiatives involving language-vision models in radiology, including innovative large language and vision assistant for biomedicine (LLaVa-Med), to illustrate their practical application. This comprehensive review offers insights into the wide-ranging applications of generative AI models in clinical research and emphasizes their transformative potential.
Collapse
Affiliation(s)
- Kiduk Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Kyungjin Cho
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | | | - Sunggu Kyung
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Soyoung Lee
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sungwon Ham
- Healthcare Readiness Institute for Unified Korea, Korea University Ansan Hospital, Korea University College of Medicine, Ansan, Republic of Korea
| | - Edward Choi
- Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Gil-Sun Hong
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| | - Namkug Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
7
|
Kim J, Li Y, Shin BS. Volumetric Imitation Generative Adversarial Networks for Anatomical Human Body Modeling. Bioengineering (Basel) 2024; 11:163. [PMID: 38391649 PMCID: PMC10886047 DOI: 10.3390/bioengineering11020163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 02/02/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
Volumetric representation is a technique used to express 3D objects in various fields, such as medical applications. On the other hand, tomography images for reconstructing volumetric data have limited utilization because they contain personal information. Existing GAN-based medical image generation techniques can produce virtual tomographic images for volume reconstruction while preserving the patient's privacy. Nevertheless, these images often do not consider vertical correlations between the adjacent slices, leading to erroneous results in 3D reconstruction. Furthermore, while volume generation techniques have been introduced, they often focus on surface modeling, making it challenging to represent the internal anatomical features accurately. This paper proposes volumetric imitation GAN (VI-GAN), which imitates a human anatomical model to generate volumetric data. The primary goal of this model is to capture the attributes and 3D structure, including the external shape, internal slices, and the relationship between the vertical slices of the human anatomical model. The proposed network consists of a generator for feature extraction and up-sampling based on a 3D U-Net and ResNet structure and a 3D-convolution-based LFFB (local feature fusion block). In addition, a discriminator utilizes 3D convolution to evaluate the authenticity of the generated volume compared to the ground truth. VI-GAN also devises reconstruction loss, including feature and similarity losses, to converge the generated volumetric data into a human anatomical model. In this experiment, the CT data of 234 people were used to assess the reliability of the results. When using volume evaluation metrics to measure similarity, VI-GAN generated a volume that realistically represented the human anatomical model compared to existing volume generation methods.
Collapse
Affiliation(s)
- Jion Kim
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Republic of Korea
| |
Collapse
|
8
|
Khosravi P, Mohammadi S, Zahiri F, Khodarahmi M, Zahiri J. AI-Enhanced Detection of Clinically Relevant Structural and Functional Anomalies in MRI: Traversing the Landscape of Conventional to Explainable Approaches. J Magn Reson Imaging 2024. [PMID: 38243677 DOI: 10.1002/jmri.29247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 01/21/2024] Open
Abstract
Anomaly detection in medical imaging, particularly within the realm of magnetic resonance imaging (MRI), stands as a vital area of research with far-reaching implications across various medical fields. This review meticulously examines the integration of artificial intelligence (AI) in anomaly detection for MR images, spotlighting its transformative impact on medical diagnostics. We delve into the forefront of AI applications in MRI, exploring advanced machine learning (ML) and deep learning (DL) methodologies that are pivotal in enhancing the precision of diagnostic processes. The review provides a detailed analysis of preprocessing, feature extraction, classification, and segmentation techniques, alongside a comprehensive evaluation of commonly used metrics. Further, this paper explores the latest developments in ensemble methods and explainable AI, offering insights into future directions and potential breakthroughs. This review synthesizes current insights, offering a valuable guide for researchers, clinicians, and medical imaging experts. It highlights AI's crucial role in improving the precision and speed of detecting key structural and functional irregularities in MRI. Our exploration of innovative techniques and trends furthers MRI technology development, aiming to refine diagnostics, tailor treatments, and elevate patient care outcomes. LEVEL OF EVIDENCE: 5 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Pegah Khosravi
- Department of Biological Sciences, New York City College of Technology, CUNY, New York City, New York, USA
- The CUNY Graduate Center, City University of New York, New York City, New York, USA
| | - Saber Mohammadi
- Department of Biological Sciences, New York City College of Technology, CUNY, New York City, New York, USA
- Department of Biophysics, Tarbiat Modares University, Tehran, Iran
| | - Fatemeh Zahiri
- Department of Cell and Molecular Sciences, Kharazmi University, Tehran, Iran
| | | | - Javad Zahiri
- Department of Neuroscience, University of California San Diego, San Diego, California, USA
| |
Collapse
|
9
|
Klarák J, Andok R, Malík P, Kuric I, Ritomský M, Klačková I, Tsai HY. From Anomaly Detection to Defect Classification. SENSORS (BASEL, SWITZERLAND) 2024; 24:429. [PMID: 38257523 PMCID: PMC10821230 DOI: 10.3390/s24020429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 12/30/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024]
Abstract
This paper proposes a new approach to defect detection system design focused on exact damaged areas demonstrated through visual data containing gear wheel images. The main advantage of the system is the capability to detect a wide range of patterns of defects occurring in datasets. The methodology is built on three processes that combine different approaches from unsupervised and supervised methods. The first step is a search for anomalies, which is performed by defining the correct areas on the controlled object by using the autoencoder approach. As a result, the differences between the original and autoencoder-generated images are obtained. These are divided into clusters using the clustering method (DBSCAN). Based on the clusters, the regions of interest are subsequently defined and classified using the pre-trained Xception network classifier. The main result is a system capable of focusing on exact defect areas using the sequence of unsupervised learning (autoencoder)-unsupervised learning (clustering)-supervised learning (classification) methods (U2S-CNN). The outcome with tested samples was 177 detected regions and 205 occurring damaged areas. There were 108 regions detected correctly, and 69 regions were labeled incorrectly. This paper describes a proof of concept for defect detection by highlighting exact defect areas. It can be thus an alternative to using detectors such as YOLO methods, reconstructors, autoencoders, transformers, etc.
Collapse
Affiliation(s)
- Jaromír Klarák
- Institute of Informatics, Slovak Academy of Sciences, 845 07 Bratislava, Slovakia; (R.A.); (P.M.); (M.R.)
| | - Robert Andok
- Institute of Informatics, Slovak Academy of Sciences, 845 07 Bratislava, Slovakia; (R.A.); (P.M.); (M.R.)
| | - Peter Malík
- Institute of Informatics, Slovak Academy of Sciences, 845 07 Bratislava, Slovakia; (R.A.); (P.M.); (M.R.)
| | - Ivan Kuric
- Department of Automation and Production Systems, Faculty of Mechanical Engineering, University of Zilina, 010 26 Zilina, Slovakia; (I.K.); (I.K.)
| | - Mário Ritomský
- Institute of Informatics, Slovak Academy of Sciences, 845 07 Bratislava, Slovakia; (R.A.); (P.M.); (M.R.)
| | - Ivana Klačková
- Department of Automation and Production Systems, Faculty of Mechanical Engineering, University of Zilina, 010 26 Zilina, Slovakia; (I.K.); (I.K.)
| | - Hung-Yin Tsai
- Department of Power Mechanical Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan;
| |
Collapse
|
10
|
Siddiquee MMR, Shah J, Wu T, Chong C, Schwedt TJ, Dumkrieger G, Nikolova S, Li B. Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION. IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION 2024; 2024:7558-7567. [PMID: 38720667 PMCID: PMC11078334 DOI: 10.1109/wacv57701.2024.00740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2024]
Abstract
Harnessing the power of deep neural networks in the medical imaging domain is challenging due to the difficulties in acquiring large annotated datasets, especially for rare diseases, which involve high costs, time, and effort for annotation. Unsupervised disease detection methods, such as anomaly detection, can significantly reduce human effort in these scenarios. While anomaly detection typically focuses on learning from images of healthy subjects only, real-world situations often present unannotated datasets with a mixture of healthy and diseased subjects. Recent studies have demonstrated that utilizing such unannotated images can improve unsupervised disease and anomaly detection. However, these methods do not utilize knowledge specific to registered neuroimages, resulting in a subpar performance in neurologic disease detection. To address this limitation, we propose Brainomaly, a GAN-based image-to-image translation method specifically designed for neurologic disease detection. Brainomaly not only offers tailored image-to-image translation suitable for neuroimages but also leverages unannotated mixed images to achieve superior neurologic disease detection. Additionally, we address the issue of model selection for inference without annotated samples by proposing a pseudo-AUC metric, further enhancing Brainomaly's detection performance. Extensive experiments and ablation studies demonstrate that Brainomaly outperforms existing state-of-the-art unsupervised disease and anomaly detection methods by significant margins in Alzheimer's disease detection using a publicly available dataset and headache detection using an institutional dataset. The code is available from https://github.com/mahfuzmohammad/Brainomaly.
Collapse
Affiliation(s)
| | - Jay Shah
- Arizona State University
- ASU-Mayo Center for Innovative Imaging
| | - Teresa Wu
- Arizona State University
- ASU-Mayo Center for Innovative Imaging
| | | | | | | | | | - Baoxin Li
- Arizona State University
- ASU-Mayo Center for Innovative Imaging
| |
Collapse
|
11
|
Hong GS, Jang M, Kyung S, Cho K, Jeong J, Lee GY, Shin K, Kim KD, Ryu SM, Seo JB, Lee SM, Kim N. Overcoming the Challenges in the Development and Implementation of Artificial Intelligence in Radiology: A Comprehensive Review of Solutions Beyond Supervised Learning. Korean J Radiol 2023; 24:1061-1080. [PMID: 37724586 PMCID: PMC10613849 DOI: 10.3348/kjr.2023.0393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 07/01/2023] [Accepted: 07/30/2023] [Indexed: 09/21/2023] Open
Abstract
Artificial intelligence (AI) in radiology is a rapidly developing field with several prospective clinical studies demonstrating its benefits in clinical practice. In 2022, the Korean Society of Radiology held a forum to discuss the challenges and drawbacks in AI development and implementation. Various barriers hinder the successful application and widespread adoption of AI in radiology, such as limited annotated data, data privacy and security, data heterogeneity, imbalanced data, model interpretability, overfitting, and integration with clinical workflows. In this review, some of the various possible solutions to these challenges are presented and discussed; these include training with longitudinal and multimodal datasets, dense training with multitask learning and multimodal learning, self-supervised contrastive learning, various image modifications and syntheses using generative models, explainable AI, causal learning, federated learning with large data models, and digital twins.
Collapse
Affiliation(s)
- Gil-Sun Hong
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Miso Jang
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sunggu Kyung
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Kyungjin Cho
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Jiheon Jeong
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Grace Yoojin Lee
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Keewon Shin
- Laboratory for Biosignal Analysis and Perioperative Outcome Research, Biomedical Engineering Center, Asan Institute of Lifesciences, Asan Medical Center, Seoul, Republic of Korea
| | - Ki Duk Kim
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Seung Min Ryu
- Department of Orthopedic Surgery, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Joon Beom Seo
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sang Min Lee
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| | - Namkug Kim
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
12
|
Shin K, Lee JS, Lee JY, Lee H, Kim J, Byeon JS, Jung HY, Kim DH, Kim N. An Image Turing Test on Realistic Gastroscopy Images Generated by Using the Progressive Growing of Generative Adversarial Networks. J Digit Imaging 2023; 36:1760-1769. [PMID: 36914855 PMCID: PMC10406771 DOI: 10.1007/s10278-023-00803-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 02/21/2023] [Accepted: 02/23/2023] [Indexed: 03/16/2023] Open
Abstract
Generative adversarial networks (GAN) in medicine are valuable techniques for augmenting unbalanced rare data, anomaly detection, and avoiding patient privacy issues. However, there were limits to generating high-quality endoscopic images with various characteristics, such as peristalsis, viewpoints, light sources, and mucous patterns. This study used the progressive growing of GAN (PGGAN) within the normal distribution dataset to confirm the ability to generate high-quality gastrointestinal images and investigated what barriers PGGAN has to generate endoscopic images. We trained the PGGAN with 107,060 gastroscopy images from 4165 normal patients to generate highly realistic 5122 pixel-sized images. For the evaluation, visual Turing tests were conducted on 100 real and 100 synthetic images to distinguish the authenticity of images by 19 endoscopists. The endoscopists were divided into three groups based on their years of clinical experience for subgroup analysis. The overall accuracy, sensitivity, and specificity of the 19 endoscopist groups were 61.3%, 70.3%, and 52.4%, respectively. The mean accuracy of the three endoscopist groups was 62.4 [Group I], 59.8 [Group II], and 59.1% [Group III], which was not considered a significant difference. There were no statistically significant differences in the location of the stomach. However, the real images with the anatomical landmark pylorus had higher detection sensitivity. The images generated by PGGAN showed highly realistic depictions that were difficult to distinguish, regardless of their expertise as endoscopists. However, it was necessary to establish GANs that could better represent the rugal folds and mucous membrane texture.
Collapse
Affiliation(s)
- Keewon Shin
- Biomedical Engineering Research Center, Asan Medical Center, Seoul, Republic of Korea
| | - Jung Su Lee
- Department of Gastroenterology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
- Seoul Samsung Internal Medicine Clinic, Seoul, Republic of Korea
| | - Ji Young Lee
- Department of Health Screening and Promotion Center, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Hyunsu Lee
- Department of Medical Informatics, Keimyung University School of Medicine, Daegu, Republic of Korea
| | - Jeongseok Kim
- Department of Internal Medicine, Keimyung University School of Medicine, Daegu, Republic of Korea
| | - Jeong-Sik Byeon
- Department of Gastroenterology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Hwoon-Yong Jung
- Department of Gastroenterology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Do Hoon Kim
- Department of Gastroenterology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| | - Namkug Kim
- Biomedical Engineering Research Center, Asan Medical Center, Seoul, Republic of Korea.
- Department of Convergence Medicine, University of Ulsan College of Medicine & Asan Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
13
|
Mostafa AM, Zakariah M, Aldakheel EA. Brain Tumor Segmentation Using Deep Learning on MRI Images. Diagnostics (Basel) 2023; 13:diagnostics13091562. [PMID: 37174953 PMCID: PMC10177460 DOI: 10.3390/diagnostics13091562] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/18/2023] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open
Abstract
Brain tumor (BT) diagnosis is a lengthy process, and great skill and expertise are required from radiologists. As the number of patients has expanded, so has the amount of data to be processed, making previous techniques both costly and ineffective. Many academics have examined a range of reliable and quick techniques for identifying and categorizing BTs. Recently, deep learning (DL) methods have gained popularity for creating computer algorithms that can quickly and reliably diagnose or segment BTs. To identify BTs in medical images, DL permits a pre-trained convolutional neural network (CNN) model. The suggested magnetic resonance imaging (MRI) images of BTs are included in the BT segmentation dataset, which was created as a benchmark for developing and evaluating algorithms for BT segmentation and diagnosis. There are 335 annotated MRI images in the collection. For the purpose of developing and testing BT segmentation and diagnosis algorithms, the brain tumor segmentation (BraTS) dataset was produced. A deep CNN was also utilized in the model-building process for segmenting BTs using the BraTS dataset. To train the model, a categorical cross-entropy loss function and an optimizer, such as Adam, were employed. Finally, the model's output successfully identified and segmented BTs in the dataset, attaining a validation accuracy of 98%.
Collapse
Affiliation(s)
- Almetwally M Mostafa
- Department of Information Systems, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia
| | - Mohammed Zakariah
- Department of Computer Science, College of Computer and Information Science, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia
| | - Eman Abdullah Aldakheel
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| |
Collapse
|
14
|
Chung YW, Choi IY. Detection of abnormal extraocular muscles in small datasets of computed tomography images using a three-dimensional variational autoencoder. Sci Rep 2023; 13:1765. [PMID: 36720904 PMCID: PMC9889739 DOI: 10.1038/s41598-023-28082-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 01/12/2023] [Indexed: 02/02/2023] Open
Abstract
We sought to establish an unsupervised algorithm with a three-dimensional (3D) variational autoencoder model (VAE) for the detection of abnormal extraocular muscles in small datasets of orbital computed tomography (CT) images. 334 CT images of normal orbits and 96 of abnormal orbits diagnosed as thyroid eye disease were used for training and validation; 24 normal and 11 abnormal orbits were used for the test. A 3D VAE was developed and trained. All images were preprocessed to emphasize extraocular muscles and to suppress background noise (e.g., high signal intensity from bones). The optimal cut-off value was identified through receiver operating characteristic (ROC) curve analysis. The ability of the model to detect muscles of abnormal size was assessed by visualization. The model achieved a sensitivity of 79.2%, specificity of 72.7%, accuracy of 77.1%, F1-score of 0.667, and AUROC of 0.801. Abnormal CT images correctly identified by the model showed differences in the reconstruction of extraocular muscles. The proposed model showed potential to detect abnormalities in extraocular muscles using a small dataset, similar to the diagnostic approach used by physicians. Unsupervised learning could serve as an alternative detection method for medical imaging studies in which annotation is difficult or impossible to perform.
Collapse
Affiliation(s)
- Yeon Woong Chung
- Department of Ophthalmology and Visual Science, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.,Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Banpo Dae-Ro 222, Seoul, 06591, Republic of Korea
| | - In Young Choi
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Banpo Dae-Ro 222, Seoul, 06591, Republic of Korea.
| |
Collapse
|
15
|
Contrastive Learning with Dynamic Weighting and Jigsaw Augmentation for Brain Tumor Classification in MRI. Neural Process Lett 2023. [DOI: 10.1007/s11063-022-11108-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
16
|
Optimization of facial skin temperature-based anomaly detection model considering diurnal variation. ARTIFICIAL LIFE AND ROBOTICS 2023. [DOI: 10.1007/s10015-023-00853-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
17
|
Wei Y, Yang M, Xu L, Liu M, Zhang F, Xie T, Cheng X, Wang X, Che F, Li Q, Xu Q, Huang Z, Liu M. Novel Computed-Tomography-Based Transformer Models for the Noninvasive Prediction of PD-1 in Pre-Operative Settings. Cancers (Basel) 2023; 15:cancers15030658. [PMID: 36765615 PMCID: PMC9913645 DOI: 10.3390/cancers15030658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 01/05/2023] [Accepted: 01/12/2023] [Indexed: 01/24/2023] Open
Abstract
The expression status of programmed cell death protein 1 (PD-1) in patients with hepatocellular carcinoma (HCC) is associated with the checkpoint blockade treatment responses of PD-1/PD-L1. Thus, accurately and preoperatively identifying the status of PD-1 has great clinical implications for constructing personalized treatment strategies. To investigate the preoperative predictive value of the transformer-based model for identifying the status of PD-1 expression, 93 HCC patients with 75 training cohorts (2859 images) and 18 testing cohorts (670 images) were included. We propose a transformer-based network architecture, ResTransNet, that efficiently employs convolutional neural networks (CNNs) and self-attention mechanisms to automatically acquire a persuasive feature to obtain a prediction score using a nonlinear classifier. The area under the curve, receiver operating characteristic curve, and decision curves were applied to evaluate the prediction model's performance. Then, Kaplan-Meier survival analyses were applied to evaluate the overall survival (OS) and recurrence-free survival (RFS) in PD-1-positive and PD-1-negative patients. The proposed transformer-based model obtained an accuracy of 88.2% with a sensitivity of 88.5%, a specificity of 88.9%, and an area under the curve of 91.1% in the testing cohort.
Collapse
Affiliation(s)
- Yi Wei
- Department of Radiology, West China Hospital, Sichuan University, Chengdu 610000, China
| | - Meiyi Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Lifeng Xu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou 324000, China
| | - Minghui Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Feng Zhang
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou 324000, China
| | - Tianshu Xie
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Xuan Cheng
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Xiaomin Wang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610000, China
| | - Feng Che
- Department of Radiology, West China Hospital, Sichuan University, Chengdu 610000, China
| | - Qian Li
- Department of Radiology, West China Hospital, Sichuan University, Chengdu 610000, China
| | - Qing Xu
- Institute of Clinical Pathology, West China Hospital, Sichuan University, Chengdu 610000, China
| | - Zixing Huang
- Department of Radiology, West China Hospital, Sichuan University, Chengdu 610000, China
- Correspondence: (Z.H.); (M.L.)
| | - Ming Liu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou 324000, China
- Correspondence: (Z.H.); (M.L.)
| |
Collapse
|
18
|
Kolte S, Bhowmik N, Dhiraj. Threat Object-based anomaly detection in X-ray images using GAN-based ensembles. Neural Comput Appl 2022; 35:1-16. [PMID: 36532881 PMCID: PMC9734403 DOI: 10.1007/s00521-022-08029-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Accepted: 10/29/2022] [Indexed: 12/13/2022]
Abstract
The problem of detecting dangerous or prohibited objects in luggage is a very important step during the implementation of Security setup at Airports, Banks, Government buildings, etc. At present, the most common techniques for detecting such dangerous objects are by using intelligent data analysis algorithms such as deep learning techniques on X-ray imaging or employing a human workforce for inferring the presence of these threat objects in the obtained X-ray images. One of the major challenges while using deep-learning methods to detect such objects is the lack of high-quality threat image data containing the "dangerous" objects (objects of interest) versus the non-threat image data in practical scenarios. So, to tackle this data scarcity problem, anomaly detection techniques using normal data samples have shown great promise. Also, among the available Deep Learning Strategies for anomaly detection for computer vision applications, generative adversarial networks have achieved state-of-the-art results. Considering these insights, we adopted a newly proposed architecture known as Skip-GANomaly and devised a modified version of it by using a UNet++ style generator which performed better than Skip-GANomaly, getting an AUC of 94.94% on Compass-XP, a public X-ray dataset. Finally, for targeting better latent space exploration, we combine these two architectures into an Ensemble, which gives another boost to the performance, getting an AUC of 96.8% on the same Compass-XP, a public X-ray dataset. To further validate the effectiveness of ensemble-based architecture, its performance was tested on patch-based training data on a subset of randomly chosen images of another huge public X-ray dataset named as SIXray, and obtained an AUC of 75.3% on this reduced dataset. To demonstrate the prowess of the discriminator and to bring some explainability to the working of our ensemble, we have used Uniform Manifold Approximation and Projection to plot the latent-space vectors for the dangerous and non-dangerous objects of the test-set; this analysis indicates that the Ensemble learns better features for separating the anomalous class from non-anomalous with respect to the individual architectures. Thus, our proposed architecture provides state-of-the-art results for threat object detection. Most importantly, our models are able to detect threat objects without ever being trained on images containing threat objects.
Collapse
Affiliation(s)
- Shreyas Kolte
- Birla Institute of Technology and Science, Pilani, India
| | | | - Dhiraj
- CSIR-CEERI, Pilani, India
| |
Collapse
|
19
|
A Systematic Literature Review on Applications of GAN-Synthesized Images for Brain MRI. FUTURE INTERNET 2022. [DOI: 10.3390/fi14120351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
With the advances in brain imaging, magnetic resonance imaging (MRI) is evolving as a popular radiological tool in clinical diagnosis. Deep learning (DL) methods can detect abnormalities in brain images without an extensive manual feature extraction process. Generative adversarial network (GAN)-synthesized images have many applications in this field besides augmentation, such as image translation, registration, super-resolution, denoising, motion correction, segmentation, reconstruction, and contrast enhancement. The existing literature was reviewed systematically to understand the role of GAN-synthesized dummy images in brain disease diagnosis. Web of Science and Scopus databases were extensively searched to find relevant studies from the last 6 years to write this systematic literature review (SLR). Predefined inclusion and exclusion criteria helped in filtering the search results. Data extraction is based on related research questions (RQ). This SLR identifies various loss functions used in the above applications and software to process brain MRIs. A comparative study of existing evaluation metrics for GAN-synthesized images helps choose the proper metric for an application. GAN-synthesized images will have a crucial role in the clinical sector in the coming years, and this paper gives a baseline for other researchers in the field.
Collapse
|
20
|
Kim E, Cho HH, Kwon J, Oh YT, Ko ES, Park H. Tumor-Attentive Segmentation-Guided GAN for Synthesizing Breast Contrast-Enhanced MRI Without Contrast Agents. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2022; 11:32-43. [PMID: 36478773 PMCID: PMC9721354 DOI: 10.1109/jtehm.2022.3221918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/25/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022]
Abstract
OBJECTIVE Breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a sensitive imaging technique critical for breast cancer diagnosis. However, the administration of contrast agents poses a potential risk. This can be avoided if contrast-enhanced MRI can be obtained without using contrast agents. Thus, we aimed to generate T1-weighted contrast-enhanced MRI (ceT1) images from pre-contrast T1 weighted MRI (preT1) images in the breast. METHODS We proposed a generative adversarial network to synthesize ceT1 from preT1 breast images that adopted a local discriminator and segmentation task network to focus specifically on the tumor region in addition to the whole breast. The segmentation network performed a related task of segmentation of the tumor region, which allowed important tumor-related information to be enhanced. In addition, edge maps were included to provide explicit shape and structural information. Our approach was evaluated and compared with other methods in the local (n = 306) and external validation (n = 140) cohorts. Four evaluation metrics of normalized mean squared error (NRMSE), Pearson cross-correlation coefficients (CC), peak signal-to-noise ratio (PSNR), and structural similarity index map (SSIM) for the whole breast and tumor region were measured. An ablation study was performed to evaluate the incremental benefits of various components in our approach. RESULTS Our approach performed the best with an NRMSE 25.65, PSNR 54.80 dB, SSIM 0.91, and CC 0.88 on average, in the local test set. CONCLUSION Performance gains were replicated in the validation cohort. SIGNIFICANCE We hope that our method will help patients avoid potentially harmful contrast agents. Clinical and Translational Impact Statement-Contrast agents are necessary to obtain DCE-MRI which is essential in breast cancer diagnosis. However, administration of contrast agents may cause side effects such as nephrogenic systemic fibrosis and risk of toxic residue deposits. Our approach can generate DCE-MRI without contrast agents using a generative deep neural network. Thus, our approach could help patients avoid potentially harmful contrast agents resulting in an improved diagnosis and treatment workflow for breast cancer.
Collapse
Affiliation(s)
- Eunjin Kim
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Hwan-Ho Cho
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
- Department of Medical Aritifical IntelligenceKonyang UniversityDaejon35365South Korea
| | - Junmo Kwon
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Young-Tack Oh
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Eun Sook Ko
- Samsung Medical CenterDepartment of Radiology, School of MedicineSungkyunkwan UniversitySeoul06351South Korea
| | - Hyunjin Park
- School of Electronic and Electrical EngineeringSungkyunkwan UniversitySuwon16419South Korea
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwon16419South Korea
| |
Collapse
|
21
|
Yang M, He X, Xu L, Liu M, Deng J, Cheng X, Wei Y, Li Q, Wan S, Zhang F, Wu L, Wang X, Song B, Liu M. CT-based transformer model for non-invasively predicting the Fuhrman nuclear grade of clear cell renal cell carcinoma. Front Oncol 2022; 12:961779. [PMID: 36249050 PMCID: PMC9555088 DOI: 10.3389/fonc.2022.961779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 08/29/2022] [Indexed: 11/13/2022] Open
Abstract
Background Clear cell Renal Cell Carcinoma (ccRCC) is the most common malignant tumor in the urinary system and the predominant subtype of malignant renal tumors with high mortality. Biopsy is the main examination to determine ccRCC grade, but it can lead to unavoidable complications and sampling bias. Therefore, non-invasive technology (e.g., CT examination) for ccRCC grading is attracting more and more attention. However, noise labels on CT images containing multiple grades but only one label make prediction difficult. However, noise labels exist in CT images, which contain multiple grades but only one label, making prediction difficult. Aim We proposed a Transformer-based deep learning algorithm with CT images to improve the diagnostic accuracy of grading prediction and to improve the diagnostic accuracy of ccRCC grading. Methods We integrate different training models to improve robustness and predict Fuhrman nuclear grade. Then, we conducted experiments on a collected ccRCC dataset containing 759 patients and used average classification accuracy, sensitivity, specificity, and AreaUnderCurve as indicators to evaluate the quality of research. In the comparative experiments, we further performed various current deep learning algorithms to show the advantages of the proposed method. We collected patients with pathologically proven ccRCC diagnosed from April 2010 to December 2018 as the training and internal test dataset, containing 759 patients. We propose a transformer-based network architecture that efficiently employs convolutional neural networks (CNNs) and self-attention mechanisms to extract a persuasive feature automatically. And then, a nonlinear classifier is applied to classify. We integrate different training models to improve the accuracy and robustness of the model. The average classification accuracy, sensitivity, specificity, and area under curve are used as indicators to evaluate the quality of a model. Results The mean accuracy, sensitivity, specificity, and Area Under Curve achieved by CNN were 82.3%, 89.4%, 83.2%, and 85.7%, respectively. In contrast, the proposed Transformer-based model obtains a mean accuracy of 87.1% with a sensitivity of 91.3%, a specificity of 85.3%, and an Area Under Curve (AUC) of 90.3%. The integrated model acquires a better performance (86.5% ACC and an AUC of 91.2%). Conclusion A transformer-based network performs better than traditional deep learning algorithms in terms of the accuracy of ccRCC prediction. Meanwhile, the transformer has a certain advantage in dealing with noise labels existing in CT images of ccRCC. This method is promising to be applied to other medical tasks (e.g., the grade of neurogliomas and meningiomas).
Collapse
Affiliation(s)
- Meiyi Yang
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaopeng He
- Department of Radiology, Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Lifeng Xu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou, China
| | - Minghui Liu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiali Deng
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xuan Cheng
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yi Wei
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
| | - Qian Li
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
| | - Shang Wan
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
| | - Feng Zhang
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou, China
| | - Lei Wu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Xiaomin Wang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Bin Song
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
- *Correspondence: Ming Liu, ; Bin Song,
| | - Ming Liu
- Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People’s Hospital, Quzhou, China
- *Correspondence: Ming Liu, ; Bin Song,
| |
Collapse
|
22
|
Reyana A, Kautish S, Yahia IS, Mohamed AW. MTEDS: Multivariant Time Series-Based Encoder-Decoder System for Anomaly Detection. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4728063. [PMID: 36211006 PMCID: PMC9534607 DOI: 10.1155/2022/4728063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 08/27/2022] [Indexed: 11/17/2022]
Abstract
Intrusion detection systems examine the computer or network for potential security vulnerabilities. Time series data is real-valued. The nature of the data influences the type of anomaly detection. As a result, network anomalies are operations that deviate from the norm. These anomalies can cause a wide range of device malfunctions, overloads, and network intrusions. As a result of this, the network's normal operation and services will be disrupted. The paper proposes a new multi-variant time series-based encoder-decoder system for dealing with anomalies in time series data with multiple variables. As a result, to update network weights via backpropagation, a radical loss function is defined. Anomaly scores are used to evaluate performance. The anomaly score, according to the findings, is more stable and traceable, with fewer false positives and negatives. The proposed system's efficiency is compared to three existing approaches: Multiscaling Convolutional Recurrent Encoder-Decoder, Autoregressive Moving Average, and Long Short Term Medium-Encoder-Decoder. The results show that the proposed technique has the highest precision of 1 for a noise level of 0.2. Thus, it demonstrates greater precision for noise factors of 0.25, 0.3, 0.35, and 0.4, and its effectiveness.
Collapse
Affiliation(s)
- A. Reyana
- Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore, Tamilnadu, India
| | - Sandeep Kautish
- Department of Computer Science and Engineering, LBEF Campus, Kathmandu, Nepal
| | - I. S. Yahia
- Department of Physics, College of Science, King Khalid University, P.O. Box 9004, Abha 61413, Saudi Arabia
- Research Center for Advanced Materials Science (RCAMS), King Khalid University, P.O. Box 9004, Abha 61413, Saudi Arabia
- Department of Physics, Faculty of Education, Ain Shams University, Roxy, Cairo 11757, Egypt
| | - Ali Wagdy Mohamed
- Operations Research Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza 12613, Egypt
- Department of Mathematics and Actuarial Science School of Sciences Engineering, The American University in Cairo, Cairo 11835, Egypt
| |
Collapse
|
23
|
Zhao J, Hou X, Pan M, Zhang H. Attention-based generative adversarial network in medical imaging: A narrative review. Comput Biol Med 2022; 149:105948. [PMID: 35994931 DOI: 10.1016/j.compbiomed.2022.105948] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 07/24/2022] [Accepted: 08/06/2022] [Indexed: 11/18/2022]
Abstract
As a popular probabilistic generative model, generative adversarial network (GAN) has been successfully used not only in natural image processing, but also in medical image analysis and computer-aided diagnosis. Despite the various advantages, the applications of GAN in medical image analysis face new challenges. The introduction of attention mechanisms, which resemble the human visual system that focuses on the task-related local image area for certain information extraction, has drawn increasing interest. Recently proposed transformer-based architectures that leverage self-attention mechanism encode long-range dependencies and learn representations that are highly expressive. This motivates us to summarize the applications of using transformer-based GAN for medical image analysis. We reviewed recent advances in techniques combining various attention modules with different adversarial training schemes, and their applications in medical segmentation, synthesis and detection. Several recent studies have shown that attention modules can be effectively incorporated into a GAN model in detecting lesion areas and extracting diagnosis-related feature information precisely, thus providing a useful tool for medical image processing and diagnosis. This review indicates that research on the medical imaging analysis of GAN and attention mechanisms is still at an early stage despite the great potential. We highlight the attention-based generative adversarial network is an efficient and promising computational model advancing future research and applications in medical image analysis.
Collapse
Affiliation(s)
- Jing Zhao
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Xiaoyuan Hou
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Meiqing Pan
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China
| | - Hui Zhang
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory of Biomechanics and Mechanobiology (Beihang University), Ministry of Education, Beijing, 100191, China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology of the People's Republic of China, Beijing, 100191, China.
| |
Collapse
|
24
|
Sheng K, Offersen CM, Middleton J, Carlsen JF, Truelsen TC, Pai A, Johansen J, Nielsen MB. Automated Identification of Multiple Findings on Brain MRI for Improving Scan Acquisition and Interpretation Workflows: A Systematic Review. Diagnostics (Basel) 2022; 12:diagnostics12081878. [PMID: 36010228 PMCID: PMC9406456 DOI: 10.3390/diagnostics12081878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 07/30/2022] [Accepted: 08/01/2022] [Indexed: 11/16/2022] Open
Abstract
We conducted a systematic review of the current status of machine learning (ML) algorithms’ ability to identify multiple brain diseases, and we evaluated their applicability for improving existing scan acquisition and interpretation workflows. PubMed Medline, Ovid Embase, Scopus, Web of Science, and IEEE Xplore literature databases were searched for relevant studies published between January 2017 and February 2022. The quality of the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 tool. The applicability of ML algorithms for successful workflow improvement was qualitatively assessed based on the satisfaction of three clinical requirements. A total of 19 studies were included for qualitative synthesis. The included studies performed classification tasks (n = 12) and segmentation tasks (n = 7). For classification algorithms, the area under the receiver operating characteristic curve (AUC) ranged from 0.765 to 0.997, while accuracy, sensitivity, and specificity ranged from 80% to 100%, 72% to 100%, and 65% to 100%, respectively. For segmentation algorithms, the Dice coefficient ranged from 0.300 to 0.912. No studies satisfied all clinical requirements for successful workflow improvements due to key limitations pertaining to the study’s design, study data, reference standards, and performance reporting. Standardized reporting guidelines tailored for ML in radiology, prospective study designs, and multi-site testing could help alleviate this.
Collapse
Affiliation(s)
- Kaining Sheng
- Department of Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark; (C.M.O.); (J.F.C.); (A.P.); (M.B.N.)
- Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark;
- Correspondence:
| | - Cecilie Mørck Offersen
- Department of Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark; (C.M.O.); (J.F.C.); (A.P.); (M.B.N.)
- Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark;
| | - Jon Middleton
- Department of Computer Science, University of Copenhagen, 2200 Copenhagen, Denmark; (J.M.); (J.J.)
- Cerebriu A/S, 1127 Copenhagen, Denmark
| | - Jonathan Frederik Carlsen
- Department of Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark; (C.M.O.); (J.F.C.); (A.P.); (M.B.N.)
- Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark;
| | - Thomas Clement Truelsen
- Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark;
- Department of Neurology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark
| | - Akshay Pai
- Department of Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark; (C.M.O.); (J.F.C.); (A.P.); (M.B.N.)
- Cerebriu A/S, 1127 Copenhagen, Denmark
| | - Jacob Johansen
- Department of Computer Science, University of Copenhagen, 2200 Copenhagen, Denmark; (J.M.); (J.J.)
- Cerebriu A/S, 1127 Copenhagen, Denmark
| | - Michael Bachmann Nielsen
- Department of Radiology, Copenhagen University Hospital Rigshospitalet, 2100 Copenhagen, Denmark; (C.M.O.); (J.F.C.); (A.P.); (M.B.N.)
- Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark;
| |
Collapse
|
25
|
Alrashedy HHN, Almansour AF, Ibrahim DM, Hammoudeh MAA. BrainGAN: Brain MRI Image Generation and Classification Framework Using GAN Architectures and CNN Models. SENSORS (BASEL, SWITZERLAND) 2022; 22:4297. [PMID: 35684918 PMCID: PMC9185441 DOI: 10.3390/s22114297] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 06/02/2022] [Accepted: 06/03/2022] [Indexed: 02/01/2023]
Abstract
Deep learning models have been used in several domains, however, adjusting is still required to be applied in sensitive areas such as medical imaging. As the use of technology in the medical domain is needed because of the time limit, the level of accuracy assures trustworthiness. Because of privacy concerns, machine learning applications in the medical field are unable to use medical data. For example, the lack of brain MRI images makes it difficult to classify brain tumors using image-based classification. The solution to this challenge was achieved through the application of Generative Adversarial Network (GAN)-based augmentation techniques. Deep Convolutional GAN (DCGAN) and Vanilla GAN are two examples of GAN architectures used for image generation. In this paper, a framework, denoted as BrainGAN, for generating and classifying brain MRI images using GAN architectures and deep learning models was proposed. Consequently, this study proposed an automatic way to check that generated images are satisfactory. It uses three models: CNN, MobileNetV2, and ResNet152V2. Training the deep transfer models with images made by Vanilla GAN and DCGAN, and then evaluating their performance on a test set composed of real brain MRI images. From the results of the experiment, it was found that the ResNet152V2 model outperformed the other two models. The ResNet152V2 achieved 99.09% accuracy, 99.12% precision, 99.08% recall, 99.51% area under the curve (AUC), and 0.196 loss based on the brain MRI images generated by DCGAN architecture.
Collapse
Affiliation(s)
- Halima Hamid N. Alrashedy
- Department of Information Technology, College of Computer Qassim University, Buraydah 51452, Saudi Arabia; (H.H.N.A.); (A.F.A.); (D.M.I.)
| | - Atheer Fahad Almansour
- Department of Information Technology, College of Computer Qassim University, Buraydah 51452, Saudi Arabia; (H.H.N.A.); (A.F.A.); (D.M.I.)
| | - Dina M. Ibrahim
- Department of Information Technology, College of Computer Qassim University, Buraydah 51452, Saudi Arabia; (H.H.N.A.); (A.F.A.); (D.M.I.)
- Computers and Control Engineering Department, Faculty of Engineering, Tanta University, Tanta 31733, Egypt
| | - Mohammad Ali A. Hammoudeh
- Department of Information Technology, College of Computer Qassim University, Buraydah 51452, Saudi Arabia; (H.H.N.A.); (A.F.A.); (D.M.I.)
| |
Collapse
|
26
|
Ali H, Biswas R, Ali F, Shah U, Alamgir A, Mousa O, Shah Z. The role of generative adversarial networks in brain MRI: a scoping review. Insights Imaging 2022; 13:98. [PMID: 35662369 PMCID: PMC9167371 DOI: 10.1186/s13244-022-01237-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 05/11/2022] [Indexed: 11/23/2022] Open
Abstract
The performance of artificial intelligence (AI) for brain MRI can improve if enough data are made available. Generative adversarial networks (GANs) showed a lot of potential to generate synthetic MRI data that can capture the distribution of real MRI. Besides, GANs are also popular for segmentation, noise removal, and super-resolution of brain MRI images. This scoping review aims to explore how GANs methods are being used on brain MRI data, as reported in the literature. The review describes the different applications of GANs for brain MRI, presents the most commonly used GANs architectures, and summarizes the publicly available brain MRI datasets for advancing the research and development of GANs-based approaches. This review followed the guidelines of PRISMA-ScR to perform the study search and selection. The search was conducted on five popular scientific databases. The screening and selection of studies were performed by two independent reviewers, followed by validation by a third reviewer. Finally, the data were synthesized using a narrative approach. This review included 139 studies out of 789 search results. The most common use case of GANs was the synthesis of brain MRI images for data augmentation. GANs were also used to segment brain tumors and translate healthy images to diseased images or CT to MRI and vice versa. The included studies showed that GANs could enhance the performance of AI methods used on brain MRI imaging data. However, more efforts are needed to transform the GANs-based methods in clinical applications.
Collapse
Affiliation(s)
- Hazrat Ali
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar.
| | - Rafiul Biswas
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Farida Ali
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Uzair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Asma Alamgir
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Osama Mousa
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar
| | - Zubair Shah
- College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, 34110, Doha, Qatar.
| |
Collapse
|
27
|
Anomaly Detection in Multi-Host Environment Based on Federated Hypersphere Classifier. ELECTRONICS 2022. [DOI: 10.3390/electronics11101529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
Abstract
Detecting anomalous inputs is essential in many mission-critical systems in various domains, particularly cybersecurity. In particular, deep neural network-based anomaly detection methods have been successful for anomaly detection tasks with the recent advancements in deep learning technology. Nevertheless, the existing methods have considered somewhat idealized problems where it is enough to learn a single detector based on a single dataset. In this paper, we consider a more practical problem where multiple hosts in an organization collect their input data, while data sharing among the hosts is prohibitive due to security reasons, and only a few of them have experienced abnormal inputs. Furthermore, the data distribution of the hosts can be skewed; for example, a particular type of input can be observed by a limited subset of hosts. We propose the federated hypersphere classifier (FHC), which is a new anomaly detection method based on an improved hypersphere classifier suited for running in the federated learning framework to perform anomaly detection in such an environment. Our experiments with image and network intrusion detection datasets show that our method outperforms the state-of-the-art anomaly detection methods trained in a host-wise fashion by learning a consensus model as if we have accessed the input data from all hosts but without communicating such data.
Collapse
|
28
|
De Nardin A, Mishra P, Foresti GL, Piciarelli C. Masked Transformer for image Anomaly Localization. Int J Neural Syst 2022; 32:2250030. [DOI: 10.1142/s0129065722500307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
29
|
Qu C, Zou Y, Ma Y, Chen Q, Luo J, Fan H, Jia Z, Gong Q, Chen T. Diagnostic Performance of Generative Adversarial Network-Based Deep Learning Methods for Alzheimer’s Disease: A Systematic Review and Meta-Analysis. Front Aging Neurosci 2022; 14:841696. [PMID: 35527734 PMCID: PMC9068970 DOI: 10.3389/fnagi.2022.841696] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 03/03/2022] [Indexed: 12/28/2022] Open
Abstract
Alzheimer’s disease (AD) is the most common form of dementia. Currently, only symptomatic management is available, and early diagnosis and intervention are crucial for AD treatment. As a recent deep learning strategy, generative adversarial networks (GANs) are expected to benefit AD diagnosis, but their performance remains to be verified. This study provided a systematic review on the application of the GAN-based deep learning method in the diagnosis of AD and conducted a meta-analysis to evaluate its diagnostic performance. A search of the following electronic databases was performed by two researchers independently in August 2021: MEDLINE (PubMed), Cochrane Library, EMBASE, and Web of Science. The Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool was applied to assess the quality of the included studies. The accuracy of the model applied in the diagnosis of AD was determined by calculating odds ratios (ORs) with 95% confidence intervals (CIs). A bivariate random-effects model was used to calculate the pooled sensitivity and specificity with their 95% CIs. Fourteen studies were included, 11 of which were included in the meta-analysis. The overall quality of the included studies was high according to the QUADAS-2 assessment. For the AD vs. cognitively normal (CN) classification, the GAN-based deep learning method exhibited better performance than the non-GAN method, with significantly higher accuracy (OR 1.425, 95% CI: 1.150–1.766, P = 0.001), pooled sensitivity (0.88 vs. 0.83), pooled specificity (0.93 vs. 0.89), and area under the curve (AUC) of the summary receiver operating characteristic curve (SROC) (0.96 vs. 0.93). For the progressing MCI (pMCI) vs. stable MCI (sMCI) classification, the GAN method exhibited no significant increase in the accuracy (OR 1.149, 95% CI: 0.878–1.505, P = 0.310) or the pooled sensitivity (0.66 vs. 0.66). The pooled specificity and AUC of the SROC in the GAN group were slightly higher than those in the non-GAN group (0.81 vs. 0.78 and 0.81 vs. 0.80, respectively). The present results suggested that the GAN-based deep learning method performed well in the task of AD vs. CN classification. However, the diagnostic performance of GAN in the task of pMCI vs. sMCI classification needs to be improved. Systematic Review Registration: [PROSPERO], Identifier: [CRD42021275294].
Collapse
Affiliation(s)
- Changxing Qu
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, West China School of Stomatology, Sichuan University, Chengdu, China
| | - Yinxi Zou
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
- West China School of Medicine, Sichuan University, Chengdu, China
| | - Yingqiao Ma
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
| | - Qin Chen
- Department of Neurology, West China Hospital, Sichuan University, Chengdu, China
| | - Jiawei Luo
- West China Biomedical Big Data Center, West China Clinical Medical College of Sichuan University, Chengdu, China
| | - Huiyong Fan
- College of Education Science, Bohai University, Jinzhou, China
| | - Zhiyun Jia
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
| | - Qiyong Gong
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
- Department of Radiology, West China Xiamen Hospital of Sichuan University, Xiamen, China
- Qiyong Gong,
| | - Taolin Chen
- Huaxi MR Research Center (HMRRC), Department of Radiology, West China Hospital, Sichuan University, Chengdu, China
- Research Unit of Psychoradiology, Chinese Academy of Medical Sciences, Chengdu, China
- Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
- *Correspondence: Taolin Chen,
| |
Collapse
|
30
|
Express Construction for GANs from Latent Representation to Data Distribution. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12083910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Generative Adversarial Networks (GANs) are powerful generative models for numerous tasks and datasets. However, most of the existing models suffer from mode collapse. The most recent research indicates that the reason for it is that the optimal transportation map from random noise to the data distribution is discontinuous, but deep neural networks (DNNs) can only approximate continuous ones. Instead, the latent representation is a better raw material used to construct a transportation map point to the data distribution than random noise. Because it is a low-dimensional mapping related to the data distribution, the construction procedure seems more like expansion rather than starting all over. Besides, we can also search for more transportation maps in this way with smoother transformation. Thus, we have proposed a new training methodology for GANs in this paper to search for more transportation maps and speed the training up, named Express Construction. The key idea is to train GANs with two independent phases for successively yielding latent representation and data distribution. To this end, an Auto-Encoder is trained to map the real data into the latent space, and two couples of generators and discriminators are used to produce them. To the best of our knowledge, we are the first to decompose the training procedure of GAN models into two more uncomplicated phases, thus tackling the mode collapse problem without much more computational cost. We also provide theoretical steps toward understanding the training dynamics of this procedure and prove assumptions. No extra hyper-parameters have been used in the proposed method, which indicates that Express Construction can be used to train any GAN models. Extensive experiments are conducted to verify the performance of realistic image generation and the resistance to mode collapse. The results show that the proposed method is lightweight, effective, and less prone to mode collapse.
Collapse
|
31
|
Bi XA, Li L, Wang Z, Wang Y, Luo X, Xu L. IHGC-GAN: influence hypergraph convolutional generative adversarial network for risk prediction of late mild cognitive impairment based on imaging genetic data. Brief Bioinform 2022; 23:6554128. [PMID: 35348583 DOI: 10.1093/bib/bbac093] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 01/28/2022] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Predicting disease progression in the initial stage to implement early intervention and treatment can effectively prevent the further deterioration of the condition. Traditional methods for medical data analysis usually fail to perform well because of their incapability for mining the correlation pattern of pathogenies. Therefore, many calculation methods have been excavated from the field of deep learning. In this study, we propose a novel method of influence hypergraph convolutional generative adversarial network (IHGC-GAN) for disease risk prediction. First, a hypergraph is constructed with genes and brain regions as nodes. Then, an influence transmission model is built to portray the associations between nodes and the transmission rule of disease information. Third, an IHGC-GAN method is constructed based on this model. This method innovatively combines the graph convolutional network (GCN) and GAN. The GCN is used as the generator in GAN to spread and update the lesion information of nodes in the brain region-gene hypergraph. Finally, the prediction accuracy of the method is improved by the mutual competition and repeated iteration between generator and discriminator. This method can not only capture the evolutionary pattern from early mild cognitive impairment (EMCI) to late MCI (LMCI) but also extract the pathogenic factors and predict the deterioration risk from EMCI to LMCI. The results on the two datasets indicate that the IHGC-GAN method has better prediction performance than the advanced methods in a variety of indicators.
Collapse
Affiliation(s)
- Xia-An Bi
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, and the College of Information Science and Engineering in Hunan Normal University, Changsha 410081, P.R. China
| | - Lou Li
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Zizheng Wang
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Yu Wang
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Xun Luo
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, and the College of Information Science and Engineering in Hunan Normal University, Changsha 410081, P.R. China
| | - Luyun Xu
- College of Business, Hunan Normal University, Changsha 410081, P.R. China
| |
Collapse
|
32
|
Generative Adversarial Networks in Brain Imaging: A Narrative Review. J Imaging 2022; 8:jimaging8040083. [PMID: 35448210 PMCID: PMC9028488 DOI: 10.3390/jimaging8040083] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 03/08/2022] [Accepted: 03/15/2022] [Indexed: 02/04/2023] Open
Abstract
Artificial intelligence (AI) is expected to have a major effect on radiology as it demonstrated remarkable progress in many clinical tasks, mostly regarding the detection, segmentation, classification, monitoring, and prediction of diseases. Generative Adversarial Networks have been proposed as one of the most exciting applications of deep learning in radiology. GANs are a new approach to deep learning that leverages adversarial learning to tackle a wide array of computer vision challenges. Brain radiology was one of the first fields where GANs found their application. In neuroradiology, indeed, GANs open unexplored scenarios, allowing new processes such as image-to-image and cross-modality synthesis, image reconstruction, image segmentation, image synthesis, data augmentation, disease progression models, and brain decoding. In this narrative review, we will provide an introduction to GANs in brain imaging, discussing the clinical potential of GANs, future clinical applications, as well as pitfalls that radiologists should be aware of.
Collapse
|
33
|
Huhtanen H, Nyman M, Mohsen T, Virkki A, Karlsson A, Hirvonen J. Automated detection of pulmonary embolism from CT-angiograms using deep learning. BMC Med Imaging 2022; 22:43. [PMID: 35282821 PMCID: PMC8919639 DOI: 10.1186/s12880-022-00763-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 02/21/2022] [Indexed: 12/22/2022] Open
Abstract
Background The aim of this study was to develop and evaluate a deep neural network model in the automated detection of pulmonary embolism (PE) from computed tomography pulmonary angiograms (CTPAs) using only weakly labelled training data. Methods We developed a deep neural network model consisting of two parts: a convolutional neural network architecture called InceptionResNet V2 and a long-short term memory network to process whole CTPA stacks as sequences of slices. Two versions of the model were created using either chest X-rays (Model A) or natural images (Model B) as pre-training data. We retrospectively collected 600 CTPAs to use in training and validation and 200 CTPAs to use in testing. CTPAs were annotated only with binary labels on both stack- and slice-based levels. Performance of the models was evaluated with ROC and precision–recall curves, specificity, sensitivity, accuracy, as well as positive and negative predictive values. Results Both models performed well on both stack- and slice-based levels. On the stack-based level, Model A reached specificity and sensitivity of 93.5% and 86.6%, respectively, outperforming Model B slightly (specificity 90.7% and sensitivity 83.5%). However, the difference between their ROC AUC scores was not statistically significant (0.94 vs 0.91, p = 0.07). Conclusions We show that a deep learning model trained with a relatively small, weakly annotated dataset can achieve excellent performance results in detecting PE from CTPAs. Supplementary Information The online version contains supplementary material available at 10.1186/s12880-022-00763-z.
Collapse
Affiliation(s)
- Heidi Huhtanen
- Department of Radiology, University of Turku and Turku University Hospital, Turku, Finland.
| | - Mikko Nyman
- Department of Radiology, University of Turku and Turku University Hospital, Turku, Finland
| | | | - Arho Virkki
- Auria Clinical Informatics, Turku University Hospital, Turku, Finland.,Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - Antti Karlsson
- Auria Biobank, Turku University Hospital, University of Turku, Turku, Finland
| | - Jussi Hirvonen
- Department of Radiology, University of Turku and Turku University Hospital, Turku, Finland
| |
Collapse
|
34
|
Wang W, Tamhane A, Santos C, Rzasa JR, Clark JH, Canares TL, Unberath M. Pediatric Otoscopy Video Screening With Shift Contrastive Anomaly Detection. Front Digit Health 2022; 3:810427. [PMID: 35224535 PMCID: PMC8866874 DOI: 10.3389/fdgth.2021.810427] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 12/28/2021] [Indexed: 11/13/2022] Open
Abstract
Ear related concerns and symptoms represent the leading indication for seeking pediatric healthcare attention. Despite the high incidence of such encounters, the diagnostic process of commonly encountered diseases of the middle and external presents a significant challenge. Much of this challenge stems from the lack of cost effective diagnostic testing, which necessitates the presence or absence of ear pathology to be determined clinically. Research has, however, demonstrated considerable variation among clinicians in their ability to accurately diagnose and consequently manage ear pathology. With recent advances in computer vision and machine learning, there is an increasing interest in helping clinicians to accurately diagnose middle and external ear pathology with computer-aided systems. It has been shown that AI has the capacity to analyze a single clinical image captured during the examination of the ear canal and eardrum from which it can determine the likelihood of a pathognomonic pattern for a specific diagnosis being present. The capture of such an image can, however, be challenging especially to inexperienced clinicians. To help mitigate this technical challenge, we have developed and tested a method using video sequences. The videos were collected using a commercially available otoscope smartphone attachment in an urban, tertiary-care pediatric emergency department. We present a two stage method that first, identifies valid frames by detecting and extracting ear drum patches from the video sequence, and second, performs the proposed shift contrastive anomaly detection (SCAD) to flag the otoscopy video sequences as normal or abnormal. Our method achieves an AUROC of 88.0% on the patient level and also outperforms the average of a group of 25 clinicians in a comparative study, which is the largest of such published to date. We conclude that the presented method achieves a promising first step toward the automated analysis of otoscopy video.
Collapse
Affiliation(s)
- Weiyao Wang
- Department of Computer Science, Johns Hopkins University School of Engineering, Baltimore, MA, United States
- *Correspondence: Weiyao Wang
| | - Aniruddha Tamhane
- Department of Computer Science, Johns Hopkins University School of Engineering, Baltimore, MA, United States
| | - Christine Santos
- Department of Pediatric, Johns Hopkins University School of Medicine, Baltimore, MA, United States
| | - John R. Rzasa
- Robert E. Fischell Institute for Biomedical Devices, University of Maryland, College Park, MA, United States
| | - James H. Clark
- Department of Otolaryngology, Johns Hopkins University School of Medicine, Baltimore, MA, United States
| | - Therese L. Canares
- Department of Pediatric, Johns Hopkins University School of Medicine, Baltimore, MA, United States
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University School of Engineering, Baltimore, MA, United States
| |
Collapse
|
35
|
Salmanpour MR, Shamsaei M, Hajianfar G, Soltanian-Zadeh H, Rahmim A. Longitudinal clustering analysis and prediction of Parkinson's disease progression using radiomics and hybrid machine learning. Quant Imaging Med Surg 2022; 12:906-919. [PMID: 35111593 DOI: 10.21037/qims-21-425] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 08/13/2021] [Indexed: 11/06/2022]
Abstract
BACKGROUND We employed machine learning approaches to (I) determine distinct progression trajectories in Parkinson's disease (PD) (unsupervised clustering task), and (II) predict progression trajectories (supervised prediction task), from early (years 0 and 1) data, making use of clinical and imaging features. METHODS We studied PD-subjects derived from longitudinal datasets (years 0, 1, 2 & 4; Parkinson's Progressive Marker Initiative). We extracted and analyzed 981 features, including motor, non-motor, and radiomics features extracted for each region-of-interest (ROIs: left/right caudate and putamen) using our standardized standardized environment for radiomics analysis (SERA) radiomics software. Segmentation of ROIs on dopamine transposer - single photon emission computed tomography (DAT SPECT) images were performed via magnetic resonance images (MRI). After performing cross-sectional clustering on 885 subjects (original dataset) to identify disease subtypes, we identified optimal longitudinal trajectories using hybrid machine learning systems (HMLS), including principal component analysis (PCA) + K-Means algorithms (KMA) followed by Bayesian information criterion (BIC), Calinski-Harabatz criterion (CHC), and elbow criterion (EC). Subsequently, prediction of the identified trajectories from early year data was performed using multiple HMLSs including 16 Dimension Reduction Algorithms (DRA) and 10 classification algorithms. RESULTS We identified 3 distinct progression trajectories. Hotelling's t squared test (HTST) showed that the identified trajectories were distinct. The trajectories included those with (I, II) disease escalation (2 trajectories, 27% and 38% of patients) and (III) stable disease (1 trajectory, 35% of patients). For trajectory prediction from early year data, HMLSs including the stochastic neighbor embedding algorithm (SNEA, as a DRA) as well as locally linear embedding algorithm (LLEA, as a DRA), linked with the new probabilistic neural network classifier (NPNNC, as a classifier), resulted in accuracies of 78.4% and 79.2% respectively, while other HMLSs such as SNEA + Lib_SVM (library for support vector machines) and t_SNE (t-distributed stochastic neighbor embedding) + NPNNC resulted in 76.5% and 76.1% respectively. CONCLUSIONS This study moves beyond cross-sectional PD subtyping to clustering of longitudinal disease trajectories. We conclude that combining medical information with SPECT-based radiomics features, and optimal utilization of HMLSs, can identify distinct disease trajectories in PD patients, and enable effective prediction of disease trajectories from early year data.
Collapse
Affiliation(s)
- Mohammad R Salmanpour
- Department of Energy Engineering and Physics, Amirkabir University of Technology, Tehran, Iran.,Department of Physics & Astronomy, University of British Columbia, Vancouver BC, Canada
| | - Mojtaba Shamsaei
- Department of Energy Engineering and Physics, Amirkabir University of Technology, Tehran, Iran
| | - Ghasem Hajianfar
- Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Science, Tehran, Iran
| | - Hamid Soltanian-Zadeh
- CIPCE, School of Electrical & Computer Engineering, University of Tehran, Tehran, Iran.,Departments of Radiology and Research Administration, Henry Ford Health System, Detroit, USA
| | - Arman Rahmim
- Department of Physics & Astronomy, University of British Columbia, Vancouver BC, Canada.,Department of Radiology, University of British Columbia, Vancouver BC, Canada
| |
Collapse
|
36
|
Cengil E, Çınar A. The effect of deep feature concatenation in the classification problem: An approach on COVID-19 disease detection. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY 2022; 32:26-40. [PMID: 34898851 PMCID: PMC8653237 DOI: 10.1002/ima.22659] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 08/04/2021] [Accepted: 09/16/2021] [Indexed: 06/01/2023]
Abstract
In image classification applications, the most important thing is to obtain useful features. Convolutional neural networks automatically learn the extracted features during training. The classification process is carried out with the obtained features. Therefore, obtaining successful features is critical to achieving high classification success. This article focuses on providing effective features to enhance classification performance. For this purpose, the success of the process of concatenating features in classification is taken as basis. At first, the features acquired by feature transfer method are extracted from AlexNet, Xception, NASNETLarge, and EfficientNet-B0 architectures, which are known to be successful in classification problems. Concatenating the features results in the creation of a new feature set. The method is completed by subjecting the features to various classification algorithms. The proposed pipeline is applied to the three datasets: "COVID-19 Image Dataset," "COVID-19 Pneumonia Normal Chest X-ray (PA) Dataset," and "COVID-19 Radiography Database" for COVID-19 disease detection. The whole datasets contain three classes (normal, COVID, and pneumonia). The best classification accuracies for the three datasets are 98.8%, 95.9%, and 99.6%, respectively. Performance metrics are given such as: sensitivity, precision, specificity, and F1-score values, as well. Contribution of paper is as follows: COVID-19 disease is similar to other lung infections. This situation makes diagnosis difficult. Furthermore, the virus's rapid spread necessitates the need to detect cases as soon as possible. There has been an increased curiosity in computer-aided deep learning models to provide the requirements. The use of the proposed method will be beneficial as it provides high accuracy.
Collapse
Affiliation(s)
- Emine Cengil
- Department of Computer Engineering, Faculty of EngineeringFirat UniversityElazigTurkey
| | - Ahmet Çınar
- Department of Computer Engineering, Faculty of EngineeringFirat UniversityElazigTurkey
| |
Collapse
|
37
|
Abstract
Due to the outbreak of lung infections caused by the coronavirus disease (COVID-19), humans have to face an unprecedented and devastating global health crisis. Since chest computed tomography (CT) images of COVID-19 patients contain abundant pathological features closely related to this disease, rapid detection and diagnosis based on CT images is of great significance for the treatment of patients and blocking the spread of the disease. In particular, the segmentation of the COVID-19 CT lung-infected area can quantify and evaluate the severity of the disease. However, due to the blurred boundaries and low contrast between the infected and the non-infected areas in COVID-19 CT images, the manual segmentation of the COVID-19 lesion is laborious and places high demands on the operator. Quick and accurate segmentation of COVID-19 lesions from CT images based on deep learning has drawn increasing attention. To effectively improve the segmentation effect of COVID-19 lung infection, a modified UNet network that combines the squeeze-and-attention (SA) and dense atrous spatial pyramid pooling (Dense ASPP) modules) (SD-UNet) is proposed, fusing global context and multi-scale information. Specifically, the SA module is introduced to strengthen the attention of pixel grouping and fully exploit the global context information, allowing the network to better mine the differences and connections between pixels. The Dense ASPP module is utilized to capture multi-scale information of COVID-19 lesions. Moreover, to eliminate the interference of background noise outside the lungs and highlight the texture features of the lung lesion area, we extract in advance the lung area from the CT images in the pre-processing stage. Finally, we evaluate our method using the binary-class and multi-class COVID-19 lung infection segmentation datasets. The experimental results show that the metrics of Sensitivity, Dice Similarity Coefficient, Accuracy, Specificity, and Jaccard Similarity are 0.8988 (0.6169), 0.8696 (0.5936), 0.9906 (0.9821), 0.9932 (0.9907), and 0.7702 (0.4788), respectively, for the binary-class (multi-class) segmentation task in the proposed SD-UNet. The result of the COVID-19 lung infection area segmented by SD-UNet is closer to the ground truth compared to several existing models such as CE-Net, DeepLab v3+, UNet++, and other models, which further proves that a more accurate segmentation effect can be achieved by our method. It has the potential to assist doctors in making more accurate and rapid diagnosis and quantitative assessment of COVID-19.
Collapse
|
38
|
Automatic Liver Segmentation in CT Images with Enhanced GAN and Mask Region-Based CNN Architectures. BIOMED RESEARCH INTERNATIONAL 2021; 2021:9956983. [PMID: 34957310 PMCID: PMC8702320 DOI: 10.1155/2021/9956983] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 09/22/2021] [Accepted: 11/26/2021] [Indexed: 01/10/2023]
Abstract
Liver image segmentation has been increasingly employed for key medical purposes, including liver functional assessment, disease diagnosis, and treatment. In this work, we introduce a liver image segmentation method based on generative adversarial networks (GANs) and mask region-based convolutional neural networks (Mask R-CNN). Firstly, since most resulting images have noisy features, we further explored the combination of Mask R-CNN and GANs in order to enhance the pixel-wise classification. Secondly, k-means clustering was used to lock the image aspect ratio, in order to get more essential anchors which can help boost the segmentation performance. Finally, we proposed a GAN Mask R-CNN algorithm which achieved superior performance in comparison with the conventional Mask R-CNN, Mask-CNN, and k-means algorithms in terms of the Dice similarity coefficient (DSC) and the MICCAI metrics. The proposed algorithm also achieved superior performance in comparison with ten state-of-the-art algorithms in terms of six Boolean indicators. We hope that our work can be effectively used to optimize the segmentation and classification of liver anomalies.
Collapse
|
39
|
de Farias EC, di Noia C, Han C, Sala E, Castelli M, Rundo L. Impact of GAN-based lesion-focused medical image super-resolution on the robustness of radiomic features. Sci Rep 2021; 11:21361. [PMID: 34725417 PMCID: PMC8560955 DOI: 10.1038/s41598-021-00898-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 10/13/2021] [Indexed: 12/25/2022] Open
Abstract
Robust machine learning models based on radiomic features might allow for accurate diagnosis, prognosis, and medical decision-making. Unfortunately, the lack of standardized radiomic feature extraction has hampered their clinical use. Since the radiomic features tend to be affected by low voxel statistics in regions of interest, increasing the sample size would improve their robustness in clinical studies. Therefore, we propose a Generative Adversarial Network (GAN)-based lesion-focused framework for Computed Tomography (CT) image Super-Resolution (SR); for the lesion (i.e., cancer) patch-focused training, we incorporate Spatial Pyramid Pooling (SPP) into GAN-Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE). At [Formula: see text] SR, the proposed model achieved better perceptual quality with less blurring than the other considered state-of-the-art SR methods, while producing comparable results at [Formula: see text] SR. We also evaluated the robustness of our model's radiomic feature in terms of quantization on a different lung cancer CT dataset using Principal Component Analysis (PCA). Intriguingly, the most important radiomic features in our PCA-based analysis were the most robust features extracted on the GAN-super-resolved images. These achievements pave the way for the application of GAN-based image Super-Resolution techniques for studies of radiomics for robust biomarker discovery.
Collapse
Affiliation(s)
- Erick Costa de Farias
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312, Lisbon, Portugal
| | - Christian di Noia
- Department of Physics, University of Milano-Bicocca, 20126, Milan, Italy
| | - Changhee Han
- Saitama Prefectural University, Saitama, 343-8540, Japan
| | - Evis Sala
- Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, UK
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, CB2 0RE, UK
| | - Mauro Castelli
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312, Lisbon, Portugal.
| | - Leonardo Rundo
- Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, UK.
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, CB2 0RE, UK.
| |
Collapse
|
40
|
Nicholaus IT, Park JR, Jung K, Lee JS, Kang DK. Anomaly Detection of Water Level Using Deep Autoencoder. SENSORS 2021; 21:s21196679. [PMID: 34640997 PMCID: PMC8512605 DOI: 10.3390/s21196679] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/22/2021] [Accepted: 09/23/2021] [Indexed: 11/16/2022]
Abstract
Anomaly detection is one of the crucial tasks in daily infrastructure operations as it can prevent massive damage to devices or resources, which may then lead to catastrophic outcomes. To address this challenge, we propose an automated solution to detect anomaly pattern(s) of the water levels and report the analysis and time/point(s) of abnormality. This research's motivation is the level difficulty and time-consuming managing facilities responsible for controlling water levels due to the rare occurrence of abnormal patterns. Consequently, we employed deep autoencoder, one of the types of artificial neural network architectures, to learn different patterns from the given sequences of data points and reconstruct them. Then we use the reconstructed patterns from the deep autoencoder together with a threshold to report which patterns are abnormal from the normal ones. We used a stream of time-series data collected from sensors to train the model and then evaluate it, ready for deployment as the anomaly detection system framework. We run extensive experiments on sensor data from water tanks. Our analysis shows why we conclude vanilla deep autoencoder as the most effective solution in this scenario.
Collapse
Affiliation(s)
| | | | | | | | - Dae-Ki Kang
- Department of Computer Engineering, Dongseo University, Busan 47011, Korea;
- Correspondence: ; Tel.: +82-51-320-1724
| |
Collapse
|
41
|
Yeung M, Sala E, Schönlieb CB, Rundo L. Focus U-Net: A novel dual attention-gated CNN for polyp segmentation during colonoscopy. Comput Biol Med 2021; 137:104815. [PMID: 34507156 PMCID: PMC8505797 DOI: 10.1016/j.compbiomed.2021.104815] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/26/2021] [Accepted: 08/26/2021] [Indexed: 02/07/2023]
Abstract
BACKGROUND Colonoscopy remains the gold-standard screening for colorectal cancer. However, significant miss rates for polyps have been reported, particularly when there are multiple small adenomas. This presents an opportunity to leverage computer-aided systems to support clinicians and reduce the number of polyps missed. METHOD In this work we introduce the Focus U-Net, a novel dual attention-gated deep neural network, which combines efficient spatial and channel-based attention into a single Focus Gate module to encourage selective learning of polyp features. The Focus U-Net incorporates several further architectural modifications, including the addition of short-range skip connections and deep supervision. Furthermore, we introduce the Hybrid Focal loss, a new compound loss function based on the Focal loss and Focal Tversky loss, designed to handle class-imbalanced image segmentation. For our experiments, we selected five public datasets containing images of polyps obtained during optical colonoscopy: CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, ETIS-Larib PolypDB and EndoScene test set. We first perform a series of ablation studies and then evaluate the Focus U-Net on the CVC-ClinicDB and Kvasir-SEG datasets separately, and on a combined dataset of all five public datasets. To evaluate model performance, we use the Dice similarity coefficient (DSC) and Intersection over Union (IoU) metrics. RESULTS Our model achieves state-of-the-art results for both CVC-ClinicDB and Kvasir-SEG, with a mean DSC of 0.941 and 0.910, respectively. When evaluated on a combination of five public polyp datasets, our model similarly achieves state-of-the-art results with a mean DSC of 0.878 and mean IoU of 0.809, a 14% and 15% improvement over the previous state-of-the-art results of 0.768 and 0.702, respectively. CONCLUSIONS This study shows the potential for deep learning to provide fast and accurate polyp segmentation results for use during colonoscopy. The Focus U-Net may be adapted for future use in newer non-invasive colorectal cancer screening and more broadly to other biomedical image segmentation tasks similarly involving class imbalance and requiring efficiency.
Collapse
Affiliation(s)
- Michael Yeung
- Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, United Kingdom; School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, United Kingdom.
| | - Evis Sala
- Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, United Kingdom; Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, CB2 0RE, United Kingdom.
| | - Carola-Bibiane Schönlieb
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, CB3 0WA, United Kingdom.
| | - Leonardo Rundo
- Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, United Kingdom; Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, CB2 0RE, United Kingdom.
| |
Collapse
|
42
|
A Histogram-Based Low-Complexity Approach for the Effective Detection of COVID-19 Disease from CT and X-ray Images. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11198867] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The global COVID-19 pandemic certainly has posed one of the more difficult challenges for researchers in the current century. The development of an automatic diagnostic tool, able to detect the disease in its early stage, could undoubtedly offer a great advantage to the battle against the pandemic. In this regard, most of the research efforts have been focused on the application of Deep Learning (DL) techniques to chest images, including traditional chest X-rays (CXRs) and Computed Tomography (CT) scans. Although these approaches have demonstrated their effectiveness in detecting the COVID-19 disease, they are of huge computational complexity and require large datasets for training. In addition, there may not exist a large amount of COVID-19 CXRs and CT scans available to researchers. To this end, in this paper, we propose an approach based on the evaluation of the histogram from a common class of images that is considered as the target. A suitable inter-histogram distance measures how this target histogram is far from the histogram evaluated on a test image: if this distance is greater than a threshold, the test image is labeled as anomaly, i.e., the scan belongs to a patient affected by COVID-19 disease. Extensive experimental results and comparisons with some benchmark state-of-the-art methods support the effectiveness of the developed approach, as well as demonstrate that, at least when the images of the considered datasets are homogeneous enough (i.e., a few outliers are present), it is not really needed to resort to complex-to-implement DL techniques, in order to attain an effective detection of the COVID-19 disease. Despite the simplicity of the proposed approach, all the considered metrics (i.e., accuracy, precision, recall, and F-measure) attain a value of 1.0 under the selected datasets, a result comparable to the corresponding state-of-the-art DNN approaches, but with a remarkable computational simplicity.
Collapse
|
43
|
Using Convolutional Encoder Networks to Determine the Optimal Magnetic Resonance Image for the Automatic Segmentation of Multiple Sclerosis. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11188335] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Multiple Sclerosis (MS) is a neuroinflammatory demyelinating disease that affects over 2,000,000 individuals worldwide. It is characterized by white matter lesions that are identified through the segmentation of magnetic resonance images (MRIs). Manual segmentation is very time-intensive because radiologists spend a great amount of time labeling T1-weighted, T2-weighted, and FLAIR MRIs. In response, deep learning models have been created to reduce segmentation time by automatically detecting lesions. These models often use individual MRI sequences as well as combinations, such as FLAIR2, which is the multiplication of FLAIR and T2 sequences. Unlike many other studies, this seeks to determine an optimal MRI sequence, thus reducing even more time by not having to obtain other MRI sequences. With this consideration in mind, four Convolutional Encoder Networks (CENs) with different network architectures (U-Net, U-Net++, Linknet, and Feature Pyramid Network) were used to ensure that the optimal MRI applies to a wide array of deep learning models. Each model had used a pretrained ResNeXt-50 encoder in order to conserve memory and to train faster. Training and testing had been performed using two public datasets with 30 and 15 patients. Fisher’s exact test was used to evaluate statistical significance, and the automatic segmentation times were compiled for the top two models. This work determined that FLAIR is the optimal sequence based on Dice Similarity Coefficient (DSC) and Intersection over Union (IoU). By using FLAIR, the U-Net++ with the ResNeXt-50 achieved a high DSC of 0.7159.
Collapse
|
44
|
Gomi T, Sakai R, Hara H, Watanabe Y, Mizukami S. Usefulness of a Metal Artifact Reduction Algorithm in Digital Tomosynthesis Using a Combination of Hybrid Generative Adversarial Networks. Diagnostics (Basel) 2021; 11:diagnostics11091629. [PMID: 34573971 PMCID: PMC8467368 DOI: 10.3390/diagnostics11091629] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 08/30/2021] [Accepted: 08/30/2021] [Indexed: 11/22/2022] Open
Abstract
In this study, a novel combination of hybrid generative adversarial networks (GANs) comprising cycle-consistent GAN, pix2pix, and (mask pyramid network) MPN (CGpM-metal artifact reduction [MAR]), was developed using projection data to reduce metal artifacts and the radiation dose during digital tomosynthesis. The CGpM-MAR algorithm was compared with the conventional filtered back projection (FBP) without MAR, FBP with MAR, and convolutional neural network MAR. The MAR rates were compared using the artifact index (AI) and Gumbel distribution of the largest variation analysis using a prosthesis phantom at various radiation doses. The novel CGpM-MAR yielded an adequately effective overall performance in terms of AI. The resulting images yielded good results independently of the type of metal used in the prosthesis phantom (p < 0.05) and good artifact removal at 55% radiation-dose reduction. Furthermore, the CGpM-MAR represented the minimum in the model with the largest variation at 55% radiation-dose reduction. Regarding the AI and Gumbel distribution analysis, the novel CGpM-MAR yielded superior MAR when compared with the conventional reconstruction algorithms with and without MAR at 55% radiation-dose reduction and presented features most similar to the reference FBP. CGpM-MAR presents a promising method for metal artifact and radiation-dose reduction in clinical practice.
Collapse
|
45
|
Brain Tumor Detection and Classification on MR Images by a Deep Wavelet Auto-Encoder Model. Diagnostics (Basel) 2021; 11:diagnostics11091589. [PMID: 34573931 PMCID: PMC8471235 DOI: 10.3390/diagnostics11091589] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/18/2021] [Accepted: 08/18/2021] [Indexed: 11/16/2022] Open
Abstract
The process of diagnosing brain tumors is very complicated for many reasons, including the brain's synaptic structure, size, and shape. Machine learning techniques are employed to help doctors to detect brain tumor and support their decisions. In recent years, deep learning techniques have made a great achievement in medical image analysis. This paper proposed a deep wavelet autoencoder model named "DWAE model", employed to divide input data slice as a tumor (abnormal) or no tumor (normal). This article used a high pass filter to show the heterogeneity of the MRI images and their integration with the input images. A high median filter was utilized to merge slices. We improved the output slices' quality through highlight edges and smoothened input MR brain images. Then, we applied the seed growing method based on 4-connected since the thresholding cluster equal pixels with input MR data. The segmented MR image slices provide two two-layer using the proposed deep wavelet auto-encoder model. We then used 200 hidden units in the first layer and 400 hidden units in the second layer. The softmax layer testing and training are performed for the identification of the MR image normal and abnormal. The contribution of the deep wavelet auto-encoder model is in the analysis of pixel pattern of MR brain image and the ability to detect and classify the tumor with high accuracy, short time, and low loss validation. To train and test the overall performance of the proposed model, we utilized 2500 MR brain images from BRATS2012, BRATS2013, BRATS2014, BRATS2015, 2015 challenge, and ISLES, which consists of normal and abnormal images. The experiments results show that the proposed model achieved an accuracy of 99.3%, loss validation of 0.1, low FPR and FNR values. This result demonstrates that the proposed DWAE model can facilitate the automatic detection of brain tumors.
Collapse
|
46
|
Denck J, Guehring J, Maier A, Rothgang E. Enhanced Magnetic Resonance Image Synthesis with Contrast-Aware Generative Adversarial Networks. J Imaging 2021; 7:133. [PMID: 34460769 PMCID: PMC8404922 DOI: 10.3390/jimaging7080133] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 07/27/2021] [Accepted: 07/30/2021] [Indexed: 01/17/2023] Open
Abstract
A magnetic resonance imaging (MRI) exam typically consists of the acquisition of multiple MR pulse sequences, which are required for a reliable diagnosis. With the rise of generative deep learning models, approaches for the synthesis of MR images are developed to either synthesize additional MR contrasts, generate synthetic data, or augment existing data for AI training. While current generative approaches allow only the synthesis of specific sets of MR contrasts, we developed a method to generate synthetic MR images with adjustable image contrast. Therefore, we trained a generative adversarial network (GAN) with a separate auxiliary classifier (AC) network to generate synthetic MR knee images conditioned on various acquisition parameters (repetition time, echo time, and image orientation). The AC determined the repetition time with a mean absolute error (MAE) of 239.6 ms, the echo time with an MAE of 1.6 ms, and the image orientation with an accuracy of 100%. Therefore, it can properly condition the generator network during training. Moreover, in a visual Turing test, two experts mislabeled 40.5% of real and synthetic MR images, demonstrating that the image quality of the generated synthetic and real MR images is comparable. This work can support radiologists and technologists during the parameterization of MR sequences by previewing the yielded MR contrast, can serve as a valuable tool for radiology training, and can be used for customized data generation to support AI training.
Collapse
Affiliation(s)
- Jonas Denck
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander Universität Erlangen-Nürnberg, 91058 Erlangen, Germany;
- Siemens Healthcare GmbH, 91052 Erlangen, Germany;
- Department of Industrial Engineering and Health, Technical University of Applied Sciences Amberg-Weiden, 92637 Weiden, Germany;
| | | | - Andreas Maier
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander Universität Erlangen-Nürnberg, 91058 Erlangen, Germany;
| | - Eva Rothgang
- Department of Industrial Engineering and Health, Technical University of Applied Sciences Amberg-Weiden, 92637 Weiden, Germany;
| |
Collapse
|
47
|
Dai Y, Gao Y, Liu F. TransMed: Transformers Advance Multi-Modal Medical Image Classification. Diagnostics (Basel) 2021; 11:diagnostics11081384. [PMID: 34441318 PMCID: PMC8391808 DOI: 10.3390/diagnostics11081384] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/07/2021] [Accepted: 07/28/2021] [Indexed: 12/24/2022] Open
Abstract
Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor segmentation, and lesion detection. CNN has great advantages in extracting local features of images. However, due to the locality of convolution operation, it cannot deal with long-range relationships well. Recently, transformers have been applied to computer vision and achieved remarkable success in large-scale datasets. Compared with natural images, multi-modal medical images have explicit and important long-range dependencies, and effective multi-modal fusion strategies can greatly improve the performance of deep models. This prompts us to study transformer-based structures and apply them to multi-modal medical images. Existing transformer-based network architectures require large-scale datasets to achieve better performance. However, medical imaging datasets are relatively small, which makes it difficult to apply pure transformers to medical image analysis. Therefore, we propose TransMed for multi-modal medical image classification. TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images and establish long-range dependencies between modalities. We evaluated our model on two datasets, parotid gland tumors classification and knee injury classification. Combining our contributions, we achieve an improvement of 10.1% and 1.9% in average accuracy, respectively, outperforming other state-of-the-art CNN-based models. The results of the proposed method are promising and have tremendous potential to be applied to a large number of medical image analysis tasks. To our best knowledge, this is the first work to apply transformers to multi-modal medical image classification.
Collapse
Affiliation(s)
- Yin Dai
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China; (Y.D.); (Y.G.)
- Engineering Center on Medical Imaging and Intelligent Analysis, Ministry Education, Northeastern University, Shenyang 110169, China
| | - Yifan Gao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China; (Y.D.); (Y.G.)
| | - Fayu Liu
- Department of Oromaxillofacial-Head and Neck Surgery, School of Stomatology, China Medical University, Shenyang 110002, China
- Correspondence:
| |
Collapse
|
48
|
Dai Y, Gao Y, Liu F. TransMed: Transformers Advance Multi-Modal Medical Image Classification. Diagnostics (Basel) 2021. [PMID: 34441318 DOI: 10.1109/access.2017] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open
Abstract
Over the past decade, convolutional neural networks (CNN) have shown very competitive performance in medical image analysis tasks, such as disease classification, tumor segmentation, and lesion detection. CNN has great advantages in extracting local features of images. However, due to the locality of convolution operation, it cannot deal with long-range relationships well. Recently, transformers have been applied to computer vision and achieved remarkable success in large-scale datasets. Compared with natural images, multi-modal medical images have explicit and important long-range dependencies, and effective multi-modal fusion strategies can greatly improve the performance of deep models. This prompts us to study transformer-based structures and apply them to multi-modal medical images. Existing transformer-based network architectures require large-scale datasets to achieve better performance. However, medical imaging datasets are relatively small, which makes it difficult to apply pure transformers to medical image analysis. Therefore, we propose TransMed for multi-modal medical image classification. TransMed combines the advantages of CNN and transformer to efficiently extract low-level features of images and establish long-range dependencies between modalities. We evaluated our model on two datasets, parotid gland tumors classification and knee injury classification. Combining our contributions, we achieve an improvement of 10.1% and 1.9% in average accuracy, respectively, outperforming other state-of-the-art CNN-based models. The results of the proposed method are promising and have tremendous potential to be applied to a large number of medical image analysis tasks. To our best knowledge, this is the first work to apply transformers to multi-modal medical image classification.
Collapse
Affiliation(s)
- Yin Dai
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
- Engineering Center on Medical Imaging and Intelligent Analysis, Ministry Education, Northeastern University, Shenyang 110169, China
| | - Yifan Gao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang 110169, China
| | - Fayu Liu
- Department of Oromaxillofacial-Head and Neck Surgery, School of Stomatology, China Medical University, Shenyang 110002, China
| |
Collapse
|
49
|
Dong Y, Wang L, Cheng S, Li Y. FAC-Net: Feedback Attention Network Based on Context Encoder Network for Skin Lesion Segmentation. SENSORS 2021; 21:s21155172. [PMID: 34372409 PMCID: PMC8347551 DOI: 10.3390/s21155172] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 07/27/2021] [Accepted: 07/27/2021] [Indexed: 11/25/2022]
Abstract
Considerable research and surveys indicate that skin lesions are an early symptom of skin cancer. Segmentation of skin lesions is still a hot research topic. Dermatological datasets in skin lesion segmentation tasks generated a large number of parameters when data augmented, limiting the application of smart assisted medicine in real life. Hence, this paper proposes an effective feedback attention network (FAC-Net). The network is equipped with the feedback fusion block (FFB) and the attention mechanism block (AMB), through the combination of these two modules, we can obtain richer and more specific feature mapping without data enhancement. Numerous experimental tests were given by us on public datasets (ISIC2018, ISBI2017, ISBI2016), and a good deal of metrics like the Jaccard index (JA) and Dice coefficient (DC) were used to evaluate the results of segmentation. On the ISIC2018 dataset, we obtained results for DC equal to 91.19% and JA equal to 83.99%, compared with the based network. The results of these two main metrics were improved by more than 1%. In addition, the metrics were also improved in the other two datasets. It can be demonstrated through experiments that without any enhancements of the datasets, our lightweight model can achieve better segmentation performance than most deep learning architectures.
Collapse
|
50
|
Yang D, Liu J, Wang Y, Xu B, Wang X. Application of a Generative Adversarial Network in Image Reconstruction of Magnetic Induction Tomography. SENSORS 2021; 21:s21113869. [PMID: 34205157 PMCID: PMC8199933 DOI: 10.3390/s21113869] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 05/29/2021] [Accepted: 05/31/2021] [Indexed: 11/16/2022]
Abstract
Image reconstruction of Magnetic induction tomography (MIT) is an ill-posed problem. The non-linear characteristics lead many difficulties to its solution. In this paper, a method based on a Generative Adversarial Network (GAN) is presented to tackle these barriers. Firstly, the principle of MIT is analyzed. Then the process for finding the global optimum of conductivity distribution is described as a training process, and the GAN model is proposed. Finally, the image was reconstructed by a part of the model (the generator). All datasets are obtained from an eight-channel MIT model by COMSOL Multiphysics software. The voltage measurement samples are used as input to the trained network, and its output is an estimate for image reconstruction of the internal conductivity distribution. The results based on the proposed model and the traditional algorithms were compared, which have shown that average root mean squared error of reconstruction results obtained by the proposed method is 0.090, and the average correlation coefficient with original images is 0.940, better than corresponding indicators of BPNN and Tikhonov regularization algorithms. Accordingly, the GAN algorithm was able to fit the non-linear relationship between input and output, and visual images also show that it solved the usual problems of artifact in traditional algorithm and hot pixels in L2 regularization, which is of great significance for other ill-posed or non-linear problems.
Collapse
Affiliation(s)
- Dan Yang
- Key Laboratory of Data Analytics and Optimization for Smart Industry, Northeastern University, Shenyang 110819, China; (J.L.); (X.W.)
- Key Laboratory of Infrared Optoelectric Materials and Micro-Nano Devices, Shenyang 110819, China;
- College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
- Correspondence: ; Tel.: +86-135-1428-6842
| | - Jiahua Liu
- Key Laboratory of Data Analytics and Optimization for Smart Industry, Northeastern University, Shenyang 110819, China; (J.L.); (X.W.)
- Key Laboratory of Infrared Optoelectric Materials and Micro-Nano Devices, Shenyang 110819, China;
| | - Yuchen Wang
- Key Laboratory of Infrared Optoelectric Materials and Micro-Nano Devices, Shenyang 110819, China;
- College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
| | - Bin Xu
- College of Computer Science and Engineering, Northeastern University, Shenyang 110819, China;
| | - Xu Wang
- Key Laboratory of Data Analytics and Optimization for Smart Industry, Northeastern University, Shenyang 110819, China; (J.L.); (X.W.)
| |
Collapse
|