1
|
Ibrahim M, Khalil YA, Amirrajab S, Sun C, Breeuwer M, Pluim J, Elen B, Ertaylan G, Dumontier M. Generative AI for synthetic data across multiple medical modalities: A systematic review of recent developments and challenges. Comput Biol Med 2025; 189:109834. [PMID: 40023073 DOI: 10.1016/j.compbiomed.2025.109834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 01/03/2025] [Accepted: 02/08/2025] [Indexed: 03/04/2025]
Abstract
This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our aim is to offer insights into their current and future applications in medical research, particularly in the context of synthesis applications, generation techniques, and evaluation methods, as well as providing a GitHub repository as a dynamic resource for ongoing collaboration and innovation. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered in previous reviews. The survey also emphasizes the aspect of conditional generation, which is not focused on in similar work. Key contributions include a broad, multi-modality scope that identifies cross-modality insights and opportunities unavailable in single-modality surveys. While core generative techniques are transferable, we find that synthesis methods often lack sufficient integration of patient-specific context, clinical knowledge, and modality-specific requirements tailored to the unique characteristics of medical data. Conditional models leveraging textual conditioning and multimodal synthesis remain underexplored but offer promising directions for innovation. Our findings are structured around three themes: (1) Synthesis applications, highlighting clinically valid synthesis applications and significant gaps in using synthetic data beyond augmentation, such as for validation and evaluation; (2) Generation techniques, identifying gaps in personalization and cross-modality innovation; and (3) Evaluation methods, revealing the absence of standardized benchmarks, the need for large-scale validation, and the importance of privacy-aware, clinically relevant evaluation frameworks. These findings emphasize the need for benchmarking and comparative studies to promote openness and collaboration.
Collapse
Affiliation(s)
- Mahmoud Ibrahim
- Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands; Department of Advanced Computing Sciences, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands; VITO, Belgium.
| | - Yasmina Al Khalil
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Sina Amirrajab
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Chang Sun
- Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands; Department of Advanced Computing Sciences, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands
| | - Marcel Breeuwer
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Josien Pluim
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | | | | | - Michel Dumontier
- Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands; Department of Advanced Computing Sciences, Faculty of Science and Engineering, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
2
|
Tian C, Xi Y, Ma Y, Chen C, Wu C, Ru K, Li W, Zhao M. Harnessing Deep Learning for Accurate Pathological Assessment of Brain Tumor Cell Types. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:1098-1111. [PMID: 39150595 PMCID: PMC11950525 DOI: 10.1007/s10278-024-01107-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 03/11/2024] [Accepted: 03/27/2024] [Indexed: 08/17/2024]
Abstract
Primary diffuse central nervous system large B-cell lymphoma (CNS-pDLBCL) and high-grade glioma (HGG) often present similarly, clinically and on imaging, making differentiation challenging. This similarity can complicate pathologists' diagnostic efforts, yet accurately distinguishing between these conditions is crucial for guiding treatment decisions. This study leverages a deep learning model to classify brain tumor pathology images, addressing the common issue of limited medical imaging data. Instead of training a convolutional neural network (CNN) from scratch, we employ a pre-trained network for extracting deep features, which are then used by a support vector machine (SVM) for classification. Our evaluation shows that the Resnet50 (TL + SVM) model achieves a 97.4% accuracy, based on tenfold cross-validation on the test set. These results highlight the synergy between deep learning and traditional diagnostics, potentially setting a new standard for accuracy and efficiency in the pathological diagnosis of brain tumors.
Collapse
Affiliation(s)
- Chongxuan Tian
- School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China
| | - Yue Xi
- Shandong Provincial Hospital affiliated to Shandong First Medical University, Jinan, Shandong, China
| | - Yuting Ma
- Shandong Provincial Hospital affiliated to Shandong First Medical University, Jinan, Shandong, China
| | - Cai Chen
- Shandong Institute of Advanced Technology, Chinese Academy of Sciences, Jinan, Shandong, China
| | - Cong Wu
- Shandong Provincial Hospital affiliated to Shandong First Medical University, Jinan, Shandong, China
| | - Kun Ru
- Department of Pathology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong, China
| | - Wei Li
- School of Control Science and Engineering, Shandong University, Jinan, Shandong, 250061, China.
| | - Miaoqing Zhao
- Department of Pathology, Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, Shandong, China.
| |
Collapse
|
3
|
Kebaili A, Lapuyade-Lahorgue J, Vera P, Ruan S. Multi-modal MRI synthesis with conditional latent diffusion models for data augmentation in tumor segmentation. Comput Med Imaging Graph 2025; 123:102532. [PMID: 40121926 DOI: 10.1016/j.compmedimag.2025.102532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 03/13/2025] [Accepted: 03/13/2025] [Indexed: 03/25/2025]
Abstract
Multimodality is often necessary for improving object segmentation tasks, especially in the case of multilabel tasks, such as tumor segmentation, which is crucial for clinical diagnosis and treatment planning. However, a major challenge in utilizing multimodality with deep learning remains: the limited availability of annotated training data, primarily due to the time-consuming acquisition process and the necessity for expert annotations. Although deep learning has significantly advanced many tasks in medical imaging, conventional augmentation techniques are often insufficient due to the inherent complexity of volumetric medical data. To address this problem, we propose an innovative slice-based latent diffusion architecture for the generation of 3D multi-modal images and their corresponding multi-label masks. Our approach enables the simultaneous generation of the image and mask in a slice-by-slice fashion, leveraging a positional encoding and a Latent Aggregation module to maintain spatial coherence and capture slice sequentiality. This method effectively reduces the computational complexity and memory demands typically associated with diffusion models. Additionally, we condition our architecture on tumor characteristics to generate a diverse array of tumor variations and enhance texture using a refining module that acts like a super-resolution mechanism, mitigating the inherent blurriness caused by data scarcity in the autoencoder. We evaluate the effectiveness of our synthesized volumes using the BRATS2021 dataset to segment the tumor with three tissue labels and compare them with other state-of-the-art diffusion models through a downstream segmentation task, demonstrating the superior performance and efficiency of our method. While our primary application is tumor segmentation, this method can be readily adapted to other modalities. Code is available here : https://github.com/Arksyd96/multi-modal-mri-and-mask-synthesis-with-conditional-slice-based-ldm.
Collapse
Affiliation(s)
- Aghiles Kebaili
- AIMS, Quantif, University of Rouen Normandy, Rouen, 76000, Normandy, France
| | | | - Pierre Vera
- AIMS, Quantif, University of Rouen Normandy, Rouen, 76000, Normandy, France; CLCC Henri Becquerel, Rouen, 76038, Normandy, France
| | - Su Ruan
- AIMS, Quantif, University of Rouen Normandy, Rouen, 76000, Normandy, France.
| |
Collapse
|
4
|
Rasool MJA, Abdusalomov A, Kutlimuratov A, Ahamed MJA, Mirzakhalilov S, Shavkatovich Buriboev A, Jeon HS. PixMed-Enhancer: An Efficient Approach for Medical Image Augmentation. Bioengineering (Basel) 2025; 12:235. [PMID: 40150699 PMCID: PMC11939228 DOI: 10.3390/bioengineering12030235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Revised: 02/20/2025] [Accepted: 02/24/2025] [Indexed: 03/29/2025] Open
Abstract
AI-powered medical imaging faces persistent challenges, such as limited datasets, class imbalances, and high computational costs. To overcome these barriers, we introduce PixMed-Enhancer, a novel conditional GAN that integrates the ghost module into its encoder-a pioneering approach that achieves efficient feature extraction while significantly reducing the computational complexity without compromising the performance. Our method features a hybrid loss function, uniquely combining binary cross-entropy (BCE) and a Structural Similarity Index Measure (SSIM), to ensure pixel-level precision while enhancing the perceptual realism. Additionally, the use of conditional input masks offers unparalleled control over the generation of tumor features, marking a breakthrough in fine-grained dataset augmentation for segmentation and diagnostic tasks. Rigorous testing on diverse datasets establishes PixMed-Enhancer as a state-of-the-art solution, excelling in its realism, structural fidelity, and computational efficiency. PixMed-Enhancer establishes a robust foundation for real-world clinical applications in AI-driven medical imaging.
Collapse
Affiliation(s)
- M. J. Aashik Rasool
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea (A.A.); (A.S.B.)
| | - Akmalbek Abdusalomov
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea (A.A.); (A.S.B.)
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan; (A.K.); (S.M.)
| | - Alpamis Kutlimuratov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan; (A.K.); (S.M.)
| | - M. J. Akeel Ahamed
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea (A.A.); (A.S.B.)
| | - Sanjar Mirzakhalilov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan; (A.K.); (S.M.)
| | - Abror Shavkatovich Buriboev
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea (A.A.); (A.S.B.)
| | - Heung Seok Jeon
- Department of Software Technology, Konkuk University, Chungju 27478, Republic of Korea
| |
Collapse
|
5
|
Jiang Y, Manem VSK. Data augmented lung cancer prediction framework using the nested case control NLST cohort. Front Oncol 2025; 15:1492758. [PMID: 40071099 PMCID: PMC11893409 DOI: 10.3389/fonc.2025.1492758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2024] [Accepted: 02/03/2025] [Indexed: 03/14/2025] Open
Abstract
Purpose In the context of lung cancer screening, the scarcity of well-labeled medical images poses a significant challenge to implement supervised learning-based deep learning methods. While data augmentation is an effective technique for countering the difficulties caused by insufficient data, it has not been fully explored in the context of lung cancer screening. In this research study, we analyzed the state-of-the-art (SOTA) data augmentation techniques for lung cancer binary prediction. Methods To comprehensively evaluate the efficiency of data augmentation approaches, we considered the nested case control National Lung Screening Trial (NLST) cohort comprising of 253 individuals who had the commonly used CT scans without contrast. The CT scans were pre-processed into three-dimensional volumes based on the lung nodule annotations. Subsequently, we evaluated five basic (online) and two generative model-based offline data augmentation methods with ten state-of-the-art (SOTA) 3D deep learning-based lung cancer prediction models. Results Our results demonstrated that the performance improvement by data augmentation was highly dependent on approach used. The Cutmix method resulted in the highest average performance improvement across all three metrics: 1.07%, 3.29%, 1.19% for accuracy, F1 score and AUC, respectively. MobileNetV2 with a simple data augmentation approach achieved the best AUC of 0.8719 among all lung cancer predictors, demonstrating a 7.62% improvement compared to baseline. Furthermore, the MED-DDPM data augmentation approach was able to improve prediction performance by rebalancing the training set and adding moderately synthetic data. Conclusions The effectiveness of online and offline data augmentation methods were highly sensitive to the prediction model, highlighting the importance of carefully selecting the optimal data augmentation method. Our findings suggest that certain traditional methods can provide more stable and higher performance compared to SOTA online data augmentation approaches. Overall, these results offer meaningful insights for the development and clinical integration of data augmented deep learning tools for lung cancer screening.
Collapse
Affiliation(s)
- Yifan Jiang
- Centre de Recherche du CHU de Québec, Université Laval, Québec, QC, Canada
- Département de Biologie Moléculaire, Biochimie Médicale et Pathologie, Université Laval, Québec, QC, Canada
| | - Venkata S. K. Manem
- Centre de Recherche du CHU de Québec, Université Laval, Québec, QC, Canada
- Département de Biologie Moléculaire, Biochimie Médicale et Pathologie, Université Laval, Québec, QC, Canada
- Institut Universitaire de Cardiologie et de Pneumologie de Québec, Québec, QC, Canada
| |
Collapse
|
6
|
Halkiopoulos C, Gkintoni E, Aroutzidis A, Antonopoulou H. Advances in Neuroimaging and Deep Learning for Emotion Detection: A Systematic Review of Cognitive Neuroscience and Algorithmic Innovations. Diagnostics (Basel) 2025; 15:456. [PMID: 40002607 PMCID: PMC11854508 DOI: 10.3390/diagnostics15040456] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2025] [Revised: 02/07/2025] [Accepted: 02/11/2025] [Indexed: 02/27/2025] Open
Abstract
Background/Objectives: The following systematic review integrates neuroimaging techniques with deep learning approaches concerning emotion detection. It, therefore, aims to merge cognitive neuroscience insights with advanced algorithmic methods in pursuit of an enhanced understanding and applications of emotion recognition. Methods: The study was conducted following PRISMA guidelines, involving a rigorous selection process that resulted in the inclusion of 64 empirical studies that explore neuroimaging modalities such as fMRI, EEG, and MEG, discussing their capabilities and limitations in emotion recognition. It further evaluates deep learning architectures, including neural networks, CNNs, and GANs, in terms of their roles in classifying emotions from various domains: human-computer interaction, mental health, marketing, and more. Ethical and practical challenges in implementing these systems are also analyzed. Results: The review identifies fMRI as a powerful but resource-intensive modality, while EEG and MEG are more accessible with high temporal resolution but limited by spatial accuracy. Deep learning models, especially CNNs and GANs, have performed well in classifying emotions, though they do not always require large and diverse datasets. Combining neuroimaging data with behavioral and cognitive features improves classification performance. However, ethical challenges, such as data privacy and bias, remain significant concerns. Conclusions: The study has emphasized the efficiencies of neuroimaging and deep learning in emotion detection, while various ethical and technical challenges were also highlighted. Future research should integrate behavioral and cognitive neuroscience advances, establish ethical guidelines, and explore innovative methods to enhance system reliability and applicability.
Collapse
Affiliation(s)
- Constantinos Halkiopoulos
- Department of Management Science and Technology, University of Patras, 26334 Patras, Greece; (C.H.); (A.A.); (H.A.)
| | - Evgenia Gkintoni
- Department of Educational Sciences and Social Work, University of Patras, 26504 Patras, Greece
| | - Anthimos Aroutzidis
- Department of Management Science and Technology, University of Patras, 26334 Patras, Greece; (C.H.); (A.A.); (H.A.)
| | - Hera Antonopoulou
- Department of Management Science and Technology, University of Patras, 26334 Patras, Greece; (C.H.); (A.A.); (H.A.)
| |
Collapse
|
7
|
Xie Y, Hao ZW, Wang XM, Wang HL, Yang JM, Zhou H, Wang XD, Zhang JY, Yang HW, Liu PR, Ye ZW. Dual-Stream Attention-Based Classification Network for Tibial Plateau Fractures via Diffusion Model Augmentation and Segmentation Map Integration. Curr Med Sci 2025; 45:57-69. [PMID: 39998767 DOI: 10.1007/s11596-025-00008-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 10/25/2024] [Accepted: 11/04/2024] [Indexed: 02/27/2025]
Abstract
OBJECTIVE This study aimed to explore a novel method that integrates the segmentation guidance classification and the diffusion model augmentation to realize the automatic classification for tibial plateau fractures (TPFs). METHODS YOLOv8n-cls was used to construct a baseline model on the data of 3781 patients from the Orthopedic Trauma Center of Wuhan Union Hospital. Additionally, a segmentation-guided classification approach was proposed. To enhance the dataset, a diffusion model was further demonstrated for data augmentation. RESULTS The novel method that integrated the segmentation-guided classification and diffusion model augmentation significantly improved the accuracy and robustness of fracture classification. The average accuracy of classification for TPFs rose from 0.844 to 0.896. The comprehensive performance of the dual-stream model was also significantly enhanced after many rounds of training, with both the macro-area under the curve (AUC) and the micro-AUC increasing from 0.94 to 0.97. By utilizing diffusion model augmentation and segmentation map integration, the model demonstrated superior efficacy in identifying Schatzker I, achieving an accuracy of 0.880. It yielded an accuracy of 0.898 for Schatzker II and III and 0.913 for Schatzker IV; for Schatzker V and VI, the accuracy was 0.887; and for intercondylar ridge fracture, the accuracy was 0.923. CONCLUSION The dual-stream attention-based classification network, which has been verified by many experiments, exhibited great potential in predicting the classification of TPFs. This method facilitates automatic TPF assessment and may assist surgeons in the rapid formulation of surgical plans.
Collapse
Affiliation(s)
- Yi Xie
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Zhi-Wei Hao
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Xin-Meng Wang
- Key Laboratory of Clinical Biochemistry Testing in Universities of Yunnan Province, School of Basic Medical Sciences, Dali University, Dali, 671003, China
| | - Hong-Lin Wang
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Jia-Ming Yang
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Hong Zhou
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Xu-Dong Wang
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Jia-Yao Zhang
- Department of Orthopedics Surgery, Fujian Provincial Hospital, Fuzhou, 350001, China
| | - Hui-Wen Yang
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Department of Otorhinolaryngology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| | - Peng-Ran Liu
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| | - Zhe-Wei Ye
- Department of Orthopedics Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
- Laboratory of Intelligent Medicine Research, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China.
| |
Collapse
|
8
|
Dohmen M, Klemens MA, Baltruschat IM, Truong T, Lenga M. Similarity and quality metrics for MR image-to-image translation. Sci Rep 2025; 15:3853. [PMID: 39890963 PMCID: PMC11785996 DOI: 10.1038/s41598-025-87358-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 01/17/2025] [Indexed: 02/03/2025] Open
Abstract
Image-to-image translation can create large impact in medical imaging, as images can be synthetically transformed to other modalities, sequence types, higher resolutions or lower noise levels. To ensure patient safety, these methods should be validated by human readers, which requires a considerable amount of time and costs. Quantitative metrics can effectively complement such studies and provide reproducible and objective assessment of synthetic images. If a reference is available, the similarity of MR images is frequently evaluated by SSIM and PSNR metrics, even though these metrics are not or too sensitive regarding specific distortions. When reference images to compare with are not available, non-reference quality metrics can reliably detect specific distortions, such as blurriness. To provide an overview on distortion sensitivity, we quantitatively analyze 11 similarity (reference) and 12 quality (non-reference) metrics for assessing synthetic images. We additionally include a metric on a downstream segmentation task. We investigate the sensitivity regarding 11 kinds of distortions and typical MR artifacts, and analyze the influence of different normalization methods on each metric and distortion. Finally, we derive recommendations for effective usage of the analyzed similarity and quality metrics for evaluation of image-to-image translation models.
Collapse
|
9
|
Mao Y, Kim J, Podina L, Kohandel M. Dilated SE-DenseNet for brain tumor MRI classification. Sci Rep 2025; 15:3596. [PMID: 39875423 PMCID: PMC11775108 DOI: 10.1038/s41598-025-86752-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 01/13/2025] [Indexed: 01/30/2025] Open
Abstract
In the field of medical imaging, particularly MRI-based brain tumor classification, we propose an advanced convolutional neural network (CNN) leveraging the DenseNet-121 architecture, enhanced with dilated convolutional layers and Squeeze-and-Excitation (SE) networks' attention mechanisms. This novel approach aims to improve upon state-of-the-art methods of tumor identification. Our model, trained and evaluated on a comprehensive Kaggle brain tumor dataset, demonstrated superior performance over established convolution-based and transformer-based models: ResNet-101, VGG-19, original DenseNet-121, MobileNet-V2, ViT-L/16, and Swin-B across key metrics: F1-score, accuracy, precision, and recall. The results underscore the effectiveness of our architectural enhancements in medical image analysis. Future research directions include optimizing dilation layers and exploring various architectural configurations. The study highlights the significant role of machine learning in improving diagnostic accuracy in medical imaging, with potential applications extending beyond brain tumor detection to other medical imaging tasks.
Collapse
Affiliation(s)
- Yuannong Mao
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada.
| | - Jiwook Kim
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Lena Podina
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Mohammad Kohandel
- Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada.
| |
Collapse
|
10
|
Abdi-Sargezeh B, Shirani S, Valentin A, Alarcon G, Sanei S. EEG-to-EEG: Scalp-to-Intracranial EEG Translation Using a Combination of Variational Autoencoder and Generative Adversarial Networks. SENSORS (BASEL, SWITZERLAND) 2025; 25:494. [PMID: 39860864 PMCID: PMC11769358 DOI: 10.3390/s25020494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 01/10/2025] [Accepted: 01/13/2025] [Indexed: 01/27/2025]
Abstract
A generative adversarial network (GAN) makes it possible to map a data sample from one domain to another one. It has extensively been employed in image-to-image and text-to image translation. We propose an EEG-to-EEG translation model to map the scalp-mounted EEG (scEEG) sensor signals to intracranial EEG (iEEG) sensor signals recorded by foramen ovale sensors inserted into the brain. The model is based on a GAN structure in which a conditional GAN (cGAN) is combined with a variational autoencoder (VAE), named as VAE-cGAN. scEEG sensors are plagued by noise and suffer from low resolution. On the other hand, iEEG sensor recordings enjoy high resolution. Here, we consider the task of mapping the scEEG sensor information to iEEG sensors to enhance the scEEG resolution. In this study, our EEG data contain epileptic interictal epileptiform discharges (IEDs). The identification of IEDs is crucial in clinical practice. Here, the proposed VAE-cGAN is firstly employed to map the scEEG to iEEG. Then, the IEDs are detected from the resulting iEEG. Our model achieves a classification accuracy of 76%, an increase of, respectively, 11%, 8%, and 3% over the previously proposed least-square regression, asymmetric autoencoder, and asymmetric-symmetric autoencoder mapping models.
Collapse
Affiliation(s)
- Bahman Abdi-Sargezeh
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 2JD, UK
| | - Sepehr Shirani
- Department of Clinical Neuroscience, King’s College London, London WC2R 2LS, UK; (S.S.); (A.V.)
| | - Antonio Valentin
- Department of Clinical Neuroscience, King’s College London, London WC2R 2LS, UK; (S.S.); (A.V.)
| | - Gonzalo Alarcon
- School of Medical Sciences, University of Manchester, Manchester M13 9PL, UK;
| | - Saeid Sanei
- Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, UK;
| |
Collapse
|
11
|
Zhao M, Guo H, Cao X, Dai J, Wang Z, Zhao J, Peng C. Bibliometric analysis of research on the application of deep learning to ophthalmology. Quant Imaging Med Surg 2025; 15:852-866. [PMID: 39839016 PMCID: PMC11744151 DOI: 10.21037/qims-24-1340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 12/13/2024] [Indexed: 01/23/2025]
Abstract
Background Recently, deep learning has become a popular area of research, and has revolutionized the diagnosis and prediction of ocular diseases, especially fundus diseases. This study aimed to conduct a bibliometric analysis of deep learning in the field of ophthalmology to describe international research trends and examine the current research directions. Methods This cross-sectional bibliometric analysis examined the development of research on deep learning in the field of ophthalmology and its sub-topics from 2015 to 2024. Visualization of similarities (VOS)-viewer was used to analyze and evaluate 3,055 articles. Data from the articles were collected on a specific date (September 11, 2024) and downloaded from the Web of Science Core Collection (WOSCC) in plain-text format. Results A total of 3,055 relevant articles on the WOSCC published from 2015 to 2024 were included in this analysis. The first article on the application of deep learning to ophthalmology was published in 2015, and the number of articles on the subject has grown significantly since 2019. China was the most productive country (n=1,187), followed by the United States (n=673). Sun Yat-sen University was the institution with the most publications. Cheng and Bogunovic were the most frequently published authors. The following four different clusters were identified based on a co-occurrence cluster analysis of high-frequency keywords: (I) deep learning for the segmentation and feature extraction of ophthalmic images; (II) deep learning for the automatic detection and classification of ophthalmic images; (III) application of deep learning to ophthalmic imaging techniques; and (IV) deep learning for the diagnosis and management of ophthalmic diseases. Conclusions The analysis of fundus images and the clinical application of deep learning techniques have emerged as prominent research areas in the field of ophthalmology. The substantial increase in publications and citations signifies the expanding impact and global collaboration in the application of deep learning research to ophthalmology. By identifying four distinct clusters representing sub-topics in deep learning ophthalmology research, this study contributes to the understanding of current trends and potential future advancements in the field.
Collapse
Affiliation(s)
- Min Zhao
- Department of Ophthalmology, the Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Haoxin Guo
- Department of Information Center, the First Hospital of China Medical University, Shenyang, China
| | - Xindan Cao
- Department of Ophthalmology, the Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Junshi Dai
- Department of Ophthalmology, the Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Zhongqing Wang
- Department of Information Center, the First Hospital of China Medical University, Shenyang, China
| | - Jiangyue Zhao
- Department of Ophthalmology, the Fourth Affiliated Hospital of China Medical University, Shenyang, China
| | - Cheng Peng
- Department of Ophthalmology, the Fourth Affiliated Hospital of China Medical University, Shenyang, China
| |
Collapse
|
12
|
Sriwatana K, Puttanawarut C, Suwan Y, Achakulvisut T. Explainable Deep Learning for Glaucomatous Visual Field Prediction: Artifact Correction Enhances Transformer Models. Transl Vis Sci Technol 2025; 14:22. [PMID: 39847375 PMCID: PMC11758932 DOI: 10.1167/tvst.14.1.22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 12/10/2024] [Indexed: 01/24/2025] Open
Abstract
Purpose The purpose of this study was to develop a deep learning approach that restores artifact-laden optical coherence tomography (OCT) scans and predicts functional loss on the 24-2 Humphrey Visual Field (HVF) test. Methods This cross-sectional, retrospective study used 1674 visual field (VF)-OCT pairs from 951 eyes for training and 429 pairs from 345 eyes for testing. Peripapillary retinal nerve fiber layer (RNFL) thickness map artifacts were corrected using a generative diffusion model. Three convolutional neural networks and 2 transformer-based models were trained on original and artifact-corrected datasets to estimate 54 sensitivity thresholds of the 24-2 HVF test. Results Predictive performances were calculated using root mean square error (RMSE) and mean absolute error (MAE), with explainability evaluated through GradCAM, attention maps, and dimensionality reduction techniques. The Distillation with No Labels (DINO) Vision Transformers (ViT) trained on artifact-corrected datasets achieved the highest accuracy (RMSE, 95% confidence interval [CI] = 4.44, 95% CI = 4.07, 4.82 decibel [dB], MAE = 3.46, 95% CI = 3.14, 3.79 dB), and the greatest interpretability, showing improvements of 0.15 dB in global RMSE and MAE (P < 0.05) compared to the performance on original maps. Feature maps and visualization tools indicate that artifacts compromise DINO-ViT's predictive ability but improve with artifact correction. Conclusions Combining self-supervised ViTs with generative artifact correction enhances the correlation between glaucomatous structures and functions. Translational Relevance Our approach offers a comprehensive tool for glaucoma management, facilitates the exploration of structure-function correlations in research, and underscores the importance of addressing artifacts in the clinical interpretation of OCT.
Collapse
Affiliation(s)
- Kornchanok Sriwatana
- Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Nakhon Pathom, Thailand
- Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Chanon Puttanawarut
- Chakri Naruebodindra Medical Institute, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Samut Prakan, Thailand
- Department of Clinical Epidemiology and Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Yanin Suwan
- Department of Ophthalmology, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
| | - Titipat Achakulvisut
- Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Nakhon Pathom, Thailand
| |
Collapse
|
13
|
Salle G, Andrade-Miranda G, Conze PH, Boussion N, Bert J, Visvikis D, Jaouen V. Cross-Modal Tumor Segmentation Using Generative Blending Augmentation and Self-Training. IEEE Trans Biomed Eng 2025; 72:370-380. [PMID: 38557627 DOI: 10.1109/tbme.2024.3384014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
OBJECTIVES Data scarcity and domain shifts lead to biased training sets that do not accurately represent deployment conditions. A related practical problem is cross-modal image segmentation, where the objective is to segment unlabelled images using previously labelled datasets from other imaging modalities. METHODS We propose a cross-modal segmentation method based on conventional image synthesis boosted by a new data augmentation technique called Generative Blending Augmentation (GBA). GBA leverages a SinGAN model to learn representative generative features from a single training image to diversify realistically tumor appearances. This way, we compensate for image synthesis errors, subsequently improving the generalization power of a downstream segmentation model. The proposed augmentation is further combined to an iterative self-training procedure leveraging pseudo labels at each pass. RESULTS The proposed solution ranked first for vestibular schwannoma (VS) segmentation during the validation and test phases of the MICCAI CrossMoDA 2022 challenge, with best mean Dice similarity and average symmetric surface distance measures. CONCLUSION AND SIGNIFICANCE Local contrast alteration of tumor appearances and iterative self-training with pseudo labels are likely to lead to performance improvements in a variety of segmentation contexts.
Collapse
|
14
|
Buttar AM, Shaheen Z, Gumaei AH, Mosleh MAA, Gupta I, Alzanin SM, Akbar MA. Enhanced neurological anomaly detection in MRI images using deep convolutional neural networks. Front Med (Lausanne) 2024; 11:1504545. [PMID: 39802885 PMCID: PMC11717658 DOI: 10.3389/fmed.2024.1504545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Accepted: 12/09/2024] [Indexed: 01/16/2025] Open
Abstract
Introduction Neurodegenerative diseases, including Parkinson's, Alzheimer's, and epilepsy, pose significant diagnostic and treatment challenges due to their complexity and the gradual degeneration of central nervous system structures. This study introduces a deep learning framework designed to automate neuro-diagnostics, addressing the limitations of current manual interpretation methods, which are often time-consuming and prone to variability. Methods We propose a specialized deep convolutional neural network (DCNN) framework aimed at detecting and classifying neurological anomalies in MRI data. Our approach incorporates key preprocessing techniques, such as reducing noise and normalizing image intensity in MRI scans, alongside an optimized model architecture. The model employs Rectified Linear Unit (ReLU) activation functions, the Adam optimizer, and a random search strategy to fine-tune hyper-parameters like learning rate, batch size, and the number of neurons in fully connected layers. To ensure reliability and broad applicability, cross-fold validation was used. Results and discussion Our DCNN achieved a remarkable classification accuracy of 98.44%, surpassing well-known models such as ResNet-50 and AlexNet when evaluated on a comprehensive MRI dataset. Moreover, performance metrics such as precision, recall, and F1-score were calculated separately, confirming the robustness and efficiency of our model across various evaluation criteria. Statistical analyses, including ANOVA and t-tests, further validated the significance of the performance improvements observed with our proposed method. This model represents an important step toward creating a fully automated system for diagnosing and planning treatment for neurological diseases. The high accuracy of our framework highlights its potential to improve diagnostic workflows by enabling precise detection, tracking disease progression, and supporting personalized treatment strategies. While the results are promising, further research is necessary to assess how the model performs across different clinical scenarios. Future studies could focus on integrating additional data types, such as longitudinal imaging and multimodal techniques, to further enhance diagnostic accuracy and clinical utility. These findings mark a significant advancement in applying deep learning to neuro-diagnostics, with promising implications for improving patient outcomes and clinical practices.
Collapse
Affiliation(s)
- Ahmed Mateen Buttar
- Department of Computer Science, University of Agriculture Faisalabad, Faisalabad, Pakistan
| | - Zubair Shaheen
- Department of Computer Science, University of Agriculture Faisalabad, Faisalabad, Pakistan
| | - Abdu H. Gumaei
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Mogeeb A. A. Mosleh
- Faculty of Engineering and Information Technology, Taiz University, Taiz, Yemen
- Faculty of Engineering and Computing, University of Science and Technology, Aden, Yemen
| | - Indrajeet Gupta
- School of Computer Science & AI SR University, Warangal, Telangana, India
| | - Samah M. Alzanin
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | | |
Collapse
|
15
|
Munguía-Siu A, Vergara I, Espinoza-Rodríguez JH. The Use of Hybrid CNN-RNN Deep Learning Models to Discriminate Tumor Tissue in Dynamic Breast Thermography. J Imaging 2024; 10:329. [PMID: 39728226 PMCID: PMC11728322 DOI: 10.3390/jimaging10120329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Revised: 12/13/2024] [Accepted: 12/19/2024] [Indexed: 12/28/2024] Open
Abstract
Breast cancer is one of the leading causes of death for women worldwide, and early detection can help reduce the death rate. Infrared thermography has gained popularity as a non-invasive and rapid method for detecting this pathology and can be further enhanced by applying neural networks to extract spatial and even temporal data derived from breast thermographic images if they are acquired sequentially. In this study, we evaluated hybrid convolutional-recurrent neural network (CNN-RNN) models based on five state-of-the-art pre-trained CNN architectures coupled with three RNNs to discern tumor abnormalities in dynamic breast thermographic images. The hybrid architecture that achieved the best performance for detecting breast cancer was VGG16-LSTM, which showed accuracy (ACC), sensitivity (SENS), and specificity (SPEC) of 95.72%, 92.76%, and 98.68%, respectively, with a CPU runtime of 3.9 s. However, the hybrid architecture that showed the fastest CPU runtime was AlexNet-RNN with 0.61 s, although with lower performance (ACC: 80.59%, SENS: 68.52%, SPEC: 92.76%), but still superior to AlexNet (ACC: 69.41%, SENS: 52.63%, SPEC: 86.18%) with 0.44 s. Our findings show that hybrid CNN-RNN models outperform stand-alone CNN models, indicating that temporal data recovery from dynamic breast thermographs is possible without significantly compromising classifier runtime.
Collapse
Affiliation(s)
- Andrés Munguía-Siu
- Department of Computing, Electronics and Mechatronics, Universidad de las Américas Puebla, Sta. Catarina Martir, San Andrés Cholula 72810, Mexico;
| | - Irene Vergara
- Department of Immunology, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| | - Juan Horacio Espinoza-Rodríguez
- Department of Computing, Electronics and Mechatronics, Universidad de las Américas Puebla, Sta. Catarina Martir, San Andrés Cholula 72810, Mexico;
| |
Collapse
|
16
|
Álvarez-Chaves H, Spruit M, R-Moreno MD. Improving ED admissions forecasting by using generative AI: An approach based on DGAN. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 256:108363. [PMID: 39182250 DOI: 10.1016/j.cmpb.2024.108363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 07/05/2024] [Accepted: 08/01/2024] [Indexed: 08/27/2024]
Abstract
BACKGROUND AND OBJECTIVE Generative Deep Learning has emerged in recent years as a significant player in the Artificial Intelligence field. Synthesizing new data while maintaining the features of reality has revolutionized the field of Deep Learning, proving to be particularly useful in contexts where obtaining data is challenging. The objective of this study is to employ the DoppelGANger algorithm, a cutting-edge approach based on Generative Adversarial Networks for time series, to enhance patient admissions forecasting in a hospital Emergency Department. METHODS We employed the DoppelGANger algorithm in a sequential methodology, conditioning generated time series with unique attributes to optimize data utilization. After confirming the successful creation of synthetic data with new attribute values, we adopted the Train-Synthetic-Test-Real framework to ensure the reliability of our synthetic data validation. We then augmented the original series with synthetic data to enhance the Prophet model's performance. This process was applied to two datasets derived from the original: one with four years of training followed by one year of testing, and another with three years of training and two years of testing. RESULTS The experimental results show that the generative model outperformed Prophet on the forecasting task, improving the SMAPE from 7.30 to 6.99 with the four-year training set, and from 22.84 to 7.41 for the three-year training set, all in daily aggregations. For the data replacement task, the Prophet SMAPE values decreased to 6.84 and 7.18 for four and three-year sets on the same aggregation. Additionally, data augmentation reduced the SMAPE to 6.79 for a one-year test set and achieved 8.56 for the two-year test set, surpassing the performance achieved by the same Prophet model when trained only on real data. Results for the remaining aggregations were consistent. CONCLUSIONS The findings of this study suggest that employing a generative algorithm to extend a training dataset can effectively enhance predictive models within the domain of Emergency Department admissions. The improvement can lead to more efficient resource allocation and patient management.
Collapse
Affiliation(s)
| | - Marco Spruit
- Leiden University Medical Center, Department of Public Health and Primary Care, 2333 ZA, Leiden, The Netherlands.
| | - María D R-Moreno
- Universidad de Alcalá, Escuela Politécnica Superior, 28805, Madrid, Spain.
| |
Collapse
|
17
|
Tang Y, Yang C, Wang Y, Zhang Y, Xin J, Zhang H, Xie H. Uncovering neural substrates across Alzheimer's disease stages using contrastive variational autoencoder. Cereb Cortex 2024; 34:bhae393. [PMID: 39363728 DOI: 10.1093/cercor/bhae393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 07/31/2024] [Accepted: 09/12/2024] [Indexed: 10/05/2024] Open
Abstract
Alzheimer's disease is the most common major neurocognitive disorder. Although currently, no cure exists, understanding the neurobiological substrate underlying Alzheimer's disease progression will facilitate early diagnosis and treatment, slow disease progression, and improve prognosis. In this study, we aimed to understand the morphological changes underlying Alzheimer's disease progression using structural magnetic resonance imaging data from cognitively normal individuals, individuals with mild cognitive impairment, and Alzheimer's disease via a contrastive variational autoencoder model. We used contrastive variational autoencoder to generate synthetic data to boost the downstream classification performance. Due to the ability to parse out the nonclinical factors such as age and gender, contrastive variational autoencoder facilitated a purer comparison between different Alzheimer's disease stages to identify the pathological changes specific to Alzheimer's disease progression. We showed that brain morphological changes across Alzheimer's disease stages were significantly associated with individuals' neurofilament light chain concentration, a potential biomarker for Alzheimer's disease, highlighting the biological plausibility of our results.
Collapse
Affiliation(s)
- Yan Tang
- School of Electronic Information, Central South University, Changsha, 410148, China
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Chao Yang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Yuqi Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Yunhao Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Jiang Xin
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China
| | - Hao Zhang
- School of Electronic Information, Central South University, Changsha, 410148, China
| | - Hua Xie
- Center for Neuroscience Research, Children's National Hospital, Washington, DC 20906, USA
- Department of Neurology, George Washington University School of Medicine, Washington, DC 20037, USA
| |
Collapse
|
18
|
Fujii Y, Uchida D, Sato R, Obata T, Akihiro M, Miyamoto K, Morimoto K, Terasawa H, Yamazaki T, Matsumoto K, Horiguchi S, Tsutsumi K, Kato H, Inoue H, Cho T, Tanimoto T, Ohto A, Kawahara Y, Otsuka M. Effectiveness of data-augmentation on deep learning in evaluating rapid on-site cytopathology at endoscopic ultrasound-guided fine needle aspiration. Sci Rep 2024; 14:22441. [PMID: 39341885 PMCID: PMC11439075 DOI: 10.1038/s41598-024-72312-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 09/05/2024] [Indexed: 10/01/2024] Open
Abstract
Rapid on-site cytopathology evaluation (ROSE) has been considered an effective method to increase the diagnostic ability of endoscopic ultrasound-guided fine needle aspiration (EUS-FNA); however, ROSE is unavailable in most institutes worldwide due to the shortage of cytopathologists. To overcome this situation, we created an artificial intelligence (AI)-based system (the ROSE-AI system), which was trained with the augmented data to evaluate the slide images acquired by EUS-FNA. This study aimed to clarify the effects of such data-augmentation on establishing an effective ROSE-AI system by comparing the efficacy of various data-augmentation techniques. The ROSE-AI system was trained with increased data obtained by the various data-augmentation techniques, including geometric transformation, color space transformation, and kernel filtering. By performing five-fold cross-validation, we compared the efficacy of each data-augmentation technique on the increasing diagnostic abilities of the ROSE-AI system. We collected 4059 divided EUS-FNA slide images from 36 patients with pancreatic cancer and nine patients with non-pancreatic cancer. The diagnostic ability of the ROSE-AI system without data augmentation had a sensitivity, specificity, and accuracy of 87.5%, 79.7%, and 83.7%, respectively. While, some data-augmentation techniques decreased diagnostic ability, the ROSE-AI system trained only with the augmented data using the geometric transformation technique had the highest diagnostic accuracy (88.2%). We successfully developed a prototype ROSE-AI system with high diagnostic ability. Each data-augmentation technique may have various compatibilities with AI-mediated diagnostics, and the geometric transformation was the most effective for the ROSE-AI system.
Collapse
Affiliation(s)
- Yuki Fujii
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan.
| | - Daisuke Uchida
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Ryosuke Sato
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Taisuke Obata
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Matsumi Akihiro
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Kazuya Miyamoto
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Kosaku Morimoto
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Hiroyuki Terasawa
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Tatsuhiro Yamazaki
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Kazuyuki Matsumoto
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Shigeru Horiguchi
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Koichiro Tsutsumi
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Hironari Kato
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| | - Hirofumi Inoue
- Department of Pathology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, Okayama, Japan
| | - Ten Cho
- Business Strategy Division, Ryobi Systems Co., Ltd., Okayama, Japan
| | | | - Akimitsu Ohto
- Business Strategy Division, Ryobi Systems Co., Ltd., Okayama, Japan
| | - Yoshiro Kawahara
- Department of Practical Gastrointestinal Endoscopy, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, Okayama, Japan
| | - Motoyuki Otsuka
- Department of Gastroenterology and Hepatology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Science, 2-5-1, Shikata-Cho, Kita-Ku, Okayama, Okayama, Japan
| |
Collapse
|
19
|
Wang X, Wu Y, Li J, Li Y, Xu S. Deep Learning-Assisted Automatic Diagnosis of Anterior Cruciate Ligament Tear in Knee Magnetic Resonance Images. Tomography 2024; 10:1263-1276. [PMID: 39195729 PMCID: PMC11487377 DOI: 10.3390/tomography10080094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 07/08/2024] [Accepted: 07/11/2024] [Indexed: 08/29/2024] Open
Abstract
Anterior cruciate ligament (ACL) tears are prevalent knee injures, particularly among active individuals. Accurate and timely diagnosis is essential for determining the optimal treatment strategy and assessing patient prognosis. Various previous studies have demonstrated the successful application of deep learning techniques in the field of medical image analysis. This study aimed to develop a deep learning model for detecting ACL tears in knee magnetic resonance Imaging (MRI) to enhance diagnostic accuracy and efficiency. The proposed model consists of three main modules: a Dual-Scale Data Augmentation module (DDA) to enrich the training data on both the spatial and layer scales; a selective group attention module (SG) to capture relationships across the layer, channel, and space scales; and a fusion module to explore the inter-relationships among various perspectives to achieve the final classification. To ensure a fair comparison, the study utilized a public dataset from MRNet, comprising knee MRI scans from 1250 exams, with a focus on three distinct views: axial, coronal, and sagittal. The experimental results demonstrate the superior performance of the proposed model, termed SGNET, in ACL tear detection compared with other comparison models, achieving an accuracy of 0.9250, a sensitivity of 0.9259, a specificity of 0.9242, and an AUC of 0.9747.
Collapse
Affiliation(s)
- Xuanwei Wang
- Department of Orthopedics, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China; (X.W.); (J.L.)
| | | | - Jiafeng Li
- Department of Orthopedics, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China; (X.W.); (J.L.)
| | - Yifan Li
- Department of Orthopedics, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China; (X.W.); (J.L.)
| | - Sanzhong Xu
- Department of Orthopedics, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China; (X.W.); (J.L.)
| |
Collapse
|
20
|
Sarmadi A, Razavi ZS, van Wijnen AJ, Soltani M. Comparative analysis of vision transformers and convolutional neural networks in osteoporosis detection from X-ray images. Sci Rep 2024; 14:18007. [PMID: 39097627 PMCID: PMC11297930 DOI: 10.1038/s41598-024-69119-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 07/31/2024] [Indexed: 08/05/2024] Open
Abstract
Within the scope of this investigation, we carried out experiments to investigate the potential of the Vision Transformer (ViT) in the field of medical image analysis. The diagnosis of osteoporosis through inspection of X-ray radio-images is a substantial classification problem that we were able to address with the assistance of Vision Transformer models. In order to provide a basis for comparison, we conducted a parallel analysis in which we sought to solve the same problem by employing traditional convolutional neural networks (CNNs), which are well-known and commonly used techniques for the solution of image categorization issues. The findings of our research led us to conclude that ViT is capable of achieving superior outcomes compared to CNN. Furthermore, provided that methods have access to a sufficient quantity of training data, the probability increases that both methods arrive at more appropriate solutions to critical issues.
Collapse
Affiliation(s)
- Ali Sarmadi
- Department of Mechanical Engineering, K. N. Toosi University of Technology, Tehran, Iran
| | - Zahra Sadat Razavi
- Department of Mechanical Engineering, K. N. Toosi University of Technology, Tehran, Iran
- Physiology Research Center, Iran University Medical Sciences, Tehran, Iran
- Biochemistry Research Center, Iran University Medical Sciences, Tehran, Iran
| | - Andre J van Wijnen
- Department of Biochemistry, University of Vermont, Burlington, VT, USA
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Madjid Soltani
- Department of Mechanical Engineering, K. N. Toosi University of Technology, Tehran, Iran.
- Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Canada.
- Centre for Biotechnology and Bioengineering (CBB), University of Waterloo, Waterloo, Canada.
- Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, Canada.
- Centre for Sustainable Business, International Business University, Toronto, Canada.
| |
Collapse
|
21
|
Prasad PJR, Fretland AA, Albregtsen F, Elle OJ, Kumar RP. Transfer Learning with Interpretability: Liver Segmentation in CT and MR using Limited Dataset. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40039644 DOI: 10.1109/embc53108.2024.10782042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Liver segmentation using deep learning (DL) is a widely discussed topic and this applies for both the modalities Computed Tomography (CT) and Magnetic Resonance Imaging (MR). However, the development of these DL networks often encounters a major hurdle: the need for large amounts of data. In this paper, we present a general solution to this challenge, by leveraging transfer learning with a conventional UNet architecture for CT and MR liver parenchyma segmentation. Our method capitalizes on publicly available data from one modality to train the model, thereby learning the features of the specific task. Limited cases from the target domain are then used for further training. By comparing with the performance of 2D diffusion model for parenchyma segmentation tasks, we propose that this approach effectively avoids the need for resource-intensive models in scenarios where time and data resources are constrained. Moreover, to ensure transparency in our model's predictions, we have incorporated interpretability of the predictions using Explainable Artificial Intelligence (XAI). This allows for meaningful visual explanations of the segmentation outputs, fostering trust in the model's decisions. Our approach demonstrates good performance on both modalities with a mean test Dice of 90.01% in CT and 79.05% in MR, emphasizing the effectiveness of transfer learning in the development of a pre-trained CT and MR liver segmentation model with XAI.
Collapse
|
22
|
Aouadi S, Torfeh T, Bouhali O, Yoganathan SA, Paloor S, Chandramouli S, Hammoud R, Al-Hammadi N. Prediction of cervix cancer stage and grade from diffusion weighted imaging using EfficientNet. Biomed Phys Eng Express 2024; 10:045042. [PMID: 38815562 DOI: 10.1088/2057-1976/ad5207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 05/30/2024] [Indexed: 06/01/2024]
Abstract
Purpose. This study aims to introduce an innovative noninvasive method that leverages a single image for both grading and staging prediction. The grade and the stage of cervix cancer (CC) are determined from diffusion-weighted imaging (DWI) in particular apparent diffusion coefficient (ADC) maps using deep convolutional neural networks (DCNN).Methods. datasets composed of 85 patients having annotated tumor stage (I, II, III, and IV), out of this, 66 were with grade (II and III) and the remaining patients with no reported grade were retrospectively collected. The study was IRB approved. For each patient, sagittal and axial slices containing the gross tumor volume (GTV) were extracted from ADC maps. These were computed using the mono exponential model from diffusion weighted images (b-values = 0, 100, 1000) that were acquired prior to radiotherapy treatment. Balanced training sets were created using the Synthetic Minority Oversampling Technique (SMOTE) and fed to the DCNN. EfficientNetB0 and EfficientNetB3 were transferred from the ImageNet application to binary and four-class classification tasks. Five-fold stratified cross validation was performed for the assessment of the networks. Multiple evaluation metrics were computed including the area under the receiver operating characteristic curve (AUC). Comparisons with Resnet50, Xception, and radiomic analysis were performed.Results. for grade prediction, EfficientNetB3 gave the best performance with AUC = 0.924. For stage prediction, EfficientNetB0 was the best with AUC = 0.931. The difference between both models was, however, small and not statistically significant EfficientNetB0-B3 outperformed ResNet50 (AUC = 0.71) and Xception (AUC = 0.89) in stage prediction, and demonstrated comparable results in grade classification, where AUCs of 0.89 and 0.90 were achieved by ResNet50 and Xception, respectively. DCNN outperformed radiomic analysis that gave AUC = 0.67 (grade) and AUC = 0.66 (stage).Conclusion.the prediction of CC grade and stage from ADC maps is feasible by adapting EfficientNet approaches to the medical context.
Collapse
Affiliation(s)
- Souha Aouadi
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Tarraf Torfeh
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Othmane Bouhali
- Department of Science, Texas A&M University at Qatar, PO Box 23874, Education City, Doha, Qatar
| | - S A Yoganathan
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Satheesh Paloor
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Suparna Chandramouli
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Rabih Hammoud
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| | - Noora Al-Hammadi
- Department of Radiation Oncology, National Center for Cancer Care and Research, Hamad Medical Corporation, PO Box 3050, Doha, Qatar
| |
Collapse
|
23
|
Carriero A, Groenhoff L, Vologina E, Basile P, Albera M. Deep Learning in Breast Cancer Imaging: State of the Art and Recent Advancements in Early 2024. Diagnostics (Basel) 2024; 14:848. [PMID: 38667493 PMCID: PMC11048882 DOI: 10.3390/diagnostics14080848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/07/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
The rapid advancement of artificial intelligence (AI) has significantly impacted various aspects of healthcare, particularly in the medical imaging field. This review focuses on recent developments in the application of deep learning (DL) techniques to breast cancer imaging. DL models, a subset of AI algorithms inspired by human brain architecture, have demonstrated remarkable success in analyzing complex medical images, enhancing diagnostic precision, and streamlining workflows. DL models have been applied to breast cancer diagnosis via mammography, ultrasonography, and magnetic resonance imaging. Furthermore, DL-based radiomic approaches may play a role in breast cancer risk assessment, prognosis prediction, and therapeutic response monitoring. Nevertheless, several challenges have limited the widespread adoption of AI techniques in clinical practice, emphasizing the importance of rigorous validation, interpretability, and technical considerations when implementing DL solutions. By examining fundamental concepts in DL techniques applied to medical imaging and synthesizing the latest advancements and trends, this narrative review aims to provide valuable and up-to-date insights for radiologists seeking to harness the power of AI in breast cancer care.
Collapse
Affiliation(s)
| | - Léon Groenhoff
- Radiology Department, Maggiore della Carità Hospital, 28100 Novara, Italy; (A.C.); (E.V.); (P.B.); (M.A.)
| | | | | | | |
Collapse
|
24
|
Kryszan K, Wylęgała A, Kijonka M, Potrawa P, Walasz M, Wylęgała E, Orzechowska-Wylęgała B. Artificial-Intelligence-Enhanced Analysis of In Vivo Confocal Microscopy in Corneal Diseases: A Review. Diagnostics (Basel) 2024; 14:694. [PMID: 38611606 PMCID: PMC11011861 DOI: 10.3390/diagnostics14070694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 03/13/2024] [Accepted: 03/22/2024] [Indexed: 04/14/2024] Open
Abstract
Artificial intelligence (AI) has seen significant progress in medical diagnostics, particularly in image and video analysis. This review focuses on the application of AI in analyzing in vivo confocal microscopy (IVCM) images for corneal diseases. The cornea, as an exposed and delicate part of the body, necessitates the precise diagnoses of various conditions. Convolutional neural networks (CNNs), a key component of deep learning, are a powerful tool for image data analysis. This review highlights AI applications in diagnosing keratitis, dry eye disease, and diabetic corneal neuropathy. It discusses the potential of AI in detecting infectious agents, analyzing corneal nerve morphology, and identifying the subtle changes in nerve fiber characteristics in diabetic corneal neuropathy. However, challenges still remain, including limited datasets, overfitting, low-quality images, and unrepresentative training datasets. This review explores augmentation techniques and the importance of feature engineering to address these challenges. Despite the progress made, challenges are still present, such as the "black-box" nature of AI models and the need for explainable AI (XAI). Expanding datasets, fostering collaborative efforts, and developing user-friendly AI tools are crucial for enhancing the acceptance and integration of AI into clinical practice.
Collapse
Affiliation(s)
- Katarzyna Kryszan
- Chair and Clinical Department of Ophthalmology, School of Medicine in Zabrze, Medical University of Silesia in Katowice, District Railway Hospital, 40-760 Katowice, Poland; (A.W.); (M.K.); (E.W.)
- Department of Ophthalmology, District Railway Hospital in Katowice, 40-760 Katowice, Poland; (P.P.); (M.W.)
| | - Adam Wylęgała
- Chair and Clinical Department of Ophthalmology, School of Medicine in Zabrze, Medical University of Silesia in Katowice, District Railway Hospital, 40-760 Katowice, Poland; (A.W.); (M.K.); (E.W.)
- Health Promotion and Obesity Management, Pathophysiology Department, Medical University of Silesia in Katowice, 40-752 Katowice, Poland
| | - Magdalena Kijonka
- Chair and Clinical Department of Ophthalmology, School of Medicine in Zabrze, Medical University of Silesia in Katowice, District Railway Hospital, 40-760 Katowice, Poland; (A.W.); (M.K.); (E.W.)
- Department of Ophthalmology, District Railway Hospital in Katowice, 40-760 Katowice, Poland; (P.P.); (M.W.)
| | - Patrycja Potrawa
- Department of Ophthalmology, District Railway Hospital in Katowice, 40-760 Katowice, Poland; (P.P.); (M.W.)
| | - Mateusz Walasz
- Department of Ophthalmology, District Railway Hospital in Katowice, 40-760 Katowice, Poland; (P.P.); (M.W.)
| | - Edward Wylęgała
- Chair and Clinical Department of Ophthalmology, School of Medicine in Zabrze, Medical University of Silesia in Katowice, District Railway Hospital, 40-760 Katowice, Poland; (A.W.); (M.K.); (E.W.)
- Department of Ophthalmology, District Railway Hospital in Katowice, 40-760 Katowice, Poland; (P.P.); (M.W.)
| | - Bogusława Orzechowska-Wylęgała
- Department of Pediatric Otolaryngology, Head and Neck Surgery, Chair of Pediatric Surgery, Medical University of Silesia, 40-760 Katowice, Poland;
| |
Collapse
|
25
|
Zhao X, Zang D, Wang S, Shen Z, Xuan K, Wei Z, Wang Z, Zheng R, Wu X, Li Z, Wang Q, Qi Z, Zhang L. sTBI-GAN: An adversarial learning approach for data synthesis on traumatic brain segmentation. Comput Med Imaging Graph 2024; 112:102325. [PMID: 38228021 DOI: 10.1016/j.compmedimag.2024.102325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/18/2024]
Abstract
Automatic brain segmentation of magnetic resonance images (MRIs) from severe traumatic brain injury (sTBI) patients is critical for brain abnormality assessments and brain network analysis. Construction of sTBI brain segmentation model requires manually annotated MR scans of sTBI patients, which becomes a challenging problem as it is quite impractical to implement sufficient annotations for sTBI images with large deformations and lesion erosion. Data augmentation techniques can be applied to alleviate the issue of limited training samples. However, conventional data augmentation strategies such as spatial and intensity transformation are unable to synthesize the deformation and lesions in traumatic brains, which limits the performance of the subsequent segmentation task. To address these issues, we propose a novel medical image inpainting model named sTBI-GAN to synthesize labeled sTBI MR scans by adversarial inpainting. The main strength of our sTBI-GAN method is that it can generate sTBI images and corresponding labels simultaneously, which has not been achieved in previous inpainting methods for medical images. We first generate the inpainted image under the guidance of edge information following a coarse-to-fine manner, and then the synthesized MR image is used as the prior for label inpainting. Furthermore, we introduce a registration-based template augmentation pipeline to increase the diversity of the synthesized image pairs and enhance the capacity of data augmentation. Experimental results show that the proposed sTBI-GAN method can synthesize high-quality labeled sTBI images, which greatly improves the 2D and 3D traumatic brain segmentation performance compared with the alternatives. Code is available at .
Collapse
Affiliation(s)
- Xiangyu Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Di Zang
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China
| | - Sheng Wang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zhenrong Shen
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Kai Xuan
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zeyu Wei
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China
| | - Zhe Wang
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China
| | - Ruizhe Zheng
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China
| | - Xuehai Wu
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China
| | - Zheren Li
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Qian Wang
- School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Zengxin Qi
- Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China; National Center for Neurological Disorders, Shanghai, China; Shanghai Key Laboratory of Brain Function and Restoration and Neural Regeneration, Shanghai, China; State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, School of Basic Medical Sciences and Institutes of Brain Science, Fudan University, China.
| | - Lichi Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
26
|
Rillig MC, Mansour I, Hempel S, Bi M, König-Ries B, Kasirzadeh A. How widespread use of generative AI for images and video can affect the environment and the science of ecology. Ecol Lett 2024; 27:e14397. [PMID: 38430051 DOI: 10.1111/ele.14397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/29/2024] [Accepted: 02/18/2024] [Indexed: 03/03/2024]
Abstract
Generative artificial intelligence (AI) models will have broad impacts on society including the scientific enterprise; ecology and environmental science will be no exception. Here, we discuss the potential opportunities and risks of advanced generative AI for visual material (images and video) for the science of ecology and the environment itself. There are clearly opportunities for positive impacts, related to improved communication, for example; we also see possibilities for ecological research to benefit from generative AI (e.g., image gap filling, biodiversity surveys, and improved citizen science). However, there are also risks, threatening to undermine the credibility of our science, mostly related to actions of bad actors, for example in terms of spreading fake information or committing fraud. Risks need to be mitigated at the level of government regulatory measures, but we also highlight what can be done right now, including discussing issues with the next generation of ecologists and transforming towards radically open science workflows.
Collapse
Affiliation(s)
- Matthias C Rillig
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - India Mansour
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Stefan Hempel
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Mohan Bi
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Birgitta König-Ries
- Heinz-Nixdorf Chair for Distributed Information Systems, Institute for Informatics, Friedrich Schiller University Jena, Jena, Germany
| | - Atoosa Kasirzadeh
- The University of Edinburgh, Edinburgh, UK
- Alan Turing Institute, London, UK
| |
Collapse
|
27
|
Wieneke H, Voigt I. Principles of artificial intelligence and its application in cardiovascular medicine. Clin Cardiol 2024; 47:e24148. [PMID: 37721424 PMCID: PMC10766001 DOI: 10.1002/clc.24148] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 08/23/2023] [Accepted: 08/29/2023] [Indexed: 09/19/2023] Open
Abstract
Artificial intelligence (AI) represents a rapidly developing field. Its use can improve diagnosis and therapy in many areas of medicine. Despite this enormous progress, many physicians perceive it as a black box and are skeptical about it. This review will present the basics of machine learning. Different classifications of artificial intelligence, such as supervised versus unsupervised and discriminative versus generative AI, are given. Analogies to human intelligence are discussed as far as algorithms are oriented toward it. In the second step, the most common models like random forest, k-means clustering, convolutional neural network, and transformers will be presented in a way that the underlying idea can be understood. Corresponding medical applications in cardiovascular medicine will be named for all models, respectively. The overview is intended to show that the term artificial intelligence covers a wide range of different concepts. It should help physicians understand the principles of AI to make up one's minds about its application in cardiology. It should also enable them to evaluate results obtained with AI's help critically.
Collapse
Affiliation(s)
- Heinrich Wieneke
- Department of Cardiology and Angiology, Contilia Heart and Vascular CenterElisabeth‐Krankenhaus EssenEssenGermany
| | - Ingo Voigt
- Department of Cardiology and Angiology, Contilia Heart and Vascular CenterElisabeth‐Krankenhaus EssenEssenGermany
| |
Collapse
|
28
|
Yoon M, Park JJ, Hur T, Hua CH, Hussain M, Lee S, Choi DJ. Application and Potential of Artificial Intelligence in Heart Failure: Past, Present, and Future. INTERNATIONAL JOURNAL OF HEART FAILURE 2024; 6:11-19. [PMID: 38303917 PMCID: PMC10827704 DOI: 10.36628/ijhf.2023.0050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/24/2023] [Accepted: 11/26/2023] [Indexed: 02/03/2024]
Abstract
The prevalence of heart failure (HF) is increasing, necessitating accurate diagnosis and tailored treatment. The accumulation of clinical information from patients with HF generates big data, which poses challenges for traditional analytical methods. To address this, big data approaches and artificial intelligence (AI) have been developed that can effectively predict future observations and outcomes, enabling precise diagnoses and personalized treatments of patients with HF. Machine learning (ML) is a subfield of AI that allows computers to analyze data, find patterns, and make predictions without explicit instructions. ML can be supervised, unsupervised, or semi-supervised. Deep learning is a branch of ML that uses artificial neural networks with multiple layers to find complex patterns. These AI technologies have shown significant potential in various aspects of HF research, including diagnosis, outcome prediction, classification of HF phenotypes, and optimization of treatment strategies. In addition, integrating multiple data sources, such as electrocardiography, electronic health records, and imaging data, can enhance the diagnostic accuracy of AI algorithms. Currently, wearable devices and remote monitoring aided by AI enable the earlier detection of HF and improved patient care. This review focuses on the rationale behind utilizing AI in HF and explores its various applications.
Collapse
Affiliation(s)
- Minjae Yoon
- Division of Cardiology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Jin Joo Park
- Division of Cardiology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Taeho Hur
- Division of Cardiology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
- Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea
| | - Cam-Hao Hua
- Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea
| | - Musarrat Hussain
- Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea
| | - Sungyoung Lee
- Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea
| | - Dong-Ju Choi
- Division of Cardiology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| |
Collapse
|
29
|
Hossain MM, Hossain MM, Arefin MB, Akhtar F, Blake J. Combining State-of-the-Art Pre-Trained Deep Learning Models: A Noble Approach for Skin Cancer Detection Using Max Voting Ensemble. Diagnostics (Basel) 2023; 14:89. [PMID: 38201399 PMCID: PMC10795598 DOI: 10.3390/diagnostics14010089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/21/2023] [Accepted: 12/22/2023] [Indexed: 01/12/2024] Open
Abstract
Skin cancer poses a significant healthcare challenge, requiring precise and prompt diagnosis for effective treatment. While recent advances in deep learning have dramatically improved medical image analysis, including skin cancer classification, ensemble methods offer a pathway for further enhancing diagnostic accuracy. This study introduces a cutting-edge approach employing the Max Voting Ensemble Technique for robust skin cancer classification on ISIC 2018: Task 1-2 dataset. We incorporate a range of cutting-edge, pre-trained deep neural networks, including MobileNetV2, AlexNet, VGG16, ResNet50, DenseNet201, DenseNet121, InceptionV3, ResNet50V2, InceptionResNetV2, and Xception. These models have been extensively trained on skin cancer datasets, achieving individual accuracies ranging from 77.20% to 91.90%. Our method leverages the synergistic capabilities of these models by combining their complementary features to elevate classification performance further. In our approach, input images undergo preprocessing for model compatibility. The ensemble integrates the pre-trained models with their architectures and weights preserved. For each skin lesion image under examination, every model produces a prediction. These are subsequently aggregated using the max voting ensemble technique to yield the final classification, with the majority-voted class serving as the conclusive prediction. Through comprehensive testing on a diverse dataset, our ensemble outperformed individual models, attaining an accuracy of 93.18% and an AUC score of 0.9320, thus demonstrating superior diagnostic reliability and accuracy. We evaluated the effectiveness of our proposed method on the HAM10000 dataset to ensure its generalizability. Our ensemble method delivers a robust, reliable, and effective tool for the classification of skin cancer. By utilizing the power of advanced deep neural networks, we aim to assist healthcare professionals in achieving timely and accurate diagnoses, ultimately reducing mortality rates and enhancing patient outcomes.
Collapse
Affiliation(s)
- Md. Mamun Hossain
- Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology, Saidpur 5310, Bangladesh
| | - Md. Moazzem Hossain
- Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology, Saidpur 5310, Bangladesh
| | - Most. Binoee Arefin
- Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology, Saidpur 5310, Bangladesh
| | - Fahima Akhtar
- Department of Computer Science and Engineering, Bangladesh Army University of Science and Technology, Saidpur 5310, Bangladesh
| | - John Blake
- School of Computer Science and Engineering, University of Aizu, Aizuwakamatsu 965-8580, Japan
| |
Collapse
|
30
|
Tatar OC, Akay MA, Metin S. DraiNet: AI-driven decision support in pneumothorax and pleural effusion management. Pediatr Surg Int 2023; 40:30. [PMID: 38151565 DOI: 10.1007/s00383-023-05609-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/24/2023] [Indexed: 12/29/2023]
Abstract
OBJECTIVE This study presents DraiNet, a deep learning model developed to detect pneumothorax and pleural effusion in pediatric patients and aid in assessing the necessity for tube thoracostomy. The primary goal is to utilize DraiNet as a decision support tool to enhance clinical decision-making in the management of these conditions. METHODS DraiNet was trained on a diverse dataset of pediatric CT scans, carefully annotated by experienced surgeons. The model incorporated advanced object detection techniques and underwent evaluation using standard metrics, such as mean Average Precision (mAP), to assess its performance. RESULTS DraiNet achieved an impressive mAP score of 0.964, demonstrating high accuracy in detecting and precisely localizing abnormalities associated with pneumothorax and pleural effusion. The model's precision and recall further confirmed its ability to effectively predict positive cases. CONCLUSION The integration of DraiNet as an AI-driven decision support system marks a significant advancement in pediatric healthcare. By combining deep learning algorithms with clinical expertise, DraiNet provides a valuable tool for non-surgical teams and emergency room doctors, aiding them in making informed decisions about surgical interventions. With its remarkable mAP score of 0.964, DraiNet has the potential to enhance patient outcomes and optimize the management of critical conditions, including pneumothorax and pleural effusion.
Collapse
Affiliation(s)
- Ozan Can Tatar
- Department of General Surgery, School of Medicine, Kocaeli University, 41000, Kocaeli, Turkey.
- Information Systems Engineering, Faculty of Technology, Kocaeli University, Kocaeli, Turkey.
| | - Mustafa Alper Akay
- Department of Pediatric Surgery, School of Medicine, Kocaeli University, Kocaeli, Turkey
| | - Semih Metin
- Department of Pediatric Surgery, School of Medicine, Kocaeli University, Kocaeli, Turkey
| |
Collapse
|
31
|
Pinto-Coelho L. How Artificial Intelligence Is Shaping Medical Imaging Technology: A Survey of Innovations and Applications. Bioengineering (Basel) 2023; 10:1435. [PMID: 38136026 PMCID: PMC10740686 DOI: 10.3390/bioengineering10121435] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023] Open
Abstract
The integration of artificial intelligence (AI) into medical imaging has guided in an era of transformation in healthcare. This literature review explores the latest innovations and applications of AI in the field, highlighting its profound impact on medical diagnosis and patient care. The innovation segment explores cutting-edge developments in AI, such as deep learning algorithms, convolutional neural networks, and generative adversarial networks, which have significantly improved the accuracy and efficiency of medical image analysis. These innovations have enabled rapid and accurate detection of abnormalities, from identifying tumors during radiological examinations to detecting early signs of eye disease in retinal images. The article also highlights various applications of AI in medical imaging, including radiology, pathology, cardiology, and more. AI-based diagnostic tools not only speed up the interpretation of complex images but also improve early detection of disease, ultimately delivering better outcomes for patients. Additionally, AI-based image processing facilitates personalized treatment plans, thereby optimizing healthcare delivery. This literature review highlights the paradigm shift that AI has brought to medical imaging, highlighting its role in revolutionizing diagnosis and patient care. By combining cutting-edge AI techniques and their practical applications, it is clear that AI will continue shaping the future of healthcare in profound and positive ways.
Collapse
Affiliation(s)
- Luís Pinto-Coelho
- ISEP—School of Engineering, Polytechnic Institute of Porto, 4200-465 Porto, Portugal;
- INESCTEC, Campus of the Engineering Faculty of the University of Porto, 4200-465 Porto, Portugal
| |
Collapse
|
32
|
Schaudt D, Späte C, von Schwerin R, Reichert M, von Schwerin M, Beer M, Kloth C. A Critical Assessment of Generative Models for Synthetic Data Augmentation on Limited Pneumonia X-ray Data. Bioengineering (Basel) 2023; 10:1421. [PMID: 38136012 PMCID: PMC10741143 DOI: 10.3390/bioengineering10121421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 11/28/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
In medical imaging, deep learning models serve as invaluable tools for expediting diagnoses and aiding specialized medical professionals in making clinical decisions. However, effectively training deep learning models typically necessitates substantial quantities of high-quality data, a resource often lacking in numerous medical imaging scenarios. One way to overcome this deficiency is to artificially generate such images. Therefore, in this comparative study we train five generative models to artificially increase the amount of available data in such a scenario. This synthetic data approach is evaluated on a a downstream classification task, predicting four causes for pneumonia as well as healthy cases on 1082 chest X-ray images. Quantitative and medical assessments show that a Generative Adversarial Network (GAN)-based approach significantly outperforms more recent diffusion-based approaches on this limited dataset with better image quality and pathological plausibility. We show that better image quality surprisingly does not translate to improved classification performance by evaluating five different classification models and varying the amount of additional training data. Class-specific metrics like precision, recall, and F1-score show a substantial improvement by using synthetic images, emphasizing the data rebalancing effect of less frequent classes. However, overall performance does not improve for most models and configurations, except for a DreamBooth approach which shows a +0.52 improvement in overall accuracy. The large variance of performance impact in this study suggests a careful consideration of utilizing generative models for limited data scenarios, especially with an unexpected negative correlation between image quality and downstream classification improvement.
Collapse
Affiliation(s)
- Daniel Schaudt
- Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081 Ulm, Germany
| | - Christian Späte
- DASU Transferzentrum für Digitalisierung, Analytics und Data Science Ulm, Olgastraße 94, 89073 Ulm, Germany
| | - Reinhold von Schwerin
- Department of Computer Science, Ulm University of Applied Science, Albert–Einstein–Allee 55, 89081 Ulm, Germany
| | - Manfred Reichert
- Institute of Databases and Information Systems, Ulm University, James-Franck-Ring, 89081 Ulm, Germany
| | - Marianne von Schwerin
- Department of Computer Science, Ulm University of Applied Science, Albert–Einstein–Allee 55, 89081 Ulm, Germany
| | - Meinrad Beer
- Department of Radiology, University Hospital of Ulm, Albert–Einstein–Allee 23, 89081 Ulm, Germany
| | - Christopher Kloth
- Department of Radiology, University Hospital of Ulm, Albert–Einstein–Allee 23, 89081 Ulm, Germany
| |
Collapse
|
33
|
Esmaeili F, Cassie E, Nguyen HPT, Plank NOV, Unsworth CP, Wang A. Utilizing Deep Learning Algorithms for Signal Processing in Electrochemical Biosensors: From Data Augmentation to Detection and Quantification of Chemicals of Interest. Bioengineering (Basel) 2023; 10:1348. [PMID: 38135939 PMCID: PMC10740562 DOI: 10.3390/bioengineering10121348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/14/2023] [Accepted: 11/21/2023] [Indexed: 12/24/2023] Open
Abstract
Nanomaterial-based aptasensors serve as useful instruments for detecting small biological entities. This work utilizes data gathered from three electrochemical aptamer-based sensors varying in receptors, analytes of interest, and lengths of signals. Our ultimate objective was the automatic detection and quantification of target analytes from a segment of the signal recorded by these sensors. Initially, we proposed a data augmentation method using conditional variational autoencoders to address data scarcity. Secondly, we employed recurrent-based networks for signal extrapolation, ensuring uniform signal lengths. In the third step, we developed seven deep learning classification models (GRU, unidirectional LSTM (ULSTM), bidirectional LSTM (BLSTM), ConvGRU, ConvULSTM, ConvBLSTM, and CNN) to identify and quantify specific analyte concentrations for six distinct classes, ranging from the absence of analyte to 10 μM. Finally, the second classification model was created to distinguish between abnormal and normal data segments, detect the presence or absence of analytes in the sample, and, if detected, identify the specific analyte and quantify its concentration. Evaluating the time series forecasting showed that the GRU-based network outperformed two other ULSTM and BLSTM networks. Regarding classification models, it turned out signal extrapolation was not effective in improving the classification performance. Comparing the role of the network architectures in classification performance, the result showed that hybrid networks, including both convolutional and recurrent layers and CNN networks, achieved 82% to 99% accuracy across all three datasets. Utilizing short-term Fourier transform (STFT) as the preprocessing technique improved the performance of all datasets with accuracies from 84% to 99%. These findings underscore the effectiveness of suitable data preprocessing methods in enhancing neural network performance, enabling automatic analyte identification and quantification from electrochemical aptasensor signals.
Collapse
Affiliation(s)
- Fatemeh Esmaeili
- Department of Engineering Science, University of Auckland, Auckland 1010, New Zealand; (F.E.); (C.P.U.)
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
| | - Erica Cassie
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Hong Phan T. Nguyen
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Natalie O. V. Plank
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
- School of Chemical and Physical Sciences, Victoria University of Wellington, Wellington 6021, New Zealand
| | - Charles P. Unsworth
- Department of Engineering Science, University of Auckland, Auckland 1010, New Zealand; (F.E.); (C.P.U.)
- The MacDiarmid Institute for Advanced Materials and Nanotechnology, Victoria University of Wellington, Wellington 6021, New Zealand; (E.C.); (H.P.T.N.); (N.O.V.P.)
| | - Alan Wang
- Auckland Bioengineering Institute, University of Auckland, Auckland 1010, New Zealand
- Center for Medical Imaging, Faculty of Medical and Health Sciences, University of Auckland, Auckland 1010, New Zealand
- Centre for Brain Research, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
34
|
Yousefpour Shahrivar R, Karami F, Karami E. Enhancing Fetal Anomaly Detection in Ultrasonography Images: A Review of Machine Learning-Based Approaches. Biomimetics (Basel) 2023; 8:519. [PMID: 37999160 PMCID: PMC10669151 DOI: 10.3390/biomimetics8070519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/05/2023] [Accepted: 10/26/2023] [Indexed: 11/25/2023] Open
Abstract
Fetal development is a critical phase in prenatal care, demanding the timely identification of anomalies in ultrasound images to safeguard the well-being of both the unborn child and the mother. Medical imaging has played a pivotal role in detecting fetal abnormalities and malformations. However, despite significant advances in ultrasound technology, the accurate identification of irregularities in prenatal images continues to pose considerable challenges, often necessitating substantial time and expertise from medical professionals. In this review, we go through recent developments in machine learning (ML) methods applied to fetal ultrasound images. Specifically, we focus on a range of ML algorithms employed in the context of fetal ultrasound, encompassing tasks such as image classification, object recognition, and segmentation. We highlight how these innovative approaches can enhance ultrasound-based fetal anomaly detection and provide insights for future research and clinical implementations. Furthermore, we emphasize the need for further research in this domain where future investigations can contribute to more effective ultrasound-based fetal anomaly detection.
Collapse
Affiliation(s)
- Ramin Yousefpour Shahrivar
- Department of Biology, College of Convergent Sciences and Technologies, Science and Research Branch, Islamic Azad University, Tehran, 14515-775, Iran
| | - Fatemeh Karami
- Department of Medical Genetics, Applied Biophotonics Research Center, Science and Research Branch, Islamic Azad University, Tehran, 14515-775, Iran
| | - Ebrahim Karami
- Department of Engineering and Applied Sciences, Memorial University of Newfoundland, St. John’s, NL A1B 3X5, Canada
| |
Collapse
|
35
|
Amarù S, Marelli D, Ciocca G, Schettini R. DALib: A Curated Repository of Libraries for Data Augmentation in Computer Vision. J Imaging 2023; 9:232. [PMID: 37888340 PMCID: PMC10607570 DOI: 10.3390/jimaging9100232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 10/13/2023] [Accepted: 10/17/2023] [Indexed: 10/28/2023] Open
Abstract
Data augmentation is a fundamental technique in machine learning that plays a crucial role in expanding the size of training datasets. By applying various transformations or modifications to existing data, data augmentation enhances the generalization and robustness of machine learning models. In recent years, the development of several libraries has simplified the utilization of diverse data augmentation strategies across different tasks. This paper focuses on the exploration of the most widely adopted libraries specifically designed for data augmentation in computer vision tasks. Here, we aim to provide a comprehensive survey of publicly available data augmentation libraries, facilitating practitioners to navigate these resources effectively. Through a curated taxonomy, we present an organized classification of the different approaches employed by these libraries, along with accompanying application examples. By examining the techniques of each library, practitioners can make informed decisions in selecting the most suitable augmentation techniques for their computer vision projects. To ensure the accessibility of this valuable information, a dedicated public website named DALib has been created. This website serves as a centralized repository where the taxonomy, methods, and examples associated with the surveyed data augmentation libraries can be explored. By offering this comprehensive resource, we aim to empower practitioners and contribute to the advancement of computer vision research and applications through effective utilization of data augmentation techniques.
Collapse
Affiliation(s)
| | - Davide Marelli
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Viale Sarca 336, 20126 Milano, Italy; (S.A.); (G.C.); (R.S.)
| | | | | |
Collapse
|