1
|
Garcea F, Serra A, Lamberti F, Morra L. Data augmentation for medical imaging: A systematic literature review. Comput Biol Med 2023; 152:106391. [PMID: 36549032 DOI: 10.1016/j.compbiomed.2022.106391] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 11/22/2022] [Accepted: 11/29/2022] [Indexed: 12/13/2022]
Abstract
Recent advances in Deep Learning have largely benefited from larger and more diverse training sets. However, collecting large datasets for medical imaging is still a challenge due to privacy concerns and labeling costs. Data augmentation makes it possible to greatly expand the amount and variety of data available for training without actually collecting new samples. Data augmentation techniques range from simple yet surprisingly effective transformations such as cropping, padding, and flipping, to complex generative models. Depending on the nature of the input and the visual task, different data augmentation strategies are likely to perform differently. For this reason, it is conceivable that medical imaging requires specific augmentation strategies that generate plausible data samples and enable effective regularization of deep neural networks. Data augmentation can also be used to augment specific classes that are underrepresented in the training set, e.g., to generate artificial lesions. The goal of this systematic literature review is to investigate which data augmentation strategies are used in the medical domain and how they affect the performance of clinical tasks such as classification, segmentation, and lesion detection. To this end, a comprehensive analysis of more than 300 articles published in recent years (2018-2022) was conducted. The results highlight the effectiveness of data augmentation across organs, modalities, tasks, and dataset sizes, and suggest potential avenues for future research.
Collapse
Affiliation(s)
- Fabio Garcea
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Alessio Serra
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Fabrizio Lamberti
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy
| | - Lia Morra
- Dipartimento di Automatica e Informatica, Politecnico di Torino, C.so Duca degli Abruzzi, 24, Torino, 10129, Italy.
| |
Collapse
|
2
|
Silva F, Pereira T, Neves I, Morgado J, Freitas C, Malafaia M, Sousa J, Fonseca J, Negrão E, Flor de Lima B, Correia da Silva M, Madureira AJ, Ramos I, Costa JL, Hespanhol V, Cunha A, Oliveira HP. Towards Machine Learning-Aided Lung Cancer Clinical Routines: Approaches and Open Challenges. J Pers Med 2022; 12:jpm12030480. [PMID: 35330479 PMCID: PMC8950137 DOI: 10.3390/jpm12030480] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 02/28/2022] [Accepted: 03/10/2022] [Indexed: 12/15/2022] Open
Abstract
Advancements in the development of computer-aided decision (CAD) systems for clinical routines provide unquestionable benefits in connecting human medical expertise with machine intelligence, to achieve better quality healthcare. Considering the large number of incidences and mortality numbers associated with lung cancer, there is a need for the most accurate clinical procedures; thus, the possibility of using artificial intelligence (AI) tools for decision support is becoming a closer reality. At any stage of the lung cancer clinical pathway, specific obstacles are identified and “motivate” the application of innovative AI solutions. This work provides a comprehensive review of the most recent research dedicated toward the development of CAD tools using computed tomography images for lung cancer-related tasks. We discuss the major challenges and provide critical perspectives on future directions. Although we focus on lung cancer in this review, we also provide a more clear definition of the path used to integrate AI in healthcare, emphasizing fundamental research points that are crucial for overcoming current barriers.
Collapse
Affiliation(s)
- Francisco Silva
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- FCUP—Faculty of Science, University of Porto, 4169-007 Porto, Portugal
- Correspondence: (F.S.); (T.P.)
| | - Tania Pereira
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- Correspondence: (F.S.); (T.P.)
| | - Inês Neves
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- ICBAS—Abel Salazar Biomedical Sciences Institute, University of Porto, 4050-313 Porto, Portugal
| | - Joana Morgado
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
| | - Cláudia Freitas
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
- FMUP—Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal;
| | - Mafalda Malafaia
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- FEUP—Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
| | - Joana Sousa
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
| | - João Fonseca
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- FEUP—Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal
| | - Eduardo Negrão
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
| | - Beatriz Flor de Lima
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
| | - Miguel Correia da Silva
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
| | - António J. Madureira
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
- FMUP—Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal;
| | - Isabel Ramos
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
- FMUP—Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal;
| | - José Luis Costa
- FMUP—Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal;
- i3S—Instituto de Investigação e Inovação em Saúde, Universidade do Porto, 4200-135 Porto, Portugal
- IPATIMUP—Institute of Molecular Pathology and Immunology of the University of Porto, 4200-135 Porto, Portugal
| | - Venceslau Hespanhol
- CHUSJ—Centro Hospitalar e Universitário de São João, 4200-319 Porto, Portugal; (C.F.); (E.N.); (B.F.d.L.); (M.C.d.S.); (A.J.M.); (I.R.); (V.H.)
- FMUP—Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal;
| | - António Cunha
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- UTAD—University of Trás-os-Montes and Alto Douro, 5001-801 Vila Real, Portugal
| | - Hélder P. Oliveira
- INESC TEC—Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal; (I.N.); (J.M.); (M.M.); (J.S.); (J.F.); (A.C.); (H.P.O.)
- FCUP—Faculty of Science, University of Porto, 4169-007 Porto, Portugal
| |
Collapse
|
3
|
Chen Y, Yang XH, Wei Z, Heidari AA, Zheng N, Li Z, Chen H, Hu H, Zhou Q, Guan Q. Generative Adversarial Networks in Medical Image augmentation: A review. Comput Biol Med 2022; 144:105382. [PMID: 35276550 DOI: 10.1016/j.compbiomed.2022.105382] [Citation(s) in RCA: 49] [Impact Index Per Article: 24.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 02/25/2022] [Accepted: 03/02/2022] [Indexed: 12/31/2022]
Abstract
OBJECT With the development of deep learning, the number of training samples for medical image-based diagnosis and treatment models is increasing. Generative Adversarial Networks (GANs) have attracted attention in medical image processing due to their excellent image generation capabilities and have been widely used in data augmentation. In this paper, a comprehensive and systematic review and analysis of medical image augmentation work are carried out, and its research status and development prospects are reviewed. METHOD This paper reviews 105 medical image augmentation related papers, which mainly collected by ELSEVIER, IEEE Xplore, and Springer from 2018 to 2021. We counted these papers according to the parts of the organs corresponding to the images, and sorted out the medical image datasets that appeared in them, the loss function in model training, and the quantitative evaluation metrics of image augmentation. At the same time, we briefly introduce the literature collected in three journals and three conferences that have received attention in medical image processing. RESULT First, we summarize the advantages of various augmentation models, loss functions, and evaluation metrics. Researchers can use this information as a reference when designing augmentation tasks. Second, we explore the relationship between augmented models and the amount of the training set, and tease out the role that augmented models may play when the quality of the training set is limited. Third, the statistical number of papers shows that the development momentum of this research field remains strong. Furthermore, we discuss the existing limitations of this type of model and suggest possible research directions. CONCLUSION We discuss GAN-based medical image augmentation work in detail. This method effectively alleviates the challenge of limited training samples for medical image diagnosis and treatment models. It is hoped that this review will benefit researchers interested in this field.
Collapse
Affiliation(s)
- Yizhou Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Xu-Hua Yang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Zihan Wei
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran; Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
| | - Nenggan Zheng
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Zhicheng Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, Zhejiang, 325035, China.
| | - Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qiu Guan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| |
Collapse
|
4
|
Weakly supervised learning for classification of lung cytological images using attention-based multiple instance learning. Sci Rep 2021; 11:20317. [PMID: 34645863 PMCID: PMC8514584 DOI: 10.1038/s41598-021-99246-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 09/22/2021] [Indexed: 11/09/2022] Open
Abstract
In cytological examination, suspicious cells are evaluated regarding malignancy and cancer type. To assist this, we previously proposed an automated method based on supervised learning that classifies cells in lung cytological images as benign or malignant. However, it is often difficult to label all cells. In this study, we developed a weakly supervised method for the classification of benign and malignant lung cells in cytological images using attention-based deep multiple instance learning (AD MIL). Images of lung cytological specimens were divided into small patch images and stored in bags. Each bag was then labeled as benign or malignant, and classification was conducted using AD MIL. The distribution of attention weights was also calculated as a color map to confirm the presence of malignant cells in the image. AD MIL using the AlexNet-like convolutional neural network model showed the best classification performance, with an accuracy of 0.916, which was better than that of supervised learning. In addition, an attention map of the entire image based on the attention weight allowed AD MIL to focus on most malignant cells. Our weakly supervised method automatically classifies cytological images with acceptable accuracy based on supervised learning without complex annotations.
Collapse
|
5
|
Chlap P, Min H, Vandenberg N, Dowling J, Holloway L, Haworth A. A review of medical image data augmentation techniques for deep learning applications. J Med Imaging Radiat Oncol 2021; 65:545-563. [PMID: 34145766 DOI: 10.1111/1754-9485.13261] [Citation(s) in RCA: 148] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/23/2021] [Indexed: 12/21/2022]
Abstract
Research in artificial intelligence for radiology and radiotherapy has recently become increasingly reliant on the use of deep learning-based algorithms. While the performance of the models which these algorithms produce can significantly outperform more traditional machine learning methods, they do rely on larger datasets being available for training. To address this issue, data augmentation has become a popular method for increasing the size of a training dataset, particularly in fields where large datasets aren't typically available, which is often the case when working with medical images. Data augmentation aims to generate additional data which is used to train the model and has been shown to improve performance when validated on a separate unseen dataset. This approach has become commonplace so to help understand the types of data augmentation techniques used in state-of-the-art deep learning models, we conducted a systematic review of the literature where data augmentation was utilised on medical images (limited to CT and MRI) to train a deep learning model. Articles were categorised into basic, deformable, deep learning or other data augmentation techniques. As artificial intelligence models trained using augmented data make their way into the clinic, this review aims to give an insight to these techniques and confidence in the validity of the models produced.
Collapse
Affiliation(s)
- Phillip Chlap
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia
| | - Hang Min
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Nym Vandenberg
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| | - Jason Dowling
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Lois Holloway
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia.,Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia.,Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Annette Haworth
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
6
|
Yu S, Zhang S, Wang B, Dun H, Xu L, Huang X, Shi E, Feng X. Generative adversarial network based data augmentation to improve cervical cell classification model. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:1740-1752. [PMID: 33757208 DOI: 10.3934/mbe.2021090] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The survival rate of cervical cancer can be improved by the early screening. However, the screening is a heavy task for pathologists. Thus, automatic cervical cell classification model is proposed to assist pathologists in screening. In cervical cell classification, the number of abnormal cells is small, meanwhile, the ratio between the number of abnormal cells and the number of normal cells is small too. In order to deal with the small sample and class imbalance problem, a generative adversarial network (GAN) trained by images of abnormal cells is proposed to obtain the generated images of abnormal cells. Using both generated images and real images, a convolutional neural network (CNN) is trained. We design four experiments, including 1) training the CNN by under-sampled images of normal cells and the real images of abnormal cells, 2) pre-training the CNN by other dataset and fine-tuning it by real images of cells, 3) training the CNN by generated images of abnormal cells and the real images, 4) pre-training the CNN by generated images of abnormal cells and fine-tuning it by real images of cells. Comparing these experimental results, we find that 1) GAN generated images of abnormal cells can effectively solve the problem of small sample and class imbalance in cervical cell classification; 2) CNN model pre-trained by generated images and fine-tuned by real images achieves the best performance whose AUC value is 0.984.
Collapse
Affiliation(s)
- Suxiang Yu
- Department of Pathology, The Fourth Central Hospital of Baoding City, Baoding 072350, China
| | - Shuai Zhang
- Department of Computer Science, The University of Manchester, Manchester M13 9PL, UK
| | - Bin Wang
- Department of Pathology, The Fourth Central Hospital of Baoding City, Baoding 072350, China
| | - Hua Dun
- Department of Pathology, The Fourth Central Hospital of Baoding City, Baoding 072350, China
| | - Long Xu
- Solar Activity Prediction Center, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China
| | - Xin Huang
- Solar Activity Prediction Center, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China
| | - Ermin Shi
- Department of Information Technology, The Fourth Central Hospital of Baoding City, Baoding 072350, China
| | - Xinxing Feng
- Endocrinology and Cardiovascular Disease Centre, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100037, China
| |
Collapse
|
7
|
Deep Neural Networks for Dental Implant System Classification. Biomolecules 2020; 10:biom10070984. [PMID: 32630195 PMCID: PMC7407934 DOI: 10.3390/biom10070984] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/27/2020] [Accepted: 06/29/2020] [Indexed: 02/08/2023] Open
Abstract
In this study, we used panoramic X-ray images to classify and clarify the accuracy of different dental implant brands via deep convolutional neural networks (CNNs) with transfer-learning strategies. For objective labeling, 8859 implant images of 11 implant systems were used from digital panoramic radiographs obtained from patients who underwent dental implant treatment at Kagawa Prefectural Central Hospital, Japan, between 2005 and 2019. Five deep CNN models (specifically, a basic CNN with three convolutional layers, VGG16 and VGG19 transfer-learning models, and finely tuned VGG16 and VGG19) were evaluated for implant classification. Among the five models, the finely tuned VGG16 model exhibited the highest implant classification performance. The finely tuned VGG19 was second best, followed by the normal transfer-learning VGG16. We confirmed that the finely tuned VGG16 and VGG19 CNNs could accurately classify dental implant systems from 11 types of panoramic X-ray images.
Collapse
|