1
|
Nanni L, Paci M, Brahnam S, Lumini A. Feature transforms for image data augmentation. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07645-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractA problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature.
Collapse
|
2
|
Deep Feature Fusion and Optimization-Based Approach for Stomach Disease Classification. SENSORS 2022; 22:s22072801. [PMID: 35408415 PMCID: PMC9003289 DOI: 10.3390/s22072801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 03/26/2022] [Accepted: 04/02/2022] [Indexed: 01/10/2023]
Abstract
Cancer is the deadliest disease among all the diseases and the main cause of human mortality. Several types of cancer sicken the human body and affect organs. Among all the types of cancer, stomach cancer is the most dangerous disease that spreads rapidly and needs to be diagnosed at an early stage. The early diagnosis of stomach cancer is essential to reduce the mortality rate. The manual diagnosis process is time-consuming, requires many tests, and the availability of an expert doctor. Therefore, automated techniques are required to diagnose stomach infections from endoscopic images. Many computerized techniques have been introduced in the literature but due to a few challenges (i.e., high similarity among the healthy and infected regions, irrelevant features extraction, and so on), there is much room to improve the accuracy and reduce the computational time. In this paper, a deep-learning-based stomach disease classification method employing deep feature extraction, fusion, and optimization using WCE images is proposed. The proposed method comprises several phases: data augmentation performed to increase the dataset images, deep transfer learning adopted for deep features extraction, feature fusion performed on deep extracted features, fused feature matrix optimized with a modified dragonfly optimization method, and final classification of the stomach disease was performed. The features extraction phase employed two pre-trained deep CNN models (Inception v3 and DenseNet-201) performing activation on feature derivation layers. Later, the parallel concatenation was performed on deep-derived features and optimized using the meta-heuristic method named the dragonfly algorithm. The optimized feature matrix was classified by employing machine-learning algorithms and achieved an accuracy of 99.8% on the combined stomach disease dataset. A comparison has been conducted with state-of-the-art techniques and shows improved accuracy.
Collapse
|
3
|
Zhang R, Guo Z, Sun Y, Lu Q, Xu Z, Yao Z, Duan M, Liu S, Ren Y, Huang L, Zhou F. COVID19XrayNet: A Two-Step Transfer Learning Model for the COVID-19 Detecting Problem Based on a Limited Number of Chest X-Ray Images. Interdiscip Sci 2020; 12:555-565. [PMID: 32959234 PMCID: PMC7505483 DOI: 10.1007/s12539-020-00393-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 09/02/2020] [Accepted: 09/05/2020] [Indexed: 12/31/2022]
Abstract
The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a major pandemic outbreak recently. Various diagnostic technologies have been under active development. The novel coronavirus disease (COVID-19) may induce pulmonary failures, and chest X-ray imaging becomes one of the major confirmed diagnostic technologies. The very limited number of publicly available samples has rendered the training of the deep neural networks unstable and inaccurate. This study proposed a two-step transfer learning pipeline and a deep residual network framework COVID19XrayNet for the COVID-19 detection problem based on chest X-ray images. COVID19XrayNet firstly tunes the transferred model on a large dataset of chest X-ray images, which is further tuned using a small dataset of annotated chest X-ray images. The final model achieved 0.9108 accuracy. The experimental data also suggested that the model may be improved with more training samples being released.
Collapse
Affiliation(s)
- Ruochi Zhang
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China
| | - Zhehao Guo
- School of Computing and Information, University of Pittsburgh, 135 N Bellefield Ave, Pittsburgh, PA, 15213, USA
| | - Yue Sun
- School of Computing and Information, University of Pittsburgh, 135 N Bellefield Ave, Pittsburgh, PA, 15213, USA
| | - Qi Lu
- School of Computing and Information, University of Pittsburgh, 135 N Bellefield Ave, Pittsburgh, PA, 15213, USA
| | - Zijian Xu
- School of Computing and Information, University of Pittsburgh, 135 N Bellefield Ave, Pittsburgh, PA, 15213, USA
| | - Zhaomin Yao
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China
| | - Meiyu Duan
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China
| | - Shuai Liu
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China
| | - Yanjiao Ren
- College of Information Technology, Jilin Agricultural University, Changchun, 130118, Jilin, China
| | - Lan Huang
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China
| | - Fengfeng Zhou
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin, China.
| |
Collapse
|
4
|
Feng X, Hao X, Shi R, Xia Z, Huang L, Yu Q, Zhou F. Detection and Comparative Analysis of Methylomic Biomarkers of Rheumatoid Arthritis. Front Genet 2020; 11:238. [PMID: 32292416 PMCID: PMC7119472 DOI: 10.3389/fgene.2020.00238] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 02/28/2020] [Indexed: 01/05/2023] Open
Abstract
Rheumatoid arthritis (RA) is a common autoimmune disorder influenced by both genetic and environmental factors. To investigate possible contributions of DNA methylation to the etiology of RA with minimum confounding genetic heterogeneity, we investigated genome-wide DNA methylation in disease-discordant monozygotic twin pairs. This study hypothesized that methylomic biomarkers might facilitate accurate RA detection. A comprehensive series of biomarker detection algorithms were utilized to find the best methylomic biomarkers for detecting RA patients using the methylomic data of the peripheral blood samples. The best model achieved 100.00% in accuracy (Acc) with 81 methylomic biomarkers and a 10-fold cross-validation (10FCV) strategy. Some of the methylomic biomarkers were experimentally confirmed to be associated with the onset or development of RA. It is also interesting to observe that many of the detected biomarkers were from chromosome Y, supporting the knowledge that RA has a significant gender discrepancy.
Collapse
Affiliation(s)
- Xin Feng
- Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun, China.,Jilin Institute of Chemical Technology, Jilin, China.,BioKnow Health Informatics Lab, College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Xubing Hao
- BioKnow Health Informatics Lab, College of Software, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Ruoyao Shi
- BioKnow Health Informatics Lab, College of Life Sciences, Jilin University, Changchun, China
| | - Zhiqiang Xia
- BioKnow Health Informatics Lab, College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Lan Huang
- College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Qiong Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun, China
| | - Fengfeng Zhou
- BioKnow Health Informatics Lab, College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
5
|
Zhang Y, Chen C, Duan M, Liu S, Huang L, Zhou F. BioDog, biomarker detection for improving identification power of breast cancer histologic grade in methylomics. Epigenomics 2019; 11:1717-1732. [PMID: 31625763 DOI: 10.2217/epi-2019-0230] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Aim: Breast cancer histologic grade (HG) is a well-established prognostic factor. This study aimed to select methylomic biomarkers to predict breast cancer HGs. Materials & methods: The proposed algorithm BioDog firstly used correlation bias reduction strategy to eliminate redundant features. Then incremental feature selection was applied to find the features with a high HG prediction accuracy. The sequential backward feature elimination strategy was employed to further refine the biomarkers. A comparison with existing algorithms were conducted. The HG-specific somatic mutations were investigated. Results & conclusions: BioDog achieved accuracy 0.9973 using 92 methylomic biomarkers for predicting breast cancer HGs. Many of these biomarkers were within the genes and lncRNAs associated with the HG development in breast cancer or other cancer types.
Collapse
Affiliation(s)
- Yexian Zhang
- College of Computer Science & Technology, & Key Laboratory of Symbolic Computation & Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, PR China
| | - Chaorong Chen
- College of Software, Jilin University, Changchun, Jilin 130012, PR China
| | - Meiyu Duan
- College of Computer Science & Technology, & Key Laboratory of Symbolic Computation & Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, PR China
| | - Shuai Liu
- College of Computer Science & Technology, & Key Laboratory of Symbolic Computation & Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, PR China
| | - Lan Huang
- College of Computer Science & Technology, & Key Laboratory of Symbolic Computation & Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, PR China
| | - Fengfeng Zhou
- College of Computer Science & Technology, & Key Laboratory of Symbolic Computation & Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, PR China
| |
Collapse
|
6
|
Feng X, Hao X, Xin R, Gao X, Liu M, Li F, Wang Y, Shi R, Zhao S, Zhou F. Detecting Methylomic Biomarkers of Pediatric Autism in the Peripheral Blood Leukocytes. Interdiscip Sci 2019; 11:237-246. [DOI: 10.1007/s12539-019-00328-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Revised: 03/25/2019] [Accepted: 03/28/2019] [Indexed: 12/12/2022]
|
7
|
Feng X, Li J, Li H, Chen H, Li F, Liu Q, You ZH, Zhou F. Age Is Important for the Early-Stage Detection of Breast Cancer on Both Transcriptomic and Methylomic Biomarkers. Front Genet 2019; 10:212. [PMID: 30984234 PMCID: PMC6448048 DOI: 10.3389/fgene.2019.00212] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Accepted: 02/27/2019] [Indexed: 12/27/2022] Open
Abstract
Patients at different ages have different rates of cell development and metabolisms. As a result, age should be an essential part of how a disease diagnosis model is trained and optimized. Unfortunately, most of the existing studies have not taken age into account. This study demonstrated that disease diagnosis models could be improved by merely applying individual models for patients of different age groups. Both transcriptomes and methylomes of the TCGA breast cancer dataset (TCGA-BRCA) were utilized for the analysis procedure of feature selection and classification. Our experimental data strongly suggested that disease diagnosis modeling should integrate patient age into the whole experimental design.
Collapse
Affiliation(s)
- Xin Feng
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Jialiang Li
- BioKnow Health Informatics Lab, College of Software, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Han Li
- BioKnow Health Informatics Lab, College of Software, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Hang Chen
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Fei Li
- BioKnow Health Informatics Lab, College of Software, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Quewang Liu
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Fengfeng Zhou
- BioKnow Health Informatics Lab, College of Computer Science and Technology, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.,BioKnow Health Informatics Lab, College of Software, Jilin University, Changchun, China.,Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|