1
|
Zhang L, Wang J, Chang R, Wang W. Investigation of the effectiveness of a classification method based on improved DAE feature extraction for hepatitis C prediction. Sci Rep 2024; 14:9143. [PMID: 38644402 PMCID: PMC11033254 DOI: 10.1038/s41598-024-59785-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 04/15/2024] [Indexed: 04/23/2024] Open
Abstract
Hepatitis C, a particularly dangerous form of viral hepatitis caused by hepatitis C virus (HCV) infection, is a major socio-economic and public health problem. Due to the rapid development of deep learning, it has become a common practice to apply deep learning to the healthcare industry to improve the effectiveness and accuracy of disease identification. In order to improve the effectiveness and accuracy of hepatitis C detection, this study proposes an improved denoising autoencoder (IDAE) and applies it to hepatitis C disease detection. Conventional denoising autoencoder introduces random noise at the input layer of the encoder. However, due to the presence of these features, encoders that directly add random noise may mask certain intrinsic properties of the data, making it challenging to learn deeper features. In this study, the problem of data information loss in traditional denoising autoencoding is addressed by incorporating the concept of residual neural networks into an enhanced denoising autoencoder. In our experimental study, we applied this enhanced denoising autoencoder to the open-source Hepatitis C dataset and the results showed significant results in feature extraction. While existing baseline machine learning methods have less than 90% accuracy and integrated algorithms and traditional autoencoders have only 95% correctness, the improved IDAE achieves 99% accuracy in the downstream hepatitis C classification task, which is a 9% improvement over a single algorithm, and a nearly 4% improvement over integrated algorithms and other autoencoders. The above results demonstrate that IDAE can effectively capture key disease features and improve the accuracy of disease prediction in hepatitis C data. This indicates that IDAE has the potential to be widely used in the detection and management of hepatitis C and similar diseases, especially in the development of early warning systems, progression prediction and personalised treatment strategies.
Collapse
Affiliation(s)
- Lin Zhang
- Zhejiang Hospital of Integrated Traditional Chinese and Western Medicine, Hangzhou, 310003, China
| | - Jixin Wang
- Department of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China.
| | - Rui Chang
- Department of ICU, Jining No.1 People's Hospital, Jining, 272100, China
| | - Weigang Wang
- Department of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China.
| |
Collapse
|
2
|
Nasser M, Yusof UK. Deep Learning Based Methods for Breast Cancer Diagnosis: A Systematic Review and Future Direction. Diagnostics (Basel) 2023; 13:diagnostics13010161. [PMID: 36611453 PMCID: PMC9818155 DOI: 10.3390/diagnostics13010161] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 12/19/2022] [Accepted: 12/19/2022] [Indexed: 01/06/2023] Open
Abstract
Breast cancer is one of the precarious conditions that affect women, and a substantive cure has not yet been discovered for it. With the advent of Artificial intelligence (AI), recently, deep learning techniques have been used effectively in breast cancer detection, facilitating early diagnosis and therefore increasing the chances of patients' survival. Compared to classical machine learning techniques, deep learning requires less human intervention for similar feature extraction. This study presents a systematic literature review on the deep learning-based methods for breast cancer detection that can guide practitioners and researchers in understanding the challenges and new trends in the field. Particularly, different deep learning-based methods for breast cancer detection are investigated, focusing on the genomics and histopathological imaging data. The study specifically adopts the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), which offer a detailed analysis and synthesis of the published articles. Several studies were searched and gathered, and after the eligibility screening and quality evaluation, 98 articles were identified. The results of the review indicated that the Convolutional Neural Network (CNN) is the most accurate and extensively used model for breast cancer detection, and the accuracy metrics are the most popular method used for performance evaluation. Moreover, datasets utilized for breast cancer detection and the evaluation metrics are also studied. Finally, the challenges and future research direction in breast cancer detection based on deep learning models are also investigated to help researchers and practitioners acquire in-depth knowledge of and insight into the area.
Collapse
|
3
|
Chen S, Zang Y, Xu B, Lu B, Ma R, Miao P, Chen B. An Unsupervised Deep Learning-Based Model Using Multiomics Data to Predict Prognosis of Patients with Stomach Adenocarcinoma. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5844846. [PMID: 36339684 PMCID: PMC9633210 DOI: 10.1155/2022/5844846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Revised: 09/25/2022] [Accepted: 10/08/2022] [Indexed: 09/08/2023]
Abstract
METHODS Patients (363 in total) with stomach adenocarcinoma from The Cancer Genome Atlas (TCGA) cohort were included. An autoencoder was constructed to integrate the RNA sequencing, miRNA sequencing, and methylation data. The features of the bottleneck layer were used to perform the k-means clustering algorithm to obtain different subgroups for evaluating the prognosis-related risk of stomach adenocarcinoma. The model's robustness was verified using a 10-fold cross-validation (CV). Survival was analyzed by the Kaplan-Meier method. Univariate and multivariate Cox regression was used to estimate hazard risk. The model was validated in three independent cohorts with different endpoints. RESULTS The patients were divided into low-risk and high-risk groups according to the k-means clustering algorithm. The high-risk group had a significantly higher risk of poor survival (log-rank P value = 2.80e - 06; adjusted hazard ratio = 2.386, 95% confidence interval: 1.607~3.543), a concordance index (C-index) of 0.714, and a Brier score of 0.184. The model performed well both in the 10-fold CV procedure and three independent cohorts from the Gene Expression Omnibus (GEO) repository. CONCLUSIONS A robust and generalizable model based on the autoencoder was proposed to integrate multiomics data and predict the prognosis of patients with stomach adenocarcinoma. The model demonstrates better performance than two alternative approaches on prognosis prediction. The results might provide the grounds for further exploring the potential biomarkers to predict the prognosis of patients with stomach adenocarcinoma.
Collapse
Affiliation(s)
- Sizhen Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Yiteng Zang
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Biyun Xu
- Department of Biostatistics, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing 210008, China
| | - Beier Lu
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Rongji Ma
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Pengcheng Miao
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| | - Bingwei Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
| |
Collapse
|
4
|
Nassif AB, Talib MA, Nasir Q, Afadar Y, Elgendy O. Breast cancer detection using artificial intelligence techniques: A systematic literature review. Artif Intell Med 2022; 127:102276. [DOI: 10.1016/j.artmed.2022.102276] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 10/18/2021] [Accepted: 03/04/2022] [Indexed: 02/07/2023]
|
5
|
Hira MT, Razzaque MA, Angione C, Scrivens J, Sawan S, Sarker M. Integrated multi-omics analysis of ovarian cancer using variational autoencoders. Sci Rep 2021; 11:6265. [PMID: 33737557 PMCID: PMC7973750 DOI: 10.1038/s41598-021-85285-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 02/28/2021] [Indexed: 02/06/2023] Open
Abstract
Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.
Collapse
Affiliation(s)
- Muta Tah Hira
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| | - M A Razzaque
- School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK.
| | - Claudio Angione
- School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK
| | - James Scrivens
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| | - Saladin Sawan
- The James Cook University Hospital, Middlesbrough, TS4 3BW, UK
| | - Mosharraf Sarker
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| |
Collapse
|
6
|
Dai X, Lian X, Wang G, Shang J, Zhang L, Zhang Q, Lei H, Yan Y, Wang Y, Zou H. Mapping the amelogenin protein expression during porcine molar crown development. Ann Anat 2021; 234:151665. [PMID: 33400984 DOI: 10.1016/j.aanat.2020.151665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 12/13/2020] [Accepted: 12/16/2020] [Indexed: 11/16/2022]
Abstract
INTRODUCTION Amelogenin (AMEL) plays critical roles during enamel and dentin matrix deposition and mineralization. Most studies focused on the expression patterns of AMEL through the bud, cap, and bell stages. The spatial-temporal expression of AMEL protein during different mineralization stages, especially from presence of crypts to crown completed stages, remains unknown. Thus, the distribution pattern of AMEL in tooth crown formation from Nolla Stage 1 to 6 was investigated. METHODS Porcine mandibular molar tooth germs from Nolla Stage 1 to 6 were obtained. The dynamic morphologic changes of tooth germs were examined by X-ray and surgical operating microscope. The AMEL protein expression was evaluated immunohistochemically, then analyzed semi-quantitatively, and further visualized via heat map. RESULTS Tooth germs continuously increased in size from Nolla Stage 1 to 6. AMEL expression in the newly formed enamel kept negative, but presented intensively positive in the previously formed enamel from Stage 1 to 3. The adjacent enamel-dentin junction (EDJ) was strongly positive during the whole process. In predentin, AMEL was weakly seen at Stage 1 and then dramatically up-regulated from Stage 2 to Stage 3, then down-regulated but was still apparently seen in the whole process. AMEL expression in dentin was decreased during dentin matrix secretion and mineralization. CONCLUSIONS This study identified the dynamic distribution of AMEL during porcine tooth crown formation. Semi-quantitative analysis and heat map emerged as reliable indicators in demonstrating AMEL distribution pattern.
Collapse
Affiliation(s)
- Xiaohua Dai
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Xiaoli Lian
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Guanhua Wang
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Jianwei Shang
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China; Department of Oral Pathology, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Le Zhang
- Department of Oral Pathology, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Qingzhi Zhang
- Department of Oral and Maxillofacial Surgery, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Han Lei
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China; Department of Oral & Maxillofacial Radiology, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Yan Yan
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China
| | - Yue Wang
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China; School of Medicine, Nankai University, Tianjin 300071, China.
| | - Huiru Zou
- Tianjin Key Laboratory of Oral and Maxillofacial Function Reconstruction, Tianjin Stomatological Hospital, The Affiliated Stomatological Hospital of Nankai University, Tianjin 300041, China.
| |
Collapse
|
7
|
Autoencoded DNA methylation data to predict breast cancer recurrence: Machine learning models and gene-weight significance. Artif Intell Med 2020; 110:101976. [PMID: 33250148 DOI: 10.1016/j.artmed.2020.101976] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 08/05/2020] [Accepted: 10/18/2020] [Indexed: 12/29/2022]
Abstract
Breast cancer is the most frequent cancer in women and the second most frequent overall after lung cancer. Although the 5-year survival rate of breast cancer is relatively high, recurrence is also common which often involves metastasis with its consequent threat for patients. DNA methylation-derived databases have become an interesting primary source for supervised knowledge extraction regarding breast cancer. Unfortunately, the study of DNA methylation involves the processing of hundreds of thousands of features for every patient. DNA methylation is featured by High Dimension Low Sample Size which has shown well-known issues regarding feature selection and generation. Autoencoders (AEs) appear as a specific technique for conducting nonlinear feature fusion. Our main objective in this work is to design a procedure to summarize DNA methylation by taking advantage of AEs. Our proposal is able to generate new features from the values of CpG sites of patients with and without recurrence. Then, a limited set of relevant genes to characterize breast cancer recurrence is proposed by the application of survival analysis and a pondered ranking of genes according to the distribution of their CpG sites. To test our proposal we have selected a dataset from The Cancer Genome Atlas data portal and an AE with a single-hidden layer. The literature and enrichment analysis (based on genomic context and functional annotation) conducted regarding the genes obtained with our experiment confirmed that all of these genes were related to breast cancer recurrence.
Collapse
|
8
|
Impact of Gene Biomarker Discovery Tools Based on Protein–Protein Interaction and Machine Learning on Performance of Artificial Intelligence Models in Predicting Clinical Stages of Breast Cancer. Interdiscip Sci 2020; 12:476-486. [DOI: 10.1007/s12539-020-00390-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 08/21/2020] [Accepted: 08/31/2020] [Indexed: 12/27/2022]
|
9
|
Li Z, Yang K, Zhang L, Wei C, Yang P, Xu W. Classification of Thyroid Nodules with Stacked Denoising Sparse Autoencoder. Int J Endocrinol 2020; 2020:9015713. [PMID: 33488708 PMCID: PMC7787836 DOI: 10.1155/2020/9015713] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 11/20/2020] [Accepted: 11/26/2020] [Indexed: 02/05/2023] Open
Abstract
PURPOSE Several commercial tests have been used for the classification of indeterminate thyroid nodules in cytology. However, the geographic inconvenience and high cost confine their widespread use. This study aims to develop a classifier for conveniently clinical utility. METHODS Gene expression data of thyroid nodule tissues were collected from three public databases. Immune-related genes were used to construct the classifier with stacked denoising sparse autoencoder. RESULTS The classifier performed well in discriminating malignant and benign thyroid nodules, with an area under the curve of 0.785 [0.638-0.931], accuracy of 92.9% [92.7-93.0%], sensitivity of 98.6% [95.9-101.3%], specificity of 58.3% [30.4-86.2%], positive likelihood ratio of 2.367 [1.211-4.625], and negative likelihood ratio of 0.024 [0.003-0.177]. In the cancer prevalence range of 20-40% for indeterminate thyroid nodules in cytology, the range of negative predictive value of this classifier was 37-61%, and the range of positive predictive value was 98-99%. CONCLUSION The classifier developed in this study has the superb discriminative ability for thyroid nodules. However, it needs validation in cytologically indeterminate thyroid nodules before clinical use.
Collapse
Affiliation(s)
- Zexin Li
- Health Care Center, The First Affiliated Hospital of Shantou University Medical College, No. 57, Changping Road, Shantou 515041, China
| | - Kaiji Yang
- Department of Radiology, The First Affiliated Hospital of Shantou University Medical College, No. 57, Changping Road, Shantou 515041, China
| | - Lili Zhang
- Health Care Center, The First Affiliated Hospital of Shantou University Medical College, No. 57, Changping Road, Shantou 515041, China
| | - Chiju Wei
- Multidisciplinary Research Center, Shantou University, No. 243, Daxue Road, Shantou 515063, China
| | - Peixuan Yang
- Health Care Center, The First Affiliated Hospital of Shantou University Medical College, No. 57, Changping Road, Shantou 515041, China
| | - Wencan Xu
- Department of Endocrinology, The First Affiliated Hospital of Shantou University Medical College, No. 57, Changping Road, Shantou 515041, China
| |
Collapse
|
10
|
Applications of Bioinformatics in Cancer. Cancers (Basel) 2019; 11:cancers11111630. [PMID: 31652939 PMCID: PMC6893424 DOI: 10.3390/cancers11111630] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 10/23/2019] [Indexed: 01/02/2023] Open
|