1
|
Li J, Shi J, Chen J, Du Z, Huang L. Self-attention random forest for breast cancer image classification. Front Oncol 2023; 13:1043463. [PMID: 36814814 PMCID: PMC9939756 DOI: 10.3389/fonc.2023.1043463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 01/09/2023] [Indexed: 02/08/2023] Open
Abstract
Introduction Early screening and diagnosis of breast cancer can not only detect hidden diseases in time, but also effectively improve the survival rate of patients. Therefore, the accurate classification of breast cancer images becomes the key to auxiliary diagnosis. Methods In this paper, on the basis of extracting multi-scale fusion features of breast cancer images using pyramid gray level co-occurrence matrix, we present a Self-Attention Random Forest (SARF) model as a classifier to explain the importance of fusion features, and can perform adaptive refinement processing on features, thus, the classification accuracy can be improved. In addition, we use GridSearchCV technique to optimize the hyperparameters of the model, which greatly avoids the limitation of artificially selected parameters. Results To demonstrate the effectiveness of our method, we perform validation on the breast cancer histopathological image-BreaKHis. The proposed method achieves an average accuracy of 92.96% and a micro average AUC value of 0.9588 for eight-class classification, and an average accuracy of 97.16% and an AUC value of 0.9713 for binary classification on BreaKHis dataset. Discussion For the sake of verify the universality of the proposed model, we also conduct experiments on MIAS dataset. An excellent average classification accuracy is 98.79% on MIAS dataset. Compared to other state-of-the-art methods, the experimental results demonstrate that the performance of the proposed method is superior to that of others. Furthermore, we can analyze the influence of different types of features on the proposed model, and provide theoretical basis for further optimization of the model in the future.
Collapse
|
2
|
Zhu Q, Wang H, Xu B, Zhang Z, Shao W, Zhang D. Multimodal Triplet Attention Network for Brain Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3884-3894. [PMID: 35969575 DOI: 10.1109/tmi.2022.3199032] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multi-modal imaging data fusion has attracted much attention in medical data analysis because it can provide complementary information for more accurate analysis. Integrating functional and structural multi-modal imaging data has been increasingly used in the diagnosis of brain diseases, such as epilepsy. Most of the existing methods focus on the feature space fusion of different modalities but ignore the valuable high-order relationships among samples and the discriminative fused features for classification. In this paper, we propose a novel framework by fusing data from two modalities of functional MRI (fMRI) and diffusion tensor imaging (DTI) for epilepsy diagnosis, which effectively captures the complementary information and discriminative features from different modalities by high-order feature extraction with the attention mechanism. Specifically, we propose a triple network to explore the discriminative information from the high-order representation feature space learned from multi-modal data. Meanwhile, self-attention is introduced to adaptively estimate the degree of importance between brain regions, and the cross-attention mechanism is utilized to extract complementary information from fMRI and DTI. Finally, we use the triple loss function to adjust the distance between samples in the common representation space. We evaluate the proposed method on the epilepsy dataset collected from Jinling Hospital, and the experiment results demonstrate that our method is significantly superior to several state-of-the-art diagnosis approaches.
Collapse
|
3
|
Gene Expression Analysis through Parallel Non-Negative Matrix Factorization. COMPUTATION 2021. [DOI: 10.3390/computation9100106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Genetic expression analysis is a principal tool to explain the behavior of genes in an organism when exposed to different experimental conditions. In the state of art, many clustering algorithms have been proposed. It is overwhelming the amount of biological data whose high-dimensional structure exceeds mostly current computational architectures. The computational time and memory consumption optimization actually become decisive factors in choosing clustering algorithms. We propose a clustering algorithm based on Non-negative Matrix Factorization and K-means to reduce data dimensionality but whilst preserving the biological context and prioritizing gene selection, and it is implemented within parallel GPU-based environments through the CUDA library. A well-known dataset is used in our tests and the quality of the results is measured through the Rand and Accuracy Index. The results show an increase in the acceleration of 6.22× compared to the sequential version. The algorithm is competitive in the biological datasets analysis and it is invariant with respect to the classes number and the size of the gene expression matrix.
Collapse
|
4
|
Guo W, Liang W, Deng Q, Zou X. A Multimodal Affinity Fusion Network for Predicting the Survival of Breast Cancer Patients. Front Genet 2021; 12:709027. [PMID: 34490038 PMCID: PMC8417828 DOI: 10.3389/fgene.2021.709027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 06/29/2021] [Indexed: 01/27/2023] Open
Abstract
Accurate survival prediction of breast cancer holds significant meaning for improving patient care. Approaches using multiple heterogeneous modalities such as gene expression, copy number alteration, and clinical data have showed significant advantages over those with only one modality for patient survival prediction. However, existing survival prediction methods tend to ignore the structured information between patients and multimodal data. We propose a multimodal data fusion model based on a novel multimodal affinity fusion network (MAFN) for survival prediction of breast cancer by integrating gene expression, copy number alteration, and clinical data. First, a stack-based shallow self-attention network is utilized to guide the amplification of tiny lesion regions on the original data, which locates and enhances the survival-related features. Then, an affinity fusion module is proposed to map the structured information between patients and multimodal data. The module endows the network with a stronger fusion feature representation and discrimination capability. Finally, the fusion feature embedding and a specific feature embedding from a triple modal network are fused to make the classification of long-term survival or short-term survival for each patient. As expected, the evaluation results on comprehensive performance indicate that MAFN achieves better predictive performance than existing methods. Additionally, our method can be extended to the survival prediction of other cancer diseases, providing a new strategy for other diseases prognosis.
Collapse
Affiliation(s)
- Weizhou Guo
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Wenbin Liang
- Key Laboratory of Luminescence Analysis and Molecular Sensing, Ministry of Education, College of Chemistry and Chemical Engineering, Southwest University, Chongqing, China
| | - Qingchun Deng
- Department of Gynecology, The Second Affiliated Hospital of Hainan Medical University, Hainan, China
| | - Xianchun Zou
- College of Computer and Information Science, Southwest University, Chongqing, China
| |
Collapse
|
5
|
Kim JY, Lee YS, Yu J, Park Y, Lee SK, Lee M, Lee JE, Kim SW, Nam SJ, Park YH, Ahn JS, Kang M, Im YH. Deep Learning-Based Prediction Model for Breast Cancer Recurrence Using Adjuvant Breast Cancer Cohort in Tertiary Cancer Center Registry. Front Oncol 2021; 11:596364. [PMID: 34017679 PMCID: PMC8129587 DOI: 10.3389/fonc.2021.596364] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 02/17/2021] [Indexed: 01/06/2023] Open
Abstract
Several prognosis prediction models have been developed for breast cancer (BC) patients with curative surgery, but there is still an unmet need to precisely determine BC prognosis for individual BC patients in real time. This is a retrospectively collected data analysis from adjuvant BC registry at Samsung Medical Center between January 2000 and December 2016. The initial data set contained 325 clinical data elements: baseline characteristics with demographics, clinical and pathologic information, and follow-up clinical information including laboratory and imaging data during surveillance. Weibull Time To Event Recurrent Neural Network (WTTE-RNN) by Martinsson was implemented for machine learning. We searched for the optimal window size as time-stamped inputs. To develop the prediction model, data from 13,117 patients were split into training (60%), validation (20%), and test (20%) sets. The median follow-up duration was 4.7 years and the median number of visits was 8.4. We identified 32 features related to BC recurrence and considered them in further analyses. Performance at a point of statistics was calculated using Harrell's C-index and area under the curve (AUC) at each 2-, 5-, and 7-year points. After 200 training epochs with a batch size of 100, the C-index reached 0.92 for the training data set and 0.89 for the validation and test data sets. The AUC values were 0.90 at 2-year point, 0.91 at 5-year point, and 0.91 at 7-year point. The deep learning-based final model outperformed three other machine learning-based models. In terms of pathologic characteristics, the median absolute error (MAE) and weighted mean absolute error (wMAE) showed great results of as little as 3.5%. This BC prognosis model to determine the probability of BC recurrence in real time was developed using information from the time of BC diagnosis and the follow-up period in RNN machine learning model.
Collapse
Affiliation(s)
- Ji-Yeon Kim
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Yong Seok Lee
- Digital Health Business Team, Samsung SDS, Seoul, South Korea
| | - Jonghan Yu
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Youngmin Park
- Digital Health Business Team, Samsung SDS, Seoul, South Korea
| | - Se Kyung Lee
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Minyoung Lee
- Digital Health Business Team, Samsung SDS, Seoul, South Korea
| | - Jeong Eon Lee
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Seok Won Kim
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Seok Jin Nam
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Yeon Hee Park
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Jin Seok Ahn
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Mira Kang
- Center for Health Promotion, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Young-Hyuck Im
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| |
Collapse
|
6
|
Kuang S, Wang L. Identification and analysis of consensus RNA motifs binding to the genome regulator CTCF. NAR Genom Bioinform 2021; 2:lqaa031. [PMID: 33575587 PMCID: PMC7671415 DOI: 10.1093/nargab/lqaa031] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 03/21/2020] [Accepted: 04/28/2020] [Indexed: 12/14/2022] Open
Abstract
CCCTC-binding factor (CTCF) is a key regulator of 3D genome organization and gene expression. Recent studies suggest that RNA transcripts, mostly long non-coding RNAs (lncRNAs), can serve as locus-specific factors to bind and recruit CTCF to the chromatin. However, it remains unclear whether specific sequence patterns are shared by the CTCF-binding RNA sites, and no RNA motif has been reported so far for CTCF binding. In this study, we have developed DeepLncCTCF, a new deep learning model based on a convolutional neural network and a bidirectional long short-term memory network, to discover the RNA recognition patterns of CTCF and identify candidate lncRNAs binding to CTCF. When evaluated on two different datasets, human U2OS dataset and mouse ESC dataset, DeepLncCTCF was shown to be able to accurately predict CTCF-binding RNA sites from nucleotide sequence. By examining the sequence features learned by DeepLncCTCF, we discovered a novel RNA motif with the consensus sequence, AGAUNGGA, for potential CTCF binding in humans. Furthermore, the applicability of DeepLncCTCF was demonstrated by identifying nearly 5000 candidate lncRNAs that might bind to CTCF in the nucleus. Our results provide useful information for understanding the molecular mechanisms of CTCF function in 3D genome organization.
Collapse
Affiliation(s)
- Shuzhen Kuang
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.,Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA
| | - Liangjiang Wang
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA
| |
Collapse
|
7
|
Hassaine A, Salimi-Khorshidi G, Canoy D, Rahimi K. Untangling the complexity of multimorbidity with machine learning. Mech Ageing Dev 2020; 190:111325. [PMID: 32768443 PMCID: PMC7493712 DOI: 10.1016/j.mad.2020.111325] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 07/28/2020] [Accepted: 07/30/2020] [Indexed: 12/20/2022]
Abstract
The prevalence of multimorbidity has been increasing in recent years, posing a major burden for health care delivery and service. Understanding its determinants and impact is proving to be a challenge yet it offers new opportunities for research to go beyond the study of diseases in isolation. In this paper, we review how the field of machine learning provides many tools for addressing research challenges in multimorbidity. We highlight recent advances in promising methods such as matrix factorisation, deep learning, and topological data analysis and how these can take multimorbidity research beyond cross-sectional, expert-driven or confirmatory approaches to gain a better understanding of evolving patterns of multimorbidity. We discuss the challenges and opportunities of machine learning to identify likely causal links between previously poorly understood disease associations while giving an estimate of the uncertainty on such associations. We finally summarise some of the challenges for wider clinical adoption of machine learning research tools and propose some solutions.
Collapse
Affiliation(s)
- Abdelaali Hassaine
- Deep Medicine, Oxford Martin School, University of Oxford, Oxford, United Kingdom; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom; Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Gholamreza Salimi-Khorshidi
- Deep Medicine, Oxford Martin School, University of Oxford, Oxford, United Kingdom; Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Dexter Canoy
- Deep Medicine, Oxford Martin School, University of Oxford, Oxford, United Kingdom; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom; Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, United Kingdom
| | - Kazem Rahimi
- Deep Medicine, Oxford Martin School, University of Oxford, Oxford, United Kingdom; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom; Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, United Kingdom.
| |
Collapse
|
8
|
Zhu N, Hou J, Ma G, Guo S, Zhao C, Chen B. Co-expression network analysis identifies a gene signature as a predictive biomarker for energy metabolism in osteosarcoma. Cancer Cell Int 2020; 20:259. [PMID: 32581649 PMCID: PMC7310058 DOI: 10.1186/s12935-020-01352-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 06/15/2020] [Indexed: 02/08/2023] Open
Abstract
Background Osteosarcoma (OS) is a common malignant bone tumor originating in the interstitial tissues and occurring mostly in adolescents and young adults. Energy metabolism is a prerequisite for cancer cell growth, proliferation, invasion, and metastasis. However, the gene signatures associated with energy metabolism and their underlying molecular mechanisms that drive them are unknown. Methods Energy metabolism-related genes were obtained from the TARGET database. We applied the “NFM” algorithm to classify putative signature gene into subtypes based on energy metabolism. Key genes related to progression were identified by weighted co-expression network analysis (WGCNA). Based on least absolute shrinkage and selection operator (LASSO) Cox proportional regression hazards model analyses, a gene signature for the predication of OS progression and prognosis was established. Robustness and estimation evaluations and comparison against other models were used to evaluate the prognostic performance of our model. Results Two subtypes associated with energy metabolism was determined using the “NFM” algorithm, and significant modules related to energy metabolism were identified by WGCNA. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) suggested that the genes in the significant modules were enriched in kinase, immune metabolism processes, and metabolism-related pathways. We constructed a seven-gene signature consisting of SLC18B1, RBMXL1, DOK3, HS3ST2, ATP6V0D1, CCAR1, and C1QTNF1 to be used for OS progression and prognosis. Upregulation of CCAR1, and C1QTNF1 was associated with augmented OS risk, whereas, increases in the expression SCL18B1, RBMXL1, DOK3, HS3ST2, and ATP6VOD1 was correlated with a diminished risk of OS. We confirmed that the seven-gene signature was robust, and was superior to the earlier models evaluated; therefore, it may be used for timely OS diagnosis, treatment, and prognosis. Conclusions The seven-gene signature related to OS energy metabolism developed here could be used in the early diagnosis, treatment, and prognosis of OS.
Collapse
Affiliation(s)
- Naiqiang Zhu
- Department of Minimally Invasive Spinal Surgery, The Affiliated Hospital of Chengde Medical College, Chengde, 067000 China
| | - Jingyi Hou
- Chengde Medical College, Chengde, 067000 China
| | - Guiyun Ma
- Department of Minimally Invasive Spinal Surgery, The Affiliated Hospital of Chengde Medical College, Chengde, 067000 China
| | - Shuai Guo
- Department of Minimally Invasive Spinal Surgery, The Affiliated Hospital of Chengde Medical College, Chengde, 067000 China
| | - Chengliang Zhao
- Department of Minimally Invasive Spinal Surgery, The Affiliated Hospital of Chengde Medical College, Chengde, 067000 China
| | - Bin Chen
- Department of Minimally Invasive Spinal Surgery, The Affiliated Hospital of Chengde Medical College, Chengde, 067000 China
| |
Collapse
|