1
|
Cai Y, Zhou N, Zhao J, Li W, Wang S. CSSEC: An adaptive approach integrating consensus and specific self-expressive coefficients for multi-omics cancer subtyping. Methods 2025; 235:26-33. [PMID: 39880224 DOI: 10.1016/j.ymeth.2025.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 01/05/2025] [Accepted: 01/16/2025] [Indexed: 01/31/2025] Open
Abstract
Cancer is a complex and heterogeneous disease, and accurate cancer subtyping can significantly improve patient survival rates. The complexity of cancer spans multiple omics levels, and analyzing multi-omics data for cancer subtyping has become a major focus of research. However, extracting complementary information from different omics data sources and adaptively integrating them remains a major challenge. To address this, we proposed an adaptive approach integrating consensus and specific self-expressive coefficients for multi-omics cancer subtyping (CSSEC). First, independent self-expressive networks are applied to each omics to calculate coefficient matrices to measure patient similarity. Then, two feature graph convolutional network modules capture consensus and specific similarity features using the topK relevant features. Finally, the multi-omics self-expression coefficient matrix is constructed by consensus and specific similarity features. Furthermore, joint consistency and disparity constraints are applied to regularize the fusion of the self-expressive coefficients. Experimental results demonstrate that CSSEC outperforms existing state-of-the-art methods in survival analysis. Moreover, case studies on kidney cancer confirm that the cancer subtypes identified by CSSEC are biologically significant. The complete code can be available at https://github.com/ykxhs/CSSEC.
Collapse
Affiliation(s)
- Yueyi Cai
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Nan Zhou
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Junran Zhao
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Weihua Li
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China.
| |
Collapse
|
2
|
Wu Y, Xie L. AI-driven multi-omics integration for multi-scale predictive modeling of genotype-environment-phenotype relationships. Comput Struct Biotechnol J 2025; 27:265-277. [PMID: 39886532 PMCID: PMC11779603 DOI: 10.1016/j.csbj.2024.12.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 12/22/2024] [Accepted: 12/26/2024] [Indexed: 02/01/2025] Open
Abstract
Despite the wealth of single-cell multi-omics data, it remains challenging to predict the consequences of novel genetic and chemical perturbations in the human body. It requires knowledge of molecular interactions at all biological levels, encompassing disease models and humans. Current machine learning methods primarily establish statistical correlations between genotypes and phenotypes but struggle to identify physiologically significant causal factors, limiting their predictive power. Key challenges in predictive modeling include scarcity of labeled data, generalization across different domains, and disentangling causation from correlation. In light of recent advances in multi-omics data integration, we propose a new artificial intelligence (AI)-powered biology-inspired multi-scale modeling framework to tackle these issues. This framework will integrate multi-omics data across biological levels, organism hierarchies, and species to predict genotype-environment-phenotype relationships under various conditions. AI models inspired by biology may identify novel molecular targets, biomarkers, pharmaceutical agents, and personalized medicines for presently unmet medical needs.
Collapse
Affiliation(s)
- You Wu
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY, USA
| | - Lei Xie
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, NY, USA
- Ph.D. Program in Biology and Biochemistry, The Graduate Center, The City University of New York, New York, NY, USA
- Department of Computer Science, Hunter College, The City University of New York, New York, NY, USA
- Helen & Robert Appel Alzheimer's Disease Research Institute, Feil Family Brain & Mind Research Institute, Weill Cornell Medicine, Cornell University, New York, NY, USA
| |
Collapse
|
3
|
Lin J, Deng W, Wei J, Zheng J, Chen K, Chai H, Zeng T, Tang H. GD-Net: An Integrated Multimodal Information Model Based on Deep Learning for Cancer Outcome Prediction and Informative Feature Selection. J Cell Mol Med 2024; 28:e70221. [PMID: 39628446 PMCID: PMC11615516 DOI: 10.1111/jcmm.70221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 09/27/2024] [Accepted: 11/08/2024] [Indexed: 12/08/2024] Open
Abstract
Multimodal information provides valuable resources for cancer prognosis and survival prediction. However, the computational integration of this heterogeneous data information poses significant challenges due to the complex interactions between molecules from different biological modalities and the limited sample size. Here, we introduce GD-Net, a Graph Deep learning algorithm to enhance the accuracy of survival prediction with an average accuracy of 72% by early fusing of multimodal information, which includes an interpretable and lightweight XGBoost module to efficiently extract informative features. First, we applied GD-Net to eight cancer datasets and achieved superior performance compared to benchmarking methods, with an average 7.9% higher C-index value. The ablation experiments strongly supported that multi-modal integration could significantly improve accuracy over the single-modality model. In the deep case study of liver cancer, 319 differential genes, 15 differential miRNAs and 155 methylated differential genes based on the predicted risk subgroups are identified as the informative features, and then we have statistically and biologically validated the efficacy of these key molecules in internal and external test datasets. The comprehensive independent validations demonstrated that GD-Net is accurate and competitive in predicting different cancer outcomes in real-time, and it is an effective tool for identifying new multimodal prognosis biomarkers.
Collapse
Affiliation(s)
- Junqi Lin
- School of MathematicsFoshan UniversityFoshanChina
| | - Weizhen Deng
- School of MathematicsFoshan UniversityFoshanChina
| | - Junyu Wei
- School of MathematicsFoshan UniversityFoshanChina
| | | | - Kenan Chen
- School of MathematicsFoshan UniversityFoshanChina
| | - Hua Chai
- School of MathematicsFoshan UniversityFoshanChina
| | - Tao Zeng
- Guangzhou National LaboratoryGuangzhouChina
- GMU‐GIBH Joint School of Life Sciences, The Guangdong‐Hong Kong‐Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou LaboratoryGuangzhou Medical UniversityGuangzhouChina
| | - Hui Tang
- School of MathematicsFoshan UniversityFoshanChina
| |
Collapse
|
4
|
Ballard JL, Wang Z, Li W, Shen L, Long Q. Deep learning-based approaches for multi-omics data integration and analysis. BioData Min 2024; 17:38. [PMID: 39358793 PMCID: PMC11446004 DOI: 10.1186/s13040-024-00391-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 09/06/2024] [Indexed: 10/04/2024] Open
Abstract
BACKGROUND The rapid growth of deep learning, as well as the vast and ever-growing amount of available data, have provided ample opportunity for advances in fusion and analysis of complex and heterogeneous data types. Different data modalities provide complementary information that can be leveraged to gain a more complete understanding of each subject. In the biomedical domain, multi-omics data includes molecular (genomics, transcriptomics, proteomics, epigenomics, metabolomics, etc.) and imaging (radiomics, pathomics) modalities which, when combined, have the potential to improve performance on prediction, classification, clustering and other tasks. Deep learning encompasses a wide variety of methods, each of which have certain strengths and weaknesses for multi-omics integration. METHOD In this review, we categorize recent deep learning-based approaches by their basic architectures and discuss their unique capabilities in relation to one another. We also discuss some emerging themes advancing the field of multi-omics integration. RESULTS Deep learning-based multi-omics integration methods were categorized broadly into non-generative (feedforward neural networks, graph convolutional neural networks, and autoencoders) and generative (variational methods, generative adversarial models, and a generative pretrained model). Generative methods have the advantage of being able to impose constraints on the shared representations to enforce certain properties or incorporate prior knowledge. They can also be used to generate or impute missing modalities. Recent advances achieved by these methods include the ability to handle incomplete data as well as going beyond the traditional molecular omics data types to integrate other modalities such as imaging data. CONCLUSION We expect to see further growth in methods that can handle missingness, as this is a common challenge in working with complex and heterogeneous data. Additionally, methods that integrate more data types are expected to improve performance on downstream tasks by capturing a comprehensive view of each sample.
Collapse
Affiliation(s)
- Jenna L Ballard
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.
| | - Zexuan Wang
- Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania, 209 S. 33rd Street, Philadelphia, PA, 19104, USA
| | - Wenrui Li
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Storrs, CT, 06269, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, 19104, USA.
| | - Qi Long
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, 19104, USA.
| |
Collapse
|
5
|
Zhao Y, Li X, Zhou C, Peng H, Zheng Z, Chen J, Ding W. A review of cancer data fusion methods based on deep learning. INFORMATION FUSION 2024; 108:102361. [DOI: 10.1016/j.inffus.2024.102361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
6
|
Li M, Guo H, Wang K, Kang C, Yin Y, Zhang H. AVBAE-MODFR: A novel deep learning framework of embedding and feature selection on multi-omics data for pan-cancer classification. Comput Biol Med 2024; 177:108614. [PMID: 38796884 DOI: 10.1016/j.compbiomed.2024.108614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 02/27/2024] [Accepted: 05/11/2024] [Indexed: 05/29/2024]
Abstract
Integration analysis of cancer multi-omics data for pan-cancer classification has the potential for clinical applications in various aspects such as tumor diagnosis, analyzing clinically significant features, and providing precision medicine. In these applications, the embedding and feature selection on high-dimensional multi-omics data is clinically necessary. Recently, deep learning algorithms become the most promising cancer multi-omic integration analysis methods, due to the powerful capability of capturing nonlinear relationships. Developing effective deep learning architectures for cancer multi-omics embedding and feature selection remains a challenge for researchers in view of high dimensionality and heterogeneity. In this paper, we propose a novel two-phase deep learning model named AVBAE-MODFR for pan-cancer classification. AVBAE-MODFR achieves embedding by a multi2multi autoencoder based on the adversarial variational Bayes method and further performs feature selection utilizing a dual-net-based feature ranking method. AVBAE-MODFR utilizes AVBAE to pre-train the network parameters, which improves the classification performance and enhances feature ranking stability in MODFR. Firstly, AVBAE learns high-quality representation among multiple omics features for unsupervised pan-cancer classification. We design an efficient discriminator architecture to distinguish the latent distributions for updating forward variational parameters. Secondly, we propose MODFR to simultaneously evaluate multi-omics feature importance for feature selection by training a designed multi2one selector network, where the efficient evaluation approach based on the average gradient of random mask subsets can avoid bias caused by input feature drift. We conduct experiments on the TCGA pan-cancer dataset and compare it with four state-of-the-art methods for each phase. The results show the superiority of AVBAE-MODFR over SOTA methods.
Collapse
Affiliation(s)
- Minghe Li
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tongyan Road, Tianjin, China
| | - Huike Guo
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tongyan Road, Tianjin, China
| | - Keao Wang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tongyan Road, Tianjin, China
| | - Chuanze Kang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tongyan Road, Tianjin, China
| | - Yanbin Yin
- Department of Food Science and Technology, University of Nebraska - Lincoln, NE, USA
| | - Han Zhang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Engineering Research Center of Trusted Behavior Intelligence, Ministry of Education, College of Artificial Intelligence, Nankai University, Tongyan Road, Tianjin, China.
| |
Collapse
|
7
|
Wei K, Qian F, Li Y, Zeng T, Huang T. Integrating multi-omics data of childhood asthma using a deep association model. FUNDAMENTAL RESEARCH 2024; 4:738-751. [PMID: 39156565 PMCID: PMC11330118 DOI: 10.1016/j.fmre.2024.03.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 03/06/2024] [Accepted: 03/17/2024] [Indexed: 08/20/2024] Open
Abstract
Childhood asthma is one of the most common respiratory diseases with rising mortality and morbidity. The multi-omics data is providing a new chance to explore collaborative biomarkers and corresponding diagnostic models of childhood asthma. To capture the nonlinear association of multi-omics data and improve interpretability of diagnostic model, we proposed a novel deep association model (DAM) and corresponding efficient analysis framework. First, the Deep Subspace Reconstruction was used to fuse the omics data and diagnostic information, thereby correcting the distribution of the original omics data and reducing the influence of unnecessary data noises. Second, the Joint Deep Semi-Negative Matrix Factorization was applied to identify different latent sample patterns and extract biomarkers from different omics data levels. Third, our newly proposed Deep Orthogonal Canonical Correlation Analysis can rank features in the collaborative module, which are able to construct the diagnostic model considering nonlinear correlation between different omics data levels. Using DAM, we deeply analyzed the transcriptome and methylation data of childhood asthma. The effectiveness of DAM is verified from the perspectives of algorithm performance and biological significance on the independent test dataset, by ablation experiment and comparison with many baseline methods from clinical and biological studies. The DAM-induced diagnostic model can achieve a prediction AUC of 0.912, which is higher than that of many other alternative methods. Meanwhile, relevant pathways and biomarkers of childhood asthma are also recognized to be collectively altered on the gene expression and methylation levels. As an interpretable machine learning approach, DAM simultaneously considers the non-linear associations among samples and those among biological features, which should help explore interpretative biomarker candidates and efficient diagnostic models from multi-omics data analysis for human complex diseases.
Collapse
Affiliation(s)
- Kai Wei
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Guoke Ningbo Life Science and Health Industry Research Institute, Ningbo 315000, China
| | - Fang Qian
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yixue Li
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Guangzhou National Laboratory, Guangzhou 510000, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou 510000, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou 510000, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou 510000, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
8
|
Yang H, Zhao L, Li D, An C, Fang X, Chen Y, Liu J, Xiao T, Wang Z. Subtype-WGME enables whole-genome-wide multi-omics cancer subtyping. CELL REPORTS METHODS 2024; 4:100781. [PMID: 38761803 PMCID: PMC11228280 DOI: 10.1016/j.crmeth.2024.100781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/05/2024] [Accepted: 04/26/2024] [Indexed: 05/20/2024]
Abstract
We present an innovative strategy for integrating whole-genome-wide multi-omics data, which facilitates adaptive amalgamation by leveraging hidden layer features derived from high-dimensional omics data through a multi-task encoder. Empirical evaluations on eight benchmark cancer datasets substantiated that our proposed framework outstripped the comparative algorithms in cancer subtyping, delivering superior subtyping outcomes. Building upon these subtyping results, we establish a robust pipeline for identifying whole-genome-wide biomarkers, unearthing 195 significant biomarkers. Furthermore, we conduct an exhaustive analysis to assess the importance of each omic and non-coding region features at the whole-genome-wide level during cancer subtyping. Our investigation shows that both omics and non-coding region features substantially impact cancer development and survival prognosis. This study emphasizes the potential and practical implications of integrating genome-wide data in cancer research, demonstrating the potency of comprehensive genomic characterization. Additionally, our findings offer insightful perspectives for multi-omics analysis employing deep learning methodologies.
Collapse
Affiliation(s)
- Hai Yang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Liang Zhao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Dongdong Li
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Congcong An
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Xiaoyang Fang
- Cornell Tech, Cornell University, New York, NY 14853, USA
| | - Yiwen Chen
- Center for Continuing and Lifelong Education, National University of Singapore, Singapore 119077, Singapore
| | - Jingping Liu
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Ting Xiao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhe Wang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
| |
Collapse
|
9
|
Lan W, Liao H, Chen Q, Zhu L, Pan Y, Chen YPP. DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery. Brief Bioinform 2024; 25:bbae185. [PMID: 38678587 PMCID: PMC11056029 DOI: 10.1093/bib/bbae185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Revised: 03/07/2024] [Accepted: 04/09/2024] [Indexed: 05/01/2024] Open
Abstract
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
Collapse
Affiliation(s)
- Wei Lan
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, School of Computer, Electronic and Information, Guangxi University, No. 100 Daxue Road, Xixiangtang District, Nanning 530004, China
| | - Haibo Liao
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, School of Computer, Electronic and Information, Guangxi University, No. 100 Daxue Road, Xixiangtang District, Nanning 530004, China
| | - Qingfeng Chen
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, School of Computer, Electronic and Information, Guangxi University, No. 100 Daxue Road, Xixiangtang District, Nanning 530004, China
| | - Lingzhi Zhu
- School of Computer and Information Science, Hunan Institute of Technology, No. 18 Henghua Road, Zhuhui District, Hengyang 421002, China
| | - Yi Pan
- School of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, No. 1068 Xueyuan Avenue, Shenzhen University Town, Nanshan District, Shenzhen 518055, China
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Plenty Rd, Bundoora, Melbourne, Victoria 3086, Australia
| |
Collapse
|
10
|
Yuan C, Yu XT, Wang J, Shu B, Wang XY, Huang C, Lv X, Peng QQ, Qi WH, Zhang J, Zheng Y, Wang SJ, Liang QQ, Shi Q, Li T, Huang H, Mei ZD, Zhang HT, Xu HB, Cui J, Wang H, Zhang H, Shi BH, Sun P, Zhang H, Ma ZL, Feng Y, Chen L, Zeng T, Tang DZ, Wang YJ. Multi-modal molecular determinants of clinically relevant osteoporosis subtypes. Cell Discov 2024; 10:28. [PMID: 38472169 PMCID: PMC10933295 DOI: 10.1038/s41421-024-00652-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 01/24/2024] [Indexed: 03/14/2024] Open
Abstract
Due to a rapidly aging global population, osteoporosis and the associated risk of bone fractures have become a wide-spread public health problem. However, osteoporosis is very heterogeneous, and the existing standard diagnostic measure is not sufficient to accurately identify all patients at risk of osteoporotic fractures and to guide therapy. Here, we constructed the first prospective multi-omics atlas of the largest osteoporosis cohort to date (longitudinal data from 366 participants at three time points), and also implemented an explainable data-intensive analysis framework (DLSF: Deep Latent Space Fusion) for an omnigenic model based on a multi-modal approach that can capture the multi-modal molecular signatures (M3S) as explicit functional representations of hidden genotypes. Accordingly, through DLSF, we identified two subtypes of the osteoporosis population in Chinese individuals with corresponding molecular phenotypes, i.e., clinical intervention relevant subtypes (CISs), in which bone mineral density benefits response to calcium supplements in 2-year follow-up samples. Many snpGenes associated with these molecular phenotypes reveal diverse candidate biological mechanisms underlying osteoporosis, with xQTL preferences of osteoporosis and its subtypes indicating an omnigenic effect on different biological domains. Finally, these two subtypes were found to have different relevance to prior fracture and different fracture risk according to 4-year follow-up data. Thus, in clinical application, M3S could help us further develop improved diagnostic and treatment strategies for osteoporosis and identify a new composite index for fracture prediction, which were remarkably validated in an independent cohort (166 participants).
Collapse
Affiliation(s)
- Chunchun Yuan
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Xiang-Tian Yu
- Clinical Research Center, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jing Wang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Shanghai Geriatric Institute of Chinese Medicine, Shanghai, China
| | - Bing Shu
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xiao-Yun Wang
- Shanghai Research Institute of Acupuncture and Meridian, Shanghai, China
| | - Chen Huang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Xia Lv
- Hudong Hospital of Shanghai, Shanghai, China
| | - Qian-Qian Peng
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Wen-Hao Qi
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Science, Fudan University, Shanghai, China
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Jing Zhang
- Green Valley (Shanghai) Pharmaceuticals Co., Ltd., Shanghai, China
| | - Yan Zheng
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Science, Fudan University, Shanghai, China
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Si-Jia Wang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Qian-Qian Liang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Qi Shi
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
- Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Ting Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - He Huang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Science, Fudan University, Shanghai, China
| | - Zhen-Dong Mei
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Science, Fudan University, Shanghai, China
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Fudan University, Shanghai, China
| | - Hai-Tao Zhang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Hong-Bin Xu
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Jiarui Cui
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Hongyu Wang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Hong Zhang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Bin-Hao Shi
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Pan Sun
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Hui Zhang
- Hudong Hospital of Shanghai, Shanghai, China
| | | | - Yuan Feng
- Green Valley (Shanghai) Pharmaceuticals Co., Ltd., Shanghai, China
| | - Luonan Chen
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China.
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou, China.
| | - De-Zhi Tang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China.
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China.
| | - Yong-Jun Wang
- Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
- Key Laboratory of Theory and Therapy of Muscles and Bones, Ministry of Education, Shanghai, China.
- Spine Institute, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China.
- Shanghai University of Traditional Chinese Medicine, Shanghai, China.
| |
Collapse
|
11
|
Zhu S, Wang W, Fang W, Cui M. Autoencoder-assisted latent representation learning for survival prediction and multi-view clustering on multi-omics cancer subtyping. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:21098-21119. [PMID: 38124589 DOI: 10.3934/mbe.2023933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Cancer subtyping (or cancer subtypes identification) based on multi-omics data has played an important role in advancing diagnosis, prognosis and treatment, which triggers the development of advanced multi-view clustering algorithms. However, the high-dimension and heterogeneity of multi-omics data make great effects on the performance of these methods. In this paper, we propose to learn the informative latent representation based on autoencoder (AE) to naturally capture nonlinear omic features in lower dimensions, which is helpful for identifying the similarity of patients. Moreover, to take advantage of survival information or clinical information, a multi-omic survival analysis approach is embedded when integrating the similarity graph of heterogeneous data at the multi-omics level. Then, the clustering method is performed on the integrated similarity to generate subtype groups. In the experimental part, the effectiveness of the proposed framework is confirmed by evaluating five different multi-omics datasets, taken from The Cancer Genome Atlas. The results show that AE-assisted multi-omics clustering method can identify clinically significant cancer subtypes.
Collapse
Affiliation(s)
- Shuwei Zhu
- School of Artificial Intelligence and Computer Science, Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China
| | - Wenping Wang
- School of Artificial Intelligence and Computer Science, Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China
| | - Wei Fang
- School of Artificial Intelligence and Computer Science, Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China
| | - Meiji Cui
- School of Intelligent Manufacturing, Nanjing University of Science and Technology, Nanjing 210094, China
| |
Collapse
|
12
|
Liang J, Li ZW, Sun ZN, Bi Y, Cheng H, Zeng T, Guo WF. Latent space search based multimodal optimization with personalized edge-network biomarker for multi-purpose early disease prediction. Brief Bioinform 2023; 24:bbad364. [PMID: 37833844 DOI: 10.1093/bib/bbad364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/06/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
Considering that cancer is resulting from the comutation of several essential genes of individual patients, researchers have begun to focus on identifying personalized edge-network biomarkers (PEBs) using personalized edge-network analysis for clinical practice. However, most of existing methods ignored the optimization of PEBs when multimodal biomarkers exist in multi-purpose early disease prediction (MPEDP). To solve this problem, this study proposes a novel model (MMPDENB-RBM) that combines personalized dynamic edge-network biomarkers (PDENB) theory, multimodal optimization strategy and latent space search scheme to identify biomarkers with different configurations of PDENB modules (i.e. to effectively identify multimodal PDENBs). The application to the three largest cancer omics datasets from The Cancer Genome Atlas database (i.e. breast invasive carcinoma, lung squamous cell carcinoma and lung adenocarcinoma) showed that the MMPDENB-RBM model could more effectively predict critical cancer state compared with other advanced methods. And, our model had better convergence, diversity and multimodal property as well as effective optimization ability compared with the other state-of-art methods. Particularly, multimodal PDENBs identified were more enriched with different functional biomarkers simultaneously, such as tissue-specific synthetic lethality edge-biomarkers including cancer driver genes and disease marker genes. Importantly, as our aim, these multimodal biomarkers can perform diverse biological and biomedical significances for drug target screen, survival risk assessment and novel biomedical sight as the expected multi-purpose of personalized early disease prediction. In summary, the present study provides multimodal property of PDENBs, especially the therapeutic biomarkers with more biological significances, which can help with MPEDP of individual cancer patients.
Collapse
Affiliation(s)
- Jing Liang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
| | - Zong-Wei Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ze-Ning Sun
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ying Bi
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou 510005, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, 510005, Guangzhou Medical University
| | - Wei-Feng Guo
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center,Guangzhou 7510060, China
| |
Collapse
|
13
|
Chen W, Wang H, Liang C. Deep multi-view contrastive learning for cancer subtype identification. Brief Bioinform 2023; 24:bbad282. [PMID: 37539822 DOI: 10.1093/bib/bbad282] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/29/2023] [Accepted: 07/19/2023] [Indexed: 08/05/2023] Open
Abstract
Cancer heterogeneity has posed great challenges in exploring precise therapeutic strategies for cancer treatment. The identification of cancer subtypes aims to detect patients with distinct molecular profiles and thus could provide new clues on effective clinical therapies. While great efforts have been made, it remains challenging to develop powerful computational methods that can efficiently integrate multi-omics datasets for the task. In this paper, we propose a novel self-supervised learning model called Deep Multi-view Contrastive Learning (DMCL) for cancer subtype identification. Specifically, by incorporating the reconstruction loss, contrastive loss and clustering loss into a unified framework, our model simultaneously encodes the sample discriminative information into the extracted feature representations and well preserves the sample cluster structures in the embedded space. Moreover, DMCL is an end-to-end framework where the cancer subtypes could be directly obtained from the model outputs. We compare DMCL with eight alternatives ranging from classic cancer subtype identification methods to recently developed state-of-the-art systems on 10 widely used cancer multi-omics datasets as well as an integrated dataset, and the experimental results validate the superior performance of our method. We further conduct a case study on liver cancer and the analysis results indicate that different subtypes might have different responses to the selected chemotherapeutic drugs.
Collapse
Affiliation(s)
- Wenlan Chen
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| |
Collapse
|
14
|
Lee M. Deep Learning Techniques with Genomic Data in Cancer Prognosis: A Comprehensive Review of the 2021-2023 Literature. BIOLOGY 2023; 12:893. [PMID: 37508326 PMCID: PMC10376033 DOI: 10.3390/biology12070893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/16/2023] [Accepted: 06/20/2023] [Indexed: 07/30/2023]
Abstract
Deep learning has brought about a significant transformation in machine learning, leading to an array of novel methodologies and consequently broadening its influence. The application of deep learning in various sectors, especially biomedical data analysis, has initiated a period filled with noteworthy scientific developments. This trend has majorly influenced cancer prognosis, where the interpretation of genomic data for survival analysis has become a central research focus. The capacity of deep learning to decode intricate patterns embedded within high-dimensional genomic data has provoked a paradigm shift in our understanding of cancer survival. Given the swift progression in this field, there is an urgent need for a comprehensive review that focuses on the most influential studies from 2021 to 2023. This review, through its careful selection and thorough exploration of dominant trends and methodologies, strives to fulfill this need. The paper aims to enhance our existing understanding of applications of deep learning in cancer survival analysis, while also highlighting promising directions for future research. This paper undertakes aims to enrich our existing grasp of the application of deep learning in cancer survival analysis, while concurrently shedding light on promising directions for future research in this vibrant and rapidly proliferating field.
Collapse
Affiliation(s)
- Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
15
|
Chen Z, Yang Z, Zhu L, Gao P, Matsubara T, Kanaya S, Altaf-Ul-Amin M. Learning vector quantized representation for cancer subtypes identification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 236:107543. [PMID: 37100024 DOI: 10.1016/j.cmpb.2023.107543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 02/13/2023] [Accepted: 04/07/2023] [Indexed: 05/21/2023]
Abstract
BACKGROUND AND OBJECTIVE Defining and separating cancer subtypes is essential for facilitating personalized therapy modality and prognosis of patients. The definition of subtypes has been constantly recalibrated as a result of our deepened understanding. During this recalibration, researchers often rely on clustering of cancer data to provide an intuitive visual reference that could reveal the intrinsic characteristics of subtypes. The data being clustered are often omics data such as transcriptomics that have strong correlations to the underlying biological mechanism. However, while existing studies have shown promising results, they suffer from issues associated with omics data: sample scarcity and high dimensionality while they impose unrealistic assumptions to extract useful features from the data while avoiding overfitting to spurious correlations. METHODS This paper proposes to leverage a recent strong generative model, Vector-Quantized Variational AutoEncoder, to tackle the data issues and extract discrete representations that are crucial to the quality of subsequent clustering by retaining only information relevant to reconstructing the input. RESULTS Extensive experiments and medical analysis on multiple datasets comprising 10 distinct cancers demonstrate the proposed clustering results can significantly and robustly improve prognosis over prevalent subtyping systems. CONCLUSION Our proposal does not impose strict assumptions on data distribution; while, its latent features are better representations of the transcriptomic data in different cancer subtypes, capable of yielding superior clustering performance with any mainstream clustering method.
Collapse
Affiliation(s)
- Zheng Chen
- Graduate School of Engineering Science, Osaka University, Japan.
| | - Ziwei Yang
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Japan
| | - Lingwei Zhu
- Department of Computing Science, University of Alberta, Canada
| | - Peng Gao
- Institute for Quantitative Biosciences, University of Tokyo, Japan
| | | | - Shigehiko Kanaya
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Japan; Data Science Center, Nara Insitute of Science and Technology, Japan
| | - Md Altaf-Ul-Amin
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Japan
| |
Collapse
|
16
|
Zhao J, Zhao B, Song X, Lyu C, Chen W, Xiong Y, Wei DQ. Subtype-DCC: decoupled contrastive clustering method for cancer subtype identification based on multi-omics data. Brief Bioinform 2023; 24:7005165. [PMID: 36702755 DOI: 10.1093/bib/bbad025] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/21/2022] [Accepted: 01/08/2023] [Indexed: 01/28/2023] Open
Abstract
Due to the high heterogeneity and complexity of cancers, patients with different cancer subtypes often have distinct groups of genomic and clinical characteristics. Therefore, the discovery and identification of cancer subtypes are crucial to cancer diagnosis, prognosis and treatment. Recent technological advances have accelerated the increasing availability of multi-omics data for cancer subtyping. To take advantage of the complementary information from multi-omics data, it is necessary to develop computational models that can represent and integrate different layers of data into a single framework. Here, we propose a decoupled contrastive clustering method (Subtype-DCC) based on multi-omics data integration for clustering to identify cancer subtypes. The idea of contrastive learning is introduced into deep clustering based on deep neural networks to learn clustering-friendly representations. Experimental results demonstrate the superior performance of the proposed Subtype-DCC model in identifying cancer subtypes over the currently available state-of-the-art clustering methods. The strength of Subtype-DCC is also supported by the survival and clinical analysis.
Collapse
Affiliation(s)
- Jing Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Bowen Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaotong Song
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Chujun Lyu
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Weizhi Chen
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
- Peng Cheng Laboratory, Vanke Cloud City Phase I Building 8, Xili Street, Nanshan District, Shenzhen, Guangdong, 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nayang, Henan, 473006, China
| |
Collapse
|
17
|
Liu Y, Li Y, Zeng T. Multi-omics of extracellular vesicles: An integrative representation of functional mediators and perspectives on lung disease study. FRONTIERS IN BIOINFORMATICS 2023; 3:1117271. [PMID: 36844931 PMCID: PMC9947558 DOI: 10.3389/fbinf.2023.1117271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 01/31/2023] [Indexed: 02/11/2023] Open
Abstract
Extracellular vesicles are secreted by almost all cell types. EVs include a broader component known as exosomes that participate in cell-cell and tissue-tissue communication via carrying diverse biological signals from one cell type or tissue to another. EVs play roles as communication messengers of the intercellular network to mediate different physiological activities or pathological changes. In particular, most EVs are natural carriers of functional cargo such as DNA, RNA, and proteins, and thus they are relevant to advancing personalized targeted therapies in clinical practice. For the application of EVs, novel bioinformatic models and methods based on high-throughput technologies and multi-omics data are required to provide a deeper understanding of their biological and biomedical characteristics. These include qualitative and quantitative representation for identifying cargo markers, local cellular communication inference for tracing the origin and production of EVs, and distant organ communication reconstruction for targeting the influential microenvironment and transferable activators. Thus, this perspective paper introduces EVs in the context of multi-omics and provides an integrative bioinformatic viewpoint of the state of current research on EVs and their applications.
Collapse
Affiliation(s)
| | - Yixue Li
- *Correspondence: Yixue Li, ; Tao Zeng,
| | - Tao Zeng
- *Correspondence: Yixue Li, ; Tao Zeng,
| |
Collapse
|
18
|
Rodríguez Ruiz N, Abd Own S, Ekström Smedby K, Eloranta S, Koch S, Wästerlid T, Krstic A, Boman M. Data-driven support to decision-making in molecular tumour boards for lymphoma: A design science approach. Front Oncol 2022; 12:984021. [PMID: 36457495 PMCID: PMC9705761 DOI: 10.3389/fonc.2022.984021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 10/03/2022] [Indexed: 09/10/2024] Open
Abstract
Background The increasing amount of molecular data and knowledge about genomic alterations from next-generation sequencing processes together allow for a greater understanding of individual patients, thereby advancing precision medicine. Molecular tumour boards feature multidisciplinary teams of clinical experts who meet to discuss complex individual cancer cases. Preparing the meetings is a manual and time-consuming process. Purpose To design a clinical decision support system to improve the multimodal data interpretation in molecular tumour board meetings for lymphoma patients at Karolinska University Hospital, Stockholm, Sweden. We investigated user needs and system requirements, explored the employment of artificial intelligence, and evaluated the proposed design with primary stakeholders. Methods Design science methodology was used to form and evaluate the proposed artefact. Requirements elicitation was done through a scoping review followed by five semi-structured interviews. We used UML Use Case diagrams to model user interaction and UML Activity diagrams to inform the proposed flow of control in the system. Additionally, we modelled the current and future workflow for MTB meetings and its proposed machine learning pipeline. Interactive sessions with end-users validated the initial requirements based on a fictive patient scenario which helped further refine the system. Results The analysis showed that an interactive secure Web-based information system supporting the preparation of the meeting, multidisciplinary discussions, and clinical decision-making could address the identified requirements. Integrating artificial intelligence via continual learning and multimodal data fusion were identified as crucial elements that could provide accurate diagnosis and treatment recommendations. Impact Our work is of methodological importance in that using artificial intelligence for molecular tumour boards is novel. We provide a consolidated proof-of-concept system that could support the end-to-end clinical decision-making process and positively and immediately impact patients. Conclusion Augmenting a digital decision support system for molecular tumour boards with retrospective patient material is promising. This generates realistic and constructive material for human learning, and also digital data for continual learning by data-driven artificial intelligence approaches. The latter makes the future system adaptable to human bias, improving adequacy and decision quality over time and over tasks, while building and maintaining a digital log.
Collapse
Affiliation(s)
- Núria Rodríguez Ruiz
- Department of Learning, Informatics, Management and Ethics (LIME), Health Informatics Centre, Karolinska Institutet, Stockholm, Sweden
| | - Sulaf Abd Own
- Department of Medicine Solna, Clinical Epidemiology Division, Karolinska Institutet, Stockholm, Sweden
- Department of Laboratory Medicine, Division of Pathology, Karolinska University Hospital Huddinge, Stockholm, Sweden
| | - Karin Ekström Smedby
- Department of Medicine Solna, Clinical Epidemiology Division, Karolinska Institutet, Stockholm, Sweden
- Department of Hematology, Karolinska University Hospital, Stockholm, Sweden
| | - Sandra Eloranta
- Department of Medicine Solna, Clinical Epidemiology Division, Karolinska Institutet, Stockholm, Sweden
| | - Sabine Koch
- Department of Learning, Informatics, Management and Ethics (LIME), Health Informatics Centre, Karolinska Institutet, Stockholm, Sweden
| | - Tove Wästerlid
- Department of Medicine Solna, Clinical Epidemiology Division, Karolinska Institutet, Stockholm, Sweden
- Department of Hematology, Karolinska University Hospital, Stockholm, Sweden
| | - Aleksandra Krstic
- Center for Hematology and Regenerative Medicine, Karolinska Institutet, Stockholm, Sweden
- Department of Laboratory Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Magnus Boman
- Department of Learning, Informatics, Management and Ethics (LIME), Health Informatics Centre, Karolinska Institutet, Stockholm, Sweden
- School of Electrical Engineering and Computer Science (EECS)/Software and Computer Systems, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
19
|
Madhumita, Paul S. Capturing the latent space of an Autoencoder for multi-omics integration and cancer subtyping. Comput Biol Med 2022; 148:105832. [PMID: 35834966 DOI: 10.1016/j.compbiomed.2022.105832] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/15/2022] [Accepted: 07/03/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND AND OBJECTIVE The motivation behind cancer subtyping is to identify subgroups of cancer patients with distinguishable phenotypes of clinical importance. It can assist in advancement of subtype-targeted based treatments. Subtype identification is a complicated task, therefore requires multi-omics data integration to identify the precise patients' subgroup. Over the years, several computational attempts have been made to identify the cancer subtypes accurately using integrative multi-omics analysis. Some studies have used Autoencoders (AE) to capture multi-omics feature integration in lower dimensions for identifying subtypes in specific types of cancer. However, capturing the highly informative latent space by learning the deep architectures of AE to attain a satisfactory generalized performance is required. Therefore, in this study, a novel AE-assisted cancer subtyping framework is presented that utilizes the compressed latent space of a Sparse AE neural network for multi-omics clustering. METHODS The proposed framework first performs a supervised feature selection based on the survival status of the patients. The selected features from each of the omic data are passed to the AE. The information embedded in the latent space of the trained AE neural networks are then used for cancer subtyping using Spectral clustering. The AE architecture designed in this study exhaustively searches the best compression for multi-omics data by varying the number of neurons in the hidden layers and penalizing activations within the layers. RESULTS AND CONCLUSION The proposed framework is applied to five different multi-omics cancer datasets taken from The Cancer Genome Atlas. It is observed that for getting a robust information bottleneck, a compression of 10-20% of the input features along with an L1 regularization penalty of 0.01 or 0.001 performs well for most of the cancer datasets. Clustering performed on this latent representation generates clusters with better silhouette scores and significantly varying survival patterns. For further biological assessment, differential expression analysis is performed between the identified subtypes of Glioblastoma multiforme (GBM), followed by enrichment analysis of the differentially expressed biomarkers. Several pathways and disease ontology terms coherent to GBM are found to be significantly associated. Varying responses of the identified GBM subtypes towards the drug Temozolomide is also tested to demonstrate its clinical importance. Hence, the study shows that AE-assisted multi-omics integration can be used for the prediction of clinically significant cancer subtypes.
Collapse
Affiliation(s)
- Madhumita
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India.
| | - Sushmita Paul
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India; School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India.
| |
Collapse
|
20
|
Zhang G, Peng Z, Yan C, Wang J, Luo J, Luo H. MultiGATAE: A Novel Cancer Subtype Identification Method Based on Multi-Omics and Attention Mechanism. Front Genet 2022; 13:855629. [PMID: 35391797 PMCID: PMC8979770 DOI: 10.3389/fgene.2022.855629] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
Cancer is one of the leading causes of death worldwide, which brings an urgent need for its effective treatment. However, cancer is highly heterogeneous, meaning that one cancer can be divided into several subtypes with distinct pathogenesis and outcomes. This is considered as the main problem which limits the precision treatment of cancer. Thus, cancer subtypes identification is of great importance for cancer diagnosis and treatment. In this work, we propose a deep learning method which is based on multi-omics and attention mechanism to effectively identify cancer subtypes. We first used similarity network fusion to integrate multi-omics data to construct a similarity graph. Then, the similarity graph and the feature matrix of the patient are input into a graph autoencoder composed of a graph attention network and omics-level attention mechanism to learn embedding representation. The K-means clustering method is applied to the embedding representation to identify cancer subtypes. The experiment on eight TCGA datasets confirmed that our proposed method performs better for cancer subtypes identification when compared with the other state-of-the-art methods. The source codes of our method are available at https://github.com/kataomoi7/multiGATAE.
Collapse
Affiliation(s)
- Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Zhen Peng
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| |
Collapse
|