1
|
Jiang L, Xu C, Bai Y, Liu A, Gong Y, Wang YP, Deng HW. Autosurv: interpretable deep learning framework for cancer survival analysis incorporating clinical and multi-omics data. NPJ Precis Oncol 2024; 8:4. [PMID: 38182734 PMCID: PMC10770412 DOI: 10.1038/s41698-023-00494-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 12/05/2023] [Indexed: 01/07/2024] Open
Abstract
Accurate prognosis for cancer patients can provide critical information for optimizing treatment plans and improving life quality. Combining omics data and demographic/clinical information can offer a more comprehensive view of cancer prognosis than using omics or clinical data alone and can also reveal the underlying disease mechanisms at the molecular level. In this study, we developed and validated a deep learning framework to extract information from high-dimensional gene expression and miRNA expression data and conduct prognosis prediction for breast cancer and ovarian-cancer patients using multiple independent multi-omics datasets. Our model achieved significantly better prognosis prediction than the current machine learning and deep learning approaches in various settings. Moreover, an interpretation method was applied to tackle the "black-box" nature of deep neural networks and we identified features (i.e., genes, miRNA, demographic/clinical variables) that were important to distinguish predicted high- and low-risk patients. The significance of the identified features was partially supported by previous studies.
Collapse
Affiliation(s)
- Lindong Jiang
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Chao Xu
- Department of Biostatistics and Epidemiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104, USA
| | - Yuntong Bai
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Anqi Liu
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Yun Gong
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, School of Science and Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, 70112, USA.
| |
Collapse
|
2
|
Huang Y, Li J, Li M, Aparasu RR. Application of machine learning in predicting survival outcomes involving real-world data: a scoping review. BMC Med Res Methodol 2023; 23:268. [PMID: 37957593 PMCID: PMC10641971 DOI: 10.1186/s12874-023-02078-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/20/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND Despite the interest in machine learning (ML) algorithms for analyzing real-world data (RWD) in healthcare, the use of ML in predicting time-to-event data, a common scenario in clinical practice, is less explored. ML models are capable of algorithmically learning from large, complex datasets and can offer advantages in predicting time-to-event data. We reviewed the recent applications of ML for survival analysis using RWD in healthcare. METHODS PUBMED and EMBASE were searched from database inception through March 2023 to identify peer-reviewed English-language studies of ML models for predicting time-to-event outcomes using the RWD. Two reviewers extracted information on the data source, patient population, survival outcome, ML algorithms, and the Area Under the Curve (AUC). RESULTS Of 257 citations, 28 publications were included. Random survival forests (N = 16, 57%) and neural networks (N = 11, 39%) were the most popular ML algorithms. There was variability across AUC for these ML models (median 0.789, range 0.6-0.950). ML algorithms were predominately considered for predicting overall survival in oncology (N = 12, 43%). ML survival models were often used to predict disease prognosis or clinical events (N = 27, 96%) in the oncology, while less were used for treatment outcomes (N = 1, 4%). CONCLUSIONS The ML algorithms, random survival forests and neural networks, are mainly used for RWD to predict survival outcomes such as disease prognosis or clinical events in the oncology. This review shows that more opportunities remain to apply these ML algorithms to inform treatment decision-making in clinical practice. More methodological work is also needed to ensure the utility and applicability of ML models in survival outcomes.
Collapse
Affiliation(s)
- Yinan Huang
- Department of Pharmacy Administration, School of Pharmacy, University of Mississippi, University, MS, 38677, USA
| | - Jieni Li
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX, 77204, USA
| | - Mai Li
- Department of Industrial Engineering, Cullen College of Engineering, University of Houston, Houston, TX, USA
| | - Rajender R Aparasu
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
3
|
Cheng KP, Shen WX, Jiang YY, Chen Y, Chen YZ, Tan Y. Deep learning of 2D-Restructured gene expression representations for improved low-sample therapeutic response prediction. Comput Biol Med 2023; 164:107245. [PMID: 37480677 DOI: 10.1016/j.compbiomed.2023.107245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 06/27/2023] [Accepted: 07/07/2023] [Indexed: 07/24/2023]
Abstract
Clinical outcome prediction is important for stratified therapeutics. Machine learning (ML) and deep learning (DL) methods facilitate therapeutic response prediction from transcriptomic profiles of cells and clinical samples. Clinical transcriptomic DL is challenged by the low-sample sizes (34-286 subjects), high-dimensionality (up to 21,653 genes) and unordered nature of clinical transcriptomic data. The established methods rely on ML algorithms at accuracy levels of 0.6-0.8 AUC/ACC values. Low-sample DL algorithms are needed for enhanced prediction capability. Here, an unsupervised manifold-guided algorithm was employed for restructuring transcriptomic data into ordered image-like 2D-representations, followed by efficient DL of these 2D-representations with deep ConvNets. Our DL models significantly outperformed the state-of-the-art (SOTA) ML models on 82% of 17 low-sample benchmark datasets (53% with >0.05 AUC/ACC improvement). They are more robust than the SOTA models in cross-cohort prediction tasks, and in identifying robust biomarkers and response-dependent variational patterns consistent with experimental indications.
Collapse
Affiliation(s)
- Kai Ping Cheng
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, PR China; Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen, 518132, PR China
| | - Wan Xiang Shen
- Bioinformatics and Drug Design Group, Department of Pharmacy, Center for Computational Science and Engineering, National University of Singapore, 117543, Singapore
| | - Yu Yang Jiang
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, 100084, PR China
| | - Yan Chen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, PR China
| | - Yu Zong Chen
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, PR China; Institute of Biomedical Health Technology and Engineering, Shenzhen Bay Laboratory, Shenzhen, 518132, PR China.
| | - Ying Tan
- The State Key Laboratory of Chemical Oncogenomics, Key Laboratory of Chemical Biology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, PR China; The Institute of Drug Discovery Technology, Ningbo University, Ningbo, 315211, PR China; Shenzhen Kivita Innovative Drug Discovery Institute, Shenzhen, 518110, PR China.
| |
Collapse
|
4
|
Azher ZL, Suvarna A, Chen JQ, Zhang Z, Christensen BC, Salas LA, Vaickus LJ, Levy JJ. Assessment of emerging pretraining strategies in interpretable multimodal deep learning for cancer prognostication. BioData Min 2023; 16:23. [PMID: 37481666 PMCID: PMC10363299 DOI: 10.1186/s13040-023-00338-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 07/05/2023] [Indexed: 07/24/2023] Open
Abstract
BACKGROUND Deep learning models can infer cancer patient prognosis from molecular and anatomic pathology information. Recent studies that leveraged information from complementary multimodal data improved prognostication, further illustrating the potential utility of such methods. However, current approaches: 1) do not comprehensively leverage biological and histomorphological relationships and 2) make use of emerging strategies to "pretrain" models (i.e., train models on a slightly orthogonal dataset/modeling objective) which may aid prognostication by reducing the amount of information required for achieving optimal performance. In addition, model interpretation is crucial for facilitating the clinical adoption of deep learning methods by fostering practitioner understanding and trust in the technology. METHODS Here, we develop an interpretable multimodal modeling framework that combines DNA methylation, gene expression, and histopathology (i.e., tissue slides) data, and we compare performance of crossmodal pretraining, contrastive learning, and transfer learning versus the standard procedure. RESULTS Our models outperform the existing state-of-the-art method (average 11.54% C-index increase), and baseline clinically driven models (average 11.7% C-index increase). Model interpretations elucidate consideration of biologically meaningful factors in making prognosis predictions. DISCUSSION Our results demonstrate that the selection of pretraining strategies is crucial for obtaining highly accurate prognostication models, even more so than devising an innovative model architecture, and further emphasize the all-important role of the tumor microenvironment on disease progression.
Collapse
Affiliation(s)
- Zarif L Azher
- Thomas Jefferson High School for Science and Technology, Alexandria, VA, USA
| | - Anish Suvarna
- Thomas Jefferson High School for Science and Technology, Alexandria, VA, USA
| | - Ji-Qing Chen
- Cancer Biology Graduate Program, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Ze Zhang
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Brock C Christensen
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Molecular and Systems Biology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Community and Family Medicine, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Lucas A Salas
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Molecular and Systems Biology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Integrative Neuroscience at Dartmouth (IND) Graduate Program, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA
| | - Joshua J Levy
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA.
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA.
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA.
- Department of Dermatology, Dartmouth Health, Lebanon, NH, USA.
| |
Collapse
|
5
|
Sun B, Chen L. Interpretable deep learning for improving cancer patient survival based on personal transcriptomes. Sci Rep 2023; 13:11344. [PMID: 37443344 PMCID: PMC10344908 DOI: 10.1038/s41598-023-38429-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 07/07/2023] [Indexed: 07/15/2023] Open
Abstract
Precision medicine chooses the optimal drug for a patient by considering individual differences. With the tremendous amount of data accumulated for cancers, we develop an interpretable neural network to predict cancer patient survival based on drug prescriptions and personal transcriptomes (CancerIDP). The deep learning model achieves 96% classification accuracy in distinguishing short-lived from long-lived patients. The Pearson correlation between predicted and actual months-to-death values is as high as 0.937. About 27.4% of patients may survive longer with an alternative medicine chosen by our deep learning model. The median survival time of all patients can increase by 3.9 months. Our interpretable neural network model reveals the most discriminating pathways in the decision-making process, which will further facilitate mechanistic studies of drug development for cancers.
Collapse
Affiliation(s)
- Bo Sun
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA, 90089, USA
| | - Liang Chen
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA, 90089, USA.
| |
Collapse
|
6
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
7
|
Ghosh Roy G, Geard N, Verspoor K, He S. MPVNN: Mutated Pathway Visible Neural Network architecture for interpretable prediction of cancer-specific survival risk. Bioinformatics 2022; 38:5026-5032. [PMID: 36124954 DOI: 10.1093/bioinformatics/btac636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 08/04/2022] [Accepted: 09/16/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Survival risk prediction using gene expression data is important in making treatment decisions in cancer. Standard neural network (NN) survival analysis models are black boxes with a lack of interpretability. More interpretable visible neural network architectures are designed using biological pathway knowledge. But they do not model how pathway structures can change for particular cancer types. RESULTS We propose a novel Mutated Pathway Visible Neural Network (MPVNN) architecture, designed using prior signaling pathway knowledge and random replacement of known pathway edges using gene mutation data simulating signal flow disruption. As a case study, we use the PI3K-Akt pathway and demonstrate overall improved cancer-specific survival risk prediction of MPVNN over other similar-sized NN and standard survival analysis methods. We show that trained MPVNN architecture interpretation, which points to smaller sets of genes connected by signal flow within the PI3K-Akt pathway that is important in risk prediction for particular cancer types, is reliable. AVAILABILITY AND IMPLEMENTATION The data and code are available at https://github.com/gourabghoshroy/MPVNN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gourab Ghosh Roy
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK.,School of Computing and Information Systems, University of Melbourne, Melbourne 3052, Australia
| | - Nicholas Geard
- School of Computing and Information Systems, University of Melbourne, Melbourne 3052, Australia
| | - Karin Verspoor
- School of Computing and Information Systems, University of Melbourne, Melbourne 3052, Australia.,School of Computing Technologies, RMIT University, Melbourne 3000, Australia
| | - Shan He
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK
| |
Collapse
|
8
|
A Scoping Review of Deep Learning in Cancer Nursing Combined with Augmented Reality: the Era of Intelligent Nursing is Coming. Asia Pac J Oncol Nurs 2022; 9:100135. [PMID: 36276884 PMCID: PMC9579790 DOI: 10.1016/j.apjon.2022.100135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 08/22/2022] [Indexed: 11/30/2022] Open
Abstract
Artificial intelligence has been developing greatly in the field of medicine. As a new research hotspot of artificial intelligence, deep learning (DL) has been widely applied in the fields of cancer risk assessment, symptom recognition, and cancer detection. Therefore, nursing care issues in terms of consuming time and energy, lower accuracy, and lower efficiency can be solved with applying DL in caring cancer patients. In addition, augmented reality (AR) has great navigation potential through combining computer-generated virtual elements with the real world. Thus, DL + AR may facilitate patients with cancer to possess a brand-new model of nursing care that is more intelligent, mobile, and adapted to the information age, compared to traditional nursing. With the advent of the era of intelligent nursing, future nursing models can not only learn from the DL + AR model to meet the needs of patients with cancer but also reduce nursing workload, save healthcare resources, and improve work efficiency, the quality of nursing care, as well as the quality of life for cancer patients.
Collapse
|
9
|
Risk Stratification for Breast Cancer Patient by Simultaneous Learning of Molecular Subtype and Survival Outcome Using Genetic Algorithm-Based Gene Set Selection. Cancers (Basel) 2022; 14:cancers14174120. [PMID: 36077657 PMCID: PMC9454699 DOI: 10.3390/cancers14174120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/18/2022] [Accepted: 08/20/2022] [Indexed: 11/26/2022] Open
Abstract
Simple Summary Patient stratification is clinically important because it allows us to understand the characteristics and establish treatment strategies for a group. Transcriptomic data play an important role in determining molecular subtypes and predicting survival. In the case of breast cancer, although the order of prognosis according to molecular subtypes is well known, there is heterogeneity even within a subtype. Therefore, patient stratification considering both molecular subtypes and survival outcomes is required. In this study, a methodology to handle this problem is presented. A genetic algorithm is used to select a set of genes, and a risk score is assigned to each patient using their expression level. According to the risk score, patients are ordered and stratified considering molecular subtypes and survival outcomes. Consequently, informative genes for patient stratification with respect to both aspects could be nominated, and the usefulness of the risk score was shown through comparison with other indicators. Abstract Patient stratification is a clinically important task because it allows us to establish and develop efficient treatment strategies for particular groups of patients. Molecular subtypes have been successfully defined using transcriptomic profiles, and they are used effectively in clinical practice, e.g., PAM50 subtypes of breast cancer. Survival prediction contributed to understanding diseases and also identifying genes related to prognosis. It is desirable to stratify patients considering these two aspects simultaneously. However, there are no methods for patient stratification that consider molecular subtypes and survival outcomes at once. Here, we propose a methodology to deal with the problem. A genetic algorithm is used to select a gene set from transcriptome data, and their expression quantities are utilized to assign a risk score to each patient. The patients are ordered and stratified according to the score. A gene set was selected by our method on a breast cancer cohort (TCGA-BRCA), and we examined its clinical utility using an independent cohort (SCAN-B). In this experiment, our method was successful in stratifying patients with respect to both molecular subtype and survival outcome. We demonstrated that the orders of patients were consistent across repeated experiments, and prognostic genes were successfully nominated. Additionally, it was observed that the risk score can be used to evaluate the molecular aggressiveness of individual patients.
Collapse
|
10
|
Liu H, Kurc T. Deep learning for survival analysis in breast cancer with whole slide image data. Bioinformatics 2022; 38:3629-3637. [PMID: 35674341 PMCID: PMC9272797 DOI: 10.1093/bioinformatics/btac381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 04/22/2022] [Accepted: 06/04/2022] [Indexed: 02/07/2023] Open
Abstract
MOTIVATION Whole slide tissue images contain detailed data on the sub-cellular structure of cancer. Quantitative analyses of this data can lead to novel biomarkers for better cancer diagnosis and prognosis and can improve our understanding of cancer mechanisms. Such analyses are challenging to execute because of the sizes and complexity of whole slide image data and relatively limited volume of training data for machine learning methods. RESULTS We propose and experimentally evaluate a multi-resolution deep learning method for breast cancer survival analysis. The proposed method integrates image data at multiple resolutions and tumor, lymphocyte and nuclear segmentation results from deep learning models. Our results show that this approach can significantly improve the deep learning model performance compared to using only the original image data. The proposed approach achieves a c-index value of 0.706 compared to a c-index value of 0.551 from an approach that uses only color image data at the highest image resolution. Furthermore, when clinical features (sex, age and cancer stage) are combined with image data, the proposed approach achieves a c-index of 0.773. AVAILABILITY AND IMPLEMENTATION https://github.com/SBU-BMI/deep_survival_analysis.
Collapse
Affiliation(s)
- Huidong Liu
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, USA
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
11
|
Affiliation(s)
- Qixian Zhong
- Department of Statistics and Data Science, School of Economics, Xiamen University
| | | | - Jane-Ling Wang
- Department of Statistics, University of California, Davis
| |
Collapse
|
12
|
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23:bbab454. [PMID: 34791014 PMCID: PMC8769688 DOI: 10.1093/bib/bbab454] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/18/2022] Open
Abstract
High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.
Collapse
Affiliation(s)
- Mingon Kang
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Euiseong Ko
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
13
|
Lai X, Zhou J, Wessely A, Heppt M, Maier A, Berking C, Vera J, Zhang L. A disease network-based deep learning approach for characterizing melanoma. Int J Cancer 2021; 150:1029-1044. [PMID: 34716589 DOI: 10.1002/ijc.33860] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 10/08/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022]
Abstract
Multiple types of genomic variations are present in cutaneous melanoma and some of the genomic features may have an impact on the prognosis of the disease. The access to genomics data via public repositories such as The Cancer Genome Atlas (TCGA) allows for a better understanding of melanoma at the molecular level, therefore making characterization of substantial heterogeneity in melanoma patients possible. Here, we proposed an approach that integrates genomics data, a disease network, and a deep learning model to classify melanoma patients for prognosis, assess the impact of genomic features on the classification and provide interpretation to the impactful features. We integrated genomics data into a melanoma network and applied an autoencoder model to identify subgroups in TCGA melanoma patients. The model utilizes communities identified in the network to effectively reduce the dimensionality of genomics data into a patient score profile. Based on the score profile, we identified three patient subtypes that show different survival times. Furthermore, we quantified and ranked the impact of genomic features on the patient score profile using a machine-learning technique. Follow-up analysis of the top-ranking features provided us with the biological interpretation of them at both pathway and molecular levels, such as their mutation and interactome profiles in melanoma and their involvement in pathways associated with signaling transduction, immune system and cell cycle. Taken together, we demonstrated the ability of the approach to identify disease subgroups using a deep learning model that captures the most relevant information of genomics data in the melanoma network.
Collapse
Affiliation(s)
- Xin Lai
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Jinfei Zhou
- College of Computer Science, Sichuan University, Chengdu, China
| | - Anja Wessely
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Markus Heppt
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Carola Berking
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Julio Vera
- Department of Dermatology, Universitätsklinikum Erlangen and Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.,Deutsches Zentrum Immuntherapie, Erlangen, Germany.,Comprehensive Cancer Center Erlangen, Erlangen, Germany
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
| |
Collapse
|
14
|
Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 2021; 13:152. [PMID: 34579788 PMCID: PMC8477474 DOI: 10.1186/s13073-021-00968-x] [Citation(s) in RCA: 217] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/12/2021] [Indexed: 12/13/2022] Open
Abstract
Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.
Collapse
Affiliation(s)
- Khoa A. Tran
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
| | - Olga Kondrashova
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Andrew Bradley
- Faculty of Engineering, Queensland University of Technology (QUT), Brisbane, 4000 Australia
| | - Elizabeth D. Williams
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
- Australian Prostate Cancer Research Centre - Queensland (APCRC-Q) and Queensland Bladder Cancer Initiative (QBCI), Brisbane, 4102 Australia
| | - John V. Pearson
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Nicola Waddell
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| |
Collapse
|
15
|
Jamshidi A, Pelletier JP, Labbe A, Abram F, Martel-Pelletier J, Droit A. Machine Learning-Based Individualized Survival Prediction Model for Total Knee Replacement in Osteoarthritis: Data From the Osteoarthritis Initiative. Arthritis Care Res (Hoboken) 2021; 73:1518-1527. [PMID: 33749148 DOI: 10.1002/acr.24601] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 03/18/2021] [Indexed: 12/28/2022]
Abstract
OBJECTIVE By using machine learning, our study aimed to build a model to predict risk and time to total knee replacement (TKR) of an osteoarthritic knee. METHODS Features were from the Osteoarthritis Initiative (OAI) cohort at baseline. Using the lasso method for variable selection in the Cox regression model, we identified the 10 most important characteristics among 1,107 features. The prognostic power of the selected features was assessed by the Kaplan-Meier method and applied to 7 machine learning methods: Cox, DeepSurv, random forests algorithm, linear/kernel support vector machine (SVM), and linear/neural multi-task logistic regression models. As some of the 10 first-found features included similar radiographic measurements, we further looked at using the least number of features without compromising the accuracy of the model. Prediction performance was assessed by the concordance index, Brier score, and time-dependent area under the curve (AUC). RESULTS Ten features were identified and included radiographs, bone marrow lesions of the medial condyle on magnetic resonance imaging, hyaluronic acid injection, performance measure, medical history, and knee-related symptoms. The methodologies Cox, DeepSurv, and linear SVM demonstrated the highest accuracy (concordance index scores of 0.85, Brier score of 0.02, and an AUC of 0.87). DeepSurv was chosen to build the prediction model to estimate the time to TKR for a given knee. Moreover, we were able to decrease the features to only 3 and maintain the high accuracy (concordance index of 0.85, Brier score of 0.02, and AUC of 0.86), which included bone marrow lesions, Kellgren/Lawrence grade, and knee-related symptoms, to predict risk and time of a TKR event. CONCLUSION For the first time, we developed a model using the OAI cohort to predict with high accuracy if a given osteoarthritic knee would require TKR, when a TKR would be required, and who would likely progress fast toward this event.
Collapse
Affiliation(s)
- Afshin Jamshidi
- University of Montreal Hospital Research Centre, Montreal, Quebec, Canada, and Laval University Hospital Research Centre, Montreal, Quebec, Canada
| | | | | | | | | | - Arnaud Droit
- Laval University Hospital Research Centre, Quebec, Canada
| |
Collapse
|
16
|
Oh JH, Choi W, Ko E, Kang M, Tannenbaum A, Deasy JO. PathCNN: interpretable convolutional neural networks for survival prediction and pathway analysis applied to glioblastoma. Bioinformatics 2021; 37:i443-i450. [PMID: 34252964 PMCID: PMC8336441 DOI: 10.1093/bioinformatics/btab285] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION Convolutional neural networks (CNNs) have achieved great success in the areas of image processing and computer vision, handling grid-structured inputs and efficiently capturing local dependencies through multiple levels of abstraction. However, a lack of interpretability remains a key barrier to the adoption of deep neural networks, particularly in predictive modeling of disease outcomes. Moreover, because biological array data are generally represented in a non-grid structured format, CNNs cannot be applied directly. RESULTS To address these issues, we propose a novel method, called PathCNN, that constructs an interpretable CNN model on integrated multi-omics data using a newly defined pathway image. PathCNN showed promising predictive performance in differentiating between long-term survival (LTS) and non-LTS when applied to glioblastoma multiforme (GBM). The adoption of a visualization tool coupled with statistical analysis enabled the identification of plausible pathways associated with survival in GBM. In summary, PathCNN demonstrates that CNNs can be effectively applied to multi-omics data in an interpretable manner, resulting in promising predictive power while identifying key biological correlates of disease. AVAILABILITY AND IMPLEMENTATION The source code is freely available at: https://github.com/mskspi/PathCNN.
Collapse
Affiliation(s)
- Jung Hun Oh
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Wookjin Choi
- Department of Computer Science, Virginia State University, Petersburg, VA 23806, USA
| | - Euiseong Ko
- Department of Computer Science, University of Nevada, Las Vegas, NV 89154, USA
| | - Mingon Kang
- Department of Computer Science, University of Nevada, Las Vegas, NV 89154, USA
| | - Allen Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, New York, NY 11794, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| |
Collapse
|
17
|
Picard M, Scott-Boyer MP, Bodein A, Périn O, Droit A. Integration strategies of multi-omics data for machine learning analysis. Comput Struct Biotechnol J 2021; 19:3735-3746. [PMID: 34285775 PMCID: PMC8258788 DOI: 10.1016/j.csbj.2021.06.030] [Citation(s) in RCA: 148] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/17/2021] [Accepted: 06/21/2021] [Indexed: 12/25/2022] Open
Abstract
Increased availability of high-throughput technologies has generated an ever-growing number of omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these data have been obtained by machine learning algorithms that have produced diagnostic and classification biomarkers. Most biomarkers obtained to date however only include one omic measurement at a time and thus do not take full advantage of recent multi-omics experiments that now capture the entire complexity of biological systems. Multi-omics data integration strategies are needed to combine the complementary knowledge brought by each omics layer. We have summarized the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical. In this mini-review, we focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications.
Collapse
Affiliation(s)
- Milan Picard
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie-Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Périn
- Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- Corresponding author.
| |
Collapse
|
18
|
A Review of Explainable Deep Learning Cancer Detection Models in Medical Imaging. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11104573] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Deep learning has demonstrated remarkable accuracy analyzing images for cancer detection tasks in recent years. The accuracy that has been achieved rivals radiologists and is suitable for implementation as a clinical tool. However, a significant problem is that these models are black-box algorithms therefore they are intrinsically unexplainable. This creates a barrier for clinical implementation due to lack of trust and transparency that is a characteristic of black box algorithms. Additionally, recent regulations prevent the implementation of unexplainable models in clinical settings which further demonstrates a need for explainability. To mitigate these concerns, there have been recent studies that attempt to overcome these issues by modifying deep learning architectures or providing after-the-fact explanations. A review of the deep learning explanation literature focused on cancer detection using MR images is presented here. The gap between what clinicians deem explainable and what current methods provide is discussed and future suggestions to close this gap are provided.
Collapse
|
19
|
Shi F, Wang J, Shi J, Wu Z, Wang Q, Tang Z, He K, Shi Y, Shen D. Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19. IEEE Rev Biomed Eng 2021; 14:4-15. [PMID: 32305937 DOI: 10.1109/rbme.2020.2987975] [Citation(s) in RCA: 491] [Impact Index Per Article: 163.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The pandemic of coronavirus disease 2019 (COVID-19) is spreading all over the world. Medical imaging such as X-ray and computed tomography (CT) plays an essential role in the global fight against COVID-19, whereas the recently emerging artificial intelligence (AI) technologies further strengthen the power of the imaging tools and help medical specialists. We hereby review the rapid responses in the community of medical imaging (empowered by AI) toward COVID-19. For example, AI-empowered image acquisition can significantly help automate the scanning procedure and also reshape the workflow with minimal contact to patients, providing the best protection to the imaging technicians. Also, AI can improve work efficiency by accurate delineation of infections in X-ray and CT images, facilitating subsequent quantification. Moreover, the computer-aided platforms help radiologists make clinical decisions, i.e., for disease diagnosis, tracking, and prognosis. In this review paper, we thus cover the entire pipeline of medical imaging and analysis techniques involved with COVID-19, including image acquisition, segmentation, diagnosis, and follow-up. We particularly focus on the integration of AI with X-ray and CT, both of which are widely used in the frontline hospitals, in order to depict the latest progress of medical imaging and radiology fighting against COVID-19.
Collapse
|
20
|
Ramirez R, Chiu YC, Zhang S, Ramirez J, Chen Y, Huang Y, Jin YF. Prediction and interpretation of cancer survival using graph convolution neural networks. Methods 2021; 192:120-130. [PMID: 33484826 DOI: 10.1016/j.ymeth.2021.01.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 01/07/2021] [Accepted: 01/12/2021] [Indexed: 12/13/2022] Open
Abstract
The survival rate of cancer has increased significantly during the past two decades for breast, prostate, testicular, and colon cancer, while the brain and pancreatic cancers have a much lower median survival rate that has not improved much over the last forty years. This has imposed the challenge of finding gene markers for early cancer detection and treatment strategies. Different methods including regression-based Cox-PH, artificial neural networks, and recently deep learning algorithms have been proposed to predict the survival rate for cancers. We established in this work a novel graph convolution neural network (GCNN) approach called Surv_GCNN to predict the survival rate for 13 different cancer types using the TCGA dataset. For each cancer type, 6 Surv_GCNN models with graphs generated by correlation analysis, GeneMania database, and correlation + GeneMania were trained with and without clinical data to predict the risk score (RS). The performance of the 6 Surv_GCNN models was compared with two other existing models, Cox-PH and Cox-nnet. The results showed that Cox-PH has the worst performance among 8 tested models across the 13 cancer types while Surv_GCNN models with clinical data reported the best overall performance, outperforming other competing models in 7 out of 13 cancer types including BLCA, BRCA, COAD, LUSC, SARC, STAD, and UCEC. A novel network-based interpretation of Surv_GCNN was also proposed to identify potential gene markers for breast cancer. The signatures learned by the nodes in the hidden layer of Surv_GCNN were identified and were linked to potential gene markers by network modularization. The identified gene markers for breast cancer have been compared to a total of 213 gene markers from three widely cited lists for breast cancer survival analysis. About 57% of gene markers obtained by Surv_GCNN with correlation + GeneMania graph either overlap or directly interact with the 213 genes, confirming the effectiveness of the identified markers by Surv_GCNN.
Collapse
Affiliation(s)
- Ricardo Ramirez
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Yu-Chiao Chiu
- Greehey Children's Cancer Research Institute, The University of Texas Health San Antonio, San Antonio, Texas 78229, USA
| | - SongYao Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, Department of Intelligent Science And Technology, School of Automation, Northwestern Polytechnical University, Xí'an, China
| | - Joshua Ramirez
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Yidong Chen
- Greehey Children's Cancer Research Institute, The University of Texas Health San Antonio, San Antonio, Texas 78229, USA; Department of Population Health Sciences, The University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Yufei Huang
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA; Department of Population Health Sciences, The University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Yu-Fang Jin
- Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA.
| |
Collapse
|
21
|
Bhargava A, Bansal A. Novel coronavirus (COVID-19) diagnosis using computer vision and artificial intelligence techniques: a review. MULTIMEDIA TOOLS AND APPLICATIONS 2021; 80:19931-19946. [PMID: 33686333 PMCID: PMC7928188 DOI: 10.1007/s11042-021-10714-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 10/23/2020] [Accepted: 02/10/2021] [Indexed: 05/07/2023]
Abstract
The universal transmission of pandemic COVID-19 (Coronavirus) causes an immediate need to commit in the fight across the whole human population. The emergencies for human health care are limited for this abrupt outbreak and abandoned environment. In this situation, inventive automation like computer vision (machine learning, deep learning, artificial intelligence), medical imaging (computed tomography, X-Ray) has developed an encouraging solution against COVID-19. In recent months, different techniques using image processing are done by various researchers. In this paper, a major review on image acquisition, segmentation, diagnosis, avoidance, and management are presented. An analytical comparison of the various proposed algorithm by researchers for coronavirus has been carried out. Also, challenges and motivation for research in the future to deal with coronavirus are indicated. The clinical impact and use of computer vision and deep learning were discussed and we hope that dermatologists may have better understanding of these areas from the study.
Collapse
|
22
|
Li X, Li Y, Yu X, Jin F. Identification and validation of stemness-related lncRNA prognostic signature for breast cancer. J Transl Med 2020; 18:331. [PMID: 32867770 PMCID: PMC7461324 DOI: 10.1186/s12967-020-02497-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 08/21/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Long noncoding RNAs (lncRNAs) are emerging as crucial contributors to the development of breast cancer and are involved in the stemness regulation of breast cancer stem cells (BCSCs). LncRNAs are closely associated with the prognosis of breast cancer patients. It is critical to identify BCSC-related lncRNAs with prognostic value in breast cancer. METHODS A co-expression network of BCSC-related mRNAs-lncRNAs from The Cancer Genome Atlas (TCGA) was constructed. Univariate and multivariate Cox proportional hazards analyses were used to identify a stemness risk model with prognostic value. Kaplan-Meier analysis, univariate and multivariate Cox regression analyses and receiver operating characteristic (ROC) curve analysis were performed to validate the risk model. Principal component analysis (PCA) and Gene Set Enrichment Analysis (GSEA) functional annotation were conducted to analyze the risk model. RESULTS In this study, BCSC-related lncRNAs in breast cancer were identified. We evaluated the prognostic value of these BCSC-related lncRNAs and eventually obtained a prognostic risk model consisting of 12 BCSC-related lncRNAs (Z68871.1, LINC00578, AC097639.1, AP003119.3, AP001207.3, LINC00668, AL122010.1, AC245297.3, LINC01871, AP000851.2, AC022509.2 and SEMA3B-AS1). The risk model was further verified as a novel independent prognostic factor for breast cancer patients based on the calculated risk score. Moreover, based on the risk model, the low- risk and high-risk groups displayed different stemness statuses. CONCLUSIONS These findings suggested that the 12 BCSC-related lncRNA signature might be a promising prognostic factor for breast cancer and can promote the management of BCSC-related therapy in clinical practice.
Collapse
Affiliation(s)
- Xiaoying Li
- Department of Breast Surgery, The First Affiliated Hospital of China Medical University, 155 Nanjing Road, Shenyang, 110001, China.,Department of Cell Biology, Key Laboratory of Cell Biology, Ministry of Public Health, and Key Laboratory of Medical Cell Biology, Ministry of Education, China Medical University, 77 Puhe Road, Shenyang, 110122, China
| | - Yang Li
- Department of Cell Biology, Key Laboratory of Cell Biology, Ministry of Public Health, and Key Laboratory of Medical Cell Biology, Ministry of Education, China Medical University, 77 Puhe Road, Shenyang, 110122, China
| | - Xinmiao Yu
- Department of Breast Surgery, The First Affiliated Hospital of China Medical University, 155 Nanjing Road, Shenyang, 110001, China.
| | - Feng Jin
- Department of Breast Surgery, The First Affiliated Hospital of China Medical University, 155 Nanjing Road, Shenyang, 110001, China.
| |
Collapse
|