1
|
Yang H, Shi Y, Lin A, Qi C, Liu Z, Cheng Q, Miao K, Zhang J, Luo P. PESSA: A web tool for pathway enrichment score-based survival analysis in cancer. PLoS Comput Biol 2024; 20:e1012024. [PMID: 38717988 PMCID: PMC11078417 DOI: 10.1371/journal.pcbi.1012024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 03/26/2024] [Indexed: 05/12/2024] Open
Abstract
The activation levels of biologically significant gene sets are emerging tumor molecular markers and play an irreplaceable role in the tumor research field; however, web-based tools for prognostic analyses using it as a tumor molecular marker remain scarce. We developed a web-based tool PESSA for survival analysis using gene set activation levels. All data analyses were implemented via R. Activation levels of The Molecular Signatures Database (MSigDB) gene sets were assessed using the single sample gene set enrichment analysis (ssGSEA) method based on data from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), The European Genome-phenome Archive (EGA) and supplementary tables of articles. PESSA was used to perform median and optimal cut-off dichotomous grouping of ssGSEA scores for each dataset, relying on the survival and survminer packages for survival analysis and visualisation. PESSA is an open-access web tool for visualizing the results of tumor prognostic analyses using gene set activation levels. A total of 238 datasets from the GEO, TCGA, EGA, and supplementary tables of articles; covering 51 cancer types and 13 survival outcome types; and 13,434 tumor-related gene sets are obtained from MSigDB for pre-grouping. Users can obtain the results, including Kaplan-Meier analyses based on the median and optimal cut-off values and accompanying visualization plots and the Cox regression analyses of dichotomous and continuous variables, by selecting the gene set markers of interest. PESSA (https://smuonco.shinyapps.io/PESSA/ OR http://robinl-lab.com/PESSA) is a large-scale web-based tumor survival analysis tool covering a large amount of data that creatively uses predefined gene set activation levels as molecular markers of tumors.
Collapse
Affiliation(s)
- Hong Yang
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Haizhu District, Guangzhou, Guangdong, China
- The First School of Clinical Medicine, Southern Medical University, Baiyun District, Guangzhou, Guangdong, China
| | - Ying Shi
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Haizhu District, Guangzhou, Guangdong, China
- The Second School of Clinical Medicine, Southern Medical University, Baiyun District, Guangzhou, Guangdong, China
| | - Anqi Lin
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Haizhu District, Guangzhou, Guangdong, China
| | - Chang Qi
- Institute of Logic and Computation, TU Wien, Austria
| | - Zaoqu Liu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, China
- State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Department of Pathophysiology, Peking Union Medical College, Beijing, China
| | - Quan Cheng
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, Hunan, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
| | - Kai Miao
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau SAR, China
- MoE Frontiers Science Center for Precision Oncology, University of Macau, Macau SAR, China
| | - Jian Zhang
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Haizhu District, Guangzhou, Guangdong, China
| | - Peng Luo
- Department of Oncology, Zhujiang Hospital, Southern Medical University, Haizhu District, Guangzhou, Guangdong, China
| |
Collapse
|
2
|
Zhang J, Liu L, Zhang W, Li X, Zhao C, Li S, Li J, Le TD. miRspongeR 2.0: an enhanced R package for exploring miRNA sponge regulation. BIOINFORMATICS ADVANCES 2022; 2:vbac063. [PMID: 36699386 PMCID: PMC9710667 DOI: 10.1093/bioadv/vbac063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 08/24/2022] [Accepted: 09/01/2022] [Indexed: 02/01/2023]
Abstract
Summary MicroRNA (miRNA) sponges influence the capability of miRNA-mediated gene silencing by competing for shared miRNA response elements and play significant roles in many physiological and pathological processes. It has been proved that computational or dry-lab approaches are useful to guide wet-lab experiments for uncovering miRNA sponge regulation. However, all of the existing tools only allow the analysis of miRNA sponge regulation regarding a group of samples, rather than the miRNA sponge regulation unique to individual samples. Furthermore, most existing tools do not allow parallel computing for the fast identification of miRNA sponge regulation. Here, we present an enhanced version of our R/Bioconductor package, miRspongeR 2.0. Compared with the original version introduced in 2019, this package extends the resolution of miRNA sponge regulation from the multi-sample level to the single-sample level. Moreover, it supports the identification of miRNA sponge networks using parallel computing, and the construction of sample-sample correlation networks. It also provides more computational methods to infer miRNA sponge regulation and expands the ground truth for validation. With these new features, we anticipate that miRspongeR 2.0 will further accelerate the research on miRNA sponges with higher resolution and more utilities. Availability and implementation http://bioconductor.org/packages/miRspongeR/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Lin Liu
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Wu Zhang
- Department of Molecular Biology, School of Agriculture and Biological Sciences, Dali University, Dali 671003, China
| | - Xiaomei Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Chunwen Zhao
- Department of Information and Electronic Engineering, School of Engineering, Dali University, Dali 671003, China
| | - Sijing Li
- Department of Information and Electronic Engineering, School of Engineering, Dali University, Dali 671003, China
| | - Jiuyong Li
- UniSA STEM, University of South Australia, Mawson Lakes, SA 5095, Australia
| | - Thuc Duy Le
- To whom correspondence should be addressed. or
| |
Collapse
|
3
|
Yang S, Zeng L, Jin X, Lin H, Song J. Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network. Front Med (Lausanne) 2022; 9:882348. [PMID: 35911385 PMCID: PMC9336509 DOI: 10.3389/fmed.2022.882348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Abstract
There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feature genes of the high-risk group and non-high-risk group samples based on the random forest (RF) algorithm and artificial neural network (ANN) algorithm. The screening results of RF showed that EPS8L1, PLCD4, CHD5, NTRK1, and SLC22A4 were the feature differentially expressed genes (DEGs) of high-risk neuroblastoma. The prediction model based on gene expression data in this study showed high overall accuracy and precision in both the training set and the test set (AUC = 0.998 in GSE49710 and AUC = 0.858 in GSE73517). Kaplan–Meier plotter showed that the overall survival and progression-free survival of patients in the low-risk subgroup were significantly better than those in the high-risk subgroup [HR: 3.86 (95% CI: 2.44–6.10) and HR: 3.03 (95% CI: 2.03–4.52), respectively]. Our ANN-based model has better classification performance than the SVM-based model and XGboost-based model. Nevertheless, more convincing data sets and machine learning algorithms will be needed to build diagnostic models for individual organization types in the future.
Collapse
Affiliation(s)
- Sha Yang
- Department of Surgery, Children’s Hospital of Chongqing Medical University, Chongqing, China
- Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
- National Clinical Research Center for Child Health and Disorders, Chongqing, China
- China International Science and Technology Cooperation Base of Child Development and Critical Disorders, Chongqing, China
- Chongqing Key Laboratory of Pediatrics, Chongqing, China
- Chongqing Engineering Research Center of Stem Cell Therapy, Chongqing, China
- Children’s Hospital of Chongqing Medical University, Chongqing, China
| | - Lingfeng Zeng
- Department of Nephrology, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Xin Jin
- Ministry of Education Key Laboratory of Child Development and Disorders, Chongqing, China
- National Clinical Research Center for Child Health and Disorders, Chongqing, China
- China International Science and Technology Cooperation Base of Child Development and Critical Disorders, Chongqing, China
- Chongqing Key Laboratory of Pediatrics, Chongqing, China
- Chongqing Engineering Research Center of Stem Cell Therapy, Chongqing, China
- Children’s Hospital of Chongqing Medical University, Chongqing, China
- Department of Cardiacthoracic, Children’s Hospital of Chongqing Medical University, Chongqing, China
| | - Huapeng Lin
- Department of Intensive Care Unit, Affiliated Hangzhou First People’s Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Jianning Song
- Department of General Surgery, Guiqian International General Hospital, Guiyang, China
- *Correspondence: Jianning Song, ,
| |
Collapse
|
4
|
Song Y, Li J, Mao Y, Zhang X. ceRNAshiny: An Interactive R/Shiny App for Identification and Analysis of ceRNA Regulation. Front Mol Biosci 2022; 9:865408. [PMID: 35647026 PMCID: PMC9136144 DOI: 10.3389/fmolb.2022.865408] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 04/13/2022] [Indexed: 12/13/2022] Open
Abstract
The competing endogenous RNA (ceRNA) network is a newly discovered post-transcriptional regulation that controls both physiological and pathological progresses. Increasing research studies have been pivoted on this theory to explore the function of novel non-coding RNAs, pseudogenes, circular RNAs, and messenger RNAs. Although there are several R packages or computational tools to analyze ceRNA networks, an urgent need for easy-to-use computational tools still remains to identify ceRNA regulation. Besides, the conventional tools were mainly devoted to investigating ceRNAs in malignancies instead of those in neurodegenerative diseases. To fill this gap, we developed ceRNAshiny, an interactive R/Shiny application, which integrates widely used computational methods and databases to provide and visualize the construction and analysis of the ceRNA network, including differential gene analysis and functional annotation. In addition, demo data in ceRNAshiny could provide ceRNA network analyses about neurodegenerative diseases such as Parkinson's disease. Overall, ceRNAshiny is a user-friendly application that benefits all researchers, especially those who lack an established bioinformatic pipeline and are interested in studying ceRNA networks.
Collapse
Affiliation(s)
- Yueqiang Song
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Jia Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | - Yiming Mao
- Department of Thoracic Surgery, Suzhou Kowloon Hospital, School of Medicine, Shanghai Jiao Tong University, Suzhou, China
| | - Xi Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
- Department of Rehabilitation, Huashan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
5
|
Kang Q, Meng J, Su C, Luan Y. Mining plant endogenous target mimics from miRNA-lncRNA interactions based on dual-path parallel ensemble pruning method. Brief Bioinform 2021; 23:6399881. [PMID: 34662389 DOI: 10.1093/bib/bbab440] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/07/2021] [Accepted: 09/24/2021] [Indexed: 12/14/2022] Open
Abstract
The interactions between microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) play important roles in biological activities. Specially, lncRNAs as endogenous target mimics (eTMs) can bind miRNAs to regulate the expressions of target messenger RNAs (mRNAs). A growing number of studies focus on animals, but the studies on plants are scarce and many functions of plant eTMs are unknown. This study proposes a novel ensemble pruning protocol for predicting plant miRNA-lncRNA interactions at first. It adaptively prunes the base models based on dual-path parallel ensemble method to meet the challenge of cross-species prediction. Then potential eTMs are mined from predicted results. The expression levels of RNAs are identified through biological experiment to construct the lncRNA-miRNA-mRNA regulatory network, and the functions of potential eTMs are inferred through enrichment analysis. Experiment results show that the proposed protocol outperforms existing methods and state-of-the-art predictors on various plant species. A total of 17 potential eTMs are verified by biological experiment to involve in 22 regulations, and 14 potential eTMs are inferred by Gene Ontology enrichment analysis to involve in 63 functions, which is significant for further research.
Collapse
Affiliation(s)
- Qiang Kang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Chenglin Su
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning, 116024 China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning, 116024 China
| |
Collapse
|