1
|
Shireman JM, White Q, Ni Z, Mohanty C, Cai Y, Zhao L, Agrawal N, Gonugunta N, Wang X, Mccarthy L, Kasulabada V, Pattnaik A, Ahmed AU, Miller J, Kulwin C, Cohen-Gadol A, Payner T, Lin CT, Savage JJ, Lane B, Shiue K, Kamer A, Shah M, Iyer G, Watson G, Kendziorski C, Dey M. Genomic analysis of human brain metastases treated with stereotactic radiosurgery reveals unique signature based on treatment failure. iScience 2024; 27:109601. [PMID: 38623341 PMCID: PMC11016778 DOI: 10.1016/j.isci.2024.109601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/12/2024] [Accepted: 03/25/2024] [Indexed: 04/17/2024] Open
Abstract
Stereotactic radiosurgery (SRS) has been shown to be efficacious for the treatment of limited brain metastasis (BM); however, the effects of SRS on human brain metastases have yet to be studied. We performed genomic analysis on resected brain metastases from patients whose resected lesion was previously treated with SRS. Our analyses demonstrated for the first time that patients possess a distinct genomic signature based on type of treatment failure including local failure, leptomeningeal spread, and radio-necrosis. Examination of the center and peripheral edge of the tumors treated with SRS indicated differential DNA damage distribution and an enrichment for tumor suppressor mutations and DNA damage repair pathways along the peripheral edge. Furthermore, the two clinical modalities used to deliver SRS, LINAC and GK, demonstrated differential effects on the tumor landscape even between controlled primary sites. Our study provides, in human, biological evidence of differential effects of SRS across BM's.
Collapse
Affiliation(s)
- Jack M. Shireman
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Quinn White
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Zijian Ni
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Chitrasen Mohanty
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Yujia Cai
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Lei Zhao
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Namita Agrawal
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Nikita Gonugunta
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Xiaohu Wang
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Liam Mccarthy
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Varshitha Kasulabada
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Akshita Pattnaik
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Atique U. Ahmed
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
| | - James Miller
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Charles Kulwin
- Goodman Campbell Brain and Spine Neurological Surgery, Indianapolis, IN, USA
| | - Aaron Cohen-Gadol
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Troy Payner
- Goodman Campbell Brain and Spine Neurological Surgery, Indianapolis, IN, USA
| | - Chih-Ta Lin
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jesse J. Savage
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Brandon Lane
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kevin Shiue
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Aaron Kamer
- Department of Clinical Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Mitesh Shah
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Gopal Iyer
- Department of Human Oncology, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Gordon Watson
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Mahua Dey
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| |
Collapse
|
2
|
Tao Y, Zhang Q, Wang H, Yang X, Mu H. Alternative splicing and related RNA binding proteins in human health and disease. Signal Transduct Target Ther 2024; 9:26. [PMID: 38302461 PMCID: PMC10835012 DOI: 10.1038/s41392-024-01734-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 12/18/2023] [Accepted: 12/27/2023] [Indexed: 02/03/2024] Open
Abstract
Alternative splicing (AS) serves as a pivotal mechanism in transcriptional regulation, engendering transcript diversity, and modifications in protein structure and functionality. Across varying tissues, developmental stages, or under specific conditions, AS gives rise to distinct splice isoforms. This implies that these isoforms possess unique temporal and spatial roles, thereby associating AS with standard biological activities and diseases. Among these, AS-related RNA-binding proteins (RBPs) play an instrumental role in regulating alternative splicing events. Under physiological conditions, the diversity of proteins mediated by AS influences the structure, function, interaction, and localization of proteins, thereby participating in the differentiation and development of an array of tissues and organs. Under pathological conditions, alterations in AS are linked with various diseases, particularly cancer. These changes can lead to modifications in gene splicing patterns, culminating in changes or loss of protein functionality. For instance, in cancer, abnormalities in AS and RBPs may result in aberrant expression of cancer-associated genes, thereby promoting the onset and progression of tumors. AS and RBPs are also associated with numerous neurodegenerative diseases and autoimmune diseases. Consequently, the study of AS across different tissues holds significant value. This review provides a detailed account of the recent advancements in the study of alternative splicing and AS-related RNA-binding proteins in tissue development and diseases, which aids in deepening the understanding of gene expression complexity and offers new insights and methodologies for precision medicine.
Collapse
Affiliation(s)
- Yining Tao
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
- Shanghai Bone Tumor Institution, 200000, Shanghai, China
| | - Qi Zhang
- Department of Biochemistry and Molecular Cell Biology, Shanghai Key Laboratory for Tumor Microenvironment and Inflammation, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
| | - Haoyu Wang
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
- Shanghai Bone Tumor Institution, 200000, Shanghai, China
| | - Xiyu Yang
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China
- Shanghai Bone Tumor Institution, 200000, Shanghai, China
| | - Haoran Mu
- Department of Orthopedics, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, 200000, Shanghai, China.
- Shanghai Bone Tumor Institution, 200000, Shanghai, China.
| |
Collapse
|
3
|
Shen F, Hu C, Huang X, He H, Yang D, Zhao J, Yang X. Advances in alternative splicing identification: deep learning and pantranscriptome. FRONTIERS IN PLANT SCIENCE 2023; 14:1232466. [PMID: 37790793 PMCID: PMC10544900 DOI: 10.3389/fpls.2023.1232466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/28/2023] [Indexed: 10/05/2023]
Abstract
In plants, alternative splicing is a crucial mechanism for regulating gene expression at the post-transcriptional level, which leads to diverse proteins by generating multiple mature mRNA isoforms and diversify the gene regulation. Due to the complexity and variability of this process, accurate identification of splicing events is a vital step in studying alternative splicing. This article presents the application of alternative splicing algorithms with or without reference genomes in plants, as well as the integration of advanced deep learning techniques for improved detection accuracy. In addition, we also discuss alternative splicing studies in the pan-genomic background and the usefulness of integrated strategies for fully profiling alternative splicing.
Collapse
Affiliation(s)
- Fei Shen
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Chenyang Hu
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Shanxi Key Lab of Chinese Jujube, College of Life Science, Yan’an University, Yan’an, Shanxi, China
| | - Xin Huang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Hao He
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Deng Yang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Jirong Zhao
- Shanxi Key Lab of Chinese Jujube, College of Life Science, Yan’an University, Yan’an, Shanxi, China
| | - Xiaozeng Yang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| |
Collapse
|
4
|
Liu Y, Yang C, Li HD, Wang J. IsoFrog: a reversible jump Markov Chain Monte Carlo feature selection-based method for predicting isoform functions. Bioinformatics 2023; 39:btad530. [PMID: 37647643 PMCID: PMC10491952 DOI: 10.1093/bioinformatics/btad530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 07/21/2023] [Accepted: 08/29/2023] [Indexed: 09/01/2023] Open
Abstract
MOTIVATION A single gene may yield several isoforms with different functions through alternative splicing. Continuous efforts are devoted to developing machine-learning methods to predict isoform functions. However, existing methods do not consider the relevance of each feature to specific functions and ignore the noise caused by the irrelevant features. In this case, we hypothesize that constructing a feature selection framework to extract the function-relevant features might help improve the model accuracy in isoform function prediction. RESULTS In this article, we present a feature selection-based approach named IsoFrog to predict isoform functions. First, IsoFrog adopts a reversible jump Markov Chain Monte Carlo (RJMCMC)-based feature selection framework to assess the feature importance to gene functions. Second, a sequential feature selection procedure is applied to select a subset of function-relevant features. This strategy screens the relevant features for the specific function while eliminating irrelevant ones, improving the effectiveness of the input features. Then, the selected features are input into our proposed method modified domain-invariant partial least squares, which prioritizes the most likely positive isoform for each positive MIG and utilizes diPLS for isoform function prediction. Tested on three datasets, our method achieves superior performance over six state-of-the-art methods, and the RJMCMC-based feature selection framework outperforms three classic feature selection methods. We expect this proposed methodology will promote the identification of isoform functions and further inspire the development of new methods. AVAILABILITY AND IMPLEMENTATION IsoFrog is freely available at https://github.com/genemine/IsoFrog.
Collapse
Affiliation(s)
- Yiwei Liu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Changhuo Yang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| |
Collapse
|
5
|
Shireman JM, White Q, Agrawal N, Ni Z, Chen G, Zhao L, Gonugunta N, Wang X, Mccarthy L, Kasulabada V, Pattnaik A, Ahmed AU, Miller J, Kulwin C, Cohen-Gadol A, Payner T, Lin CT, Savage JJ, Lane B, Shiue K, Kamer A, Shah M, Iyer G, Watson G, Kendziorski C, Dey M. Genomic Analysis of Human Brain Metastases Treated with Stereotactic Radiosurgery Under the Phase-II Clinical Trial (NCT03398694) Reveals DNA Damage Repair at the Peripheral Tumor Edge. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.15.23288491. [PMID: 37131583 PMCID: PMC10153341 DOI: 10.1101/2023.04.15.23288491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Stereotactic Radiosurgery (SRS) is one of the leading treatment modalities for oligo brain metastasis (BM), however no comprehensive genomic data assessing the effect of radiation on BM in humans exist. Leveraging a unique opportunity, as part of the clinical trial (NCT03398694), we collected post-SRS, delivered via Gamma-knife or LINAC, tumor samples from core and peripheral-edges of the resected tumor to characterize the genomic effects of overall SRS as well as the SRS delivery modality. Using these rare patient samples, we show that SRS results in significant genomic changes at DNA and RNA levels throughout the tumor. Mutations and expression profiles of peripheral tumor samples indicated interaction with surrounding brain tissue as well as elevated DNA damage repair. Central samples show GSEA enrichment for cellular apoptosis while peripheral samples carried an increase in tumor suppressor mutations. There are significant differences in the transcriptomic profile at the periphery between Gamma-knife vs LINAC.
Collapse
Affiliation(s)
- Jack M. Shireman
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Quinn White
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Namita Agrawal
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Zijian Ni
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Grace Chen
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Lei Zhao
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Nikita Gonugunta
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Xiaohu Wang
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Liam Mccarthy
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Varshitha Kasulabada
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Akshita Pattnaik
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Atique U. Ahmed
- Department of Neurological Surgery, Northwestern University, Chicago, IL, USA
| | - James Miller
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Charles Kulwin
- Goodman Campbell Brain and Spine Neurological Surgery, Indianapolis, IN, USA
| | - Aaron Cohen-Gadol
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Troy Payner
- Goodman Campbell Brain and Spine Neurological Surgery, Indianapolis, IN, USA
| | - Chih-Ta Lin
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Jesse J. Savage
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Brandon Lane
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kevin Shiue
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Aaron Kamer
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Mitesh Shah
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Gopal Iyer
- Department of Human Oncology, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Gordon Watson
- Department of Radiation Oncology, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| | - Mahua Dey
- Department of Neurosurgery, University of Wisconsin Madison School of Medicine and Public Health, Madison, WI, USA
| |
Collapse
|
6
|
Qiu S, Yu G, Lu X, Domeniconi C, Guo M. Isoform function prediction by Gene Ontology embedding. Bioinformatics 2022; 38:4581-4588. [PMID: 35997558 DOI: 10.1093/bioinformatics/btac576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 07/13/2022] [Accepted: 08/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION High-resolution annotation of gene functions is a central task in functional genomics. Multiple proteoforms translated from alternatively spliced isoforms from a single gene are actual function performers and greatly increase the functional diversity. The specific functions of different isoforms can decipher the molecular basis of various complex diseases at a finer granularity. Multi-instance learning (MIL)-based solutions have been developed to distribute gene(bag)-level Gene Ontology (GO) annotations to isoforms(instances), but they simply presume that a particular annotation of the gene is responsible by only one isoform, neglect the hierarchical structures and semantics of massive GO terms (labels), or can only handle dozens of terms. RESULTS We propose an efficacy approach IsofunGO to differentiate massive functions of isoforms by GO embedding. Particularly, IsofunGO first introduces an attributed hierarchical network to model massive GO terms, and a GO network embedding strategy to learn compact representations of GO terms and project GO annotations of genes into compressed ones, this strategy not only explores and preserves hierarchy between GO terms but also greatly reduces the prediction load. Next, it develops an attention-based MIL network to fuse genomics and transcriptomics data of isoforms and predict isoform functions by referring to compressed annotations. Extensive experiments on benchmark datasets demonstrate the efficacy of IsofunGO. Both the GO embedding and attention mechanism can boost the performance and interpretability. AVAILABILITYAND IMPLEMENTATION The code of IsofunGO is available at http://www.sdu-idea.cn/codes.php?name=IsofunGO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sichao Qiu
- School of Software, Shandong University, Jinan, Shandong 250101, China.,Joint SDU-NTU Centre for Artificial Intelligence Research, Shandong University, Jinan, Shandong 250101, China
| | - Guoxian Yu
- School of Software, Shandong University, Jinan, Shandong 250101, China.,Joint SDU-NTU Centre for Artificial Intelligence Research, Shandong University, Jinan, Shandong 250101, China
| | - Xudong Lu
- School of Software, Shandong University, Jinan, Shandong 250101, China.,Joint SDU-NTU Centre for Artificial Intelligence Research, Shandong University, Jinan, Shandong 250101, China
| | | | - Maozu Guo
- College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
| |
Collapse
|
7
|
Yu G, Huang Q, Zhang X, Guo M, Wang J. Tissue Specificity Based Isoform Function Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3048-3059. [PMID: 34185647 DOI: 10.1109/tcbb.2021.3093167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Alternative splicing enables a gene spliced into different isoforms and hence protein variants. Identifying individual functions of these isoforms help deciphering the functional diversity of proteins. Although much efforts have been made for automatic gene function prediction, few efforts have been moved toward computational isoform function prediction, mainly due to the unavailable (or scanty) functional annotations of isoforms. Existing efforts directly combine multiple RNA-seq datasets without account of the important tissue specificity of alternative splicing. To bridge this gap, we introduce a novel approach called TS-Isofun to predict the functions of isoforms by integrating multiple functional association networks with respect to tissue specificity. TS-Isofun first constructs tissue-specific isoform functional association networks using multiple RNA-seq datasets from tissue-wise. Next, TS-Isofun assigns weights to these networks and models the tissue specificity by selectively integrating them with adaptive weights. It then introduces a joint matrix factorization-based data fusion model to leverage the integrated network, gene-level data and functional annotations of genes to infer the functions of isoforms. To achieve coherent weight assignment and isoform function prediction, TS-Isofun jointly optimizes the weights of individual networks and the isoform function prediction in a unified objective function. Experimental results show that TS-Isofun significantly outperforms state-of-the-art methods and the account of tissue specificity contributes to more accurate isoform function prediction.
Collapse
|
8
|
Hao DC, Chen H, Xiao PG, Jiang T. A Global Analysis of Alternative Splicing of Dichocarpum Medicinal Plants, Ranunculales. Curr Genomics 2022; 23:207-216. [PMID: 36777007 PMCID: PMC9878827 DOI: 10.2174/1389202923666220527112929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/19/2022] [Accepted: 04/26/2022] [Indexed: 11/22/2022] Open
Abstract
Background: The multiple isoforms are often generated from a single gene via Alternative Splicing (AS) in plants, and the functional diversity of the plant genome is significantly increased. Despite well-studied gene functions, the specific functions of isoforms are little known, therefore, the accurate prediction of isoform functions is exceedingly wanted. Methods: Here we perform the first global analysis of AS of Dichocarpum, a medicinal genus of Ranunculales, by utilizing full-length transcriptome datasets of five Chinese endemic Dichocarpum taxa. Multiple software were used to identify AS events, the gene function was annotated based on seven databases, and the protein-coding sequence of each AS isoform was translated into an amino acid sequence. The self-developed software DIFFUSE was used to predict the functions of AS isoforms. Results: Among 8,485 genes with AS events, the genes with two isoforms were the most (6,038), followed by those with three isoforms and four isoforms. Retained intron (RI, 551) was predominant among 1,037 AS events, and alternative 3' splice sites and alternative 5' splice sites were second. The software DIFFUSE was effective in predicting functions of Dichocarpum isoforms, which have not been unearthed. When compared with the sequence alignment-based database annotations, DIFFUSE performed better in differentiating isoform functions. The DIFFUSE predictions on the terms GO:0003677 (DNA binding) and GO: 0010333 (terpene synthase activity) agreed with the biological features of transcript isoforms. Conclusion: Numerous AS events were for the first time identified from full-length transcriptome datasets of five Dichocarpum taxa, and functions of AS isoforms were successfully predicted by the self-developed software DIFFUSE. The global analysis of Dichocarpum AS events and predicting isoform functions can help understand the metabolic regulations of medicinal taxa and their pharmaceutical explorations.
Collapse
Affiliation(s)
- Da-Cheng Hao
- Biotechnology Institute, School of Environment and Chemical Engineering, Dalian Jiaotong University, Dalian 116028, China;,Institute of Molecular Plant Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK;,Address correspondence to these authors at the School of Environment and Chemical Engineering, Dalian Jiaotong University, Dalian 116028, China; Tel: 0086-411-84572552; E-mail: ; and Department of Computer Science and Engineering, University of California, Riverside, CA, USA; Tel/Fax: 001-951-827-2991; E-mail:
| | - Hao Chen
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA;,Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA;,These authors contributed equally to this work.
| | - Pei-Gen Xiao
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Beijing 100193, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA;,Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China,Address correspondence to these authors at the School of Environment and Chemical Engineering, Dalian Jiaotong University, Dalian 116028, China; Tel: 0086-411-84572552; E-mail: ; and Department of Computer Science and Engineering, University of California, Riverside, CA, USA; Tel/Fax: 001-951-827-2991; E-mail:
| |
Collapse
|
9
|
Wang J, Zhang L, Zeng A, Xia D, Yu J, Yu G. DeepIII: Predicting Isoform-Isoform Interactions by Deep Neural Networks and Data Fusion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2177-2187. [PMID: 33764878 DOI: 10.1109/tcbb.2021.3068875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Alternative splicing enables a gene translating into different isoforms and into the corresponding proteoforms, which actually accomplish various biological functions of a living body. Isoform-isoform interactions (IIIs) provide a higher resolution interactome to explore the cellular processes and disease mechanisms than the canonically studied protein-protein interactions (PPIs), which are often recorded at the coarse gene level. The knowledge of IIIs is critical to map pathways, understand protein complexity and functional diversity, but the known IIIs are very scanty. In this paper, we propose a deep learning based method called DeepIII to systematically predict genome-wide IIIs by integrating diverse data sources, including RNA-seq datasets of different human tissues, exon array data, domain-domain interactions (DDIs) of proteins, nucleotide sequences and amino acid sequences. Particularly, DeepIII fuses these data to learn the representation of isoform pairs with a four-layer deep neural networks, and then performs binary classification on the learnt representation to achieve the prediction of IIIs. Experimental results show that DeepIII achieves a superior prediction performance to the state-of-the-art solutions and the III network constructed by DeepIII gives more accurate isoform function prediction. Case studies further confirm that DeepIII can differentiate the individual interaction partners of different isoforms spliced from the same gene. The code and datasets of DeepIII are available at http://mlda.swu.edu.cn/codes.php?name=DeepIII.
Collapse
|
10
|
Yu G, Yang Y, Yan Y, Guo M, Zhang X, Wang J. DeepIDA: Predicting Isoform-Disease Associations by Data Fusion and Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2166-2176. [PMID: 33571094 DOI: 10.1109/tcbb.2021.3058801] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Alternative splicing produces different isoforms from the same gene locus, it is an important mechanism for regulating gene expression and proteome diversity. Although the prediction of gene(ncRNA)-disease associations has been extensively studied, few (or no) computational solutions have been proposed for the prediction of isoform-disease association (IDA) at a large scale, mainly due to the lack of disease annotations of isoforms. However, increasing evidences confirm the associations between diseases and isoforms, which can more precisely uncover the pathology of complex diseases. Therefore, it is highly desirable to predict IDAs. To bridge this gap, we propose a deep neural network based solution (DeepIDA) to fuse multi-type genomics and transcriptomics data to predict IDAs. Particularly, DeepIDA uses gene-isoform relations to dispatch gene-disease associations to isoforms. In addition, it utilizes two DNN sub-networks with different structures to capture nucleotide and expression features of isoforms, Gene Ontology data and miRNA target data, respectively. After that, these two sub-networks are merged in a dense layer to predict IDAs. The experimental results on public datasets show that DeepIDA can effectively predict IDAs with AUPRC (area under the precision-recall curve) of 0.9141, macro F-measure of 0.9155, G-mean of 0.9278 and balanced accuracy of 0.9303 across 732 diseases, which are much higher than those of competitive methods. Further study on sixteen isoform-disease association cases again corroborates the superiority of DeepIDA. The code of DeepIDA is available at http://mlda.swu.edu.cn/codes.php?name=DeepIDA.
Collapse
|
11
|
Verta JP, Jacobs A. The role of alternative splicing in adaptation and evolution. Trends Ecol Evol 2021; 37:299-308. [PMID: 34920907 DOI: 10.1016/j.tree.2021.11.010] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/26/2021] [Accepted: 11/19/2021] [Indexed: 01/02/2023]
Abstract
Regulation of gene expression plays a central role in adaptive divergence and evolution. Although the role of gene regulation in microevolutionary processes is gaining wide acceptance, most studies have only investigated the evolution of transcript levels, ignoring the potentially significant role of transcript structures. We argue that variation in alternative splicing plays an important and widely unexplored role in adaptation (e.g., by increasing transcriptome and/or proteome diversity, or buffering potentially deleterious genetic variation). New studies increasingly highlight the potential for independent evolution in alternative splicing and transcript level, providing alternative paths for selection to act upon. We propose that alternative splicing and transcript levels can provide contrasting, nonredundant mechanisms of equal importance for adaptive diversification of gene function and regulation.
Collapse
Affiliation(s)
- Jukka-Pekka Verta
- Organismal and Evolutionary Biology Research Programme, University of Helsinki, Viikinkaari 9, 00790, Helsinki, Finland.
| | - Arne Jacobs
- Institute of Biodiversity, Animal Health, and Comparative Medicine, University of Glasgow, G12 8QQ, Glasgow, UK.
| |
Collapse
|
12
|
Yu G, Zhou G, Zhang X, Domeniconi C, Guo M. DMIL-IsoFun: predicting isoform function using deep multi-instance learning. Bioinformatics 2021; 37:4818-4825. [PMID: 34282449 DOI: 10.1093/bioinformatics/btab532] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 06/20/2021] [Accepted: 07/16/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Alternative splicing creates the considerable proteomic diversity and complexity on relatively limited genome. Proteoforms translated from alternatively spliced isoforms of a gene actually execute the biological functions of this gene, which reflect the functional knowledge of genes at a finer granular level. Recently, some computational approaches have been proposed to differentiate isoform functions using sequence and expression data. However, their performance is far from being desirable, mainly due to the imbalance and lack of annotations at isoform-level, and the difficulty of modeling gene-isoform relations. RESULT We propose a deep multi-instance learning based framework (DMIL-IsoFun) to differentiate the functions of isoforms. DMIL-IsoFun firstly introduces a multi-instance learning convolution neural network trained with isoform sequences and gene-level annotations to extract the feature vectors and initialize the annotations of isoforms, and then uses a class-imbalance Graph Convolution Network to refine the annotations of individual isoforms based on the isoform co-expression network and extracted features. Extensive experimental results show that DMIL-IsoFun improves the Smin and Fmax of state-of-the-art solutions by at least 29.6% and 40.8%. The effectiveness of DMIL-IsoFun is further confirmed on a testbed of human multiple-isoform genes, and Maize isoforms related with photosynthesis. AVAILABILITY The code and data are available at http://www.sdu-idea.cn/codes.php?name=DMIL-Isofun. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guoxian Yu
- School of Software, Shandong University, Jinan, 250101, China.,College of Computer and Information Sciences, Southwest University, Chongqing, 400715, China.,Computer, Electrical, and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, SA
| | - Guangjie Zhou
- School of Software, Shandong University, Jinan, 250101, China.,College of Computer and Information Sciences, Southwest University, Chongqing, 400715, China
| | - Xiangliang Zhang
- Computer, Electrical, and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, SA
| | - Carlotta Domeniconi
- Department of Computer Science, George Mason University, Fairfax, 22030, USA
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
13
|
Aledavood E, Forte A, Estarellas C, Javier Luque F. Structural basis of the selective activation of enzyme isoforms: Allosteric response to activators of β1- and β2-containing AMPK complexes. Comput Struct Biotechnol J 2021; 19:3394-3406. [PMID: 34194666 PMCID: PMC8217686 DOI: 10.1016/j.csbj.2021.05.056] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 05/30/2021] [Accepted: 05/30/2021] [Indexed: 12/21/2022] Open
Abstract
AMP-activated protein kinase (AMPK) is a key energy sensor regulating the cell metabolism in response to energy supply and demand. The evolutionary adaptation of AMPK to different tissues is accomplished through the expression of distinct isoforms that can form up to 12 complexes, which exhibit notable differences in the sensitivity to allosteric activators. To shed light into the molecular determinants of the allosteric regulation of this energy sensor, we have examined the structural and dynamical properties of β1- and β2-containing AMPK complexes formed with small molecule activators A-769662 and SC4, and dissected the mechanical response leading to active-like enzyme conformations through the analysis of interaction networks between structural domains. The results reveal the mechanical sensitivity of the α2β1 complex, in contrast with a larger resilience of the α2β2 species, especially regarding modulation by A-769662. Furthermore, binding of activators to α2β1 consistently promotes the pre-organization of the ATP-binding site, favoring the adoption of activated states of the enzyme. These findings are discussed in light of the changes in the residue content of β-subunit isoforms, particularly regarding the β1Asn111 → β2Asp111 substitution as a key factor in modulating the mechanical sensitivity of β1- and β2-containing AMPK complexes. Our studies pave the way for the design of activators tailored for improving the therapeutic treatment of tissue-specific metabolic disorders.
Collapse
Affiliation(s)
| | - Alessia Forte
- Department of Nutrition, Food Science and Gastronomy, Faculty of Pharmacy and Food Sciences, Institute of Biomedicine (IBUB) and Institute of Theoretical and Computational Chemistry (IQTCUB), University of Barcelona, Av. Prat de la Riba 171, Santa Coloma de Gramenet 08921, Spain
| | | | | |
Collapse
|
14
|
Li HD, Yang C, Zhang Z, Yang M, Wu FX, Omenn GS, Wang J. IsoResolve: predicting splice isoform functions by integrating gene and isoform-level features with domain adaptation. Bioinformatics 2021; 37:522-530. [PMID: 32966552 PMCID: PMC8088322 DOI: 10.1093/bioinformatics/btaa829] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/12/2020] [Accepted: 09/09/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION High resolution annotation of gene functions is a central goal in functional genomics. A single gene may produce multiple isoforms with different functions through alternative splicing. Conventional approaches, however, consider a gene as a single entity without differentiating these functionally different isoforms. Towards understanding gene functions at higher resolution, recent efforts have focused on predicting the functions of isoforms. However, the performance of existing methods is far from satisfactory mainly because of the lack of isoform-level functional annotation. RESULTS We present IsoResolve, a novel approach for isoform function prediction, which leverages the information from gene function prediction models with domain adaptation (DA). IsoResolve treats gene-level and isoform-level features as source and target domains, respectively. It uses DA to project the two domains into a latent variable space in such a way that the latent variables from the two domains have similar distribution, which enables the gene domain information to be leveraged for isoform function prediction. We systematically evaluated the performance of IsoResolve in predicting functions. Compared with five state-of-the-art methods, IsoResolve achieved significantly better performance. IsoResolve was further validated by case studies of genes with isoform-level functional annotation. AVAILABILITY AND IMPLEMENTATION IsoResolve is freely available at https://github.com/genemine/IsoResolve. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hong-Dong Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Changhuo Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, Hunan 410083, China
| | - Mengyun Yang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N5A9, Canada
| | - Gilbert S Omenn
- Institute for Systems Biology, Seattle, WA 98101, USA.,Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering
| |
Collapse
|
15
|
Yu G, Zeng J, Wang J, Zhang H, Zhang X, Guo M. Imbalance deep multi‐instance learning for predicting isoform–isoform interactions. INT J INTELL SYST 2021. [DOI: 10.1002/int.22402] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Guoxian Yu
- School of Software Shandong University Jinan China
- College of Computer and Information Science Southwest University Chongqing China
- Joint SDU‐NTU Centre for Artificial Intelligence Research Shandong University Jinan China
| | - Jie Zeng
- College of Computer and Information Science Southwest University Chongqing China
| | - Jun Wang
- College of Computer and Information Science Southwest University Chongqing China
- Joint SDU‐NTU Centre for Artificial Intelligence Research Shandong University Jinan China
| | - Hong Zhang
- College of Computer and Information Science Southwest University Chongqing China
| | - Xiangliang Zhang
- CEMSE King Abdullah University of Science and Technology Thuwal Saudi Arabia
| | - Maozu Guo
- School of Electrical and Information Engineering Beijing University of Civil Engineering and Architecture Beijing China
| |
Collapse
|
16
|
Chen H, Shaw D, Bu D, Jiang T. FINER: enhancing the prediction of tissue-specific functions of isoforms by refining isoform interaction networks. NAR Genom Bioinform 2021; 3:lqab057. [PMID: 34169280 PMCID: PMC8219044 DOI: 10.1093/nargab/lqab057] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 05/18/2021] [Accepted: 06/03/2021] [Indexed: 12/24/2022] Open
Abstract
Annotating the functions of gene products is a mainstay in biology. A variety of databases have been established to record functional knowledge at the gene level. However, functional annotations at the isoform resolution are in great demand in many biological applications. Although critical information in biological processes such as protein-protein interactions (PPIs) is often used to study gene functions, it does not directly help differentiate the functions of isoforms, as the 'proteins' in the existing PPIs generally refer to 'genes'. On the other hand, the prediction of isoform functions and prediction of isoform-isoform interactions, though inherently intertwined, have so far been treated as independent computational problems in the literature. Here, we present FINER, a unified framework to jointly predict isoform functions and refine PPIs from the gene level to the isoform level, enabling both tasks to benefit from each other. Extensive computational experiments on human tissue-specific data demonstrate that FINER is able to gain at least 5.16% in AUC and 15.1% in AUPRC for functional prediction across multiple tissues by refining noisy PPIs, resulting in significant improvement over the state-of-the-art methods. Some in-depth analyses reveal consistency between FINER's predictions and the tissue specificity as well as subcellular localization of isoforms.
Collapse
Affiliation(s)
- Hao Chen
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
| | - Dipan Shaw
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
| | - Dongbo Bu
- Key Lab of Intelligent Information Process, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
- Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
17
|
Pozo F, Martinez-Gomez L, Walsh TA, Rodriguez JM, Di Domenico T, Abascal F, Vazquez J, Tress ML. Assessing the functional relevance of splice isoforms. NAR Genom Bioinform 2021; 3:lqab044. [PMID: 34046593 PMCID: PMC8140736 DOI: 10.1093/nargab/lqab044] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 04/22/2021] [Accepted: 05/17/2021] [Indexed: 12/20/2022] Open
Abstract
Alternative splicing of messenger RNA can generate an array of mature transcripts, but it is not clear how many go on to produce functionally relevant protein isoforms. There is only limited evidence for alternative proteins in proteomics analyses and data from population genetic variation studies indicate that most alternative exons are evolving neutrally. Determining which transcripts produce biologically important isoforms is key to understanding isoform function and to interpreting the real impact of somatic mutations and germline variations. Here we have developed a method, TRIFID, to classify the functional importance of splice isoforms. TRIFID was trained on isoforms detected in large-scale proteomics analyses and distinguishes these biologically important splice isoforms with high confidence. Isoforms predicted as functionally important by the algorithm had measurable cross species conservation and significantly fewer broken functional domains. Additionally, exons that code for these functionally important protein isoforms are under purifying selection, while exons from low scoring transcripts largely appear to be evolving neutrally. TRIFID has been developed for the human genome, but it could in principle be applied to other well-annotated species. We believe that this method will generate valuable insights into the cellular importance of alternative splicing.
Collapse
Affiliation(s)
- Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Laura Martinez-Gomez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Thomas A Walsh
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - José Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Tomas Di Domenico
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Hinxton CB10 1SA, UK
| | - Jesús Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| |
Collapse
|
18
|
Huang Q, Wang J, Zhang X, Guo M, Yu G. IsoDA: Isoform-Disease Association Prediction by Multiomics Data Fusion. J Comput Biol 2021; 28:804-819. [PMID: 33826865 DOI: 10.1089/cmb.2020.0626] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
A gene can be spliced into different isoforms by alternative splicing, which contributes to the functional diversity of protein species. Computational prediction of gene-disease associations (GDAs) has been studied for decades. However, the process of identifying the isoform-disease associations (IDAs) at a large scale is rarely explored, which can decipher the pathology at a more granular level. The main bottleneck is the lack of IDAs in current databases and the multilevel omics data fusion. To bridge this gap, we propose a computational approach called Isoform-Disease Association prediction by multiomics data fusion (IsoDA) to predict IDAs. Based on the relationship between a gene and its spliced isoforms, IsoDA first introduces a dispatch and aggregation term to dispatch gene-disease associations to individual isoforms, and reversely aggregate these dispatched associations to their hosting genes. At the same time, it fuses the genome, transcriptome, and proteome data by joint matrix factorization to improve the prediction of IDAs. Experimental results show that IsoDA significantly outperforms the related state-of-the-art methods at both the gene level and isoform level. A case study further shows that IsoDA credibly identifies three isoforms spliced from apolipoprotein E, which have individual associations with Alzheimer's disease, and two isoforms spliced from vascular endothelial growth factor A, which have different associations with coronary heart disease. The codes of IsoDA are available at http://mlda.swu.edu.cn/codes.php?name=IsoDA.
Collapse
Affiliation(s)
- Qiuyue Huang
- College of Computer and Information Science, Southwest University, Chongqing, China.,School of Software, Shandong University, Jinan, China
| | - Jun Wang
- School of Software, Shandong University, Jinan, China
| | - Xiangliang Zhang
- Department of Computer Science, Computer, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Maozu Guo
- Department of Computer Science, College of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing, China.,School of Software, Shandong University, Jinan, China.,Department of Computer Science, Computer, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
19
|
Shaw D, Chen H, Xie M, Jiang T. DeepLPI: a multimodal deep learning method for predicting the interactions between lncRNAs and protein isoforms. BMC Bioinformatics 2021; 22:24. [PMID: 33461501 PMCID: PMC7814738 DOI: 10.1186/s12859-020-03914-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 11/30/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) regulate diverse biological processes via interactions with proteins. Since the experimental methods to identify these interactions are expensive and time-consuming, many computational methods have been proposed. Although these computational methods have achieved promising prediction performance, they neglect the fact that a gene may encode multiple protein isoforms and different isoforms of the same gene may interact differently with the same lncRNA. RESULTS In this study, we propose a novel method, DeepLPI, for predicting the interactions between lncRNAs and protein isoforms. Our method uses sequence and structure data to extract intrinsic features and expression data to extract topological features. To combine these different data, we adopt a hybrid framework by integrating a multimodal deep learning neural network and a conditional random field. To overcome the lack of known interactions between lncRNAs and protein isoforms, we apply a multiple instance learning (MIL) approach. In our experiment concerning the human lncRNA-protein interactions in the NPInter v3.0 database, DeepLPI improved the prediction performance by 4.7% in term of AUC and 5.9% in term of AUPRC over the state-of-the-art methods. Our further correlation analyses between interactive lncRNAs and protein isoforms also illustrated that their co-expression information helped predict the interactions. Finally, we give some examples where DeepLPI was able to outperform the other methods in predicting mouse lncRNA-protein interactions and novel human lncRNA-protein interactions. CONCLUSION Our results demonstrated that the use of isoforms and MIL contributed significantly to the improvement of performance in predicting lncRNA and protein interactions. We believe that such an approach would find more applications in predicting other functional roles of RNAs and proteins.
Collapse
Affiliation(s)
- Dipan Shaw
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521 USA
| | - Hao Chen
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521 USA
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA 92521 USA
- Bioinformatics Division, BNRIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
| |
Collapse
|
20
|
Brunet MA, Lucier JF, Levesque M, Leblanc S, Jacques JF, Al-Saedi HRH, Guilloy N, Grenier F, Avino M, Fournier I, Salzet M, Ouangraoua A, Scott M, Boisvert FM, Roucou X. OpenProt 2021: deeper functional annotation of the coding potential of eukaryotic genomes. Nucleic Acids Res 2021; 49:D380-D388. [PMID: 33179748 PMCID: PMC7779043 DOI: 10.1093/nar/gkaa1036] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 10/15/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022] Open
Abstract
OpenProt (www.openprot.org) is the first proteogenomic resource supporting a polycistronic annotation model for eukaryotic genomes. It provides a deeper annotation of open reading frames (ORFs) while mining experimental data for supporting evidence using cutting-edge algorithms. This update presents the major improvements since the initial release of OpenProt. All species support recent NCBI RefSeq and Ensembl annotations, with changes in annotations being reported in OpenProt. Using the 131 ribosome profiling datasets re-analysed by OpenProt to date, non-AUG initiation starts are reported alongside a confidence score of the initiating codon. From the 177 mass spectrometry datasets re-analysed by OpenProt to date, the unicity of the detected peptides is controlled at each implementation. Furthermore, to guide the users, detectability statistics and protein relationships (isoforms) are now reported for each protein. Finally, to foster access to deeper ORF annotation independently of one's bioinformatics skills or computational resources, OpenProt now offers a data analysis platform. Users can submit their dataset for analysis and receive the results from the analysis by OpenProt. All data on OpenProt are freely available and downloadable for each species, the release-based format ensuring a continuous access to the data. Thus, OpenProt enables a more comprehensive annotation of eukaryotic genomes and fosters functional proteomic discoveries.
Collapse
Affiliation(s)
- Marie A Brunet
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-François Lucier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Maxime Levesque
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Sébastien Leblanc
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Jean-Francois Jacques
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Hassan R H Al-Saedi
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Noé Guilloy
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| | - Frederic Grenier
- Center for Computational Science, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
- Biology Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Mariano Avino
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - Isabelle Fournier
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Michel Salzet
- INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & Spectrométrie de Masse (PRISM), Université de Lille, F-59000 Lille, France
| | - Aïda Ouangraoua
- Informatics Department, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada
| | - Michelle S Scott
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
| | - François-Michel Boisvert
- Department of Immunology and Cellular Biology, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Xavier Roucou
- Department of Biochemistry and Functional Genomics, Université de Sherbrooke, 3201 Jean Mignault, Sherbrooke, QC J1E 4K8, Canada
- PROTEO, Quebec Network for Research on Protein Function, Structure, and Engineering, Université Laval, Quebec City, QC G1V0A6, Canada
| |
Collapse
|
21
|
Sciarrillo R, Wojtuszkiewicz A, Assaraf YG, Jansen G, Kaspers GJL, Giovannetti E, Cloos J. The role of alternative splicing in cancer: From oncogenesis to drug resistance. Drug Resist Updat 2020; 53:100728. [PMID: 33070093 DOI: 10.1016/j.drup.2020.100728] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 09/17/2020] [Accepted: 09/21/2020] [Indexed: 12/15/2022]
Abstract
Alternative splicing is a tightly regulated process whereby non-coding sequences of pre-mRNA are removed and protein-coding segments are assembled in diverse combinations, ultimately giving rise to proteins with distinct or even opposing functions. In the past decade, whole genome/transcriptome sequencing studies revealed the high complexity of splicing regulation, which occurs co-transcriptionally and is influenced by chromatin status and mRNA modifications. Consequently, splicing profiles of both healthy and malignant cells display high diversity and alternative splicing was shown to be widely deregulated in multiple cancer types. In particular, mutations in pre-mRNA regulatory sequences, splicing regulators and chromatin modifiers, as well as differential expression of splicing factors are important contributors to cancer pathogenesis. It has become clear that these aberrations contribute to many facets of cancer, including oncogenic transformation, cancer progression, response to anticancer drug treatment as well as resistance to therapy. In this respect, alternative splicing was shown to perturb the expression a broad spectrum of relevant genes involved in drug uptake/metabolism (i.e. SLC29A1, dCK, FPGS, and TP), activation of nuclear receptor pathways (i.e. GR, AR), regulation of apoptosis (i.e. MCL1, BCL-X, and FAS) and modulation of response to immunotherapy (CD19). Furthermore, aberrant splicing constitutes an important source of novel cancer biomarkers and the spliceosome machinery represents an attractive target for a novel and rapidly expanding class of therapeutic agents. Small molecule inhibitors targeting SF3B1 or splice factor kinases were highly cytotoxic against a wide range of cancer models, including drug-resistant cells. Importantly, these effects are enhanced in specific cancer subsets, such as splicing factor-mutated and c-MYC-driven tumors. Furthermore, pre-clinical studies report synergistic effects of spliceosome modulators in combination with conventional antitumor agents. These strategies based on the use of low dose splicing modulators could shift the therapeutic window towards decreased toxicity in healthy tissues. Here we provide an extensive overview of the latest findings in the field of regulation of splicing in cancer, including molecular mechanisms by which cancer cells harness alternative splicing to drive oncogenesis and evade anticancer drug treatment as well as splicing-based vulnerabilities that can provide novel treatment opportunities. Furthermore, we discuss current challenges arising from genome-wide detection and prediction methods of aberrant splicing, as well as unravelling functional relevance of the plethora of cancer-related splicing alterations.
Collapse
Affiliation(s)
- Rocco Sciarrillo
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Department of Pediatric Oncology, Emma's Children's Hospital, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Department of Medical Oncology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Anna Wojtuszkiewicz
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Yehuda G Assaraf
- The Fred Wyszkowski Cancer Research Laboratory, Department of Biology, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Gerrit Jansen
- Amsterdam Immunology and Rheumatology Center, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands
| | - Gertjan J L Kaspers
- Department of Pediatric Oncology, Emma's Children's Hospital, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Princess Máxima Center for Pediatric Oncology, Utrecht, Netherlands
| | - Elisa Giovannetti
- Department of Medical Oncology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands; Fondazione Pisana per la Scienza, Pisa, Italy
| | - Jacqueline Cloos
- Department of Hematology, Amsterdam UMC, VU University Medical Center, Cancer Center Amsterdam, Amsterdam, Netherlands.
| |
Collapse
|
22
|
Mishra SK, Muthye V, Kandoi G. Computational Methods for Predicting Functions at the mRNA Isoform Level. Int J Mol Sci 2020; 21:ijms21165686. [PMID: 32784445 PMCID: PMC7460821 DOI: 10.3390/ijms21165686] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 11/16/2022] Open
Abstract
Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable, and plant diseases. The mRNA isoforms of the same gene can have dramatically different functions. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.
Collapse
|
23
|
Isoform-Disease Association Prediction by Data Fusion. BIOINFORMATICS RESEARCH AND APPLICATIONS 2020. [DOI: 10.1007/978-3-030-57821-3_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
24
|
Shi Q, Chen W, Huang S, Wang Y, Xue Z. Deep learning for mining protein data. Brief Bioinform 2019; 22:194-218. [PMID: 31867611 DOI: 10.1093/bib/bbz156] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 10/21/2019] [Accepted: 11/07/2019] [Indexed: 01/16/2023] Open
Abstract
The recent emergence of deep learning to characterize complex patterns of protein big data reveals its potential to address the classic challenges in the field of protein data mining. Much research has revealed the promise of deep learning as a powerful tool to transform protein big data into valuable knowledge, leading to scientific discoveries and practical solutions. In this review, we summarize recent publications on deep learning predictive approaches in the field of mining protein data. The application architectures of these methods include multilayer perceptrons, stacked autoencoders, deep belief networks, two- or three-dimensional convolutional neural networks, recurrent neural networks, graph neural networks, and complex neural networks and are described from five perspectives: residue-level prediction, sequence-level prediction, three-dimensional structural analysis, interaction prediction, and mass spectrometry data mining. The advantages and deficiencies of these architectures are presented in relation to various tasks in protein data mining. Additionally, some practical issues and their future directions are discussed, such as robust deep learning for protein noisy data, architecture optimization for specific tasks, efficient deep learning for limited protein data, multimodal deep learning for heterogeneous protein data, and interpretable deep learning for protein understanding. This review provides comprehensive perspectives on general deep learning techniques for protein data analysis.
Collapse
Affiliation(s)
- Qiang Shi
- School of Software Engineering, Huazhong University of Science and Technology. His main interests cover machine learning especially deep learning, protein data analysis, and big data mining
| | - Weiya Chen
- School of Software Engineering, Huazhong University of Science & Technology, Wuhan, China. His research interests cover bioinformatics, virtual reality, and data visualization
| | - Siqi Huang
- Software Engineering at Huazhong University of science and technology, focusing on Machine learning and data mining
| | - Yan Wang
- School of life, University of Science & Technology; her main interests cover protein structure and function prediction and big data mining
| | - Zhidong Xue
- School of Software Engineering, Huazhong University of Science & Technology, Wuhan, China. His research interests cover bioinformatics, machine learning, and image processing
| |
Collapse
|