1
|
Hegde M, Girisa S, Kunnumakkara AB. A compilation of bioinformatic approaches to identify novel downstream targets for the detection and prophylaxis of cancer. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2023; 134:75-113. [PMID: 36858743 DOI: 10.1016/bs.apcsb.2022.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/22/2023]
Abstract
The paradigm of cancer genomics has been radically changed by the development in next-generation sequencing (NGS) technologies making it possible to envisage individualized treatment based on tumor and stromal cells genome in a clinical setting within a short timeframe. The abundance of data has led to new avenues for studying coordinated alterations that impair biological processes, which in turn has increased the demand for bioinformatic tools for pathway analysis. While most of this work has been concentrated on optimizing certain algorithms to obtain quicker and more accurate results. Large volumes of these existing algorithm-based data are difficult for the biologists and clinicians to access, download and reanalyze them. In the present study, we have listed the bioinformatics algorithms and user-friendly graphical user interface (GUI) tools that enable code-independent analysis of big data without compromising the quality and time. We have also described the advantages and drawbacks of each of these platforms. Additionally, we emphasize the importance of creating new, more user-friendly solutions to provide better access to open data and talk about relevant problems like data sharing and patient privacy.
Collapse
Affiliation(s)
- Mangala Hegde
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India
| | - Sosmitha Girisa
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India
| | - Ajaikumar B Kunnumakkara
- Cancer Biology Laboratory, Department of Biosciences and Bioengineering, Indian Institute of Technology (IIT) Guwahati, Guwahati, Assam, India.
| |
Collapse
|
2
|
Rodriguez JM, Pozo F, Cerdán-Vélez D, Di Domenico T, Vázquez J, Tress M. APPRIS: selecting functionally important isoforms. Nucleic Acids Res 2022; 50:D54-D59. [PMID: 34755885 PMCID: PMC8728124 DOI: 10.1093/nar/gkab1058] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/14/2021] [Accepted: 10/20/2021] [Indexed: 12/20/2022] Open
Abstract
APPRIS (https://appris.bioinfo.cnio.es) is a well-established database housing annotations for protein isoforms for a range of species. APPRIS selects principal isoforms based on protein structure and function features and on cross-species conservation. Most coding genes produce a single main protein isoform and the principal isoforms chosen by the APPRIS database best represent this main cellular isoform. Human genetic data, experimental protein evidence and the distribution of clinical variants all support the relevance of APPRIS principal isoforms. APPRIS annotations and principal isoforms have now been expanded to 10 model organisms. In this paper we highlight the most recent updates to the database. APPRIS annotations have been generated for two new species, cow and chicken, the protein structural information has been augmented with reliable models from the EMBL-EBI AlphaFold database, and we have substantially expanded the confirmatory proteomics evidence available for the human genome. The most significant change in APPRIS has been the implementation of TRIFID functional isoform scores. TRIFID functional scores are assigned to all splice isoforms, and APPRIS uses the TRIFID functional scores and proteomics evidence to determine principal isoforms when core methods cannot.
Collapse
Affiliation(s)
- Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Fernando Pozo
- Bioinformatics Institute, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Daniel Cerdán-Vélez
- Bioinformatics Institute, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Tomás Di Domenico
- Bioinformatics Institute, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| | - Jesús Vázquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
- CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain
| | - Michael L Tress
- Bioinformatics Institute, Spanish National Cancer Research Centre (CNIO), Madrid, 28029, Spain
| |
Collapse
|
3
|
De Paoli-Iseppi R, Gleeson J, Clark MB. Isoform Age - Splice Isoform Profiling Using Long-Read Technologies. Front Mol Biosci 2021; 8:711733. [PMID: 34409069 PMCID: PMC8364947 DOI: 10.3389/fmolb.2021.711733] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 07/19/2021] [Indexed: 01/12/2023] Open
Abstract
Alternative splicing (AS) of RNA is a key mechanism that results in the expression of multiple transcript isoforms from single genes and leads to an increase in the complexity of both the transcriptome and proteome. Regulation of AS is critical for the correct functioning of many biological pathways, while disruption of AS can be directly pathogenic in diseases such as cancer or cause risk for complex disorders. Current short-read sequencing technologies achieve high read depth but are limited in their ability to resolve complex isoforms. In this review we examine how long-read sequencing (LRS) technologies can address this challenge by covering the entire RNA sequence in a single read and thereby distinguish isoform changes that could impact RNA regulation or protein function. Coupling LRS with technologies such as single cell sequencing, targeted sequencing and spatial transcriptomics is producing a rapidly expanding suite of technological approaches to profile alternative splicing at the isoform level with unprecedented detail. In addition, integrating LRS with genotype now allows the impact of genetic variation on isoform expression to be determined. Recent results demonstrate the potential of these techniques to elucidate the landscape of splicing, including in tissues such as the brain where AS is particularly prevalent. Finally, we also discuss how AS can impact protein function, potentially leading to novel therapeutic targets for a range of diseases.
Collapse
Affiliation(s)
| | | | - Michael B. Clark
- Centre for Stem Cell Systems, Department of Anatomy and Physiology, The University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
4
|
Sulakhe D, D'Souza M, Wang S, Balasubramanian S, Athri P, Xie B, Canzar S, Agam G, Gilliam TC, Maltsev N. Exploring the functional impact of alternative splicing on human protein isoforms using available annotation sources. Brief Bioinform 2020; 20:1754-1768. [PMID: 29931155 DOI: 10.1093/bib/bby047] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 05/02/2018] [Indexed: 12/30/2022] Open
Abstract
In recent years, the emphasis of scientific inquiry has shifted from whole-genome analyses to an understanding of cellular responses specific to tissue, developmental stage or environmental conditions. One of the central mechanisms underlying the diversity and adaptability of the contextual responses is alternative splicing (AS). It enables a single gene to encode multiple isoforms with distinct biological functions. However, to date, the functions of the vast majority of differentially spliced protein isoforms are not known. Integration of genomic, proteomic, functional, phenotypic and contextual information is essential for supporting isoform-based modeling and analysis. Such integrative proteogenomics approaches promise to provide insights into the functions of the alternatively spliced protein isoforms and provide high-confidence hypotheses to be validated experimentally. This manuscript provides a survey of the public databases supporting isoform-based biology. It also presents an overview of the potential global impact of AS on the human canonical gene functions, molecular interactions and cellular pathways.
Collapse
Affiliation(s)
- Dinanath Sulakhe
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| | - Mark D'Souza
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA
| | - Sheng Wang
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, USA
| | - Sandhya Balasubramanian
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Genentech, Inc. 1 DNA Way, Mail Stop: 35-6J, South San Francisco, CA, USA
| | - Prashanth Athri
- Department of Computer Science and Engineering, Amrita School of Engineering, Bengaluru, Amrita Vishwa Vidyapeetham, Kasavanahalli, Carmelaram P.O., Bengaluru, Karnataka, India
| | - Bingqing Xie
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Department of Computer Science, Illinois Institute of Technology, Chicago, IL, USA
| | - Stefan Canzar
- Toyota Technological Institute at Chicago, 6045 S. Kenwood Avenue, Chicago, IL, USA.,Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Gady Agam
- Department of Computer Science, Illinois Institute of Technology, Chicago, IL, USA
| | - T Conrad Gilliam
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| | - Natalia Maltsev
- Department of Human Genetics, University of Chicago, 920 E. 58th Street, Chicago, IL, USA.,Computation Institute, University of Chicago, 5735 S. Ellis Avenue, Chicago, IL, USA
| |
Collapse
|
5
|
Rodriguez JM, Rodriguez-Rivas J, Di Domenico T, Vázquez J, Valencia A, Tress ML. APPRIS 2017: principal isoforms for multiple gene sets. Nucleic Acids Res 2019; 46:D213-D217. [PMID: 29069475 PMCID: PMC5753224 DOI: 10.1093/nar/gkx997] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/19/2017] [Indexed: 01/23/2023] Open
Abstract
The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants.
Collapse
Affiliation(s)
- Jose Manuel Rodriguez
- Spanish National Bioinformatics Institute (INB), Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Juan Rodriguez-Rivas
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Tomás Di Domenico
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| | - Jesús Vázquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain.,CIBER de Enfermedades Cardiovasculares (CIBERCV), 28029 Madrid, Spain
| | - Alfonso Valencia
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona E-08010, Spain.,Life Sciences Department, Barcelona Supercomputing Centre (BSC-CNS), Barcelona E-08034, Spain
| | - Michael L Tress
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain
| |
Collapse
|
6
|
Tapial J, Ha KCH, Sterne-Weiler T, Gohr A, Braunschweig U, Hermoso-Pulido A, Quesnel-Vallières M, Permanyer J, Sodaei R, Marquez Y, Cozzuto L, Wang X, Gómez-Velázquez M, Rayon T, Manzanares M, Ponomarenko J, Blencowe BJ, Irimia M. An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms. Genome Res 2017; 27:1759-1768. [PMID: 28855263 PMCID: PMC5630039 DOI: 10.1101/gr.220962.117] [Citation(s) in RCA: 245] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 08/09/2017] [Indexed: 12/29/2022]
Abstract
Alternative splicing (AS) generates remarkable regulatory and proteomic complexity in metazoans. However, the functions of most AS events are not known, and programs of regulated splicing remain to be identified. To address these challenges, we describe the Vertebrate Alternative Splicing and Transcription Database (VastDB), the largest resource of genome-wide, quantitative profiles of AS events assembled to date. VastDB provides readily accessible quantitative information on the inclusion levels and functional associations of AS events detected in RNA-seq data from diverse vertebrate cell and tissue types, as well as developmental stages. The VastDB profiles reveal extensive new intergenic and intragenic regulatory relationships among different classes of AS and previously unknown and conserved landscapes of tissue-regulated exons. Contrary to recent reports concluding that nearly all human genes express a single major isoform, VastDB provides evidence that at least 48% of multiexonic protein-coding genes express multiple splice variants that are highly regulated in a cell/tissue-specific manner, and that >18% of genes simultaneously express multiple major isoforms across diverse cell and tissue types. Isoforms encoded by the latter set of genes are generally coexpressed in the same cells and are often engaged by translating ribosomes. Moreover, they are encoded by genes that are significantly enriched in functions associated with transcriptional control, implying they may have an important and wide-ranging role in controlling cellular activities. VastDB thus provides an unprecedented resource for investigations of AS function and regulation.
Collapse
Affiliation(s)
- Javier Tapial
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Kevin C H Ha
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | | - André Gohr
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | | | - Antonio Hermoso-Pulido
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Mathieu Quesnel-Vallières
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Jon Permanyer
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Reza Sodaei
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Yamile Marquez
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Luca Cozzuto
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Xinchen Wang
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Melisa Gómez-Velázquez
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Teresa Rayon
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Miguel Manzanares
- Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), 28029 Madrid, Spain
| | - Julia Ponomarenko
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | | | - Manuel Irimia
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| |
Collapse
|
7
|
Xing Y, Zhao X, Yu T, Liang D, Li J, Wei G, Liu G, Cui X, Zhao H, Cai L. MiasDB: A Database of Molecular Interactions Associated with Alternative Splicing of Human Pre-mRNAs. PLoS One 2016; 11:e0155443. [PMID: 27167218 PMCID: PMC4864242 DOI: 10.1371/journal.pone.0155443] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 04/28/2016] [Indexed: 12/21/2022] Open
Abstract
Alternative splicing (AS) is pervasive in human multi-exon genes and is a major contributor to expansion of the transcriptome and proteome diversity. The accurate recognition of alternative splice sites is regulated by information contained in networks of protein-protein and protein-RNA interactions. However, the mechanisms leading to splice site selection are not fully understood. Although numerous databases have been built to describe AS, molecular interaction databases associated with AS have only recently emerged. In this study, we present a new database, MiasDB, that provides a description of molecular interactions associated with human AS events. This database covers 938 interactions between human splicing factors, RNA elements, transcription factors, kinases and modified histones for 173 human AS events. Every entry includes the interaction partners, interaction type, experimental methods, AS type, tissue specificity or disease-relevant information, a simple description of the functionally tested interaction in the AS event and references. The database can be queried easily using a web server (http://47.88.84.236/Miasdb). We display some interaction figures for several genes. With this database, users can view the regulation network describing AS events for 12 given genes.
Collapse
Affiliation(s)
- Yongqiang Xing
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Xiujuan Zhao
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Tao Yu
- School of Science, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Dong Liang
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Jun Li
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Guanyun Wei
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Guoqing Liu
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Xiangjun Cui
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Hongyu Zhao
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Lu Cai
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
- * E-mail:
| |
Collapse
|
8
|
Zhang Y. Network analysis reveals stage-specific changes in zebrafish embryo development using time course whole transcriptome profiling and prior biological knowledge. BioData Min 2015; 8:26. [PMID: 26322129 PMCID: PMC4552361 DOI: 10.1186/s13040-015-0057-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Accepted: 07/30/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Molecular networks act as the backbone of molecular activities within cells, offering a unique opportunity to better understand the mechanism of diseases. While network data usually constitute only static network maps, integrating them with time course gene expression information can provide clues to the dynamic features of these networks and unravel the mechanistic driver genes characterizing cellular responses. Time course gene expression data allow us to broadly "watch" the dynamics of the system. However, one challenge in the analysis of such data is to establish and characterize the interplay among genes that are altered at different time points in the context of a biological process or functional category. Integrative analysis of these data sources will lead us a more complete understanding of how biological entities (e.g., genes and proteins) coordinately perform their biological functions in biological systems. RESULTS In this paper, we introduced a novel network-based approach to extract functional knowledge from time-dependent biological processes at a system level using time course mRNA sequencing data in zebrafish embryo development. The proposed method was applied to investigate 1α, 25(OH)2D3-altered mechanisms in zebrafish embryo development. We applied the proposed method to a public zebrafish time course mRNA-Seq dataset, containing two different treatments along four time points. We constructed networks between gene ontology biological process categories, which were enriched in differential expressed genes between consecutive time points and different conditions. The temporal propagation of 1α, 25-Dihydroxyvitamin D3-altered transcriptional changes started from a few genes that were altered initially at earlier stage, to large groups of biological coherent genes at later stages. The most notable biological processes included neuronal and retinal development and generalized stress response. In addition, we also investigated the relationship among biological processes enriched in co-expressed genes under different conditions. The enriched biological processes include translation elongation, nucleosome assembly, and retina development. These network dynamics provide new insights into the impact of 1α, 25-Dihydroxyvitamin D3 treatment in bone and cartilage development. CONCLUSION We developed a network-based approach to analyzing the DEGs at different time points by integrating molecular interactions and gene ontology information. These results demonstrate that the proposed approach can provide insight on the molecular mechanisms taking place in vertebrate embryo development upon treatment with 1α, 25(OH)2D3. Our approach enables the monitoring of biological processes that can serve as a basis for generating new testable hypotheses. Such network-based integration approach can be easily extended to any temporal- or condition-dependent genomic data analyses.
Collapse
Affiliation(s)
- Yuji Zhang
- Division of Biostatistics and Bioinformatics, University of Maryland Greenebaum Cancer Center, Baltimore, USA ; Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, USA
| |
Collapse
|
9
|
Cui H, Dhroso A, Johnson N, Korkin D. The variation game: Cracking complex genetic disorders with NGS and omics data. Methods 2015; 79-80:18-31. [PMID: 25944472 DOI: 10.1016/j.ymeth.2015.04.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2014] [Revised: 03/27/2015] [Accepted: 04/17/2015] [Indexed: 12/14/2022] Open
Abstract
Tremendous advances in Next Generation Sequencing (NGS) and high-throughput omics methods have brought us one step closer towards mechanistic understanding of the complex disease at the molecular level. In this review, we discuss four basic regulatory mechanisms implicated in complex genetic diseases, such as cancer, neurological disorders, heart disease, diabetes, and many others. The mechanisms, including genetic variations, copy-number variations, posttranscriptional variations, and epigenetic variations, can be detected using a variety of NGS methods. We propose that malfunctions detected in these mechanisms are not necessarily independent, since these malfunctions are often found associated with the same disease and targeting the same gene, group of genes, or functional pathway. As an example, we discuss possible rewiring effects of the cancer-associated genetic, structural, and posttranscriptional variations on the protein-protein interaction (PPI) network centered around P53 protein. The review highlights multi-layered complexity of common genetic disorders and suggests that integration of NGS and omics data is a critical step in developing new computational methods capable of deciphering this complexity.
Collapse
Affiliation(s)
- Hongzhu Cui
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Andi Dhroso
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Nathan Johnson
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| | - Dmitry Korkin
- Department of Computer Science, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States; Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, United States
| |
Collapse
|
10
|
Shargunov AV, Krasnov GS, Ponomarenko EA, Lisitsa AV, Shurdov MA, Zverev VV, Archakov AI, Blinov VM. Tissue-Specific Alternative Splicing Analysis Reveals the Diversity of Chromosome 18 Transcriptome. J Proteome Res 2013; 13:173-82. [DOI: 10.1021/pr400808u] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Alexander V. Shargunov
- I. I. Mechnikov Institute of Vaccines and Sera of the Russian Academy of Medical Sciences, 5A, Maly Kazenny per., 105064 Moscow, Russia
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
| | - George S. Krasnov
- I. I. Mechnikov Institute of Vaccines and Sera of the Russian Academy of Medical Sciences, 5A, Maly Kazenny per., 105064 Moscow, Russia
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
| | - Elena A. Ponomarenko
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
- LLC PostGenTech, 10, Pogodinskaya Street, 119121 Moscow, Russia
| | - Andrey V. Lisitsa
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
- LLC PostGenTech, 10, Pogodinskaya Street, 119121 Moscow, Russia
| | | | - Vitaliy V. Zverev
- I. I. Mechnikov Institute of Vaccines and Sera of the Russian Academy of Medical Sciences, 5A, Maly Kazenny per., 105064 Moscow, Russia
| | - Alexander I. Archakov
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
| | - Vladimir M. Blinov
- I. I. Mechnikov Institute of Vaccines and Sera of the Russian Academy of Medical Sciences, 5A, Maly Kazenny per., 105064 Moscow, Russia
- Bioinformatics
and Postgenome Research, V. N. Orekhovich Institute of Biomedical Chemistry of the Russian Academy of Medical Sciences, 10, Pogodinskaya
Street, 119121 Moscow, Russia
| |
Collapse
|
11
|
Systematic analysis of intron size and abundance parameters in diverse lineages. SCIENCE CHINA-LIFE SCIENCES 2013; 56:968-74. [DOI: 10.1007/s11427-013-4540-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2013] [Accepted: 08/10/2013] [Indexed: 01/06/2023]
|
12
|
Krasnov GS, Dmitriev AA, Lakunina VA, Kirpiy AA, Kudryavtseva AV. Targeting VDAC-bound hexokinase II: a promising approach for concomitant anti-cancer therapy. Expert Opin Ther Targets 2013; 17:1221-33. [DOI: 10.1517/14728222.2013.833607] [Citation(s) in RCA: 104] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
13
|
Bianchi V, Colantoni A, Calderone A, Ausiello G, Ferrè F, Helmer-Citterich M. DBATE: database of alternative transcripts expression. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat050. [PMID: 23842462 PMCID: PMC5654372 DOI: 10.1093/database/bat050] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The use of high-throughput RNA sequencing technology (RNA-seq) allows whole transcriptome analysis, providing an unbiased and unabridged view of alternative transcript expression. Coupling splicing variant-specific expression with its functional inference is still an open and difficult issue for which we created the DataBase of Alternative Transcripts Expression (DBATE), a web-based repository storing expression values and functional annotation of alternative splicing variants. We processed 13 large RNA-seq panels from human healthy tissues and in disease conditions, reporting expression levels and functional annotations gathered and integrated from different sources for each splicing variant, using a variant-specific annotation transfer pipeline. The possibility to perform complex queries by cross-referencing different functional annotations permits the retrieval of desired subsets of splicing variant expression values that can be visualized in several ways, from simple to more informative. DBATE is intended as a novel tool to help appreciate how, and possibly why, the transcriptome expression is shaped. Database URL:http://bioinformatica.uniroma2.it/DBATE/.
Collapse
Affiliation(s)
- Valerio Bianchi
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica s.n.c., Rome 00133, Italy
| | | | | | | | | | | |
Collapse
|
14
|
Krasnov GS, Dmitriev AA, Snezhkina AV, Kudryavtseva AV. Deregulation of glycolysis in cancer: glyceraldehyde-3-phosphate dehydrogenase as a therapeutic target. Expert Opin Ther Targets 2013; 17:681-93. [DOI: 10.1517/14728222.2013.775253] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
15
|
Kelemen O, Convertini P, Zhang Z, Wen Y, Shen M, Falaleeva M, Stamm S. Function of alternative splicing. Gene 2013; 514:1-30. [PMID: 22909801 PMCID: PMC5632952 DOI: 10.1016/j.gene.2012.07.083] [Citation(s) in RCA: 515] [Impact Index Per Article: 46.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Revised: 07/21/2012] [Accepted: 07/30/2012] [Indexed: 12/15/2022]
Abstract
Almost all polymerase II transcripts undergo alternative pre-mRNA splicing. Here, we review the functions of alternative splicing events that have been experimentally determined. The overall function of alternative splicing is to increase the diversity of mRNAs expressed from the genome. Alternative splicing changes proteins encoded by mRNAs, which has profound functional effects. Experimental analysis of these protein isoforms showed that alternative splicing regulates binding between proteins, between proteins and nucleic acids as well as between proteins and membranes. Alternative splicing regulates the localization of proteins, their enzymatic properties and their interaction with ligands. In most cases, changes caused by individual splicing isoforms are small. However, cells typically coordinate numerous changes in 'splicing programs', which can have strong effects on cell proliferation, cell survival and properties of the nervous system. Due to its widespread usage and molecular versatility, alternative splicing emerges as a central element in gene regulation that interferes with almost every biological function analyzed.
Collapse
Affiliation(s)
- Olga Kelemen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Paolo Convertini
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Zhaiyi Zhang
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Yuan Wen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Manli Shen
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Marina Falaleeva
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| | - Stefan Stamm
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, Kentucky, United States of America
| |
Collapse
|
16
|
Rodriguez JM, Maietta P, Ezkurdia I, Pietrelli A, Wesselink JJ, Lopez G, Valencia A, Tress ML. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res 2012; 41:D110-7. [PMID: 23161672 PMCID: PMC3531113 DOI: 10.1093/nar/gks1058] [Citation(s) in RCA: 153] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Here, we present APPRIS (http://appris.bioinfo.cnio.es), a database that houses annotations of human splice isoforms. APPRIS has been designed to provide value to manual annotations of the human genome by adding reliable protein structural and functional data and information from cross-species conservation. The visual representation of the annotations provided by APPRIS for each gene allows annotators and researchers alike to easily identify functional changes brought about by splicing events. In addition to collecting, integrating and analyzing reliable predictions of the effect of splicing events, APPRIS also selects a single reference sequence for each gene, here termed the principal isoform, based on the annotations of structure, function and conservation for each transcript. APPRIS identifies a principal isoform for 85% of the protein-coding genes in the GENCODE 7 release for ENSEMBL. Analysis of the APPRIS data shows that at least 70% of the alternative (non-principal) variants would lose important functional or structural information relative to the principal isoform.
Collapse
|
17
|
Shionyu M, Takahashi KI, Go M. AS-EAST: a functional annotation tool for putative proteins encoded by alternatively spliced transcripts. Bioinformatics 2012; 28:2076-7. [PMID: 22645168 PMCID: PMC3400965 DOI: 10.1093/bioinformatics/bts320] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Summary: Alternative Splicing Effects ASsessment Tools (AS-EAST) is an online tool for the functional annotation of putative proteins encoded by transcripts generated by alternative splicing (AS). When provided with a transcript sequence, AS-EAST identifies regions altered by AS events in the putative protein sequence encoded by the transcript. Users can evaluate the predicted function of the putative protein by inspecting whether functional domains are included in the altered regions. Moreover, users can infer the loss of inter-molecular interactions in the protein network according to whether the AS events affect interaction residues observed in the 3D structure of the reference isoform. The information obtained from AS-EAST will help to design experimental analyses for the functional significance of novel splice isoforms. Availability: The online tool is freely available at http://as-alps.nagahama-i-bio.ac.jp/ASEAST/. Contact:m_shionyu@nagahama-i-bio.ac.jp
Collapse
Affiliation(s)
- Masafumi Shionyu
- Department of Computer Bioscience, Nagahama Institute of Bio-Science and Technology, 1266, Tamura-cho, Nagahama, Shiga 526-0829, Japan.
| | | | | |
Collapse
|
18
|
Podder S, Ghosh TC. Evolutionary dynamics of human autoimmune disease genes and malfunctioned immunological genes. BMC Evol Biol 2012; 12:10. [PMID: 22276655 PMCID: PMC3347981 DOI: 10.1186/1471-2148-12-10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Accepted: 01/25/2012] [Indexed: 02/01/2023] Open
Abstract
Background One of the main issues of molecular evolution is to divulge the principles in dictating the evolutionary rate differences among various gene classes. Immunological genes have received considerable attention in evolutionary biology as candidates for local adaptation and for studying functionally important polymorphisms. The normal structure and function of immunological genes will be distorted when they experience mutations leading to immunological dysfunctions. Results Here, we examined the fundamental differences between the genes which on mutation give rise to autoimmune or other immune system related diseases and the immunological genes that do not cause any disease phenotypes. Although the disease genes examined are analogous to non-disease genes in product, expression, function, and pathway affiliation, a statistically significant decrease in evolutionary rate has been found in autoimmune disease genes relative to all other immune related diseases and non-disease genes. Possible ways of accumulation of mutation in the three steps of the central dogma (DNA-mRNA-Protein) have been studied to trace the mutational effects predisposed to disease consequence and acquiring higher selection pressure. Principal Component Analysis and Multivariate Regression Analysis have established the predominant role of single nucleotide polymorphisms in guiding the evolutionary rate of immunological disease and non-disease genes followed by m-RNA abundance, paralogs number, fraction of phosphorylation residue, alternatively spliced exon, protein residue burial and protein disorder. Conclusions Our study provides an empirical insight into the etiology of autoimmune disease genes and other immunological diseases. The immediate utility of our study is to help in disease gene identification and may also help in medicinal improvement of immune related disease.
Collapse
|
19
|
Floris M, Raimondo D, Leoni G, Orsini M, Marcatili P, Tramontano A. MAISTAS: a tool for automatic structural evaluation of alternative splicing products. Bioinformatics 2011; 27:1625-9. [PMID: 21498402 PMCID: PMC3106191 DOI: 10.1093/bioinformatics/btr198] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Motivation: Analysis of the human genome revealed that the amount of transcribed sequence is an order of magnitude greater than the number of predicted and well-characterized genes. A sizeable fraction of these transcripts is related to alternatively spliced forms of known protein coding genes. Inspection of the alternatively spliced transcripts identified in the pilot phase of the ENCODE project has clearly shown that often their structure might substantially differ from that of other isoforms of the same gene, and therefore that they might perform unrelated functions, or that they might even not correspond to a functional protein. Identifying these cases is obviously relevant for the functional assignment of gene products and for the interpretation of the effect of variations in the corresponding proteins. Results: Here we describe a publicly available tool that, given a gene or a protein, retrieves and analyses all its annotated isoforms, provides users with three-dimensional models of the isoform(s) of his/her interest whenever possible and automatically assesses whether homology derived structural models correspond to plausible structures. This information is clearly relevant. When the homology model of some isoforms of a gene does not seem structurally plausible, the implications are that either they assume a structure unrelated to that of the other isoforms of the same gene with presumably significant functional differences, or do not correspond to functional products. We provide indications that the second hypothesis is likely to be true for a substantial fraction of the cases. Availability:http://maistas.bioinformatica.crs4.it/. Contact:anna.tramontano@uniromal.it
Collapse
Affiliation(s)
- Matteo Floris
- CRS4-Bioinformatics Laboratory, c/o Sardegna Ricerche Scientific Park, Pula, 09010 Cagliari, Italy
| | | | | | | | | | | |
Collapse
|
20
|
Hegyi H, Kalmar L, Horvath T, Tompa P. Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder. Nucleic Acids Res 2010; 39:1208-19. [PMID: 20972208 PMCID: PMC3045584 DOI: 10.1093/nar/gkq843] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
According to current estimations ∼95% of multi-exonic human protein-coding genes undergo alternative splicing (AS). However, for 4000 human proteins in PDB, only 14 human proteins have structures of at least two alternative isoforms. Surveying these structural isoforms revealed that the maximum insertion accommodated by an isoform of a fully ordered protein domain was 5 amino acids, other instances of domain changes involved intrinsic structural disorder. After collecting 505 minor isoforms of human proteins with evidence for their existence we analyzed their length, protein disorder and exposed hydrophobic surface. We found that strict rules govern the selection of alternative splice variants aimed to preserve the integrity of globular domains: alternative splice sites (i) tend to avoid globular domains or (ii) affect them only marginally or (iii) tend to coincide with a location where the exposed hydrophobic surface is minimal or (iv) the protein is disordered. We also observed an inverse correlation between the domain fraction lost and the full length of the minor isoform containing the domain, possibly indicating a buffering effect for the isoform protein counteracting the domain truncation effect. These observations provide the basis for a prediction method (currently under development) to predict the viability of splice variants.
Collapse
Affiliation(s)
- Hedi Hegyi
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, 1518 Budapest, Hungary.
| | | | | | | |
Collapse
|
21
|
de la Grange P, Gratadou L, Delord M, Dutertre M, Auboeuf D. Splicing factor and exon profiling across human tissues. Nucleic Acids Res 2010; 38:2825-38. [PMID: 20110256 PMCID: PMC2875023 DOI: 10.1093/nar/gkq008] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
It has been shown that alternative splicing is especially prevalent in brain and testis when compared to other tissues. To test whether there is a specific propensity of these tissues to generate splicing variants, we used a single source of high-density microarray data to perform both splicing factor and exon expression profiling across 11 normal human tissues. Paired comparisons between tissues and an original exon-based statistical group analysis demonstrated after extensive RT-PCR validation that the cerebellum, testis, and spleen had the largest proportion of differentially expressed alternative exons. Variations at the exon level correlated with a larger number of splicing factors being expressed at a high level in the cerebellum, testis and spleen than in other tissues. However, this splicing factor expression profile was similar to a more global gene expression pattern as a larger number of genes had a high expression level in the cerebellum, testis and spleen. In addition to providing a unique resource on expression profiling of alternative splicing variants and splicing factors across human tissues, this study demonstrates that the higher prevalence of alternative splicing in a subset of tissues originates from the larger number of genes, including splicing factors, being expressed than in other tissues.
Collapse
Affiliation(s)
- Pierre de la Grange
- GenoSplice technology, Centre Hayem, Hôpital Saint-Louis, 1 avenue Claude Vellefaux, 75010, Paris, France.
| | | | | | | | | |
Collapse
|
22
|
Zhang Z, Zhou L, Wang P, Liu Y, Chen X, Hu L, Kong X. Divergence of exonic splicing elements after gene duplication and the impact on gene structures. Genome Biol 2009; 10:R120. [PMID: 19883501 PMCID: PMC3091315 DOI: 10.1186/gb-2009-10-11-r120] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Revised: 09/28/2009] [Accepted: 11/02/2009] [Indexed: 12/18/2022] Open
Abstract
An analysis of human exonic splicing elements in duplicated genes reveals their important role in the generation of new gene structures. Background The origin of new genes and their contribution to functional novelty has been the subject of considerable interest. There has been much progress in understanding the mechanisms by which new genes originate. Here we examine a novel way that new gene structures could originate, namely through the evolution of new alternative splicing isoforms after gene duplication. Results We studied the divergence of exonic splicing enhancers and silencers after gene duplication and the contributions of such divergence to the generation of new splicing isoforms. We found that exonic splicing enhancers and exonic splicing silencers diverge especially fast shortly after gene duplication. About 10% and 5% of paralogous exons undergo significantly asymmetric evolution of exonic splicing enhancers and silencers, respectively. When compared to pre-duplication ancestors, we found that there is a significant overall loss of exonic splicing enhancers and the magnitude increases with duplication age. Detailed examination reveals net gains and losses of exonic splicing enhancers and silencers in different copies and paralog clusters after gene duplication. Furthermore, we found that exonic splicing enhancer and silencer changes are mainly caused by synonymous mutations, though nonsynonymous changes also contribute. Finally, we found that exonic splicing enhancer and silencer divergence results in exon splicing state transitions (from constitutive to alternative or vice versa), and that the proportion of paralogous exon pairs with different splicing states also increases over time, consistent with previous predictions. Conclusions Our results suggest that exonic splicing enhancer and silencer changes after gene duplication have important roles in alternative splicing divergence and that these changes contribute to the generation of new gene structures.
Collapse
Affiliation(s)
- Zhenguo Zhang
- The Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS) and Shanghai Jiao Tong University School of Medicine (SJTUSM), 225 South Chong Qing Road, Shanghai 200025, PR China.
| | | | | | | | | | | | | |
Collapse
|