1
|
Chen G, Chen J, Qi L, Yin Y, Lin Z, Wen H, Zhang S, Xiao C, Bello SF, Zhang X, Nie Q, Luo W. Bulk and single-cell alternative splicing analyses reveal roles of TRA2B in myogenic differentiation. Cell Prolif 2024; 57:e13545. [PMID: 37705195 PMCID: PMC10849790 DOI: 10.1111/cpr.13545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/08/2023] [Accepted: 08/28/2023] [Indexed: 09/15/2023] Open
Abstract
Alternative splicing (AS) disruption has been linked to disorders of muscle development, as well as muscular atrophy. However, the precise changes in AS patterns that occur during myogenesis are not well understood. Here, we employed isoform long-reads RNA-seq (Iso-seq) and single-cell RNA-seq (scRNA-seq) to investigate the AS landscape during myogenesis. Our Iso-seq data identified 61,146 full-length isoforms representing 11,682 expressed genes, of which over 52% were novel. We identified 38,022 AS events, with most of these events altering coding sequences and exhibiting stage-specific splicing patterns. We identified AS dynamics in different types of muscle cells through scRNA-seq analysis, revealing genes essential for the contractile muscle system and cytoskeleton that undergo differential splicing across cell types. Gene-splicing analysis demonstrated that AS acts as a regulator, independent of changes in overall gene expression. Two isoforms of splicing factor TRA2B play distinct roles in myogenic differentiation by triggering AS of TGFBR2 to regulate canonical TGF-β signalling cascades differently. Our study provides a valuable transcriptome resource for myogenesis and reveals the complexity of AS and its regulation during myogenesis.
Collapse
Affiliation(s)
- Genghua Chen
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Jiahui Chen
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Lin Qi
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Yunqian Yin
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Zetong Lin
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Huaqiang Wen
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Shuai Zhang
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Chuanyun Xiao
- Human and Animal PhysiologyWageningen UniversityWageningenThe Netherlands
| | - Semiu Folaniyi Bello
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Xiquan Zhang
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Qinghua Nie
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| | - Wen Luo
- College of Animal ScienceSouth China Agricultural UniversityGuangzhouGuangdongChina
- Guangdong Provincial Key Lab of Agro‐Animal Genomics and Molecular Breeding, Lingnan Guangdong Laboratory of Modern Agriculture & State Key Laboratory for Conservation and Utilization of Subtropical Agro‐Bioresources, Key Laboratory of Chicken Genetics, Breeding and Reproduction, Ministry of AgricultureGuangzhouGuangdongChina
- State Key Laboratory of Livestock and Poultry Breeding, and Lingnan Guangdong Laboratory of AgricultureSouth China Agricultural UniversityGuangzhouChina
| |
Collapse
|
2
|
Zhang M, Zheng S, Liang JQ. Transcriptional and reverse transcriptional regulation of host genes by human endogenous retroviruses in cancers. Front Microbiol 2022; 13:946296. [PMID: 35928153 PMCID: PMC9343867 DOI: 10.3389/fmicb.2022.946296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 06/29/2022] [Indexed: 11/16/2022] Open
Abstract
Human endogenous retroviruses (HERVs) originated from ancient retroviral infections of germline cells millions of years ago and have evolved as part of the host genome. HERVs not only retain the capacity as retroelements but also regulate host genes. The expansion of HERVs involves transcription by RNA polymerase II, reverse transcription, and re-integration into the host genome. Fast progress in deep sequencing and functional analysis has revealed the importance of domesticated copies of HERVs, including their regulatory sequences, transcripts, and proteins in normal cells. However, evidence also suggests the involvement of HERVs in the development and progression of many types of cancer. Here we summarize the current state of knowledge about the expression of HERVs, transcriptional regulation of host genes by HERVs, and the functions of HERVs in reverse transcription and gene editing with their reverse transcriptase.
Collapse
Affiliation(s)
- Mengwen Zhang
- The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
- Ministry of Education Key Laboratory of Cancer Prevention and Intervention, Second Affiliated Hospital, Cancer Institute, Zhejiang University School of Medicine, Hangzhou, China
| | - Shu Zheng
- The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China
- Ministry of Education Key Laboratory of Cancer Prevention and Intervention, Second Affiliated Hospital, Cancer Institute, Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Shu Zheng,
| | - Jessie Qiaoyi Liang
- Department of Medicine and Therapeutics, Faculty of Medicine, Center for Gut Microbiota Research, Li Ka Shing Institute of Health Sciences, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
- Jessie Qiaoyi Liang,
| |
Collapse
|
3
|
Fletcher S, Bellgard MI, Price L, Akkari AP, Wilton SD. Translational development of splice-modifying antisense oligomers. Expert Opin Biol Ther 2016; 17:15-30. [PMID: 27805416 DOI: 10.1080/14712598.2017.1250880] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
INTRODUCTION Antisense nucleic acid analogues can interact with pre-mRNA motifs and influence exon or splice site selection and thereby alter gene expression. Design of antisense molecules to target specific motifs can result in either exon exclusion or exon inclusion during splicing. Novel drugs exploiting the antisense concept are targeting rare, life-limiting diseases; however, the potential exists to treat a wide range of conditions by antisense-mediated splice intervention. Areas covered: In this review, the authors discuss the clinical translation of novel molecular therapeutics to address the fatal neuromuscular disorders Duchenne muscular dystrophy and spinal muscular atrophy. The review also highlights difficulties posed by issues pertaining to restricted participant numbers, variable phenotype and disease progression, and the identification and validation of study endpoints. Expert opinion: Translation of novel therapeutics for Duchenne muscular dystrophy and spinal muscular atrophy has been greatly advanced by multidisciplinary research, academic-industry partnerships and in particular, the engagement and support of the patient community. Sponsors, supporters and regulators are cooperating to deliver new drugs and identify and define meaningful outcome measures. Non-conventional and adaptive trial design could be particularly suited to clinical evaluation of novel therapeutics and strategies to treat serious, rare diseases that may be problematic to study using more conventional clinical trial structures.
Collapse
Affiliation(s)
- S Fletcher
- a Centre for Neuromuscular and Neurological Disorders , University of Western Australia , Nedlands , Western Australia , Australia.,b Western Australian Neuroscience Research Institute , Nedlands , Western Australia , Australia.,c Centre for Comparative Genomics , Murdoch University , Western Australia , Australia
| | - M I Bellgard
- b Western Australian Neuroscience Research Institute , Nedlands , Western Australia , Australia.,c Centre for Comparative Genomics , Murdoch University , Western Australia , Australia
| | - L Price
- a Centre for Neuromuscular and Neurological Disorders , University of Western Australia , Nedlands , Western Australia , Australia.,b Western Australian Neuroscience Research Institute , Nedlands , Western Australia , Australia.,c Centre for Comparative Genomics , Murdoch University , Western Australia , Australia
| | - A P Akkari
- b Western Australian Neuroscience Research Institute , Nedlands , Western Australia , Australia.,c Centre for Comparative Genomics , Murdoch University , Western Australia , Australia.,d Shiraz Pharmaceuticals, Inc , Chapel Hill , NC , USA
| | - S D Wilton
- a Centre for Neuromuscular and Neurological Disorders , University of Western Australia , Nedlands , Western Australia , Australia.,b Western Australian Neuroscience Research Institute , Nedlands , Western Australia , Australia.,c Centre for Comparative Genomics , Murdoch University , Western Australia , Australia
| |
Collapse
|
4
|
Zhou K, Salamov A, Kuo A, Aerts AL, Kong X, Grigoriev IV. Alternative splicing acting as a bridge in evolution. Stem Cell Investig 2015; 2:19. [PMID: 27358887 DOI: 10.3978/j.issn.2306-9759.2015.10.01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 10/15/2015] [Indexed: 12/15/2022]
Abstract
BACKGROUND Alternative splicing (AS) regulates diverse cellular and developmental functions through alternative protein structures of different isoforms. Alternative exons dominate AS in vertebrates; however, very little is known about the extent and function of AS in lower eukaryotes. To understand the role of introns in gene evolution, we examined AS from a green algal and five fungal genomes using a novel EST-based gene-modeling algorithm (COMBEST). METHODS AS from each genome was classified with COMBEST that maps EST sequences to genomes to build gene models. Various aspects of AS were analyzed through statistical methods. The interplay of intron 3n length, phase, coding property, and intron retention (RI) were examined with Chi-square testing. RESULTS With 3 to 834 times EST coverage, we identified up to 73% of AS in intron-containing genes and found preponderance of RI among 11 types of AS. The number of exons, expression level, and maximum intron length correlated with number of AS per gene (NAG), and intron-rich genes suppressed AS. Genes with AS were more ancient, and AS was conserved among fungal genomes. Among stopless introns, non-retained introns (NRI) avoided, but major RI preferred 3n length. In contrast, stop-containing introns showed uniform distribution among 3n, 3n+1, and 3n+2 lengths. We found a clue to the intron phase enigma: it was the coding function of introns involved in AS that dictates the intron phase bias. CONCLUSIONS Majority of AS is non-functional, and the extent of AS is suppressed for intron-rich genes. RI through 3n length, stop codon, and phase bias bridges the transition from functionless to functional alternative isoforms.
Collapse
Affiliation(s)
- Kemin Zhou
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| | - Asaf Salamov
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| | - Alan Kuo
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| | - Andrea L Aerts
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| | - Xiangyang Kong
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| | - Igor V Grigoriev
- 1 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA ; 2 Roche Molecular Diagnostics, 4300 Hacienda Drive, Pleasanton, CA 94588, USA ; 3 Department of Clinical Medicine, Kunming University of Science and Technology, Kunming 650031, China
| |
Collapse
|
5
|
Lipovich L, Hou ZC, Jia H, Sinkler C, McGowen M, Sterner KN, Weckle A, Sugalski AB, Pipes L, Gatti DL, Mason CE, Sherwood CC, Hof PR, Kuzawa CW, Grossman LI, Goodman M, Wildman DE. High-throughput RNA sequencing reveals structural differences of orthologous brain-expressed genes between western lowland gorillas and humans. J Comp Neurol 2015; 524:288-308. [PMID: 26132897 DOI: 10.1002/cne.23843] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Revised: 06/20/2015] [Accepted: 06/23/2015] [Indexed: 12/22/2022]
Abstract
The human brain and human cognitive abilities are strikingly different from those of other great apes despite relatively modest genome sequence divergence. However, little is presently known about the interspecies divergence in gene structure and transcription that might contribute to these phenotypic differences. To date, most comparative studies of gene structure in the brain have examined humans, chimpanzees, and macaque monkeys. To add to this body of knowledge, we analyze here the brain transcriptome of the western lowland gorilla (Gorilla gorilla gorilla), an African great ape species that is phylogenetically closely related to humans, but with a brain that is approximately one-third the size. Manual transcriptome curation from a sample of the planum temporale region of the neocortex revealed 12 protein-coding genes and one noncoding-RNA gene with exons in the gorilla unmatched by public transcriptome data from the orthologous human loci. These interspecies gene structure differences accounted for a total of 134 amino acids in proteins found in the gorilla that were absent from protein products of the orthologous human genes. Proteins varying in structure between human and gorilla were involved in immunity and energy metabolism, suggesting their relevance to phenotypic differences. This gorilla neocortical transcriptome comprises an empirical, not homology- or prediction-driven, resource for orthologous gene comparisons between human and gorilla. These findings provide a unique repository of the sequences and structures of thousands of genes transcribed in the gorilla brain, pointing to candidate genes that may contribute to the traits distinguishing humans from other closely related great apes.
Collapse
Affiliation(s)
- Leonard Lipovich
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Neurology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Zhuo-Cheng Hou
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Animal Genetics, China Agricultural University, Beijing, China
| | - Hui Jia
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Christopher Sinkler
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Michael McGowen
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,School of Biological and Chemical Sciences, Queen Mary, University of London, London, United Kingdom
| | - Kirstin N Sterner
- Department of Anthropology, University of Oregon, Eugene, Oregon, 97403
| | - Amy Weckle
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| | - Amara B Sugalski
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Lenore Pipes
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Domenico L Gatti
- Department of Biochemistry and Molecular Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201.,Cardiovascular Research Institute, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Chet C Sherwood
- Department of Anthropology and the Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052
| | - Patrick R Hof
- Fishberg Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029.,New York Consortium in Evolutionary Primatology, New York, New York, 10024
| | | | - Lawrence I Grossman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Morris Goodman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Anatomy and Cell Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Derek E Wildman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| |
Collapse
|
6
|
Comprehensive analysis of alternative splicing in Digitalis purpurea by strand-specific RNA-Seq. PLoS One 2014; 9:e106001. [PMID: 25167195 PMCID: PMC4148352 DOI: 10.1371/journal.pone.0106001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Accepted: 07/25/2014] [Indexed: 12/23/2022] Open
Abstract
Digitalis purpurea (D. purpurea) is one of the most important medicinal plants and is well known in the treatment of heart failure because of the cardiac glycosides that are its main active compounds. However, in the absence of strand specific sequencing information, the post-transcriptional mechanism of gene regulation in D. purpurea thus far remains unknown. In this study, a strand-specific RNA-Seq library was constructed and sequenced using Illumina HiSeq platforms to characterize the transcriptome of D. purpurea with a focus on alternative splicing (AS) events and the effect of AS on protein domains. De novo RNA-Seq assembly resulted in 48,475 genes. Based on the assembled transcripts, we reported a list of 3,265 AS genes, including 5,408 AS events in D. purpurea. Interestingly, both glycosyltransferases and monooxygenase, which were involved in the biosynthesis of cardiac glycosides, are regulated by AS. A total of 2,422 AS events occurred in coding regions, and 959 AS events were located in the regions of 882 unique protein domains, which could affect protein function. This D. purpurea transcriptome study substantially increased the expressed sequence resource and presented a better understanding of post-transcriptional regulation to further facilitate the medicinal applications of D. purpurea for human health.
Collapse
|
7
|
Lo WS, Gardiner E, Xu Z, Lau CF, Wang F, Zhou JJ, Mendlein JD, Nangle LA, Chiang KP, Yang XL, Au KF, Wong WH, Guo M, Zhang M, Schimmel P. Human tRNA synthetase catalytic nulls with diverse functions. Science 2014; 345:328-32. [PMID: 25035493 PMCID: PMC4188629 DOI: 10.1126/science.1252943] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Genetic efficiency in higher organisms depends on mechanisms to create multiple functions from single genes. To investigate this question for an enzyme family, we chose aminoacyl tRNA synthetases (AARSs). They are exceptional in their progressive and accretive proliferation of noncatalytic domains as the Tree of Life is ascended. Here we report discovery of a large number of natural catalytic nulls (CNs) for each human AARS. Splicing events retain noncatalytic domains while ablating the catalytic domain to create CNs with diverse functions. Each synthetase is converted into several new signaling proteins with biological activities "orthogonal" to that of the catalytic parent. We suggest that splice variants with nonenzymatic functions may be more general, as evidenced by recent findings of other catalytically inactive splice-variant enzymes.
Collapse
Affiliation(s)
- Wing-Sze Lo
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Pangu Biopharma, Edinburgh Tower, The Landmark, 15 Queen's Road Central, Hong Kong, China
| | - Elisabeth Gardiner
- The Scripps Laboratories for tRNA Synthetase Research, The Scripps Research Institute, 10650 North Torrey Pines Road, La Jolla, CA 92037, USA. aTyr Pharma, 3545 John Hopkins Court, Suite 250, San Diego, CA 92121, USA
| | - Zhiwen Xu
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Pangu Biopharma, Edinburgh Tower, The Landmark, 15 Queen's Road Central, Hong Kong, China
| | - Ching-Fun Lau
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Pangu Biopharma, Edinburgh Tower, The Landmark, 15 Queen's Road Central, Hong Kong, China
| | - Feng Wang
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Pangu Biopharma, Edinburgh Tower, The Landmark, 15 Queen's Road Central, Hong Kong, China
| | - Jie J Zhou
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Pangu Biopharma, Edinburgh Tower, The Landmark, 15 Queen's Road Central, Hong Kong, China
| | - John D Mendlein
- aTyr Pharma, 3545 John Hopkins Court, Suite 250, San Diego, CA 92121, USA
| | - Leslie A Nangle
- aTyr Pharma, 3545 John Hopkins Court, Suite 250, San Diego, CA 92121, USA
| | - Kyle P Chiang
- aTyr Pharma, 3545 John Hopkins Court, Suite 250, San Diego, CA 92121, USA
| | - Xiang-Lei Yang
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. The Scripps Laboratories for tRNA Synthetase Research, The Scripps Research Institute, 10650 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Kin-Fai Au
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - Min Guo
- The Scripps Laboratories for tRNA Synthetase Research, Scripps Florida, 130 Scripps Way, Jupiter, FL 33458, USA
| | - Mingjie Zhang
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Division of Life Science, State Key Laboratory of Molecular Neuroscience, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Paul Schimmel
- IAS HKUST-Scripps R&D Laboratory, Institute for Advanced Study, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. The Scripps Laboratories for tRNA Synthetase Research, The Scripps Research Institute, 10650 North Torrey Pines Road, La Jolla, CA 92037, USA. The Scripps Laboratories for tRNA Synthetase Research, Scripps Florida, 130 Scripps Way, Jupiter, FL 33458, USA.
| |
Collapse
|
8
|
Robinson TJ, Forte E, Salinas RE, Puri S, Marengo M, Garcia-Blanco MA, Luftig MA. SplicerEX: a tool for the automated detection and classification of mRNA changes from conventional and splice-sensitive microarray expression data. RNA (NEW YORK, N.Y.) 2012; 18:1435-1445. [PMID: 22736799 PMCID: PMC3404365 DOI: 10.1261/rna.033621.112] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Accepted: 05/04/2012] [Indexed: 06/01/2023]
Abstract
The key postulate that one gene encodes one protein has been overhauled with the discovery that one gene can generate multiple RNA transcripts through alternative mRNA processing. In this study, we describe SplicerEX, a novel and uniquely motivated algorithm designed for experimental biologists that (1) detects widespread changes in mRNA isoforms from both conventional and splice sensitive microarray data, (2) automatically categorizes mechanistic changes in mRNA processing, and (3) mitigates known technological artifacts of exon array-based detection of alternative splicing resulting from 5' and 3' signal attenuation, background detection limits, and saturation of probe set signal intensity. In this study, we used SplicerEX to compare conventional and exon-based Affymetrix microarray data in a model of EBV transformation of primary human B cells. We demonstrated superior detection of 3'-located changes in mRNA processing by the Affymetrix U133 GeneChip relative to the Human Exon Array. SplicerEX-identified exon-level changes in the EBV infection model were confirmed by RT-PCR and revealed a novel set of EBV-regulated mRNA isoform changes in caspases 6, 7, and 8. Finally, SplicerEX as compared with MiDAS analysis of publicly available microarray data provided more efficiently categorized mRNA isoform changes with a significantly higher proportion of hits supported by previously annotated alternative processing events. Therefore, SplicerEX provides an important tool for the biologist interested in studying changes in mRNA isoform usage from conventional or splice-sensitive microarray platforms, especially considering the expansive amount of archival microarray data generated over the past decade. SplicerEX is freely available upon request.
Collapse
Affiliation(s)
| | | | | | - Shaan Puri
- Department of Molecular Genetics and Microbiology
| | - Matthew Marengo
- Department of Molecular Genetics and Microbiology
- Center for RNA Biology
| | - Mariano A. Garcia-Blanco
- Department of Molecular Genetics and Microbiology
- Center for RNA Biology
- Department of Medicine, and
- Center for Virology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Micah A. Luftig
- Department of Molecular Genetics and Microbiology
- Center for RNA Biology
- Department of Medicine, and
- Center for Virology, Duke University Medical Center, Durham, North Carolina 27710, USA
| |
Collapse
|
9
|
Davis MJ, Shin CJ, Jing N, Ragan MA. Rewiring the dynamic interactome. MOLECULAR BIOSYSTEMS 2012; 8:2054-66, 2013. [DOI: 10.1039/c2mb25050k] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
10
|
Widespread establishment and regulatory impact of Alu exons in human genes. Proc Natl Acad Sci U S A 2011; 108:2837-42. [PMID: 21282640 DOI: 10.1073/pnas.1012834108] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The Alu element has been a major source of new exons during primate evolution. Thousands of human genes contain spliced exons derived from Alu elements. However, identifying Alu exons that have acquired genuine biological functions remains a major challenge. We investigated the creation and establishment of Alu exons in human genes, using transcriptome profiles of human tissues generated by high-throughput RNA sequencing (RNA-Seq) combined with extensive RT-PCR analysis. More than 25% of Alu exons analyzed by RNA-Seq have estimated transcript inclusion levels of at least 50% in the human cerebellum, indicating widespread establishment of Alu exons in human genes. Genes encoding zinc finger transcription factors have significantly higher levels of Alu exonization. Importantly, Alu exons with high splicing activities are strongly enriched in the 5'-UTR, and two-thirds (10/15) of 5'-UTR Alu exons tested by luciferase reporter assays significantly alter mRNA translational efficiency. Mutational analysis reveals the specific molecular mechanisms by which newly created 5'-UTR Alu exons modulate translational efficiency, such as the creation or elongation of upstream ORFs that repress the translation of the primary ORFs. This study presents genomic evidence that a major functional consequence of Alu exonization is the lineage-specific evolution of translational regulation. Moreover, the preferential creation and establishment of Alu exons in zinc finger genes suggest that Alu exonization may have globally affected the evolution of primate and human transcriptomes by regulating the protein production of master transcriptional regulators in specific lineages.
Collapse
|
11
|
Zhang Z, Stamm S. Analysis of mutations that influence pre-mRNA splicing. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2010; 703:137-60. [PMID: 21125488 DOI: 10.1007/978-1-59745-248-9_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
A rapidly increasing number of human diseases are now recognized as being caused by the selection of wrong splice sites. In most cases, these changes in alternative splice site selection are due to single nucleotide exchanges in splicing regulatory elements. This chapter describes the use of bioinformatics tools to predict the influence of a mutation on alternative pre-mRNA splicing and the experimental testing of these predictions. The bioinformatic analysis determines the influence of a mutation on splicing enhancers and silencers, splice sites and RNA secondary structures. This approach generates hypotheses that are tested using splicing reporter constructs, which are then analyzed in transfection assays. We describe a recombination-based system that allows for the generation of splicing reporter constructs in the first week and their subsequent analysis in the second week.
Collapse
Affiliation(s)
- Zhaiyi Zhang
- Department of Molecular and Cellular Biochemistry, Biomedical Biological Sciences Research Building, College of Medicine, University of Kentucky, Lexington, KY, USA.
| | | |
Collapse
|
12
|
Nygard AB, Jørgensen CB, Cirera S, Fredholm M. Investigation of Tissue-Specific Human Orthologous Alternative Splice Events in Pig. Anim Biotechnol 2010; 21:203-16. [DOI: 10.1080/10495398.2010.497729] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
13
|
Shimada MK, Hayakawa Y, Takeda JI, Gojobori T, Imanishi T. A comprehensive survey of human polymorphisms at conserved splice dinucleotides and its evolutionary relationship with alternative splicing. BMC Evol Biol 2010; 10:122. [PMID: 20433709 PMCID: PMC2882926 DOI: 10.1186/1471-2148-10-122] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Accepted: 04/30/2010] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Alternative splicing (AS) is a key molecular process that endows biological functions with diversity and complexity. Generally, functional redundancy leads to the generation of new functions through relaxation of selective pressure in evolution, as exemplified by duplicated genes. It is also known that alternatively spliced exons (ASEs) are subject to relaxed selective pressure. Within consensus sequences at the splice junctions, the most conserved sites are dinucleotides at both ends of introns (splice dinucleotides). However, a small number of single nucleotide polymorphisms (SNPs) occur at splice dinucleotides. An intriguing question relating to the evolution of AS diversity is whether mutations at splice dinucleotides are maintained as polymorphisms and produce diversity in splice patterns within the human population. We therefore surveyed validated SNPs in the database dbSNP located at splice dinucleotides of all human genes that are defined by the H-Invitational Database. RESULTS We found 212 validated SNPs at splice dinucleotides (sdSNPs); these were confirmed to be consistent with the GT-AG rule at either allele. Moreover, 53 of them were observed to neighbor ASEs (AE dinucleotides). No significant differences were observed between sdSNPs at AE dinucleotides and those at constitutive exons (CE dinucleotides) in SNP properties including average heterozygosity, SNP density, ratio of predicted alleles consistent with the GT-AG rule, and scores of splice sites formed with the predicted allele. We also found that the proportion of non-conserved exons was higher for exons with sdSNPs than for other exons. CONCLUSIONS sdSNPs are found at CE dinucleotides in addition to those at AE dinucleotides, suggesting two possibilities. First, sdSNPs at CE dinucleotides may be robust against sdSNPs because of unknown mechanisms. Second, similar to sdSNPs at AE dinucleotides, those at CE dinucleotides cause differences in AS patterns because of the arbitrariness in the classification of exons into alternative and constitutive type that varies according to the dataset. Taking into account the absence of differences in sdSNP properties between those at AE and CE dinucleotides, the increased proportion of non-conserved exons found in exons flanked by sdSNPs suggests the hypothesis that sdSNPs are maintained at the splice dinucleotides of newly generated exons at which negative selection pressure is relaxed.
Collapse
Affiliation(s)
- Makoto K Shimada
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi Koto-ku, Tokyo135-0064, Japan
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-45 Aomi, Koto-ku, Tokyo 135-0064, Japan
- Institute for Comprehensive Medical Science, Fujita Health University, 1-98 Dengakugakubo, Kutsukake-cho, Toyoake, Aichi 470-1192, Japan
| | - Yosuke Hayakawa
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-45 Aomi, Koto-ku, Tokyo 135-0064, Japan
- Hitachi Software Engineering Co., Ltd., 1-1-43 Suehirocho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Jun-ichi Takeda
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi Koto-ku, Tokyo135-0064, Japan
- Japan Biological Informatics Consortium, 10F TIME24 Building, 2-45 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Takashi Gojobori
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi Koto-ku, Tokyo135-0064, Japan
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Tadashi Imanishi
- Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi Koto-ku, Tokyo135-0064, Japan
| |
Collapse
|
14
|
Robinson TJ, Dinan MA, Dewhirst M, Garcia-Blanco MA, Pearson JL. SplicerAV: a tool for mining microarray expression data for changes in RNA processing. BMC Bioinformatics 2010; 11:108. [PMID: 20184770 PMCID: PMC2838864 DOI: 10.1186/1471-2105-11-108] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Accepted: 02/25/2010] [Indexed: 12/22/2022] Open
Abstract
Background Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. Results Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. Conclusions Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.
Collapse
Affiliation(s)
- Timothy J Robinson
- Molecular Cancer Biology Program, Duke University Medical Center, Durham, USA
| | | | | | | | | |
Collapse
|
15
|
Kanapin AA, Mulder N, Kuznetsov VA. Projection of gene-protein networks to the functional space of the proteome and its application to analysis of organism complexity. BMC Genomics 2010; 11 Suppl 1:S4. [PMID: 20158875 PMCID: PMC2822532 DOI: 10.1186/1471-2164-11-s1-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
We consider the problem of biological complexity via a projection of protein-coding genes of complex organisms onto the functional space of the proteome. The latter can be defined as a set of all functions committed by proteins of an organism. Alternative splicing (AS) allows an organism to generate diverse mature RNA transcripts from a single mRNA strand and thus it could be one of the key mechanisms of increasing of functional complexity of the organism's proteome and a driving force of biological evolution. Thus, the projection of transcription units (TU) and alternative splice-variant (SV) forms onto proteome functional space could generate new types of relational networks (e.g. SV-protein function networks, SFN) and lead to discoveries of novel evolutionarily conservative functional modules. Such types of networks might provide new reliable characteristics of organism complexity and a better understanding of the evolutionary integration and plasticity of interconnection of genome-transcriptome-proteome functions.
Collapse
|
16
|
Abstract
The number of known alternative human isoforms has been increasing steadily with the amount of available transcription data. To date, over 100 000 isoforms have been detected in EST libraries, and at least 75% of human genes have at least one alternative isoform. In this paper, we propose that most alternative splicing events are the result of noise in the splicing process. We show that the number of isoforms and their abundance can be predicted by a simple stochastic noise model that takes into account two factors: the number of introns in a gene and the expression level of a gene. The results strongly support the hypothesis that most alternative splicing is a consequence of stochastic noise in the splicing machinery, and has no functional significance. The results are also consistent with error rates tuned to ensure that an adequate level of functional product is produced and to reduce the toxic effect of accumulation of misfolding proteins. Based on simulation of sampling of virtual cDNA libraries, we estimate that error rates range from 1 to 10% depending on the number of introns and the expression level of a gene.
Collapse
Affiliation(s)
- Eugene Melamud
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, MD 20850, USA.
| | | |
Collapse
|
17
|
Abstract
Even though nearly every human gene has at least one alternative splice form, very little is so far known about the structure and function of resulting protein products. It is becoming increasingly clear that a significant fraction of all isoforms are products of noisy selection of splice sites and thus contribute little to actual functional diversity, and may potentially be deleterious. In this study, we examine the impact of alternative splicing on protein sequence and structure in three datasets: alternative splicing events conserved across multiple species, alternative splicing events in genes that are strongly linked to disease and all observed alternative splicing events. We find that the vast majority of all alternative isoforms result in unstable protein conformations. In contrast to that, the small subset of isoforms conserved across species tends to maintain protein structural integrity to a greater extent. Alternative splicing in disease-associated genes produces unstable structures just as frequently as all other genes, indicating that selection to reduce the effects of alternative splicing on this set is not especially pronounced. Overall, the properties of alternative spliced proteins are consistent with the outcome of noisy selection of splice sites by splicing machinery.
Collapse
Affiliation(s)
- Eugene Melamud
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, MD 20850, USA.
| | | |
Collapse
|
18
|
Ragan C, Cloonan N, Grimmond SM, Zuker M, Ragan MA. Transcriptome-wide prediction of miRNA targets in human and mouse using FASTH. PLoS One 2009; 4:e5745. [PMID: 19478946 PMCID: PMC2684643 DOI: 10.1371/journal.pone.0005745] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Accepted: 04/29/2009] [Indexed: 12/21/2022] Open
Abstract
Transcriptional regulation by microRNAs (miRNAs) involves complementary base-pairing at target sites on mRNAs, yielding complex secondary structures. Here we introduce an efficient computational approach and software (FASTH) for genome-scale prediction of miRNA target sites based on minimizing the free energy of duplex structure. We apply our approach to identify miRNA target sites in the human and mouse transcriptomes. Our results show that short sequence motifs in the 5′ end of miRNAs frequently match mRNAs perfectly, not only at validated target sites but additionally at many other, energetically favourable sites. High-quality matching regions are abundant and occur at similar frequencies in all mRNA regions, not only the 3′UTR. About one-third of potential miRNA target sites are reassigned to different mRNA regions, or gained or lost altogether, among different transcript isoforms from the same gene. Many potential miRNA target sites predicted in human are not found in mouse, and vice-versa, but among those that do occur in orthologous human and mouse mRNAs most are situated in corresponding mRNA regions, i.e. these sites are themselves orthologous. Using a luciferase assay in HEK293 cells, we validate four of six predicted miRNA-mRNA interactions, with the mRNA level reduced by an average of 73%. We demonstrate that a thermodynamically based computational approach to prediction of miRNA binding sites on mRNAs can be scaled to analyse complete mammalian transcriptome datasets. These results confirm and extend the scope of miRNA-mediated species- and transcript-specific regulation in different cell types, tissues and developmental conditions.
Collapse
Affiliation(s)
- Chikako Ragan
- The University of Queensland, Institute for Molecular Bioscience, and ARC Centre of Excellence in Bioinformatics, Brisbane, Australia
| | - Nicole Cloonan
- The University of Queensland, Institute for Molecular Bioscience, and ARC Centre of Excellence in Bioinformatics, Brisbane, Australia
| | - Sean M. Grimmond
- The University of Queensland, Institute for Molecular Bioscience, and ARC Centre of Excellence in Bioinformatics, Brisbane, Australia
| | - Michael Zuker
- Rensselaer Polytechnic Institute, Troy, New York, United States of America
| | - Mark A. Ragan
- The University of Queensland, Institute for Molecular Bioscience, and ARC Centre of Excellence in Bioinformatics, Brisbane, Australia
- * E-mail:
| |
Collapse
|
19
|
A global view of cancer-specific transcript variants by subtractive transcriptome-wide analysis. PLoS One 2009; 4:e4732. [PMID: 19266097 PMCID: PMC2648985 DOI: 10.1371/journal.pone.0004732] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2008] [Accepted: 01/29/2009] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Alternative pre-mRNA splicing (AS) plays a central role in generating complex proteomes and influences development and disease. However, the regulation and etiology of AS in human tumorigenesis is not well understood. METHODOLOGY/PRINCIPAL FINDINGS A Basic Local Alignment Search Tool database was constructed for the expressed sequence tags (ESTs) from all available databases of human cancer and normal tissues. An insertion or deletion in the alignment of EST/EST was used to identify alternatively spliced transcripts. Alignment of the ESTs with the genomic sequence was further used to confirm AS. Alternatively spliced transcripts in each tissue were then subtractively cross-screened to obtain tissue-specific variants. We systematically identified and characterized cancer/tissue-specific and alternatively spliced variants in the human genome based on a global view. We identified 15,093 cancer-specific variants of 9,989 genes from 27 types of human cancers and 14,376 normal tissue-specific variants of 7,240 genes from 35 normal tissues, which cover the main types of human tumors and normal tissues. Approximately 70% of these transcripts are novel. These data were integrated into a database HCSAS (http://202.114.72.39/database/human.html, pass:68756253). Moreover, we observed that the cancer-specific AS of both oncogenes and tumor suppressor genes are associated with specific cancer types. Cancer shows a preference in the selection of alternative splice-sites and utilization of alternative splicing types. CONCLUSIONS/SIGNIFICANCE These features of human cancer, together with the discovery of huge numbers of novel splice forms for cancer-associated genes, suggest an important and global role of cancer-specific AS during human tumorigenesis. We advise the use of cancer-specific alternative splicing as a potential source of new diagnostic, prognostic, predictive, and therapeutic tools for human cancer. The global view of cancer-specific AS is not only useful for exploring the complexity of the cancer transcriptome but also widens the eyeshot of clinical research.
Collapse
|
20
|
Tolvanen M, Ojala PJ, Törönen P, Anderson H, Partanen J, Turpeinen H. Interspliced transcription chimeras: neglected pathological mechanism infiltrating gene accession queries? J Biomed Inform 2008; 42:382-9. [PMID: 19041732 DOI: 10.1016/j.jbi.2008.11.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Revised: 11/03/2008] [Accepted: 11/06/2008] [Indexed: 10/21/2022]
Abstract
Over half of the DNA of mammalian genomes is transcribed, and one of the emerging enigmas in the field of RNA research is intergenic splicing or transcription induced chimerism. We argue that fused low-copy-number transcripts constitute neglected pathological mechanism akin to copy number variation, due to loss of stoichiometric subunit ratios in protein complexes. An obstacle for transcriptomics meta-analysis of published microarrays is the traditional nomenclature of merged transcript neighbors under same accession codes. Tandem transcripts cover 4-20% of genomes but are only loosely overlapping in population. They were most enriched in systems medicine annotations concerning neurology, thalassemia and genital disorders in the GeneGo Inc. MetaCore-MetaDrug(TM) knowledgebase, evaluated with external randomizations here. Clinical transcriptomics is good news since new disease etiologies offer new remedies. We identified homeotic HOX-transfactors centered around BMI-1, the Grb2 adaptor network, the kallikrein system, and thalassemia RNA surveillance as vulnerable hotspot chimeras. As a cure, RNA interference would require verification of chimerism from symptomatic tissue contra healthy control tissue from the same patient.
Collapse
Affiliation(s)
- Martti Tolvanen
- Institute of Medical Technology, University of Tampere, Finland
| | | | | | | | | | | |
Collapse
|
21
|
Shimada MK, Matsumoto R, Hayakawa Y, Sanbonmatsu R, Gough C, Yamaguchi-Kabata Y, Yamasaki C, Imanishi T, Gojobori T. VarySysDB: a human genetic polymorphism database based on all H-InvDB transcripts. Nucleic Acids Res 2008; 37:D810-5. [PMID: 18953038 PMCID: PMC2686441 DOI: 10.1093/nar/gkn798] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Creation of a vast variety of proteins is accomplished by genetic variation and a variety of alternative splicing transcripts. Currently, however, the abundant available data on genetic variation and the transcriptome are stored independently and in a dispersed fashion. In order to provide a research resource regarding the effects of human genetic polymorphism on various transcripts, we developed VarySysDB, a genetic polymorphism database based on 187,156 extensively annotated matured mRNA transcripts from 36,073 loci provided by H-InvDB. VarySysDB offers information encompassing published human genetic polymorphisms for each of these transcripts separately. This allows comparisons of effects derived from a polymorphism on different transcripts. The published information we analyzed includes single nucleotide polymorphisms and deletion-insertion polymorphisms from dbSNP, copy number variations from Database of Genomic Variants, short tandem repeats and single amino acid repeats from H-InvDB and linkage disequilibrium regions from D-HaploDB. The information can be searched and retrieved by features, functions and effects of polymorphisms, as well as by keywords. VarySysDB combines two kinds of viewers, GBrowse and Sequence View, to facilitate understanding of the positional relationship among polymorphisms, genome, transcripts, loci and functional domains. We expect that VarySysDB will yield useful information on polymorphisms affecting gene expression and phenotypes. VarySysDB is available at http://h-invitational.jp/varygene/.
Collapse
Affiliation(s)
- Makoto K Shimada
- Integrated Database and Systems Biology Team, Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, Japan Biological Informatics Consortium, Hitachi Software Engineering Co., Ltd., Tokyo, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Yamaguchi-Kabata Y, Shimada MK, Hayakawa Y, Minoshima S, Chakraborty R, Gojobori T, Imanishi T. Distribution and effects of nonsense polymorphisms in human genes. PLoS One 2008; 3:e3393. [PMID: 18852891 PMCID: PMC2561068 DOI: 10.1371/journal.pone.0003393] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Accepted: 09/03/2008] [Indexed: 11/20/2022] Open
Abstract
Background A great amount of data has been accumulated on genetic variations in the human genome, but we still do not know much about how the genetic variations affect gene function. In particular, little is known about the distribution of nonsense polymorphisms in human genes despite their drastic effects on gene products. Methodology/Principal Findings To detect polymorphisms affecting gene function, we analyzed all publicly available polymorphisms in a database for single nucleotide polymorphisms (dbSNP build 125) located in the exons of 36,712 known and predicted protein-coding genes that were defined in an annotation project of all human genes and transcripts (H-InvDB ver3.8). We found a total of 252,555 single nucleotide polymorphisms (SNPs) and 8,479 insertion and deletions in the representative transcripts in these genes. The SNPs located in ORFs include 40,484 synonymous and 53,754 nonsynonymous SNPs, and 1,258 SNPs that were predicted to be nonsense SNPs or read-through SNPs. We estimated the density of nonsense SNPs to be 0.85×10−3 per site, which is lower than that of nonsynonymous SNPs (2.1×10−3 per site). On average, nonsense SNPs were located 250 codons upstream of the original termination codon, with the substitution occurring most frequently at the first codon position. Of the nonsense SNPs, 581 were predicted to cause nonsense-mediated decay (NMD) of transcripts that would prevent translation. We found that nonsense SNPs causing NMD were more common in genes involving kinase activity and transport. The remaining 602 nonsense SNPs are predicted to produce truncated polypeptides, with an average truncation of 75 amino acids. In addition, 110 read-through SNPs at termination codons were detected. Conclusion/Significance Our comprehensive exploration of nonsense polymorphisms showed that nonsense SNPs exist at a lower density than nonsynonymous SNPs, suggesting that nonsense mutations have more severe effects than amino acid changes. The correspondence of nonsense SNPs to known pathological variants suggests that phenotypic effects of nonsense SNPs have been reported for only a small fraction of nonsense SNPs, and that nonsense SNPs causing NMD are more likely to be involved in phenotypic variations. These nonsense SNPs may include pathological variants that have not yet been reported. These data are available from Transcript View of H-InvDB and VarySysDB (http://h-invitational.jp/varygene/).
Collapse
Affiliation(s)
- Yumi Yamaguchi-Kabata
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Makoto K. Shimada
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | - Yosuke Hayakawa
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | | | - Ranajit Chakraborty
- Center for Genome Information, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Takashi Gojobori
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Tadashi Imanishi
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
- * E-mail:
| |
Collapse
|
23
|
Takeda JI, Suzuki Y, Sakate R, Sato Y, Seki M, Irie T, Takeuchi N, Ueda T, Nakao M, Sugano S, Gojobori T, Imanishi T. Low conservation and species-specific evolution of alternative splicing in humans and mice: comparative genomics analysis using well-annotated full-length cDNAs. Nucleic Acids Res 2008; 36:6386-95. [PMID: 18838389 PMCID: PMC2582632 DOI: 10.1093/nar/gkn677] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Using full-length cDNA sequences, we compared alternative splicing (AS) in humans and mice. The alignment of the human and mouse genomes showed that 86% of 199 426 total exons in human AS variants were conserved in the mouse genome. Of the 20 392 total human AS variants, however, 59% consisted of all conserved exons. Comparing AS patterns between human and mouse transcripts revealed that only 431 transcripts from 189 loci were perfectly conserved AS variants. To exclude the possibility that the full-length human cDNAs used in the present study, especially those with retained introns, were cloning artefacts or prematurely spliced transcripts, we experimentally validated 34 such cases. Our results indicate that even retained-intron type transcripts are typically expressed in a highly controlled manner and interact with translating ribosomes. We found non-conserved AS exons to be predominantly outside the coding sequences (CDSs). This suggests that non-conserved exons in the CDSs of transcripts cause functional constraint. These findings should enhance our understanding of the relationship between AS and species specificity of human genes.
Collapse
Affiliation(s)
- Jun-Ichi Takeda
- Integrated Database and Systems Biology Team, Biomedicinal Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research Building, Aomi 2-42, Koto-ku, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Floris M, Orsini M, Thanaraj TA. Splice-mediated Variants of Proteins (SpliVaP) - data and characterization of changes in signatures among protein isoforms due to alternative splicing. BMC Genomics 2008; 9:453. [PMID: 18831736 PMCID: PMC2573899 DOI: 10.1186/1471-2164-9-453] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2008] [Accepted: 10/02/2008] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND It is often the case that mammalian genes are alternatively spliced; the resulting alternate transcripts often encode protein isoforms that differ in amino acid sequences. Changes among the protein isoforms can alter the cellular properties of proteins. The effect can range from a subtle modulation to a complete loss of function. RESULTS (i) We examined human splice-mediated protein isoforms (as extracted from a manually curated data set, and from a computationally predicted data set) for differences in the annotation for protein signatures (Pfam domains and PRINTS fingerprints) and we characterized the differences & their effects on protein functionalities. An important question addressed relates to the extent of protein isoforms that may lack any known function in the cell. (ii) We present a database that reports differences in protein signatures among human splice-mediated protein isoform sequences. CONCLUSION (i) Characterization: The work points to distinct sets of alternatively spliced genes with varying degrees of annotation for the splice-mediated protein isoforms. Protein molecular functions seen to be often affected are those that relate to: binding, catalytic, transcription regulation, structural molecule, transporter, motor, and antioxidant; and the processes that are often affected are nucleic acid binding, signal transduction, and protein-protein interactions. Signatures are often included/excluded and truncated in length among protein isoforms; truncation is seen as the predominant type of change. Analysis points to the following novel aspects: (a) Analysis using data from the manually curated Vega indicates that one in 8.9 genes can lead to a protein isoform of no "known" function; and one in 18 expressed protein isoforms can be such an "orphan" isoform; the corresponding numbers as seen with computationally predicted ASD data set are: one in 4.9 genes and one in 9.8 isoforms. (b) When swapping of signatures occurs, it is often between those of same functional classifications. (c) Pfam domains can occur in varying lengths, and PRINTS fingerprints can occur with varying number of constituent motifs among isoforms - since such a variation is seen in large number of genes, it could be a general mechanism to modulate protein function. (ii) DATA The reported resource (at http://www.bioinformatica.crs4.org/tools/dbs/splivap/) provides the community ability to access data on splice-mediated protein isoforms (with value-added annotation such as association with diseases) through changes in protein signatures.
Collapse
Affiliation(s)
- Matteo Floris
- CRS4-Bioinformatica, Parco Scientifico e Technologico, POLARIS, Edificio 3, 09010 PULA (CA), Sardinia, Italy.
| | | | | |
Collapse
|
25
|
A general definition and nomenclature for alternative splicing events. PLoS Comput Biol 2008; 4:e1000147. [PMID: 18688268 PMCID: PMC2467475 DOI: 10.1371/journal.pcbi.1000147] [Citation(s) in RCA: 175] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2007] [Accepted: 07/01/2008] [Indexed: 11/19/2022] Open
Abstract
Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells is one of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenon contributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora of different transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify the different types of reflected splicing variation. In this work, we present a general definition of the AS event along with a notation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assigns a specific “AS code” to every possible pattern of splicing variation. On the basis of this definition and the corresponding codes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of AS events in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversity across genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—of the observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate and to compare the AS landscape of different reference annotation sets in human and in other metazoan species and found that proportions of AS events change substantially depending on the annotation protocol, species-specific attributes, and coding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conduct specific studies investigating the occurrence, impact, and regulation of AS. The genome sequence is said to be an organism's blueprint, a set of instructions driving the organism's biology. The unfolding of these instructions—the so-called genes—is initiated by the transcription of DNA into RNA molecules, which subsequently are processed before they can take their functional role. During this processing step, initially identical RNA molecules may result in different products through a process known as alternative splicing (AS). AS therefore allows for widening the diversity from the limited repertoire of genes, and it is often postulated as an explanation for the apparent paradox that complex and simple organisms resemble in their number of genes; it characterizes species, individuals, and developmental and cellular conditions. Comparing the differences of AS products between cells may help to reveal the broad molecular basis underlying phenotypic differences—for instance, between a cancer and a normal cell. An obstacle for such comparisons has been that, so far, no paradigm existed to delineate each single quantum of AS, so-called AS events. Here, we describe a possibility of exhaustively decomposing AS complements into qualitatively different groups of events and a nomenclature to unequivocally denote them. This typological catalogue of AS events along with their observed frequencies represent the AS landscape, and we propose a procedure to automatically identify such landscapes. We use it to describe the human AS landscape and to investigate how it has changed throughout evolution.
Collapse
|
26
|
Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 2008; 9:R7. [PMID: 18190707 PMCID: PMC2395244 DOI: 10.1186/gb-2008-9-1-r7] [Citation(s) in RCA: 2056] [Impact Index Per Article: 128.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2007] [Revised: 12/17/2007] [Accepted: 01/11/2008] [Indexed: 01/16/2023] Open
Abstract
EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Collapse
Affiliation(s)
- Brian J Haas
- J Craig Venter Institute, The Institute for Genomic Research, Rockville, Maryland 20850, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Robin G, Cowieson NP, Guncar G, Forwood JK, Listwan P, Hume DA, Kobe B, Martin JL, Huber T. A general target selection method for crystallographic proteomics. Methods Mol Biol 2008; 426:27-35. [PMID: 18542855 DOI: 10.1007/978-1-60327-058-8_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Increasing the success in obtaining structures and maximizing the value of the structures determined are the two major goals of target selection in structural proteomics. This chapter presents an efficient and flexible target selection procedure supplemented with a Web-based resource that is suitable for small- to large-scale structural genomics projects that use crystallography as the major means of structure determination. Based on three criteria, biological significance, structural novelty, and "crystallizability," the approach first removes (filters) targets that do not meet minimal criteria and then ranks the remaining targets based on their "crystallizability" estimates. This novel procedure was designed to maximize selection efficiency, and its prevailing criteria categories make it suitable for a broad range of structural proteomics projects.
Collapse
Affiliation(s)
- Gautier Robin
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Mangone M, Macmenamin P, Zegar C, Piano F, Gunsalus KC. UTRome.org: a platform for 3'UTR biology in C. elegans. Nucleic Acids Res 2007; 36:D57-62. [PMID: 17986455 PMCID: PMC2238901 DOI: 10.1093/nar/gkm946] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Three-prime untranslated regions (3′UTRs) are widely recognized as important post-transcriptional regulatory regions of mRNAs. RNA-binding proteins and small non-coding RNAs such as microRNAs (miRNAs) bind to functional elements within 3′UTRs to influence mRNA stability, translation and localization. These interactions play many important roles in development, metabolism and disease. However, even in the most well-annotated metazoan genomes, 3′UTRs and their functional elements are not well defined. Comprehensive and accurate genome-wide annotation of 3′UTRs and their functional elements is thus critical. We have developed an open-access database, available at http://www.UTRome.org, to provide a rich and comprehensive resource for 3′UTR biology in the well-characterized, experimentally tractable model system Caenorhabditis elegans. UTRome.org combines data from public repositories and a large-scale effort we are undertaking to characterize 3′UTRs and their functional elements in C. elegans, including 3′UTR sequences, graphical displays, predicted and validated functional elements, secondary structure predictions and detailed data from our cloning pipeline. UTRome.org will grow substantially over time to encompass individual 3′UTR isoforms for the majority of genes, new and revised functional elements, and in vivo data on 3′UTR function as they become available. The UTRome database thus represents a powerful tool to better understand the biology of 3′UTRs.
Collapse
Affiliation(s)
- Marco Mangone
- Department of Biology and Center for Genomics and Systems Biology, New York University, 100 Washington Square East, New York, NY 10003, USA
| | | | | | | | | |
Collapse
|
29
|
Matsuya A, Sakate R, Kawahara Y, Koyanagi KO, Sato Y, Fujii Y, Yamasaki C, Habara T, Nakaoka H, Todokoro F, Yamaguchi K, Endo T, Oota S, Makalowski W, Ikeo K, Suzuki Y, Hanada K, Hashimoto K, Hirai M, Iwama H, Saitou N, Hiraki AT, Jin L, Kaneko Y, Kanno M, Murakami K, Noda AO, Saichi N, Sanbonmatsu R, Suzuki M, Takeda JI, Tanaka M, Gojobori T, Imanishi T, Itoh T. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees. Nucleic Acids Res 2007; 36:D787-92. [PMID: 17982176 PMCID: PMC2238928 DOI: 10.1093/nar/gkm878] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called ‘Evola’. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In ‘dN/dS view’, natural selection on genes can be analyzed between human and other species. In ‘Locus maps’, all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/.
Collapse
Affiliation(s)
- Akihiro Matsuya
- Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Abstract
Although the number of protein-encoding human genes is more limited than many had estimated, the human transcript repertoire is much more diverse than anticipated. In part, transcript diversity is generated through the use of alternative promoters and alternate splicing. In addition, based on discoveries using technologies such as full-length cDNA libraries and whole genome tiling microarrays, it is now likely that non-protein-encoding transcripts comprise a substantial fraction of the human RNA population. Much attention is currently focused on understanding the role of alternative promoters in generating transcript diversity, both for non-protein-encoding (ncRNAs) and protein-encoding RNAs.
Collapse
|
31
|
He C, Zuo Z, Chen H, Zhang L, Zhou F, Cheng H, Zhou R. Genome-wide detection of testis- and testicular cancer-specific alternative splicing. Carcinogenesis 2007; 28:2484-90. [PMID: 17724370 DOI: 10.1093/carcin/bgm194] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Alternative pre-messenger RNA (mRNA) splicing is a key molecular event that allows for protein diversity and plays important roles in development and disease. Alternative pre-mRNA splicing regulations during spermatogenesis and alternative pre-mRNA splicing etiology in testicular tumorigenesis are yet to be characterized. By genome-wide analysis, here we describe alternative splicing features that distinguish distinctive patterns of alternative pre-mRNA splicing among human testis, testicular cancer and mouse testis. Through computationally subtractive analysis, we detected 80 testis-specific transcript candidates in human testis, 175 in human testicular cancer and 262 in mouse testis, which were integrated into a database. Reverse transcription-polymerase chain reaction confirmed that most of these transcript candidates from mouse testis were testis specific. Around 40% of the transcripts were from unknown/hypothetical genes, which were useful for further functional analysis. These transcripts were not overlapped, indicating lack of evolutionary conservation. Further chromosome mapping showed distinct chromosomal preference of alternative pre-mRNA splicing events. Comparison analysis indicated that alternative pre-mRNA splicing in human testicular tumor shared some characters/trends with those in mouse testis. Moreover, human testicular tumor tended to use rare splice sites and there were also distinct sequences adjacent dominant splice sites between normal testis and testicular tumor. These special features of alternative pre-mRNA splicing in human testicular tumor suggested that testicular tumorigenesis was involved in multiple steps/levels of alternative splicing events. Using alternative splicing as a potential source for new clinical diagnostic, prognostic and therapeutic strategies for treatment of testicular tumors seems to have a bright prospect.
Collapse
Affiliation(s)
- Chunjiang He
- Department of Genetics and Center for Developmental Biology, College of Life Sciences, Wuhan University, Wuhan 430072, P. R. China
| | | | | | | | | | | | | |
Collapse
|
32
|
Ralph SA. Subcellular multitasking - multiple destinations and roles for the Plasmodium falcilysin protease. Mol Microbiol 2007; 63:309-13. [PMID: 17241197 DOI: 10.1111/j.1365-2958.2006.05528.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Plasmodium falcilysin protease is a M16-family protease that has been previously identified as a food vacuole enzyme that participates in the breakdown of haemoglobin. Plant homologues of this protease are responsible for breaking down transit peptides that have been processed in mitochondria and plastids, and in this issue of Molecular Microbiology, Ponpuak and colleagues show that falcilysin participates in degradation of transit peptides and haemoglobin in discrete subcellular organelles. The recruitment of a gene product from one cellular compartment to another is a recurring phenomenon in molecular evolutionary biology, and arises through a number of distinct mechanisms. Plasmodium accomplishes this triple act by targeting products of the single falcilysin gene to multiple compartments.
Collapse
Affiliation(s)
- Stuart A Ralph
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria 3050, Australia.
| |
Collapse
|
33
|
Takeda JI, Suzuki Y, Nakao M, Kuroda T, Sugano S, Gojobori T, Imanishi T. H-DBAS: alternative splicing database of completely sequenced and manually annotated full-length cDNAs based on H-Invitational. Nucleic Acids Res 2007; 35:D104-9. [PMID: 17130147 PMCID: PMC1716722 DOI: 10.1093/nar/gkl854] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2006] [Revised: 10/09/2006] [Accepted: 10/10/2006] [Indexed: 11/13/2022] Open
Abstract
The Human-transcriptome DataBase for Alternative Splicing (H-DBAS) is a specialized database of alternatively spliced human transcripts. In this database, each of the alternative splicing (AS) variants corresponds to a completely sequenced and carefully annotated human full-length cDNA, one of those collected for the H-Invitational human-transcriptome annotation meeting. H-DBAS contains 38,664 representative alternative splicing variants (RASVs) in 11,744 loci, in total. The data is retrievable by various features of AS, which were annotated according to manual annotations, such as by patterns of ASs, consequently invoked alternations in the encoded amino acids and affected protein motifs, GO terms, predicted subcellular localization signals and transmembrane domains. The database also records recently identified very complex patterns of AS, in which two distinct genes seemed to be bridged, nested or degenerated (multiple CDS): in all three cases, completely unrelated proteins are encoded by a single locus. By using AS Viewer, each AS event can be analyzed in the context of full-length cDNAs, enabling the user's empirical understanding of the relation between AS event and the consequent alternations in the encoded amino acid sequences together with various kinds of affected protein motifs. H-DBAS is accessible at http://jbirc.jbic.or.jp/h-dbas/.
Collapse
Affiliation(s)
- Jun-ichi Takeda
- Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
| | - Yutaka Suzuki
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, the University of Tokyo5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Mitsuteru Nakao
- Computational Biology Research Center, National Institute of Advanced Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Kazusa DNA Research Institute, 2-6-7 Kazusa-KamatariKisarazu, Chiba 292-0818, Japan
| | - Tsuyoshi Kuroda
- Maze Corporation, TS Building 1013-20-2 Hatagaya, Shibuya-ku, Tokyo 151-0072, Japan
| | - Sumio Sugano
- Department of Medical Genome Sciences, Graduate School of Frontier Sciences, the University of Tokyo5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8562, Japan
| | - Takashi Gojobori
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Center for Information Biology and DDBJ, National Institute of Genetics1111 Yata, Mishima, Shizuoka 411-8540, Japan
| | - Tadashi Imanishi
- Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, AIST Bio-IT Research BuildingAomi 2-42, Koto-ku, Tokyo 135-0064, Japan
- Graduate School of Information Science and Technology, Hokkaido UniversityNorth 14, West 9, Kita-ku, Sapporo, Hokkaido 060-0814, Japan
| |
Collapse
|
34
|
Xing Y, Lee C. Relating alternative splicing to proteome complexity and genome evolution. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2007; 623:36-49. [PMID: 18380339 DOI: 10.1007/978-0-387-77374-2_3] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Prior to genomics, studies of alternative splicing primarily focused on the function and mechanism of alternative splicing in individual genes and exons. This has changed dramatically since the late 1990s. High-throughput genomics technologies, such as EST sequencing and microarrays designed to detect changes in splicing, led to genome-wide discoveries and quantification of alternative splicing in a wide range of species from human to Arabidopsis. Consensus estimates of AS frequency in the human genome grew from less than 5% in mid-1990s to as high as 60-74% now. The rapid growth in sequence and microarray data for alternative splicing has made it possible to look into the global impact of alternative splicing on protein function and evolution of genomes. In this chapter, we review recent research on alternative splicing's impact on proteomic complexity and its role in genome evolution.
Collapse
Affiliation(s)
- Yi Xing
- Department of Internal Medicine, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, USA
| | | |
Collapse
|
35
|
del Val C, Kuryshev VY, Glatting KH, Ernst P, Hotz-Wagenblatt A, Poustka A, Suhai S, Wiemann S. CAFTAN: a tool for fast mapping, and quality assessment of cDNAs. BMC Bioinformatics 2006; 7:473. [PMID: 17064411 PMCID: PMC1636072 DOI: 10.1186/1471-2105-7-473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2006] [Accepted: 10/25/2006] [Indexed: 11/10/2022] Open
Abstract
Background The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of truly useful cDNAs from biological and experimental noise. To this end we have developed a new high-throughput analysis tool, CAFTAN, which simplifies these efforts and thus fills the gap between large-scale cDNA collections and their systematic annotation and application in functional genomics. Results CAFTAN is built around the mapping of cDNAs to the genome assembly, and the subsequent analysis of their genomic context. It uses sequence features like the presence and type of PolyA signals, inner and flanking repeats, the GC-content, splice site types, etc. All these features are evaluated in individual tests and classify cDNAs according to their sequence quality and likelihood to have been generated from fully processed mRNAs. Additionally, CAFTAN compares the coordinates of mapped cDNAs with the genomic coordinates of reference sets from public available resources (e.g., VEGA, ENSEMBL). This provides detailed information about overlapping exons and the structural classification of cDNAs with respect to the reference set of splice variants. The evaluation of CAFTAN showed that is able to correctly classify more than 85% of 5950 selected "known protein-coding" VEGA cDNAs as high quality multi- or single-exon. It identified as good 80.6 % of the single exon cDNAs and 85 % of the multiple exon cDNAs. The program is written in Perl and in a modular way, allowing the adoption of this strategy to other tasks like EST-annotation, or to extend it by adding new classification rules and new organism databases as they become available. We think that it is a very useful program for the annotation and research of unfinished genomes. Conclusion CAFTAN is a high-throughput sequence analysis tool, which performs a fast and reliable quality prediction of cDNAs. Several thousands of cDNAs can be analyzed in a short time, giving the curator/scientist a first quick overview about the quality and the already existing annotation of a set of cDNAs. It supports the rejection of low quality cDNAs and helps in the selection of likely novel splice variants, and/or completely novel transcripts for new experiments.
Collapse
Affiliation(s)
- Coral del Val
- DKFZ, German Cancer Research Center, Division Molecular Biophysics, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
- DKFZ, German Cancer Research Center, Division of Molecular Genome Analysis, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
- Dept. Computer Science and Artificial Intelligence, ETSI Informatics University of Granada, C/Daniel Saucedo Aranda s/n 18071, Granada, Spain
| | - Vladimir Yurjevich Kuryshev
- DKFZ, German Cancer Research Center, Division of Molecular Genome Analysis, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Karl-Heinz Glatting
- DKFZ, German Cancer Research Center, Division Molecular Biophysics, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Peter Ernst
- DKFZ, German Cancer Research Center, Division Molecular Biophysics, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Agnes Hotz-Wagenblatt
- DKFZ, German Cancer Research Center, Division Molecular Biophysics, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Annemarie Poustka
- DKFZ, German Cancer Research Center, Division of Molecular Genome Analysis, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Sandor Suhai
- DKFZ, German Cancer Research Center, Division Molecular Biophysics, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| | - Stefan Wiemann
- DKFZ, German Cancer Research Center, Division of Molecular Genome Analysis, Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany
| |
Collapse
|