1
|
Brown SJ, Stoilov P, Xing Y. Chromatin and epigenetic regulation of pre-mRNA processing. Hum Mol Genet 2012; 21:R90-6. [PMID: 22936691 DOI: 10.1093/hmg/dds353] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
New data are revealing a complex landscape of gene regulation shaped by chromatin states that extend into the bodies of transcribed genes and associate with distinct RNA elements such as exons, introns and polyadenylation sites. Exons are characterized by increased levels of nucleosome positioning, DNA methylation and certain histone modifications. As pre-mRNA splicing occurs co-transcriptionally, changes in the transcription elongation rate or epigenetic marks can influence exon splicing. These new discoveries broaden our understanding of the epigenetic code and ascribe a novel role for chromatin in controlling pre-mRNA processing. In this review, we summarize the recently discovered interplay between the modulation of chromatin states and pre-mRNA processing with the particular focus on how these processes communicate with one another to control gene expression.
Collapse
Affiliation(s)
- Seth J Brown
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| | | | | |
Collapse
|
2
|
Coyne RS, Thiagarajan M, Jones KM, Wortman JR, Tallon LJ, Haas BJ, Cassidy-Hanley DM, Wiley EA, Smith JJ, Collins K, Lee SR, Couvillion MT, Liu Y, Garg J, Pearlman RE, Hamilton EP, Orias E, Eisen JA, Methé BA. Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure. BMC Genomics 2008; 9:562. [PMID: 19036158 PMCID: PMC2612030 DOI: 10.1186/1471-2164-9-562] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2008] [Accepted: 11/26/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. RESULTS We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. CONCLUSION We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes.
Collapse
Affiliation(s)
- Robert S Coyne
- J. Craig Venter Institute (formerly The Institute for Genomic Research), 9704 Medical Center Dr., Rockville, MD, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Sinha R, Hiller M, Pudimat R, Gausmann U, Platzer M, Backofen R. Improved identification of conserved cassette exons using Bayesian networks. BMC Bioinformatics 2008; 9:477. [PMID: 19014490 PMCID: PMC2621368 DOI: 10.1186/1471-2105-9-477] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2008] [Accepted: 11/12/2008] [Indexed: 12/14/2022] Open
Abstract
Background Alternative splicing is a major contributor to the diversity of eukaryotic transcriptomes and proteomes. Currently, large scale detection of alternative splicing using expressed sequence tags (ESTs) or microarrays does not capture all alternative splicing events. Moreover, for many species genomic data is being produced at a far greater rate than corresponding transcript data, hence in silico methods of predicting alternative splicing have to be improved. Results Here, we show that the use of Bayesian networks (BNs) allows accurate prediction of evolutionary conserved exon skipping events. At a stringent false positive rate of 0.5%, our BN achieves an improved true positive rate of 61%, compared to a previously reported 50% on the same dataset using support vector machines (SVMs). Incorporating several novel discriminative features such as intronic splicing regulatory elements leads to the improvement. Features related to mRNA secondary structure increase the prediction performance, corroborating previous findings that secondary structures are important for exon recognition. Random labelling tests rule out overfitting. Cross-validation on another dataset confirms the increased performance. When using the same dataset and the same set of features, the BN matches the performance of an SVM in earlier literature. Remarkably, we could show that about half of the exons which are labelled constitutive but receive a high probability of being alternative by the BN, are in fact alternative exons according to the latest EST data. Finally, we predict exon skipping without using conservation-based features, and achieve a true positive rate of 29% at a false positive rate of 0.5%. Conclusion BNs can be used to achieve accurate identification of alternative exons and provide clues about possible dependencies between relevant features. The near-identical performance of the BN and SVM when using the same features shows that good classification depends more on features than on the choice of classifier. Conservation based features continue to be the most informative, and hence distinguishing alternative exons from constitutive ones without using conservation based features remains a challenging problem.
Collapse
Affiliation(s)
- Rileen Sinha
- Genome Analysis, Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany.
| | | | | | | | | | | |
Collapse
|
4
|
Chen L, Zheng S. Identify alternative splicing events based on position-specific evolutionary conservation. PLoS One 2008; 3:e2806. [PMID: 18665247 PMCID: PMC2467489 DOI: 10.1371/journal.pone.0002806] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Accepted: 07/07/2008] [Indexed: 11/19/2022] Open
Abstract
The evolution of eukaryotes is accompanied by the increased complexity of alternative splicing which greatly expands genome information. One of the greatest challenges in the post-genome era is a complete revelation of human transcriptome with consideration of alternative splicing. Here, we introduce a comparative genomics approach to systemically identify alternative splicing events based on the differential evolutionary conservation between exons and introns and the high-quality annotation of the ENCODE regions. Specifically, we focus on exons that are included in some transcripts but are completely spliced out for others and we call them conditional exons. First, we characterize distinguishing features among conditional exons, constitutive exons and introns. One of the most important features is the position-specific conservation score. There are dramatic differences in conservation scores between conditional exons and constitutive exons. More importantly, the differences are position-specific. For flanking intronic regions, the differences between conditional exons and constitutive exons are also position-specific. Using the Random Forests algorithm, we can classify conditional exons with high specificities (97% for the identification of conditional exons from intron regions and 95% for the classification of known exons) and fair sensitivities (64% and 32% respectively). We applied the method to the human genome and identified 39,640 introns that actually contain conditional exons and classified 8,813 conditional exons from the current RefSeq exon list. Among those, 31,673 introns containing conditional exons and 5,294 conditional exons classified from known exons cannot be inferred from RefSeq, UCSC or Ensembl annotations. Some of these de novo predictions were experimentally verified.
Collapse
Affiliation(s)
- Liang Chen
- Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America.
| | | |
Collapse
|
5
|
Bruedigam C, Koedam M, Chiba H, Eijken M, van Leeuwen JPTM. Evidence for multiple peroxisome proliferator-activated receptor gamma transcripts in bone: fine-tuning by hormonal regulation and mRNA stability. FEBS Lett 2008; 582:1618-24. [PMID: 18435931 DOI: 10.1016/j.febslet.2008.04.012] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Revised: 04/08/2008] [Accepted: 04/11/2008] [Indexed: 12/12/2022]
Abstract
The expression, regulation and functional significance of multiple peroxisome proliferator-activated receptor gamma transcript variants in bone were studied. PPARG transcripts giving rise to PPARg-1 protein were expressed in human osteoblasts, whereas PPARG-2 transcript and protein remained virtually absent. PPARG expression underwent homologous regulation, was upregulated during differentiation and directly induced by the osteogenic hormone dexamethasone, suggesting a role for PPARg-1 in osteogenesis. Differences between the stabilities of PPARG-1, -3 and -4 were observed. We hypothesize that cell-specific expression patterns of multiple PPARG transcript variants encoding for the same protein but differing in mRNA stabilities enable a fine-tuning of PPARG action, which eventually supports a well-adjusted signal transduction between the cell and its environment.
Collapse
Affiliation(s)
- Claudia Bruedigam
- Department of Internal Medicine, Erasmus MC, P.O. Box 2040, 3000 CA Rotterdam, The Netherlands
| | | | | | | | | |
Collapse
|
6
|
Barberan-Soler S, Zahler AM. Alternative splicing regulation during C. elegans development: splicing factors as regulated targets. PLoS Genet 2008; 4:e1000001. [PMID: 18454200 PMCID: PMC2265522 DOI: 10.1371/journal.pgen.1000001] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2007] [Accepted: 01/15/2008] [Indexed: 11/19/2022] Open
Abstract
Alternative splicing generates protein diversity and allows for post-transcriptional gene regulation. Estimates suggest that 10% of the genes in Caenorhabditis elegans undergo alternative splicing. We constructed a splicing-sensitive microarray to detect alternative splicing for 352 cassette exons and tested for changes in alternative splicing of these genes during development. We found that the microarray data predicted that 62/352 (∼18%) of the alternative splicing events studied show a strong change in the relative levels of the spliced isoforms (>4-fold) during development. Confirmation of the microarray data by RT-PCR was obtained for 70% of randomly selected genes tested. Among the genes with the most developmentally regulated alternatively splicing was the hnRNP F/H splicing factor homolog, W02D3.11 – now named hrpf-1. For the cassette exon of hrpf-1, the inclusion isoform comprises 65% of hrpf-1 steady state messages in embryos but only 0.1% in the first larval stage. This dramatic change in the alternative splicing of an alternative splicing factor suggests a complex cascade of splicing regulation during development. We analyzed splicing in embryos from a strain with a mutation in the splicing factor sym-2, another hnRNP F/H homolog. We found that approximately half of the genes with large alternative splicing changes between the embryo and L1 stages are regulated by sym-2 in embryos. An analysis of the role of nonsense-mediated decay in regulating steady-state alternative mRNA isoforms was performed. We found that 8% of the 352 events studied have alternative isoforms whose relative steady-state levels in embryos change more than 4-fold in a nonsense-mediated decay mutant, including hrpf-1. Strikingly, 53% of these alternative splicing events that are affected by NMD in our experiment are not obvious substrates for NMD based on the presence of premature termination codons. This suggests that the targeting of splicing factors by NMD may have downstream effects on alternative splicing regulation. Alternative splicing is a mechanism for generating more than one messenger RNA from a given gene. The alternative transcripts can encode different proteins that share some regions in common but have modified functions, thus increasing the number of proteins encoded by the genome. Alternative splicing can also lead to the production of mRNA isoforms that are then subject to degradation by the nonsense-mediated decay pathway, thus providing a mechanism to down-regulate gene expression without decreasing transcription. Examples of cell type-specific, hormone-responsive, and developmentally-regulated alternative splicing have been described. We decided to measure the extent of developmentally regulated alternative splicing in the nematode model organism Caenorhabditis elegans. We developed a DNA microarray that can measure the alternative splicing of 352 cassette exons simultaneously and used it to probe alternative splicing in RNA extracted from embryos, the four larval stages, and adults. We show that 18% of the alternatively spliced genes tested show >4-fold changes in alternative splicing during development. In addition, we show that one of the most regulated genes is itself a splicing factor, providing support for a model in which a cascade of alternative splicing regulation occurs during development.
Collapse
Affiliation(s)
- Sergio Barberan-Soler
- Department of MCD Biology, University of California Santa Cruz, Santa Cruz, California, United States of America
- Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, California, United States of America
| | - Alan M. Zahler
- Department of MCD Biology, University of California Santa Cruz, Santa Cruz, California, United States of America
- Center for Molecular Biology of RNA, University of California Santa Cruz, Santa Cruz, California, United States of America
- * E-mail:
| |
Collapse
|
7
|
Holste D, Ohler U. Strategies for identifying RNA splicing regulatory motifs and predicting alternative splicing events. PLoS Comput Biol 2008; 4:e21. [PMID: 18225947 PMCID: PMC2217580 DOI: 10.1371/journal.pcbi.0040021] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Affiliation(s)
- Dirk Holste
- * To whom correspondence should be addressed. E-mail: (UO), (DH)
| | - Uwe Ohler
- * To whom correspondence should be addressed. E-mail: (UO), (DH)
| |
Collapse
|
8
|
Kashyap L, Sharma RK. Alternative splicing: a paradoxical qudo in eukaryotic genomes. Bioinformation 2007; 2:155-6. [PMID: 21670794 PMCID: PMC2255073 DOI: 10.6026/97320630002155] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2007] [Revised: 12/08/2007] [Accepted: 12/11/2007] [Indexed: 11/23/2022] Open
Abstract
One of the most remarkable observations stemming from the sequencing of genomes of diverse species is that the number of protein-coding genes in an organism does not correlate with its overall cellular complexity. Alternative splicing, a key mechanism for generating protein complexity, has been suggested as one of the major explanation for this discrepancy between the number of genes and genome complexity. Determining the extent and importance of alternative splicing required the confluence of critical advances in data acquisition, improved understanding of biological processes and the development of fast and accurate computational analysis tools. Although many model organisms have now been completely sequenced, we are still very far from understanding the exact frequency of alternative splicing from these sequenced genomes.This paper will highlight some recent progress and future challenges for functional genomics and bioinformatics in this rapidly developing area.
Collapse
Affiliation(s)
- Luv Kashyap
- Department of Biochemistry, Faculty of Life Sciences, Aligarh Muslim University, Aligarh, India
| | - Ravi Kumar Sharma
- Botany Division, Central Drug Research Institute, M G Marg, Lucknow, India
| |
Collapse
|
9
|
Leparc GG, Mitra RD. A sensitive procedure to detect alternatively spliced mRNA in pooled-tissue samples. Nucleic Acids Res 2007; 35:e146. [PMID: 18000005 PMCID: PMC2175357 DOI: 10.1093/nar/gkm989] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
One important goal of genomics is to explore the extent of alternative splicing in the transcriptome and generate a comprehensive catalog of splice forms. New computational and experimental approaches have led to an increase in the number of predicted alternatively spliced transcripts; however, validation of these predictions has not kept pace. In this work, we systematically explore different methods for the validation of cassette exons predicted by computational methods or tiling microarrays. Our goal was to find a procedure that is cost effective, sensitive and specific. We examined three ways of priming the reverse transcription (RT) reaction—poly-dT priming, random priming and pooled exon-specific priming. We also examined two strategies for PCR amplification—flanking PCR, which uses primers that hybridize to the constitutive exons flanking the predicted exon, and a semi-nested PCR with a primer that targets the predicted exon. We found that the combination of RT using a pool of gene-specific primers followed by semi-nested PCR resulted in a significant increase in sensitivity over the most commonly used methodology (97% of the test set was detected versus 14%). Our method was also highly specific—no false positives were detected using a test set of true negatives. Finally, we demonstrate that this method is able to detect alternative exons with a high sensitivity from whole-organism RNA, allowing all tissues to be sampled in a single experiment. The protocol developed here is an accurate and cost-effective way to validate predictions of alternative splicing.
Collapse
Affiliation(s)
- Germán Gastón Leparc
- Department of Genetics and Center for Genome Sciences, Washington University in St Louis, 4444 Forest Park Parkway, Campus Box 8510, St Louis, MO 63108, USA
| | | |
Collapse
|