251
|
Kolle G, Shepherd JL, Gardiner B, Kassahn KS, Cloonan N, Wood DLA, Nourbakhsh E, Taylor DF, Wani S, Chy HS, Zhou Q, McKernan K, Kuersten S, Laslett AL, Grimmond SM. Deep-transcriptome and ribonome sequencing redefines the molecular networks of pluripotency and the extracellular space in human embryonic stem cells. Genome Res 2011; 21:2014-25. [PMID: 22042643 DOI: 10.1101/gr.119321.110] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Recent RNA-sequencing studies have shown remarkable complexity in the mammalian transcriptome. The ultimate impact of this complexity on the predicted proteomic output is less well defined. We have undertaken strand-specific RNA sequencing of multiple cellular RNA fractions (>20 Gb) to uncover the transcriptional complexity of human embryonic stem cells (hESCs). We have shown that human embryonic stem (ES) cells display a high degree of transcriptional diversity, with more than half of active genes generating RNAs that differ from conventional gene models. We found evidence that more than 1000 genes express long 5' and/or extended 3'UTRs, which was confirmed by "virtual Northern" analysis. Exhaustive sequencing of the membrane-polysome and cytosolic/untranslated fractions of hESCs was used to identify RNAs encoding peptides destined for secretion and the extracellular space and to demonstrate preferential selection of transcription complexity for translation in vitro. The impact of this newly defined complexity on known gene-centric network models such as the Plurinet and the cell surface signaling machinery in human ES cells revealed a significant expansion of known transcript isoforms at play, many predicting possible alternative functions based on sequence alterations within key functional domains.
Collapse
Affiliation(s)
- Gabriel Kolle
- Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of Queensland, Queensland 4072, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
252
|
Sun Y, Wang Y, Hu Y, Chen G, Ma H. Comparative analysis of neural transcriptomes and functional implication of unannotated intronic expression. BMC Genomics 2011; 12:494. [PMID: 21985610 PMCID: PMC3228559 DOI: 10.1186/1471-2164-12-494] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Accepted: 10/10/2011] [Indexed: 12/13/2022] Open
Abstract
Background The transcriptome and its regulation bridge the genome and the phenome. Recent RNA-seq studies unveiled complex transcriptomes with previously unknown transcripts and functions. To investigate the characteristics of neural transcriptomes and possible functions of previously unknown transcripts, we analyzed and compared nine recent RNA-seq datasets corresponding to tissues/organs ranging from stem cell, embryonic brain cortex to adult whole brain. Results We found that the neural and stem cell transcriptomes share global similarity in both gene and chromosomal expression, but are quite different from those of liver or muscle. We also found an unusually high level of unannotated expression in mouse embryonic brains. The intronic unannotated expression was found to be strongly associated with genes annotated for neurogenesis, axon guidance, negative regulation of transcription, and neural transmission. These functions are the hallmarks of the late embryonic stage cortex, and crucial for synaptogenesis and neural circuit formation. Conclusions Our results revealed unique global and local landscapes of neural transcriptomes. It also suggested potential functional roles for previously unknown transcripts actively expressed in the developing brain cortex. Our findings provide new insights into potentially novel genes, gene functions and regulatory mechanisms in early brain development.
Collapse
Affiliation(s)
- Yazhou Sun
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | |
Collapse
|
253
|
Saxena A, Carninci P. Long non-coding RNA modifies chromatin: epigenetic silencing by long non-coding RNAs. Bioessays 2011; 33:830-9. [PMID: 21915889 PMCID: PMC3258546 DOI: 10.1002/bies.201100084] [Citation(s) in RCA: 151] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Revised: 08/09/2011] [Accepted: 08/10/2011] [Indexed: 12/16/2022]
Abstract
Common themes are emerging in the molecular mechanisms of long non-coding RNA-mediated gene repression. Long non-coding RNAs (lncRNAs) participate in targeted gene silencing through chromatin remodelling, nuclear reorganisation, formation of a silencing domain and precise control over the entry of genes into silent compartments. The similarities suggest that these are fundamental processes of transcription regulation governed by lncRNAs. These findings have paved the way for analogous investigations on other lncRNAs and chromatin remodelling enzymes. Here we discuss these common mechanisms and provide our view on other molecules that warrant similar investigations. We also present our concepts on the possible mechanisms that may facilitate the exit of genes from the silencing domains and their potential therapeutic applications. Finally, we point to future areas of research and put forward our recommendations for improvements in resources and applications of existing technologies towards targeted outcomes in this active area of research.
Collapse
Affiliation(s)
- Alka Saxena
- Omics Science Center, RIKEN Yokohama Institute, 1-7-22 Suehiro Cho, Tsurumi Ku, Yokohama, Kanagawa 230-0045, Japan
| | | |
Collapse
|
254
|
Knowling S, Morris KV. Non-coding RNA and antisense RNA. Nature's trash or treasure? Biochimie 2011; 93:1922-7. [PMID: 21843589 DOI: 10.1016/j.biochi.2011.07.031] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 07/29/2011] [Indexed: 01/13/2023]
Abstract
Although control of cellular function has classically been considered the responsibility of proteins, research over the last decade has elucidated many roles for RNA in regulation of not only the proteins that control cellular functions but also for the cellular functions themselves. In parallel to this advancement in knowledge about the regulatory roles of RNA there has been an explosion of knowledge about the role that epigenetics plays in controlling not only long-term cellular fate but also the short-term regulatory control of genes. Of particular interest is the crossover between these two worlds, a world where RNA can act out its part and subsequently elicit chromatin modifications that alter cellular function. Two main categories of RNA are examined here, non-coding RNA and antisense RNA both of which perform vital functions in controlling numerous genes, proteins and RNA itself. As the activities of non-coding and antisense RNA in both normal and aberrant cellular function are elucidated, so does the number of possible targets for pharmacopeic intervention.
Collapse
Affiliation(s)
- Stuart Knowling
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 N. Torrey Pines Road, La Jolla, CA 92037, USA.
| | | |
Collapse
|
255
|
Valen E, Preker P, Andersen PR, Zhao X, Chen Y, Ender C, Dueck A, Meister G, Sandelin A, Jensen TH. Biogenic mechanisms and utilization of small RNAs derived from human protein-coding genes. Nat Struct Mol Biol 2011; 18:1075-82. [PMID: 21822281 DOI: 10.1038/nsmb.2091] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2011] [Accepted: 05/23/2011] [Indexed: 11/09/2022]
Abstract
Efforts to catalog eukaryotic transcripts have uncovered many small RNAs (sRNAs) derived from gene termini and splice sites. Their biogenesis pathways are largely unknown, but a mechanism based on backtracking of RNA polymerase II (RNAPII) has been suggested. By sequencing transcripts 12-100 nucleotides in length from cells depleted of major RNA degradation enzymes and RNAs associated with Argonaute (AGO1/2) effector proteins, we provide mechanistic models for sRNA production. We suggest that neither splice site-associated (SSa) nor transcription start site-associated (TSSa) RNAs arise from RNAPII backtracking. Instead, SSa RNAs are largely degradation products of splicing intermediates, whereas TSSa RNAs probably derive from nascent RNAs protected by stalled RNAPII against nucleolysis. We also reveal new AGO1/2-associated RNAs derived from 3' ends of introns and from mRNA 3' UTRs that appear to draw from noncanonical microRNA biogenesis pathways.
Collapse
Affiliation(s)
- Eivind Valen
- The Bioinformatics Centre, Department of Biology and the Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| | | | | | | | | | | | | | | | | | | |
Collapse
|
256
|
Abstract
Recent advances in high-throughput sequencing have facilitated the genome-wide studies of small non-coding RNAs (sRNAs). Numerous studies have highlighted the role of various classes of sRNAs at different levels of gene regulation and disease. The fast growth of sequence data and the diversity of sRNA species have prompted the need to organise them in annotation databases. There are currently several databases that collect sRNA data. Various tools are provided for access, with special emphasis on the well-characterised family of micro-RNAs. The striking heterogeneity of the new classes of sRNAs and the lack of sufficient functional annotation, however, make integration of these datasets a difficult task. This review describes the currently available databases for human sRNAs that are accessible via the internet, and some of the large datasets for human sRNAs from high-throughput sequencing experiments that are so far only available as supplementary data in publications. Some of the main issues related to the integration and annotation of sRNA datasets are also discussed.
Collapse
Affiliation(s)
- Eneritz Agirre
- Department of Computational Genomics, Universitat Pompeu Fabra, Dr. Aiguader 88, E08003 Barcelona, Spain
| | | |
Collapse
|
257
|
Taft RJ, Hawkins PG, Mattick JS, Morris KV. The relationship between transcription initiation RNAs and CCCTC-binding factor (CTCF) localization. Epigenetics Chromatin 2011; 4:13. [PMID: 21813016 PMCID: PMC3170176 DOI: 10.1186/1756-8935-4-13] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Accepted: 08/03/2011] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Transcription initiation RNAs (tiRNAs) are nuclear localized 18 nucleotide RNAs derived from sequences immediately downstream of RNA polymerase II (RNAPII) transcription start sites. Previous reports have shown that tiRNAs are intimately correlated with gene expression, RNA polymerase II binding and behaviors, and epigenetic marks associated with transcription initiation, but not elongation. RESULTS In the present work, we show that tiRNAs are commonly found at genomic CCCTC-binding factor (CTCF) binding sites in human and mouse, and that CTCF sites that colocalize with RNAPII are highly enriched for tiRNAs. To directly investigate the relationship between tiRNAs and CTCF we examined tiRNAs originating near the intronic CTCF binding site in the human tumor suppressor gene, p21 (cyclin-dependent kinase inhibitor 1A gene, also known as CDKN1A). Inhibition of CTCF-proximal tiRNAs resulted in increased CTCF localization and increased p21 expression, while overexpression of CTCF-proximal tiRNA mimics decreased CTCF localization and p21 expression. We also found that tiRNA-regulated CTCF binding influences the levels of trimethylated H3K27 at the alternate upstream p21 promoter, and affects the levels of alternate p21 (p21alt) transcripts. Extending these studies to another randomly selected locus with conserved CTCF binding we found that depletion of tiRNA alters nucleosome density proximal to sites of tiRNA biogenesis. CONCLUSIONS Taken together, these data suggest that tiRNAs modulate local epigenetic structure, which in turn regulates CTCF localization.
Collapse
Affiliation(s)
- Ryan J Taft
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | - Peter G Hawkins
- Department of Molecular and Experimental Medicine, The Kellogg School of Science and Technology, The Scripps Research Institute, La Jolla, CA 92037, USA
- The Kellogg School of Science and Technology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - John S Mattick
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | - Kevin V Morris
- Department of Molecular and Experimental Medicine, The Kellogg School of Science and Technology, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
258
|
Dinger ME, Gascoigne DK, Mattick JS. The evolution of RNAs with multiple functions. Biochimie 2011; 93:2013-8. [PMID: 21802485 DOI: 10.1016/j.biochi.2011.07.018] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2011] [Accepted: 07/12/2011] [Indexed: 01/21/2023]
Abstract
Increasing numbers of transcripts have been reported to transmit both protein-coding and regulatory information. Apart from challenging our conception of the gene, this observation raises the question as to what extent this phenomenon occurs across the genome and how and why such dual encoding of function has evolved in the eukaryotic genome. To address this question, we consider the evolutionary path of genes in the earliest forms of life on Earth, where it is generally regarded that proteins evolved from a cellular machinery based entirely within RNA. This led to the domination of protein-coding genes in the genomes of microorganisms, although it is likely that RNA never lost its other capacities and functionalities, as evidenced by cis-acting riboswitches and UTRs. On the basis that the subsequent evolution of a more sophisticated regulatory architecture to provide higher levels of epigenetic control and accurate spatiotemporal expression in developmentally complex organisms is a complicated task, we hypothesize: (i) that mRNAs have been and remain subject to secondary selection to provide trans-acting regulatory capability in parallel with protein-coding functions; (ii) that some and perhaps many protein-coding loci, possibly as a consequence of gene duplication, have lost protein-coding functions en route to acquiring more sophisticated trans-regulatory functions; (iii) that many transcripts have become subject to secondary processing to release different products; and (iv) that novel proteins have emerged within loci that previously evolved functionality as regulatory RNAs. In support of the idea that there is a dynamic flux between different types of informational RNAs in both evolutionary and real time, we review recent observations that have arisen from transcriptomic surveys of complex eukaryotes and reconsider how these observations impact on the notion that apparently discrete loci may express transcripts with more than one function. In conclusion, we posit that many eukaryotic loci have evolved the capacity to transact a multitude of overlapping and potentially independent functions as both regulatory and protein-coding RNAs.
Collapse
Affiliation(s)
- Marcel E Dinger
- Institute for Molecular Bioscience, University of Queensland, 306 Carmody Road, St Lucia, QLD 4072, Australia.
| | | | | |
Collapse
|
259
|
Abstract
Despite recent controversies, the evidence that the majority of the human genome is transcribed into RNA remains strong.
Collapse
|
260
|
Tisseur M, Kwapisz M, Morillon A. Pervasive transcription - Lessons from yeast. Biochimie 2011; 93:1889-96. [PMID: 21771634 DOI: 10.1016/j.biochi.2011.07.001] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2011] [Accepted: 07/04/2011] [Indexed: 10/18/2022]
Abstract
Pervasive transcription is now accepted to be a general feature of eukaryotic genomes, generating short and long non-coding RNAs (ncRNAs). Growing number of examples have shown that regulatory ncRNAs can control gene expression and chromatin domain formation. In this review, we discuss recent reports that show that Saccharomyces cerevisiae's genome also supports pervasive transcription, which is strongly controlled by RNA decay pathways and nucleosome positioning. We therefore propose that S. cerevisiae is an excellent model for studying large ncRNAs, which has already provided important examples of antisense-mediated transcriptional silencing.
Collapse
Affiliation(s)
- Mathieu Tisseur
- ncRNA, Epigenetic and Genome Fluidity, Institut Curie, Centre de Recherche, CNRS UMR3244, Université Pierre et Marie Curie, 26 rue d'Ulm, 75248 Paris Cedex 05, France
| | | | | |
Collapse
|
261
|
Abstract
Eukaryotic genomes accommodate numerous types of information within diverse DNA and RNA sequence elements. At many loci, these elements overlap and the same sequence is read multiple times during the production, processing, localization, function and turnover of a single transcript. Moreover, two or more transcripts from the same locus might use a common sequence in different ways, to perform distinct biological roles. Recent results show that many transcripts also undergo post-transcriptional cleavage to release specific fragments, which can then function independently. This phenomenon appears remarkably widespread, with even well-documented transcript classes such as messenger RNAs yielding fragments. RNA fragmentation significantly expands the already extraordinary spectrum of transcripts present within eukaryotic cells, and also calls into question how the 'gene' should be defined.
Collapse
|
262
|
Wei W, Pelechano V, Järvelin AI, Steinmetz LM. Functional consequences of bidirectional promoters. Trends Genet 2011; 27:267-76. [PMID: 21601935 PMCID: PMC3123404 DOI: 10.1016/j.tig.2011.04.002] [Citation(s) in RCA: 156] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2011] [Revised: 04/20/2011] [Accepted: 04/20/2011] [Indexed: 02/07/2023]
Abstract
Several studies have shown that promoters of protein-coding genes are origins of pervasive non-coding RNA transcription and can initiate transcription in both directions. However, only recently have researchers begun to elucidate the functional implications of this bidirectionality and non-coding RNA production. Increasing evidence indicates that non-coding transcription at promoters influences the expression of protein-coding genes, revealing a new layer of transcriptional regulation. This regulation acts at multiple levels, from modifying local chromatin to enabling regional signal spreading and more distal regulation. Moreover, the bidirectional activity of a promoter is regulated at multiple points during transcription, giving rise to diverse types of transcripts.
Collapse
Affiliation(s)
| | | | | | - Lars M. Steinmetz
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| |
Collapse
|
263
|
Sobala A, Hutvagner G. Transfer RNA-derived fragments: origins, processing, and functions. WILEY INTERDISCIPLINARY REVIEWS-RNA 2011; 2:853-62. [PMID: 21976287 DOI: 10.1002/wrna.96] [Citation(s) in RCA: 147] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Deep sequencing approaches have revealed multiple types of small RNAs with known and unknown functions. In this review we focus on a recently identified group of small RNAs that are derived from transfer RNAs (tRNAs), tRNA fragments (tRFs). We review the mechanism of their processing and their functions in mammalian cells, and highlight points of possible cross-talk between tRFs and the canonical small RNA pathway characterized by small interfering RNAs (siRNAs), microRNAs (miRNAs), and Piwi-interacting RNAs (piRNAs). We also propose a nomenclature that is based on their processing characteristics.
Collapse
Affiliation(s)
- Andrew Sobala
- Wellcome Trust Centre for Gene Regulation and Expression, Dundee University, Dundee, UK
| | | |
Collapse
|
264
|
|
265
|
Antisense RNA polymerase II divergent transcripts are P-TEFb dependent and substrates for the RNA exosome. Proc Natl Acad Sci U S A 2011; 108:10460-5. [PMID: 21670248 DOI: 10.1073/pnas.1106630108] [Citation(s) in RCA: 143] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Divergent transcription occurs at the majority of RNA polymerase II (RNAPII) promoters in mouse embryonic stem cells (mESCs), and this activity correlates with CpG islands. Here we report the characterization of upstream antisense transcription in regions encoding transcription start site associated RNAs (TSSa-RNAs) at four divergent CpG island promoters: Isg20l1, Tcea1, Txn1, and Sf3b1. We find that upstream antisense RNAs (uaRNAs) have distinct capped 5' termini and heterogeneous nonpolyadenylated 3' ends. uaRNAs are short-lived with average half-lives of 18 minutes and are present at 1-4 copies per cell, approximately one RNA per DNA template. Exosome depletion stabilizes uaRNAs. These uaRNAs are probably initiation products because their capped termini correlate with peaks of paused RNAPII. The pausing factors NELF and DSIF are associated with these antisense polymerases and their sense partners. Knockdown of either NELF or DSIF results in an increase in the levels of uaRNAs. Consistent with P-TEFb controlling release from pausing, treatment with its inhibitor, flavopiridol, decreases uaRNA and nascent mRNA transcripts with similar kinetics. Finally, Isg20l1 induction reveals equivalent increases in transcriptional activity in sense and antisense directions. Together these data show divergent polymerases are regulated after P-TEFb recruitment with uaRNA levels controlled by the exosome.
Collapse
|
266
|
Kaikkonen MU, Lam MT, Glass CK. Non-coding RNAs as regulators of gene expression and epigenetics. Cardiovasc Res 2011; 90:430-40. [PMID: 21558279 PMCID: PMC3096308 DOI: 10.1093/cvr/cvr097] [Citation(s) in RCA: 410] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2010] [Revised: 03/24/2011] [Accepted: 04/01/2011] [Indexed: 02/07/2023] Open
Abstract
Genome-wide studies have revealed that mammalian genomes are pervasively transcribed. This has led to the identification and isolation of novel classes of non-coding RNAs (ncRNAs) that influence gene expression by a variety of mechanisms. Here we review the characteristics and functions of regulatory ncRNAs in chromatin remodelling and at multiple levels of transcriptional and post-transcriptional regulation. We also describe the potential roles of ncRNAs in vascular biology and in mediating epigenetic modifications that might play roles in cardiovascular disease susceptibility. The emerging recognition of the diverse functions of ncRNAs in regulation of gene expression suggests that they may represent new targets for therapeutic intervention.
Collapse
Affiliation(s)
- Minna U. Kaikkonen
- Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA
- Department of Biotechnology and Molecular Medicine 1, A.I. Virtanen Institute, University of Eastern Finland, PO Box 1627, 70120 Kuopio, Finland
| | - Michael T.Y. Lam
- Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA
- The Medical Scientist Training Program, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA
| | - Christopher K. Glass
- Department of Cellular and Molecular Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA
- Department of Medicine, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA
| |
Collapse
|
267
|
Kanamori-Katayama M, Itoh M, Kawaji H, Lassmann T, Katayama S, Kojima M, Bertin N, Kaiho A, Ninomiya N, Daub CO, Carninci P, Forrest ARR, Hayashizaki Y. Unamplified cap analysis of gene expression on a single-molecule sequencer. Genome Res 2011; 21:1150-9. [PMID: 21596820 DOI: 10.1101/gr.115469.110] [Citation(s) in RCA: 137] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better define transcription start sites (TSS) than known models, identify novel regions of transcription and alternative promoters, and find two major classes of TSS signal, sharp peaks and broad regions. However, using this protocol, we observe reproducible evidence of regulation at the much finer level of individual TSS positions. The libraries are quantitative over 5 orders of magnitude and highly reproducible (Pearson's correlation coefficient of 0.987). We have also scaled down the sample requirement to 5 μg of total RNA for a standard HeliScopeCAGE library and 100 ng for a low-quantity version. When the same RNA was run as 5-μg and 100-ng versions, the 100 ng was still able to detect expression for ∼60% of the 13,468 loci detected by a 5-μg library using the same threshold, allowing comparative analysis of even rare cell populations. Testing the protocol for differential gene expression measurements on triplicate HeLa and THP-1 samples, we find that the log fold change compared to Illumina microarray measurements is highly correlated (0.871). In addition, HeliScopeCAGE finds differential expression for thousands more loci including those with probes on the array. Finally, although the majority of tags are 5' associated, we also observe a low level of signal on exons that is useful for defining gene structures.
Collapse
|
268
|
Min IM, Waterfall JJ, Core LJ, Munroe RJ, Schimenti J, Lis JT. Regulating RNA polymerase pausing and transcription elongation in embryonic stem cells. Genes Dev 2011; 25:742-54. [PMID: 21460038 DOI: 10.1101/gad.2005511] [Citation(s) in RCA: 250] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Transitions between pluripotent stem cells and differentiated cells are executed by key transcription regulators. Comparative measurements of RNA polymerase distribution over the genome's primary transcription units in different cell states can identify the genes and steps in the transcription cycle that are regulated during such transitions. To identify the complete transcriptional profiles of RNA polymerases with high sensitivity and resolution, as well as the critical regulated steps upon which regulatory factors act, we used genome-wide nuclear run-on (GRO-seq) to map the density and orientation of transcriptionally engaged RNA polymerases in mouse embryonic stem cells (ESCs) and mouse embryonic fibroblasts (MEFs). In both cell types, progression of a promoter-proximal, paused RNA polymerase II (Pol II) into productive elongation is a rate-limiting step in transcription of ∼40% of mRNA-encoding genes. Importantly, quantitative comparisons between cell types reveal that transcription is controlled frequently at paused Pol II's entry into elongation. Furthermore, "bivalent" ESC genes (exhibiting both active and repressive histone modifications) bound by Polycomb group complexes PRC1 (Polycomb-repressive complex 1) and PRC2 show dramatically reduced levels of paused Pol II at promoters relative to an average gene. In contrast, bivalent promoters bound by only PRC2 allow Pol II pausing, but it is confined to extremely 5' proximal regions. Altogether, these findings identify rate-limiting targets for transcription regulation during cell differentiation.
Collapse
Affiliation(s)
- Irene M Min
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | | | | | | | | | | |
Collapse
|
269
|
Khaitan D, Dinger ME, Mazar J, Crawford J, Smith MA, Mattick JS, Perera RJ. The melanoma-upregulated long noncoding RNA SPRY4-IT1 modulates apoptosis and invasion. Cancer Res 2011; 71:3852-62. [PMID: 21558391 DOI: 10.1158/0008-5472.can-10-4460] [Citation(s) in RCA: 375] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The identification of cancer-associated long noncoding RNAs (lncRNAs) and the investigation of their molecular and biological functions are important to understand the molecular biology of cancer and its progression. Although the functions of lncRNAs and the mechanisms regulating their expression are largely unknown, recent studies are beginning to unravel their importance in human health and disease. Here, we report that a number of lncRNAs are differentially expressed in melanoma cell lines in comparison to melanocytes and keratinocyte controls. One of these lncRNAs, SPRY4-IT1 (GenBank accession ID AK024556), is derived from an intron of the SPRY4 gene and is predicted to contain several long hairpins in its secondary structure. RNA-FISH analysis showed that SPRY4-IT1 is predominantly localized in the cytoplasm of melanoma cells, and SPRY4-IT1 RNAi knockdown results in defects in cell growth, differentiation, and higher rates of apoptosis in melanoma cell lines. Differential expression of both SPRY4 and SPRY4-IT1 was also detected in vivo, in 30 distinct patient samples, classified as primary in situ, regional metastatic, distant metastatic, and nodal metastatic melanoma. The elevated expression of SPRY4-IT1 in melanoma cells compared to melanocytes, its accumulation in cell cytoplasm, and effects on cell dynamics, including increased rate of wound closure on SPRY4-IT1 overexpression, suggest that the higher expression of SPRY4-IT1 may have an important role in the molecular etiology of human melanoma.
Collapse
Affiliation(s)
- Divya Khaitan
- Sanford Burnham Medical Research Institute, Orlando, Florida 32827, USA
| | | | | | | | | | | | | |
Collapse
|
270
|
Mattick JS. The central role of RNA in human development and cognition. FEBS Lett 2011; 585:1600-16. [DOI: 10.1016/j.febslet.2011.05.001] [Citation(s) in RCA: 149] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Accepted: 05/03/2011] [Indexed: 12/22/2022]
|
271
|
Hansen TB, Kjems J, Bramsen JB. Enhancing miRNA annotation confidence in miRBase by continuous cross dataset analysis. RNA Biol 2011; 8:378-83. [PMID: 21558790 DOI: 10.4161/rna.8.3.14333] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The immaculate annotation of all microRNAs (miRNAs) is a prerequisite to study their biological function on a genome-wide scale. However, the original criteria for proper miRNA annotation seem unsuited for the automated analysis of the immense number of small RNA reads available in next generation sequencing (NGS) datasets. Here we analyze the confidence of past miRNA annotation in miRBase by cross-analyzing publicly available NGS datasets using strengthened annotation requirements. Our analysis highlights that a large number of annotated human miRNAs in miRBase seems to require more experimental validation to be confidently annotated. Notably, our dataset analysis also identified almost 300 currently non-annotated miRNA*s and 28 novel miRNAs. These observations hereby greatly increase the confidence of past miRNA annotation in miRBase but also illustrate the usefulness of continuous re-evaluating NGS datasets in the identification of novel miRNAs.
Collapse
Affiliation(s)
- Thomas B Hansen
- Department of Molecular Biology, Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Aarhus, Denmark.
| | | | | |
Collapse
|
272
|
Gibb EA, Brown CJ, Lam WL. The functional role of long non-coding RNA in human carcinomas. Mol Cancer 2011; 10:38. [PMID: 21489289 PMCID: PMC3098824 DOI: 10.1186/1476-4598-10-38] [Citation(s) in RCA: 1314] [Impact Index Per Article: 101.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2011] [Accepted: 04/13/2011] [Indexed: 12/15/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are emerging as new players in the cancer paradigm demonstrating potential roles in both oncogenic and tumor suppressive pathways. These novel genes are frequently aberrantly expressed in a variety of human cancers, however the biological functions of the vast majority remain unknown. Recently, evidence has begun to accumulate describing the molecular mechanisms by which these RNA species function, providing insight into the functional roles they may play in tumorigenesis. In this review, we highlight the emerging functional role of lncRNAs in human cancer.
Collapse
Affiliation(s)
- Ewan A Gibb
- British Columbia Cancer Agency Research Centre, Vancouver, Canada.
| | | | | |
Collapse
|
273
|
Genome-wide analysis of the 5' and 3' ends of vaccinia virus early mRNAs delineates regulatory sequences of annotated and anomalous transcripts. J Virol 2011; 85:5897-909. [PMID: 21490097 DOI: 10.1128/jvi.00428-11] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Poxviruses are large DNA viruses that encode a multisubunit RNA polymerase, stage-specific transcription factors, and enzymes that cap and polyadenylate mRNAs within the cytoplasm of infected animal cells. Genome-wide microarray and RNA-seq technologies have been used to profile the transcriptome of vaccinia virus (VACV), the prototype member of the family. Here, we adapted tag-based methods in conjunction with SOLiD and Illumina deep sequencing platforms to determine the precise 5' and 3' ends of VACV early mRNAs and map the putative transcription start sites (TSSs) and polyadenylation sites (PASs). Individual and clustered TSSs were found preceding 104 annotated open reading frames (ORFs), excluding pseudogenes. In the majority of cases, a 15-nucleotide consensus core motif was present upstream of the ORF. This motif, however, was also present at numerous other locations, indicating that it was insufficient for transcription initiation. Further analysis revealed a 10-nucleotide AT-rich spacer following functional core motifs that may facilitate DNA unwinding. Additional putative TSSs occurred in anomalous locations that may expand the functional repertoire of the VACV genome. However, many of the anomalous TSSs lacked an upstream core motif, raising the possibility that they arose by a processing mechanism as has been proposed for eukaryotic systems. Discrete and clustered PASs occurred about 40 nucleotides after an UUUUUNU termination signal. However, a large number of PASs were not preceded by this motif, suggesting alternative polyadenylation mechanisms. Pyrimidine-rich coding strand sequences were found immediately upstream of both types of PASs, signifying an additional feature of VACV 3'-end formation and polyadenylation.
Collapse
|
274
|
Wery M, Kwapisz M, Morillon A. Noncoding RNAs in gene regulation. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 3:728-38. [PMID: 21381218 DOI: 10.1002/wsbm.148] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
RNAs have been traditionally viewed as intermediates between DNA and proteins. However, there is a growing body of literature indicating that noncoding RNAs (ncRNAs) are key players for gene regulation, genome stability, and chromatin modification. In addition to the well-known small interfering RNAs and microRNAs acting in transcriptional and posttranscriptional gene silencing, recent advances in the field of transcriptome exploration have revealed novel sets of new small and large ncRNAs. Many of them appear to be conserved across mammals, and abnormal expression of several ncRNAs has been linked to a wide variety of human diseases, such as cancer. Here, we review the different classes of ncRNAs identified to date, in yeast and mammals, and we discuss the mechanisms by which they affect gene regulation.
Collapse
Affiliation(s)
- Maxime Wery
- Institut Curie, Centre de Recherche, Paris, France
| | | | | |
Collapse
|
275
|
Euskirchen GM, Auerbach RK, Davidov E, Gianoulis TA, Zhong G, Rozowsky J, Bhardwaj N, Gerstein MB, Snyder M. Diverse roles and interactions of the SWI/SNF chromatin remodeling complex revealed using global approaches. PLoS Genet 2011; 7:e1002008. [PMID: 21408204 PMCID: PMC3048368 DOI: 10.1371/journal.pgen.1002008] [Citation(s) in RCA: 172] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Accepted: 01/04/2011] [Indexed: 12/22/2022] Open
Abstract
A systems understanding of nuclear organization and events is critical for determining how cells divide, differentiate, and respond to stimuli and for identifying the causes of diseases. Chromatin remodeling complexes such as SWI/SNF have been implicated in a wide variety of cellular processes including gene expression, nuclear organization, centromere function, and chromosomal stability, and mutations in SWI/SNF components have been linked to several types of cancer. To better understand the biological processes in which chromatin remodeling proteins participate, we globally mapped binding regions for several components of the SWI/SNF complex throughout the human genome using ChIP-Seq. SWI/SNF components were found to lie near regulatory elements integral to transcription (e.g. 5′ ends, RNA Polymerases II and III, and enhancers) as well as regions critical for chromosome organization (e.g. CTCF, lamins, and DNA replication origins). Interestingly we also find that certain configurations of SWI/SNF subunits are associated with transcripts that have higher levels of expression, whereas other configurations of SWI/SNF factors are associated with transcripts that have lower levels of expression. To further elucidate the association of SWI/SNF subunits with each other as well as with other nuclear proteins, we also analyzed SWI/SNF immunoprecipitated complexes by mass spectrometry. Individual SWI/SNF factors are associated with their own family members, as well as with cellular constituents such as nuclear matrix proteins, key transcription factors, and centromere components, implying a ubiquitous role in gene regulation and nuclear function. We find an overrepresentation of both SWI/SNF-associated regions and proteins in cell cycle and chromosome organization. Taken together the results from our ChIP and immunoprecipitation experiments suggest that SWI/SNF facilitates gene regulation and genome function more broadly and through a greater diversity of interactions than previously appreciated. Genetic information and programming are not entirely contained in DNA sequence but are also governed by chromatin structure. Gaining a greater understanding of chromatin remodeling complexes can bridge gaps between processes in the genome and the epigenome and can offer insights into diseases such as cancer. We identified targets of the chromatin remodeling complex, SWI/SNF, on a genome-wide scale using ChIP-Seq. We also identify proteins that co-purify with its various components via immunoprecipitation combined with mass spectrometry. By integrating these newly-identified regions with a combination of novel and published data sources, we identify pathways and cellular compartments in which SWI/SNF plays a major role as well as discern general characteristics of SWI/SNF target sites. Our parallel evaluations of multiple SWI/SNF factors indicate that these subunits are found in highly dynamic and combinatorial assemblies. Our study presents the first genome-wide and unified view of multiple SWI/SNF components and also provides a valuable resource to the scientific community as an important data source to be integrated with future genomic and epigenomic studies.
Collapse
Affiliation(s)
- Ghia M. Euskirchen
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
| | - Raymond K. Auerbach
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Eugene Davidov
- PerkinElmer, Shelton, Connecticut, United States of America
| | - Tara A. Gianoulis
- Department of Genetics and Wyss Institute for Bio-Inspired Engineering, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Guoneng Zhong
- Yale Center for Medical Informatics, Yale University, New Haven, Connecticut, United States of America
| | - Joel Rozowsky
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Nitin Bhardwaj
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Michael Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
276
|
Yamashita R, Sathira NP, Kanai A, Tanimoto K, Arauchi T, Tanaka Y, Hashimoto SI, Sugano S, Nakai K, Suzuki Y. Genome-wide characterization of transcriptional start sites in humans by integrative transcriptome analysis. Genome Res 2011; 21:775-89. [PMID: 21372179 DOI: 10.1101/gr.110254.110] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We performed a genome-wide analysis of transcriptional start sites (TSSs) in human genes by multifaceted use of a massively parallel sequencer. By analyzing 800 million sequences that were obtained from various types of transcriptome analyses, we characterized 140 million TSS tags in 12 human cell types. Despite the large number of TSS clusters (TSCs), the number of TSCs was observed to decrease sharply with increasing expression levels. Highly expressed TSCs exhibited several characteristic features: Nucleosome-seq analysis revealed highly ordered nucleosome structures, ChIP-seq analysis detected clear RNA polymerase II binding signals in their surrounding regions, evaluations of previously sequenced and newly shotgun-sequenced complete cDNA sequences showed that they encode preferable transcripts for protein translation, and RNA-seq analysis of polysome-incorporated RNAs yielded direct evidence that those transcripts are actually translated into proteins. We also demonstrate that integrative interpretation of transcriptome data is essential for the selection of putative alternative promoter TSCs, two of which also have protein consequences. Furthermore, discriminative chromatin features that separate TSCs at different expression levels were found for both genic TSCs and intergenic TSCs. The collected integrative information should provide a useful basis for future biological characterization of TSCs.
Collapse
Affiliation(s)
- Riu Yamashita
- Frontier Research Initiative, Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
277
|
Merkel A, Guigó R. Review of ‘Cap-analysis gene expression’. Bioessays 2011. [DOI: 10.1002/bies.201000144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
278
|
McCormick KP, Willmann MR, Meyers BC. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments. SILENCE 2011; 2:2. [PMID: 21356093 PMCID: PMC3055805 DOI: 10.1186/1758-907x-2-2] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Accepted: 02/28/2011] [Indexed: 01/30/2023]
Abstract
Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations.
Collapse
Affiliation(s)
- Kevin P McCormick
- Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA
| | - Matthew R Willmann
- Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Blake C Meyers
- Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA
| |
Collapse
|
279
|
Khalil AM, Rinn JL. RNA-protein interactions in human health and disease. Semin Cell Dev Biol 2011; 22:359-65. [PMID: 21333748 DOI: 10.1016/j.semcdb.2011.02.016] [Citation(s) in RCA: 104] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Accepted: 02/11/2011] [Indexed: 11/19/2022]
Abstract
It is now clear that the genomes of many organisms encode thousands of large and small non-coding (nc)RNAs. However, relative to the discovery of ncRNAs the functions and mechanisms of ncRNAs remain disproportionately understood. One intriguing observation is that many ncRNAs are found to be associated with protein complexes including those involved in transcription regulation, post-transcriptional silencing, and epigentic regulation. These observations suggest that the functions and mechanisms of many of these ncRNAs may depend on their interactions with various protein complexes within the cell. In this review we discuss well known examples as well as newly emerging evidence of a widespread RNA-protein interactions in distinct biological processes in a wide range of organisms, and highlight the importance of developing new technologies to dissect these interactions. Finally, we propose that mis-regulation of ncRNAs interactions with their protein partners may contribute to human disease, and open up a novel approach to therapeutic interventions.
Collapse
Affiliation(s)
- Ahmad M Khalil
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| | | |
Collapse
|
280
|
Abstract
New DNA sequencing technologies have provided novel insights into eukaryotic genomes, epigenomes, and the transcriptome, including the identification of new non-coding RNA (ncRNA) classes such as promoter-associated RNAs and long RNAs. Moreover, it is now clear that up to 90% of eukaryotic genomes are transcribed, generating an extraordinary range of RNAs with no coding capacity. Taken together, these new discoveries are modifying the status quo in genomic science by demonstrating that the eukaryotic gene pool is divided into two distinct categories of transcripts: protein-coding and non-coding. The function of the majority of ncRNAs produced by the transcriptome is largely unknown; however, it is probable that many are associated with epigenetic mechanisms. The purpose of this review is to describe the most recent discoveries in the ncRNA field that implicate these molecules as key players in the epigenome.
Collapse
Affiliation(s)
- Fabrício F Costa
- Cancer Biology and Epigenomics Program, Children's Memorial Research Center and Northwestern University's Feinberg School of Medicine, 2300 Children's Plaza, Chicago, IL, USA.
| |
Collapse
|
281
|
Ozsolak F, Kapranov P, Foissac S, Kim SW, Fishilevich E, Monaghan AP, John B, Milos PM. Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 2011; 143:1018-29. [PMID: 21145465 DOI: 10.1016/j.cell.2010.11.020] [Citation(s) in RCA: 316] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Revised: 09/28/2010] [Accepted: 11/09/2010] [Indexed: 01/12/2023]
Abstract
The emerging discoveries on the link between polyadenylation and disease states underline the need to fully characterize genome-wide polyadenylation states. Here, we report comprehensive maps of global polyadenylation events in human and yeast generated using refinements to the Direct RNA Sequencing technology. This direct approach provides a quantitative view of genome-wide polyadenylation states in a strand-specific manner and requires only attomole RNA quantities. The polyadenylation profiles revealed an abundance of unannotated polyadenylation sites, alternative polyadenylation patterns, and regulatory element-associated poly(A)(+) RNAs. We observed differences in sequence composition surrounding canonical and noncanonical human polyadenylation sites, suggesting novel noncoding RNA-specific polyadenylation mechanisms in humans. Furthermore, we observed the correlation level between sense and antisense transcripts to depend on gene expression levels, supporting the view that overlapping transcription from opposite strands may play a regulatory role. Our data provide a comprehensive view of the polyadenylation state and overlapping transcription.
Collapse
Affiliation(s)
- Fatih Ozsolak
- Helicos BioSciences Corporation, Cambridge, MA 02139, USA.
| | | | | | | | | | | | | | | |
Collapse
|
282
|
Abstract
The application of new and less biased methods to study the transcriptional output from genomes, such as tiling arrays and deep sequencing, has revealed that most of the genome is transcribed and that there is substantial overlap of transcripts derived from the two strands of DNA. In protein coding regions, the map of transcripts is very complex due to small transcripts from the flanking ends of the transcription unit, the use of multiple start and stop sites for the main transcript, production of multiple functional RNA molecules from the same primary transcript, and RNA molecules made by independent transcription from within the unit. In genomic regions separating those that encode proteins or highly abundant RNA molecules with known function, transcripts are generally of low abundance and short-lived. In most of these cases, it is unclear to what extent a function is related to transcription per se or to the RNA products.
Collapse
Affiliation(s)
- Henrik Nielsen
- Department of Cellular and Molecular Medicine, The Panum Institute, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
283
|
Knowling S, Morris KV. Epigenetic regulation of gene expression in human cells by noncoding RNAs. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2011; 102:1-10. [PMID: 21846567 DOI: 10.1016/b978-0-12-415795-8.00003-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Emerging evidence has begun to suggest that a vast array of noncoding RNAs is operative in human cells, with some containing the ability to directly modulate gene transcription. While observations of noncoding-RNA-based epigenetic regulation of gene expression were in the past relegated to imprinted or X-linked genes, it is now becoming apparent that several different genes in differentiated cells may be under some form of RNA-based regulatory control. Studies have begun to discern certain aspects of an underlying mechanism of action whereby noncoding RNAs modulate gene transcription. Much of the evidence suggests that noncoding RNAs are functional in controlling gene transcription by the targeted recruitment of epigenetic silencing complexes to homology-containing loci in the genome. The results of these studies, as well as the implications that a vast array of noncoding-RNA-based regulatory networks may be operative in human cells, are discussed. Knowledge of this emerging RNA-based epigenetic regulatory network has implications in cellular evolution as well as in an entirely new area of pharmacopeia.
Collapse
Affiliation(s)
- Stuart Knowling
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, California, USA
| | | |
Collapse
|
284
|
Burroughs AM, Ando Y, de Hoon MJL, Tomaru Y, Suzuki H, Hayashizaki Y, Daub CO. Deep-sequencing of human Argonaute-associated small RNAs provides insight into miRNA sorting and reveals Argonaute association with RNA fragments of diverse origin. RNA Biol 2011; 8:158-77. [PMID: 21282978 DOI: 10.4161/rna.8.1.14300] [Citation(s) in RCA: 229] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
While several studies have focused on the relationship between individual miRNA loci or classes of small RNA with human Argonaute (AGO) proteins, a comprehensive, global analysis of the RNA content associating with different AGO proteins has yet to be performed. We have compared the content of deep sequenced RNA extracted from immunoprecipitation experiments with the AGO1, AGO2, and AGO3 proteins. Consistent with previous observations, sequence tags derived from miRNA loci globally associate in approximately equivalent amounts with AGO1, AGO2, and AGO3. Exceptions include miR-182, miR-222, and miR-223*, which could be coupled to processes targeting the loci for interaction with specific AGO proteins. A closer inspection of the data, however, supports the presence of an unusual sorting mechanism wherein a subset of miRNA loci give rise to distinct isomirs which preferentially associate with distinct AGO proteins in a significantly differential manner. We also identify the complete set of short RNA derived from non-miRNA sources including tRNA, snRNA, snoRNA, vRNA, and mRNA associating with the AGO proteins, many of which are predicted to play roles in post-transcriptional gene silencing. We also observe enrichment of tags mapping to promoter regions of genes, suggesting that a fraction of the recently-identified promoter-associated small RNAs in humans could function through interaction with AGO proteins. Finally, we observe antisense miRNA transcripts are frequently present in low copy numbers across a range of diverse miRNA loci and these transcripts appear to associate with AGO proteins.
Collapse
|
285
|
Saxena A, Carninci P. Whole transcriptome analysis: what are we still missing? WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2010; 3:527-43. [PMID: 21197667 DOI: 10.1002/wsbm.135] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
New technologies such as tag-based sequencing and tiling arrays have provided unique insights into the transcriptional output of cells. Many new RNA classes have been uncovered in the past decade, despite limitations in current technologies. Even as the repertoire of known functional elements of the transcriptome increases and contemporary technologies become mainstream, inadequacies in conventional protocols for library preparation, sequencing and mapping continue to hamper revelation of the entire transcriptome of cells. In this article, we review current protocols and outline their deficiencies. We also provide our view on what we may be overlooking in the transcriptome, despite exhaustive investigations, and indicate future areas of technological development and research.
Collapse
Affiliation(s)
- Alka Saxena
- Omics Science Center, RIKEN Yokohama Institute, Tsurumi, Japan
| | | |
Collapse
|
286
|
Abstract
In the few years since its initial application, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the characterization and quantification of transcriptomes. Recently, several developments in RNA-seq methods have provided an even more complete characterization of RNA transcripts. These developments include improvements in transcription start site mapping, strand-specific measurements, gene fusion detection, small RNA characterization and detection of alternative splicing events. Ongoing developments promise further advances in the application of RNA-seq, particularly direct RNA sequencing and approaches that allow RNA quantification from very small amounts of cellular materials.
Collapse
Affiliation(s)
- Fatih Ozsolak
- Helicos BioSciences Corporation, One Kendall Square, Cambridge, Massachusetts 02139, USA.
| | | |
Collapse
|
287
|
Hoskins RA, Landolin JM, Brown JB, Sandler JE, Takahashi H, Lassmann T, Yu C, Booth BW, Zhang D, Wan KH, Yang L, Boley N, Andrews J, Kaufman TC, Graveley BR, Bickel PJ, Carninci P, Carlson JW, Celniker SE. Genome-wide analysis of promoter architecture in Drosophila melanogaster. Genome Res 2010; 21:182-92. [PMID: 21177961 DOI: 10.1101/gr.112466.110] [Citation(s) in RCA: 167] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Core promoters are critical regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resolution map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap analysis of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid amplification of cDNA ends (RLM-RACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our analysis indicates that, due to non-promoter-associated RNA background signal, previous studies have likely overestimated the number of promoter-associated CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally determined by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally diverse and control patterning of gene expression in Drosophila and mammals.
Collapse
Affiliation(s)
- Roger A Hoskins
- Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 97420, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
288
|
Cherbas L, Willingham A, Zhang D, Yang L, Zou Y, Eads BD, Carlson JW, Landolin JM, Kapranov P, Dumais J, Samsonova A, Choi JH, Roberts J, Davis CA, Tang H, van Baren MJ, Ghosh S, Dobin A, Bell K, Lin W, Langton L, Duff MO, Tenney AE, Zaleski C, Brent MR, Hoskins RA, Kaufman TC, Andrews J, Graveley BR, Perrimon N, Celniker SE, Gingeras TR, Cherbas P. The transcriptional diversity of 25 Drosophila cell lines. Genome Res 2010; 21:301-14. [PMID: 21177962 DOI: 10.1101/gr.112961.110] [Citation(s) in RCA: 191] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Drosophila melanogaster cell lines are important resources for cell biologists. Here, we catalog the expression of exons, genes, and unannotated transcriptional signals for 25 lines. Unannotated transcription is substantial (typically 19% of euchromatic signal). Conservatively, we identify 1405 novel transcribed regions; 684 of these appear to be new exons of neighboring, often distant, genes. Sixty-four percent of genes are expressed detectably in at least one line, but only 21% are detected in all lines. Each cell line expresses, on average, 5885 genes, including a common set of 3109. Expression levels vary over several orders of magnitude. Major signaling pathways are well represented: most differentiation pathways are "off" and survival/growth pathways "on." Roughly 50% of the genes expressed by each line are not part of the common set, and these show considerable individuality. Thirty-one percent are expressed at a higher level in at least one cell line than in any single developmental stage, suggesting that each line is enriched for genes characteristic of small sets of cells. Most remarkable is that imaginal disc-derived lines can generally be assigned, on the basis of expression, to small territories within developing discs. These mappings reveal unexpected stability of even fine-grained spatial determination. No two cell lines show identical transcription factor expression. We conclude that each line has retained features of an individual founder cell superimposed on a common "cell line" gene expression pattern.
Collapse
Affiliation(s)
- Lucy Cherbas
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana 47405, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
289
|
Lu ZJ, Yip KY, Wang G, Shou C, Hillier LW, Khurana E, Agarwal A, Auerbach R, Rozowsky J, Cheng C, Kato M, Miller DM, Slack F, Snyder M, Waterston RH, Reinke V, Gerstein MB. Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data. Genome Res 2010; 21:276-85. [PMID: 21177971 DOI: 10.1101/gr.110189.110] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We present an integrative machine learning method, incRNA, for whole-genome identification of noncoding RNAs (ncRNAs). It combines a large amount of expression data, RNA secondary-structure stability, and evolutionary conservation at the protein and nucleic-acid level. Using the incRNA model and data from the modENCODE consortium, we are able to separate known C. elegans ncRNAs from coding sequences and other genomic elements with a high level of accuracy (97% AUC on an independent validation set), and find more than 7000 novel ncRNA candidates, among which more than 1000 are located in the intergenic regions of C. elegans genome. Based on the validation set, we estimate that 91% of the approximately 7000 novel ncRNA candidates are true positives. We then analyze 15 novel ncRNA candidates by RT-PCR, detecting the expression for 14. In addition, we characterize the properties of all the novel ncRNA candidates and find that they have distinct expression patterns across developmental stages and tend to use novel RNA structural families. We also find that they are often targeted by specific transcription factors (∼59% of intergenic novel ncRNA candidates). Overall, our study identifies many new potential ncRNAs in C. elegans and provides a method that can be adapted to other organisms.
Collapse
Affiliation(s)
- Zhi John Lu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
290
|
Munafó DB, Robb GB. Optimization of enzymatic reaction conditions for generating representative pools of cDNA from small RNA. RNA (NEW YORK, N.Y.) 2010; 16:2537-52. [PMID: 20921270 PMCID: PMC2995414 DOI: 10.1261/rna.2242610] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2010] [Accepted: 08/30/2010] [Indexed: 05/23/2023]
Abstract
Small regulatory RNA repertoires in biological samples are heterogeneous mixtures that may include species arising from varied biosynthetic pathways and modification events. Small RNA profiling and discovery approaches ought to capture molecules in a way that is representative of expression level. It follows that the effects of RNA modifications on representation should be minimized. The collection of high-quality, representative data, therefore, will be highly dependent on bias-free sample manipulation in advance of quantification. We examined the impact of 2'-O-methylation of the 3'-terminal nucleotide of small RNA on key enzymatic reactions of standard front-end manipulation schemes. Here we report that this common modification negatively influences the representation of these small RNA species. Deficits occurred at multiple steps as determined by gel analysis of synthetic input RNA and by quantification and sequencing of derived cDNA pools. We describe methods to minimize the effects of 2'-O-methyl modification of small RNA 3'-termini using T4 RNA ligase 2 truncated, and other optimized reaction conditions, demonstrating their use by quantifying representation of miRNAs and piRNAs in cDNA pools prepared from biological samples.
Collapse
|
291
|
Valor LM, Barco A. Hippocampal gene profiling: toward a systems biology of the hippocampus. Hippocampus 2010; 22:929-41. [PMID: 21080408 DOI: 10.1002/hipo.20888] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/24/2010] [Indexed: 01/17/2023]
Abstract
Transcriptomics and proteomics approaches give a unique perspective for understanding brain and hippocampal functions but also pose unique challenges because of the singular complexity of the nervous system. The proliferation of genome-wide expression studies during the last decade has provided important insight into the molecular underpinnings of brain anatomy, neural plasticity, and neurological diseases. Microarray technology has dominated transcriptomics research, but this situation is rapidly changing with the recent technological advances in high-throughput sequencing. The full potential of transcriptomics in the neurosciences will be achieved as a result of its integration with other "-omics" disciplines as well as the development of novel analytical bioinformatics and systems biology tools for meta-analysis. Here, we review some of the most relevant advances in the gene profiling of the hippocampus, its relationship with proteomics approaches, and the promising perspectives for the future.
Collapse
Affiliation(s)
- Luis M Valor
- Instituto de Neurociencias de Alicante, Universidad Miguel Hernández-Consejo Superior de Investigaciones Científicas, Campus de Sant Joan, Apt. 18, Sant Joan d'Alacant, 03550, Alicante, Spain
| | | |
Collapse
|
292
|
Mercer TR, Wilhelm D, Dinger ME, Soldà G, Korbie DJ, Glazov EA, Truong V, Schwenke M, Simons C, Matthaei KI, Saint R, Koopman P, Mattick JS. Expression of distinct RNAs from 3' untranslated regions. Nucleic Acids Res 2010; 39:2393-403. [PMID: 21075793 PMCID: PMC3064787 DOI: 10.1093/nar/gkq1158] [Citation(s) in RCA: 162] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The 3′ untranslated regions (3′UTRs) of eukaryotic genes regulate mRNA stability, localization and translation. Here, we present evidence that large numbers of 3′UTRs in human, mouse and fly are also expressed separately from the associated protein-coding sequences to which they are normally linked, likely by post-transcriptional cleavage. Analysis of CAGE (capped analysis of gene expression), SAGE (serial analysis of gene expression) and cDNA libraries, as well as microarray expression profiles, demonstrate that the independent expression of 3′UTRs is a regulated and conserved genome-wide phenomenon. We characterize the expression of several 3′UTR-derived RNAs (uaRNAs) in detail in mouse embryos, showing by in situ hybridization that these transcripts are expressed in a cell- and subcellular-specific manner. Our results suggest that 3′UTR sequences can function not only in cis to regulate protein expression, but also intrinsically and independently in trans, likely as noncoding RNAs, a conclusion supported by a number of previous genetic studies. Our findings suggest novel functions for 3′UTRs, as well as caution in the use of 3′UTR sequence probes to analyze gene expression.
Collapse
Affiliation(s)
- Tim R Mercer
- Institute for Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
293
|
Underwood JG, Uzilov AV, Katzman S, Onodera CS, Mainzer JE, Mathews DH, Lowe TM, Salama SR, Haussler D. FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing. Nat Methods 2010; 7:995-1001. [PMID: 21057495 PMCID: PMC3247016 DOI: 10.1038/nmeth.1529] [Citation(s) in RCA: 242] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2010] [Accepted: 10/13/2010] [Indexed: 01/07/2023]
Abstract
Previous efforts to determine structures of non-coding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we have developed FragSeq, a high-throughput RNA structure probing method that uses high-throughput RNA sequencing on fragments generated by nuclease P1, which specifically cleaves single stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we show that we can accurately and simultaneously map single-stranded regions (ssRNA) in multiple ncRNAs with known structure. We carried out probing in two cell types to demonstrate reproducibility. We also identified and experimentally validated structured regions in ncRNAs never previously probed.
Collapse
Affiliation(s)
- Jason G Underwood
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
294
|
Mercer TR, Dinger ME, Bracken CP, Kolle G, Szubert JM, Korbie DJ, Askarian-Amiri ME, Gardiner BB, Goodall GJ, Grimmond SM, Mattick JS. Regulated post-transcriptional RNA cleavage diversifies the eukaryotic transcriptome. Genome Res 2010; 20:1639-50. [PMID: 21045082 DOI: 10.1101/gr.112128.110] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The complexity of the eukaryotic transcriptome is generated by the interplay of transcription initiation, termination, alternative splicing, and other forms of post-transcriptional modification. It was recently shown that RNA transcripts may also undergo cleavage and secondary 5' capping. Here, we show that post-transcriptional cleavage of RNA contributes to the diversification of the transcriptome by generating a range of small RNAs and long coding and noncoding RNAs. Using genome-wide histone modification and RNA polymerase II occupancy data, we confirm that the vast majority of intraexonic CAGE tags are derived from post-transcriptional processing. By comparing exonic CAGE tags to tissue-matched PARE data, we show that the cleavage and subsequent secondary capping is regulated in a developmental-stage- and tissue-specific manner. Furthermore, we find evidence of prevalent RNA cleavage in numerous transcriptomic data sets, including SAGE, cDNA, small RNA libraries, and deep-sequenced size-fractionated pools of RNA. These cleavage products include mRNA variants that retain the potential to be translated into shortened functional protein isoforms. We conclude that post-transcriptional RNA cleavage is a key mechanism that expands the functional repertoire and scope for regulatory control of the eukaryotic transcriptome.
Collapse
Affiliation(s)
- Tim R Mercer
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
295
|
Ørom UA, Derrien T, Beringer M, Gumireddy K, Gardini A, Bussotti G, Lai F, Zytnicki M, Notredame C, Huang Q, Guigo R, Shiekhattar R. Long noncoding RNAs with enhancer-like function in human cells. Cell 2010; 143:46-58. [PMID: 20887892 DOI: 10.1016/j.cell.2010.09.001] [Citation(s) in RCA: 1429] [Impact Index Per Article: 102.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 07/01/2010] [Accepted: 08/13/2010] [Indexed: 12/21/2022]
Abstract
While the long noncoding RNAs (ncRNAs) constitute a large portion of the mammalian transcriptome, their biological functions has remained elusive. A few long ncRNAs that have been studied in any detail silence gene expression in processes such as X-inactivation and imprinting. We used a GENCODE annotation of the human genome to characterize over a thousand long ncRNAs that are expressed in multiple cell lines. Unexpectedly, we found an enhancer-like function for a set of these long ncRNAs in human cell lines. Depletion of a number of ncRNAs led to decreased expression of their neighboring protein-coding genes, including the master regulator of hematopoiesis, SCL (also called TAL1), Snai1 and Snai2. Using heterologous transcription assays we demonstrated a requirement for the ncRNAs in activation of gene expression. These results reveal an unanticipated role for a class of long ncRNAs in activation of critical regulators of development and differentiation.
Collapse
|
296
|
Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD, Chuzhanova N, Krawczak M, Kehrer-Sawatzki H, Stenson PD. Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics. Hum Mutat 2010; 31:631-55. [PMID: 20506564 DOI: 10.1002/humu.21260] [Citation(s) in RCA: 117] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
297
|
Kapranov P, Ozsolak F, Kim SW, Foissac S, Lipson D, Hart C, Roels S, Borel C, Antonarakis SE, Monaghan AP, John B, Milos PM. New class of gene-termini-associated human RNAs suggests a novel RNA copying mechanism. Nature 2010; 466:642-6. [PMID: 20671709 DOI: 10.1038/nature09190] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2009] [Accepted: 05/20/2010] [Indexed: 11/09/2022]
Abstract
Small (<200 nucleotide) RNA (sRNA) profiling of human cells using various technologies demonstrates unexpected complexity of sRNAs with hundreds of thousands of sRNA species present. Genetic and in vitro studies show that these RNAs are not merely degradation products of longer transcripts but could indeed have a function. Furthermore, profiling of RNAs, including the sRNAs, can reveal not only novel transcripts, but also make clear predictions about the existence and properties of novel biochemical pathways operating in a cell. For example, sRNA profiling in human cells indicated the existence of an unknown capping mechanism operating on cleaved RNA, a biochemical component of which was later identified. Here we show that human cells contain a novel type of sRNA that has non-genomically encoded 5' poly(U) tails. The presence of these RNAs at the termini of genes, specifically at the very 3' ends of known mRNAs, strongly argues for the presence of a yet uncharacterized endogenous biochemical pathway in cells that can copy RNA. We show that this pathway can operate on multiple genes, with specific enrichment towards transcript-encoding components of the translational machinery. Finally, we show that genes are also flanked by sense, 3' polyadenylated sRNAs that are likely to be capped.
Collapse
Affiliation(s)
- Philipp Kapranov
- Helicos BioSciences Corporation, 1 Kendall Sq. Ste B7301 Cambridge, Massachusetts 02139-1671, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
298
|
Sun H, Wu J, Wickramasinghe P, Pal S, Gupta R, Bhattacharyya A, Agosto-Perez FJ, Showe LC, Huang THM, Davuluri RV. Genome-wide mapping of RNA Pol-II promoter usage in mouse tissues by ChIP-seq. Nucleic Acids Res 2010; 39:190-201. [PMID: 20843783 PMCID: PMC3017616 DOI: 10.1093/nar/gkq775] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Alternative promoters that are differentially used in various cellular contexts and tissue types add to the transcriptional complexity in mammalian genome. Identification of alternative promoters and the annotation of their activity in different tissues is one of the major challenges in understanding the transcriptional regulation of the mammalian genes and their isoforms. To determine the use of alternative promoters in different tissues, we performed ChIP-seq experiments using antibody against RNA Pol-II, in five adult mouse tissues (brain, liver, lung, spleen and kidney). Our analysis identified 38 639 Pol-II promoters, including 12 270 novel promoters, for both protein coding and non-coding mouse genes. Of these, 6384 promoters are tissue specific which are CpG poor and we find that only 34% of the novel promoters are located in CpG-rich regions, suggesting that novel promoters are mostly tissue specific. By identifying the Pol-II bound promoter(s) of each annotated gene in a given tissue, we found that 37% of the protein coding genes use alternative promoters in the five mouse tissues. The promoter annotations and ChIP-seq data presented here will aid ongoing efforts of characterizing gene regulatory regions in mammalian genomes.
Collapse
Affiliation(s)
- Hao Sun
- Center for Systems and Computational Biology, Molecular and Cellular Oncogenesis Program, The Wistar Institute, Philadelphia, PA 19104, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
299
|
|
300
|
Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods 2010; 7:709-15. [PMID: 20711195 PMCID: PMC3005310 DOI: 10.1038/nmeth.1491] [Citation(s) in RCA: 529] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 07/20/2010] [Indexed: 12/17/2022]
Abstract
Strand-specific, massively parallel cDNA sequencing (RNA-seq) is a powerful tool for transcript discovery, genome annotation and expression profiling. There are multiple published methods for strand-specific RNA-seq, but no consensus exists as to how to choose between them. Here we developed a comprehensive computational pipeline to compare library quality metrics from any RNA-seq method. Using the well-annotated Saccharomyces cerevisiae transcriptome as a benchmark, we compared seven library-construction protocols, including both published and our own methods. We found marked differences in strand specificity, library complexity, evenness and continuity of coverage, agreement with known annotations and accuracy for expression profiling. Weighing each method's performance and ease, we identified the dUTP second-strand marking and the Illumina RNA ligation methods as the leading protocols, with the former benefitting from the current availability of paired-end sequencing. Our analysis provides a comprehensive benchmark, and our computational pipeline is applicable for assessment of future protocols in other organisms.
Collapse
|